新列中的數據幀之間的標誌相似性

我想比較兩個不同長度的pandas DataFrame並確定匹配的索引號。當值匹配時，我想在新列中標記這些值。新列中的數據幀之間的標誌相似性

df1: 
Index Column 1 
41660 Apple 
41935 Banana 
42100 Strawberry 
42599 Pineapple 

df2: 
Index Column 1 
42599 Pineapple 

Output: 
Index Column 1 'Matching Index?' 
41660 Apple 
41935 Banana 
42100 Strawberry 
42599 Pineapple True

來源

2016-07-07 zbug

的可能的複製[比較兩列兩個Python Pandas數據框並獲取常用行]（http://stackoverflow.com/questions/30291032/comparing-2-columns-of-two-python-pandas-dataframes-and-getting-the-common - ） – Andy

如果這些真的是指數，那麼你可以在指數使用intersection：

In [61]: 
df1.loc[df1.index.intersection(df2.index), 'flag'] = True 
df1 

Out[61]: 
     Column 1 flag 
Index     
41660  Apple NaN 
41935  Banana NaN 
42100 Strawberry NaN 
42599 Pineapple True

否則使用isin：

In [63]: 
df1.loc[df1['Index'].isin(df2['Index']), 'flag'] = True 
df1 

Out[63]: 
    Index Column 1 flag 
0 41660  Apple NaN 
1 41935  Banana NaN 
2 42100 Strawberry NaN 
3 42599 Pineapple True

來源

2016-07-07 15:28:26 EdChum

謝謝，這解決了我的問題。 – zbug

+1到@ EdChum的答案。如果你可以在你的匹配列不同的值，True住嘗試：

>>> df1.merge(df2,how='outer',indicator='Flag') 
    Index  Column  Flag 
0 41660  Apple left_only 
1 41935  Banana left_only 
2 42100 Strawberry left_only 
3 42599 Pineapple  both

來源

2016-07-07 15:34:17 bernie

使用ISIN（） - 方法：

import pandas as pd 

df1 = pd.DataFrame(data=[ 
    [41660, 'Apple'], 
    [41935, 'Banana'], 
    [42100, 'Strawberry'], 
    [42599, 'Pineapple'], 
         ] 
        , columns=['Index', 'Column 1']) 

df2 = pd.DataFrame(data=[ 
    [42599, 'Pineapple'], 
         ] 
        , columns=['Index', 'Column 1']) 

df1['Matching'] = df1['Index'].isin(df2['Index']) 
print(df1)

輸出：

Index Column 1 Matching 
0 41660  Apple False 
1 41935  Banana False 
2 42100 Strawberry False 
3 42599 Pineapple  True

來源

2016-07-07 15:39:57 Blind0ne

'isin'已經在我的回答中提及 – EdChum

新列中的數據幀之間的標誌相似性

回答

相關問題