如果在另一個數據框中存在相同的行，如何刪除Pandas數據框中的行？

我有兩個dataframes：如果在另一個數據框中存在相同的行，如何刪除Pandas數據框中的行？

df1 = row1;row2;row3 
df2 = row4;row5;row6;row2

我希望我的輸出數據幀只包含在DF1獨特的行，即：

df_out = row1;row3

我如何獲得這個最有效的？

此代碼我想要做什麼，但使用2 for循環：

a = pd.DataFrame({0:[1,2,3],1:[10,20,30]}) 
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]}) 

match_ident = [] 
for i in range(0,len(a)): 
    found=False 
    for j in range(0,len(b)): 
     if a[0][i]==b[0][j]: 
      if a[1][i]==b[1][j]: 
       found=True 
    match_ident.append(not(found)) 

a = a[match_ident]

來源

2017-06-22 RRC

沒有重複，因爲我沒有映射到兩個數據框中的公共值的唯一標識。 – RRC

不能標記它，但https://stackoverflow.com/questions/28901683/pandas-get-rows-which-are-not-in-other-dataframe – victor

您的使用merge與參數indicator和外連接，query進行過濾，然後用drop刪除輔助柱：

DataFrame加入所有列，所以on參數可以省略。

print (pd.merge(a,b, indicator=True, how='outer') 
     .query('_merge=="left_only"') 
     .drop('_merge', axis=1)) 
    0 1 
0 1 10 
2 3 30

來源

2017-06-22 18:26:33 jezrael

太棒了！沒有想到使用指標參數。解決我的問題。 – RRC

很高興能幫到你，美好的一天！ – jezrael

你可以a和b轉化爲Index S，然後使用Index.isin method以確定哪些行共同分享：

import pandas as pd 
a = pd.DataFrame({0:[1,2,3],1:[10,20,30]}) 
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]}) 

a_index = a.set_index([0,1]).index 
b_index = b.set_index([0,1]).index 
mask = ~a_index.isin(b_index) 
result = a.loc[mask] 
print(result)

產量

0 1 
0 1 10 
2 3 30

來源

2017-06-22 18:33:22 unutbu

如果在另一個數據框中存在相同的行，如何刪除Pandas數據框中的行？

回答

相關問題