我想結合這兩個數據幀(DF1和DF2),但只在不在第一個數據幀(DF1)的記錄。在下面的示例中,我希望結果只拾取記錄0,1,4,5而不是合併2,3,因爲它們在DF1中已經具有複雜的單元外觀。 我試過,沒有運氣合併和np.where熊貓組合一個數據幀與另一個具有不同形狀的差異
np.where(df1[['complex','unit']] != df2[['complex','unit']])
這就造成了ValueError: Can only compare identically-labeled DataFrame objects
DF1
company complex unit location datetime serial seq interval
3 6 10 UpMaster 2017-07-21 00:33:37 1505.0 3400.0 1554
4 6 11 UpMaster 2017-07-21 00:59:44 1505.0 3401.0 1567
5 6 10 UpMaster 2017-07-21 01:25:41 1505.0 3402.0 1557
6 6 A UpMaster 2017-07-21 01:51:45 1505.0 3403.0 1564
7 6 13 UpMaster 2017-07-21 02:17:48 1505.0 3404.0 1563
DF2
index complex unit
0 7 1807
1 4 7
2 6 10
3 6 A
4 10 110A
5 6 12
期望的結果
company complex unit location datetime serial seq interval
3 6 10 UpMaster 2017-07-21 00:33:37 1505.0 3400.0 1554
4 6 11 UpMaster 2017-07-21 00:59:44 1505.0 3401.0 1567
5 6 10 Down 2017-07-21 01:25:41 1505.0 3402.0 1557
6 6 A UpMaster 2017-07-21 01:51:45 1505.0 3403.0 1564
7 6 13 UpMaster 2017-07-21 02:17:48 1505.0 3404.0 1563
8 7 1807 NaN NaN NaN NaN Nan
9 4 7 NaN NaN NaN NaN Nan
10 10 110A NaN NaN NaN NaN Nan
11 6 12 NaN NaN NaN NaN Nan
編輯:: Append方法行之有效的感謝!
df1 = df1.append(df2[-df2['unit_id'].isin(df1['unit_id'].unique())], ignore_index=True)
上面是最終的解決方案我又可以在UNIT_ID唯一標識符加法之後。如果沒有這樣一個聰明的解決方案,建議從2個半獨特領域的關鍵。
df1['key'] = df1['complex'].astype(str) + ' ' + df1['unit'].astype(str)
df2['key'] = df2['complex'].astype(str) + ' ' + df2['unit'].astype(str)
df1 = df1.append(df2[-df2['key'].isin(df1['key'].unique())],ignore_index=True)
df1 = df1.drop('key',axis=1)
'pd.concat','drop'duplicated' by'complex'and' unit', – Wen
也在DF2索引5中,它是否應該包含在新的DF中? – Wen
好趕上溫我更新了理想的結果,以適應這一點。我不知道如何複雜和單位'pd.concat,drop_duplicated如何工作,因爲可能有多個單位/複雜對記錄 –