2017-10-18 78 views
-2

我有兩個數據幀,我想沿着列連接它們。該索引不是唯一的:沿着具有非唯一索引的列連接兩個數據幀

df1 = pd.DataFrame({'A': ['0', '1', '2', '2'],'B': ['B0', 'B1', 'B2', 'B3'],'C': ['C0', 'C1', 'C2', 'C3']}): 
    A B C 
0 0 B0 C0 
1 1 B1 C1 
2 2 B2 C2 
3 2 B3 C3 

df2 = pd.DataFrame({'A': ['0', '2', '3'],'E': ['E0', 'E1', 'E2']},index=[0, 2, 3]) 
    A E 
0 0 E0 
1 2 E1 
2 3 E2 

A應該是我的索引。我要的是:

A B C E 
0 0 B0 C0 E0 
1 1 B1 C1 NAN 
2 2 B2 C2 E1 
3 2 B3 C3 E1 

pd.concat([df1, df2], 1)給我的錯誤:

Reindexing only valid with uniquely valued Index objects 
+2

'pd.concat([DF1,DF2],1)' –

+0

錯誤:重新索引只與唯一價值指數的有效對象 –

+0

發佈的答案... –

回答

4

也許你正在尋找左外merge一個

df1.merge(df2, how='left') 
    A B C E 
0 0 B0 C0 E0 
1 1 B1 C1 NaN 
2 2 B2 C2 E1 
3 2 B3 C3 E1 
0

沿柱軸連接具有concat

import pandas as pd 

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],'B': ['B0', 'B1', 'B2', 'B3'],'C': ['C0', 'C1', 'C2', 'C3']},index=[0, 1, 2, 2]) 

df2 = pd.DataFrame({'D': ['D0', 'D1'],'E': ['E0', 'E1']},index=[0, 2]) 

df = pd.concat([df1, df2], axis=1) 

輸出:

A B C D E 
0 A0 B0 C0 D0 E0 
1 A1 B1 C1 NaN NaN 
2 A2 B2 C2 D1 E1 
2 A3 B3 C3 D1 E1 
+0

錯誤:Reindexing只對有唯一值的索引對象有效 –

1

通過使用combine_first

df1.combine_first(df2).dropna(subset=['A'],axis=0) 
Out[320]: 
    A B C D E 
0 A0 B0 C0 D0 E0 
1 A1 B1 C1 NaN NaN 
2 A2 B2 C2 D1 E1 
2 A3 B3 C3 D1 E1 

你編輯後:

通過使用combine_first

df1.combine_first(df2.set_index('A')) 
Out[338]: 
    A B C E 
0 0 B0 C0 E0 
1 1 B1 C1 NaN 
2 2 B2 C2 E1 
3 2 B3 C3 E2 

或者

pd.concat([df1,df2.set_index('A')],axis=1) 
Out[339]: 
    A B C E 
0 0 B0 C0 E0 
1 1 B1 C1 NaN 
2 2 B2 C2 E1 
3 2 B3 C3 E2 
相關問題