2017-09-14 160 views
3

我試圖子集基於在另一個類似的數據幀列的pandas數據幀。我在R中可以很容易地做到這一點:大熊貓:基於另一個數據框的列選擇數據框列

df1 <- data.frame(A=1:5, B=6:10, C=11:15) 
df2 <- data.frame(A=1:5, B=6:10) 

#Select columns in df1 that exist in df2 
df1[df1 %in% df2] 
    A B 
1 1 6 
2 2 7 
3 3 8 
4 4 9 
5 5 10 

#Select columns in df1 that do not exist in df2 
df1[!(df1 %in% df2)] 
    C 
1 11 
2 12 
3 13 
4 14 
5 15 

我如何能做到這一點與下面的pandas dataframes?

df1 = pd.DataFrame({'A': [1,2,3,4,5],'B': [6,7,8,9,10],'C': [11,12,13,14,15]}) 
df2 = pd.DataFrame({'A': [1,2,3,4,5],'B': [6,7,8,9,10],}) 

回答

4
In [77]: df1[df1.columns.intersection(df2.columns)] 
Out[77]: 
    A B 
0 1 6 
1 2 7 
2 3 8 
3 4 9 
4 5 10 

In [78]: df1[df1.columns.difference(df2.columns)] 
Out[78]: 
    C 
0 11 
1 12 
2 13 
3 14 
4 15 

或類似,但不明顯:

In [92]: df1[list(set(df1) & set(df2))] 
Out[92]: 
    B A 
0 6 1 
1 7 2 
2 8 3 
3 9 4 
4 10 5 

In [93]: df1[list(set(df1) - set(df2))] 
Out[93]: 
    C 
0 11 
1 12 
2 13 
3 14 
4 15 
2

使用isindropna

df1[df1.isin(df2)].dropna(1) 

    A B 
0 1 6 
1 2 7 
2 3 8 
3 4 9 
4 5 10 


df1[~df1.isin(df2)].dropna(1) 

    C 
0 11 
1 12 
2 13 
3 14 
4 15 
相關問題