2016-11-22 159 views
1

問題:我有2個數據幀df1df2。我的目標是修改df1,如果在df2內找到它,請替換它的一些值。將值映射到Pandas中不同數據幀的數據幀

import pandas as pd 

# dataframe 1 
data = {'A':[90,20,30,25,50,60], 
     'B':['qq','ee','rr','tt','ii','oo'], 
     'C':['XX','VV','BB','NN','KK','JJ']} 
df1 = pd.DataFrame(data) 

# dataframe 2 
convert_table = {'X': ['dd','ee','ff','gg','hh','ii','ll','mm','nn','oo','pp','qq','rr','ss','tt','uu'], 
       'Y': ['DD','VV','FF','GG','HH','KK','LL','MM','NN','JJ','PP','XX','BB','SS','NN','LL'], 
       'Z': [5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61]} 
df2 = pd.DataFrame(convert_table) 

# search values of df1 inside of df2 and replace values 
for idx1,row1 in df1.iterrows(): 
    for idx2, row2 in df2.iterrows(): 
     if row1['B']==row2['X'] and row1['C']==row2['Y']: 
      df1.replace(to_replace=row1['B'],value=row2['Z'],inplace=True) 

正如你可以看到我有2圈,我檢查的df1row1)的一般行的df2發現裏面。如果這一條件得到滿足,那麼我更換包含在ROW1 [「B」]與包含在row2['Z']

因此,我得到的是結果的一個值(我想有作爲的結果到底是什麼):

In [120]: df1 
Out[120]: 
    A B C 
0 90 43 XX 
1 20 7 VV 
2 30 47 BB 
3 25 59 NN 
4 50 19 KK 
5 60 37 JJ 

請注意列B如何更改。

問題:你可以建議我一個更好的方式來寫我的代碼嗎?我想通過使用Pandas或Python提供的內置函數儘可能快地完成它。

注意:包含在數據幀中的數據僅用於演示目的。

回答

3

使用合併兩列:

df1.merge(df2, left_on=['B','C'], right_on=['X','Y'], how='left') 

how='left'是關鍵在這裏。如果你不明白爲什麼,請閱讀Brief primer on merge methods (relational algebra)

我會修改你的榜樣打造一個那裏是在DF1的條目不存在DF2,這是('ii','KK')

In [1]: 
# dataframe 2 
convert_table = {'X': ['dd','ee','ff','gg','hh','ll','mm','nn','oo','pp','qq','rr','ss','tt','uu'], 
       'Y': ['DD','VV','FF','GG','HH','LL','MM','NN','JJ','PP','XX','BB','SS','NN','LL'], 
       'Z': [5,7,11,13,17,19,23,29,37,41,43,47,53,59,61]} 
df2 = pd.DataFrame(convert_table) 



In [2]: merged = df1.merge(df2, left_on=['B','C'], right_on=['X','Y'], how='left') 
     merged 
Out[2]: 
    A B C X Y  Z 
0 90 qq XX qq XX 43.0 
1 20 ee VV ee VV 7.0 
2 30 rr BB rr BB 47.0 
3 25 tt NN tt NN 59.0 
4 50 ii KK NaN NaN NaN 
5 60 oo JJ oo JJ 37.0 

我們獲取最終數據框:

In [3]: 
merged.ix[merged.Z.notnull(),'B'] = merged.ix[merged.Z.notnull(),'Z'] 
merged = merged[['A','B','C']] 
merged 

Out[3]: 
    A B C 
0 90 43 XX 
1 20 7 VV 
2 30 47 BB 
3 25 59 NN 
4 50 ii KK 
5 60 37 JJ 
+0

是否有可能獲得與我在例子中獲得的列數相同的列數的輸出? –

+1

我剛剛在發佈您的評論的同時做到了這一點:) –

+0

非常感謝:) –