2017-10-17 113 views
1

有兩個數據框,一個有少量信息(df1),另一個有全部數據(df2)。我正在嘗試在df1的新列中創建該列,該列找到Total2值並根據名稱填充新列。請注意,在df1中可見的名稱將始終在df2的名稱中找到匹配項。我想知道在熊貓中是否有一些功能已經做到了這一點?我的最終目標是創建一個條形圖。使用其他數據框的匹配值在數據幀中創建新列

alldatapath = "all_data.csv" 
filteredpath = "filtered.csv" 

import pandas as pd 

df1 = pd.read_csv(
    filteredpath,  # file name 
    sep=',',     # column separator 
    quotechar='"',    # quoting character 
    na_values="NA",    # fill missing values with 0 
    usecols=[0,1],  # columns to use 
    decimal='.')    # symbol for decimals 

df2 = pd.read_csv(
    alldatapath,  # file name 
    sep=',',     # column separator 
    quotechar='"',    # quoting character 
    na_values="NA",    # fill missing values with 0 
    usecols=[0,1],  # columns to use 
    decimal='.')    # symbol for decimals 

df1 = df1.head(5) #trim to top 5 

print(df1) 
print(df2) 

輸出(DF1):

  Name Total 
0 Accounting  3 
1 Reporting  1 
2  Finance  1 
3  Audit  1 
4 Template  2 

輸出(DF2):

  Name Total2 
0 Reporting 100 
1 Accounting 120 
2  Finance 400 
3  Audit 500 
4 Information  50 
5  Template 1200 
6  KnowHow 2000 

最終輸出(DF1)應該是這樣的:

  Name Total Total2(new column) 
0 Accounting  3 120 
1 Reporting  1 100 
2  Finance  1 400 
3  Audit  1 500 
4 Template  2 1200 

回答

2

需要map通過Series第一個新列:

df1['Total2'] = df1['Name'].map(df2.set_index('Name')['Total2']) 
print (df1) 
     Name Total Total2 
0 Accounting  3  120 
1 Reporting  1  100 
2  Finance  1  400 
3  Audit  1  500 
4 Template  2 1200 

然後set_indexDataFrame.plot.bar

df1.set_index('Name').plot.bar() 
+0

的感謝!我將研究這些功能,將其應用於我的全球代碼。 – Gonzalo

相關問題