2017-01-16 115 views
0

我是熊貓新手,一直在使用它作爲一堂課,但是我當然不熟悉Panda-ese。大熊貓聚集

比方說,我有一個數據幀,例如:

accord = pd.Series({'Manufacturer' : 'Honda', 
       'Model' : 'Accord', 
       'Drivetrain': 'FWD'}) 

civic = pd.Series({'Manufacturer' : 'Honda', 
        'Model' : 'Civic', 
        'Drivetrain': 'FWD'}) 


focus = pd.Series({'Manufacturer' : 'Ford', 
        'Model' : 'Focus', 
        'Drivetrain': 'FWD'}) 

mustang = pd.Series({'Manufacturer' : 'Ford', 
        'Model' : 'Mustang', 
        'Drivetrain': 'RWD'}) 

cars_df = pd.DataFrame([accord, civic, focus, mustang]) 

什麼我最終想要得到的是包括每個製造商總的模型和多少前輪驅動的車輛,他們做一個清單。

所以,我拉了一系列的,做一個新的數據幀:

manufacturer_s = cars_df['Manufacturer'].unique() 
manufacturer_df = pd.DataFrame(index=manufacturer_s) 

我加空列因我所求的信息:

manufacturer_df['FWD MODEL COUNT'] = 0 
manufacturer_df['MODEL COUNT'] = 0 

而且我用「iterrows」來填充這樣的數據:

for manufacturer, row in manufacturer_df.iterrows(): 
    row['MODEL COUNT'] = 
      len(cars_df[cars_df['Manufacturer'] == manufacturer]) 
    row['FWD MODEL COUNT'] = 
      len(cars_df[(cars_df['Manufacturer'] == manufacturer) & 
         (cars_df['Drivetrain'] == 'FWD')]) 

現在,我的輸出如下:

 FWD MODEL COUNT MODEL COUNT 
Honda    2   2 
Ford     1   2 

(編輯:我發現一個錯字,所以這部分工作)現在,這不僅是詳細(可能慢),但它不覺得「熊貓式」。

另外,我試過如下:

manufacturer_df['MODEL COUNT'] = manufacturer_df.apply(lambda car: 
     len(cars_df[cars_df['Manufacturer'] == car.index]), axis=1) 
manufacturer_df['FWD MODEL COUNT'] = manufacturer_df.apply(lambda car: 
    len(cars_df[(cars_df['Manufacturer'] == car.index) & 
       (cars_df['Drivetrain'] == 'FWD')]), axis=1) 

這並不在所有的工作......所以,我應該怎麼做到這一點,(還)我究竟做錯了什麼?

+0

您可能希望*總和*,而不是* len個*。想一想吧。 –

回答

1

您可以使用groupby().agg(),您可以在其中彙總每個列的不同聚合函數。你可以計算與pd.Series.nunique每個廠商獨特的模型和計算trues的數量x == "FWD"爲每個組計算FWD車輛的總數:

(cars_df.groupby("Manufacturer").agg({"Model": "nunique",  
             "Drivetrain": lambda x: (x == "FWD").sum()})) 

#    Model Drivetrain 
#Manufacturer  
#  Ford  2    1 
#  Honda  2    2