大熊貓，按功能

後一組的專欄中，我有一個簡單的熊貓數據框命名purchase_cat_df的名字：大熊貓，按功能

   email    cat 
0 [email protected] Mobiles & Tablets 
1 [email protected] Mobiles & Tablets 
2 [email protected] Mobiles & Tablets 
3 [email protected] Mobiles & Tablets 
4 [email protected]  Home & Living 
5 [email protected]  Home & Living

我被「電子郵件」和分組，並把「貓」在列表中這樣：

test = purchase_cat_df.groupby('email').apply(lambda x: list(x.cat))

但後來我的數據幀的測試是：

email 
[email protected] [Mobiles & Tablets, Mobiles & Tablets, Home & ... 
[email protected]         [Mobiles & Tablets] 
[email protected]     [Mobiles & Tablets, Home & Living]

我失去了指數法和名字，我怎麼能評爲第2列？

來源

2014-09-23 woshitom

我想你會得到一個系列，而不是一個DataFrame。 – BrenBarn 2014-09-23 18:45:59

我不確定你的目標是什麼，但我建議調用'purchase_cat_df = purchase_cat_df.set_index（'email'）'然後你可以通過調用'purchase_cat_df.loc [emailX，'cat']來獲得你的列表。沒有tolist（）的tolist（）'調用返回一個系列 – ZJS 2014-09-23 19:07:28

顯然，索引不再有意義，因爲每條輸出行都是由具有不同索引的多條輸入行生成的。 – mdurant 2014-09-23 19:49:08

由於@BrenBarn在評論中提到，帶有列表的列沒有名稱，因爲您有Series而不是DataFrame。

試試這個：

test = purchase_cat_df.groupby('email').apply({'cat': list})

它返回一個DataFrame與email設置爲索引和cat作爲新列的名稱。

當您想要聚合多個列時，您也可以使用它。見the documentation有幾個例子。

來源

2015-02-03 16:20:41 LondonRob

如果你想保持原來的指標，你可能尋找的是這樣的：

purchase_cat_df.groupby('email', as_index=False)

as_index =假保持原有指數。然後，您可以繼續按名稱對列進行處理。

來源

2017-05-24 09:46:27 Axel

大熊貓，按功能

回答

相關問題