熊貓GROUPBY和

我有一個數據幀如下：熊貓GROUPBY和

ref, type, amount 
001, foo, 10 
001, foo, 5 
001, bar, 50 
001, bar, 5 
001, test, 100 
001, test, 90 
002, foo, 20 
002, foo, 35 
002, bar, 75 
002, bar, 80 
002, test, 150 
002, test, 110

這就是我試圖讓：

ref, type, amount, foo, bar, test 
001, foo, 10, 15, 55, 190 
001, foo, 5, 15, 55, 190 
001, bar, 50, 15, 55, 190 
001, bar, 5, 15, 55, 190 
001, test, 100, 15, 55, 190 
001, test, 90, 15, 55, 190 
002, foo, 20, 55, 155, 260 
002, foo, 35, 55, 155, 260 
002, bar, 75, 55, 155, 260 
002, bar, 80, 55, 155, 260 
002, test, 150, 55, 155, 260 
002, test, 110, 55, 155, 260

所以我有這樣的：

df.groupby('ref')['amount'].transform(sum)

但我如何過濾它以使上述僅適用於行type=foo或bar或test？

來源

2016-09-28 Kvothe

@EdChum是的，我可以過濾數據幀，但我需要三個新的列'ref'和類型'總和'。如果這是有道理的？ – Kvothe

那麼，爲什麼不groupby在裁判和類型呢？ – EdChum

我可以在參考和類型groupby，但列將如何工作？因爲我想爲每個類型的值添加總和。 – Kvothe

一個解決方案使用pivot table：

>>> b = pd.pivot_table(df, values='amount', index=['ref'], columns=['type'], aggfunc=np.sum) 
>>> b 
type bar foo test 
ref 
1  55 15 190 
2  155 55 260 

>>> pd.merge(df, b, left_on='ref', right_index=True) 
    ref type amount bar foo test 
0  1 foo  10 55 15 190 
1  1 foo  5 55 15 190 
2  1 bar  50 55 15 190 
3  1 bar  5 55 15 190 
4  1 test  100 55 15 190 
5  1 test  90 55 15 190 
6  2 foo  20 155 55 260 
7  2 foo  35 155 55 260 
8  2 bar  75 155 55 260 
9  2 bar  80 155 55 260 
10 2 test  150 155 55 260 
11 2 test  110 155 55 260

來源

2016-09-28 14:47:18 3kt

謝謝！ @ 3kt這個作品也是！ – Kvothe

我認爲你需要groupby與unstack然後merge原始DataFrame：

df1 = df.groupby(['ref','type'])['amount'].sum().unstack().reset_index() 
print (df1) 
type ref bar foo test 
0  001 55 15 190 
1  002 155 55 260 

df = pd.merge(df, df1, on='ref') 
print (df) 
    ref type amount sums bar foo test 
0 001 foo  10 15 55 15 190 
1 001 foo  5 15 55 15 190 
2 001 bar  50 55 55 15 190 
3 001 bar  5 55 55 15 190 
4 001 test  100 190 55 15 190 
5 001 test  90 190 55 15 190 
6 002 foo  20 55 155 55 260 
7 002 foo  35 55 155 55 260 
8 002 bar  75 155 155 55 260 
9 002 bar  80 155 155 55 260 
10 002 test  150 260 155 55 260 
11 002 test  110 260 155 55 260

時序：

In [506]: %timeit (pd.merge(df, df.groupby(['ref','type'])['amount'].sum().unstack().reset_index(), on='ref')) 
100 loops, best of 3: 3.4 ms per loop 

In [507]: %timeit (pd.merge(df, pd.pivot_table(df, values='amount', index=['ref'], columns=['type'], aggfunc=np.sum), left_on='ref', right_index=True)) 
100 loops, best of 3: 4.99 ms per loop

來源

2016-09-28 14:43:30 jezrael

，這正是我所需要的。非常感謝！ – Kvothe

很高興能幫到你！ – jezrael

熊貓GROUPBY和

回答

相關問題