2014-11-21 38 views
3

我有一個像這樣的熊貓數據框;它顯示了股票投資的歷史。在Profit欄中,1表示盈利,0表示虧損。通過熊貓羣體功能發現有利可圖的投資比例

Stock Year Profit Count 
AAPL 2012 0  23 
AAPL 2012 1  19 
AAPL 2013 0  20 
AAPL 2013 1  10 
GOOG 2012 0  26 
GOOG 2012 1  20 
GOOG 2013 0  23 
GOOG 2013 1  11 

我必須找出投資獲利的百分比:

Stock Year Profit CountPercent 
AAPL 2012 1  38.77 
AAPL 2013 1  33.33 
GOOG 2012 1  43.47 
GOOG 2013 1  32.35 

我嘗試使用方法this post 但它顯示'TypeError: Join on level between two MultiIndex objects is ambiguous'

+0

請提供您的代碼。 CountPercent是什麼意思? – Ffisegydd 2014-11-21 10:16:04

回答

2

我已將您的數據加載到一個名爲「股票」的數據框中。

# Get the count of profitable trades, indexed by stock+year: 
count_profitable = stocks[ stocks['Profit']==1 ].set_index(['Stock','Year']).Count 
# Get the count of all trades, indexed by stock + year: 
count_all  = stocks.groupby(['Stock','Year']).Count.sum() 
# Render nice percentages 
pandas.options.display.float_format = '{:.2f}%'.format 
(count_profitable/count_all) * 100 

這將產生:

Stock Year 
AAPL 2012 45.24% 
     2013 33.33% 
GOOG 2012 43.48% 
     2013 32.35% 
Name: Count, dtype: float64 
+0

非常適合我,感謝您的時間... – ProgR 2014-11-21 15:23:18

2

你可以使用pivot_table

In [38]: result = df.pivot_table(index=['Stock', 'Year'], columns='Profit', values='Count', aggfunc='sum') 

In [39]: result['CountPercent'] = result[1]/(result[0]+result[1]) 

In [41]: result['CountPercent'] 
Out[41]: 
Stock Year 
AAPL 2012 0.452381 
     2013 0.333333 
GOOG 2012 0.434783 
     2013 0.323529 
Name: CountPercent, dtype: float64 
1

假設你的數據幀是一致的格式(即0先於1中的 '利潤' 列),您可以執行以下操作groupby操作:

>>> grouped = df.groupby(['Stock', 'Year']) 
>>> perc = grouped['Count'].last()/grouped['Count'].sum() 
>>> perc.reset_index() 
    Stock Year  Count 
0 AAPL 2012 0.452381 
1 AAPL 2013 0.333333 
2 GOOG 2012 0.434783 
3 GOOG 2013 0.323529 

這只是一個普通的DataFrame,所以應該直接重命名'Count'列,將其舍入到小數點後兩位並添加'Profit'列。