2017-09-26 100 views
1
   star_rating   duration   
Date   20170829 20170830 20170829 20170830 
genre           
Action   1038.1 1038.1 15917.0 16598.0 
Adventure  595.0 595.0 9386.0 10113.0 
Animation  490.7 490.7 5811.0 5989.0 
Biography  596.9 596.9 9661.0 10002.0 
Comedy   1211.7 1211.7 16616.0 16786.0 

In[86]: df2.columns 
Out[86]: 
MultiIndex(levels=[['star_rating', 'duration'], [20170829, 20170830]], 
      labels=[[0, 0, 1, 1], [0, 1, 0, 1]], 
      names=[None, 'Date']) 

大家好,我有上表DF2,我想插入一列的差異,這將是20170830一個簡單的減法 - 20170829.將在多指標數據幀大熊貓計算列

  star_rating      duration   
Date  20170829 20170830 Diff 20170829 20170830 Diff 
genre      
Action  1038.1  1038.1  0  15917  16598  681 
Adventure 595   595   0  9386  10113  727 
Animation 490.7  490.7  0  5811  5989  178 
Biography 596.9  596.9  0  9661  10002  341 
Comedy  1211.7  1211.7  0  16616  16786  170 

它如果日期處於最高位置,那麼我可以使用df2['diff'] = df2[20170830] - df2[20170829]

我是multiIndex新手,很感謝任何人有任何想法讓我開始。提前致謝。

+0

https://stackoverflow.com/questions/43238183/python-pandas-add-subtotal-on-each-lvl-of- multiindex-dataframe檢查這一點 – Wen

回答

0

讓我們試試:

df1 = df.groupby(level=0,axis=1).diff().dropna(1) 

df1.columns = df1.columns.set_levels(['diff','diff'],level=1) 

df.columns = df.columns.set_levels(df.columns.get_level_values(1).astype(str),level=1) 

df_out = pd.concat([df,df1],axis=1).sort_index(1) 

輸出:

  duration     star_rating    
Date  20170829 20170830 diff 20170829 20170830 diff 
genre              
Action  15917.0 16598.0 681.0  1038.1 1038.1 0.0 
Adventure 9386.0 10113.0 727.0  595.0 595.0 0.0 
Animation 5811.0 5989.0 178.0  490.7 490.7 0.0 
Biography 9661.0 10002.0 341.0  596.9 596.9 0.0 
Comedy  16616.0 16786.0 170.0  1211.7 1211.7 0.0