2017-07-06 95 views
2

我無法有效地向MultiIndexed DataFrame添加單行。通過添加該行,MultiIndex被扁平化爲一個簡單的元組索引。奇怪的是,這對MultiIndexed列來說不是問題。如何添加一行到熊貓DataFrame而不展平MultiIndex

系統信息:

Python 3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import pandas as pd 
>>> pd.__version__ 
'0.19.2' 

示例數據:既多指標行和列數據幀

import numpy as np 
import pandas as pd 

index = pd.MultiIndex(levels=[['bar', 'foo'], ['one', 'two']], 
         labels=[[0, 0, 1, 1], [0, 1, 0, 1]], 
         names=['row_0', 'row_1']) 
columns = pd.MultiIndex(levels=[['dull', 'shiny'], ['a', 'b']], 
         labels=[[0, 0, 1, 1], [0, 1, 0, 1]], 
         names=['col_0', 'col_1']) 
df = pd.DataFrame(np.ones((4,4)),columns=columns, index=index) 

print(df) 

    col_0  dull  shiny  
col_1   a b  a b 
row_0 row_1      
bar one 1.0 1.0 1.0 1.0 
     two 1.0 1.0 1.0 1.0 
foo one 1.0 1.0 1.0 1.0 
     two 1.0 1.0 1.0 1.0 

這是沒有問題的一個附加列添加到數據幀:

df['last_col'] = 42 #define a new column and assign a value 

print(df) 

col_0  dull  shiny  last_col 
col_1   a b  a b   
row_0 row_1        
bar one 1.0 1.0 1.0 1.0  42 
     two 1.0 1.0 1.0 1.0  42 
foo one 1.0 1.0 1.0 1.0  42 
     two 1.0 1.0 1.0 1.0  42 

但是,如果我爲添加一行(通過使用loc)做同樣的事情,MultiIndex被平化爲 簡單的元組的指數:

df.loc['last_row'] = 43 #define a new row and assign a value 

print(df) 

col_0  dull  shiny  last_col 
col_1   a  b  a  b   
(bar, one) 1.0 1.0 1.0 1.0  42 
(bar, two) 1.0 1.0 1.0 1.0  42 
(foo, one) 1.0 1.0 1.0 1.0  42 
(foo, two) 1.0 1.0 1.0 1.0  42 
last_row 43.0 43.0 43.0 43.0  43 

有誰有一個想法如何添加行沒有一個既簡單又有效的方式壓扁指數?非常感謝你!!

+0

開設了一個問題:https://github.com/pandas-dev/pandas/issues/17024 –

回答

2

我認爲你需要元組定義的MultiIndex兩個值:

df.loc[('last_row', 'a'), :] = 43 
print(df) 
col_0   dull  shiny  
col_1    a  b  a  b 
row_0 row_1       
bar  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
foo  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
last_row a  43.0 43.0 43.0 43.0 

對於列它的工作原理類似:

df[('last_col', 'a')] = 43 
print(df) 
col_0  dull  shiny  last_col 
col_1   a b  a b  a 
row_0 row_1        
bar one 1.0 1.0 1.0 1.0  43 
     two 1.0 1.0 1.0 1.0  43 
foo one 1.0 1.0 1.0 1.0  43 
     two 1.0 1.0 1.0 1.0  43 

編輯:

看來你需要定義的列名,如果需要全部使用:

df.loc['last_row',:] = 43 
print(df) 
col_0   dull  shiny  
col_1    a  b  a  b 
row_0 row_1       
bar  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
foo  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
last_row  43.0 43.0 43.0 43.0 

如果添加沒有定義級別空字符串:

print(df.index) 
MultiIndex(levels=[['bar', 'foo', 'last_row'], ['one', 'two', '']], 
      labels=[[0, 0, 1, 1, 2], [0, 1, 0, 1, 2]], 
      names=['row_0', 'row_1']) 
df.loc['last_row','dull'] = 43 
print(df) 
col_0   dull  shiny  
col_1    a  b  a b 
row_0 row_1      
bar  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
foo  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
last_row  43.0 43.0 NaN NaN 
df.loc['last_row', ('dull', 'a')] = 43 
print(df) 
col_0   dull  shiny  
col_1    a b  a b 
row_0 row_1      
bar  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
foo  one  1.0 1.0 1.0 1.0 
     two  1.0 1.0 1.0 1.0 
last_row  43.0 NaN NaN NaN 
+0

嗨jezrael,很酷,看起來不錯。非常感謝!! –