2016-08-04 92 views
2

我試圖劃分在由索引數據幀中的所有列。(1221行,1000列)熊貓鴻溝據幀值

  5000058004097 5000058022936 5000058036940 5000058036827 \ 

91.0  3.667246e+10 3.731947e+12 2.792220e+14 2.691262e+13 
94.0  9.869027e+10 1.004314e+13 7.514220e+14 7.242529e+13 
96.0  2.536914e+11 2.581673e+13 1.931592e+15 1.861752e+14 
... 

這裏是我試過的代碼...

A = SHIGH.divide(SHIGH.index, axis =1) 

,我得到這個錯誤:

ValueError: operands could not be broadcast together with shapes (1221,1000) (1221,) 

我也曾嘗試

A = SHIGH.divide(SHIGH.index.values.tolist(), axis =1) 

並且還重新索引並使用該列來劃分並得到相同的錯誤。

如果有人可以請指出我的錯誤,將不勝感激。

回答

1

您需要將Index對象轉換爲Series

df.div(df.index.to_series(), axis=0) 

實施例:

In [118]: 
df = pd.DataFrame(np.random.randn(5,3)) 
df 

Out[118]: 
      0   1   2 
0 0.828540 -0.574005 -0.535122 
1 -0.126242 2.152599 -1.356933 
2 0.289270 -0.663178 -0.374691 
3 -0.016866 -0.760110 -1.696402 
4 0.130580 -1.043561 0.789491 

In [124]: 
df.div(df.index.to_series(), axis=0) 

Out[124]: 
      0   1   2 
0  inf  -inf  -inf 
1 -0.126242 2.152599 -1.356933 
2 0.144635 -0.331589 -0.187345 
3 -0.005622 -0.253370 -0.565467 
4 0.032645 -0.260890 0.197373 
1

需要轉換索引to_series,然後通過div劃分:

print (SHIGH.divide(SHIGH.index.to_series(), axis = 0)) 
     5000058004097 5000058022936 5000058036940 5000058036827 
91.0 4.029941e+08 4.101041e+10 3.068374e+12 2.957431e+11 
94.0 1.049896e+09 1.068419e+11 7.993851e+12 7.704818e+11 
96.0 2.642619e+09 2.689243e+11 2.012075e+13 1.939325e+12 

在兩種解決方案timings是相同的:

SHIGH = pd.DataFrame({'5000058022936': {96.0: 25816730000000.0, 91.0: 3731947000000.0, 94.0: 10043140000000.0}, 
       '5000058036940': {96.0: 1931592000000000.0, 91.0: 279222000000000.0, 94.0: 751422000000000.0}, 
       '5000058036827': {96.0: 186175200000000.0, 91.0: 26912620000000.0, 94.0: 72425290000000.0}, 
       '5000058004097': {96.0: 253691400000.0, 91.0: 36672460000.0, 94.0: 98690270000.0}}) 


print (SHIGH) 
     5000058004097 5000058022936 5000058036827 5000058036940 
91.0 3.667246e+10 3.731947e+12 2.691262e+13 2.792220e+14 
94.0 9.869027e+10 1.004314e+13 7.242529e+13 7.514220e+14 
96.0 2.536914e+11 2.581673e+13 1.861752e+14 1.931592e+15 

#[1200 rows x 1000 columns] in sample DataFrame 
SHIGH = pd.concat([SHIGH]*400).reset_index(drop=True) 
SHIGH = pd.concat([SHIGH]*250, axis=1) 

In [212]: %timeit (SHIGH.divide(SHIGH.index.values, axis = 0)) 
100 loops, best of 3: 14.8 ms per loop 

In [213]: %timeit (SHIGH.divide(SHIGH.index.to_series(), axis = 0)) 
100 loops, best of 3: 14.9 ms per loop 
1

這樣做的另一種方法是

df.div(df.index.values, axis=0) 

實施例:

In [7]: df = pd.DataFrame({'a': range(5), 'b': range(1, 6), 'c': range(2, 7)}).set_index('a') 

In [8]: df.divide(df.index.values, axis=0) 
Out[8]: 
      b   c 
a      
0  inf  inf 
1 2.000000 3.000000 
2 1.500000 2.000000 
3 1.333333 1.666667 
4 1.250000 1.500000