重置從groupby或pivot創建的pandas DataFrame的索引？

我有包含有關各種金融證券的價格，交易量和其他數據的數據。我輸入的數據如下所示：重置從groupby或pivot創建的pandas DataFrame的索引？

import numpy as np 
import pandas 

prices = np.random.rand(15) * 100 
volumes = np.random.randint(15, size=15) * 10 
idx = pandas.Series([2007, 2007, 2007, 2007, 2007, 2008, 
        2008, 2008, 2008, 2008, 2009, 2009, 
        2009, 2009, 2009], name='year') 
df = pandas.DataFrame.from_items([('price', prices), ('volume', volumes)]) 
df.index = idx 

# BELOW IS AN EXMPLE OF WHAT INPUT MIGHT LOOK LIKE 
# IT WON'T BE EXACT BECAUSE OF THE USE OF RANDOM 
#   price volume 
# year 
# 2007 0.121002  30 
# 2007 15.256424  70 
# 2007 44.479590  50 
# 2007 29.096013  0 
# 2007 21.424690  0 
# 2008 23.019548  40 
# 2008 90.011295  0 
# 2008 88.487664  30 
# 2008 51.609119  70 
# 2008 4.265726  80 
# 2009 34.402065  140 
# 2009 10.259064  100 
# 2009 47.024574  110 
# 2009 57.614977  140 
# 2009 54.718016  50

我想生產，看起來像一個數據幀：

year  2007  2008  2009 
0  0.121002 23.019548 34.402065 
1  15.256424 90.011295 10.259064 
2  44.479590 88.487664 47.024574 
3  29.096013 51.609119 57.614977 
4  21.424690 4.265726 54.718016

我知道的一個方式生產使用GROUPBY以上輸出：

df = df.reset_index() 
grouper = df.groupby('year') 
df2 = None 
for group, data in grouper: 
    series = data['price'].copy() 
    series.index = range(len(series)) 
    series.name = group 
    df2 = pandas.DataFrame(series) if df2 is None else pandas.concat([df2, series], axis=1)

而且我也知道，你可以做支點，以獲得具有NaN的對樞丟失索引的數據幀：

# df = df.reset_index() 
df.pivot(columns='year', values='price') 

# Output 
# year  2007  2008  2009 
# 0  0.121002  NaN  NaN 
# 1  15.256424  NaN  NaN 
# 2  44.479590  NaN  NaN 
# 3  29.096013  NaN  NaN 
# 4  21.424690  NaN  NaN 
# 5   NaN 23.019548  NaN 
# 6   NaN 90.011295  NaN 
# 7   NaN 88.487664  NaN 
# 8   NaN 51.609119  NaN 
# 9   NaN 4.265726  NaN 
# 10   NaN  NaN 34.402065 
# 11   NaN  NaN 10.259064 
# 12   NaN  NaN 47.024574 
# 13   NaN  NaN 57.614977 
# 14   NaN  NaN 54.718016

我的問題是：

有沒有辦法，我可以創建在GROUPBY我的輸出數據框，而無需創建一系列的方式，或者是有辦法，我可以重新索引我輸入的數據幀，使我得到使用樞軸的理想輸出？

來源

2017-01-10 aquil.abdullah

你需要標籤每年0-4。爲此，請在分組後使用cumcount。然後，您可以使用該新列作爲索引正確旋轉。

df['year_count'] = df.groupby(level='year').cumcount() 
df.reset_index().pivot(index='year_count', columns='year', values='price') 

year    2007  2008  2009 
year_count         
0   61.682275 32.729113 54.859700 
1   44.231296 4.453897 45.325802 
2   65.850231 82.023960 28.325119 
3   29.098607 86.046499 71.329594 
4   67.864723 43.499762 19.255214

來源

2017-01-10 04:44:06

可以使用groupby通過values與numpy array創建apply新Series然後unstack重塑：

print (df.groupby(level='year')['price'].apply(lambda x: pd.Series(x.values)).unstack(0)) 
year  2007  2008  2009 
0  55.360804 68.671626 78.809139 
1  50.246485 55.639250 84.483814 
2  17.646684 14.386347 87.185550 
3  54.824732 91.846018 60.793002 
4  24.303751 50.908714 22.084445

來源

2017-01-10 08:25:12 jezrael

重置從groupby或pivot創建的pandas DataFrame的索引？

回答

相關問題