看來你需要reset_index
與pivot
:
df = df.reset_index().pivot(index='index', columns='Q', values='A')
print (df)
Q a b c d
index
A h i j None
B l m None k
C None None n None
然後,如果neccessary reindex_axis
和replace
:
cols = list('abcdefg')
print (df.reindex_axis(cols, axis=1).replace({None:np.nan}))
Q a b c d e f g
index
A h i j NaN NaN NaN NaN
B l m NaN k NaN NaN NaN
C NaN NaN n NaN NaN NaN NaN
編輯:
如果數據副本更好的是groupby
與join
:
print (df)
Q A
A a h
A b i
A c j
B d k
B a l
B b m <-duplicates B b
B b t <-duplicates B b
C c n
df = df.reset_index().groupby(['index','Q'])['A'].apply(','.join).unstack()
print (df)
Q a b c d
index
A h i j None
B l m,t None k
C None None n None
與pivot_table
另一種可能的解決方案:
#aggfunc='first' - get only first value, another values are lost
df1 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc='first')
print (df1)
Q a b c d
index
A h i j None
B l m None k
C None None n None
Q a b c d
#aggfunc='sum' - summed data, no separator
df2 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc='sum')
print (df2)
index
A h i j None
B l mt None k
C None None n None
Q a b c d
#aggfunc=','.join - summed data with separator
df3 = df.reset_index().pivot_table(index='index', columns='Q', values='A', aggfunc=','.join)
print (df3)
index
A h i j None
B l m,t None k
C None None n None