2017-06-14 59 views
1

我有一個數據幀,看起來像選擇一個組,改造剩餘的組列熊貓

import pandas as pd 

from pandas.compat import StringIO 

origin = pd.read_table(StringIO('''label type value 
x a 1 
x b 2 
y a 4 
y b 5 
z a 7 
z c 9''')) 

origin 
Out[5]: 
    label type value 
0  x a  1 
1  x b  2 
2  y a  4 
3  y b  5 
4  z a  7 
5  z c  9 

我想把它改造成類似

label type value y_value z_value 
0  x a  1   4   7 
1  x b  2   5  NaN 

這裏y_value和z_value根據類型決定。

回答

1

你可以使用boolean indexing的第一過濾 - 在df2也刪除其不在df1['type']isin行,然後pivotadd_suffix和最後join

a = 'x' 
df1 = df[df['label'] == a] 
df2 = df[(df['label'] != a) & (df['type'].isin(df1['type']))] 
df3 = df2.pivot(index='type', columns='label', values='value').add_suffix('_value') 
print (df3) 
label y_value z_value 
type     
a   4.0  7.0 
b   5.0  NaN 

df3 = df1.join(df3, on='type') 
print (df3) 
    label type value y_value z_value 
0  x a  1  4.0  7.0 
1  x b  2  5.0  NaN 
0

您可以使用pivot_table

origin_temp = origin.pivot(index='type',columns='label',values='value') 

輸出繼電器:

type x y  z 
a 1.0 4.0 7.0 
b 2.0 5.0 NaN 
c NaN NaN 9.0 

過濾什麼interrest你:

origin_temp = origin_temp.drop('c').reset_index() 
origin_temp = origin_temp.drop('x',axis=1) 
origin_temp = origin_temp[['y','z']] 
origin_temp.columns = [ i + '_value' for i in origin_temp] 

輸出

y_value z_value 
0 4.0  7.0 
1 5.0  NaN 

然後過濾你想保持

origin_temp_2 = origin[origin['label'] == 'x' ] 

輸出

label type value 
0 x  a  1 
1 x  b  2 

值最後Concat的兩個:

origine_final = pd.concat([origin_temp, origin_temp_2],axis=1) 

輸出

y_value z_value label type value 
0 4.0  7.0  x  a  1 
1 5.0  NaN  x  b  2