2017-06-22 73 views
1

我有格式的數據幀:熊貓假人編碼字符串

id amenities      ... 
1  "TV,Internet,Shower,..."  ... 
2  "TV,Hot tub,Internet,..."  ... 
3  "Internet,Heating,Shower..." ... 
... 

我想分裂有關逗號的字符串,併爲每個類別創建虛擬列,導致這樣的事情:

id TV Internet Shower Hot tub Heating ... 
1  1  1   1   0   0   ... 
2  1  1   0   1   0   ... 
3  0  1   1   0   1   ... 
... 

我該如何去做這件事?

感謝

回答

2

您可以使用get_dummiesjoinconcat

df = df[['id']].join(df['amentieis'].str.get_dummies(',')) 
print (df) 
    id Heating Hot tub Internet Shower TV 
0 1  0  0   1  1 1 
1 2  0  1   1  0 1 
2 3  1  0   1  1 0 

或者:

df = pd.concat([df['id'], df['amentieis'].str.get_dummies(',')], axis=1) 
print (df) 
    id Heating Hot tub Internet Shower TV 
0 1  0  0   1  1 1 
1 2  0  1   1  0 1 
2 3  1  0   1  1 0 
+0

完美,謝謝! –