2017-02-27 95 views
1

在我的DataFrame中,我有許多相同AutoNumber的實例,它們有不同的KeyValue_String。我想將這些實例轉換爲單行,其中KeyValue_String是由多個唯一值組成的列表。Python/Pandas:如果Column有多個值,在列表中轉換爲具有多個值的單行

AutoNumber KeyValue_String ReferralType      Description 
0  50899    DD    3      Web Search 
1  50905   Cheque    1   Gatestone Collections 
2  50906    DD    2   Centum Mortgage Brokers 
3  50907   Cheque    1  Financial Debt Recovery Ltd. 
4  50908    DD    2   Centum Mortgage Brokers 
5  50909    DD    2   Centum Mortgage Brokers 
6  50910   Cheque    1  Allied International Credit 
7  50911   Cheque    1    D&A Collection Corp 
8  50912   Cheque    1   Gatestone Collections 
9  50913   Cheque    1  Financial Debt Recovery Ltd. 
10  50914   Cheque    3 Existing Customer - Refinancing 
11  50914    DD    3 Existing Customer - Refinancing 
12  50915   Cheque    1   Gatestone Collections 
13  50916   Cheque    3 Existing Customer - Refinancing 
14  50916   Cheque    3 Existing Customer - Refinancing 

所需的輸出應該是這樣的,但我希望將所有其他列

 AutoNumber KeyValue_String 
0   50899   DD 
1   50905  Cheque 
2   50906   DD 
3   50907  Cheque 
4   50908   DD 
5   50909   DD 
6   50910  Cheque 
7   50911  Cheque 
8   50912  Cheque 
9   50913  Cheque 
10   50914 [Cheque, DD] 
11   50915  Cheque 
12   50916  Cheque 
13   50917  Cheque 
14   50918  Cheque 

回答

1

如果我理解正確的,你可以選擇使用groupbytransformunique

df['KeyValue_String'] = df.groupby('AutoNumber').KeyValue_String.transform('unique') 

然後你可以刪除重複假定在具有相同自動編號行包含除KeyValue_String重複信息的評論中提到。

df = df.drop_duplicates(subset='AutoNumber') 

,如果你想你的陣列把一切都在列作爲一個數組,並沒有花費精力將混合類型,這將僅僅是很難一起反正上班列我想提醒。

演示

>>> df 
    AutoNumber KeyValue_String 
0  50899    DD 
1  50905   Cheque 
2  50906    DD 
3  50907   Cheque 
4  50908    DD 
5  50909    DD 
6  50910   Cheque 
7  50911   Cheque 
8  50912   Cheque 
9  50913   Cheque 
10  50914   Cheque 
11  50914    DD 
12  50915   Cheque 
13  50916   Cheque 
14  50916   Cheque 

>>> df['KeyValue_String'] = df.groupby('AutoNumber').KeyValue_String.transform('unique') 

>>> df.drop_duplicates(subset='AutoNumber') 

    AutoNumber KeyValue_String 
0  50899   [DD] 
1  50905  [Cheque] 
2  50906   [DD] 
3  50907  [Cheque] 
4  50908   [DD] 
5  50909   [DD] 
6  50910  [Cheque] 
7  50911  [Cheque] 
8  50912  [Cheque] 
9  50913  [Cheque] 
10  50914 [Cheque, DD] 
12  50915  [Cheque] 
13  50916  [Cheque] 
+1

感謝一大堆。感謝幫助! –