複雜的groupby操作使用Pandas捕獲多對一的場景

下面是我的數據框的一個小樣本，它有數百萬行。它表示Send_customers向Pay_Customers匯款。複雜的groupby操作使用Pandas捕獲多對一的場景

 In [14]: df1 
     Out[14]: 
      Send_Customer   Pay_Customer 
0  1000000000009548332 2000000000087113758 
1  1000000000072327616 2000000000087113758 
2  1000000000081537869 2000000000087113758 
3  1000000000007725765 2000000000078800989 
4  1000000000031950290 2000000000078800989 
5  1000000000082570417 2000000000078800989 
6  1000000000009548332 1000000000142041382 
7  1000000000072327616 1000000000142041382 
8  2000000000097181041 1000000000004033594

我需要爲那些參與多對一場景的send_customers存儲計數。

例如，Pay_Customers 2000000000087113758,2000000000078800989,1000000000142041382正在接收來自多個send_customers的錢。因此，對於那些Send_Customers寄錢給他們，「計數」值爲1

Send_Customers 1000000000009548332和1000000000072327616分別參與2至一個許多情況下用Pay_Customers 2000000000087113758和1000000000142041382，所以有累計「計算」應是2.

在此先感謝！

來源

2016-08-05 mysterious_guy

您可以使用groupby：

print(df1.groupby('Send_Customer')['Pay_Customer'].count())

輸出：

Send_Customer 
1000000000007725765 1 
1000000000009548332 2 
1000000000031950290 1 
1000000000072327616 2 
1000000000081537869 1 
1000000000082570417 1 
2000000000097181041 1

根據你的評論，如果你想只保留其中count高於1你可以做到這一點，而不是行：

df1 = df1.groupby('Send_Customer')['Pay_Customer'].count().reset_index(name="count") 
df1 = df1[df1['count'] > 1]

產量：

1 1000000000009548332  2 
3 1000000000072327616  2

來源

2016-08-05 02:03:13

嗨。我的數據幀有數百萬行。以上只是一個小樣本。對不起，我沒有提及它早些時候。我只需要採取多對一的情況下參與的客戶的數量。因此，在本示例中，由於其Pay_customer不涉及多對一的場景，因此無需爲Send_customer 2000000000097181041計數。 –

@mysterious_guy請參閱我的編輯。 –

複雜的groupby操作使用Pandas捕獲多對一的場景

回答

相關問題