熊貓數據幀更新使用另一種柱

我有兩列的數據幀df一列，它的列是phone和label，其label只能是0或1。
下面是一個例子：熊貓數據幀更新使用另一種柱

phone label 
    a  0 
    b  1 
    a  1 
    a  0 
    c  0 
    b  0

我想要做的是計算每種類型的'電話'的'1'的數量，並使用數字替換'電話'列我附帶的是groupby，但我不熟悉它

T他的回答應該是：

Count the number of each 'phone' 
phone count 
    a   1 
    b   1 
    c   0 

replace the 'phone' with 'count' in the original table 
phone 
    1 
    1 
    1 
    1 
    0 
    1

來源

2016-07-15 Fan

你想找到沒有。 'phone'中的行給出了標籤== 1？ –

你想要：'df.groupby ['phone']。sum（）'？ – bernie

但我怎樣才能取代'電話'與'總和' – Fan

tak荷蘭國際集團考慮的是，label列只能有0或1，您可以使用.trasnform('sum')方法：

In [4]: df.label = df.groupby('phone')['label'].transform('sum') 

In [5]: df 
Out[5]: 
    phone label 
0  a  1 
1  b  1 
2  a  1 
3  a  1 
4  c  0 
5  b  1

說明：

In [2]: df 
Out[2]: 
    phone label 
0  a  0 
1  b  1 
2  a  1 
3  a  0 
4  c  0 
5  b  0 

In [3]: df.groupby('phone')['label'].transform('sum') 
Out[3]: 
0 1 
1 1 
2 1 
3 1 
4 0 
5 1 
dtype: int64

來源

2016-07-15 07:11:51 MaxU

您可以在熊貓中篩選和分組數據。對於你的情況下，它看起來

假設數據

phone label 
0  a  0 
1  b  1 
2  a  1 
3  a  1 
4  c  1 
5  d  1 
6  a  0 
7  c  0 
8  b  0 

df.groupby(['phone','label'])['label'].count() 
phone label 
a  0  2 
     1  2 
b  0  1 
     1  1 
c  0  1 
     1  1 
d  1  1

如果需要的phones組數給予label==1然後做到這一點 -

#first filter to get only label==1 rows 
phone_rows_label_one_df = df[df.label==1] 

#then do groupby 
phone_rows_label_one_df.groupby(['phone'])['label'].count() 

phone 
a 2 
b 1 
c 1 
d 1

要獲得count在數據幀的新列這樣做

phone_rows_label_one_df.groupby(['phone'])['label'].count().reset_index(name='count') 
    phone count 
0  a  2 
1  b  1 
2  c  1 
3  d  1

來源

2016-07-15 02:44:18

其實，我想找出每個類型的'手機'給定標籤== 1.行數。 – Fan

我怎樣才能取代'手機'在計數的原始表中？ – Fan

@粉絲完成。熊貓真棒！ –

熊貓數據幀更新使用另一種柱

回答

相關問題