Python的GROUPBY結果計數頻率

我有一個數據幀Python的GROUPBY結果計數頻率

df = pd.DataFrame({'id':['one','one','two','two','three','three','three'], 
        'type':['current','saving','current','current','current','saving','credit']})

我想算ID的數量只有具有「當前」東西想：

only_currnt_id_list = ['two']

來源

2017-08-30 KEXIN WANG

爲什麼它應該以'two'結果？ – RomanPerekhrest

因爲只有用戶「two」只有「current」類型 –

使用pd.crosstab

df=pd.crosstab(df.id,df.type) 
df.loc[df.sum(1)==df.current,].index.values[0] 

Out[1065]: 'two'

試試這個

或者您可以使用groupby和nunique

df['unique']=df.groupby('id')['type'].transform('nunique') 

df.loc[(df.unique==1)&(df.type=='current'),:].id.unique().tolist() 


Out[1085]: ['two']

來源

2017-08-30 13:55:21 Wen

我想你需要：

L = df.groupby('id') \ 
     .filter(lambda x: (x['type'] == 'current').all() and 
         (x['type'] == 'current').sum() == 1)['id'].tolist() 
print (L) 

['two']

編輯：

df = pd.DataFrame({'id':['one','one','two','three','three','three'],'type':['current','current','current','current','saving','credit']}) 
print (df) 
     id  type 
0 one current 
1 one current 
2 two current 
3 three current 
4 three saving 
5 three credit

L = df.groupby('id') \ 
     .filter(lambda x: (x['type'] == 'current').all() and 
         (x['type'] == 'current').sum() == 1)['id'].tolist() 
print (L) 
['two'] 

L = df.groupby('id') \ 
     .filter(lambda x: (x['type'] == 'current').all())['id'].unique().tolist() 
print (L) 
['one', 'two']

來源

2017-08-30 13:49:21 jezrael

什麼是'（x ['type'] =='current'）。sum（）== 1'的作用是什麼？ –

嗨jezrael，謝謝你的回答，它確實有效。而如果用戶'兩'有多個'當前'類型呢？ –

對不起，這意味着如果只需要過濾只有'current'值的'id'。我添加樣本以更好地解釋。 – jezrael

不使用純大熊貓，但你可以只使用所有的ID和ID之間的set區別是什麼都type != 'current'：

>>> set(df["id"]) - set(df["id"][df["type"] != "current"]) 
{2}

來源

2017-08-30 13:51:32

Python的GROUPBY結果計數頻率

回答

相關問題