2016-04-28 59 views
-2
is_correct, question_id 
t   1 
t   1 
f   1 
f   1 
t   2 
t   2 

期望的結果:如何分組和計算不同的條件?

correct_count, incorrect_count, question_id 
2    2    1 
2    0    2 

這是我,但我只能得到一個正確的計數

df[df["is_correct"]].groupby("question_id")["question_id"].count() 
+0

[python pandas可能的重複:如何分組和按列中的每個值的條件?](http://stackoverflow.com/questions/31458703/python-pandas-how-to-組按和計數上帶有一個條件換每值-i的n-a-c) –

+0

它是重複的。雖然MaxU對這個問題的解決方案比另一個有更好更有趣的答案 – samol

+0

然後,請好好將另一個問題標記爲這個問題的重複,這樣所有問題都會被引導到這個問題上。 –

回答

1

您可以使用pivot_table功能爲:

In [28]: data = """\ 
    ....: is_correct question_id 
    ....: t   1 
    ....: t   1 
    ....: f   1 
    ....: f   1 
    ....: t   2 
    ....: t   2 
    ....: """ 

In [29]: df = pd.read_csv(io.StringIO(data), delim_whitespace=True) 

In [30]: df['count'] = 0 

In [31]: 

In [31]: df 
Out[31]: 
    is_correct question_id count 
0   t   1  0 
1   t   1  0 
2   f   1  0 
3   f   1  0 
4   t   2  0 
5   t   2  0 

In [32]: 

In [32]: df.pivot_table(index='question_id', columns='is_correct', 
    ....:    values='count', aggfunc='count', fill_value=0)\ 
    ....: .reset_index() 
Out[32]: 
is_correct question_id f t 
0      1 2 2 
1      2 0 2 
+0

@samol,它有幫助嗎? – MaxU

0

創建另一列,您可以使用groupby來計算:

df = pd.DataFrame({'is_correct':['t','t','f','f','t','t'],'question_id':[1,1,1,1,2,2]}) 
df['to_sum_up']=1 

is_correct question_id to_sum_up 
t   1   1 
t   1   1 
f   1   1 
f   1   1 
t   2   1 
t   2   1 

df2 = df.groupby(['question_id','is_correct'],as_index = False).sum() 

一旦你做出GROUPBY,你只需要重新排列數據,以便它適合的列你想:

df2['correct_count'] = df2.ix[df2['is_correct']=='t','N'] 
df2['incorrect_count'] = df2.ix[df2['is_correct']=='f','N'] 

然後纔能有一個很好的數據幀作爲輸出:

df2.ix[df2['correct_count'].isnull(),'correct_count'] = 0 
df2.ix[df2['incorrect_count'].isnull(),'incorrect_count'] = 0 
df2 = df2.groupby('question_id',as_index = False).max() 
df2 = df2.drop(['N','is_correct'],1) 

     question_id correct_count incorrect_count 
0  1    2    2 
1  2    2    0