我認爲你需要groupby
與apply
:
#output is tuple with question value
df = df.groupby('q_body')['a_body'].apply(lambda x: tuple([x.name] + list(x)))
print (df)
q_body
question 1 (question 1, answer 1, answer 2, answer 3)
question 2 (question 2, answer 1, answer 2)
Name: a_body, dtype: object
#output is list with question value
df = df.groupby('q_body')['a_body'].apply(lambda x: [x.name] + list(x))
print (df)
q_body
question 1 [question 1, answer 1, answer 2, answer 3]
question 2 [question 2, answer 1, answer 2]
Name: a_body, dtype: object
#output is list without question value
df = df.groupby('q_body')['a_body'].apply(list)
print (df)
q_body
question 1 [answer 1, answer 2, answer 3]
question 2 [answer 1, answer 2]
Name: a_body, dtype: object
#grouping by parent_id without question value
df = df.groupby('parent_id')['a_body'].apply(list)
print (df)
parent_id
1 [answer 1, answer 2, answer 3]
2 [answer 1, answer 2]
Name: a_body, dtype: object
#output is string, values are concanecated by ,
df = df.groupby('parent_id')['a_body'].apply(', '.join)
print (df)
parent_id
1 answer 1, answer 2, answer 3
2 answer 1, answer 2
Name: a_body, dtype: object
但是,如果需要輸出列表中添加tolist
:
L = df.groupby('q_body')['a_body'].apply(lambda x: tuple([x.name] + list(x))).tolist()
print (L)
[('question 1', 'answer 1', 'answer 2', 'answer 3'), ('question 2', 'answer 1', 'answer 2')]
謝謝jezrael,現在會更多地使用lambda。 –
很高興能爲您提供幫助。美好的一天。 – jezrael