熊貓：如何計算每一行中各個單詞的數據幀

-1

假設我有一個熊貓數據幀，它看起來像這樣的事情：熊貓：如何計算每一行中各個單詞的數據幀

sentences 
['this', 'is', 'a', 'sentence', 'and', 'this', 'one', 'as', 'well'] 
['this', 'is', 'another', 'sentence', 'and', 'this', 'sentence', 'looks', 'like', 'other', 'sentences']

我試圖計算每個每個單詞的計數行，並以一種我可以在需要時輕鬆使用它的方式存儲它們。到目前爲止，我失敗了，我會很感激一些幫助。

謝謝！

來源

2017-06-19 emreorta

您是否嘗試過使用df.column_name [。 value_counts（）]（https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html）？ – Tbaki

您可以使用Counter與DataFrame構造，但對遺漏值獲得NaNs：

from collections import Counter 

print (type(df.loc[0, 'sentences'])) 
<class 'list'> 

df1 = pd.DataFrame([Counter(x) for x in df['sentences']]) 
print (df1) 
    a and another as is like looks one other sentence sentences \ 
0 1.0 1  NaN 1.0 1 NaN NaN 1.0 NaN   1  NaN 
1 NaN 1  1.0 NaN 1 1.0 1.0 NaN 1.0   2  1.0 

    this well 
0  2 1.0 
1  2 NaN

如果需要更換NaNs到0添加DataFrame.fillna：

df1 = pd.DataFrame([Counter(x) for x in df['sentences']]).fillna(0).astype(int) 
print (df1) 
    a and another as is like looks one other sentence sentences \ 
0 1 1  0 1 1  0  0 1  0   1   0 
1 0 1  1 0 1  1  1 0  1   2   1 

    this well 
0  2  1 
1  2  0

來源

2017-06-19 08:36:18 jezrael

感謝您的迅速響應！如果不按字母順序重新排列，可以這樣做嗎？ – emreorta

不幸的是，因爲'DataFrame'構造函數對它進行排序:( – jezrael

呃，好像我們不能擁有所有東西：D再次感謝！ – emreorta

熊貓：如何計算每一行中各個單詞的數據幀

回答

相關問題