好第一件事情,你的字典是不是一個字典,並且現在應建設成爲一個像這樣
d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}
有,我們實際上我們可以用大熊貓建立一個數據幀一本字典,而是在爲了以你想要的方式獲得它,我們將不得不從字典中建立一個列表清單。然後,我們將創建一個數據框和標記列,然後排序
import collections
import pandas as pd
d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}
d = pd.DataFrame([[k,k1,v1] for k,v in d.items() for k1,v1 in v.items()], columns = ['File','Words','Count'])
print d.sort(['File','Count'], ascending=[1,1])
File Words Count
1 document1 it's 0
0 document1 he's 1
3 document1 they're 2
2 document1 there's 5
4 document2 he's 0
7 document2 they're 2
6 document2 there's 3
5 document2 it's 4
11 document3 they're 0
8 document3 he's 1
10 document3 there's 4
9 document3 it's 7
如果你想與前n次出現,那麼你可以使用groupby()
,然後要麼排序
d = d.sort(['File','Count'], ascending=[1,1]).groupby('File').head(2)
File Words Count
1 document1 it's 0
0 document1 he's 1
4 document2 he's 0
7 document2 they're 2
11 document3 they're 0
8 document3 he's 1
時head() or tail()
列表理解返回名單列表,看起來像這樣
d = [['document1', "he's", 1], ['document1', "it's", 0], ['document1', "there's", 5], ['document1', "they're", 2], ['document2', "he's", 0], ['document2', "it's", 4], ['document2', "there's", 3], ['document2', "they're", 2], ['document3', "he's", 1], ['document3', "it's", 7], ['document3', "there's", 4], ['document3', "they're", 0]]
爲了正確地建立字典,你只需要使用一些東西克
d['document1']['it\'s'] = 1
如果由於某種原因,你使用STR的元組和類型的字典的列表,你可以使用這個列表理解,而不是
[[i[0],k1,v1] for i in d for k1,v1 in i[1].items()]
這不是一個字典死心塌地的線條。 – user2357112
您應該使用Python Pandas lib來創建您在帖子中顯示的數據框的類型。 –
我從哪裏開始?我應該看的任何方法? – blacksite