Counter（）對所有單詞返回1。如何獲得實際數量？

我有一個文本文件，我試圖獲取最常用的單詞。我正在使用Counter，但它似乎每個返回1。Counter（）對所有單詞返回1。如何獲得實際數量？

我在學習，所以我使用Simple Sabotage Field Manual作爲我的文本文件。

import re 
from collections import Counter 
my_file = "fieldManual.txt" 

#### GLOBAL VARIABLES 
lst = [] # used in unique_words 
cnt = Counter() 

######### 

def clean_word(the_word): 
    #new_word = re.sub('[^a-zA-Z]', '',the_word) 
    new_word = re.sub('^[^a-zA-z]*|[^a-zA-Z]*$', '', the_word) 
    return new_word 

def unique_words(): 
    with open(my_file, encoding="utf8") as infile: 
     for line in infile: 
      words = line.split() 
      for word in words: 
       edited_word = clean_word(word) 
       if edited_word not in lst: 
        lst.append(edited_word) 
        cnt[edited_word] += 1 
    lst.sort() 
    word_count = Counter(lst) 
    return(lst) 
    return (cnt) 

unique_words() 
test = ['apple','egg','apple','banana','egg','apple'] 
print(Counter(lst)) # returns '1' for everything 
print(cnt) # same here

所以，print(Counter(test))回報，正確，

計數器（{ '蘋果'：3， '蛋'：2， '香蕉'：1}）

但我試圖打印最頻繁的詞在我lst回報

計數器（{ ''：1， 'A'：1， '實際'：1， '同意'：1，「協議'：1，'AK'：1，'AND'：1，'ANY'：1，'任何'：1，'AR'：1，'原樣'：1，'ASCII'：1，''關於'：1，'摘要'：1，'意外'：1，'Act'：1，'Acts'：1，'Add'：1，'Additional'：1，'Adjust'：1，'Advocate' ：1， '後'：1， '三農'：1，...

繼答案from here，我試圖在if edited_word not in lst:加入cnt.Update(edited_word)，但然後打印cnt我只是得到單個字符：

Counter（{'e'：2401，'i'：1634，'t'：1470，'''：1467，'n'：1455，'r'：1442，'a'：1407，'o '：1244，'1'：948，'c'：862，'d'：752，'u'：651，'p'：590，'g'：564，'m'：436，...

如何從我的.txt文件中返回每個唯一字的頻率？

來源

2017-08-08 BruceWayne

如果尚未找到，則只能將該單詞追加到列表中。因此，每個單詞只會顯示一次。

來源

2017-08-08 04:39:29

[Oh ... my ... goodness]（http://gif-finder.com/wp-content/uploads/2015/02/Steve-Carell-Facepalm.gif）。非常感謝您指出現在非常明顯的一點。這將做到！我是這種方式overhinking這一點，並沒有退後一步，並通過大聲去... – BruceWayne

這裏有一些錯誤。您應該增加計數器，無論該單詞是否在列表中，或者只需從分割字符串調用列表中的計數器即可。你有返回返回語句（第二個不會被執行）。您正在查找與word_count列表的計數，然後忽略該輸出（每個單詞也是1）。只是清理這些代碼可能會幫助解決問題。

來源

2017-08-08 04:43:36 theaustinseven

非常感謝您的建議。我正在學習Python（顯然我猜），只是試圖讓這個工作，然後會回去，使其更緊，甚至可能會去[CodeReview]（https://codereview.stackexchange.com/）。另外，感謝關於'return'的註釋，我會把它們放在同一行。 – BruceWayne

Counter（）對所有單詞返回1。如何獲得實際數量？

回答

相關問題