如何讀取文件，並計算出一個特定的值

我如何找出有多少關鍵字從文件也在另一個文件？我有一個包含單詞列表的文件，我試圖弄清楚這些單詞是否在另一個文件中。如何讀取文件，並計算出一個特定的值

我有一個包含關鍵詞的文件（keywords.txt），和我試圖找出是否另一個文件包含（tweets.txt），其中包含的句子，包含任何關鍵字

def main() : 
    done = False 
    while not done: 
     try: 
      keywords = input("Enter the filename titled keywords: ") 
      with open(keywords, "r") as words: 
       done = True 
     except IOError: 
      print("Error: file not found.") 

total = 0 
try: 
    tweets = input("Enter the file Name titled tweets: ") 
    with open(tweets, 'r') as tweets: 
except IOError: 
    print("Error: file not found.") 

def sentiment_of_msg(msg_words_counter): 
     summary = 0 
     for line in tweets: 
       if happy_dict in line: 
        summary += 10 * **The number of keywords in the sentence of the file** 
       elif veryUnhappy_dict in line: 
        summary += 1 * quantity 
       elif neutral_dict in line: 
        summary += 5 * quantity 
      return summary

來源

2016-11-09 HelloWorld4382

第一讀取文本存儲大文件。現在你打開文件，但後來你對這些文件什麼都不做。後來你會做計算。 – furas

沒有人願意爲你做功課，原因很多。問一個具體的問題來解決你的問題的一部分。現在你甚至沒有接近。用開放（tweets，'r'）作爲推文後會發生什麼：'？ –

@AlexHall如果你不打算提出任何建議或提供幫助，Id感謝，如果你沒有評論。謝謝！ – HelloWorld4382

我感覺到這是作業，所以我能做的最好的是給你一個解決方案的大綱。

如果你能負擔得起在內存中加載文件：

負載keywords.txt，read its lines，將它們分成記號，並從中構建一個set。現在你有能力快速身份的查詢（即你可以問if token in set並在固定時間內得到答案的數據結構。
負荷你的關鍵字做的鳴叫文件，並通過行（或但是他們閱讀其內容線你可能需要做一些預處理（刪除空格，替換不必要的字符，刪除無效的單詞，逗號等）。對於每一行，分割它，以便獲取每條推文的單詞，並詢問是否有任何分割的單詞處於。關鍵詞設置

僞代碼是這樣的：

file=open(keywords) 
keywords_set=set() 
for token in file.readlines(): 
    for word in token.split(): 
     keywords_set.add(word) 

file=open(tweets) 
for token in file.readlines(): 
    preprocess(token) #function with your custom logic 
    for item in token.split(): 
     if item in keywords: 
      do_stuff() #function with your custom logic

如果您需要關鍵字的頻率，請使用{key：key_frequency}構建字典。或者查看Counter，並考慮如何解決您的問題。

如果您不能加載鳴叫文件到內存中考慮lazy solution閱讀使用發電機從文件

來源

2016-11-10 08:15:50 themistoklik

謝謝！它應該提示用戶輸入文件名，這就是爲什麼我問他們。我把你的東西考慮在內！ – HelloWorld4382

如何讀取文件，並計算出一個特定的值

回答

相關問題