文本分析 - 無法在csv或xls文件中編寫Python程序的輸出

您好我正在嘗試使用python 2.x中的Naive Bayes分類器進行情感分析。它使用txt文件讀取情緒，然後根據示例txt文件情緒給出正面或負面的輸出。我希望輸出與輸入相同，例如我有一個文本文件讓我們可以看到1000條原始情緒，我希望輸出對每個情緒都顯示正面或負面。請幫忙。下面是我使用文本分析 - 無法在csv或xls文件中編寫Python程序的輸出

import math 
import string 

def Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string): 
    y_values = [0,1] 
    prob_values = [None, None] 

    for y_value in y_values: 
     posterior_prob = 1.0 

     for word in test_string.split(): 
      word = word.lower().translate(None,string.punctuation).strip() 
      if y_value == 0: 
       if word not in negative: 
        posterior_prob *= 0.0 
       else: 
        posterior_prob *= negative[word] 
      else: 
       if word not in positive: 
        posterior_prob *= 0.0 
       else: 
        posterior_prob *= positive[word] 

     if y_value == 0: 
      prob_values[y_value] = posterior_prob * float(total_negative)/(total_negative + total_positive) 
     else: 
      prob_values[y_value] = posterior_prob * float(total_positive)/(total_negative + total_positive) 

    total_prob_values = 0 
    for i in prob_values: 
     total_prob_values += i 

    for i in range(0,len(prob_values)): 
     prob_values[i] = float(prob_values[i])/total_prob_values 

    print prob_values 

    if prob_values[0] > prob_values[1]: 
     return 0 
    else: 
     return 1 


if __name__ == '__main__': 
    sentiment = open(r'C:/Users/documents/sample.txt') 

    #Preprocessing of training set 
    vocabulary = {} 
    positive = {} 
    negative = {} 
    training_set = [] 
    TOTAL_WORDS = 0 
    total_negative = 0 
    total_positive = 0 

    for line in sentiment: 
     words = line.split() 
     y = words[-1].strip() 
     y = int(y) 

     if y == 0: 
      total_negative += 1 
     else: 
      total_positive += 1 

     for word in words: 
      word = word.lower().translate(None,string.punctuation).strip() 
      if word not in vocabulary and word.isdigit() is False: 
       vocabulary[word] = 1 
       TOTAL_WORDS += 1 
      elif word in vocabulary: 
       vocabulary[word] += 1 
       TOTAL_WORDS += 1 

      #Training 
      if y == 0: 
       if word not in negative: 
        negative[word] = 1 
       else: 
        negative[word] += 1 
      else: 
       if word not in positive: 
        positive[word] = 1 
       else: 
        positive[word] += 1 

    for word in vocabulary.keys(): 
     vocabulary[word] = float(vocabulary[word])/TOTAL_WORDS 

    for word in positive.keys(): 
     positive[word] = float(positive[word])/total_positive 

    for word in negative.keys(): 
     negative[word] = float(negative[word])/total_negative 

    test_string = raw_input("Enter the review: \n") 

    classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string) 
    if classifier == 0: 
     print "Negative review" 
    else: 
     print "Positive review"

來源

2017-05-04 hitesh

嗨亞太區首席技術官Matt，根據我所瞭解，你想作爲輸出用句詞的CSV/xls文件，用戶插入的輸入。對於每個單詞，您都希望分類器計算的相對情緒（正面或負面）。這是對的嗎？你能提供一個想要的csv/xls文件的例子嗎？謝謝 – Giordano

我會粘貼下面的csv文件的內容： – hitesh

一個好產品 - 你的工作很有趣！多年來一直享有良好的使用體驗。好的產品好結果我不使用任何更多我一直是一個穩定的產品總體一個非常好的產品相比其餘產品正常工作，但別人告訴我一些其他的產品優越。穩健慢最好所有無法安裝用戶友好非常糟糕很難理解日誌和繁瑣的設置和部署，正確的。下面是 – hitesh

我已經檢查由您在評論張貼GitHub庫中的代碼。我試圖運行該項目，但我有一些錯誤。

無論如何，我已經檢查了項目結構和用於訓練樸素貝葉斯算法的文件，我認爲可以使用以下代碼片段將結果數據寫入Excel文件（即.xls）

with open("test11.txt") as f: 
    for line in f: 
     classifier = naive_bayes_classifier(positive, negative, total_negative, total_positive, line) 
     result = 'Positive' if classifier == 0 else 'Negative' 
     data_to_be_written += ([line, result],) 

# Create a workbook and add a worksheet. 
workbook = xlsxwriter.Workbook('test.xls') 
worksheet = workbook.add_worksheet() 

# Start from the first cell. Rows and columns are zero indexed. 
row = 0 
col = 0 

# Iterate over the data and write it out row by row. 
for item, cost in data_to_be_written: 
    worksheet.write(row, col,  item) 
worksheet.write(row, col + 1, cost) 
row += 1 

workbook.close()

Sorthly，與句子中的文件的每一行進行測試，我所說的分類，並準備將在csv文件寫入的結構。
然後循環結構並寫入xls文件。
爲此，我使用了一個名爲xlsxwriter的python網站包。

正如我之前告訴過你的，我運行該項目時遇到了一些問題，所以這段代碼也沒有經過測試。無論如何，如果您遇到麻煩，請通知我。

問候

來源

2017-07-01 09:23:40 Giordano

@ Giordano-謝謝。我嘗試運行，但有一些錯誤。 – hitesh

將代碼更改爲below- – hitesh

哪種錯誤？你可以發佈他們嗎？ – Giordano

> with open("test11.txt") as f: 
>  for line in f: 
>   classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, line) if classifier == 0: 
>  f.write(line + 'Negative') else: 
>  f.write(line + 'Positive') 
>  
> #  result = 'Positive' if classifier == 0 else 'Negative' 
> #  data_to_be_written += ([line, result],) 
> 
> # Create a workbook and add a worksheet. workbook = xlsxwriter.Workbook('test.xls') worksheet = workbook.add_worksheet() 
> 
> # Start from the first cell. Rows and columns are zero indexed. row = 0 col = 0 
> 
> # Iterate over the data and write it out row by row. for item, cost in f: worksheet.write(row, col,  item) worksheet.write(row, col + 
> 1, cost) row += 1 
> 
> workbook.close()

來源

2017-07-07 06:20:50 hitesh

仍然得到一個零誤差:( – hitesh

文本分析 - 無法在csv或xls文件中編寫Python程序的輸出

回答

相關問題