蟒蛇for循環使用的文件，而不是字典

我使用的，而不是一個Python字典我自己的文件時，但是當我在該文件上應用for環路我收到此錯誤：蟒蛇for循環使用的文件，而不是字典

TypeError: string indices must be integers, not str

我的代碼在下面給出其中「sai.json」是包含字典的文件。

import json 
from naiveBayesClassifier import tokenizer 
from naiveBayesClassifier.trainer import Trainer 
from naiveBayesClassifier.classifier import Classifier 

nTrainer = Trainer(tokenizer) 

ofile = open("sai.json","r") 

dataset=ofile.read() 
print dataset 

for n in dataset: 
    nTrainer.train(n['text'], n['category']) 

nClassifier = Classifier(nTrainer.data, tokenizer) 

unknownInstance = "Even if I eat too much, is not it possible to lose some weight" 

classification = nClassifier.classify(unknownInstance) 
print classification

來源

2015-11-07 Neha

'N'是一個字符串，而不是一本字典。請做一些關於如何解析json的研究。使用'json'模塊 – Pynchia

您正在導入json模塊，但您沒有使用它！

您可以使用json.load從打開的文件中加載JSON數據轉換爲Python dict，或者您也可以讀取文件轉換成字符串，然後使用json.loads將數據加載到dict。

例如，

ofile = open("sai.json","r") 
data = json.load(ofile) 
ofile.close()

甚至更好

with open("sai.json", "r") as ifile: 
    data = json.load(ofile)

或者，使用json.loads：

with open("sai.json", "r") as ifile: 
    dataset=ofile.read() 
data = json.loads(dataset)

然後你就可以用data['text']和
訪問data內容data['category']，假設字典有這些鍵。

你得到一個錯誤，因爲dataset是一個字符串，因此

for n in dataset: 
    nTrainer.train(n['text'], n['category'])

環比由字符字符串的字符，把每個字符爲一個元素字符串。字符串只能由整數，而不是其他的字符串進行索引，但沒有太多的點索引到一個元素串，因爲如果s是一個元素串，然後s[0]具有相同內容s

這裏的數據你在評論中。我假定你的數據是一個包裝在字典中的列表，但是可以將一個普通列表作爲JSON對象。我使用print json.dumps(dataset, indent=4)來格式化它。請注意，文件中最後一個}後面沒有逗號：在Python中沒問題，但是它在JSON中是錯誤的。

sai.json

[ 
    { 
     "category": "NO", 
     "text": "hello everyone" 
    }, 
    { 
     "category": "YES", 
     "text": "dont use words like jerk" 
    }, 
    { 
     "category": "NO", 
     "text": "what the hell." 
    }, 
    { 
     "category": "yes", 
     "text": "you jerk" 
    } 
]

現在，如果我們在json.load閱讀你的代碼應該正常工作。但這裏有一個簡單的演示，只是打印內容：

with open("sai.json", "r") as f: 
    dataset = json.load(f) 

for n in dataset: 
    print "text: '%s', category: '%s'" % (n['text'], n['category'])

輸出

text: 'hello everyone', category: 'NO' 
text: 'dont use words like jerk', category: 'YES' 
text: 'what the hell.', category: 'NO' 
text: 'you jerk', category: 'yes'

來源

2015-11-07 07:18:18

我仍然收到錯誤「TypeError：字符串索引必須是整數，而不是str」。我如何將n從字符串更改爲此循環中的整數。 – Neha

我在這個for循環中出現錯誤： - 對於數據集中的n： nTrainer.train（n ['text']，n ['category']） – Neha

@Neha：該代碼看起來完全像您問題中的代碼。但是，也許你誤解了我之前說過的話。該代碼**不正確**。你有沒有嘗試用我讀到的「sai.json」文件轉換成Python字典的三種方法之一？如果該文件不包含有效的JSON數據，那麼'json'模塊會在嘗試加載時發出錯誤消息。一旦你加載了，那麼訪問你需要的數據字段的正確方式取決於JSON數據的結構。也許你應該將你的數據樣本發佈到問題中。 –

蟒蛇for循環使用的文件，而不是字典

回答

相關問題