我有一個.TSV文件的電影名稱和電影數據,我正在使用PYDOT軟件包進行分析。該文件已鏈接Here。包含用於創建它的JSON的文件鏈接到Here。Unicode寫入正確,讀取不正確
該文件是從解析的JSON寫入的,並且使用utf-8編碼編寫。雖然文件中寫入正確的,當我到Python讀回,解釋似乎始終停留在以下行:
'Taken\t["Liam Neeson", " Maggie Grace", " Jon Gries", " David Warshofsky"]\n'
'The Walking Dead\t["Andrew Lincoln", " Steven Yeun", " Chandler Riggs",'
輸出應該是這樣的,並在文件中被寫成這樣:
Taken ["Liam Neeson", " Maggie Grace", " Jon Gries", " David Warshofsky"]
The Walking Dead ["Andrew Lincoln", " Steven Yeun", " Chandler Riggs", " Norman Reedus"]
Toy Story 3 ["Tom Hanks", " Tim Allen", " Joan Cusack", " Ned Beatty"]
這裏是用於創建文本文件代碼:
step3v2=open('step3.txt', 'rU')
step4=codecs.open('step4.txt', mode='w', encoding='utf-8')
data=[]
merged=''
for line in step3v2:
data.append(json.loads(line))
for row in data:
moviename=row[u'Title']
row[u'Actors']=row[u'Actors'].split(',')
actors=json.dumps(row[u'Actors']) + '\r\n'
merged+=moviename + '\t'
merged+=actors
step4.write(merged)
這裏是讀取文件的代碼:
graph=pydot.Dot(graph_type='graph', charset='utf8')
step4v2=open('step4.txt', 'rU')
textfile=step4v2.readlines()
for line in textfile:
print repr(line)
解釋似乎在下面給arbritarily停止線:意味着什麼?有沒有錯誤?或者它只是等待或? –
沒有錯誤。解釋者根本沒有閱讀的字符串更多。爲了更加清晰,我將編輯該問題。 – Mike
有時候意味着什麼?或總是? –