提取XML文件時的Unicode錯誤Python

import os, csv, io 

from xml.etree import ElementTree 
file_name = "example.xml" 
full_file = os.path.abspath(os.path.join("xml", file_name)) 
dom = ElementTree.parse(full_file) 
Fruit = dom.findall("Fruit") 

with io.open('test.csv','w', encoding='utf8') as fp: 
    a = csv.writer(fp, delimiter=',') 
    for f in Fruit: 
     Explanation = f.findtext("Explanation") 
     Types = f.findall("Type") 
     for t in Types: 
      Type = t.text 
      a.writerow([Type, Explanation])

我從XML文件中提取數據，並將其放入CSV文件中。我在下面看到這個錯誤信息。這可能是因爲提取的數據包含華氏符號。我怎樣才能擺脫這些Unicode錯誤，而無需手動修復XML文件？提取XML文件時的Unicode錯誤Python

對於我的代碼的最後一行我得到這個錯誤消息 UnicodeEncodeError：「ASCII」編解碼器不能在1267位置編碼字符U「\ XB0」：序數不在範圍內（128）

<Fruits> 
<Fruit> 
    <Family>Citrus</Family> 
    <Explanation>They cannot grow at a temperature below 32 °F</Explanation> 
    <Type>Orange</Type> 
    <Type>Lemon</Type> 
    <Type>Lime</Type> 
    <Type>Grapefruit</Type> 
</Fruit> 
</Fruits>

來源

2016-05-23 Alexander

你使用的是Python2還是Python3？ –

你能提供一個單行示例XML文件來演示這個問題嗎？ –

我使用Python 2.7。我包含一個XML示例 – Alexander

您沒有寫入，發生錯誤的位置。可能在最後一行。你必須自己編碼字符串：

with open('test.csv','w') as fp: 
    a = csv.writer(fp, delimiter=',') 
    for f in Fruit: 
     explanation = f.findtext("Explanation") 
     types = f.findall("Type") 
     for t in types: 
      a.writerow([t.text.encode('utf8'), explanation.encode('utf8')])

來源

2016-05-23 19:24:25 Daniel

''\ xb0'.decode（'latin1'）== u'°''所以它可能是'latin1'。 'utf8'會拋出一個錯誤。 –

'u'\ xb0''已經是一個unicode字符串。錯誤發生在**編碼**而不是**解碼**。 – Daniel

是的，它在最後一行。 – Alexander

提取XML文件時的Unicode錯誤Python

回答

相關問題