UnicodeDecodeError：'ascii'編解碼器無法解碼字節0xc2

我在Python中創建XML文件，並且在我的XML中有一個字段，我放置了文本文件的內容。我這樣做UnicodeDecodeError：'ascii'編解碼器無法解碼字節0xc2

f = open ('myText.txt',"r") 
data = f.read() 
f.close() 

root = ET.Element("add") 
doc = ET.SubElement(root, "doc") 

field = ET.SubElement(doc, "field") 
field.set("name", "text") 
field.text = data 

tree = ET.ElementTree(root) 
tree.write("output.xml")

然後我得到了UnicodeDecodeError。我已經嘗試將特別註釋# -*- coding: utf-8 -*-放在我的腳本之上，但仍然出現錯誤。此外，我試圖執行編碼我的變量data.encode('utf-8')但仍然有錯誤。我知道這個問題非常普遍，但是我從其他問題中得到的所有解決方案都不適合我。

UPDATE

回溯：使用該腳本的第一行

Traceback (most recent call last): 
    File "D:\Python\lse\createxml.py", line 151, in <module> 
    tree.write("D:\\python\\lse\\xmls\\" + items[ctr][0] + ".xml") 
    File "C:\Python27\lib\xml\etree\ElementTree.py", line 820, in write 
    serialize(write, self._root, encoding, qnames, namespaces) 
    File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml 
    _serialize_xml(write, e, encoding, qnames, None) 
    File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml 
    _serialize_xml(write, e, encoding, qnames, None) 
    File "C:\Python27\lib\xml\etree\ElementTree.py", line 937, in _serialize_xml 
    write(_escape_cdata(text, encoding)) 
    File "C:\Python27\lib\xml\etree\ElementTree.py", line 1073, in _escape_cdata 
    return text.encode(encoding, "xmlcharrefreplace") 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 243: ordina 
l not in range(128)

回溯只有特殊的註釋：使用.encode('utf-8')

Traceback (most recent call last): 
    File "D:\Python\lse\createxml.py", line 148, in <module> 
    field.text = data.encode('utf-8') 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 227: ordina 
l not in range(128)

我用.decode('utf-8')和錯誤信息沒有出現，它成功地創建了我的XML文件。但問題是XML在我的瀏覽器上不可見。

來源

2013-05-12 kagat-kagat

查看整個錯誤消息以查看其來源將很有用。同時嘗試使用'decode'而不是'encode'。 – 2013-05-12 14:49:30

已更新，當我使用'decode'時，它成功創建了我的XML，但該文件在我的瀏覽器中不可見。 – 2013-05-12 15:00:36

請注意，使用'＃ - * - coding：utf-8 - * - '僅用於在Python源代碼中插入非ASCII字符。它不會以任何方式影響字符串的編碼/解碼。另外，如果文件'myText.txt'不是ASCII，則應該使用'codecs.open'並提供正確的編碼：'codecs.open（'myText.txt'，'r'，'utf-8'）' 。 – Bakuriu 2013-05-12 15:17:55

在使用之前，您需要將輸入字符串中的數據解碼爲unicode，以避免編碼問題。

field.text = data.decode("utf8")

來源

2013-05-12 15:33:17 uhbif19

我在pywikipediabot中遇到類似的錯誤。該.decode方法是向正確方向邁出的一步，但對我來說沒無添加'ignore'工作：因爲ElementTree的，沒想到發現非

fix_encoding = lambda s: s.decode('utf8', 'ignore')

來源

2013-12-25 03:32:48 guaka

+10

請注意，忽略編碼錯誤將可能導致數據丟失，或產生不正確的輸出。 – tripleee 2015-02-01 06:55:15

的Python 2

錯誤造成ASCII字符串在嘗試寫出時設置XML。您應該使用Unicode字符串代替非ASCII。可通過在字符串上使用u前綴（即u'€'）或通過使用適當編碼對mystr.decode('utf-8')進行解碼來創建Unicode字符串。

最佳做法是在讀取所有文本數據時對其進行解碼，而不是對程序進行解碼。 io模塊提供了一個open()方法，它在讀取文本數據時將其解碼爲Unicode字符串。

ElementTree將會更加高興Unicodes，並在使用ET.write()方法時正確編碼它。

此外，爲了獲得最佳兼容性和可讀性，請確保在write()期間ET編碼爲UTF-8並添加相關頭文件。

意味着你的輸入文件是UTF-8編碼（0xC2是常見的UTF-8領先字節），把一切融合在一起，並使用with聲明，你的代碼應該是這樣的：

with io.open('myText.txt', "r", encoding='utf-8') as f: 
    data = f.read() 

root = ET.Element("add") 
doc = ET.SubElement(root, "doc") 

field = ET.SubElement(doc, "field") 
field.set("name", "text") 
field.text = data 

tree = ET.ElementTree(root) 
tree.write("output.xml", encoding='utf-8', xml_declaration=True)

輸出：

<?xml version='1.0' encoding='utf-8'?> 
<add><doc><field name="text">data€</field></doc></add>

來源

2016-05-07 12:03:29

#!/usr/bin/python

# encoding=utf8

嘗試此操作以啓動python文件

來源

2016-11-21 09:24:37

UnicodeDecodeError：'ascii'編解碼器無法解碼字節0xc2

回答

相關問題