嘗試使用lxml.etree.iterparse函數解析以下Python文件。用Python解析大型xml文件 - etree.parse error
「sampleoutput.xml」
<item>
<title>Item 1</title>
<desc>Description 1</desc>
</item>
<item>
<title>Item 2</title>
<desc>Description 2</desc>
</item>
我試圖從Parsing Large XML file with Python lxml and Iterparse
代碼的etree.iterparse(MYFILE)調用我做MYFILE =打開(「/用戶/埃裏克/桌面/ wikipedia_map前/sampleoutput.xml","r「)
但事實證明了以下錯誤
Traceback (most recent call last):
File "/Users/eric/Documents/Programming/Eclipse_Workspace/wikipedia_mapper/testscraper.py", line 6, in <module>
for event, elem in context :
File "iterparse.pxi", line 491, in lxml.etree.iterparse.__next__ (src/lxml/lxml.etree.c:98565)
File "iterparse.pxi", line 543, in lxml.etree.iterparse._read_more_events (src/lxml/lxml.etree.c:99086)
File "parser.pxi", line 590, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74712)
lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 5, column 1
有什麼想法?謝謝!
難道說你的XML文件的格式不正確?它不包含'<?xml'標記或根元素。 – C0deH4cker 2012-07-09 04:33:36
我不知道lxml,但你的例子不是有效的XML。一個XML文檔必須有一個根元素。你的不是。 – 2012-07-09 04:35:06
您需要一個根元素,而不僅僅是子節點。 – pinkdawn 2012-07-09 05:39:11