2014-12-19 56 views
-3

我想知道如何讀取.txt文件,輸出格式應該是.xml格式。讀取.txt並在python3.2中輸出.xml

我有輸入文件作爲

 Paper 1/White Spaces are included 
Single Correct Answer Type 

1. Text of question 1 
    a) Option 1.a b) Option 1.b 
    c) Option 1.c d) Option 1.d 

2. Text of question 2 
    a) This is an example of Option 2.a 
    b) Option 2.b has a special char α 
    c) Option 2.c 
    d) Option 2.d 

3. Text of question 3 
    a) Option 3.a can span multiple 
    lines. 
    b) Option 3b 
    c) Option 3c 
    d) Option 3d 

我的代碼:

from lxml import etree 
import csv 

root = etree.Element('data') 
#f = open('input1.txt','rb') 
rdr = csv.reader(open("input1.txt",newline='\n')) 
header = next(rdr) 
for row in rdr: 
    eg = etree.SubElement(root, 'eg') 
    for h, v in zip(header, row): 
     etree.SubElement(eg, h).text = v 

f = open(r"C:\temp\input1.xml", "w") 
f.write(etree.tostring(root)) 
f.close() 

我發現了一個錯誤,如:

Traceback (most recent call last): 
    File "E:\python3.2\input1.py", line 11, in <module> 
    etree.SubElement(eg, h).text = v 
    File "lxml.etree.pyx", line 2995, in lxml.etree.SubElement (src\lxml\lxml.etree.c:69677) 
    File "apihelpers.pxi", line 188, in lxml.etree._makeSubElement (src\lxml\lxml.etree.c:15691) 
    File "apihelpers.pxi", line 1571, in lxml.etree._tagValidOrRaise (src\lxml\lxml.etree.c:29249) 
ValueError: Invalid tag name ' Paper 1' 

而且我希望它考慮白空間也。 我正在使用Python 3.2。有什麼建議麼?

+0

可能重複http://stackoverflow.com/questions/18739501/python-text-文件到XML) – 2014-12-19 09:06:48

回答

1

您可以從txt文件中讀取此信息,在對象類中進行組織,然後對其進行序列化。

如何DE /串行化:http://code.activestate.com/recipes/577266-xml-to-python-data-structure-de-serialization/

實施例:

f = open('file.txt') 
lines = f.readlines() 
f.close() 

#do something to orginize these lines into objects. 

xmlStrings = [serialize(pythonObj) for pythonObj in txtInfoObjs] 

g = open('file.xml') 
g.write(xmlStrings[0]) 
g.close() 
[Python的文本文件,以XML(的