2013-10-21 64 views

回答

1

我意識到這個問題已被問到一段時間了,但同時,一個lxml API已被引入,看起來很有希望解決這個問題:http://lxml.de/api.html;具體來說,請參閱以下部分:「增量XML生成」。

我很快通過流式傳輸10M文件來測試它,就像在您的基準測試中那樣,而且它在我的舊筆記本上花費了幾分之一秒,這絕不是非常科學,但與您的generate_large_xml()完全相同功能。

0

Yury V. Zaytsev提到的,lxml真的爲以流方式

這裏生成XML文檔提供API是工作示例:

from lxml import etree 

fname = "streamed.xml" 
with open(fname, "w") as f, etree.xmlfile(f) as xf: 
    attribs = {"tag": "bagggg", "text": "att text", "published": "now"} 
    with xf.element("root", attribs): 
     xf.write("root text\n") 
     for i in xrange(10): 
      rec = etree.Element("record", id=str(i)) 
      rec.text = "record text data" 
      xf.write(rec) 

生成的XML看起來像這樣(在內容從一個行XML文檔重新格式化):

<?xml version="1.0"?> 
<root text="att text" tag="bagggg" published="now">root text 
    <record id="0">record text data</record> 
    <record id="1">record text data</record> 
    <record id="2">record text data</record> 
    <record id="3">record text data</record> 
    <record id="4">record text data</record> 
    <record id="5">record text data</record> 
    <record id="6">record text data</record> 
    <record id="7">record text data</record> 
    <record id="8">record text data</record> 
    <record id="9">record text data</record> 
</root>