2013-07-08 32 views
0
<myroot> <data txt="some0" txt1 = "some1" txt2 = "some2" > 
       <data2> 
         < bank = "SBI" bank2 = "SBI2" > 
       <data2> 
       <data3> 
         <branch = "bang1" branch = "bang2" > 
       <data3> 
      </data> 

      <data txt="some0" txt1 = "some1" txt2 = "some2" > 
       <data2> 
         < bank = "citi" bank2 = "citi2" > 
       <data2> 
       <data3> 
         <branch = "bang3" branch = "bang4" > 
       <data3> 
      </data> </myroot> 

上述數據存儲在一個不在xml文件中的變量中。我無法解析它,因爲它不是一個xml文件。請幫我把數據轉換成XML格式/文件,並解析相同,下面的腳本我正努力:如何解析存儲在變量中的xml數據?

stdout = "<myroot>%s</myroot>" % stdout 
print'main data', stdout 
tree = ElementTree.fromstring(stdout) 
tree1 = ET.parse('tree') 

在腳本的第一行我增加了根標籤的數據和在主數據我上面顯示的XML數據將被存儲,然後我試圖解析它,但它會拋出一個錯誤。

+0

我們展示你的錯誤。 – refi64

回答

0

這是因爲你的XML錯誤而引發錯誤。

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1301, in XML 
    parser.feed(text) 
    File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed 
    self._raiseerror(v) 
    File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror 
    raise err 
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 3, column 25 

所以看3號線,塔25田田

>>> stdout.split('\n')[2][25:] 
' bank = "SBI" bank2 = "SBI2" >' 
+0

:這是錯誤異常:[錯誤2]沒有這樣的文件或目錄:'樹' 回溯: 回溯(最近調用最後): – user2558589

+0

@ user2558589,你有一個文件或目錄叫'樹'? –

+0

nope,我將ElementTree.fromstring(stdout)的optput存儲在一個名爲tree的變量中,然後試圖解析它。 – user2558589

0

它解析罰款BeautifulSoup

>>> s = """<myroot> <data txt="some0" txt1 = "some1" txt2 = "some2" > 
...     <data2> 
...       < bank = "SBI" bank2 = "SBI2" > 
...     <data2> 
...     <data3> 
...       <branch = "bang1" branch = "bang2" > 
...     <data3> 
...    </data> 
... 
...    <data txt="some0" txt1 = "some1" txt2 = "some2" > 
...     <data2> 
...       < bank = "citi" bank2 = "citi2" > 
...     <data2> 
...     <data3> 
...       <branch = "bang3" branch = "bang4" > 
...     <data3> 
...    </data> </myroot>""" 

>>> from bs4 import BeautifulSoup 
>>> soup = BeautifulSoup(s) 
>>> print soup.prettify() 
<myroot> 
<data txt="some0" txt1="some1" txt2="some2"> 
    <data2> 
    &lt; bank = "SBI" bank2 = "SBI2" &gt; 
    <data2> 
    <data3> 
    <branch "bang1" = branch="bang2"> 
     <data3> 
     </data3> 
    </branch> 
    </data3> 
    </data2> 
    </data2> 
</data> 
<data txt="some0" txt1="some1" txt2="some2"> 
    <data2> 
    &lt; bank = "citi" bank2 = "citi2" &gt; 
    <data2> 
    <data3> 
    <branch "bang3" = branch="bang4"> 
     <data3> 
     </data3> 
    </branch> 
    </data3> 
    </data2> 
    </data2> 
</data> 
</myroot>