檢索一個元素的所有文本，包括其python中的子元素

我編寫了一個代碼來查找xml中特定標記中的文本。它適用於沒有子標籤的標籤。檢索一個元素的所有文本，包括其python中的子元素

For e.g. 1 <a>ajsaka</a>. it works fine for this. 

e.g. 2 But if there is an instance of <b>ahsjd<c>jjiij</c>aa</b>.

它不工作。我希望標籤中的所有內容包括其子元素文本。我想要它打印ahsjdjjiijaa，但它只打印ahsjd。這是我的代碼到目前爲止。

這裏是輸入文件。

<level> 
<ex> 
<nt>[edit <topic-ref link-text="short-title" 
topic-id="13629">address</topic-ref>],</nt> 
<nt>[edit routing-instances <var>routing-instance-name</var 
    > <topic-ref link-text="short-title" topic-id="13629">address- 
assignment</topic-ref 
>]</nt> 
</ex> 
    <exam> 
    </exam> 
</level> 

from lxml import etree 
doc=etree.parse('C:/xx/bb.xml') 
root=doc.getroot() 
node=root.find('level') 
count=len(node.getchildren()) 
print (count) 
for elem in root.findall('level/ex/nt'): 
    print (elem.text)

我該如何得到它？

來源

2017-06-22 Shahul Hameed

這裏沒有'你輸入XML中level'標籤。擴展您的輸入 – RomanPerekhrest

你可以閱讀您的文件作爲字符串，然後concatinate標籤

之間的所有文字

import xml.etree.ElementTree as ET 
text = open('C:/xx/bb.xml').read() 
''.join(ET.fromstring(text).itertext())

輸出：

'ahsjdjjiijaa'

來源

2017-06-22 12:51:13

它工作時，我想我的文件中的所有內容作爲一個字符串？心不是。對不起，如果我錯了。即使內部有子標籤，我也只想要特定標籤內的內容。（''.join（[x for elem.itertext（）]））它的工作原理是在root.findall（'hierarchy-level/example/statement'）中爲elem編寫的代碼 –

。謝謝。現在我明白了。 –

檢索一個元素的所有文本，包括其python中的子元素

回答

相關問題