在Python lxml中查找前綴標記的技巧？

我想使用lxml的ElementTree etree在我的xml文檔中查找特定的標籤。標籤如下所示：在Python lxml中查找前綴標記的技巧？

<text:ageInformation> 
    <text:statedAge>12</text:statedAge> 
</text:ageInformation>

我希望用etree.find（「文本：statedAge」），但這種方法並不像「文」字頭。它提到我應該將「文本」添加到前綴映射中，但我不確定如何去做。有小費嗎？

編輯：我希望能夠寫入hr4e前綴標籤。下面是該文件的重要組成部分：在XML文檔中

<?xml version="1.0" encoding="utf-8"?> 
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd"> 
    <header> 
    <documentID root="18c41e51-5f4d-4d15-993e-2a932fed720a" /> 
    <title>Health Records for Everyone Continuity of Care Document</title> 
    <version> 
    <number>1</number> 
</version> 
<confidentiality codeSystem="2.16.840.1.113883.5.25" code="N" /> 
<documentTimestamp value="201105300211+0800" /> 
<personalInformation> 
    <patientInformation> 
    <personID root="2.16.840.1.113883.3.881.PI13023911" /> 
    <personAddress> 
     <streetAddressLine nullFlavor="NI" /> 
     <city>Santa Cruz</city> 
     <state nullFlavor="NI" /> 
     <postalCode nullFlavor="NI" /> 
    </personAddress> 
    <personPhone nullFlavor="NI" /> 
    <personInformation> 
     <personName> 
     <given>Benjamin</given> 
     <family>Keidan</family> 
     </personName> 
     <gender codeSystem="2.16.840.1.113883.5.1" code="M" /> 
     <personDateOfBirth value="NI" /> 
     <hr4e:ageInformation> 
     <hr4e:statedAge>9424</hr4e:statedAge> 
     <hr4e:estimatedAge>0912</hr4e:estimatedAge> 
     <hr4e:yearInSchool>1</hr4e:yearInSchool> 
     <hr4e:statusInSchool>attending</hr4e:statusInSchool> 
     </hr4e:ageInformation> 
    </personInformation> 
    <hr4e:livingSituation> 
     <hr4e:homeVillage>Putney</hr4e:homeVillage> 
     <hr4e:tribe>Oromo</hr4e:tribe> 
    </hr4e:livingSituation> 
    </patientInformation> 
</personalInformation>

來源

2011-10-07 super

命名空間前綴必須聲明（映射到URI）。然後你可以使用{URI}localname notation找到text:statedAge和其他元素。像這樣：

from lxml import etree 

XML = """ 
<root xmlns:text="http://example.com"> 
<text:ageInformation> 
    <text:statedAge>12</text:statedAge> 
</text:ageInformation> 
</root>""" 

root = etree.fromstring(XML) 

ageinfo = root.find("{http://example.com}ageInformation") 
age = ageinfo.find("{http://example.com}statedAge") 
print age.text

這將打印「12」。

做的另一種方式：

ageinfo = root.find("text:ageInformation", 
        namespaces={"text": "http://example.com"}) 
age = ageinfo.find("text:statedAge", 
        namespaces={"text": "http://example.com"}) 
print age.text

您還可以使用XPath：

age = root.xpath("//text:statedAge", 
       namespaces={"text": "http://example.com"})[0] 
print age.text

來源

2011-10-08 09:06:28 mzjn

我不斷收到NoneTypes。 .. 是我的根文件。我試過ageInfo = root.find（「{hr4e :: patientdata} ageInformation」） – super

@super：如果您提供了一個完整的示例XML文檔（更新問題），這將有所幫助。 – mzjn

kk。我包括它。 – super

我最後不得不使用嵌套的前綴：

from lxml import etree 

XML = """ 
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd"> 
<personInformation> 
<hr4e:ageInformation> 
    <hr4e:statedAge>12</hr4e:statedAge> 
</hr4e:ageInformation> 
</personInformation> 
</greenCCD>""" 

root = etree.fromstring(XML) 
#root = etree.parse("hr4e_patient.xml") 

ageinfo = root.find("{AlschulerAssociates::GreenCDA}personInformation/{hr4e::patientdata}ageInformation") 
age = ageinfo.find("{hr4e::patientdata}statedAge") 
print age.text

來源

2011-10-11 22:53:26 super

偉大的，它適合你（我認爲我給了原來的問題一個很好的答案，考慮到有關實際命名空間的重要信息被省略）。 – mzjn

沒有你的幫助，我不會找到我的解決方案。非常感謝您的親切先生。 – super

在Python lxml中查找前綴標記的技巧？

回答

相關問題