2015-10-16 59 views
1

我需要爲特定值過濾XML文件,如果節點包含此值,則應刪除該節點。如果childnode的子節點包含特定值,則刪除XML節點

<?xml version="1.0" encoding="utf-8" ?> 
<ogr:FeatureCollection 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://ogr.maptools.org/ TZwards.xsd" 
xmlns:ogr="http://ogr.maptools.org/" 
xmlns:gml="http://www.opengis.net/gml"> 
    <gml:boundedBy></gml:boundedBy>     
    <gml:featureMember> 
     <ogr:TZwards fid="F0"> 
      <ogr:Region_Nam>TARGET</ogr:Region_Nam> 
      <ogr:District_N>Kondoa</ogr:District_N> 
      <ogr:Ward_Name>Bumbuta</ogr:Ward_Name> 
     </ogr:TZwards> 
    </gml:featureMember> 
    <gml:featureMember> 
     <ogr:TZwards fid="F1"> 
      <ogr:Region_Nam>REMOVE</ogr:Region_Nam> 
      <ogr:District_N>Kondoa</ogr:District_N> 
      <ogr:Ward_Name>Pahi</ogr:Ward_Name> 
     </ogr:TZwards> 
    </gml:featureMember> 
</ogr:FeatureCollection> 

的Python腳本應該保持<gml:featureMember>節點如果<ogr:Region_Nam>包含TARGET並刪除所有其他節點。

from xml.dom import minidom 
import xml.etree.ElementTree as ET 

tree = ET.parse('input.xml').getroot() 

removeList = list() 
for child in tree.iter('gml:featureMember'): 
    if child.tag == 'ogr:TZwards': 
     name = child.find('ogr:Region_Nam').text 
     if (name == 'TARGET'): 
      removeList.append(child) 

for tag in removeList: 
    parent = tree.find('ogr:TZwards') 
    parent.remove(tag) 

out = ET.ElementTree(tree) 
out.write(outputfilepath) 

所需的輸出:

<?xml version="1.0" encoding="utf-8" ?> 
<ogr:FeatureCollection> 
    <gml:boundedBy></gml:boundedBy>     
    <gml:featureMember> 
     <ogr:TZwards fid="F0"> 
      <ogr:Region_Nam>TARGET</ogr:Region_Nam> 
      <ogr:District_N>Kondoa</ogr:District_N> 
      <ogr:Ward_Name>Bumbuta</ogr:Ward_Name> 
     </ogr:TZwards> 
    </gml:featureMember> 
</ogr:FeatureCollection> 

我的輸出仍然包含需要聲明的Python代碼的命名空間的所有節點..

回答

1

from xml.dom import minidom 
import xml.etree.ElementTree as ET 

tree = ET.parse('/tmp/input.xml').getroot() 
namespaces = {'gml': 'http://www.opengis.net/gml', 'ogr':'http://ogr.maptools.org/'} 
for child in tree.findall('gml:featureMember', namespaces=namespaces): 
    if len(child.find('ogr:TZwards', namespaces=namespaces)): 
     name = child.find('ogr:TZwards', namespaces=namespaces).find('ogr:Region_Nam', namespaces=namespaces).text 
     if name != 'TARGET': 
      tree.remove(child) 

out = ET.ElementTree(tree) 
out.write("/tmp/out.xml") 
+0

你能否確認作品?因爲如果我運行解決方案,我仍然擁有相同數量的featureMembers。可能它是名稱空間的問題? – toefftoefftoeff

+0

你的問題之一是前綴沒有在你的xml中聲明。所以我刪除他們爲我的測試。如果你從你的xml和python代碼中刪除前綴... – djangoliv

+0

我刪除了他們,因爲我認爲他們已經過時瞭解...更新後的原始文章^ – toefftoefftoeff