2011-03-29 123 views
1

我試圖解析XML文檔,使用lxml objectify和xpath提取數據。下面是該文件的一個剪斷:Python lxml(objectify):Xpath麻煩

<?xml version="1.0" encoding="UTF-8"?> 
<Assets> 
<asset name="Adham"> 
    <pos> 
    <x>27913.769923</x> 
    <y>5174.627773</y> 
    </pos> 
    <description>Ba bla bla</description> 
    <bar>(null)</bar> 
    </general> 
</asset> 
<asset name="Adrian"> 
    <pos> 
    <x>-179.477707</x> 
    <y>5286.959359</y> 
    </pos> 
    <commodities/> 
    <description>test test test</description> 
    <bar>more bla</bar> 
    </general> 
</asset> 
</Assets> 

我有以下方法:

def getALLattributesX(self, _root): 
    '''Uses getattributeX and parses through the attribute dict, assigning 
    values as it goes. _root is the main document root''' 
    for k in self.attrib: 
     self.getattributeX(_root, self.attribPaths[k], k) 

...調用該方法:

def getattributeX(self, node, x_path, _attrib): 
    '''Gets a value from an xml node indicated by an xpath 
    and assigns it to a the appropriate. If node does not exists 
    it assigns "error" 
    ''' 

    print node.xpath(x_path)[0].text 
    try: 
     self.attrib[_attrib] = node.xpath(x_path) 
    except KeyError: 
     self.misload = True 
    #except AttributeError: 
     # self.attrib[attrib] = "error loading " + attrib 
     #self.misload = True 

print語句是從測試。當我執行第一個方法時,它通過xml文檔解析,成功停止每個資產對象。我必須爲它找到的變量的字典,併爲它使用路徑免費字典,如下定義:

class tAssetList: 

    alist = {} #dict of assets 
    tlist = [] 
    tree = None # XML tree 
    root = None #root elem 

    def readXML(self, _filename): 
     #Load file 
     fileobject = open(_filename, "r") #read-only 
     self.tree = objectify.parse(fileobject) 
     self.root = self.tree.getroot() 

     for elem in self.root.asset: 
      temp_asset = tAsset() 
      a_name = elem.get("name") # get name, which is the key for dict 
      temp_asset.getALLattributesX(elem) 
      self.alist[a_name] = temp_asset 


class tAsset(obs.nxObject): 
    def __init__(self): 
     self.attrib = {"X_pos" : None, "Y_pos" : None} 
     self.attribPaths = {"X_pos" : '/pos/x', "Y_pos" : '/pos/y'} 

然而,XPath的似乎並不奏效時,我把它叫做節點上(這是一個客觀的XML節點)。它只是輸出[],如果我直接將其等同,並且如果我嘗試:[0] .text,它會給索引超出範圍錯誤。

這是怎麼回事?

回答

4

/pos/x/pos/y是絕對的XPath表達式,它們不選擇任何元素,因爲提供的XML文檔沒有pos頂層元素。

嘗試

pos/x 

pos/y 
+0

+1絕對和相對錶現之間的正確區分。 – 2011-03-29 18:43:45

+0

我認爲這可能與此有關,但我不確定其中的差異。它工作得很好,謝謝! – Biosci3c 2011-03-29 19:51:30

+0

@ Biosci3c:不客氣。 – 2011-03-29 21:07:22