不能下LXML

匹配使用python的XPath標籤下面是我的代碼：不能下LXML

def extractContent(self,html): 
    parser = etree.XMLParser(ns_clean=True, recover=True) 
    print html.find('id="detail"') 
    tree = etree.fromstring(html,parser) 
    if tree!=None: 
     for c in self.contents: 
     m = tree.xpath(c['xpath']) 
     print m,c['xpath'] 
     if len(m) >= 1: 
      print c['name'] + ' : ' + m[0].text

我想匹配的HTML源//*[@id="i-detail"]/li[1]但它說明不了什麼。

這裏是上面代碼的輸出：

25803 
[] //*[@id="i-detail"]/li[1]

這是html代碼：

<div class="mc fore tabcon"> 
        <ul id="i-detail"> 
         <li title="XXXXXXXXX">**AAAAAAAAAAA**(what i want to match)</li> 
         <li>BBBBBBBBB</li> 
.......

我試圖使用XPath下comandline：

>>> root.xpath('//*[@id="i-detail"]/li') 
>>> [] 
>>> root.xpath('//*[@id="i-detail"]/*') 
>>> [<Element {http://www.w3.org/1999/xhtml}li at 0x1007b7910>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b79b0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7a50>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7aa0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7af0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7b40>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7b90>] 
>>> root.xpath('//*[@id="i-detail"]/*')[0] <----- this line could get the target !

來源

2012-07-11 MrROY

使用'tree not is None'，'None'是一個單身人士。 – 2012-07-11 08:16:33

請格式化您的代碼。 – 2012-07-11 08:17:20

這似乎在我身邊工作：

>>> s = """<div class="mc fore tabcon"> 
        <ul id="i-detail"> 
         <li title="XXXXXXXXX">**AAAAAAAAAAA**(what i want to match)</li> 
         <li>BBBBBBBBB</li> 
        </ul> 
</div>""" 
>>> parser = etree.XMLParser(ns_clean=True, recover=True) 
>>> root = etree.fromstring(s, parser) 
>>> for node in root.xpath('//*[@id="i-detail"]/li[1]'): 
    print node, node.text 


<Element li at 0x12534b8> **AAAAAAAAAAA**(what i want to match)

來源

2012-07-11 08:38:29 Emmanuel

回答

相關問題