HTML - 找到一個給定的標籤

所有的子標籤假設我有一個包含類似HTML - 找到一個給定的標籤

<ul class ="good"> 
    <li>1</li> 
    <li>2</li> 
    <li>3</li> 
</ul> 

<ul class ="bad"> 
    <li>a</li> 
    <li>b</li> 
    <li>c</li> 
</ul>

我要搶第一<ul>內<li>元素的HTML頁面。從here我已經基本複製（注：每@twotwotwo評論編輯的代碼）

page, _ := html.Parse(httpBody) 
    var f func(*html.Node) 
    f = func(n *html.Node) { 
     //fmt.Println("Inside f") 
     if n.Type == html.ElementNode && n.Data == "ul" { 
      fmt.Println("ul found -> ",n) 
      for c := n.FirstChild; c != nil; c = c.NextSibling { 
       f(c) 
      } 
     } else { 
      fmt.Println(n.Data ,"is not the correct one") 
      for c := n.FirstChild; c != nil; c = c.NextSibling { f(c) } 
      } 
    } 
f(page)

但我得到的唯一輸出是

is not the correct one 
html is not the correct one 
head is not the correct one 
body is not the correct one

我不知道爲什麼遞歸停在身上。我試過motherfuckingwebsite.com，它在身體內有標籤

P.S. 我也曾嘗試

page := html.NewTokenizer(httpBody) 

for { 
    tokenType := page.Next() 
    if tokenType == html.ErrorToken { 
     return links 
    } 
    token := page.Token()

但這似乎顯示所有的令牌，而無需關心樹形結構。

編輯：

來源

2014-10-01 meto

不知道，但你可能需要編寫一個遞歸搜索。我認爲它只是搜索你開始的節點的孩子，而不是孩子的孩子等。 – twotwotwo 2014-10-01 04:12:02

它甚至出現了一個簡單的標記，就像你提供的標記一樣，它包裹在符合標準的標記中。它包括我收集的''節點和''節點。因此，根據@ twotwotwo的評論，你將不得不遞歸找到你想要的東西。 – 2014-10-01 04:24:13

我已經在過去，使用這個包：https://github.com/PuerkitoBio/goquery

它提供了一個「jQuery的像」接口/跨HTML文檔查詢。與該庫，它像這樣簡單：

import (
    "bytes" 
    "fmt" 
    "log" 

    "github.com/PuerkitoBio/goquery" 
) 

var httpBody string = ` 
    <ul class ="good"> 
     <li>1</li> 
     <li>2</li> 
     <li>3</li> 
    </ul> 

    <ul class ="bad"> 
     <li>a</li> 
     <li>b</li> 
     <li>c</li> 
    </ul> 
` 

func main() { 
    b := bytes.NewBufferString(httpBody) 
    doc, err := goquery.NewDocumentFromReader(b) 
    if err != nil { 
     log.Fatal(err) 
    } 

    doc.Find("ul.good").Each(func(i int, ul *goquery.Selection) { 
     ul.Find("li").Each(func(i int, li *goquery.Selection) { 
      fmt.Println(li.Text()) 
     }) 
    }) 
}

它打印：

1 
2 
3

來源

2014-10-01 04:33:40

HTML - 找到一個給定的標籤

回答

相關問題