如何使用jsoup從此html標籤中獲取文本？

我在使用jsoup來提取數據時遇到了一個問題。像這樣的數據：如何使用jsoup從此html標籤中獲取文本？

This is a <strong>strong</strong> number <date>2013</date>

我想這樣的數據：This is a number

我怎麼能這樣做？誰能幫我？

來源

2013-04-11 user2269351

您可以將HTML解析成Document，選擇body - 元素，並得到其文本。

例子：

Document doc = Jsoup.parse("This is a <strong>strong</strong> number <date>2013</date>"); 

String ownText = doc.body().ownText(); 
String text = doc.body().text(); 

System.out.println(ownText); 
System.out.println(text);

輸出：

This is a number 
This is a strong number 2013

來源

2013-04-12 23:17:32 ollo

非常感謝你馬赫！ – user2269351 2013-04-15 01:27:22

這應該回答你的問題：

public String escapeHtml(String source) { 
    Document doc = Jsoup.parseBodyFragment(source); 
    Elements elements = doc.select("b"); 
    for (Element element : elements) { 
     element.replaceWith(new TextNode(element.toString(),"")); 
    } 
    return Jsoup.clean(doc.body().toString(), new Whitelist().addTags("a").addAttributes("a", "href", "name", "rel", "target")); 
}

Jsoup - Howto clean html by escaping not deleting the unwanted html?

來源

2013-04-12 23:17:32

Document doc = Jsoup.parse("This is a <strong>strong</strong> number <date>2013</date>"); 

Spanned HtmlDoc = Html.fromHtml(doc.toString()); 
String fromHTML = HtmlDoc.toString(); 

System.out.println(fromHTML);

來源

2015-09-18 19:05:09

如何使用jsoup從此html標籤中獲取文本？

回答

相關問題