從字符串中列出HTML標記

我有一個字符串，我想從中列出其中存在的所有HTML標記。有沒有任何圖書館可以做這項工作？從字符串中列出HTML標記

任何信息對我都很有幫助。

2012-03-05 Tapas Bose

看看這裏，我想你會找到你想要的一切 - > http://java-source.net/open-source/html-parsers – tartak 2012-03-05 11:59:43

你也許可以使用Jtidy，查看http：// jtidy.sourceforge.net/howto.html – Sap 2012-03-05 12:00:34

http://htmlcleaner.sourceforge.net – edze 2012-03-05 12:02:36

您可以使用下面的代碼從字符串中僅提取HTML標記。

package com.overflow.stack; 

    /** 
    * 
    * @author sarath_sivan 
    */ 

    public class ExtractHtmlTags { 

     public static void getHtmlTags(String html) { 
      int beginIndex = 0; 
      while(beginIndex!=-1) { 
       beginIndex = html.indexOf("<", 0); 
       int endIndex = html.indexOf(">", beginIndex+1); 
       String htmlTag = ""; 
       try { 
        if(beginIndex!=-1) { 
         htmlTag = html.substring(beginIndex, endIndex+1); 
        } 
       } catch(Exception e) { 
        e.printStackTrace(); 
       } 
       System.out.println(htmlTag); 
       html = html.substring(endIndex+1, html.length()); 
      } 
     } 

     public static void main(String[] args) { 
      String html = "<html><body><h2>List HTML tags from a String</h2>hello<br /></body></html>"; 
      ExtractHtmlTags.getHtmlTags(html); 
     } 

    }

但是，我不明白你想要用提取的HTML標籤做什麼。祝你好運！

來源

2012-03-05 17:22:54

從的HtmlUnit解析器可以接受字符串並返回一個結構化的結果：

http://htmlunit.sourceforge.net/apidocs/com/gargoylesoftware/htmlunit/html/HTMLParser.html

來源

2012-03-05 12:00:04

您可以嘗試http://jsoup.org/ 不知道它允許獲得的標籤列表，但你可以獲取列表迭代DOM 。

來源

2012-03-05 12:03:01 StanislavL

page = Nokogiri::HTML(open('http://yoursite.com')) 
page.css("*").map{|x| x.name}.flatten.uniq

來源

2012-03-05 12:07:28 gayavat

從字符串中列出HTML標記

回答

相關問題