2011-08-26 45 views
0

我想在提出這樣的XML獲得該項目的信息:如何遍歷此XML以獲取DATA?

<item> 
    <title>The Colbert Report - Confused by Rick Parry With an "A" for America</title> 

    <guid isPermaLink="false">http://www.hulu.com/watch/267788/the-colbert-report-confused-by-rick-parry-with-an-a-for-america#http%3A%2F%2Fwww.hulu.com%2Ffeed%2Fpopular%2Fvideos%2Fthis_week%3Frd%3D0</guid> 
    <link>http://rss.hulu.com/~r/HuluPopularVideosThisWeek/~3/6aeJ5cWMBzw/the-colbert-report-confused-by-rick-parry-with-an-a-for-america</link> 
    <description>&lt;a href="http://www.hulu.com/watch/267788/the-colbert-report-confused-by-rick-parry-with-an-a-for-america#http%3A%2F%2Fwww.hulu.com%2Ffeed%2Fpopular%2Fvideos%2Fthis_week%3Frd%3D0"&gt;&lt;img src="http://thumbnails.hulu.com/507/40025507/40025507_145x80_generated.jpg" align="right" hspace="10" vspace="10" width="145" height="80" border="0" /&gt;&lt;/a&gt;&lt;p&gt;The fat cat media elites in Des Moines think they can sit in their ivory corn silos and play puppet master with national politics.&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.hulu.com/users/add_to_playlist?from=feed&amp;video_id=267788"&gt;Add this to your queue&lt;/a&gt;&lt;br/&gt;Added: Fri Aug 12 09:59:14 UTC 2011&lt;br/&gt;Air date: Thu Aug 11 00:00:00 UTC 2011&lt;br/&gt;Duration: 05:39&lt;br/&gt;Rating: 4.7/5.0&lt;br/&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/HuluPopularVideosThisWeek/~4/6aeJ5cWMBzw" height="1" width="1"/&gt;</description> 

    <pubDate>Fri, 12 Aug 2011 09:59:14 -0000</pubDate> 
    <media:thumbnail height="80" width="145" url="http://thumbnails.hulu.com/507/40025507/40025507_145x80_generated.jpg" /> 
    <media:credit>Comedy Central</media:credit> 
    <dcterms:valid>start=2011-08-12T00:15:00Z; end=2011-09-09T23:45:00Z; scheme=W3C-DTF</dcterms:valid> 
    <feedburner:origLink>http://www.hulu.com/watch/267788/the-colbert-report-confused-by-rick-parry-with-an-a-for-america#http%3A%2F%2Fwww.hulu.com%2Ffeed%2Fpopular%2Fvideos%2Fthis_week%3Frd%3D0</feedburner:origLink></item> 
<item> 

我需要的標題,鏈接,媒體:縮略圖網址和說明。

我已經使用中發現的方法:http://www.rgagnon.com/javadetails/java-0573.html

事情做工精細的標題和鏈接,但沒有對圖片的網址和說明。

有人可以幫助我嗎?

+1

你可以包含XPath表達式,這將是很好的。 – JMelnik

回答

0

這裏的問題是描述標籤包含一個轉義的xml(或可能是html)字符串,而不僅僅是xml。

可能最簡單的做法是獲取此標記包含的文本並打開另一個XML分析器以將其解析爲單獨的XML文檔。如果它實際上是一個html片段而不是有效的xml,這可能不起作用。

3

您可以使用XPath從XML文檔中檢索特定數據。

例如,以便檢索url屬性的內容:

XPathFactory factory = XPathFactory.newInstance();

XPath xpath = factory.newXPath(); 
String url = xpath.evaluate("/item/media:thumbnail/@url", new InputSource("data.xml")); 
+0

+1是最乾淨的解決方案。有時你只想從響應中獲得一個值,這可能會使其他方法有點矯枉過正。 –

2
try { 
     DocumentBuilderFactory dbf = 
     DocumentBuilderFactory.newInstance(); 
     DocumentBuilder db = dbf.newDocumentBuilder(); 
     InputSource is = new InputSource(new FileReader(new File("item.xml"))); 

     Document doc = db.parse(is); 
     NodeList nodes = doc.getElementsByTagName("item"); 

     // iterate the employees 
     for (int i = 0; i < nodes.getLength(); i++) { 
      Element element = (Element) nodes.item(i); 

      NodeList title = element.getElementsByTagName("title"); 
      Element line = (Element) title.item(0); 
      System.out.println("title: " + line.getTextContent()); 

      NodeList link = element.getElementsByTagName("link"); 
      line = (Element) link.item(0); 
      System.out.println("link: " + line.getTextContent()); 

      NodeList mt = element.getElementsByTagName("media:thumbnail"); 
      line = (Element) mt.item(0); 
      System.out.println("media:thumbnail: " + line.getTextContent()); 

      Attr url = line.getAttributeNode("url"); 
      System.out.println("media:thumbnail -> url: " + url.getTextContent()); 
     } 
    } 
    catch (Exception e) { 
     e.printStackTrace(); 
    } 

對於URL,你第一次得到元素媒體:縮略圖,然後因爲URL是一個屬性媒體:縮略圖,您只需從媒體:縮略圖元素中調用函數getAttributeNode(「url」)即可。

2

對於純DOM的解決方案,你可以使用下面的代碼來獲取想要的值:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); 
DocumentBuilder builder = factory.newDocumentBuilder(); 
Document doc = builder.parse("document.xml"); 

Element item = doc.getDocumentElement(); // assuming that item is a root element 
NodeList itemChilds = item.getChildNodes(); 

for (int i = 0; i != itemChilds.getLength(); ++i) 
{ 
    Node itemChildNode = itemChilds.item(i); 
    if (!(itemChildNode instanceof Element)) 
     continue; 
    Element itemChild = (Element) itemChildNode; 
    String itemChildName = itemChild.getNodeName(); 

    if (itemChildName.equals("title")) // possible switch in Java 7 
     System.out.println("title: " + itemChild.getTextContent()); 
    else if (itemChildName.equals("link")) 
     System.out.println("link: " + itemChild.getTextContent()); 
    else if (itemChildName.equals("description")) 
     System.out.println("description: " + itemChild.getTextContent()); 
    else if (itemChildName.equals("media:thumbnail")) 
     System.out.println("image url: " + itemChild.getAttribute("url")); 
} 

結果:

title: The Colbert Report - Confused by Rick Parry With an "A" for America 
link: http://rss.hulu.com/~r/HuluPopularVideosThisWeek/~3/6aeJ5cWMBzw/the-colbert.. 
description: <a href="http://www.hulu.com/watch/267788/the-colbert-report-confuse.. 
image url: http://thumbnails.hulu.com/507/40025507/40025507_145x80_generated.jpg