2012-02-09 110 views
0

因此,在我的last question中,我詢問了如何在RSS提要中解析XML中的鏈接。使用我從這裏與額外的研究相結合收到的援助的想法,我能寫了這個:從Python輸出獲取一行代碼

def GetRSS(RSSurl): 
    url_info = urllib.urlopen(RSSurl) 
    if (url_info): 
     xmldoc = minidom.parse(url_info) 
    if (xmldoc): 
     channel = xmldoc.getElementsByTagName('channel') 
     for node in channel: 
      item = xmldoc.getElementsByTagName('item') 
      for node in item: 
       alist = xmldoc.getElementsByTagName('link') 
       for a in alist: 
        linktext = a.firstChild.data 
        print linktext 

正如我在其他問題中提到,我寫了這個獲得來自RSS feed on Redlettermedia.com的鏈接。代碼工作正常,我收到的輸出是:

http://redlettermedia.com 
http://redlettermedia.com/half-in-the-bag-b-fest-2012/ 
http://redlettermedia.com/an-update-from-red-letter-media/ 
http://redlettermedia.com/half-in-the-bag-red-tails/ 
http://redlettermedia.com/half-in-the-bag-the-devil-inside-and-flyin-ryan/ 
http://redlettermedia.com/newly-found-episode-iii-review-behind-the-scenes-footage/ 
http://redlettermedia.com/half-in-the-bag-the-girl-with-the-dragon-tattoo-and-2011-re-cap/ 
http://redlettermedia.com/mr-plinetts-indiana-jones-and-the-kingdom-of-the-crystal-skull-review/ 
http://redlettermedia.com/new-mr-plinkett-review-trailer/ 
http://redlettermedia.com/plinkett-fest/ 
http://redlettermedia.com/update/ 
http://redlettermedia.com 
http://redlettermedia.com/half-in-the-bag-b-fest-2012/ 
http://redlettermedia.com/an-update-from-red-letter-media/ 
http://redlettermedia.com/half-in-the-bag-red-tails/ 
http://redlettermedia.com/half-in-the-bag-the-devil-inside-and-flyin-ryan/ 
http://redlettermedia.com/newly-found-episode-iii-review-behind-the-scenes-footage/ 

依此類推。我現在要做的是僅打印最新的更新鏈接作爲結果(這是輸出中的第二行,在這種情況下爲「http://redlettermedia.com/half-in-the-bag-b-fest-2012/」)。我將如何只打印該行?

+0

可以安裝非STDLIB模塊?你如何定義'最新的更新鏈接'? – Daenyth 2012-02-09 05:29:09

回答

1

如果它總是在列表中的第二項,你可以嘗試

url = xmldoc.getElementsByTagName('link')[1].firstChild.data 
print url 
+0

這項工作非常完美,除了我收到十行重複我正在嘗試獲取的網址。我該怎麼做才能做到這一點,而不是隻接收一次我想要的網址? – Jordan 2012-02-09 05:41:03

+0

這是因爲您要爲列表中的所有項目打印它。你很可能會用'我的建議'來替換'for node in item:'後的內容,但我目前無法測試... – timc 2012-02-09 05:44:31

+0

嗯,我想這就是我應該做的,實際上。我完全用你提出的建議替換了'for node in item:'的所有內容,但由於某種原因,我似乎仍然得到了十行。 – Jordan 2012-02-09 06:05:56