Feedparser（和urllib2的）問題：連接超時

與urllib2和feedparser庫在Python中，我發現了以下錯誤的大部分時間，只要嘗試連接，並獲取從特定的URL的內容開始：Feedparser（和urllib2的）問題：連接超時

urllib2.URLError: <urlopen error [Errno 110] Connection timed out>

下面粘貼了最小可重現的示例（基本上，使用feedparser.parser直接和高級，我首先使用urllib2庫來獲取XML內容）。

# test-1 
import feedparser 
f = feedparser.parse('http://www.zurnal24.si/index.php?ctl=show_rss') 
title = f['channel']['title'] 
print title 

# test-2 
import urllib2 
import feedparser 
url = 'http://www.zurnal24.si/index.php?ctl=show_rss' 
opener = urllib2.build_opener() 
opener.addheaders = [('User-Agent', 'Mozilla/5.0')] 
request = opener.open(url) 
response = request.read() 
feed = feedparser.parse(response) 
title = feed['channel']['title'] 
print title

當我試着使用不同的URL地址（例如，http://www.delo.si/rss/），一切工作正常。請注意，所有網址都會導向非英語（即斯洛文尼亞語）RSS源。

我從本地和遠程機器上運行我的實驗（通過ssh）。即使在本地主機上，報告的錯誤在遠程計算機上也會更頻繁地發生，儘管它是不可預測的。

任何建議將不勝感激。

來源

2011-11-23 Andrej

超時發生的頻率如何？如果不頻繁，則可以在每次超時後等待，然後重試請求：

import urllib2 
import feedparser 
import time 
import sys 

url = 'http://www.zurnal24.si/index.php?ctl=show_rss' 
opener = urllib2.build_opener() 
opener.addheaders = [('User-Agent', 'Mozilla/5.0')] 

# Try to connect a few times, waiting longer after each consecutive failure 
MAX_ATTEMPTS = 8 
for attempt in range(MAX_ATTEMPTS): 
    try: 
     request = opener.open(url) 
     break 
    except urllib2.URLError, e: 
     sleep_secs = attempt ** 2 
     print >> sys.stderr, 'ERROR: %s.\nRetrying in %s seconds...' % (e, sleep_secs)    
     time.sleep(sleep_secs) 

response = request.read() 
feed = feedparser.parse(response) 
title = feed['channel']['title'] 
print title

來源

2011-11-23 09:14:57

有時我可以檢索10個連續的請求，直到出現錯誤，有時甚至不是單個請求。 – Andrej

作爲誤差表示，它是一個連接問題。這可能是您的網絡連接或與服務器/連接/帶寬的問題..

一個簡單的解決方法是做一個while循環您feedparsing，當然保持的最大重試計數器..

來源

2011-11-23 09:14:06 hymloth

Feedparser（和urllib2的）問題：連接超時

回答

相關問題