2017-04-24 79 views
0

我有一個AWS Lambda函數,它使用pool.map調用一組URL。問題是,如果其中一個URL返回除200之外的其他內容,則Lambda函數將失敗並立即重試。問題是它立即重試整個lambda函數。我希望它僅重試失敗的URL,如果(在第二次嘗試後)仍然失敗,請調用固定URL來記錄錯誤。Lambda Python Pool.map和urllib2.urlopen:僅重試失敗進程,僅記錄錯誤

這是代碼,因爲它目前是(有一些細節移除),只有工作時,所有的網址爲:

from __future__ import print_function 
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib 
import datetime 
import json 

print('Loading function') 

def lambda_handler(event, context): 

    f = urllib2.urlopen("https://example.com/geturls/?action=something"); 
    data = json.loads(f.read()); 

    urls = []; 
    for d in data: 
     urls.append("https://"+d+".example.com/path/to/action"); 

    # Make the Pool of workers 
    pool = ThreadPool(4); 

    # Open the urls in their own threads 
    # and return the results 
    results = pool.map(urllib2.urlopen, urls); 

    #close the pool and wait for the work to finish 
    pool.close(); 
    return pool.join(); 

我試着讀the official documentation但它似乎缺少一點在解釋map功能,具體解釋返回值。

使用urlopen文檔我試着改變我的代碼如下:

from __future__ import print_function 
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib 
import datetime 
import json 

print('Loading function') 

def lambda_handler(event, context): 

    f = urllib2.urlopen("https://example.com/geturls/?action=something"); 
    data = json.loads(f.read()); 

    urls = []; 
    for d in data: 
     urls.append("https://"+d+".example.com/path/to/action"); 

    # Make the Pool of workers 
    pool = ThreadPool(4); 

    # Open the urls in their own threads 
    # and return the results 
    try: 
    results = pool.map(urllib2.urlopen, urls); 
    except URLError: 
    try:        # try once more before logging error 
     urllib2.urlopen(URLError.url); # TODO: figure out which URL errored 
    except URLError:     # log error 
     urllib2.urlopen("https://example.com/error/?url="+URLError.url); 

    #close the pool and wait for the work to finish 
    pool.close(); 
    return true; # always return true so we never duplicate successful calls 

我不知道如果我正確的做例外的方式,或者如果我甚至做正確的Python異常符號。再次,我的目標是我希望它僅重試失敗的URL,如果(在第二次嘗試後)仍然失敗,請調用固定URL來記錄錯誤。

回答

0

我想出了答案感謝a "lower-level" look at this question I posted here

答案是創建自己的自定義包裝到urllib2.urlopen功能,因爲每個線程本身需要的是嘗試{}趕上「d的,而不是整個事情。這個功能看起來像這樣:

def my_urlopen(url): 
    try: 
     return urllib2.urlopen(url) 
    except URLError: 
     urllib2.urlopen("https://example.com/log_error/?url="+url) 
     return None 

我把該def lambda_handler函數聲明之上,那麼我可以在它從該更換整個try/catch語句:

try: 
    results = pool.map(urllib2.urlopen, urls); 
except URLError: 
    try:        # try once more before logging error 
     urllib2.urlopen(URLError.url); 
    except URLError:     # log error 
     urllib2.urlopen("https://example.com/error/?url="+URLError.url); 

要這樣:

results = pool.map(my_urlopen, urls); 

QED

相關問題