Lambda Python Pool.map和urllib2.urlopen：僅重試失敗進程，僅記錄錯誤

我有一個AWS Lambda函數，它使用pool.map調用一組URL。問題是，如果其中一個URL返回除200之外的其他內容，則Lambda函數將失敗並立即重試。問題是它立即重試整個lambda函數。我希望它僅重試失敗的URL，如果（在第二次嘗試後）仍然失敗，請調用固定URL來記錄錯誤。Lambda Python Pool.map和urllib2.urlopen：僅重試失敗進程，僅記錄錯誤

這是代碼，因爲它目前是（有一些細節移除），只有工作時，所有的網址爲：

from __future__ import print_function 
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib 
import datetime 
import json 

print('Loading function') 

def lambda_handler(event, context): 

    f = urllib2.urlopen("https://example.com/geturls/?action=something"); 
    data = json.loads(f.read()); 

    urls = []; 
    for d in data: 
     urls.append("https://"+d+".example.com/path/to/action"); 

    # Make the Pool of workers 
    pool = ThreadPool(4); 

    # Open the urls in their own threads 
    # and return the results 
    results = pool.map(urllib2.urlopen, urls); 

    #close the pool and wait for the work to finish 
    pool.close(); 
    return pool.join();

我試着讀the official documentation但它似乎缺少一點在解釋map功能，具體解釋返回值。

使用urlopen文檔我試着改變我的代碼如下：

from __future__ import print_function 
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib 
import datetime 
import json 

print('Loading function') 

def lambda_handler(event, context): 

    f = urllib2.urlopen("https://example.com/geturls/?action=something"); 
    data = json.loads(f.read()); 

    urls = []; 
    for d in data: 
     urls.append("https://"+d+".example.com/path/to/action"); 

    # Make the Pool of workers 
    pool = ThreadPool(4); 

    # Open the urls in their own threads 
    # and return the results 
    try: 
    results = pool.map(urllib2.urlopen, urls); 
    except URLError: 
    try:        # try once more before logging error 
     urllib2.urlopen(URLError.url); # TODO: figure out which URL errored 
    except URLError:     # log error 
     urllib2.urlopen("https://example.com/error/?url="+URLError.url); 

    #close the pool and wait for the work to finish 
    pool.close(); 
    return true; # always return true so we never duplicate successful calls

我不知道如果我正確的做例外的方式，或者如果我甚至做正確的Python異常符號。再次，我的目標是我希望它僅重試失敗的URL，如果（在第二次嘗試後）仍然失敗，請調用固定URL來記錄錯誤。

來源

2017-04-24 Bing

我想出了答案感謝a "lower-level" look at this question I posted here。

答案是創建自己的自定義包裝到urllib2.urlopen功能，因爲每個線程本身需要的是嘗試{}趕上「d的，而不是整個事情。這個功能看起來像這樣：

def my_urlopen(url): 
    try: 
     return urllib2.urlopen(url) 
    except URLError: 
     urllib2.urlopen("https://example.com/log_error/?url="+url) 
     return None

我把該def lambda_handler函數聲明之上，那麼我可以在它從該更換整個try/catch語句：

try: 
    results = pool.map(urllib2.urlopen, urls); 
except URLError: 
    try:        # try once more before logging error 
     urllib2.urlopen(URLError.url); 
    except URLError:     # log error 
     urllib2.urlopen("https://example.com/error/?url="+URLError.url);

要這樣：

results = pool.map(my_urlopen, urls);

QED

來源

2017-05-26 19:01:48 Bing

Lambda Python Pool.map和urllib2.urlopen：僅重試失敗進程，僅記錄錯誤

回答

相關問題