2013-05-13 50 views
2

我試圖通過數據存儲區運行標準MapReduce作業。地圖管道運行良好,但隨後工作陷入了ShufflePipeline。我得到這些錯誤日誌中的八個:GAE MapReduce ShufflePipeline導致TaskTooLargeError

2013-05-13 08:26:18.154 /mapreduce/kickoffjob_callback 500 19978ms 2kb AppEngine-Google 

0.1.0.2 - - [13/May/2013:08:26:18 -0700] 
"POST /mapreduce/kickoffjob_callback HTTP/1.1" 500 2511 
"http://x.appspot.com/mapreduce/pipeline/run" "AppEngine-Google; 
"x" ms=19979 cpu_ms=9814 cpm_usd=0.000281 queue_name=default 
task_name=15467899496029413827 app_engine_release=1.8.0 
instance=00c61b117c2368b09b3a28374853f2e040692c68 


E 2013-05-13 08:26:18.055 

Task size must be less than 102400; found 105564 
Traceback (most recent call last): 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 1536, in __call__ 
    rv = self.handle_exception(request, response, e) 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 1530, in __call__ 
    rv = self.router.dispatch(request, response) 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 1278, in default_dispatcher 
    return route.handler_adapter(request, response) 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 1102, in __call__ 
    return handler.dispatch() 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 572, in dispatch 
    return self.handle_exception(e, self.app.debug) 
    File "/base/data/home/apps/x/1.367342714947958888/webapp2.py", line 570, in dispatch 
    return method(*args, **kwargs) 
    File "/base/data/home/apps/x/1.367342714947958888/mapreduce/base_handler.py", line 65, in post 
    self.handle() 
    File "/base/data/home/apps/x/1.367342714947958888/mapreduce/handlers.py", line 692, in handle 
    spec, input_readers, output_writers, queue_name, self.base_path()) 
    File "/base/data/home/apps/x/1.367342714947958888/mapreduce/handlers.py", line 767, in _schedule_shards 
    queue_name=queue_name) 
    File "/base/data/home/apps/x/1.367342714947958888/mapreduce/handlers.py", line 369, in _schedule_slice 
    worker_task.add(queue_name, parent=shard_state) 
    File "/base/data/home/apps/x/1.367342714947958888/mapreduce/util.py", line 265, in add 
    countdown=self.countdown) 
    File "/python27_runtime/python27_lib/versions/1/google/appengine/api/taskqueue/taskqueue.py", line 769, in __init__ 
    (max_task_size_bytes, self.size)) 
TaskTooLargeError: Task size must be less than 102400; found 105564 

我該如何解決這個問題?看起來這是由MR庫的內部運作以及它如何分解其任務而引起的問題。如果是這樣,我該如何解決這個問題?

回答

相關問題