0
運行多個mrjob任務,我有這樣的任務:如何使用不同的參數
from mrjob.job import MRJob
from mrjob.step import MRStep
import urllib
import re
import httpagentparser
UA_STRING = re.compile(MYSUPERCOMPLEXREGEX)
class MRReferralAnalysis(MRJob):
def mapper(self, _, line):
for group in UA_STRING.findall(line):
ua = httpagentparser.simple_detect(group)
yield (ua, 1)
def reducer(self, itemOfInterest, counts):
yield (sum(counts), itemOfInterest)
def steps(self):
return [
MRStep(mapper=self.mapper,
reducer=self.reducer)
]
if __name__ == '__main__':
MRReferralAnalysis.run()
現在我希望讓此mrjob多次(約二十幾倍)的程序,用不同的是從另一個獲取參數文件並傳遞到我的MYSUPERCOMPLEXREGEX。這甚至可能與mrJob和如何安排任務?或者寫一個觸發作業的包裝程序?