0
我跑在MapReduce的以下Python代碼:的MapReduce:ValueError異常:值過多解壓(預期2)
from mrjob.job import MRJob
import collections
bigram = collections.defaultdict(float)
unigram = collections.defaultdict(float)
class MRWordFreqCount(MRJob):
def mapper(self, _, line):
# Now we loop over lines in the system input
line = line.strip().split()
# go through each word in sentence
i = 0
for word in line:
if i > 0:
hist = word
else:
hist = ''
word = CleanWord(word) # Get the new word
# If CleanWord didn't return a string, move on
if word == None: continue
i += 1
yield word.lower(), hist.lower(), 1.0
if __name__ == '__main__':
MRWordFreqCount.run()
我得到的錯誤:ValueError異常:值過多解壓(預期2)但我無法弄清楚爲什麼。有什麼建議麼? 我正在運行的cmd代碼是: python myjob.py Test.txt --mapper
您正在從'mapper'返回3個值,而您似乎只能返回2個值。 –
謝謝。是的,你是對的 - MrJobs mapper函數只需要一個鍵,值作爲輸出。 https://pythonhosted.org/mrjob/guides/concepts.html#mapreduce-and-apache-hadoop – user1761806