2014-12-10 107 views
0

我是新來的hadoop並遇到此問題。我正在嘗試將縮減器的默認文本,整數值更改爲文本,文本。我想映射文本,IntWritable然後在reducer中我想有2個計數器取決於該值是什麼,然後在收集器的文本中寫入這2個計數器。Hadoop映射器和減速器值類型不匹配錯誤

public class WordCountMapper extends MapReduceBase 
    implements Mapper<LongWritable, Text, Text, IntWritable> { 

    private final IntWritable one = new IntWritable(1); 
    private Text word = new Text(); 

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> 
     output, Reporter reporter) throws IOException { 

    String line = value.toString(); 
    String[] words = line.split(","); 
    String[] date = words[2].split(" "); 
     word.set(date[0]+" "+date[1]+" "+date[2]); 
     if(words[0].contains("0")) 
      one.set(0); 
     else 
      one.set(4); 
     output.collect(word, one); 

    } 
} 

----------------------------------------------------------------------------------- 

public class WordCountReducer extends MapReduceBase 
    implements Reducer<Text, IntWritable, Text, Text> { 

    public void reduce(Text key,Iterator<IntWritable> values, 
        OutputCollector<Text, Text> output, 
        Reporter reporter) throws IOException { 

    int sad = 0; 
    int happy = 0; 
    while (values.hasNext()) { 
     IntWritable value = (IntWritable) values.next(); 
     if(value.get() == 0) 
      sad++; // process value 
     else 
      happy++; 
    } 

    output.collect(key, new Text("sad:"+sad+", happy:"+happy)); 
    } 
} 
--------------------------------------------------------------------------------- 

public class WordCount { 

    public static void main(String[] args) { 
    JobClient client = new JobClient(); 
    JobConf conf = new JobConf(WordCount.class); 

    // specify output types 
    conf.setOutputKeyClass(Text.class); 
    conf.setOutputValueClass(IntWritable.class); 

    // specify input and output dirs 
    FileInputFormat.addInputPath(conf, new Path("input")); 
    FileOutputFormat.setOutputPath(conf, new Path("output")); 

    // specify a mapper 
    conf.setMapperClass(WordCountMapper.class); 

    // specify a reducer 
    conf.setReducerClass(WordCountReducer.class); 
    conf.setCombinerClass(WordCountReducer.class); 

    client.setConf(conf); 
    try { 
     JobClient.runJob(conf); 
    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
    } 
} 

我得到這個錯誤:在此之後重複

14/12/10 18:11:01 INFO mapred.JobClient: Task Id : attempt_201412100143_0008_m_000000_0, Status : FAILED java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:425) at WordCountMapper.map(WordCountMapper.java:31) at WordCountMapper.map(WordCountMapper.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) Caused by: java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:143) at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:626) at WordCountReducer.reduce(WordCountReducer.java:29) at WordCountReducer.reduce(WordCountReducer.java:1) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:904) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:785) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1600(MapTask.java:286) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:712)

錯誤本身幾次。有人可以解釋爲什麼會發生此錯誤嗎我搜索了類似的錯誤,但是我發現所有映射器和縮減器的鍵值類型都不匹配,但正如我所看到的,我已經爲映射器和縮減器匹配了鍵值類型。 預先感謝您。

回答

2

嘗試評論

conf.setCombinerClass(WordCountReducer.class);

和運行。

這是因爲數據緩衝區可能已滿。

Spill error

還包括

job.setMapOutputKeyClass(Text.class); 
job.setMapOutputValueClass(IntWritable.class); 

job.setOutputKeyClass(Text.class); 
job.setOutputValueClass(Text.class); 

如地圖和減速發出不同的關鍵值的數據類型。

如果兩者都發射相同的數據類型,然後

job.setOutputKeyClass(); 
job.setOutputValueClass(); 

就足夠了。

+0

非常感謝!對'conf.setCombinerClass(WordCountReducer.class)'進行評論'的確有竅門。 – Alek 2014-12-11 14:19:16

0
在這行字計數類

它應該是

conf.setOutputValueClass(Text.class); 
相關問題