在hadoop中運行jar文件時出錯

在hadoop中運行jar文件時，出現空指針異常。我無法理解有什麼問題。在hadoop中運行jar文件時出錯

以下是我的驅動程序類：

package mapreduce; 

import java.io.*; 

import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.conf.*; 
import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 
import org.apache.hadoop.util.*; 


public class StockDriver extends Configured implements Tool 
{ 
     public int run(String[] args) throws Exception 
     { 
      //creating a JobConf object and assigning a job name for identification purposes 
      JobConf conf = new JobConf(getConf(), StockDriver.class); 
      conf.setJobName("StockDriver"); 

      //Setting configuration object with the Data Type of output Key and Value 
      conf.setOutputKeyClass(Text.class); 
      conf.setOutputValueClass(IntWritable.class); 

      //Providing the mapper and reducer class names 
      conf.setMapperClass(StockMapper.class); 
      conf.setReducerClass(StockReducer.class); 

      File in = new File(args[0]); 
      int number_of_companies = in.listFiles().length; 
      for(int iter=1;iter<=number_of_companies;iter++) 
      { 
       Path inp = new Path(args[0]+"/i"+Integer.toString(iter)+".txt"); 
       Path out = new Path(args[1]+Integer.toString(iter)); 
       //the HDFS input and output directory to be fetched from the command line 
       FileInputFormat.addInputPath(conf, inp); 
       FileOutputFormat.setOutputPath(conf, out); 
       JobClient.runJob(conf); 
      } 
      return 0; 
     } 

     public static void main(String[] args) throws Exception 
     { 
      int res = ToolRunner.run(new Configuration(), new StockDriver(),args); 
      System.exit(res); 
     } 
}

映射類：

package mapreduce; 

import java.io.IOException; 
import gonn.ConstraintTree; 

import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 

public class StockMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> 
{ 
     //hadoop supported data types 
     private static IntWritable send; 
     private Text word; 

     //map method that performs the tokenizer job and framing the initial key value pairs 
     public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException 
     { 
      //taking one line at a time and tokenizing the same 
      String line = value.toString(); 
      String[] words = line.split(" "); 
      String out = ConstraintTree.isMain(words[1]); 
      word = new Text(out); 

      send = new IntWritable(Integer.parseInt(words[0])); 
      output.collect(word, send); 
     } 
}

減速機類：

package mapreduce; 

import java.io.IOException; 
import java.util.Iterator; 

import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 

public class StockReducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> 
{ 
     //reduce method accepts the Key Value pairs from mappers, do the aggregation based on keys and produce the final output 
     public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException 
     { 
      int val = 0; 

      while (values.hasNext()) 
      { 
       val += values.next().get(); 
      } 
      output.collect(key, new IntWritable(val)); 
     } 
}

堆棧跟蹤：

Exception in thread "main" java.lang.NullPointerException 
    at mapreduce.StockDriver.run(StockDriver.java:29) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) 
    at mapreduce.StockDriver.main(StockDriver.java:44) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

當我嘗試使用java -jar myfile.jar args...運行jar文件時，它工作正常。但是，當我試圖運行它在hadoop集羣使用hadoop jar myfile.jar [MainClass] args...是給錯誤。

只是爲了澄清，第29行是int number_of_companies = in.listFiles().length;

來源

2014-09-25 Darshil Babel

您是否爲arg [0]中的每個文件運行單獨的MR作業？ – blackSmith 2014-09-25 09:10:21

@blackSmith不，我在每個文件的循環中使用相同的Mapreduce作業。 – 2014-09-25 09:18:11

問題的原因是用於讀取HDFS文件中使用File的API。如果使用不存在的路徑創建File對象，則listFiles方法返回null。作爲HDFS（我假設），這是不存在的本地文件系統中的輸入目錄，NPE是從正在添加：

in.listFiles().length

使用以下方法來提取HDFS目錄中的文件數量：

FileSystem fs = FileSystem.get(new Configuration()); 
int number_of_companies = fs.listStatus(new Path(arg[0])).length;

來源

2014-09-25 09:35:30 blackSmith

在hadoop中運行jar文件時出錯

回答

相關問題