2015-11-07 195 views
0

雖然HDFS上得到錯誤執行的JAR文件命令如下的Hadoop jar命令錯誤

#hadoop jar WordCountNew.jar WordCountNew /MRInput57/Input-Big.txt /MROutput57 
15/11/06 19:46:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
15/11/06 19:46:32 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:8020/var/lib/hadoop-0.20/cache/mapred/mapred/staging/root/.staging/job_201511061734_0003 
15/11/06 19:46:32 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /MRInput57/Input-Big.txt already exists 
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /MRInput57/Input-Big.txt already exists 
    at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132) 
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) 
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:882) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) 
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:882) 
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:526) 
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:556) 
    at MapReduce.WordCountNew.main(WordCountNew.java:114) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
    at java.lang.reflect.Method.invoke(Method.java:597) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:197) 


My Driver class Program is as below 

    public static void main(String[] args) throws IOException, Exception { 
     // Configutation details w. r. t. Job, Jar file 
     Configuration conf = new Configuration(); 
     Job job = new Job(conf, "WORDCOUNTJOB"); 

     // Setting Driver class 
     job.setJarByClass(MapReduceWordCount.class); 
     // Setting the Mapper class 
     job.setMapperClass(TokenizerMapper.class); 
     // Setting the Combiner class 
     job.setCombinerClass(IntSumReducer.class); 
     // Setting the Reducer class 
     job.setReducerClass(IntSumReducer.class); 
     // Setting the Output Key class 
     job.setOutputKeyClass(Text.class); 
     // Setting the Output value class 
     job.setOutputValueClass(IntWritable.class); 
     // Adding the Input path 
     FileInputFormat.addInputPath(job, new Path(args[0])); 
     // Setting the output path 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     // System exit strategy 
     System.exit(job.waitForCompletion(true) ? 0 : 1); 
    } 

有人請糾正這個問題在我的代碼?

問候 Pranav

回答

1

您需要檢查輸出目錄不存在,並刪除它,如果它。 MapReduce不能(或不會)將文件寫入存在的目錄。它需要創建目錄來確保。

補充一點:

Path outPath = new Path(args[1]); 
FileSystem dfs = FileSystem.get(outPath.toUri(), conf); 
if (dfs.exists(outPath)) { 
    dfs.delete(outPath, true); 
} 
0

輸出目錄不應該執行程序前存在。刪除現有目錄或提供新目錄或刪除程序中的輸出目錄。

我希望在從命令提示符執行程序之前,從命令提示符處刪除輸出目錄。

從命令提示符:

hdfs dfs -rm -r <your_output_directory_HDFS_URL> 

從Java:

Chris Gerken code is good enough. 
0
您正在嘗試創建存儲輸出

輸出目錄已經present.So嘗試刪除同名的一級目錄或更改輸出目錄的名稱。

0

正如其他人已經注意到的那樣,您會收到錯誤消息,因爲輸出目錄已經存在,很可能是因爲您之前嘗試過執行此作業。

您可以刪除現有的輸出目錄運行工作的權利之前,即:

#hadoop fs -rm -r /MROutput57 && \ 
hadoop jar WordCountNew.jar WordCountNew /MRInput57/Input-Big.txt /MROutput57