Windows上的Hadoop：得到異常「不是有效的DFS文件名」

-1

我是新來的hadoop &在初始階段掙扎。在eclipse中我寫了word count程序併爲wordcount程序創建了JAR。Windows上的Hadoop：得到異常「不是有效的DFS文件名」

我嘗試使用下面的Hadoop命令來運行它：

$ ./hadoop jar C:/cygwin64/home/PAKU/hadoop-1.2.1/wordcount.jar com.hadoopexpert.WordCountDriver file:///C:/cygwin64/home/PAKU/work/hadoopdata/tmp/dfs/ddata/file.txt file:///C:/cygwin64/home/PAKU/hadoop-dir/datadir/tmp/output

而且，我發現了異常，如：

Exception in thread "main" java.lang.IllegalArgumentException: Pathname /C:/cygwin64/home/PAKU/work/hadoopdata/tmp/mapred/staging/PAKU/.staging from hdfs://localhost:50000/C:/cygwin64/home/PAKU/work/hadoopdata/tmp/mapred/staging/PAKU/.staging is not a valid DFS filename. 
     at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:143) 
     at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:554) 
     at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:788) 
     at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:109) 
     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942) 
     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Unknown Source) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) 
     at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) 
     at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) 
     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) 
     at com.hadoopexpert.WordCountDriver.main(WordCountDriver.java:30) 
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) 
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) 
     at java.lang.reflect.Method.invoke(Unknown Source) 
     at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

注：我使用Cygwin的Windows上運行的Hadoop 。

代碼：

public class WordCountDriver { 
    public static void main(String[] args) { 
     try { 
      Job job = new Job(); 
      job.setMapperClass(WordCountMapper.class); 
      job.setReducerClass(WordCountReducer.class); 
      job.setMapOutputKeyClass(Text.class); 
      job.setMapOutputValueClass(IntWritable.class); 

      job.setOutputKeyClass(Text.class); 
      job.setOutputValueClass(IntWritable.class); 

      job.setJarByClass(WordCountDriver.class); 

      FileInputFormat.setInputPaths(job, new Path(args[0])); 
      FileOutputFormat.setOutputPath(job, new Path(args[1])); 
      try { 
       System.exit(job.waitForCompletion(true) ? 0 :-1); 
      } catch (ClassNotFoundException e) { 
       e.printStackTrace(); 
      } catch (InterruptedException e) { 
       e.printStackTrace(); 
      } 
      } catch (IOException e) { 
      e.printStackTrace(); 
     } 
    } 
} 


public class WordCountReducer extends Reducer<Text,IntWritable, Text, IntWritable>{ 
    public void reduce(Text key, Iterable<IntWritable> value, Context context){ 
     int total = 0; 
     while(value.iterator().hasNext()){ 
      IntWritable i = value.iterator().next(); 
      int i1= i.get(); 
      total += i1; 
     } 
     try { 
      context.write(key, new IntWritable(total)); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } catch (InterruptedException e) { 
      e.printStackTrace(); 
     } 

    } 
} 


public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ 
    public void map(LongWritable key, Text value, Context context){ 
     String s = value.toString(); 
     for(String word :s.split(" ")){ 
      Text text = new Text(word); 
      IntWritable intW = new IntWritable(1); 
      try { 
       context.write(text, intW); 
      } catch (IOException e) { 
       e.printStackTrace(); 
      } catch (InterruptedException e) { 
       e.printStackTrace(); 
      } 
     } 
    } 
}

任何人可以幫助我跑我的第一個Hadoop的程序。

在此先感謝。

來源

2016-12-24 PKH

放置代碼。我認爲你在'main'中指定了一個無效的路徑。 –

@AniMenon - 我添加了代碼。你可以請幫忙 – PKH

@AniMenon - 如何通過命令行獲取HDFS位置？ – PKH

您已指定本地路徑FileInputFormat和FileOutputFormat。

將文件放在hdfs中，然後使用hdfs路徑。

步驟：

首先put（或copyFromLocal）文件到HDFS：

hdfs dfs -put /local/file/locaion hdfs://ip_add:port/hdfs_location

您可以使用ls檢查文件：
```
hdfs dfs -ls /hdfs_location/ 
```

現在將hdfs位置作爲輸入的參數，並給出輸出的新目錄。

來源

2016-12-24 08:36:49

- 如何通過命令行獲取HDFS位置？ – PKH

答案已更新。遵循這些步驟。 –

我想你還沒有上傳你的文件在hdfs中。你可以使用hadoop的put命令來做到這一點。一旦該文件將在hdfs目錄中，那麼我認爲它應該工作

來源

2016-12-24 08:57:18

Windows上的Hadoop：得到異常「不是有效的DFS文件名」

回答

相關問題