2016-10-02 86 views
0

我是Hadoop的新手。我正在嘗試使用下面的代碼讀取HDFS上的現有文件。配置似乎文件和文件路徑也是正確的。 -Hadoop Map Reduce - 讀取HDFS文件 - FileAlreadyExists錯誤

public static class Map extends Mapper<LongWritable, Text, Text, Text> { 

    private static Text f1, f2, hdfsfilepath; 
    private static HashMap<String, ArrayList<String>> friendsData = new HashMap<>(); 

    public void setup(Context context) throws IOException { 
     Configuration conf = context.getConfiguration(); 
     Path path = new Path("hdfs://cshadoop1" + conf.get("hdfsfilepath")); 
     FileSystem fs = FileSystem.get(path.toUri(), conf); 
     if (fs.exists(path)) { 
     BufferedReader br = new BufferedReader(
      new InputStreamReader(fs.open(path))); 
     String line; 
     line = br.readLine(); 
     while (line != null) { 
      StringTokenizer str = new StringTokenizer(line, ","); 
      String friend = str.nextToken(); 
      ArrayList<String> friendDetails = new ArrayList<>(); 
      while (str.hasMoreTokens()) { 
      friendDetails.add(str.nextToken()); 
      } 
      friendsData.put(friend, friendDetails); 
     } 
     } 
    } 

    public void map(LongWritable key, Text value, Context context) 
     throws IOException, InterruptedException { 
     for (String k : friendsData.keySet()) { 
     context.write(new Text(k), new Text(friendsData.get(k).toString())); 
     } 
    } 
    } 

我得到下面的異常,當我運行的代碼 -

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://cshadoop1/socNetData/userdata/userdata.txt already exists 
     at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146) 
     at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) 
     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) 

我只是想讀取現有的文件。任何想法,我在這裏失蹤?感謝任何幫助。

回答

2

異常告訴你,你的輸出目錄已經存在,但它不應該。刪除它或更改其名稱。

此外,輸出目錄'userdata.txt'的名稱看起來像文件的名稱。因此,請檢查您是否在輸入/輸出目錄中發生錯誤。