2017-02-12 79 views
1

我在我的還原器階段收到了JAVA堆空間錯誤。我在我的應用程序中使用了41還原器,還使用了自定義分區器類。 下面是我的reducer代碼,拋出錯誤。錯誤:還原器階段中的Java堆空間

17/02/12 05:26:45 INFO mapreduce.Job: map 98% reduce 0% 
17/02/12 05:28:02 INFO mapreduce.Job: map 100% reduce 0% 
17/02/12 05:28:09 INFO mapreduce.Job: map 100% reduce 17% 
17/02/12 05:28:10 INFO mapreduce.Job: map 100% reduce 39% 
17/02/12 05:28:11 INFO mapreduce.Job: map 100% reduce 46% 
17/02/12 05:28:12 INFO mapreduce.Job: map 100% reduce 51% 
17/02/12 05:28:13 INFO mapreduce.Job: map 100% reduce 54% 
17/02/12 05:28:14 INFO mapreduce.Job: map 100% reduce 56% 
17/02/12 05:28:15 INFO mapreduce.Job: map 100% reduce 88% 
17/02/12 05:28:16 INFO mapreduce.Job: map 100% reduce 90% 
17/02/12 05:28:18 INFO mapreduce.Job: map 100% reduce 93% 
17/02/12 05:28:18 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000020_0, Status : FAILED 
Error: Java heap space 
17/02/12 05:28:19 INFO mapreduce.Job: map 100% reduce 91% 
17/02/12 05:28:20 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000021_0, Status : FAILED 
Error: Java heap space 
17/02/12 05:28:22 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000027_0, Status : FAILED 
Error: Java heap space 
17/02/12 05:28:23 INFO mapreduce.Job: map 100% reduce 89% 
17/02/12 05:28:24 INFO mapreduce.Job: map 100% reduce 90% 
17/02/12 05:28:24 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000029_0, Status : FAILED 
Error: Java heap space 

這裏是我的減速器代碼..

 public class MyReducer extends Reducer<NullWritable, Text, NullWritable, Text> { 

    private Logger logger = Logger.getLogger(MyReducer.class); 
    StringBuilder sb = new StringBuilder(); 
    private MultipleOutputs<NullWritable, Text> multipleOutputs; 

    public void setup(Context context) { 

     logger.info("Inside Reducer."); 

     multipleOutputs = new MultipleOutputs<NullWritable, Text>(context); 
    } 

    @Override 
    public void reduce(NullWritable Key, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException { 

     for (Text value : values) { 
      final String valueStr = value.toString(); 
      if (valueStr.contains("Japan")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 20)); 
      } else if (valueStr.contains("SelfSourcedPrivate")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 29)); 
      } else if (valueStr.contains("SelfSourcedPublic")) { 
       sb.append(value.toString().substring(0, valueStr.length() - 29)); 
      } else if (valueStr.contains("ThirdPartyPrivate")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 25)); 
      } 
     } 
     multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), "MyFileName"); 
    } 

    public void cleanup(Context context) throws IOException, InterruptedException { 
     multipleOutputs.close(); 
    } 
} 

你可以建議將解決我的問題的任何變化。 如果我們使用組合器類,它會改善嗎?

+1

您試圖向字符串添加多少個值?你有一個有很多價值的鑰匙嗎? –

回答

0

最後我管理解決它。

我剛剛在for循環中使用了multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName);,這解決了我的問題。我用非常大的數據集19 GB文件測試了它,它對我來說工作得很好。 這是我的最終解決方案。最初我認爲它可能會創建很多對象,但它對我來說工作得很好.Map縮小也是競爭非常快。

@Override 
    public void reduce(NullWritable Key, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException { 
     for (Text value : values) { 

      final String valueStr = value.toString(); 
      StringBuilder sb = new StringBuilder(); 
      if (valueStr.contains("Japan")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 20)); 
      } else if (valueStr.contains("SelfSourcedPrivate")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 24)); 
      } else if (valueStr.contains("SelfSourcedPublic")) { 
       sb.append(value.toString().substring(0, valueStr.length() - 25)); 
      } else if (valueStr.contains("ThirdPartyPrivate")) { 
       sb.append(valueStr.substring(0, valueStr.length() - 25)); 
      } 
      multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), 
        strName); 
     } 
    }