2017-10-20 61 views
1

使用h2o.randomforest時出現此錯誤。請參閱下面的函數調用和相關的錯誤。使用H2O的堆使用錯誤RandomForest

base_line_rf <- h2o.randomForest(x=2:ncol(train), 
           y=1, 
           ntrees = 10000, 
           mtries = ncol(train)-1, 
           training_frame = train, 
           model_id <- model_id, 
           stopping_rounds = 5, 
           stopping_tolerance = 0, 
           stopping_metric = "AUC", 
           binomial_double_trees = TRUE 
) 

錯誤:

java.lang.AssertionError: I am really confused about the heap usage; MEM_MAX=7624720384 heapUsedGC=7626295912 
    at water.MemoryManager.set_goals(MemoryManager.java:97) 
    at water.MemoryManager.malloc(MemoryManager.java:265) 
    at water.MemoryManager.malloc(MemoryManager.java:222) 
    at water.MemoryManager.malloc8d(MemoryManager.java:281) 
    at hex.tree.DHistogram.init(DHistogram.java:281) 
    at hex.tree.DHistogram.init(DHistogram.java:240) 
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.computeChunk(ScoreBuildHistogram2.java:326) 
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.map(ScoreBuildHistogram2.java:306) 
    at water.LocalMR.compute2(LocalMR.java:84) 
    at water.LocalMR.compute2(LocalMR.java:76) 
    at water.LocalMR.compute2(LocalMR.java:76) 
    at water.LocalMR.compute2(LocalMR.java:76) 
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1255) 
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) 
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) 
    at jsr166y.ForkJoinPool$WorkQueue.popAndExecAll(ForkJoinPool.java:904) 
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:977) 
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) 
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 

,這是什麼錯誤的原因是什麼?

謝謝

+0

請提供樣本數據重複的例子:[示例](https://stackoverflow.com/a/5963610/4421870) – Mako212

+0

你可能需要更多的內存,檢查[這個答案](https://stackoverflow.com/questions/45333883/h2o-server-crash)。 –

+0

這是一個斷言錯誤 - 斷言在默認情況下是禁用的,所以您必須打開它們(用於調試?)。如果再次關閉它們,它可能會起作用,但也有可能稍後會彈出另一個相關錯誤。 –

回答

1

根據您的問題,您需要設置H2O羣與更多的內存運行,以滿足您的10000樹隨機森林。看起來H2O集羣(Java進程)是用8GB內存創建的,但是根據你的10000樹設置,它需要更多的內存,然後給8GB。

max_mem_size 7624.720384 MB (Configured) 
heapUsedGC - 7626.295912 MB (Required) 

看起來你是R中使用H2O,因此您可以通過max_mem_size = 12G(指水簇將帶有12GB內存開始)在h2o.init()函數,它應該適合你的隨機森林的要求如下:

h2o.init(max_mem_size="12G") 

您還可以檢查與下面的命令你H2O集羣詳細信息:

> h2o.clusterInfo() 
R is connected to the H2O cluster: 
    H2O cluster uptime:   19 seconds 80 milliseconds 
    H2O cluster version:  3.14.0.3 
    H2O cluster version age: 27 days 
    H2O cluster name:   H2O_started_from_R_avkashchauhan_hwc594 
    H2O cluster total nodes: 1 
    H2O cluster total memory: 10.65 GB <=== This is the max memory size 
    H2O cluster total cores: 8 
    H2O cluster allowed cores: 8 
    H2O cluster healthy:  TRUE 
    H2O Connection ip:   localhost 
    H2O Connection port:  54321 
    H2O Connection proxy:  NA 
    H2O Internal Security:  FALSE 
    H2O API Extensions:   XGBoost, Algos, AutoML, Core V3, Core V4 
    R Version:     R version 3.4.1 (2017-06-30)