2016-11-29 54 views
1

我已經運行了網格搜索,並將時代作爲超參數之一。現在選擇最佳模型後,如何確定爲這個特定模型選擇了哪個時期?如何從網格搜索結果確定時期超參數

下面是模型 模型詳細信息的摘要: ==============

H2OBinomialModel: deeplearning 
Model ID: dl_grid_model_19 
Status of Neuron Layers: predicting Churn, 2-class classification, bernoulli distribution, CrossEntropy loss, 4,226 weights/biases, 44.1 KB, 47,520 training samples, mini-batch size 1 
    layer units    type dropout  l1  l2 mean_rate rate_rms momentum mean_weight weight_rms 
1  1 30   Input 0.00 %                  
2  2 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.011006 0.210611 
3  3 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.035854 0.191687 
4  4 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.029072 0.185352 
5  5 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.057359 0.186863 
6  6  2   Softmax   0.000010 0.000010 0.009995 0.000000 0.501901 0.122655 0.406789 
    mean_bias bias_rms 
1     
2 0.401924 0.136989 
3 0.938406 0.041128 
4 0.950918 0.043826 
5 0.915588 0.060796 
6 0.019925 0.175195 


H2OBinomialMetrics: deeplearning 
** Reported on training data. ** 
** Metrics reported on full training frame ** 

MSE: 0.1946901 
RMSE: 0.441237 
LogLoss: 0.5731371 
Mean Per-Class Error: 0.194215 
AUC: 0.8767996 
Gini: 0.7535992 

Confusion Matrix for F1-optimal threshold: 
     No Yes Error  Rate 
No  1755 614 0.259181 =614/2369 
Yes  308 2075 0.129249 =308/2383 
Totals 2063 2689 0.194024 =922/4752 

Maximum Metrics: Maximum metrics at their respective thresholds 
         metric threshold value idx 
1      max f1 0.216316 0.818218 266 
2      max f2 0.058723 0.889206 348 
3     max f0point5 0.306487 0.801744 216 
4     max accuracy 0.217122 0.805976 265 
5    max precision 0.730797 1.000000 0 
6     max recall 0.006754 1.000000 398 
7    max specificity 0.730797 1.000000 0 
8    max absolute_mcc 0.216316 0.616944 266 
9 max min_per_class_accuracy 0.257957 0.795636 242 
10 max mean_per_class_accuracy 0.217122 0.805792 265 

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)` 
H2OBinomialMetrics: deeplearning 
** Reported on validation data. ** 
** Metrics reported on full validation frame ** 

MSE: 0.1418929 
RMSE: 0.3766867 
LogLoss: 0.4374728 
Mean Per-Class Error: 0.2603761 
AUC: 0.8306744 
Gini: 0.6613489 

Confusion Matrix for F1-optimal threshold: 
     No Yes Error  Rate 
No  1075 201 0.157524 =201/1276 
Yes  162 284 0.363229 =162/446 
Totals 1237 485 0.210801 =363/1722 

Maximum Metrics: Maximum metrics at their respective thresholds 
         metric threshold value idx 
1      max f1 0.323830 0.610097 183 
2      max f2 0.087110 0.740000 319 
3     max f0point5 0.514027 0.608666 94 
4     max accuracy 0.514027 0.800232 94 
5    max precision 0.668538 0.875000 21 
6     max recall 0.011443 1.000000 389 
7    max specificity 0.717464 0.999216 0 
8    max absolute_mcc 0.323830 0.466764 183 
9 max min_per_class_accuracy 0.229876 0.746082 238 
10 max mean_per_class_accuracy 0.173814 0.753367 273 

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)` 

回答

2

要找出多少時期使用的模型,最好的辦法是看看分數的歷史。例如。一個模型m

h2o.scoreHistory(m) 

(或者圖形版本,情節模式:plot(m)

這可能是信息太多,所以減少它只是顯示出與時代:

h2o.scoreHistory(m)[,c("epochs")] 

(我剛剛注意到h2o.scoreHistory(m)$epochs也會起作用。)

顯示返回的最終模型的時代:

last(h2o.scoreHistory(m)[,c("epochs")]) 

順便說一句,如果你剛剛印刷,你應該已經看到了時代的一列,如果它是你的超參數的一個網格對象。

回答你沒有問過的問題:看看早期停止,這將使你免於嘗試提前猜測你需要多少個時代,因此也爲你節省了一個超參數你的網格搜索。

你也可以簡單地讓與你正在考慮的最高紀元值模型,並期待在歷史得分在每一個你感興趣的其他時代價值,以獲得分數。

相關問題