2017-03-09 51 views
1

我已經能夠使用nnet和neuralnet來預測常規backprop網絡中的值,但是由於許多原因一直在努力做到與MXNET和R相同。無法在mxnet 0.94中預測R

這是文件(簡單的CSV頁眉,列已歸一化) https://files.fm/u/cfhf3zka

這是我使用的代碼:

filedata <- read.csv("example.csv") 

require(mxnet) 

datain <- filedata[,1:3] 
dataout <- filedata[,4] 

lcinm <- data.matrix(datain, rownames.force = "NA") 
lcoutm <- data.matrix(dataout, rownames.force = "NA") 
lcouta <- as.numeric(lcoutm) 

data <- mx.symbol.Variable("data") 
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=3) 
act1 <- mx.symbol.Activation(fc1, name="sigm1", act_type="sigmoid") 
fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=3) 
act2 <- mx.symbol.Activation(fc2, name="sigm2", act_type="sigmoid") 
fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=3) 
act3 <- mx.symbol.Activation(fc3, name="sigm3", act_type="sigmoid") 
fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=1) 
softmax <- mx.symbol.LogisticRegressionOutput(fc4, name="softmax") 

mx.set.seed(0) 
mxn <- mx.model.FeedForward.create(array.layout = "rowmajor", softmax, X = lcinm, y = lcouta, learning.rate=0.01, eval.metric=mx.metric.rmse) 

preds <- predict(mxn, lcinm) 

predsa <-array(preds) 

predsa 

控制檯輸出爲:

Start training with 1 devices 
[1] Train-rmse=0.0852988247858687 
[2] Train-rmse=0.068769514264606 
[3] Train-rmse=0.0687647380075881 
[4] Train-rmse=0.0687647164103567 
[5] Train-rmse=0.0687647161066822 
[6] Train-rmse=0.0687647160828069 
[7] Train-rmse=0.0687647161241598 
[8] Train-rmse=0.0687647160882147 
[9] Train-rmse=0.0687647160594508 
[10] Train-rmse=0.068764716079949 
> preds <- predict(mxn, lcinm) 
Warning message: 
In mx.model.select.layout.predict(X, model) : 
    Auto detect layout of input matrix, use rowmajor.. 

> predsa <-array(preds) 
> predsa 
    [1] 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 
    [10] 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 0.6776764 

因此它得到一個「平均值」但無法預測值,嘗試了其他方法並學習以避免過度預測但從未收到d甚至可變的輸出。

回答

0

我試過你的例子,它似乎像你試圖預測連續輸出LogisticRegressionOutput。我相信你應該使用LinearRegressionOutput。你可以看到這個例子here和Julia例子here。另外,由於您預測的是連續輸出,因此使用不同的激活函數(如ReLu)可能更好,請參閱this question的某些原因。

有了這些變化,我公司生產的以下代碼:

data <- mx.symbol.Variable("data") 
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=3) 
act1 <- mx.symbol.Activation(fc1, name="sigm1", act_type="softrelu") 
fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=3) 
act2 <- mx.symbol.Activation(fc2, name="sigm2", act_type="softrelu") 
fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=3) 
act3 <- mx.symbol.Activation(fc3, name="sigm3", act_type="softrelu") 
fc4 <- mx.symbol.FullyConnected(act3, name="fc4", num_hidden=1) 
softmax <- mx.symbol.LinearRegressionOutput(fc4, name="softmax") 

mx.set.seed(0) 
mxn <- mx.model.FeedForward.create(array.layout = "rowmajor", 
            softmax, 
            X = lcinm, 
            y = lcouta, 
            learning.rate=1, 
            eval.metric=mx.metric.rmse, 
            num.round = 100) 

preds <- predict(mxn, lcinm) 

predsa <-array(preds) 
require(ggplot2) 
qplot(x = dataout, y = predsa, geom = "point", alpha = 0.6) + 
    geom_abline(slope = 1) 

這讓我不斷減少錯誤率:

Start training with 1 devices 
[1] Train-rmse=0.0725415842873665 
[2] Train-rmse=0.0692660343340093 
[3] Train-rmse=0.0692562284995407 
... 
[97] Train-rmse=0.048629236911287 
[98] Train-rmse=0.0486272021266279 
[99] Train-rmse=0.0486251858007309 
[100] Train-rmse=0.0486231872849457 

和預測的輸出開始與實際輸出對齊的證明有這個陰謀:enter image description here

+0

我已經標記你的解決方案是正確的,因爲你已經顯示了交叉圖,這對我來說已經足夠了,但是你的解決方案提高了m礦問題,爲什麼LinearRegression而不是曲線的Sigmoid(或爲什麼Logistic不工作)。爲什麼地球上這個網絡比nnet和neuralnet慢了一千倍(即使是一個sigmoid作爲激活)都運行在同一個CPU上 – David