2017-07-08 69 views
0

我完全新的R和我試圖通過R.R中使用包「neuralnet」預測Weekly_Sales

訓練使用neuralnet包神經網絡

的數據來預測測試數據集的Weekly_Sales我已經看過(TRAIN1):

Store Dep Date  Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI   Unemployment IsHoliday Rank Weekly_Sales 
    1 1 5/2/2010  42.31  2.572  -2000  -500  -100  -500  -700 211.0963582  8.106   0   13  24924.50 
    1 1 12/2/2010  38.51  2.548  -2000  -500  -100  -500  -700 211.2421698  8.106   1   13  46039.49 
    1 1 19/02/2010  39.93  2.514  -2000  -500  -100  -500  -700 211.2891429  8.106   0   13  41595.55 
    1 1 26/02/2010  46.63  2.561  -2000  -500  -100  -500  -700 211.3196429  8.106   0   13  19403.54 
    1 1 5/3/2010  46.50  2.625  -2000  -500  -100  -500  -700 211.3501429  8.106   0   13  21827.90 
    1 1 12/3/2010  57.79  2.667  -2000  -500  -100  -500  -700 211.3501429  8.106   0   13  21827.90 

數據

>ind<- sample(2,nrow(train1),replace= TRUE,prob=c(0.7,0.3)) 
>train <- train1[ind==1,] 
>test <- train1 [ind==2,] 

列車的分離

>head(train) 
Store Dept Date Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5  CPI Unemployment  IsHoliday Rank Weekly_Sales 
1 1 5/2/2010  42.31  2.572  -2000  -500  -100  -500  -700  211.0963582  8.106   0   13  24924.50 
1 1 26-02-2010  46.63  2.561  -2000  -500  -100  -500  -700  211.3196429  8.106   0   13  19403.54 
1 1 5/3/2010  46.50  2.625  -2000  -500  -100  -500  -700  211.3501429  8.106   0   13  21827.90 
1 1 19-03-2010  54.58  2.720  -2000  -500  -100  -500  -700  211.2156350  8.106   0   13  22136.64 
1 1 26-03-2010  51.45  2.732  -2000  -500  -100  -500  -700  211.0180424  8.106   0   13  26229.21 
1 1 2/4/2010  62.27  2.719  -2000  -500  -100  -500  -700  210.8204499  7.808   0   13  57258.43 

測試

>head(test) 
Store Dept Date Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI  Unemployment IsHoliday Rank Weekly_Sales 
1 1 12/2/2010  38.51  2.548  -2000  -500  -100  -500  -700 211.2421698 8.106  1  13  46039.49 
1 1 19-02-2010  39.93  2.514  -2000  -500  -100  -500  -700 211.2891429 8.106  0  13  41595.55 
1 1 12/3/2010  57.79  2.667  -2000  -500  -100  -500  -700 211.3806429 8.106  0  13  21043.39 
1 1 7/5/2010  72.55  2.835  -2000  -500  -100  -500  -700 210.3399684 7.808  0  13  17413.94 
1 1 21-05-2010  76.44  2.826  -2000  -500  -100  -500  -700 210.6170934 7.808  0  13  14773.04 
1 1 28-05-2010  80.44  2.759  -2000  -500  -100  -500  -700 210.8967606 7.808  0  13  15580.43 

我使用容貌的代碼如下:

>library(neuralnet) 



>n <-neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=FALSE) 
>plot(n) 
>output <- compute(n,test[,4:14]) 
>output1 <- output$net.result*(max(test$Weekly_Sales)-min(test$Weekly_Sales))+min(test$Weekly_Sales) 

神經網絡進行訓練,並且它示出範圍中的一個錯誤10^13。此外,我每次都獲得相同的輸出,我正在運行代碼,這些預測甚至與測試數據中的實際Weekly_Sales差不多。我已經使用了另一個部門的數據集,但仍得到相同的預測。

輸出

>head(output$net.result) 
     [,1] 
2 0.9999999998 
3 0.9999999998 
6 0.9999999998 
14 0.9999999998 
16 0.9999999998 
17 0.9999999998 



> head(output1) 
    [,1] 
2 149743.97 
3 149743.97 
6 149743.97 
14 149743.97 
16 149743.97 
17 149743.97 

回答

1

您需要在申請前neuralnet正常化您的數據()。因此分裂TRAIN1到火車/測試前,使用下面的代碼

maximum <- apply(train1, 2, max) 
minimum <- apply(train1, 2, min) 
train1_scaled <- as.data.frame(scale(train1, center=minimum, scale = maximum- minimum)) 

然後用你的代碼,以分割數據,並使用以下功能

#linear.output should be TRUE as you are predicting continuos dependent variable 
n <- neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=TRUE) 

下面的代碼也需要這個

#basically to convert it back to non-scaled version, you need to do it using non-scaled original data not 'test' dataset 
output1 <- output$net.result*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales) 

#also the dependent variable in test dataset will need conversion 
test$Weekly_Sales_nonScaled <- test$Weekly_Sales*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales) 

#After this you can compare original data (test$Weekly_Sales_nonScaled) with predicted data (output1) 
後稍作修改

請不要忘記告訴我們是否有幫助:)