如何使用SOM算法進行分類預測

我想看看如果SOM算法可以用於分類預測。我曾經在下面的代碼，但我看到分類結果遠不是正確的。例如，在測試數據集中，我獲得的不僅僅是我在訓練目標變量中的3個值。我如何創建一個與訓練目標變量保持一致的預測模型？如何使用SOM算法進行分類預測

library(kohonen) 
    library(HDclassif) 
    data(wine) 
    set.seed(7) 

    training <- sample(nrow(wine), 120) 
    Xtraining <- scale(wine[training, ]) 
    Xtest <- scale(wine[-training, ], 
        center = attr(Xtraining, "scaled:center"), 
        scale = attr(Xtraining, "scaled:scale")) 

    som.wine <- som(Xtraining, grid = somgrid(5, 5, "hexagonal")) 


som.prediction$pred <- predict(som.wine, newdata = Xtest, 
          trainX = Xtraining, 
          trainY = factor(Xtraining$class))

而結果：

$unit.classif 

[1] 7 7 1 7 1 11 6 2 2 7 7 12 11 11 12 2 7 7 7 1 2 7 2 16 20 24 25 16 13 17 23 22 
[33] 24 18 8 22 17 16 22 18 22 22 18 23 22 18 18 13 10 14 15 4 4 14 14 15 15 4

來源

2017-07-14 mql4beginner

這可能幫助：

SOM是一種無監督分類算法，所以你不應該指望它是在包含數據集的培訓一個分類標籤（如果你這樣做，它需要這些信息才能工作，並且對於未標記的數據集將是無用的）
想法是它會將輸入數字向量「轉換」爲網絡單元號（嘗試再次運行您的代碼，每3格1個，並且您將獲得預期的輸出）
然後，您需要將這些網絡單元號回到您正在尋找的類別（這是您的代碼中缺少的關鍵部分）

下面的可重複示例將輸出經典分類錯誤。它包含原始帖子中缺少的「轉換」部分的一個實現選項。

雖然，對於這個特定的數據集，模型過快配合：3個單位給出最好的結果。

#Set and scale a training set (-1 to drop the classes) 
data(wine) 
set.seed(7) 
training <- sample(nrow(wine), 120) 
Xtraining <- scale(wine[training, -1]) 

#Scale a test set (-1 to drop the classes) 
Xtest <- scale(wine[-training, -1], 
       center = attr(Xtraining, "scaled:center"), 
       scale = attr(Xtraining, "scaled:scale")) 

#Set 2D grid resolution 
#WARNING: it overfits pretty quickly 
#Errors are 36% for 1 unit, 63% for 2, 93% for 3, 89% for 4 
som_grid <- somgrid(xdim = 1, ydim=3, topo="hexagonal") 

#Create a trained model 
som_model <- som(Xtraining, som_grid) 

#Make a prediction on test data 
som.prediction <- predict(som_model, newdata = Xtest) 

#Put together original classes and SOM classifications 
error.df <- data.frame(real = wine[-training, 1], 
         predicted = som.prediction$unit.classif) 

#Return the category number that has the strongest association with the unit 
#number (0 stands for ambiguous) 
switch <- sapply(unique(som_model$unit.classif), function(x, df){ 
    cat <- as.numeric(names(which.max(table(
    error.df[error.df$predicted==x,1])))) 
    if(length(cat)<1){ 
    cat <- 0 
    } 
    return(c(x, cat)) 
}, df = data.frame(real = wine[training, 1], predicted = som_model$unit.classif)) 

#Translate units numbers into classes 
error.df$corrected <- apply(error.df, MARGIN = 1, function(x, switch){ 
    cat <- switch[2, which(switch[1,] == x["predicted"])] 
    if(length(cat)<1){ 
    cat <- 0 
    } 
    return(cat) 
}, switch = switch) 

#Compute a classification error 
sum(error.df$corrected == error.df$real)/length(error.df$real)

來源

2017-07-22 16:47:13 kdallaporta

感謝@Kevin Dallaporta的代碼示例。我有2個問題，首先我使用了trainY = factor（Xtraining $ class），但是我沒有在預測函數中看到它。第二，如何將類預測結果附加到測試數據集？ – mql4beginner

我很高興它幫助！似乎'trainY =因素'參數存在於科諾寧的'V2.X'中，並在'V3.X'中消失。我不知道它應該做什麼，但是有或沒有，回報是相同的，並且在2017年3月的文檔中沒有跟蹤。在我提供的代碼中，預測結果在'error.df $ corrected'中，因此您可以附加測試：'test $ predicted < - error.df $ corrected' – kdallaporta

如何使用SOM算法進行分類預測

回答

相關問題