爲什麼我們在weka評估函數中使用訓練數據？

-2

我正在使用weka進行分類。我正在使用不同的火車和測試數據集。我注意到，在對測試數據集進行評估時，我們在評估函數中使用了訓練數據。有誰知道我們爲什麼使用培訓數據？爲什麼不測試數據？我的意思是在下面的代碼中，爲什麼我們在第6行使用列車？爲什麼不測試？爲什麼我們在weka評估函數中使用訓練數據？

 1. trainsource = new DataSource(train_file_path); 
     2. trains = trainsource.getDataSet(); 
     3. trains.setClassIndex(0); 

     4. testsource = new DataSource(test_file_path); 
     5. tests = testsource.getDataSet(); 


     6. evaluation= new Evaluation(**trains**);  
     7. model.buildClassifier(trains); 
     8. evaluation.evaluateModel(model, tests);

謝謝！提前！！

來源

2016-05-14 Sangeeta

因爲這是機器學習的方式。他們通過使用「訓練數據」來訓練分類器來學習。 WEKA一般以'arff格式'取得訓練檔案。訓練數據在屬性下有很多數據。培訓文件的例子：

@relation maitre 

@attribute patrons {none, some, full} 
@attribute waitEstation {0-10,10-30,30-60,>60} 
@attribute reservation {TRUE, FALSE} 
@attribute bar {TRUE, FALSE} 
@attribute alternative {TRUE, FALSE} 
@attribute sit {yes, no} 

@data 
some,0-10,TRUE,FALSE,TRUE,yes 
full,30-60,FALSE,FALSE,TRUE,no 
some,0-10,FALSE,TRUE,FALSE,yes 
full,10-30,FALSE,FALSE,TRUE,yes 
full,>60,TRUE,FALSE,TRUE,no 
some,0-10,TRUE,TRUE,FALSE,yes 
none,0-10,FALSE,TRUE,FALSE,no 
some,0-10,TRUE,FALSE,FALSE,yes 
full,>60,FALSE,TRUE,FALSE,no 
full,10-30,TRUE,TRUE,TRUE,yes 
none,0-10,FALSE,FALSE,FALSE,no 
full,30-60,FALSE,TRUE,TRUE,no

現在分類器可以是任何類型例如：樸素貝葉斯分類器，J48，SVM等。當分類器使用訓練數據集訓練，它會在WEKA換算的「模型」。現在你可以使用這個創建的「模型」驗證你的'測試集'。因此'測試數據'用於驗證模型。

現在，如果使用上述訓練數據集訓練分類器，現在可以預測未知類。例如，如果你想預測屬性'坐'。您需要測試數據如下：

@relation maitretest 

@attribute patrons {none, some, full} 
@attribute waitEstation {0-10,10-30,30-60,>60} 
@attribute reservation {TRUE, FALSE} 
@attribute bar {TRUE, FALSE} 
@attribute alternative {TRUE, FALSE} 
@attribute sit {yes, no} 

@data 
some,0-10,TRUE,FALSE,TRUE,? 
full,30-60,FALSE,FALSE,TRUE,?

注意？標記屬性'坐'的位置。你現在可以預測未知的類。希望這個清除你的疑惑:)

來源

2016-05-14 11:08:54

爲什麼我們在weka評估函數中使用訓練數據？

回答

相關問題