WEKA分類類別的可能性

我想知道WEKA是否有方法輸出一些分類的「最佳猜測」。WEKA分類類別的可能性

我的方案是：我用例如交叉驗證對數據進行分類，然後在weka的輸出中得到類似這樣的結果：這些是對此實例進行分類的3個最佳猜測。我想要的就是，即使實例未正確分類，我也會得到3或5個最佳猜測的輸出。

實施例：

類：A，B，C，d，E 實例：1 ... 10

和輸出將是：實例1很可能90％是A類， 75％可能是B類，60％喜歡成爲C類。

謝謝。

來源

2012-08-14 user1454263

我不知道你是否可以在本地做到這一點，但你可以得到每個班級的概率，對他們進行排序並取前三名。

你想要的功能是distributionForInstance(Instance instance)它返回一個double[]給每個類的概率。

來源

2012-08-14 20:57:19 Antimony

好的謝謝，我試了一下。 – user1454263 2012-08-17 12:36:21

不一般。所需的信息並不適用於所有分類器 - 在大多數情況下（例如，對於決策樹），決策是清晰的（儘管可能不正確），而沒有置信度值。你的任務需要能處理不確定性的分類器（比如樸素貝葉斯分類器）。

從技術上講，最容易做的事情可能是訓練模型，然後對單個實例進行分類，Weka應該爲您提供所需的輸出。一般來說，您也可以爲一組實例執行此操作，但我不認爲Weka提供了這種開箱即用的功能。您可能需要定製代碼或通過API使用它（例如在R中）。

來源

2012-08-14 21:00:13

我打算通過API使用它 – user1454263 2012-08-17 12:37:04

當你計算實例的概率時，你到底該怎麼做？

我已經爲新實例here發佈了我的PART規則和數據，但就手動計算而言，我不太確定如何執行此操作！感謝

編輯：現在計算：

私人浮子[] getProbDist（字符串分割）{

//取入的東西如（52/2），這意味着52個實例正確分類和2不正確地分類。

if(prob_dis.length > 2) 
     return null; 

    if(prob_dis.length == 1){ 
     String temp = prob_dis[0]; 
     prob_dis = new String[2]; 
     prob_dis[0] = "1"; 
     prob_dis[1] = temp; 
    } 

    float p1 = new Float(prob_dis[0]); 
    float p2 = new Float(prob_dis[1]); 
    // assumes two tags 
    float[] tag_prob = new float[2]; 

    tag_prob[1] = 1 - tag_prob[1]; 
    tag_prob[0] = (float)p2/p1; 

// returns double[] as being the probabilities 

return tag_prob;  
}

來源

2012-08-20 21:45:01 redrubia

Weka的API有一個名爲Classifier.distributionForInstance（）的方法可用於獲取分類預測分佈。然後，您可以通過降低概率來對分佈進行排序，以獲得前N個預測。

下面是一個打印出來的函數：（1）測試實例的地面實況標籤; （2）來自classifyInstance（）的預測標籤;和（3）來自distributionForInstance（）的預測分佈。我已經使用J48，但它應該與其他分類器一起使用。

輸入參數是序列化的模型文件（您可以在模型訓練階段創建，應用-d選項）和ARFF格式的測試文件。

public void test(String modelFileSerialized, String testFileARFF) 
    throws Exception 
{ 
    // Deserialize the classifier. 
    Classifier classifier = 
     (Classifier) weka.core.SerializationHelper.read(
      modelFileSerialized); 

    // Load the test instances. 
    Instances testInstances = DataSource.read(testFileARFF); 

    // Mark the last attribute in each instance as the true class. 
    testInstances.setClassIndex(testInstances.numAttributes()-1); 

    int numTestInstances = testInstances.numInstances(); 
    System.out.printf("There are %d test instances\n", numTestInstances); 

    // Loop over each test instance. 
    for (int i = 0; i < numTestInstances; i++) 
    { 
     // Get the true class label from the instance's own classIndex. 
     String trueClassLabel = 
      testInstances.instance(i).toString(testInstances.classIndex()); 

     // Make the prediction here. 
     double predictionIndex = 
      classifier.classifyInstance(testInstances.instance(i)); 

     // Get the predicted class label from the predictionIndex. 
     String predictedClassLabel = 
      testInstances.classAttribute().value((int) predictionIndex); 

     // Get the prediction probability distribution. 
     double[] predictionDistribution = 
      classifier.distributionForInstance(testInstances.instance(i)); 

     // Print out the true label, predicted label, and the distribution. 
     System.out.printf("%5d: true=%-10s, predicted=%-10s, distribution=", 
          i, trueClassLabel, predictedClassLabel); 

     // Loop over all the prediction labels in the distribution. 
     for (int predictionDistributionIndex = 0; 
      predictionDistributionIndex < predictionDistribution.length; 
      predictionDistributionIndex++) 
     { 
      // Get this distribution index's class label. 
      String predictionDistributionIndexAsClassLabel = 
       testInstances.classAttribute().value(
        predictionDistributionIndex); 

      // Get the probability. 
      double predictionProbability = 
       predictionDistribution[predictionDistributionIndex]; 

      System.out.printf("[%10s : %6.3f]", 
           predictionDistributionIndexAsClassLabel, 
           predictionProbability); 
     } 

     o.printf("\n"); 
    } 
}

來源

2012-08-25 16:11:15 stackoverflowuser2010

WEKA分類類別的可能性

回答

相關問題