2014-11-23 67 views
4

我想傳遞一個.arff文件線性迴歸對象,而這樣做它給了我這個異常無法處理多值標稱一流!無法處理多值標稱類 - JAVA

什麼實際發生的情況是我使用CFSSubsetEval評價主體和搜索爲GreedyStepwise這樣做之後進行屬性的選擇,通過這些屬性,以線性迴歸如下

LinearRegression rl=new LinearRegression(); rl.buildClassifier(data);  

數據是有實例對象.avff文件中的數據,以前僅使用weka轉換爲標稱值。我在這裏做錯了什麼?我試圖在谷歌搜索這個錯誤,但找不到一個。

代碼

package com.attribute; 

import java.io.BufferedReader; 
import java.io.FileReader; 
import java.util.Random; 

import weka.attributeSelection.AttributeSelection; 
import weka.attributeSelection.CfsSubsetEval; 
import weka.attributeSelection.GreedyStepwise; 
import weka.classifiers.Evaluation; 
import weka.classifiers.functions.LinearRegression; 
import weka.classifiers.meta.AttributeSelectedClassifier; 
import weka.classifiers.trees.J48; 
import weka.core.Instances; 
import weka.core.Utils; 
import weka.filters.supervised.attribute.NominalToBinary; 

/** 
* performs attribute selection using CfsSubsetEval and GreedyStepwise 
* (backwards) and trains J48 with that. Needs 3.5.5 or higher to compile. 
* 
* @author FracPete (fracpete at waikato dot ac dot nz) 
*/ 
public class AttributeSelectionTest2 { 

    /** 
    * uses the meta-classifier 
    */ 
    protected static void useClassifier(Instances data) throws Exception { 
     System.out.println("\n1. Meta-classfier"); 
     AttributeSelectedClassifier classifier = new AttributeSelectedClassifier(); 
     CfsSubsetEval eval = new CfsSubsetEval(); 
     GreedyStepwise search = new GreedyStepwise(); 
     search.setSearchBackwards(true); 
     J48 base = new J48(); 
     classifier.setClassifier(base); 
     classifier.setEvaluator(eval); 
     classifier.setSearch(search); 
     Evaluation evaluation = new Evaluation(data); 
     evaluation.crossValidateModel(classifier, data, 10, new Random(1)); 
     System.out.println(evaluation.toSummaryString()); 
    } 

    /** 
    * uses the low level approach 
    */ 
    protected static void useLowLevel(Instances data) throws Exception { 
     System.out.println("\n3. Low-level"); 
     AttributeSelection attsel = new AttributeSelection(); 
     CfsSubsetEval eval = new CfsSubsetEval(); 
     GreedyStepwise search = new GreedyStepwise(); 
     search.setSearchBackwards(true); 
     attsel.setEvaluator(eval); 
     attsel.setSearch(search); 
     attsel.SelectAttributes(data); 
     int[] indices = attsel.selectedAttributes(); 
     System.out.println("selected attribute indices (starting with 0):\n" 
       + Utils.arrayToString(indices)); 
     useLinearRegression(indices, data); 
    } 

    protected static void useLinearRegression(int[] indices, Instances data) throws Exception{ 
     System.out.println("\n 4. Linear-Regression on above selected attributes"); 

     BufferedReader reader = new BufferedReader(new FileReader(
       "C:/Entertainement/MS/Fall 2014/spdb/project 4/healthcare.arff")); 
     Instances data1 = new Instances(reader); 
     data.setClassIndex(data.numAttributes() - 1); 
     /*NominalToBinary nb = new NominalToBinary(); 
     for(int i=0;i<=20; i++){ 
     //Still coding left here, create an Instance variable to store the data from 'data' variable for given indices 
      Instances data_lr=data1. 
     }*/ 
     LinearRegression rl=new LinearRegression(); //Creating a LinearRegression Object to pass data1 
     rl.buildClassifier(data1); 
    } 
    /** 
    * takes a dataset as first argument 
    * 
    * @param args 
    *   the commandline arguments 
    * @throws Exception 
    *    if something goes wrong 
    */ 
    public static void main(String[] args) throws Exception { 
     // load data 
     System.out.println("\n0. Loading data"); 
     BufferedReader reader = new BufferedReader(new FileReader(
       "C:/Entertainement/MS/Fall 2014/spdb/project 4/healthcare.arff")); 
     Instances data = new Instances(reader); 

     if (data.classIndex() == -1) 
      data.setClassIndex(data.numAttributes() - 14); 

     // 1. meta-classifier 
     useClassifier(data); 

     // 2. filter 
     //useFilter(data); 

     // 3. low-level 
     useLowLevel(data); 
    } 
} 

注意:由於我沒有寫代碼來構建與「指數」屬性的實例變量,我是(爲計劃的緣故運行)從相同的加載數據原始文件。

我不知道如何上傳示例數據的文件,但它看起來像這樣。根據你的數據,看起來你的最後一個屬性是一個標準的數據類型(主要包含數字,但也有一些字符串)。[鏈接](https://scontent-a-dfw.xx.fbcdn.net/hphotos-xfa1/t31.0-8/p552x414/10496920_756438941076936_8448023649960186530_o.jpg

+0

如果這可以幫助,凸起的例外是** ** weka.core.unsupportedattributetypeexception [鏈接](http://weka.sourceforge.net /doc.dev/weka/core/UnsupportedAttributeTypeException.html) – Sashi 2014-11-23 21:08:55

+0

爲了能夠提供幫助,你可以上傳一個最低限度重現的例子嗎?這將意味着您的數據集的一個小例子,然後有足夠的代碼來重新創建錯誤。這會讓它變得更容易! – Walter 2014-11-23 21:28:52

+0

@Walter請檢查編輯的問題。 – Sashi 2014-11-23 21:46:20

回答

4

LinearRegression將不允許預測名義類別。

你可能做些什麼來確保你給定的數據集的工作原理是通過帶有線性迴歸的Weka Explorer運行它,並查看是否生成了期望的結果。在此之後,數據很可能會在您的代碼中正常工作。

希望這有助於!

+0

感謝您的回覆。事實上,我能夠將另一個文件上傳到由.arff文件形式的上述列中僅包含兩列的LinearRegression。現在它給了我另一個異常** weka.classifiers.functions.LinearRegression:類屬性沒有設置!**。文件中的數據僅以數值形式顯示。你能幫忙嗎? – Sashi 2014-11-23 22:47:39

+0

您可能需要設置分類索引。也許你可以先將數據加載到Weka Explorer中,然後確認它在那裏工作,然後一旦確認OK就返回代碼。 – 2014-11-23 22:49:28

+0

試試這個,讓你知道。謝謝馬修:) – Sashi 2014-11-23 23:08:29

0

下面是示例數據集的用於線性迴歸(source

@RELATION house 
@ATTRIBUTE houseSize NUMERIC 
@ATTRIBUTE lotSize NUMERIC 
@ATTRIBUTE bedrooms NUMERIC 
@ATTRIBUTE granite NUMERIC 
@ATTRIBUTE bathroom NUMERIC 
@ATTRIBUTE sellingPrice NUMERIC 

@DATA 
3529,9191,6,0,0,205000 
3247,10061,5,1,1,224900 
4032,10150,5,0,1,197900 
2397,14156,4,1,0,189900 
2200,9600,4,0,1,195000 
3536,19994,6,1,1,325000 
2983,9365,5,0,1,230000