我對數據挖掘,我們從kaggle給CSV數據的一所學校項目的工作(這是怎樣的數據看起來(2線出6970)):轉換CSV到ARFF
4,1970,Female,150,DomesticPartnersKids,Bachelor's Degree,Democrat,,Yes,No,No,No,Yes,Public,No,Yes,No,Yes,No,No,Yes,Science,Study first,Yes,Yes,No,No,Receiving,No,No,Pragmatist,No,No,Cool headed,Standard hours,No,Happy,Yes,Yes,Yes,No,A.M.,No,End,Yes,No,Me,Yes,Yes,No,Yes,No,Mysterious,No,No,,,,,,,,,,Mac,Yes,Cautious,No,Umm...,No,Space,Yes,In-person,No,Yes,Yes,No,Yay people!,Yes,Yes,Yes,Yes,Yes,No,Yes,,,,,,,,,,,,,,,,,No,No,No,Only-child,Yes,No,No
5,1997,Male,75,Single,High School Diploma,Republican,,Yes,Yes,No,,Yes,Private,No,No,No,Yes,No,No,Yes,Science,Study first,,Yes,No,Yes,Receiving,No,Yes,Pragmatist,No,Yes,Cool headed,Odd hours,No,Right,Yes,No,No,Yes,A.M.,Yes,Start,Yes,Yes,Circumstances,No,Yes,No,Yes,Yes,Mysterious,No,No,Tunes,Technology,Yes,Yes,Yes,Yes,No,Supportive,No,PC,No,Cautious,No,Umm...,No,Space,No,In-person,No,No,Yes,Yes,Grrr people,Yes,No,No,No,No,No,No,Yes,No,No,Yes,No,Own,Pessimist,Mom,No,No,No,No,Nope,Yes,No,No,No,Yes,No,Yes,No,Yes,No
和我們必須得到.arff格式才能在weka中使用。我manualy輸入的報頭(107個屬性)
@ATTRIBUTE user_id NUMERIC
@ATTRIBUTE yob NUMERIC
@ATTRIBUTE gender {Male,Female}
@ATTRIBUTE income {150,100,75,50,25,10}
@ATTRIBUTE householdstatus {MarriedKids,Married,DomesticPartnersKids,DomesticPartners,Single,SingleKids}
@ATTRIBUTE educationlevel {Bachelor's Degree,High School Diploma,Current K-12,Current Undergraduate,Master's Degree,Associate's Degree,Doctoral Degree}
@ATTRIBUTE party {Democrat,Republican}
@ATTRIBUTE Q124742 {Yes,No}
@ATTRIBUTE Q124122 {Yes,No}
,我得到這個錯誤:
}預計在統計結束閱讀令牌EOL
然後我試圖使用WEKA轉換器,但它給我一個錯誤
values.Read 2數目錯誤,預期1,讀令牌[EOL],第4行問題在線遇到:3
什麼Kaggle項目?如果我能得到數據文件,我會試試看。 – zbicyclist
[鏈接](https://inclass.kaggle.com/c/can-we-predict-voting-outcomes)你的迴應 – candy