做了一個課程,我被困在我認爲必須是一個小問題 我想找出SelectKBest什麼是最重要的功能(我改變了k
從2, 4,6,8)SelectKbest,Treeclassifer,在某處出現小錯誤:(
我加載數據
data_dict = pickle.load(open("final_project_dataset.pkl", "r"))
my_dataset = data_dict
data = featureFormat(my_dataset, feature_combo, sort_keys = True)
labels, features = targetFeatureSplit(data)
kbest = SelectKBest(k=2)
train_new= kbest.fit_transform(features,labels)
與get_support
我發現最重要的功能,然後嘗試用我的分類器使用
from sklearn import tree
clf1 = tree.DecisionTreeClassifier(min_samples_split=2)
test_classifier(clf1, my_dataset, feature_lists2)
我第一次使用的功能列表與所有我叫組合的特點:
feature_combo=['poi','salary','bonus','total_stock_value','long_term_incentive','restricted_stock_deferred','from_this_person_to_poi','shared_receipt_with_poi','newfeature_ratio','total_payments','deferral_payments','loan_advances', 'restricted_stock','director_fees','to_messages','from_messages']
獲得了最重要的之後,我創建了功能列表,如:
feature_lists2=['salary','bonus']
當我運行它,我得到一個神祕的錯誤:
Traceback (most recent call last):
File "C:\Users\Stephan\Downloads\ud120-projects\final_project\poi_id.py", line 62, in <module>
train_new= kbest.fit_transform(features,labels)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\base.py", line 429, in fit_transform
return self.fit(X, y, **fit_params).transform(X)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 300, in fit
self._check_params(X, y)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 405, in _check_params
% self.k)
ValueError: k should be >=0, <= n_features; got 2.Use k='all' to return all features.
[Finished in 0.5s with exit code 1]
任何人都可以看到我在做什麼錯? (我是初學者)