0

我正在構建多個分類器的網格搜索,並希望使用遞歸特徵消除與交叉驗證。我從Recursive feature elimination and grid search using scikit-learn提供的代碼開始。下面是我的工作代碼:使用scikit-learn遞歸特徵消除和網格搜索:DeprecationWarning

param_grid = [{'C': 0.001}, {'C': 0.01}, {'C': .1}, {'C': 1.0}, {'C': 10.0}, 
       {'C': 100.0}, {'fit_intercept': True}, {'fit_intercept': False}, 
       {'penalty': 'l1'}, {'penalty': 'l2'}] 

estimator = LogisticRegression() 
selector = RFECV(estimator, step=1, cv=5, scoring="roc_auc") 
clf = grid_search.GridSearchCV(selector, {"estimator_params": param_grid}, 
           cv=5, n_jobs=-1) 
clf.fit(X,y) 
print clf.best_estimator_.estimator_ 
print clf.best_estimator_.ranking_ 
print clf.best_estimator_.score(X, y) 

我收到DeprecationWarning因爲它出現在「estimator_params」參數在0.18被刪除;我試圖找出正確的語法在第4行

試圖用...

param_grid = [{'C': 0.001}, {'C': 0.01}, {'C': .1}, {'C': 1.0}, {'C': 10.0}, 
       {'C': 100.0}, {'fit_intercept': True}, {'fit_intercept': False}, 
       {'fit_intercept': 'l1'}, {'fit_intercept': 'l2'}] 
clf = grid_search.GridSearchCV(selector, param_grid, 
           cv=5, n_jobs=-1) 

返回ValueError異常:參數值應該是一個列表。並且...

param_grid = {"penalty": ["l1","l2"], 
      "C": [.001,.01,.1,1,10,100], 
      "fit_intercept": [True, False]} 
clf = grid_search.GridSearchCV(selector, param_grid, 
           cv=5, n_jobs=-1) 

返回值ValueError:估計器RFECV的無效參數損失。使用estimator.get_params().keys()檢查可用參數列表。檢查鍵顯示「C」,「fit_intercept」和「懲罰」全部3個參數鍵。嘗試...

param_grid = {"estimator__C": [.001,.01,.1,1,10,100], 
       "estimator__fit_intercept": [True, False], 
       "estimator__penalty": ["l1","l2"]} 
clf = grid_search.GridSearchCV(selector, param_grid, 
           cv=5, n_jobs=-1) 

永不完成執行,所以我猜這種類型的參數分配不受支持。

至於現在我設置爲忽略警告,但我想用0.18的適當語法更新代碼。任何援助將不勝感激!

回答

0

對以前發佈在SO上的問題的回答:https://stackoverflow.com/a/35560648/5336341。感謝Paulo Alves的答案。

相關代碼:

params = {'estimator__max_depth': [1, 5, None], 
      'estimator__class_weight': ['balanced', None]} 
estimator = DecisionTreeClassifier() 
selector = RFECV(estimator, step=1, cv=3, scoring='accuracy') 
clf = GridSearchCV(selector, params, cv=3) 
clf.fit(X_train, y_train) 
clf.best_estimator_.estimator_ 

看到更多,請使用:

print(selector.get_params())