我想使用scikit學習來計算多類數據集的概率。但是,由於某種原因,我爲每個例子都得到了相同的概率。任何想法發生了什麼?這是否與我的模型,我對圖書館的使用或其他內容有關?感謝任何幫助!爲什麼我的所有SVM結果在scikit中都一樣?
svm_model = svm.SVC(probability=True, kernel='rbf',C=1, decision_function_shape='ovr', gamma=0.001,verbose=100)
svm_model.fit(train_X,train_y)
preds= svm_model.predict_proba(test_X)
train_X看起來像這樣
array([[2350, 5550, 2750.0, ..., 23478, 1, 3],
[2500, 5500, 3095.5, ..., 23674, 0, 3],
[3300, 6900, 3600.0, ..., 6529, 0, 3],
...,
[2150, 6175, 2500.0, ..., 11209, 0, 3],
[2095, 5395, 2595.4, ..., 10070, 0, 3],
[1650, 2850, 2000.0, ..., 25463, 1, 3]], dtype=object)
train_y看起來像這樣
0 1
1 2
10 2
100 2
1000 2
10000 2
10001 2
10002 2
10003 2
10004 2
10005 2
10006 2
10007 2
10008 1
10009 1
1001 2
10010 2
test_X看起來像這樣
array([[2190, 3937, 2200.5, ..., 24891, 1, 5],
[2695, 7000, 2850.0, ..., 5491, 1, 4],
[2950, 12000, 4039.5, ..., 22367, 0, 4],
...,
[2850, 5200, 3000.0, ..., 15576, 1, 1],
[3200, 16000, 4100.0, ..., 1320, 0, 3],
[2100, 3750, 2400.0, ..., 6022, 0, 1]], dtype=object)
我的結果看起來像
array([[ 0.07819139, 0.22727628, 0.69453233],
[ 0.07819139, 0.22727628, 0.69453233],
[ 0.07819139, 0.22727628, 0.69453233],
...,
[ 0.07819139, 0.22727628, 0.69453233],
[ 0.07819139, 0.22727628, 0.69453233],
[ 0.07819139, 0.22727628, 0.69453233]])
爲什麼「train_y」有兩列? – lejlot
Train_y有一個索引列 – user1507889