我想從我的sci-kit學習模型中預測y_train_actual
的均方根誤差與原始值salaries
。TypeError:不支持的操作數類型爲 - :'numpy.ndarray'和'numpy.ndarray'
問題:但與mean_squared_error(y_train_actual, salaries)
,我收到錯誤TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'
。作爲第二個參數使用list(salaries)
而不是salaries
會產生相同的錯誤。
隨着mean_squared_error(y_train_actual, y_valid_actual)
我收到錯誤Found array with dim 40663. Expected 244768
我怎麼能轉換爲正確的數組類型sklearn.netrucs.mean_squared_error()
?
代碼
from sklearn.metrics import mean_squared_error
y_train_actual = [ np.exp(float(row)) for row in y_train ]
print mean_squared_error(y_train_actual, salaries)
錯誤
TypeError Traceback (most recent call last)
<ipython-input-144-b6d4557ba9c5> in <module>()
3 y_valid_actual = [ np.exp(float(row)) for row in y_valid ]
4
----> 5 print mean_squared_error(y_train_actual, salaries)
6 print mean_squared_error(y_train_actual, y_valid_actual)
C:\Python27\lib\site-packages\sklearn\metrics\metrics.pyc in mean_squared_error(y_true, y_pred)
1462 """
1463 y_true, y_pred = check_arrays(y_true, y_pred)
-> 1464 return np.mean((y_pred - y_true) ** 2)
1465
1466
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'
代碼
y_train_actual = [ np.exp(float(row)) for row in y_train ]
y_valid_actual = [ np.exp(float(row)) for row in y_valid ]
print mean_squared_error(y_train_actual, y_valid_actual)
錯誤
ValueError Traceback (most recent call last)
<ipython-input-146-7fcd0367c6f1> in <module>()
4
5 #print mean_squared_error(y_train_actual, salaries)
----> 6 print mean_squared_error(y_train_actual, y_valid_actual)
C:\Python27\lib\site-packages\sklearn\metrics\metrics.pyc in mean_squared_error(y_true, y_pred)
1461
1462 """
-> 1463 y_true, y_pred = check_arrays(y_true, y_pred)
1464 return np.mean((y_pred - y_true) ** 2)
1465
C:\Python27\lib\site-packages\sklearn\utils\validation.pyc in check_arrays(*arrays, **options)
191 if size != n_samples:
192 raise ValueError("Found array with dim %d. Expected %d"
--> 193 % (size, n_samples))
194
195 if not allow_lists or hasattr(array, "shape"):
ValueError: Found array with dim 40663. Expected 244768
代碼
print type(y_train)
print type(y_train_actual)
print type(salaries)
結果
<type 'list'>
<type 'list'>
<type 'tuple'>
打印y_train [:10]
個[10.126631103850338, 10.308952660644293, 10.308952660644293, 10.221941283654663, 10.126631103850338, 10.126631103850338, 11.225243392518447, 9.9987977323404529, 10.043249494911286, 11.350406535472453]
打印薪金[:10]
('25000', '30000', '30000', '27500', '25000', '25000', '75000', '22000', '23000', '85000')
打印列表(工資)[:10]
['25000', '30000', '30000', '27500', '25000', '25000', '75000', '22000', '23000', '85000']
打印文件N(y_train)
244768
打印LEN(工資)
244768
你可以添加y_train的「形狀」嗎?我的猜測是,y_train_actual是'ndarrays'的'list',它可能在'mean_square_error()'內發生衝突。 – fgb 2013-05-02 04:39:18
@fgb我得到錯誤'AttributeError:'列表'對象沒有屬性'shape'' – Nyxynyx 2013-05-02 04:41:16
沒錯。你有關於y_train的尺寸的想法嗎? – fgb 2013-05-02 04:42:38