1

我一直在使用sklearn嘗試線性迴歸。有時我得到一個值錯誤,有時它工作正常。我不知道使用哪種方法。是 錯誤信息如下:Python Sklearn線性迴歸值錯誤

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 512, in fit 
    y_numeric=True, multi_output=True) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/utils/validation.py", line 531, in check_X_y 
    check_consistent_length(X, y) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/utils/validation.py", line 181, in check_consistent_length 
    " samples: %r" % [int(l) for l in lengths]) 
ValueError: Found input variables with inconsistent numbers of samples: [1, 200] 

的代碼是這樣的:

import pandas as pd 
from sklearn.linear_model import LinearRegression 
data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0); 
x = data['TV'] 
y = data['Sales'] 
lm = LinearRegression() 
lm.fit(x,y) 

請幫助我。我是一名學生,想要學習機器學習的基礎知識。

回答

1

lm.fit預計X是一個

numpy的陣列或形狀的稀疏矩陣[N_SAMPLES次,n_features]

x具有形狀:

In [6]: x.shape 
Out[6]: (200,) 

只需使用:

lm.fit(x.reshape(-1,1) ,y) 
+0

謝謝!工作得很好 –

1

您通過X作爲一個數據幀,而不是一個系列,你可以使用[[]] 「雙括號」 或to_frame()單個功能:

import pandas as pd 
from sklearn.linear_model import LinearRegression 
data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0); 
x = data[['TV']] 

或者

x = data['TV'].to_frame() 
y = data['Sales'] 
lm = LinearRegression() 
lm.fit(x,y) 

輸出:

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False) 
+0

謝謝!工作得很好 –