2015-10-07 90 views
3

我試圖預測使用SVR和GP的時間序列。 時間系列實際上是一個pandas.Seriespandas.DatetimeIndex作爲索引。使用日期時間索引的Python中的時間序列預測

這兩種算法都是在scikit-learn中實現的。而實際上SVR接受修改後X軸預測:

train_size = 100 
svr = GridSearchCV(SVR(kernel='rbf', gamma=0.001), cv=5, 
       param_grid={"C": [1e0, 1e1, 1e2, 1e3], 
          "gamma": np.logspace(-2, 2, 5)}) 

# Chose train_size items randomly in the data to predict 
data_train = ts.sample(n=train_size) 
data_train_y = data_train.values 
data_train_x = (data_train.index.values).reshape(train_size, 1) 

t0 = time.time() 
svr.fit(data_train_x, data_train_y) 
svr_fit = time.time() - t0 
print("SVR complexity and bandwidth selected and model fitted in %.3f s" 
    % svr_fit) 

sv_ratio = svr.best_estimator_.support_.shape[0]/train_size 
print("Support vector ratio: %.3f" % sv_ratio) 


min_date = min(ts.index.values) 
max_date = max(ts.index.values) + np.timedelta64('1','D') 

X_plot = pd.date_range(min_date, max_date, freq='H') 
X_plot = X_plot.to_series() 
X_plot = X_plot.reshape([len(X_plot),1]) 

t0 = time.time() 
y_svr = svr.predict(X_plot) 
svr_predict = time.time() - t0 
print("SVR prediction for %d inputs in %.3f s" 
    % (X_plot.shape[0], svr_predict)) 

高斯過程並不想對付它:

from sklearn import gaussian_process 
gp = gaussian_process.GaussianProcess(theta0=1e-2, thetaL=1e-4, thetaU=1e-1) 
gp.fit(data_train_x, data_train_y) 

它導致:

TypeError: ufunc add cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]') 

回答

0

顯然時間戳數據類型是不可接受的。他們應該轉換爲浮動。