2017-05-08 233 views
0

我是Keras的新手,一直在努力正確地塑造數據。現在我已經嘗試了幾個星期,這是我收到的最接近的。我很確定我只是強迫事情工作,不得不調整數據的形狀。幾個問題:如何正確輸入形狀或input_dim?

  1. 模型,損失,優化或激活函數是否確定input_shape或input_dim需要的形狀/尺寸?

如果不是如何將數據整形爲正確的形式。

我試圖將數據整形爲(1,1,59),但是我會接到抱怨說目標數據的形狀是(1,1,19)。現在我知道如何去做的唯一方法是將數據削減一半,使其成爲一個平坦的形狀,但我想只使用20%的數據來創建一個新的集合。

我的代碼: 我試圖做的是有模型從1學習序列 - 100 然後給出了一些應該預測未來數應該是什麼。

# Tool setup 
import numpy as np 
import matplotlib.pyplot as plt 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.layers import LSTM 
from keras.layers import Dropout 

# Setup our dataset and testset. 
dataset = [] # Training set. 
validset = [] 
testset = [] 

dataset = list(range(60)) 
validset = list(range(60, 80)) 
testset = list(range(80, 100)) 

# Preprocess Data: 
X_train = dataset[:-1] # Drop the last element. 
Y_train = dataset[1:] # The second element is the target for prediction. 

# Reshape training data for Keras LSTM model 
# The training data needs to be (batchIndex, timeStepIndex, dimensionIndex) 
# Single batch, time steps, dimensions 
#print(np.array(X_train).shape) 
X_train = np.array(X_train).reshape(-1, 59, 1) 
Y_train = np.array(Y_train).reshape(-1, 59, 1) 

# Normalize the Data: 
#X_train = np.divide(X_train, 200) 
#Y_train = np.divide(Y_train, 200) 

X_test = validset[:-1] # Drop the last element. 
Y_test = validset[1:] # The second element is the target for prediction. 
#print(np.array(X_test).shape) 
X_test = np.array(X_test).reshape(-1, 19, 1) 
Y_test = np.array(Y_test).reshape(-1, 19, 1) 

# Build Model 
model = Sequential() 
#model.add(LSTM(100, input_dim=1, return_sequences=True, 
activation='softmax')) 
model.add(LSTM(100, input_dim=1, return_sequences=True)) 
model.add(Dense(1)) 
model.compile(loss='mse', optimizer='rmsprop', metrics=["accuracy"]) 
#model.add(Dropout(0.80)) 

# Train the Model 
history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), 
nb_epoch=10, batch_size=1, verbose=1) 

# The validation set is checked during training to monitor progress, and 
possibly for early stopping, 
# but is never used for gradient descent. 

# validation_data is used as held-out validation data. Will override 
validation_split. 
# validation_data=(X_test, Y_test) 

# validation_split is the Fraction of the data to use as held-out validation 
data. 
# validation_split=0.083 

from IPython.display import SVG 
from keras.utils.visualize_util import model_to_dot 

SVG(model_to_dot(model).create(prog='dot', format='svg')) 

# list all data in history 
print(history.history.keys()) 
# summarize history for accuracy 
plt.plot(history.history['acc']) 
plt.plot(history.history['val_acc']) 
plt.title('model accuracy') 
plt.ylabel('accuracy') 
plt.xlabel('epoch') 
plt.legend(['train', 'validate'], loc='upper left') 
plt.show() 

# summarize history for loss 
plt.plot(history.history['loss']) 
plt.plot(history.history['val_loss']) 
plt.title('model loss') 
plt.ylabel('loss') 
plt.xlabel('epoch') 
plt.legend(['train', 'validate'], loc='upper left') 
plt.show() 

# Test the Model 
#print(np.array(testset).shape) 
testset = np.array(testset).reshape(-1, 5, 1) 
predict = model.predict(testset) 
# Undo the normalization step. 
#predict = np.multiply(data, 200) 
predict = predict.reshape(-1) 
print(predict[0]) 

回答

1

模型是否,損失,優化,或者激活功能確定什麼形狀/尺寸的input_shape或input_dim需要呢?

我傾向於答案是肯定的。某些功能需要不同的尺寸。

現在讓我們保持簡單,只關注問題的精神。

dataset = list(range(100)) 
validset = dataset[-20:] 
testset = dataset[-20:] 

看來Keras希望爲LSTM數據被成形爲這樣: batchIndex,timestepIndex,dimensionIndex

print(np.array(X_train).shape) 
X_train = np.array(X_train).reshape(99, 1, 1) 
Y_train = np.array(Y_train).reshape(99, 1, 1) 
print(np.array(X_train).shape) 

結果: (99,) (99,1,1)

模型簡化爲:

model = Sequential() 
model.add(LSTM(100, input_dim=1, return_sequences=True)) 
model.add(Dense(1)) 
model.compile(loss='mse', optimizer='rmsprop', metrics=["accuracy"]) 

通過圖表和錯誤的預測判斷,還有很多工作要做。至少這會讓事情開始。