2017-04-03 203 views
1

早上好keras,我想訓練LSTM分類垃圾郵件和非垃圾郵件,我碰到下面的錯誤來了:LSTM錯誤蟒蛇

ValueError: Input 0 is incompatible with layer lstm_1: expected ndim = 3, found ndim = 4

有人可以幫助我瞭解問題出在哪裏?

我的代碼:

import sys 
import pandas as pd 
import numpy as np 
import math 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.layers import LSTM 
from sklearn.preprocessing import MinMaxScaler 
from sklearn.metrics import mean_squared_error 
from sklearn.feature_extraction.text import CountVectorizer 

if __name__ == "__main__": 
    np.random.seed(7) 

    with open('SMSSpamCollection') as file: 
     dataset = [[x.split('\t')[0],x.split('\t')[1]] for x in [line.strip() for line in file]] 

    data = np.array([dat[1] for dat in dataset]) 
    labels = np.array([dat[0] for dat in dataset]) 

    dataVectorizer = CountVectorizer(analyzer = "word", 
          tokenizer = None, 
          preprocessor = None, 
          stop_words = None, 
          max_features = 5000) 
    labelVectorizer = CountVectorizer(analyzer = "word", 
          tokenizer = None, 
          preprocessor = None, 
          stop_words = None, 
          max_features = 5000) 

    data = dataVectorizer.fit_transform(data).toarray() 
    labels = labelVectorizer.fit_transform(labels).toarray() 
    vocab = labelVectorizer.get_feature_names() 

    print(vocab) 
    print(data) 
    print(labels) 

    data = np.reshape(data, (data.shape[0], 1, data.shape[1])) 

    input_dim = data.shape 
    tam = len(data[0]) 

    print(data.shape) 
    print(tam) 

    model = Sequential() 
    model.add(LSTM(tam, input_shape=input_dim)) 
    model.add(Dense(1)) 
    model.compile(loss='mean_squared_error', optimizer='adam') 
    model.fit(data, labels, epochs=100, batch_size=1, verbose=2) 

我嘗試添加SMSSpamCollection

ham Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat... 
ham Ok lar... Joking wif u oni... 
spam Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's 
ham U dun say so early hor... U c already then say... 
ham Nah I don't think he goes to usf, he lives around here though 
spam FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.50 to rcv 
ham Even my brother is not like to speak with me. They treat me like aids patent. 
... 

由於數據陣列中,但也沒有結果 我的文件中的另一個位置

回答

1

問題在於事實您正在添加與樣本相關的附加維度。試試:

input_dim = (data.shape[1], data.shape[2]) 

這應該有效。