如何構建用於分類的LSTM神經網絡

我有兩個人之間有各種對話的數據。每個句子都有某種類型的分類。我正在嘗試使用NLP網絡對會話的每個句子進行分類。我嘗試了一個卷積網絡並獲得了不錯的結果（不是破天荒的）。我認爲，既然這是一種來回的對話，而LSTM網絡可能會產生更好的結果，因爲之前所說的可能會對接下來的內容產生很大的影響。如何構建用於分類的LSTM神經網絡

如果我按照上面的結構，我認爲我做了許多一對多。我的數據看起來像。

X_train = [[sentence 1], 
      [sentence 2], 
      [sentence 3]] 
Y_train = [[0], 
      [1], 
      [0]]

數據已經使用word2vec進行處理。然後，我設計我的網絡如下。

model = Sequential()  
model.add(Embedding(len(vocabulary),embedding_dim, 
      input_length=X_train.shape[1])) 
model.add(LSTM(88)) 
model.add(Dense(1,activation='sigmoid')) 
model.compile(optimizer='rmsprop',loss='binary_crossentropy', 
       metrics['accuracy']) 
model.fit(X_train,Y_train,verbose=2,nb_epoch=3,batch_size=15)

我假設這個設置將一次饋送一批語句。但是，如果在model.fit中，洗牌並不等於錯誤接收洗牌批次，那麼爲什麼在這種情況下LSTM網絡甚至有用？從課題研究，實現了許多一對多結構中的一個需要改變LSTM層太

model.add(LSTM(88,return_sequence=True))

和輸出層將需要......

model.add(TimeDistributed(Dense(1,activation='sigmoid')))

當切換到這個結構我得到了輸入大小的錯誤。我不確定如何重新格式化數據以滿足此要求，以及如何編輯嵌入圖層以接收新的數據格式。

任何輸入將不勝感激。或者，如果您對更好的方法有任何建議，我很樂意聽到他們的聲音！

來源

2017-02-19 DJK

你的第一次嘗試很好。在句子之間進行混洗，只在他們之間洗牌訓練樣本，以便它們不總是以相同的順序進入。句子中的單詞不會被打亂。

或者我可能沒有正確理解這個問題？

編輯：

更好地理解這個問題後，這裏是我的建議。

數據準備：你切的陰莖在n句子的塊（它們可以重疊）。然後你應該有一個形狀，如(number_blocks_of_sentences, n, number_of_words_per_sentence)，所以基本上是一個包含n句子塊的2D數組列表。 n不應該太大，因爲LSTM在訓練時無法處理序列中的大量元素（消失梯度）。你的目標應該是一個形狀爲(number_blocks_of_sentences, n, 1)的數組，因此也包含一個包含句子塊中每個句子類的二維數組列表。

型號：

n_sentences = X_train.shape[1] # number of sentences in a sample (n) 
n_words = X_train.shape[2]  # number of words in a sentence 

model = Sequential() 
# Reshape the input because Embedding only accepts shape (batch_size, input_length) so we just transform list of sentences in huge list of words 
model.add(Reshape((n_sentences * n_words,),input_shape = (n_sentences, n_words))) 
# Embedding layer - output shape will be (batch_size, n_sentences * n_words, embedding_dim) so each sample in the batch is a big 2D array of words embedded 
model.add(Embedding(len(vocabaulary), embedding_dim, input_length = n_sentences * n_words)) 
# Recreate the sentence shaped array 
model.add(Reshape((n_sentences, n_words, embedding_dim))) 
# Encode each sentence - output shape is (batch_size, n_sentences, 88) 
model.add(TimeDistributed(LSTM(88))) 
# Go over lines and output hidden layer which contains info about previous sentences - output shape is (batch_size, n_sentences, hidden_dim) 
model.add(LSTM(hidden_dim, return_sequence=True)) 
# Predict output binary class - output shape is (batch_size, n_sentences, 1) 
model.add(TimeDistributed(Dense(1,activation='sigmoid'))) 
...

這應該是一個良好的開端。

我希望這會有所幫助

來源

2017-02-20 06:55:22

那麼你是說LSTM層每次只能喂一個字嗎？因此，即使句子正在洗牌，句子中的每個單詞都會分別傳遞給LSTM以瞭解整個句子之間的總體情況？ – DJK

如果我沒有正確說出我的問題，我很抱歉。由於數據是一個對話，所以在前面的句子中所說的話對下面的句子有重要意義。所以我試圖設置網絡來學習對話流程並對每個句子進行分類。這就是爲什麼我試圖使用return_sequence，因此網絡將保存關於前一句的信息，同時對當前句子進行分類。 – DJK

LSTM被喂入一系列矢量。在你的情況下，它是一個單詞嵌入序列。它將爲您的案例中的每個句子返回一個長度爲88的向量，然後將其減少到1輸出密集層。所以它一次只關心一個句子。這就是你目前所做的。那是你想要做的嗎？ –

如何構建用於分類的LSTM神經網絡

回答

相關問題