如何更改Tensorflow RNN模型中的最大序列長度？

我目前正試圖修改我的tensorflow分類器，它能夠標記一個單詞序列爲正數或負數，以處理更長的序列，而無需重新訓練。我的模型是一個RNN，最大序列長度爲210.一個輸入是一個單詞（300 dim），我用Google向量化單詞word2vec，所以我可以輸入最多210個單詞的序列。現在我的問題是，如何將最大序列長度更改爲例如3000，以分類電影評論。如何更改Tensorflow RNN模型中的最大序列長度？

與210固定最大序列長度我的工作模型（tf_version：1.1.0）：

n_chunks = 210 
chunk_size = 300 

x = tf.placeholder("float",[None,n_chunks,chunk_size]) 
y = tf.placeholder("float",None) 
seq_length = tf.placeholder("int64",None) 


with tf.variable_scope("rnn1"): 
     lstm_cell = tf.contrib.rnn.LSTMCell(rnn_size, 
              state_is_tuple=True) 

     lstm_cell = tf.contrib.rnn.DropoutWrapper (lstm_cell, 
                input_keep_prob=0.8) 

     outputs, _ = tf.nn.dynamic_rnn(lstm_cell,x,dtype=tf.float32, 
             sequence_length = self.seq_length) 

fc = tf.contrib.layers.fully_connected(outputs, 1000, 
             activation_fn=tf.nn.relu) 

output = tf.contrib.layers.flatten(fc) 

#*1 
logits = tf.contrib.layers.fully_connected(output, self.n_classes, 
              activation_fn=None) 

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits 
             (logits=logits, labels=y)) 
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost) 

... 
#train 
#train_x padded to fit(batch_size*n_chunks*chunk_size) 
sess.run([optimizer, cost], feed_dict={x:train_x, y:train_y, 
                seq_length:seq_length}) 
#predict: 
... 

pred = tf.nn.softmax(logits) 
pred = sess.run(pred,feed_dict={x:word_vecs, seq_length:sq_l})

我已經嘗試過什麼修改：

1個與無更換n_chunks和簡單地在

飼料數據

x = tf.placeholder(tf.float32, [None,None,300]) 
#model fails to build 
#ValueError: The last dimension of the inputs to `Dense` should be defined. 
#Found `None`. 
# at *1 

... 
#all entrys in word_vecs still have got the same length for example 
#3000(batch_size*3000(!= n_chunks)*300) 
pred = tf.nn.softmax(logits) 
pred = sess.run(pred,feed_dict={x:word_vecs, seq_length:sq_l})

2更改X，然後恢復舊型號：

x = tf.placeholder(tf.float32, [None,n_chunks*10,chunk_size] 
... 
saver = tf.train.Saver(tf.all_variables(), reshape=True) 
saver.restore(sess,"...") 
#fails as well: 
#InvalidArgumentError (see above for traceback): Input to reshape is a 
#tensor with 420000 values, but the requested shape has 840000 
#[[Node: save/Reshape_5 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, 
#_device="/job:localhost/replica:0/task:0/cpu:0"](save/RestoreV2_5, 
#save/Reshape_5/shape)]] 

# run prediction

如果可能，請您提供任何有效的示例或解釋爲什麼它不是？

來源

2017-07-14 Tobi

我只是想知道爲什麼不把n_chunk的值設爲3000？

在你第一次嘗試時，你不能使用兩個None，因爲tf不能爲每一個放置多少維度。第一個維度設置爲「無」，因爲它取決於批量大小。在第二次嘗試中，您只需更改一個地方，其他使用n_chunks的地方可能會與x佔位符發生衝突。

來源

2017-07-15 03:30:56 lerner

感謝您的回答，我沒有將n_chunks設置爲3000，因爲它不需要進行培訓，因爲最大seq長度爲210.如果我將n_chunks設置爲3000，我必須用0個vecs填充所有輸入以使它們合適，所以如果我有一個超過n_chunks的序列，那麼訓練過程會變得非常昂貴，我將不得不重新開始。在我的第二次嘗試中，我改變了其他地方，n_chunks也進來了，我只是忘了提及它。 – Tobi

如何更改Tensorflow RNN模型中的最大序列長度？

回答

相關問題