2016-11-26 36 views
0

我在我的模型中使用了兩個堆碼dynamic_rnn,這意味着第二個dynamic_rnninitial_state是第一個dynamic_rnn輸出的final_state。我的損失函數僅基於第二個dynamic_rnnoutput進行計算。我的問題是,梯度會傳回第一個dynamic_rnn將漸變向後傳播多個dynamic_rnn?

你可能會問我,爲什麼我冗長用兩個dynamic_rnn而不是一個。答案是對於我的問題,除了最後一步之外,大多數輸入序列是完全相同的。所以,我只是爲所有的節省時間的目的,這些輸入序列的公共部分運行dynamic_rnn一次喂final_state另一個dynamic_rnn它接受的不同和最後輸入元素。

假設我們有與長度10 3個序列的所有這些序列是除了最後步驟(第10個元素)相同。簡化代碼:

cell = BasicRNNCell() 
# the first dynamic_rnn which handles the common part 
first_outputs, first_states = tf.nn.dynamic_rnn(
    cell=cell, 
    dtype=tf.float32, 
    sequence_length=[9], # only one sample with length 9 
    inputs=identical_input # input with shape (1, 9, input_element_dim) 
) 
# tile the first_states to accommodate next dynamic_rnn 
# first_states is transformed from shape (1, hidden_state_dim) to (3, hidden_state_dim) 
first_states = tf.reshape(tf.tile(first_states, [1, 3]), [3, hidden_state_dim]) 
# the second dynamic_rnn which handles the distinct last element 
second_outputs, second_states = tf.nn.dynamic_rnn(
    initial_state=first_states, 
    cell=cell, 
    dtype=tf.float32, 
    sequence_length=[1, 1, 1], # 3 samples with only one element 
    inputs=distinct_input # input with shape (3, 1, input_element_dim) 
) 
# calculate loss based on second_outputs 
loss = some_loss_function(second_outputs, groud_truth) 

回答

0

它應該。如果您發現問題,請詳細描述您所遇到的錯誤。