TL; DR
看來,在從細胞中的問題重量和偏見的代碼將可以正常使用。多個小區lstm1
和lstm2
將具有相同的行爲,並且MultiRNNCell內的小區將具有獨立的權重和偏差。即在僞:
lstm1._cells[0].weights == lstm2._cells[0].weights
lstm1._cells[1].weights == lstm2._cells[1].weights
加長版
這是至今沒有一個明確的答案,但是這是研究我迄今所取得的結果。
它看起來像一個黑客,但我們可以覆蓋get_variable
方法來查看哪些變量被訪問。例如像這樣:
from tensorflow.python.ops import variable_scope as vs
def verbose(original_function):
# make a new function that prints a message when original_function starts and finishes
def new_function(*args, **kwargs):
print('get variable:', '/'.join((tf.get_variable_scope().name, args[0])))
result = original_function(*args, **kwargs)
return result
return new_function
vs.get_variable = verbose(vs.get_variable)
現在我們可以運行下面的修改後的代碼:
def create_lstm_multicell(name):
def lstm_cell(i, s):
print('creating cell %i in %s' % (i, s))
return rnn.LSTMCell(nstates, reuse=tf.get_variable_scope().reuse)
lstm_multi_cell = rnn.MultiRNNCell([lstm_cell(i, name) for i in range(n_layers)])
return lstm_multi_cell
with tf.variable_scope('lstm') as scope:
lstm1 = create_lstm_multicell('lstm1')
layer1, _ = tf.nn.dynamic_rnn(lstm1, x, dtype=tf.float32)
val_1 = tf.reduce_sum(layer1)
with tf.variable_scope('lstm') as scope:
scope.reuse_variables()
lstm2 = create_lstm_multicell('lstm2')
layer2, _ = tf.nn.dynamic_rnn(lstm2, x, dtype=tf.float32)
val_2 = tf.reduce_sum(layer2)
輸出看起來像這樣(我刪除重複的線條):
creating cell 0 in lstm1
creating cell 1 in lstm1
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/biases
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/biases
creating cell 0 in lstm2
creating cell 1 in lstm2
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/biases
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/biases
此輸出指示lstm1
和lstm2
單元格將使用相同的權重&偏差,兩者都有分開權重&第一個偏差和MultiRNNCell內的第二個單元。
另外,val_1
和val_2
的輸出lstm1
和lstm2
在優化期間是相同的。
我認爲MultiRNNCell在其內部創建命名空間cell_0
,cell_1
等。因此,lstm1
和lstm2
之間的權重將被重新使用。
重用重量是什麼意思?你想建立一個有狀態的流程嗎? – dv3
@ dv3不,我不需要國家的LSTM。我只想讓lstm1和lstm2表現相同,即多單元中每個單元的權重應該在lstm1和lstm2之間相同。 –