如何在張量流中停止張量的某些輸入的梯度

我想實現一個嵌入層。嵌入將使用預先訓練的手套嵌入進行初始化。對於可以在手套中找到的詞語，它將被固定。對於那些沒有戴手套的話，它會隨機初始化，並且可以訓練。我如何做tensorflow？我知道整個張量都有一個tf.stop_gradient，對於這種情況，有沒有任何stop_gradient api？或者，有沒有解決這個問題的方法？任何建議表示讚賞如何在張量流中停止張量的某些輸入的梯度

來源

2017-04-12 Jerrik Eph

這樣的想法是使用mask並tf.stop_gradient開裂這樣的問題：

res_matrix = tf.stop_gradient(mask_h*E) + mask*E，

其中在基質mask中，1表示到我想申請梯度條目，0表示對哪個條目我不想應用漸變（將漸變設置爲0），mask_h是mask的反轉（1翻轉爲0，0翻轉爲1）。然後我們可以從res_matrix中獲取。這裏是測試代碼：

import tensorflow as tf 
import numpy as np 

def entry_stop_gradients(target, mask): 
    mask_h = tf.abs(mask-1) 
    return tf.stop_gradient(mask_h * target) + mask * target 

mask = np.array([1., 0, 1, 1, 0, 0, 1, 1, 0, 1]) 
mask_h = np.abs(mask-1) 

emb = tf.constant(np.ones([10, 5])) 

matrix = entry_stop_gradients(emb, tf.expand_dims(mask,1)) 

parm = np.random.randn(5, 1) 
t_parm = tf.constant(parm) 

loss = tf.reduce_sum(tf.matmul(matrix, t_parm)) 
grad1 = tf.gradients(loss, emb) 
grad2 = tf.gradients(loss, matrix) 
print matrix 
with tf.Session() as sess: 
    print sess.run(loss) 
    print sess.run([grad1, grad2])

來源

2017-04-12 11:39:56

我建議你有兩個不同的張量來保存你的數據：一個tf.constant爲你的預訓練數據，一個tf.Variable爲你的新數據進行訓練，然後你可以混合既有級聯又有張量連接操作。

由於tf.constant無法訓練，因此您不必擔心停止漸變。

來源

2017-04-12 09:39:39

這樣我就不得不做很多預處理。這會讓我的代碼看起來有點難看。我會嘗試使用收集並與stop_gradient一起分散，看看是否會起作用。真的希望有一個功能來支持這一點。謝謝。 –

我不太瞭解單詞嵌入，但我對您的問題的理解是，您有一個變量v，並且您只想訓練它的某些（已知）條目。您可以通過使用「掩模」（即與v相同形狀的恆定張量對可訓練條目的值爲1，否則爲0）來實現此目的。

v = your_variable() 
loss = your_loss() #some loss that uses v 
mask = your_mask_as_explained_above() 
opt = tf.train.GradientDescentOptimizer(learning_rate=0.1) 

# Get list (length 1 in our example) of (gradient, variable)-pairs from the optimizer and extract the gradient w.r.t. v 
grads_and_vars = opt.compute_gradients(loss, [v]) 
v_grad = grads_and_vars[0][0] 

# Multiply the gradient with the mask before feeding it back to the optimizer 
sgd_step = opt.apply_gradients([(v, v_grad*mask)])

根據您的問題的複雜性，這可能不是一個有效的解決方案，不過，因爲全梯度w.r.t.在每個步驟中計算v。在優化器更新中，應用根本不是。

如果您對opt.compute_gradients和opt.apply_gradients不熟悉，請在docs中解釋。

來源

2017-04-12 10:22:49 lballes

感謝您的回覆，我認爲您的解決方案將起作用。我剛剛提出了另一個想法，我已經在下面發佈了它。 –

如何在張量流中停止張量的某些輸入的梯度

回答

相關問題