通過稀疏張量反向傳播梯度？

我有一個正常的前饋網絡，產生一個矢量v。v的元素然後被用作稀疏矩陣M的非零條目（假設座標是預定義的）。稀疏矩陣然後乘以一個密集向量，並在結果標量上定義一個損失。我想反向傳播損失w.r.t.網絡的權重，這需要通過稀疏矩陣。通過稀疏張量反向傳播梯度？

這似乎是一個完全合理的用例的稀疏矩陣，但現在看來，這樣的功能是不支持的。事實上，即使調用tf.gradients（男，[V]）產生一個錯誤：

AttributeError: 'SparseTensor' object has no attribute 'value_index'

難道我做錯了什麼，或我的假設，這個功能不（沒？）有正確嗎？如果是後者，那麼對於這種特殊的用例來說，是否存在一種解決方法，即重寫所有具有漸變定義的稀疏張量操作？

來源

2017-02-03 zergylord

我在這裏黑暗中釣魚，從代碼和文檔工作，沒有經驗。

的Tensor類創建者是：

def __init__(self, op, value_index, dtype): 
    # value_index: An `int`. Index of the operation's endpoint that produces this tensor.

的value_index被用於生成Tensor名稱。

的SparseTensor一個是

def __init__(self, indices, values, dense_shape):

無處在它的定義文件tensorflow/tensorflow/python/framework/sparse_tensor.py是value_index引用。

它的參數是張量，大概每個都有自己的value_index。

否則看起來SparseTensor是另一種IndexedSlices，它也包含張量。

到tf.gradients的輸入都是

A `Tensor` or list of tensors

的gradients定義文件有_IndexedSlicesToTensor方法，但沒有等效SparseTensor。所以在IndexedSlices的情況下（如果結果太大，會出現警告），但似乎有某種自動轉換爲緻密的情況，但不是SparseTensors。我不知道這是一個不完整的發展情況，還是一個不可兼容的情況。

來源

2017-02-03 23:29:32 hpaulj

上這方面的一個微小變化不工作，以直接的values一個SparseTensor的梯度：

import tensorflow as tf 
sparse_values = tf.identity(tf.Variable(tf.constant([1., 2., 3.]))) 
sparse_indices = tf.constant([[0, 0], [1, 1], [2, 2]], dtype=tf.int64) 
sparse_matrix = tf.SparseTensor(sparse_indices, sparse_values, [3, 3]) 
multiplied = tf.sparse_tensor_dense_matmul(sparse_matrix, tf.eye(3)) 
loss = tf.reduce_sum(multiplied) 
gradients = tf.gradients(loss, [sparse_values]) 
with tf.Session() as session: 
    tf.global_variables_initializer().run() 
    print(session.run(gradients))

打印（上TensorFlow 0.12.1）：

[array([ 1., 1., 1.], dtype=float32)]

爲什麼tf.identity運是必要的梯度被定義我還沒有弄清楚（可能與ref dtypes有關）。

來源

2017-02-08 00:34:21

通過稀疏張量反向傳播梯度？

回答

相關問題