Theano隨着Python2.7：SGD多種損失

Theano被讚譽之後，我想我會用一種特定形式的SGD來完成我的第一步。我有一個參數向量Theta，我想優化我的損失函數返回一個向量，其中包含矩陣A和B之間的平方損失的列總和。每個元素都是使用廣播的theta的特定維度的獨立損失。 Theta應該更新，以便下一次迭代每個維度的損失更低。我選擇這個是因爲數據（X，Y）是以這種方式給出的。Theano隨着Python2.7：SGD多種損失

現在教程中說應該使用T.grad（）來獲取更新的漸變。但T.grad不允許我計算非標量的梯度。教程（http://deeplearning.net/software/theano/tutorial/gradients.html）說'標量成本只能由grad直接處理。數組通過重複的應用程序處理。'所以我嘗試了（可以承認一個醜陋的嘗試）來計算每個損失的梯度。如何計算多次損失的梯度？有沒有一種乾淨的，最佳實踐的方式？這甚至是正確的嗎？我應該考慮的其他事情？

馬丁

import numpy 
from theano import tensor as T 
from theano import function 
from theano import shared 

alpha = 0.00001 
theta = shared(numpy.random.rand(10), name='theta') 
X = T.dmatrix(name='X') 
Y = T.dmatrix(name='Y') 
losses = T.sqr(theta * X - Y).sum(axis=0)

這是它是越來越怪異：因爲T.grad（虧損，THETA）拋出類型錯誤：成本必須是一個標量。所以，我得到了這個醜陋的嘗試：

d_losses = [T.grad(losses[i], theta) for i in xrange(len(theta.get_value()))] 
updates = [(theta, theta - numpy.array(alpha) * d_losses)]

當我想編譯它，我得到這個：

>>> f = function(inputs=[A], outputs=loss, updates=updates) 
    Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 266, in function 
    profile=profile) 
    File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 489, in pfunc 
    no_default_updates=no_default_updates) 
    File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 202, in rebuild_collect_shared 
    update_val = store_into.type.filter_variable(update_val) 
    File "/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py", line 206, in filter_variable 
    other = self.Constant(type=self, data=other) 
    File "/usr/local/lib/python2.7/dist-packages/theano/tensor/var.py", line 732, in __init__ 
    Constant.__init__(self, type, data, name) 
    File "/usr/local/lib/python2.7/dist-packages/theano/gof/graph.py", line 443, in __init__ 
    self.data = type.filter(data) 
    File "/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py", line 115, in filter 
    up_dtype = scal.upcast(self.dtype, data.dtype) 
    File "/usr/local/lib/python2.7/dist-packages/theano/scalar/basic.py", line 67, in upcast 
    rval = str(z.dtype) 
AttributeError: 'float' object has no attribute 'dtype'

來源

2015-10-06 Martin T.

爲什麼你想要幾次虧損？你可以有一個標量損失並得到w.r.t.到theta的每個組件。 –

所以你的意思是我在xrange（len（theta.get_value（））]]中的d_loss = [T.grad（loss，theta [i]）？或者我會怎麼做？最初的想法是，每個功能都有我自己想要捕捉的自己的損失。 –

作爲的Mikael Rousson指出了一個註釋，你可能穿上」梯度的目的需要處理單獨的損失;只需將所有損失分量總和爲一個標量，然後根據參數向量計算偏導數，得到一個梯度向量。

所以添加

loss = losses.sum()

或直接定義標量損失

loss = T.sqr(theta * X - Y).sum()

然後使用

d_losses = T.grad(loss, theta) 
updates = [(theta, theta - alpha * d_losses)]

d_losses[0]等於loss的偏導數相對於theta[0]但唯一的術語在loss那涉及theta[0]是losses第一個元素之和的組成部分，所以它也等於losses[0]相對於theta[0]的偏導數，我想這正是您想要的。

來源

2015-10-07 06:20:27

這是有道理的。謝謝！ –

Theano隨着Python2.7：SGD多種損失

回答

相關問題