2017-08-31 59 views
2

試圖總結我的頭圍繞如何梯度代表,以及如何autograd作品:爲什麼autograd不會爲中間變量產生漸變?

import torch 
from torch.autograd import Variable 

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

z.backward() 

print(x.grad) 
#Variable containing: 
#32 
#[torch.FloatTensor of size 1] 

print(y.grad) 
#None 

爲什麼它不會產生梯度y?如果y.grad = dz/dy,那麼它不應該至少產生像y.grad = 2*y這樣的變量嗎?

+0

我認爲這是一個有趣的問題發佈在https://discuss.pytorch.org/ –

回答

2

默認情況下,僅對葉變量保留漸變。以後不檢查非葉變量的梯度。這是通過設計完成的 ,以節省內存。

-soumith chintala

參見:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94

選項1:

呼叫y.retain_grad()

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

y.retain_grad() 

z.backward() 

print(y.grad) 
#Variable containing: 
# 8 
#[torch.FloatTensor of size 1] 

來源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16

選項2:

註冊一個hook,它基本上是一個在計算梯度時調用的函數。然後你可以保存它,爲它分配,打印,無論...

from __future__ import print_function 
import torch 
from torch.autograd import Variable 

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

y.register_hook(print) ## this can be anything you need it to be 

z.backward() 

輸出:

Variable containing: 8 [torch.FloatTensor of size 1 

來源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2

另見:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7

+0

謝謝你不知道有關retain_grad()方法 –

相關問題