試圖總結我的頭圍繞如何梯度代表，以及如何autograd作品：爲什麼autograd不會爲中間變量產生漸變？

import torch 
from torch.autograd import Variable 

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

z.backward() 

print(x.grad) 
#Variable containing: 
#32 
#[torch.FloatTensor of size 1] 

print(y.grad) 
#None

爲什麼它不會產生梯度y？如果y.grad = dz/dy，那麼它不應該至少產生像y.grad = 2*y這樣的變量嗎？

來源

2017-08-31 foobar

我認爲這是一個有趣的問題發佈在https://discuss.pytorch.org/ –

默認情況下，僅對葉變量保留漸變。以後不檢查非葉變量的梯度。這是通過設計完成的，以節省內存。

-soumith chintala

參見：https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94

選項1：

呼叫y.retain_grad()

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

y.retain_grad() 

z.backward() 

print(y.grad) 
#Variable containing: 
# 8 
#[torch.FloatTensor of size 1]

來源：https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16

選項2：

註冊一個hook，它基本上是一個在計算梯度時調用的函數。然後你可以保存它，爲它分配，打印，無論...

from __future__ import print_function 
import torch 
from torch.autograd import Variable 

x = Variable(torch.Tensor([2]), requires_grad=True) 
y = x * x 
z = y * y 

y.register_hook(print) ## this can be anything you need it to be 

z.backward()

輸出：

Variable containing: 8 [torch.FloatTensor of size 1

來源：https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2

另見：https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7

來源

2017-09-01 19:33:46

謝謝你不知道有關retain_grad（）方法 –

爲什麼autograd不會爲中間變量產生漸變？

回答

選項1：

選項2：

相關問題