2016-10-28 110 views
1

我使用下面的方程式來計算梯度誤差在計算梯度使用Python

gradient = [f(x+h) - f(x-h)]/2h 

和我用線性函數測試,但什麼是錯。 的代碼是在這裏:

import numpy as np 

def evla_numerical_gradient(f, x): 

    gradient = np.zeros(x.shape, dtype=np.float64) 
    delta_x = 0.00001 

    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite']) 

    while not it.finished: 
     index = it.multi_index 
     x_old = x[index] 

     x[index] = x_old + delta_x 
     fx_addh = f(x) 
     print(fx_addh) 

     x[index] = x_old - delta_x 
     fx_minush = f(x) 
     print(fx_minush) 

     x[index] = x_old 

     print((fx_addh - fx_minush)/(2 * delta_x)) 
     gradient[index] = (fx_addh - fx_minush)/(2. * delta_x) 

     it.iternext() 

    return gradient 


def lin(x): 
    return x 

if __name__ == '__main__': 
    x = np.array([0.001]) 
    grad = evla_numerical_gradient(lin, x) 
    print(grad) 

其結果是在這裏:

[ 0.00101] 
[ 0.00099] 
[ 0.] 
[ 0.] 

爲什麼在X坡度爲0?

回答

0

與您的代碼的問題是上線的以下組合(我展示的fx_addh的例子中,fx_minush的情況是類似的

fx_addh = f(x) 
x[index] = x_old 

要放置的f(x)結果爲fx_addh,但問題是你定義的方式f(x),它只是你的lin(x)的一個句柄,你直接返回參數

在Python賦值操作中不要複製對象,而是在目標之間創建一個綁定(在作業=左側)和對象(作業右側=)。更多關於這個here

爲了說服自己發生這種情況,您可以在設置x[index] = x_old的行之後放置另一個print(fx_addh);你會看到它現在包含零值。

爲了解決這個問題,你可以修改你的lin(x)函數返回傳遞作爲參數的對象的副本:

import numpy as np 
import copy 

def evla_numerical_gradient(f, x): 

    gradient = np.zeros(x.shape, dtype=np.float64) 
    delta_x = 0.00001 

    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite']) 

    while not it.finished: 
     index = it.multi_index 
     x_old = x[index] 

     x[index] = x_old + delta_x 
     fx_addh = f(x) 
     print(fx_addh) 

     x[index] = x_old - delta_x 
     fx_minush = f(x) 
     print(fx_minush) 

     x[index] = x_old 

     print((fx_addh - fx_minush)/(2 * delta_x)) 
     gradient[index] = (fx_addh - fx_minush)/(2. * delta_x) 

     it.iternext() 

    return gradient 


def lin(x): 
    return copy.copy(x) 

if __name__ == '__main__': 
    x = np.array([0.001]) 
    grad = evla_numerical_gradient(lin, x) 
    print(grad) 

將返回:

[ 0.00101] 
[ 0.00099] 
[ 1.] 
[ 1.] 

指示1漸變爲你會期待。

0

因爲fx_addhfx_minush指向內存的相同索引。改變lin功能如下:

def lin(x): 
    return x.copy() 

結果:

[ 0.00101] 
[ 0.00099] 
[ 1.] 
[ 1.]