2017-04-06 105 views
1

我明白了代碼1是使用屬於TensorFlow庫(黑盒)的tf.train.GradientDescentOptimizer的線性迴歸的代碼。
使用手動梯度計算的線性迴歸

代碼2是一個代碼示例,可以在沒有GradientDescentOptimizer的情況下執行相同的操作。 是沒有黑框的代碼。

我想補充的代碼2偏置(# hypothesis = X * W + b)在這種情況下,如何代碼(梯度,血統,更新,等等)應該是什麼?

代碼1

import tensorflow as tf 

x_train = [1, 2, 3] 
y_train = [1, 2, 3] 

X = tf.placeholder(tf.float32) 
Y = tf.placeholder(tf.float32) 
W = tf.Variable(5.) 
b = tf.Variable(5.) 
hypothesis = X * W + b 
cost = tf.reduce_mean(tf.square(hypothesis - Y)) 
learning_rate = 0.1 

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) 
gvs = optimizer.compute_gradients(cost, [W, b]) 
apply_gradients = optimizer.apply_gradients(gvs) 

sess = tf.Session() 
sess.run(tf.global_variables_initializer()) 
for step in range(21): 
    gradient_val, cost_val, _ = sess.run(
     [gvs, cost, apply_gradients], feed_dict={X: x_train, Y: y_train}) 
    print("%3d Cost: %10s, W': %10s, W: %10s, b': %10s, b: %10s" % 
      (step, round(cost_val, 5), 
      round(gradient_val[0][0] * learning_rate, 5), round(gradient_val[0][1], 5), 
      round(gradient_val[1][0] * learning_rate, 5), round(gradient_val[1][1], 5))) 

代碼2

import tensorflow as tf 

x_train = [1, 2, 3] 
y_train = [1, 2, 3] 

X = tf.placeholder(tf.float32) 
Y = tf.placeholder(tf.float32) 
W = tf.Variable(5.) 
# b = tf.Variable(5.) # Bias 
hypothesis = X * W 
# hypothesis = X * W + b 
cost = tf.reduce_mean(tf.square(hypothesis - Y)) 
learning_rate = 0.1 

gradient = tf.reduce_mean((W * X - Y) * X) * 2 
descent = W - learning_rate * gradient 
update = tf.assign(W, descent) 

sess = tf.Session() 
sess.run(tf.global_variables_initializer()) 
print(sess.run(W)) 
for step in range(21): 
    gradient_val, update_val, cost_val = sess.run(
     [gradient, update, cost], feed_dict={X: x_train, Y: y_train}) 
    print(step, gradient_val * learning_rate, update_val, cost_val) 
+0

一個非常有趣的問題! –

回答

1

我已經提到An Introduction to Gradient Descent and Linear Regression

代碼2

import tensorflow as tf 

x_train = [1, 2, 3] 
y_train = [1, 2, 3] 

X = tf.placeholder(tf.float32) 
Y = tf.placeholder(tf.float32) 
W = tf.Variable(5.) 
b = tf.Variable(5.) 
hypothesis = X * W + b 
cost = tf.reduce_mean(tf.square(hypothesis - Y)) 
learning_rate = 0.1 

W_gradient = tf.reduce_mean((W * X + b - Y) * X) * 2 
b_gradient = tf.reduce_mean(W * X + b - Y) * 2 
W_descent = W - learning_rate * W_gradient 
b_descent = b - learning_rate * b_gradient 
W_update = tf.assign(W, W_descent) 
b_update = tf.assign(b, b_descent) 

sess = tf.Session() 
sess.run(tf.global_variables_initializer()) 
for step in range(21): 
    cost_val, W_gradient_val, W_update_val, b_gradient_val, b_update_val = sess.run(
     [cost, W_gradient, W_update, b_gradient, b_update], 
     feed_dict={X: x_train, Y: y_train}) 
    print("%3d Cost: %8s, W': %8s, W: %8s, b': %8s, b: %8s" % 
      (step, round(cost_val, 5), 
      round(W_gradient_val * learning_rate, 5), round(W_update_val, 5), 
      round(b_gradient_val * learning_rate, 5), round(b_update_val, 5)))