理解TensorFlow計算難度

我是TensorFlow的新手，難以理解計算的工作原理。我無法在網上找到我的問題的答案。理解TensorFlow計算難度

對於下面這段代碼，我在「train_neural_net（）」函數的for循環中最後一次打印「d」，我期望這些值與打印「test_distance.eval」時的值相同。。但他們是不同的。誰能告訴我爲什麼會發生這種情況？是不是TensorFlow應該緩存for循環中學習到的變量結果，並在運行「test_distance.eval」時使用它們？

def neural_network_model1(data): 
    nn1_hidden_1_layer = {'weights': tf.Variable(tf.random_normal([5, n_nodes_hl1])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl1]))} 
    nn1_hidden_2_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl2]))} 
    nn1_output_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl2, vector_size])), 'biasses': tf.Variable(tf.random_normal([vector_size]))} 

    nn1_l1 = tf.add(tf.matmul(data, nn1_hidden_1_layer["weights"]), nn1_hidden_1_layer["biasses"]) 
    nn1_l1 = tf.sigmoid(nn1_l1) 

    nn1_l2 = tf.add(tf.matmul(nn1_l1, nn1_hidden_2_layer["weights"]), nn1_hidden_2_layer["biasses"]) 
    nn1_l2 = tf.sigmoid(nn1_l2) 

    nn1_output = tf.add(tf.matmul(nn1_l2, nn1_output_layer["weights"]), nn1_output_layer["biasses"]) 

    return nn1_output 

def neural_network_model2(data): 
    nn2_hidden_1_layer = {'weights': tf.Variable(tf.random_normal([5, n_nodes_hl1])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl1]))} 
    nn2_hidden_2_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl2]))} 
    nn2_output_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl2, vector_size])), 'biasses': tf.Variable(tf.random_normal([vector_size]))} 

    nn2_l1 = tf.add(tf.matmul(data, nn2_hidden_1_layer["weights"]), nn2_hidden_1_layer["biasses"]) 
    nn2_l1 = tf.sigmoid(nn2_l1) 

    nn2_l2 = tf.add(tf.matmul(nn2_l1, nn2_hidden_2_layer["weights"]), nn2_hidden_2_layer["biasses"]) 
    nn2_l2 = tf.sigmoid(nn2_l2) 

    nn2_output = tf.add(tf.matmul(nn2_l2, nn2_output_layer["weights"]), nn2_output_layer["biasses"]) 

    return nn2_output 

def train_neural_net(): 
    prediction1 = neural_network_model1(x1) 
    prediction2 = neural_network_model2(x2) 

    distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(prediction1, prediction2)), reduction_indices=1)) 
    cost = tf.reduce_mean(tf.multiply(y, distance)) 
    optimizer = tf.train.AdamOptimizer().minimize(cost) 

    hm_epochs = 500 

    test_result1 = neural_network_model1(x3) 
    test_result2 = neural_network_model2(x4) 
    test_distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(test_result1, test_result2)), reduction_indices=1)) 

    with tf.Session() as sess: 
     sess.run(tf.global_variables_initializer()) 

     for epoch in range(hm_epochs): 
      _, d = sess.run([optimizer, distance], feed_dict = {x1: train_x1, x2: train_x2, y: train_y}) 
      print("Epoch", epoch, "distance", d) 

     print("test distance", test_distance.eval({x3: train_x1, x4: train_x2})) 

train_neural_net()

來源

2017-03-01 Mehran

每次調用函數neural_network_model1()或neural_network_model2()，創建一套新的變量，因此有四組變量的總額。

調用sess.run(tf.global_variables_initializer())初始化所有四組變量。

當您在for循環訓練，你只能更新前兩個集的變量，這些行創建：

prediction1 = neural_network_model1(x1) 
prediction2 = neural_network_model2(x2)

當你test_distance.eval()評估，張量test_distance僅取決於變量在最後兩組變量，這與這些線創建創建：
```
test_result1 = neural_network_model1(x3) 
test_result2 = neural_network_model2(x4) 
```
這些變量從未在訓練循環更新，所以該評價結果將是基地d取隨機初始值。

TensorFlow確實包含一些代碼，用於在使用with tf.variable_scope(...):塊的多次調用同一函數之間共享權重。有關如何使用這些信息的更多信息，請參閱TensorFlow網站上的tutorial on variables and sharing。

來源

2017-03-01 21:20:37 mrry

感謝您的回答。所以我的'neural_network_model（data）'函數更類似於一個類，因爲每次運行它時，我都會用自己的變量創建一個新對象。它是否正確？ – Mehran

我想你可以把它想象成一個構造函數，是的，雖然它與類有點不同，因爲'layer'字典不存儲在一個對象的成員中。 TensorFlow使用全局變量（技術上是'tf.Graph'對象的成員）將所有你創建的'tf.Variable'對象集中在一起，並且它在'tf.global_variables_initializer（）'和'tf中使用這個集合。 train.AdamOptimizer（）。最小化（）'。 – mrry

知道了！謝謝。 – Mehran

您不需要爲生成模型定義兩個函數，您可以使用tf.name_scope，並將函數的名稱傳遞給該函數，以將其用作變量聲明的前綴。另一方面，您定義了兩個距離變量，第一個是distance，第二個是test_distance。但是你的模型會從列車數據中學習，以最小化僅與第一距離變量有關的cost。因此，test_distance從來沒有使用過，與之相關的模型永遠不會學到任何東西！再次不需要兩個距離函數。你只需要一個。當你要計算列車距離，你應該與列車數據餵養它，當你要計算測試距離你應該測試數據餵養它。無論如何，如果你想要第二個距離的工作，你應該申報另一個optimizer它也是你必須學習它，因爲你已經做了第一個。你也應該考慮這樣一個事實，即模型是基於他們的初始值和訓練數據進行學習的。即使給兩個模型提供完全相同的培訓批次，也不能指望具有完全相似的特徵模型，因爲權重的初始值不同，這可能會導致陷入不同的局部最小誤差曲面。最後請注意，無論何時您撥打neural_network_model1或neural_network_model2，您都會產生新的權重和偏差，因爲tf.Variable正在爲您生成新的變量。

來源

2017-03-01 21:31:18 goldIs

理解TensorFlow計算難度

回答

相關問題