Tensorflow隨機值

我正在採取深入的學習和張量流程的第一步。因此，我有一些問題。Tensorflow隨機值

根據教程和入門指南，我創建了一個隱藏層以及一些簡單的softmax模型的DNN。我使用了https://archive.ics.uci.edu/ml/datasets/wine的數據集，並將其分解爲訓練和測試數據集。

from __future__ import print_function 
import tensorflow as tf 


num_attributes = 13 
num_types = 3 


def read_from_cvs(filename_queue): 
    reader = tf.TextLineReader() 
    key, value = reader.read(filename_queue) 
    record_defaults = [[] for col in range(
     num_attributes + 1)] 
    attributes = tf.decode_csv(value, record_defaults=record_defaults) 
    features = tf.stack(attributes[1:], name="features") 
    labels = tf.one_hot(tf.cast(tf.stack(attributes[0], name="labels"), tf.uint8), num_types + 1, name="labels-onehot") 
    return features, labels 


def input_pipeline(filename='wine_train.csv', batch_size=30, num_epochs=None): 
    filename_queue = tf.train.string_input_producer([filename], num_epochs=num_epochs, shuffle=True) 
    features, labels = read_from_cvs(filename_queue) 

    min_after_dequeue = 2 * batch_size 
    capacity = min_after_dequeue + 3 * batch_size 
    feature_batch, label_batch = tf.train.shuffle_batch(
     [features, labels], batch_size=batch_size, capacity=capacity, 
     min_after_dequeue=min_after_dequeue) 
    return feature_batch, label_batch 


def train_and_test(hidden1, hidden2, learning_rate, epochs, train_batch_size, test_batch_size, test_interval): 
    examples_train, labels_train = input_pipeline(filename="wine_train.csv", batch_size=train_batch_size) 
    examples_test, labels_test = input_pipeline(filename="wine_train.csv", batch_size=test_batch_size) 

    with tf.name_scope("first layer"): 
     x = tf.placeholder(tf.float32, [None, num_attributes], name="input") 
     weights1 = tf.Variable(
      tf.random_normal(shape=[num_attributes, hidden1], stddev=0.1), name="weights") 
     bias = tf.Variable(tf.constant(0.0, shape=[hidden1]), name="bias") 
     activation = tf.nn.relu(
      tf.matmul(x, weights1) + bias, name="relu_act") 
     tf.summary.histogram("first_activation", activation) 

    with tf.name_scope("second_layer"): 
     weights2 = tf.Variable(
      tf.random_normal(shape=[hidden1, hidden2], stddev=0.1), 
      name="weights") 
     bias2 = tf.Variable(tf.constant(0.0, shape=[hidden2]), name="bias") 
     activation2 = tf.nn.relu(
      tf.matmul(activation, weights2) + bias2, name="relu_act") 
     tf.summary.histogram("second_activation", activation2) 

    with tf.name_scope("output_layer"): 
     weights3 = tf.Variable(
      tf.random_normal(shape=[hidden2, num_types + 1], stddev=0.5), name="weights") 
     bias3 = tf.Variable(tf.constant(1.0, shape=[num_types+1]), name="bias") 
     output = tf.add(
      tf.matmul(activation2, weights3, name="mul"), bias3, name="output") 
     tf.summary.histogram("output_activation", output) 

    y_ = tf.placeholder(tf.float32, [None, num_types+1]) 

    with tf.name_scope("loss"): 
     cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=output)) 
     tf.summary.scalar("cross_entropy", cross_entropy) 
    with tf.name_scope("train"): 
     train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy) 

    with tf.name_scope("tests"): 
     correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_, 1)) 
     accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 
     tf.summary.scalar("accuracy", accuracy) 

    summary_op = tf.summary.merge_all() 
    sess = tf.InteractiveSession() 
    writer = tf.summary.FileWriter("./wineDnnLow", sess.graph) 
    tf.global_variables_initializer().run() 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord, sess=sess) 


    try: 
     step = 0 
     while not coord.should_stop() and step < epochs: 
      # train 
      ex, lab = sess.run([examples_train, labels_train]) 
      _ = sess.run([train_step], feed_dict={x: ex, y_: lab}) 
      # test 
      if step % test_interval == 0: 
       ex, lab = sess.run([examples_test, labels_test]) 
       summery, test_accuracy = sess.run([summary_op, accuracy], feed_dict={x: ex, y_: lab}) 
       writer.add_summary(summery, step) 
       print("accurary= {0:f} on {}".format(test_accuracy, step)) 
      step += 1 
    except tf.errors.OutOfRangeError: 
     print("Done training for %d steps" % (step)) 

    coord.request_stop() 
    coord.join(threads) 
    sess.close() 



def main(): 
    train_and_test(10, 20, 0.5, 700, 30, 10, 1) 


if __name__ == '__main__': 
    main()

的問題是，準確性因素不收斂，似乎得到隨機值。但是，當我嘗試tf.contrib.learn.DNNClassifier我的數據被分類得很好。所以任何人都可以給我一些提示，問題出在我自己創建的DNN上？

此外，我還有第二個問題。在訓練中，我在session.run（）上提供train_step，而不是在測試上。這是否確保權重不受影響，因此圖形沒有通過測試學習？

編輯：如果我使用MNIST數據集及其批處理insteat我的淨行爲良好。因此，我認爲問題是由input_pipeline引起的。

來源

2017-08-29 user98765

降低學習率，減少所有層的stddev。總的來說 - 你是怎麼想出所有這些常量的？看起來你似乎在每個變量中都提供了隨機初始值。 – lejlot

我嘗試了不同的學習率，但問題仍然是一樣的。此外，如果我使用MNIST數據集進行批處理，則網絡正常工作。因此，我認爲這應該是由我的input_pipeline – user98765

快速瀏覽一下數據集，向我表明我要做的第一件事就是將它歸一化（減去平均值，除以標準偏差）。也就是說，與MNIST相比，它仍然是一個非常小的數據集，所以不要指望所有東西都一模一樣。

如果您不確定輸入流水線，只需將所有數據加載到內存中，而不是使用輸入流水線。

一些常規注意事項：

您的輸入管道不是節省您的任何時間。你的數據集很小，所以我只是使用feed_dict，但是如果它很大，你最好去掉佔位符，並使用input_pipeline的輸出（並建立一個單獨的測試圖）。

對於常見圖層類型，使用tf.layers API。例如，您的推理部分可以通過以下三行有效縮小。

activation = tf.layers.dense(x, hidden1, activation=tf.nn.relu) 
activation2 = tf.layers.dense(x, hidden2, activation=tf.nn.relu) 
output = tf.layers.dense(activation2, num_types+1)

（你不會有相同的初始化，但您可以指定那些具有可選參數，默認值是一個良好的開端，但。）

GradientDescentOptimizer是非常原始的。我目前的最愛是AdamOptimizer，但與其他人一起試驗。如果這看起來太複雜，MomentumOptimizer通常會在複雜性和性能優勢之間進行折衷。

查看tf.estimator.Estimator API。它會讓你做的更容易，並迫使你從模型本身分離數據加載（一件好事）。

查看tf.contrib.data.Dataset API進行數據預處理。隊列在tensorflow中已經存在了一段時間，所以這是大多數教程的寫作內容，但我認爲Dataset API更直觀/更簡單。同樣，對於這種情況，您可以輕鬆地將所有數據加載到內存中，這有點矯枉過正。有關如何使用從CSV文件開始的Dataset的問題，請參閱this。

來源

2017-08-29 23:51:30 DomJack

謝謝。爲了清楚起見，我在這個小數據集上使用了過度投入的input_pipeline，因爲後來我想使用更大的數據集，但認爲在小數據集上學習會更容易，但使用「正確」的方法。 – user98765

值得讚揚 - 但我最先得到最簡單的東西，然後詳細說明:)。獎勵標誌，如果你去，並轉換爲'tfrecords'，而不是每次你在數據集中運行它時都解析每個csv記錄。無論您使用什麼（csv，tfrecords），您都不應該爲每個訓練步驟執行2次會話運行（1爲獲取數據，1爲將數據提供給主圖） - 您應該將兩者連接起來以避免不必要地傳輸數據在這個地方。 – DomJack

爲避免每次訓練執行2次運行，我必須移除佔位符以直接提供張量？「並建立一個單獨的測試圖」我如何獲得一個額外的圖與我列車狀態？我必須使用tf.train.Saver來保存和恢復它還是有其他方法？ – user98765

Tensorflow隨機值

回答

相關問題