1

這是一個有鑑別力的網絡,我正在訓練,所以我可以在生成網絡中使用它。我訓練了一個具有2個特徵的數據集並進行二元分類。 1 =打坐0 =不打坐。 (數據集來自siraj raval的視頻之一)。tensorflow:輸出層總是顯示[1.]

由於某些原因,輸出層(ol)總是在每個測試用例中輸出[1]。

我的數據集:https://drive.google.com/open?id=0B5DaSp-aTU-KSmZtVmFoc0hRa3c

import pandas as pd 
import tensorflow as tf 

data = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_train.csv") 
data_f = data.drop("lbl", axis = 1) 
data_l = data.drop(["f1", "f2"], axis = 1) 

learning_rate = 0.01 
batch_size = 1 
n_epochs = 30 
n_examples = 999 # This is highly unsatisfying >:3 
n_iteration = int(n_examples/batch_size) 


features = tf.placeholder('float', [None, 2], name='features_placeholder') 
labels = tf.placeholder('float', [None, 1], name = 'labels_placeholder') 

weights = { 
      'ol': tf.Variable(tf.random_normal([2, 1], stddev= -12), name = 'w_ol') 
} 

biases = { 
      'ol': tf.Variable(tf.random_normal([1], stddev=-12), name = 'b_ol') 
} 

ol = tf.nn.sigmoid(tf.add(tf.matmul(features, weights['ol']), biases['ol']), name = 'ol') 

loss = -tf.reduce_sum(labels*tf.log(ol), name = 'loss') # cross entropy 
train = tf.train.AdamOptimizer(learning_rate).minimize(loss) 

sess = tf.Session() 
sess.run(tf.global_variables_initializer()) 

for epoch in range(n_epochs): 
    ptr = 0 
    for iteration in range(n_iteration): 
     epoch_x = data_f[ptr: ptr + batch_size] 
     epoch_y = data_l[ptr: ptr + batch_size] 
     ptr = ptr + batch_size 

     _, err = sess.run([train, loss], feed_dict={features: epoch_x, labels:epoch_y}) 
    print("Loss @ epoch ", epoch, " = ", err) 

print("Testing...\n") 

data = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_eval.csv") 
test_data_l = data.drop(["f1", "f2"], axis = 1) 
test_data_f = data.drop("lbl", axis = 1) 
#vvvHERE  
print(sess.run(ol, feed_dict={features: test_data_f})) #<<<HERE 
#^^^HERE 
saver = tf.train.Saver() 
saver.save(sess, save_path="E:/workspace_py/saved_models/meditation_disciminative_model.ckpt") 
sess.close() 

輸出:

2017-10-11 00:49:47.453721: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 
2017-10-11 00:49:47.454212: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 
2017-10-11 00:49:49.608862: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 960M 
major: 5 minor: 0 memoryClockRate (GHz) 1.176 
pciBusID 0000:01:00.0 
Total memory: 4.00GiB 
Free memory: 3.35GiB 
2017-10-11 00:49:49.609281: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 0 
2017-10-11 00:49:49.609464: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0: Y 
2017-10-11 00:49:49.609659: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0) 
Loss @ epoch 0 = 0.000135789 
Loss @ epoch 1 = 4.16049e-05 
Loss @ epoch 2 = 1.84776e-05 
Loss @ epoch 3 = 9.41758e-06 
Loss @ epoch 4 = 5.24522e-06 
Loss @ epoch 5 = 2.98024e-06 
Loss @ epoch 6 = 1.66893e-06 
Loss @ epoch 7 = 1.07288e-06 
Loss @ epoch 8 = 5.96047e-07 
Loss @ epoch 9 = 3.57628e-07 
Loss @ epoch 10 = 2.38419e-07 
Loss @ epoch 11 = 1.19209e-07 
Loss @ epoch 12 = 1.19209e-07 
Loss @ epoch 13 = 1.19209e-07 
Loss @ epoch 14 = -0.0 
Loss @ epoch 15 = -0.0 
Loss @ epoch 16 = -0.0 
Loss @ epoch 17 = -0.0 
Loss @ epoch 18 = -0.0 
Loss @ epoch 19 = -0.0 
Loss @ epoch 20 = -0.0 
Loss @ epoch 21 = -0.0 
Loss @ epoch 22 = -0.0 
Loss @ epoch 23 = -0.0 
Loss @ epoch 24 = -0.0 
Loss @ epoch 25 = -0.0 
Loss @ epoch 26 = -0.0 
Loss @ epoch 27 = -0.0 
Loss @ epoch 28 = -0.0 
Loss @ epoch 29 = -0.0 
Testing... 

[[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.] 
[ 1.]] 
Saving model... 
[Finished in 57.9s] 

回答

1

主要問題

首先,這不是有效的交叉熵損失。您正在使用的公式僅適用於2個或更多輸出。使用單個sigmoid輸出你必須做

-tf.reduce_sum(labels*tf.log(ol) + (1-labels)*tf.log(1-ol), name = 'loss') 

否則最佳的解決方案是總是回答「1」(現在正在發生)。

爲什麼?

請注意,標籤只有0或1,而您的全部損失是預測的標籤和對數的乘積。因此,當真實標籤爲0時,無論您的預測如何,您的損失爲0,因爲0 * log(x)= 0無論x是什麼(只要定義了log(x))。因此,你的模型只會在不應該預測「1」的時候受到懲罰,所以它會一直學習輸出1。

其他一些奇怪的事情

  1. 你是正態分佈提供負STDDEV,而你不應該(除非這是random_normal的一些未公開的特性,但根據文檔應該接受一個單一的積極浮動,你應該在那裏提供一個小數字)。

  2. 像這樣計算交叉熵(以一種天真的方式)不是數值穩定的,可以看看tf.sigmoid_cross_entropy_with_logits。

  3. 你不是在排列你的數據集,因此你總是以相同的順序處理數據,這會產生不好的後果(定期增加損失,更難收斂或缺乏收斂)。