我是新來TensorFlow和神經網絡在一般情況下,我想開發一個神經網絡,可以預測一個屬性的值(這是在Kaggle.com上開始的比賽),我知道使用神經網絡可能不是解決迴歸問題的最佳模型,但我決定嘗試一下。Tensorflow與1-hdden層預測神經網絡不會改變 - 迴歸
當使用單層神經網絡(沒有隱藏層,這可能是一個線性迴歸)時,模型實際上預測值接近實際值,但是當我添加一個隱藏層時,預測的所有值都是相同的批次的20個輸入張量:
('real', array([[ 181000.],
[ 128900.],
[ 161500.],
[ 180500.],
[ 181000.],
[ 183900.],
[ 122000.],
[ 378500.],
[ 381000.],
[ 144000.],
[ 260000.],
[ 185750.],
[ 137000.],
[ 177000.],
[ 139000.],
[ 137000.],
[ 162000.],
[ 197900.],
[ 237000.],
[ 68400.]]))
('prediction ', array([[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687],
[ 4995.10597687]]))
更新: 我注意到,預測值只反射輸出層的偏見,而兩者的隱含層和輸出層的重量沒有變化,並始終零點
爲了進一步檢查發生了什麼問題,我生成了模型的圖(一次使用隱藏層,另一次使用隱藏層)比較兩個圖,看看是否有某些東西丟失,不幸的是它們都是看起來是正確的我,但我還是不明白,爲什麼樣板工程時,有沒有隱藏的圖層並採用了隱藏層
我的全代碼如下:
# coding: utf-8
import tensorflow as tf
import numpy as np
def loadDataFromCSV(fileName , numberOfFields , numberOfOutputFields , numberOfRecords):
XsArray = np.ndarray([numberOfRecords ,(numberOfFields-numberOfOutputFields)] , dtype=np.float64)
YsArray = np.ndarray([numberOfRecords ,numberOfOutputFields] , dtype=np.float64)
fileQueue = tf.train.string_input_producer(fileName)
defaultValues = [[0]]*numberOfFields
decodedLine = [[None]]*numberOfFields
reader = tf.TextLineReader()
key , singleLine = reader.read(fileQueue)
decodedLine = tf.decode_csv(singleLine,record_defaults=defaultValues)
inputFeatures = decodedLine[0:numberOfFields-numberOfOutputFields]
outputFeatures =decodedLine[numberOfFields-numberOfOutputFields:numberOfFields]
with tf.Session() as session :
tf.global_variables_initializer().run()
coor = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coor)
for i in range(numberOfRecords) :
XsArray[i,:] ,YsArray[i,:] = session.run([inputFeatures , outputFeatures])
coor.request_stop()
coor.join(threads)
return XsArray , YsArray
x , y =loadDataFromCSV(['/Users/mousaalsulaimi/Downloads/convertcsv.csv'] , 289 , 1, 1460)
num_steps = 10000
batch_size = 20
graph = tf.Graph()
with graph.as_default() :
with tf.name_scope('input'):
inputProperties = tf.placeholder(tf.float32 , shape=(batch_size ,287))
with tf.name_scope('realPropertyValue') :
outputValues = tf.placeholder(tf.float32,shape=(batch_size,1))
with tf.name_scope('weights'):
hidden1_w = tf.Variable(tf.truncated_normal([287,1000],stddev=math.sqrt(3/(287+1000)) , dtype=tf.float32))
with tf.name_scope('baises'):
hidden1_b = tf.Variable(tf.zeros([1000] , dtype=tf.float32))
with tf.name_scope('hidden_layer'):
hidden1 =tf.matmul(inputProperties,hidden1_w) + hidden1_b
#hidden1_relu = tf.nn.relu(hidden1)
#hidden1_dropout = tf.nn.dropout(hidden1_relu,.5)
with tf.name_scope('layer2_weights'):
output_w = tf.Variable(tf.truncated_normal([1000,1],stddev=math.sqrt(3/(1000+1)) , dtype=tf.float32))
with tf.name_scope('layer2_baises'):
output_b = tf.Variable(tf.zeros([1] , dtype=tf.float32))
with tf.name_scope('layer_2_predictions'):
output =tf.matmul(hidden1,output_w) + output_b
with tf.name_scope('predictions'):
predictedValues = (output)
loss = tf.sqrt(tf.reduce_mean(tf.square(predictedValues-outputValues)))
loss_l2 = tf.nn.l2_loss(hidden1_w)
with tf.name_scope('minimization') :
minimum = tf.train.AdamOptimizer(.5).minimize(loss+.004*loss_l2)
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print("Initialized")
for step in range(num_steps):
# Pick an offset within the training data, which has been randomized.
# Note: we could use better randomization across epochs.
offset = (step * batch_size) % (y.shape[0] - batch_size)
# Generate a minibatch.
batch_data = x[offset:(offset + batch_size), 1:]
batch_labels = y[offset:(offset + batch_size), :]
print("real" , batch_labels)
# Prepare a dictionary telling the session where to feed the minibatch.
# The key of the dictionary is the placeholder node of the graph to be fed,
# and the value is the numpy array to feed to it.
feed_dict = {inputProperties : batch_data, outputValues : batch_labels}
_, l, predictions , inp = session.run([minimum, loss, predictedValues ,inputProperties ], feed_dict=feed_dict)
print("prediction " , predictions)
print("loss : " , l)
print("----------")
print('+++++++++++')
也是我在的情況下上傳數據文件convertcsv.csv here要看一看。
我很感謝任何幫助弄清楚我做錯了什麼。
謝謝
我不認爲這些是導致性能不佳的原因,但我注意到了3件事:首先,您使用'hidden1'而不是'hidden_dropout'來定義'output',所以您基本上只是在做線性迴歸,因爲層之間沒有激活功能。其次,您可能想要將'output_w'的正則化添加到'loss_l2'。最後,32位通常綽綽有餘,因此明確使用64位浮點數可能沒有什麼區別。 – Styrke
您也可以嘗試權重的初始化。如果使用Xavier初始化,標準偏差應該是'sqrt(3。/(in + out))'。對於'output_w',''hidden1_w'和'sqrt(3。/(1000 + 1))'是'sqrt(3。/(287 + 1000))'。 – Styrke
謝謝Styrke,我刪除了relu激活函數和退出,因爲我認爲他們導致問題的地方,我剛剛返回他們,我也嘗試了Xavier initalization,正如您所建議的那樣,但沒有改變,輸出層仍然不能正確預測任何事物。 –