2016-05-30 46 views
0

我正在使用TensorFlow實現一個網絡。 網絡將輸入的二進制特徵向量作爲輸入,並且它應該預測浮點值作爲輸出。 我期待一個(1,1)張量對象作爲輸出,用於我的功能multilayer_perceptron(),相反,運行時pred,它返回我的輸入數據的相同長度的矢量(X,1)。TensorFlow:網絡輸出沒有預期的形狀

由於我是這個框架的新手,我預計這個錯誤是非常微不足道的。 我在做什麼錯?

import tensorflow as tf 

print "**** Defining parameters..." 
# Parameters 
learning_rate = 0.001 
training_epochs = 15 
batch_size = 1 
display_step = 1 

print "**** Defining Network..." 
# Network Parameters 
n_hidden_1 = 10 # 1st layer num features 
n_hidden_2 = 10 # 2nd layer num features 
n_input = Xa.shape[1] # data input(feature vector length) 
n_classes = 1 # total classes (IC50 value) 

# tf Graph input 
x = tf.placeholder("int32", [batch_size, None]) 
y = tf.placeholder("float", [None, n_classes]) 

# Create model 
def multilayer_perceptron(_X, _weights, _biases): 
    lookup_h1 = tf.nn.embedding_lookup(_weights['h1'], _X) 
    layer_1 = tf.nn.relu(tf.add(tf.reduce_sum(lookup_h1, 0), _biases['b1'])) #Hidden layer with RELU activation 
    layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, _weights['h2']), _biases['b2'])) #Hidden layer with RELU activation 
    pred = tf.matmul(layer_2, _weights['out']) + _biases['out'] 

    return pred 

# Store layers weight & bias 
weights = { 
      'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), 
      'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), 
      'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes])) 
      } 
biases = { 
     'b1': tf.Variable(tf.random_normal([n_hidden_1])), 
     'b2': tf.Variable(tf.random_normal([n_hidden_2])), 
     'out': tf.Variable(tf.random_normal([n_classes])) 
} 

# Construct model 
pred = multilayer_perceptron(x, weights, biases) 

# Define loss and optimizer 
cost = tf.reduce_mean(tf.square(tf.sub(pred, y))) # MSE 
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent 

# Evaluate model 
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 

# Initializing the variables 
init = tf.initialize_all_variables() 

print "**** Launching the graph..." 
# Launch the graph 
with tf.Session() as sess: 
    sess.run(init) 

    print "**** Training..." 
    # Training cycle 
    for epoch in range(training_epochs): 
    avg_cost = 0. 
    total_batch = int(Xa.tocsc().shape[0]/batch_size) 
    # Loop over all batches 
    for i in range(total_batch): 
     # Extract sample 
     batch_xs = Xa.tocsc()[i,:].tocoo() 
     batch_ys = np.reshape(Ya.tocsc()[i,0], (batch_size,1)) 
     #**************************************************************************** 
     # Extract sparse indeces from input matrix (They will be used as actual input) 
     ids = batch_xs.nonzero()[1] 
     # Fit training using batch data 
     sess.run(optimizer, feed_dict={x: ids, y: batch_ys}) 
     # Compute average loss 
     avg_cost += sess.run(cost, feed_dict={x: ids, y: batch_ys})/total_batch 
     # Display logs per epoch step 
    if epoch % display_step == 0: 
     print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost) 
    print "Optimization Finished!" 
+0

既然你學習這是很好的開始的教程。看看Udacity課程的第二個任務。這裏有可用的解決方案:https://github.com/napsternxg/Udacity-Deep-Learning/blob/master/udacity/2_fullyconnected.ipynb如果你不能找到問題告訴我,我會幫你的。然而,通過查看類似的代碼找到答案會比單個答案更有利。 – Elmira

+0

謝謝你的建議,我馬上給它看看。 –

+0

我真的不能找到解決這個問題,你能幫我明白問題出在哪裏? –

回答

0

pred應該是形狀[n_input, n_class]的,因爲你以這種方式定義weights['out']biases['out']。你從pred得到(1,1)張的唯一方式是你n_class = 1 ...