多GPU似乎無法在TensorFlow1.0上工作

我使用的是TensorFlow 1.0，我開發了一個簡單的程序來測量性能。我有一個愚蠢的模型如下多GPU似乎無法在TensorFlow1.0上工作

def model(example_batch): 
    h1 = tf.layers.dense(inputs=example_batch, units=64, activation=tf.nn.relu) 
    h2 = tf.layers.dense(inputs=h1, units=2) 
    return h2

和一個簡單的功能，運行模擬：如果我運行python腳本

def testPerformanceFromMemory(model, iter=1000 num_cores=2): 
    example_batch = tf.placeholder(np.float32, shape=(64, 128)) 
    for core in range(num_cores): 
    with tf.device('/gpu:%d'%core): 
     prediction = model(example_batch) 
    init_op = tf.global_variables_initializer() 
    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) 
    sess.run(init_op) 
    tf.train.start_queue_runners(sess=sess) 
    input_array = np.random.random((64,128)) 
    for step in range(iter): 
    myprediction = sess.run(prediction, feed_dict={example_batch:input_array})

，然後運行的NVIDIA-SMI命令，我可以看到GPU0運行使用率很高，但GPU1的使用率爲0％。

我看過這個：https://www.tensorflow.org/tutorials/using_gpu和這個：https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py但我不知道爲什麼我的例子不能在多GPU中運行。

PS如果我從tensorflow庫中加載ciphar 10的例子，它會以multigpu模式運行。

編輯：mrry說我改寫預測的話，我張貼在這裏的正確方法：

def testPerformanceFromMemory(model, iter=1000 num_cores=2): 
    example_batch = tf.placeholder(np.float32, shape=(64, 128)) 
    prediction = [] 
    for core in range(num_cores): 
    with tf.device('/gpu:%d'%core): 
     prediction.append([model(example_batch)]) 
    init_op = tf.global_variables_initializer() 
    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) 
    sess.run(init_op) 
    tf.train.start_queue_runners(sess=sess) 
    input_array = np.random.random((64,128)) 
    for step in range(iter): 
    myprediction = sess.run(prediction, feed_dict={example_batch:input_array})

來源

2017-03-09 RdlP

看你的程序在不同的GPU，要創建多個並行子圖（通常被稱爲「塔」）設備，但在第for每次循環覆蓋prediction張量：

for core in range(num_cores): 
    with tf.device('/gpu:%d'%core): 
    prediction = model(example_batch) 
# ... 
for step in range(iter): 
    myprediction = sess.run(prediction, feed_dict={example_batch:input_array})

其結果是，當你調用sess.run(prediction, ...)你將只運行在最後它創建的子圖第一個for循環，只能在一個GPU上運行。

來源

2017-03-09 17:20:27 mrry

多GPU似乎無法在TensorFlow1.0上工作

回答

相關問題