系統信息：1.1.0，GPU，Windows，Python 3.5，代碼在ipython控制檯中運行。Tensorflow在使用tf.device（'/ cpu：0'）時分配GPU內存

我試圖運行兩個不同的Tensorflow會話，一個在GPU上（它執行一些批處理工作），另一個在我用於快速測試而另一個工作的CPU上。

問題是，當我產生指定with tf.device('/cpu:0')的第二個會話時，會話嘗試分配GPU內存並崩潰我的其他會話。

我的代碼：

import os 
os.environ["CUDA_VISIBLE_DEVICES"] = "" 
import time 

import tensorflow as tf 

with tf.device('/cpu:0'): 
    with tf.Session() as sess: 
    # Here 6 GBs of GPU RAM are allocated. 
    time.sleep(5)

如何強制Tensorflow忽略GPU？

UPDATE：

爲在@Nicolas評論建議，我接過來一看at this answer跑

import os 
os.environ["CUDA_VISIBLE_DEVICES"] = "" 
import tensorflow as tf 

from tensorflow.python.client import device_lib 
print(device_lib.list_local_devices())

它打印：

[name: "/cpu:0" 
device_type: "CPU" 
memory_limit: 268435456 
locality { 
} 
incarnation: 2215045474989189346 
, name: "/gpu:0" 
device_type: "GPU" 
memory_limit: 6787871540 
locality { 
    bus_id: 1 
} 
incarnation: 13663872143510826785 
physical_device_desc: "device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0" 
]

在我看來，即使我明確告訴腳本忽略任何CUDA設備，它仍然可以找到並使用它們。這可能是TF 1.1的缺陷嗎？

來源

2017-06-12 GPhilo

事實證明，設置CUDA_VISIBLE_DEVICES爲空字符串不不掩蓋CUDA設備到腳本可見。

從documentation of CUDA_VISIBLE_DEVICES（重點由我加的）：

，只有其索引存在序列中是 CUDA應用程序可見，並且它們在序列的順序列舉設備。 如果其中一個索引無效，則只有索引位於無效索引前面的設備對CUDA應用程序可見。對於例如，CUDA_VISIBLE_DEVICES設置爲2,1原因設備0是隱形和裝置2裝置1 設置 CUDA_VISIBLE_DEVICES到0,2，-1,1導致設備0和2可見之前列舉和設備1不可見。

好像被當作「沒有有效的設備存在」被處理的空字符串，但改變的意義，因爲它是文檔中沒有提及。

將代碼更改爲os.environ["CUDA_VISIBLE_DEVICES"] = "-1"可以解決問題。運行

import os 
os.environ["CUDA_VISIBLE_DEVICES"]="-1"  
import tensorflow as tf 

from tensorflow.python.client import device_lib 
print(device_lib.list_local_devices())

現在打印

[name: "/cpu:0" 
device_type: "CPU" 
memory_limit: 268435456 
locality { 
} 
incarnation: 14097726166554667970 
]

和實例化一個tf.Session不生豬GPU的內存了。

來源

2017-06-13 10:00:13 GPhilo

我冒昧地在一些文檔示例中聚合（並引用）各種答案（包括你的），請參閱stackoverflow.com/documentation/tensorflow/10621我希望你不介意。隨意編輯它。 – npf

你介意嘗試這些配置選項之一嗎？

config = tf.ConfigProto() 
config.gpu_options.allow_growth = True 
# or config.gpu_options.per_process_gpu_memory_fraction = 0.0 
with tf.Session(config=config) as sess: 
    ...

按照該documentation，它應該幫助你管理你的GPU內存此特定會話等你第二次會議應該能夠在GPU上運行。

編輯：根據本answer你也應該試試這個：

import os 
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 
os.environ["CUDA_VISIBLE_DEVICES"]="-1"

來源

2017-06-13 05:52:36 npf

'config.gpu_options.per_process_gpu_memory_fraction = 0.0'選項對我不起作用：它仍會嘗試在GPU上分配內存並殺死其他會話。（有趣的是，這是另一個會話死亡的第二個會話。） – GPhilo

'config.gpu_options.allow_growth = True'選項似乎有效，儘管這是IMO的混亂。 'allow_growth'只會關閉GPU內存的預分配，但爲什麼當我禁用腳本的CUDA設備時，內存是否可以預先分配？ – GPhilo

當你說session時，你是否指''tf.Session'，因爲我在一個單獨的python進程中創建2'tf.Session'：一個用於GPU部分，另一個用於CPU部分？ – npf

Tensorflow在使用tf.device（'/ cpu：0'）時分配GPU內存

UPDATE：

回答

相關問題