0
我試圖使用它的GPU VRAM> 96%的服務器上運行:儘管指定`device_count = {'CPU':1,'GPU':0}`,但由於GPU上的內存不足錯誤,爲什麼TensorFlow會話無法啓動?
import tensorflow as tf
a = tf.constant(1, name = 'a')
b = tf.constant(3, name = 'b')
c = tf.constant(9, name = 'c')
d = tf.add(a, b, name='d')
e = tf.add(d, c, name='e')
session_conf = tf.ConfigProto(
device_count={'CPU': 1, 'GPU': 0},
allow_soft_placement=True
)
sess = tf.Session(config=session_conf)
print(sess.run([d, e]))
它給了我CUDA_ERROR_OUT_OF_MEMORY
錯誤停止程序的執行:
[email protected]:/scratch/test$ python3.5 shape.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
E tensorflow/core/common_runtime/direct_session.cc:137] Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 18446744073709551615
Traceback (most recent call last):
File "shape.py", line 20, in <module>
sess = tf.Session(config=session_conf)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1187, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 552, in __init__
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
爲什麼能水平假設我在創建TensorFlow會話時指定了device_count={'CPU': 1, 'GPU': 0}, allow_soft_placement=True
,vRAM使用會干擾我的程序?
使進程級別的GPU不可見請注意,在windows'export CUDA_VISIBLE_DEVICES ='不會工作(因爲我發現了困難的方法[在這裏](https: //stackoverflow.com/questions/44500733/tensorflow-allocating-gpu-memory-when-using-tf-device-cpu0/44513295?noredirect=1#comment76027592_44513295))。爲了有效地屏蔽所有的GPU,你必須設置'CUDA_VISIBLE_DEVICES = -1'(或者任何其他無效的設備編號)。 – GPhilo
它適用於我,而且它的使用非常廣泛,在你的情況下,你的CUDA驅動程序必須是特殊的。 –
您正在使用哪種CUDA SDK?我使用的是版本8,在他們的文檔中他們沒有指定空字符串的行爲 – GPhilo