今天我安裝了tensorflow使用RTFM tensorflow dot org install linux, 我安裝了VirtualEnv + Python3 + CPU和測試tensorflow Hello World,它工作正常。已解決 - 使用Nvidia GPU的VirtualEnv tensorflow:cuda-9.0-vs-cuda-8.0,cuDNN-7.0-vs-cuDNN-6.0
然後我繼續在nvidia路徑(GPU GTX 970)上安裝VirtualEnv + Python + GPU。 RTFM(docs dot nvidia dot com cuda cuda-installation-guide-linux index dot html),cuda-9.0,cuDDN 7,所有PATH都可以,.bashrc最新,printenv LD_LIBRARY_PATH ok。
我的GPU可以使用cuda腳本deviceQuery和bandwitdhTest。所有來自Nvidia清單的安裝後操作均已通過。
當我在VirtualEnv + Python3 + GPU中運行Hello World時,我得到了下面的代碼(cliffnote:tensorflow想要使用來自/usr/local/cuda-9.0/lib64這個9.0目錄中的一些cudalibrary-8.0我嘗試添加一個符號,以便cudalibrary 8.0點至9.0,但後來我得到了同樣的信息與另一個cudalibrary ......這種技巧對所有的CUDA庫是不是我叫修復;-))
感謝您提供任何您可能具有的相關見解。
[email protected]:~/Documents/Ordinateur/VirtualEnv$ source tensorflow_py3_gpu/bin/activate
(tensorflow_py3_gpu) [email protected]:~/Documents/Ordinateur/VirtualEnv$ python
Python 3.5.2 (default, Sep 14 2017, 22:51:06) [GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> # Python
... import tensorflow as tf
Traceback (most recent call last):
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory
上面的最後一行是關於cudalibrary-8.0,顯然不在cudalibrary-9.0的列表中。下面是它的其餘部分。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import *
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/home/alexandre/Documents/Ordinateur/VirtualEnv/tensorflow_py3_gpu/lib/python3.5/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https tensorflow dot org slash install slash install_sources hashtag common_installation_problems for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.
>>> hello = tf.constant('Hello, TensorFlow!')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'tf' is not defined
>>> sess = tf.Session()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'tf' is not defined
>>> print(sess.run(hello))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sess' is not defined
>>> quit()
(tensorflow_py3_gpu) [email protected]:~/Documents/Ordinateur/VirtualEnv$ deactivate`
- 第二天更新
不那麼幹淨的解決辦法:創建在/ usr /本地/ CUDA/lib64目錄/對數錯誤版本每個庫,鏈接到一個鏈接正確的數字版本。
[email protected]:/usr/local/cuda/lib64$ sudo ln -s libcurand.so.9.0 libcurand.so.8.0
我這樣做是有五個CUDA庫(libcusolver,libcublas,libcudart,libcurand,libcufft),並與cuDNN庫libcudnn(第6版 - >第7版)。
Hello world! tensorflow的工作......但如果有人能告訴我爲什麼tensorflow調用cuda-8.0和cuDDN-6.0庫時,我只安裝了cuda-9.0和cuDDN-7.0,非常歡迎。
[已解決...或近期]更新 我發現https://github.com/tensorflow/tensorflow/issues/12052幾乎可以解釋這一切。
Cliffnote: tensorflow-1.3使用cuda-8.0和cuDNN-6.0(這就是爲什麼當我運行tensorflow時這些庫顯式鏈接)。我被欺騙的NVIDIA網站,讓我下載CUDA 9.0和cuDNN-7.0版本,這將不會在tensorflow-1.3來實現。
tensorflow-1.4將與cuda-9.0和cuDNN-7.0版本一起使用。 tensorflow-1-4可能在2017年10月的某個時間(或很快,請檢查上面的鏈接)。
我注意到別的東西。 https://www.tensorflow.org/install/install_linux提到cuda-8.0和cuDNN-6.0需要運行tensorflow;但是當你遵循nvidia網站的程序時,你會自動獲得cuda-9.0和cuDNN-7.0。 我的下一個選項是: - 我應該卸載並切換回cuda-8.0和cuDNN-6.0嗎? - 我應該忽略這個(並休息我的情況)並重新安裝來源的tensorflow,以編譯我的GPU(SSE4.1,SSE4.2,AVX,AVX2,FMA)的某些功能? – Taamer