2016-09-17 59 views
2

(在下面的問題中很長的錯誤信息TL; DR,這裏是具體問題:爲什麼測試代碼不能在TX1的GPU上執行,我需要什麼要做到這一點?Theano:新Jetson TX1上GPU的使用情況

我剛剛閃過,並安裝了一個全新的與JetPack 2.3的Nvidia Jetson TX1。我試圖在TX1上安裝Theano,以便能夠使用板載GPU進行進一步的機器學習和神經網絡應用。

但是,我似乎無法讓GPU本身工作。

安裝Theano的從here採取:

sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libblas-dev git 
pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git --user # Need Theano 0.8(not yet released) or more recent 

安裝Theano版本是0.9.0.dev2,蟒蛇是2.7.12版本。

我從here使用的測試腳本:

from theano import function, config, shared, tensor 
import numpy 
import time 

vlen = 10 * 30 * 768 # 10 x #cores x # threads per core 
iters = 1000 

rng = numpy.random.RandomState(22) 
x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) 
f = function([], tensor.exp(x)) 
print(f.maker.fgraph.toposort()) 
t0 = time.time() 
for i in range(iters): 
    r = f() 
t1 = time.time() 
print("Looping %d times took %f seconds" % (iters, t1 - t0)) 
print("Result is %s" % (r,)) 
if numpy.any([isinstance(x.op, tensor.Elemwise) and 
       ('Gpu' not in type(x.op).__name__) 
       for x in f.maker.fgraph.toposort()]): 
    print('Used the cpu') 
else: 
    print('Used the gpu') 

當運行建議:

THEANO_FLAGS=device=cuda0 python gpu_tutorial1.py 

我得到如下回應,充滿了錯誤,警告,以及執行的CPU上,而比GPU:

ERROR (theano.gpuarray): pygpu was configured but could not be imported 
Traceback (most recent call last): 
    File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 21, in <module> 
    import pygpu 
ImportError: No module named pygpu 
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem: 
['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"] 
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)] 
Looping 1000 times took 12.736936 seconds 
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753 
    1.62323285] 
Used the cpu 

當我將設備標誌更改爲'gpu':

THEANO_FLAGS=device=gpu python gpu_tutorial1.py 

事情有所改善,在NVIDIA的Tegra的X1至少發現,雖然最終沒有被採用:

Using gpu device 0: NVIDIA Tegra X1 (CNMeM is disabled, cuDNN 5105) 
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem: 
['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"] 
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)] 
Looping 1000 times took 12.820628 seconds 
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753 
    1.62323285] 
Used the cpu 

我確實打算髮送警示線到Theano郵件列表,但是這個警告似乎與我目前的主要問題無關:爲什麼這個測試代碼不能在TX1的GPU上執行,我該如何做才能做到這一點?

回答

2

事實證明,該網站上推薦的CLI調用不正確。正確的調用是:

THEANO_FLAGS='device=gpu,floatX=float32' python gpu_tutorial1.py 

這足以與一個滿意的加速(明顯,在輸出中報告),並擺脫採空區,smackingly長的錯誤警告的GPU執行。

將這兩個標誌放在.theanorc文件中也足夠,並簡化了調用。

+0

對於未來的用戶,'device = gpu'是一個不推薦的選項,它現在是'cuda *'。 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 – tandem