2017-03-15 90 views
-1

系統: 的Ubuntu 16.04.2 cudnn 5.1,CUDA 8.0Theano/Chainer報告不報告正確的免費VRAM上K80與12GB RAM

我已經theano從混帳(最新版本)安裝。

當我從https://github.com/yusuketomoto/chainer-fast-neuralstyle/tree/resize-conv運行生成示例時,它會報告內存是否已用完CPU或GPU。

python generate.py sample_images/tubingen.jpg -m models/composition.model -o sample_images/output.jpg -g 0 

WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL: 
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 

/home/ubuntu/Theano/theano/sandbox/cuda/__init__.py:558: UserWarning: Theano flag device=gpu* (old gpu back-end) only support floatX=float32. You have floatX=float64. Use the new gpu back-end with device=cuda* for that value of floatX. 
    warnings.warn(msg) 
Using gpu device 0: Tesla K80 (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5105) 
Traceback (most recent call last): 
    File "generate.py", line 45, in <module> 
    y = model(x) 
    File "/home/ubuntu/chainer-fast-neuralstyle/net.py", line 56, in __call__ 
    h = F.relu(self.b2(self.c2(h), test=test)) 
    File "/usr/local/lib/python2.7/dist-packages/chainer/links/connection/convolution_2d.py", line 108, in __call__ 
    deterministic=self.deterministic) 
    File "/usr/local/lib/python2.7/dist-packages/chainer/functions/connection/convolution_2d.py", line 326, in convolution_2d 
    return func(x, W, b) 
    File "/usr/local/lib/python2.7/dist-packages/chainer/function.py", line 199, in __call__ 
    outputs = self.forward(in_data) 
    File "/usr/local/lib/python2.7/dist-packages/chainer/function.py", line 310, in forward 
    return self.forward_gpu(inputs) 
    File "/usr/local/lib/python2.7/dist-packages/chainer/functions/connection/convolution_2d.py", line 90, in forward_gpu 
    y = cuda.cupy.empty((n, out_c, out_h, out_w), dtype=x.dtype) 
    File "/usr/local/lib/python2.7/dist-packages/cupy/creation/basic.py", line 19, in empty 
    return cupy.ndarray(shape, dtype=dtype, order=order) 
    File "cupy/core/core.pyx", line 88, in cupy.core.core.ndarray.__init__ (cupy/core/core.cpp:6333) 
    File "cupy/cuda/memory.pyx", line 280, in cupy.cuda.memory.alloc (cupy/cuda/memory.cpp:5988) 
    File "cupy/cuda/memory.pyx", line 431, in cupy.cuda.memory.MemoryPool.malloc (cupy/cuda/memory.cpp:9256) 
    File "cupy/cuda/memory.pyx", line 447, in cupy.cuda.memory.MemoryPool.malloc (cupy/cuda/memory.cpp:9162) 
    File "cupy/cuda/memory.pyx", line 342, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc (cupy/cuda/memory.cpp:7817) 
    File "cupy/cuda/memory.pyx", line 368, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc (cupy/cuda/memory.cpp:7592) 
    File "cupy/cuda/memory.pyx", line 260, in cupy.cuda.memory._malloc (cupy/cuda/memory.cpp:5930) 
    File "cupy/cuda/memory.pyx", line 261, in cupy.cuda.memory._malloc (cupy/cuda/memory.cpp:5851) 
    File "cupy/cuda/memory.pyx", line 35, in cupy.cuda.memory.Memory.__init__ (cupy/cuda/memory.cpp:1772) 
    File "cupy/cuda/runtime.pyx", line 207, in cupy.cuda.runtime.malloc (cupy/cuda/runtime.cpp:3429) 
    File "cupy/cuda/runtime.pyx", line 130, in cupy.cuda.runtime.check_status (cupy/cuda/runtime.cpp:2241) 
cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory 

-

import theano.sandbox.cuda.basic_ops as sbcuda 
sbcuda.cuda_ndarray.cuda_ndarray.mem_info() 
(500105216L, 11995578368L) 

-

lspci -vvv |grep -i -A 20 nvidia 
00:04.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) 
Subsystem: NVIDIA Corporation GK210GL [Tesla K80] 
Physical Slot: 4 
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- 
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 
Latency: 0 
Interrupt: pin A routed to IRQ 11 
Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M] 
Region 1: Memory at 400000000 (64-bit, prefetchable) [size=16G] 
Region 3: Memory at 800000000 (64-bit, prefetchable) [size=32M] 
Region 5: I/O ports at c000 [size=128] 
Capabilities: <access denied> 
Kernel driver in use: nvidia 
Kernel modules: nvidia_375_drm, nvidia_375 

究竟做這些數字意味着什麼呢? Theano/Chainer只能訪問〜500MB的VRAM?

+0

您的GPU可能具有使用內存的「殭屍」分配,或者GPU上的某些其他進程正在使用內存。如果所有這些看起來都很奇怪,你可以嘗試檢查'nvidia-smi'的輸出和/或重新啓動系統並重新運行你的測試,看看大部分內存是否可用。 –

+0

我設法通過完全卸載theano來解決問題。我很困惑爲什麼導入chainer顯示theano警告,但它是這樣做的。卸載theano允許該腳本工作。 – Chris

回答

0

我設法通過完全卸載theano來解決問題。我很困惑爲什麼導入chainer顯示theano警告,但它是這樣做的。卸載theano允許該腳本工作。