我是PyCUDA的新手。我想用__global__
聲明的函數調用__device__
聲明的函數。我如何在pyCUDA中做到這一點?從pycuda的全局函數中調用設備函數
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import numpy as n
import pycuda.autoinit
import pycuda.gpuarray as gp
d=gp.zeros(shape=(128,128),dtype=n.int32)
h=n.zeros(shape=(128,128),dtype=n.int32)
mod=SourceModule("""
__global__ void matAdd(int *a)
{
int px=blockIdx.x*blockDim.x+threadIdx.x;
int py=blockIdx.y*blockDim.y+threadIdx.y;
a[px*128+py]+=1;
matMul(px);
}
__device__ void matMul(int px)
{
px=5;
}
""")
m=mod.get_function("matAdd")
m(d,block=(32,32,1),grid=(4,4))
d.get(h)
上面的代碼是給我下面的錯誤
7-linux-i686.egg/pycuda/../include/pycuda kernel.cu]
[stderr:
kernel.cu(8): error: identifier "matMul" is undefined
kernel.cu(12): warning: parameter "px" was set but never used
1 error detected in the compilation of "/tmp/tmpxft_00002286_00000000-6_kernel.cpp1.ii".
]
我不確定我是否理解這個問題。在PyCUDA中,您仍然使用CUDA C編寫設備代碼。如果您使用C++而不是Python編寫主機代碼,那也沒什麼兩樣。那麼你在問什麼? – talonmies 2012-08-10 13:29:28