2015-04-05 80 views
-2

我希望將指針數組從一個結構體複製到另一個結構體。該結構是這樣的:使用CUDA複製結構體內的指針陣列

typedef struct COORD3D 
{ 
    int x,y,z; 
} 
COORD3D; 

typedef struct structName 
{ 
    double *volume; 
    COORD3D size; 
    // .. some other vars 
} 
structName; 

我要做到這一點,我通過在結構中的空實例,並與我要複製的數據結構的地址的地址的功能裏面。目前,我這樣做連續通過:

void foo(structName *dest, structName *source) 
{ 

    // .. some other work 

    int size = source->size.x * source->size.y * source->size.z; 
    dest->volume = (double*)malloc(size*sizeof(double)); 

    int i; 
    for(i=0;i<size;i++) 
     dest->volume[i] = source->volume[i]; 
} 

我想這樣做的CUDA來加速這一進程(如數組是非常大的[〜1200萬組的元素]

我已經試過不過以下。雖然代碼編譯和運行,我得到存儲在數組中不正確的結果(似乎是非常大的隨機數)

void foo(structName *dest, structName *source) 
{ 
    // .. some other work 

    int size = source->size.x * source->size.y * source->size.z; 
    dest->volume = (double*)malloc(size*sizeof(double)); 

    // Device Pointers 
    double *DEVICE_SOURCE, *DEVICE_DEST; 

    // Declare memory on GPU 
    cudaMalloc(&DEVICE_DEST,size); 
    cudaMalloc(&DEVICE_SOURCE,size); 

    // Copy Source to GPU 
    cudaMemcpy(DEVICE_SOURCE,source->volume,size, 
       cudaMemcpyHostToDevice); 

    // Setup Blocks/Grids 
    dim3 dimGrid(ceil(source->size.x/10.0), 
       ceil(source->size.y/10.0), 
       ceil(source->size.z/10.0)); 
    dim3 dimBlock(10,10,10); 

    // Run CUDA Kernel 
    copyVol<<<dimGrid,dimBlock>>> (DEVICE_SOURCE, 
            DEVICE_DEST, 
            source->size.x, 
            source->size.y, 
            source->size.z); 

    // Copy Constructed Array back to Host 
    cudaMemcpy(dest->volume,DEVICE_DEST,size, 
       cudaMemcpyDeviceToHost); 

} 

內核是這樣的:

__global__ void copyVol(double *source, double *dest, 
         int x, int y, int z) 
{ 
    int posX = blockIdx.x * blockDim.x + threadIdx.x; 
    int posY = blockIdx.y * blockDim.y + threadIdx.y; 
    int posZ = blockIdx.z * blockDim.z + threadIdx.z; 

    if (posX < x && posY < y && posZ < z) 
    { 
     dest[posX+(posY*x)+(posZ*y*x)] = 
     source[posX+(posY*x)+(posZ*y*x)]; 
    } 
} 

誰能告訴我我哪裏出錯了?

+0

'malloc(source-> size,sizeof(double));'不能編譯。 – 2015-04-05 17:09:45

+0

對不起,這是一個錯字,現在編輯 – 2015-04-05 17:10:58

+0

還有其他的拼寫錯誤嗎? – 2015-04-05 17:11:25

回答

0

我冒着錯誤的答案冒險,但你有沒有忽略數據類型的大小?

cudaMalloc(&DEVICE_DEST,size); 

應該

cudaMalloc(&DEVICE_DEST,size*sizeof(double)); 

而且

cudaMemcpy(DEVICE_SOURCE,source->volume,size, cudaMemcpyHostToDevice); 

應該

cudaMemcpy(DEVICE_SOURCE,source->volume,size*sizeof(double), cudaMemcpyHostToDevice); 

等。

+0

釘了它,再次感謝(那些日子之一) – 2015-04-05 17:32:26