內核崩潰時嘗試做一個簡單的值分配

我正在學習CUDA，並仍處於初級階段。我正在嘗試一個簡單的任務，但我的代碼崩潰時，我運行它，我不知道爲什麼。任何幫助，將不勝感激。內核崩潰時嘗試做一個簡單的值分配

編輯：崩潰上cudaMemcpy和Image結構中，pixelVal是int**類型。這是原因嗎？

原始C++代碼：

void Image::reflectImage(bool flag, Image& oldImage) 
/*Reflects the Image based on users input*/ 
{ 
    int rows = oldImage.N; 
    int cols = oldImage.M; 
    Image tempImage(oldImage); 

    for(int i = 0; i < rows; i++) 
    { 
     for(int j = 0; j < cols; j++) 
     tempImage.pixelVal[rows - (i + 1)][j] = oldImage.pixelVal[i][j]; 
    } 
    oldImage = tempImage; 
}

我的CUDA內核&代碼：

#define NTPB 512 
__global__ void fliph(int* a, int* b, int r, int c) 
{ 
    int i = blockIdx.x * blockDim.x + threadIdx.x; 
    int j = blockIdx.y * blockDim.y + threadIdx.y; 

    if (i >= r || j >= c) 
     return; 
    a[(r - i * c) + j] = b[i * c + j]; 
} 
void Image::reflectImage(bool flag, Image& oldImage) 
/*Reflects the Image based on users input*/ 
{ 
    int rows = oldImage.N; 
    int cols = oldImage.M; 
    Image tempImage(oldImage); 
    if(flag == true) //horizontal reflection 
    { 
    //Allocate device memory 
    int* dpixels; 
    int* oldPixels; 
    int n = rows * cols; 
    cudaMalloc((void**)&dpixels, n * sizeof(int)); 
    cudaMalloc((void**)&oldPixels, n * sizeof(int)); 
    cudaMemcpy(dpixels, tempImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice); 
    cudaMemcpy(oldPixels, oldImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice); 
    int nblks = (n + NTPB - 1)/NTPB; 
    fliph<<<nblks, NTPB>>>(dpixels, oldPixels, rows, cols); 
    cudaMemcpy(tempImage.pixelVal, dpixels, n * sizeof(int), cudaMemcpyDeviceToHost); 
    cudaFree(dpixels); 
    cudaFree(oldPixels); 
    } 
    oldImage = tempImage; 
}

來源

2013-04-04 Bhrugesh Patel

您的塊和網格是一維。你爲什麼在內核中使用二維索引。內核中的變量'j'始終爲0。 – sgarizvi 2013-04-04 17:14:18

通過快速審查，代碼看起來沒有問題（除了@ sgar91筆記）。我建議您爲程序提供錯誤檢查以進一步說明問題。看[在]（http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api）這篇文章。 – stuhlo 2013-04-04 17:25:36

我計算了7個CUDA API調用，並且根本沒有發現錯誤檢查！第一步：檢查錯誤並嘗試縮小問題發生的位置。 – talonmies 2013-04-04 18:03:36

你必須按順序使用2D指數i和j來處理圖像以創建二維網格。在目前的情況下，內核只處理圖像的第一行。

要創建一個2D的索引機制，創建二維塊和2D網格是這樣的：

const int BLOCK_DIM = 16; 

dim3 Block(BLOCK_DIM,BLOCK_DIM); 

dim3 Grid; 
Grid.x = (cols + Block.x - 1)/Block.x; 
Grid.y = (rows + Block.y - 1)/Block.y; 

fliph<<<Grid, Block>>>(dpixels, oldPixels, rows, cols);

來源

2013-04-04 18:03:42 sgarizvi

內核崩潰時嘗試做一個簡單的值分配

回答

相關問題