2011-09-27 46 views
1

我不明白到底發生了什麼下面幾行:共享內存指針運算

  1. unsigned char *membershipChanged = (unsigned char *)sharedMemory;

  2. float *clusters = (float *)(sharedMemory + blockDim.x);

我認爲在#1 sharedMemory實際上被重命名爲membershipChanged,但爲什麼要將blockDim添加到sharedMemory指針。這個地址在哪裏?

sharedMemoryextern __shared__ char sharedMemory[];


我在CUDA kmeans implementation中發現的代碼創建的。

void find_nearest_cluster(int numCoords, 
          int numObjs, 
          int numClusters, 
          float *objects,   // [numCoords][numObjs] 
          float *deviceClusters, // [numCoords][numClusters] 
          int *membership,   // [numObjs] 
          int *intermediates) 
{ 
extern __shared__ char sharedMemory[]; 

// The type chosen for membershipChanged must be large enough to support 
// reductions! There are blockDim.x elements, one for each thread in the 
// block. 
unsigned char *membershipChanged = (unsigned char *)sharedMemory; 
float *clusters = (float *)(sharedMemory + blockDim.x); 

membershipChanged[threadIdx.x] = 0; 

// BEWARE: We can overrun our shared memory here if there are too many 
// clusters or too many coordinates! 
for (int i = threadIdx.x; i < numClusters; i += blockDim.x) { 
    for (int j = 0; j < numCoords; j++) { 
     clusters[numClusters * j + i] = deviceClusters[numClusters * j + i]; 
    } 
} 
..... 

回答

4

sharedMemory + blockDim.x點從共享存儲器區域的基部blockDim.x字節遠。

您可能會這樣做的原因是在共享內存中進行次分配。包含find_nearest_cluster的內核啓動站點爲內核動態分配一定數量的共享存儲。該代碼意味着兩個邏輯上不同的陣列駐留在由sharedMemory - membershipChangedclusters指向的共享存儲中。指針算術只是獲取指向第二個數組的指針的一種方法。

+0

和'extern __shared__ char sharedMemory [];'是否足以告訴cuda分配所有可用的共享內存? – Framester

+1

編號共享內存在內核啓動時使用以下語法動態分配:kernel <<< num_blocks,num_threads,num_bytes_smem >>>(args ...)。指向動態分配的消息的指針位於extern共享變量上。 –