2017-06-23 89 views
0

當我嘗試將數據循環回內核函數,幾次迭代停止工作後,只給出0作爲答案時,我的代碼會中斷嗎?有人知道爲什麼嗎?如果我循環調用內核的整個方法,但它的工作速度更慢OpenCL只在循環調用時停止運行

cl_mem *ptrInput = &Pressure_BUFF; 
cl_mem *ptrOutput = &Pressure_OUT_BUFF; 

for(int i = 0; i<Interaction_per_frame; i++){ 

    clSetKernelArg(kernel_2, 4, sizeof(Pressure_BUFF), ptrInput); 
    clEnqueueNDRangeKernel(queue_2, kernel_2, 1, NULL,&work_units_per_kernel, NULL, 0, NULL, NULL); 
    clFinish(queue_2);//Terminar de calcular 

    cl_mem *ptrTpm = ptrInput; 
    ptrInput = ptrOutput; 
    ptrOutput = ptrTpm; 

} 

clEnqueueReadBuffer(queue_2, Pressure_OUT_BUFF, CL_TRUE, 0,sizeof(Pressure), Pressure, 0, NULL, NULL); 
+0

使用'clEnqueueCopyBuffer'代替cl_mem雜耍。 – anil

回答

0

您不能只更改輸入內存緩衝區而不改動輸出。否則數據具有與輸出相同的輸入。

最簡單的方法是使用2個內核,所以你不需要每次調用setargs並完成。

//Create 2 buffers, A and B 
bufA = clCreateBuffer(...); 
bufB = clCreateBuffer(...); 

//Create 2 kernels with same parameters 
kernelAB = clCreateKernel(...); 
kernelBA = clCreateKernel(...); 

//Set one to input A output B, and the other in reverse 
clSetKernelArgs(kernelAB, in, bufferA); 
clSetKernelArgs(kernelAB, out, bufferB); 
clSetKernelArgs(kernelBA, in, bufferB); 
clSetKernelArgs(kernelBA, out, bufferA); 

for(int i = 0; i<Interaction_per_frame; i++){ 
    clEnqueueNDRangeKernel(queue_2, i%2 ? kernelBA : kernelAB, 1, NULL,&work_units_per_kernel, NULL, 0, NULL, NULL);  
} 

clEnqueueReadBuffer(queue_2, Interaction_per_frame%2 ? bufferB : bufferA, CL_TRUE, 0,sizeof(Pressure), Pressure, 0, NULL, NULL);