CUDA 5,設備功能3.5,VS 2012,64位Win 2012 Server。CUDA固定內存從設備中刷新
線程之間沒有共享內存訪問,每個線程都是獨立的。
我使用零拷貝的固定內存。在主機上,只有當我在主機上發出cudaDeviceSynchronize
時,我才能讀取設備寫入的固定內存。
我希望能夠到:
- 水衝到鎖定的存儲,一旦設備已經更新了它。
- 不會阻止設備線程(可能由異步複製)
我打過電話__threadfence_system
和__threadfence
每個設備的寫入後,但沒有刷新。
下面是一個完整的示例代碼CUDA演示我的問題:
#include <conio.h>
#include <cstdio>
#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
__global__ void Kernel(volatile float* hResult)
{
int tid = threadIdx.x + blockIdx.x * blockDim.x;
printf("Kernel %u: Before Writing in Kernel\n", tid);
hResult[tid] = tid + 1;
__threadfence_system();
// expecting that the data is getting flushed to host here!
printf("Kernel %u: After Writing in Kernel\n", tid);
// time waster for-loop (sleep)
for (int timeWater = 0; timeWater < 100000000; timeWater++);
}
void main()
{
size_t blocks = 2;
volatile float* hResult;
cudaHostAlloc((void**)&hResult,blocks*sizeof(float),cudaHostAllocMapped);
Kernel<<<1,blocks>>>(hResult);
int filledElementsCounter = 0;
// naiive thread implementation that can be impelemted using
// another host thread
while (filledElementsCounter < blocks)
{
// blocks until the value changes, this moves sequentially
// while threads have no order (fine for this sample).
while(hResult[filledElementsCounter] == 0);
printf("%f\n", hResult[filledElementsCounter]);;
filledElementsCounter++;
}
cudaFreeHost((void *)hResult);
system("pause");
}
目前該樣品沒有被從設備讀取,除非我發出cudaDeviceSynchronize
將無限期地等待。下面的作品樣本,但它是不我希望,因爲它違背了異步複製的目的是什麼:
void main()
{
size_t blocks = 2;
volatile float* hResult;
cudaHostAlloc((void**)&hResult, blocks*sizeof(float), cudaHostAllocMapped);
Kernel<<<1,blocks>>>(hResult);
cudaError_t error = cudaDeviceSynchronize();
if (error != cudaSuccess) { throw; }
for(int i = 0; i < blocks; i++)
{
printf("%f\n", hResult[i]);
}
cudaFreeHost((void *)hResult);
system("pause");
}
你解決了這個問題嗎?您是否嘗試使用動態並行機制將數據寫入CPU主機的內存?在內核函數中使用'cudaMemcpyAsync(uva_host_ptr,device_ptr,size);',如以下鏈接所示:http://on-demand.gputechconf.com/gtc/2012/presentations/S0338-GTC2012-CUDA-Programming- Model.pdf – Alex 2013-10-13 21:34:50