如何讓不同的線程在CUDA中執行不同的部分？

我正在研究CUDA，並且遇到與線程同步相關的問題。在我的代碼中，我需要線程來執行代碼的不同部分，例如：如何讓不同的線程在CUDA中執行不同的部分？

one thread -> 
all thread -> 
one thread ->

這就是我想要的。在代碼的最初部分，只有一個線程會執行，然後一部分將由所有線程執行，然後再次執行單個線程。線程也在循環中執行。誰能告訴我該怎麼做？

來源

2010-05-06 Vickey

您只能同步單個塊內的線程。可以在多個塊之間進行同步，但只能在非常特定的情況下進行同步。如果你需要所有線程之間的全局同步，那麼做的方法是啓動一個新的內核。

在一個塊中，可以使用__syncthreads()同步線程。例如：

__global__ void F(float *A, int N) 
{ 
    int idx = threadIdx.x + blockIdx.x * blockDim.x; 

    if (threadIdx.x == 0) // thread 0 of each block does this: 
    { 
     // Whatever 
    } 
    __syncthreads(); 

    if (idx < N) // prevent buffer overruns 
    { 
     A[idx] = A[idx] * A[idx]; // "real work" 
    } 

    __syncthreads(); 

    if (threadIdx.x == 0) // thread 0 of each block does this: 
    { 
     // Whatever 
    } 
}

來源

2010-05-06 15:00:53 mch

這是一個簡單的解決方案，但要注意分支（導致當前warp被序列化）。儘可能嘗試使半變形中的所有線程遵循相同的執行路徑。 – Ljdawson 2010-05-14 23:16:00

您需要使用線程ID來控制執行的內容，例如

if (thread_ID == 0) 
{ 
    // do single thread stuff 
} 

// do common stuff on all threads 

if (thread_ID == 0) 
{ 
    // do single thread stuff 
}

來源

2010-05-06 09:44:48

如果您的程序包含多個塊，則需要跨塊使用自定義同步機制。如果你的內核只啓動一個塊，那麼__syncthreads（）將起作用。

來源

2012-08-24 19:27:05

如何讓不同的線程在CUDA中執行不同的部分？

回答

相關問題