將OpenMP轉換爲TBB

將OpenMP代碼轉換爲TBB有一些困難。有人能幫我嗎？將OpenMP轉換爲TBB

我在OpenMP的，下面的代碼，其中成績不錯

# pragma omp parallel \ 
shared (b, count, count_max, g, r, x_max, x_min, y_max, y_min) \ 
private (i, j, k, x, x1, x2, y, y1, y2) 
{ 
# pragma omp for 

for (i = 0; i < m; i++) 
{ 
for (j = 0; j < n; j++) 
{ 
//cout << omp_get_thread_num() << " thread\n"; 
    x = ((double) ( j - 1) * x_max 
     + (double) (m - j ) * x_min) 
    /(double) (m  - 1); 

    y = ((double) ( i - 1) * y_max 
     + (double) (n - i ) * y_min) 
    /(double) (n  - 1); 

    count[i][j] = 0; 

    x1 = x; 
    y1 = y; 

    for (k = 1; k <= count_max; k++) 
    { 
    x2 = x1 * x1 - y1 * y1 + x; 
    y2 = 2 * x1 * y1 + y; 

    if (x2 < -2.0 || 2.0 < x2 || y2 < -2.0 || 2.0 < y2) 
    { 
     count[i][j] = k; 
     break; 
    } 
    x1 = x2; 
    y1 = y2; 
    } 

    if ((count[i][j] % 2) == 1) 
    { 
    r[i][j] = 255; 
    g[i][j] = 255; 
    b[i][j] = 255; 
    } 
    else 
    { 
    c = (int) (255.0 * sqrt (sqrt (sqrt ( 
     ((double) (count[i][j])/(double) (count_max)))))); 
    r[i][j] = 3 * c/5; 
    g[i][j] = 3 * c/5; 
    b[i][j] = c; 
    } 
} 
} 
}

而且TBB版本慢10倍，然後OpenMP的

爲TBB的代碼是：

tbb::parallel_for (int(0), m, [&](int i) 
{ 
for (j = 0; j < n; j++) 
{ 
    x = ((double) ( j - 1) * x_max 
     + (double) (m - j ) * x_min) 
    /(double) (m  - 1); 

    y = ((double) ( i - 1) * y_max 
     + (double) (n - i ) * y_min) 
    /(double) (n  - 1); 

    count[i][j] = 0; 

    x1 = x; 
    y1 = y; 

    for (k = 1; k <= count_max; k++) 
    { 
    x2 = x1 * x1 - y1 * y1 + x; 
    y2 = 2 * x1 * y1 + y; 

    if (x2 < -2.0 || 2.0 < x2 || y2 < -2.0 || 2.0 < y2) 
    { 
     count[i][j] = k; 
     break; 
    } 
    x1 = x2; 
    y1 = y2; 
    } 

    if ((count[i][j] % 2) == 1) 
    { 
    r[i][j] = 255; 
    g[i][j] = 255; 
    b[i][j] = 255; 
    } 
    else 
    { 
    c = (int) (255.0 * sqrt (sqrt (sqrt ( 
     ((double) (count[i][j])/(double) (count_max)))))); 
    r[i][j] = 3 * c/5; 
    g[i][j] = 3 * c/5; 
    b[i][j] = c; 
    } 
} 
});

來源

2016-12-23 Gabriel Stanica

TBB的默認分區是'auto_partitioner'，執行遞歸工作細分到每線程一個循環迭代，這可能會導致巨大的開銷水平。具有許多編譯器的'for'工作共享結構的默認調度是'static'，因此，您應該爲''parallel_for'算法提供'static_partitioner'的單例實例，以便在TBB中具有與OpenMP中相同的工作分配。 –

我已經改變了與parallel_for時TBB –

:: parallel_for時（TBB :: blocked_range2d （0，M，O，N） \t，[＆]（TBB :: blocked_range2d S，static_partitioner（））{ \t爲（int i = s.rows（）。begin（）; i

請注意OpenMP版本代碼中的private (i, j, k, x, x1, x2, y, y1, y2)子句。這個變量列表指定了並行循環體內的私有/局部變量。但是，在TBB版本的代碼中，許多這些變量都被lambda作爲參考（[&]）捕獲，因此代碼不正確。它有種族，而且在我看來，減速是由於從多個線程訪問這些變量（緩存一致性開銷和循環索引混亂）造成的。因此，如果您想修復代碼，請將這些變量置於本地，例如

tbb::parallel_for (int(0), m, [&](int i) 
{ 
double x, y, x1, x2, y1, y2; // !!!! 
int j, k;     // !!!! 
for (j = 0; j < n; j++) 
{ 
    x = ((double) ( j - 1) * x_max 
     + (double) (m - j ) * x_min) 
    /(double) (m  - 1); 
...

來源

2016-12-23 15:49:24 Alex

將OpenMP轉換爲TBB

回答

相關問題