2017-02-16 87 views
1

將OpenMPI從1.8.4更新到2.0.2後,我使用MPI_Wtime()進行了錯誤的時間測量。在版本1.8.4中,結果與omp_get_wtime()計時器返回的結果相同,現在MPI_Wtime運行速度提高了約2倍。MPI_Wtime定時器在OpenMPI 2.0.2中運行速度提高2倍

什麼會導致這樣的行爲?

我的示例代碼:

#include <omp.h> 
#include <mpi.h> 
#include <stdio.h> 
#include <stdlib.h> 

int some_work(int rank, int tid){ 
    int count = 10000; 
    int arr[count]; 
    for(int i=0; i<count; i++) 
    arr[i] = i + tid + rank; 
    for(int val=0; val<4000000; val++) 
    for(int i=0; i<count-1; i++) 
     arr[i] = arr[i+1]; 

    return arr[0]; 
} 


int main (int argc, char *argv[]) { 

    MPI_Init(NULL, NULL); 
    int rank, size; 

    MPI_Comm_size(MPI_COMM_WORLD, &size); 
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

    if (rank == 0) 
    printf("there are %d mpi processes\n", size); 

    MPI_Barrier(MPI_COMM_WORLD); 

    double omp_time1 = omp_get_wtime(); 
    double mpi_time1 = MPI_Wtime(); 
    #pragma omp parallel 
    { 
    int tid = omp_get_thread_num(); 
    if (tid == 0) { 
     int nthreads = omp_get_num_threads(); 
     printf("There are %d threads for process %d\n", nthreads, rank); 
     int result = some_work(rank, tid); 
     printf("result for process %d thread %d is %d\n", rank, tid, result); 
    } 
    } 

    MPI_Barrier(MPI_COMM_WORLD); 
    double mpi_time2 = MPI_Wtime(); 
    double omp_time2 = omp_get_wtime(); 
    printf("process %d omp time: %f\n", rank, omp_time2 - omp_time1); 
    printf("process %d mpi time: %f\n", rank, mpi_time2 - mpi_time1); 
    printf("process %d ratio: %f\n", rank, (mpi_time2 - mpi_time1)/(omp_time2 - omp_time1)); 

    MPI_Finalize(); 

    return EXIT_SUCCESS; 
} 

編譯

g++ -O3 src/example_main.cpp -o bin/example -fopenmp -I/usr/mpi/gcc/openmpi-2.0.2/include -L /usr/mpi/gcc/openmpi-2.0.2/lib -lmpi 

運行

salloc -N2 -n2 mpirun --map-by ppr:1:node:pe=16 bin/example 

給出類似

there are 2 mpi processes 
There are 16 threads for process 0 
There are 16 threads for process 1 
result for process 1 thread 0 is 10000 
result for process 0 thread 0 is 9999 
process 1 omp time: 5.066794 
process 1 mpi time: 10.098752 
process 1 ratio: 1.993125 
process 0 omp time: 5.066816 
process 0 mpi time: 8.772390 
process 0 ratio: 1.731342 

這個比例與我之前寫的不一致,但仍然足夠大。

結果的openmpi 1.8.4都OK:

g++ -O3 src/example_main.cpp -o bin/example -fopenmp -I/usr/mpi/gcc/openmpi-1.8.4/include -L /usr/mpi/gcc/openmpi-1.8.4/lib -lmpi -lmpi_cxx 

給人

result for process 0 thread 0 is 9999 
result for process 1 thread 0 is 10000 
process 0 omp time: 4.655244 
process 0 mpi time: 4.655232 
process 0 ratio: 0.999997 
process 1 omp time: 4.655335 
process 1 mpi time: 4.655321 
process 1 ratio: 0.999997 
+0

請發佈您用於測量的代碼以及特定的測量結果(不僅僅是因子)。 – Zulan

+0

增加了一個小例子,那裏的比例不一致。 – catbus

回答

0

也許MPI_Wtime()可能是本身就是一個昂貴的操作? 如果您避免測量MPI_Wtime()作爲OpenMP-Time的一部分消耗的時間,那麼結果是否更加一致? 例如:

double mpi_time1 = MPI_Wtime(); 
double omp_time1 = omp_get_wtime(); 
/* do something */ 
double omp_time2 = omp_get_wtime(); 
double mpi_time2 = MPI_Wtime();