2016-06-30 127 views
0

我正在學習如何使用OpenMPI和Fortran。通過使用OpenMPI文檔,我試圖創建一個簡單的客戶端/服務器程序。然而,當我運行它從客戶端收到以下錯誤:「ORTE_ERROR_LOG:文件dpm_orte.c中未找到第167行」導致使用OpenMPI的Fortran程序崩潰

[Laptop:13402] [[54220,1],0] ORTE_ERROR_LOG: Not found in file dpm_orte.c at line 167 
[Laptop:13402] *** An error occurred in MPI_Comm_connect 
[Laptop:13402] *** reported by process [3553361921,0] 
[Laptop:13402] *** on communicator MPI_COMM_WORLD 
[Laptop:13402] *** MPI_ERR_INTERN: internal error 
[Laptop:13402] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, 
[Laptop:13402] *** and potentially your MPI job) 
------------------------------------------------------- 
Primary job terminated normally, but 1 process returned 
a non-zero exit code.. Per user-direction, the job has been aborted. 
------------------------------------------------------- 
-------------------------------------------------------------------------- 
mpiexec detected that one or more processes exited with non-zero status, thus causing 
the job to be terminated. The first process to do so was: 
    Process name: [[54220,1],0] 
    Exit code: 17 
-------------------------------------------------------------------------- 

代碼爲服務器和客戶端可以看到下面:

server.f90

program name 
use mpi 
implicit none 

    ! type declaration statements 
    INTEGER :: ierr, size, newcomm, loop, buf(255), status(MPI_STATUS_SIZE) 
    CHARACTER(MPI_MAX_PORT_NAME) :: port_name 

    ! executable statements 
    call MPI_Init(ierr) 
    call MPI_Comm_size(MPI_COMM_WORLD, size, ierr) 
    call MPI_Open_port(MPI_INFO_NULL, port_name, ierr) 
    print *, "Port name is: ", port_name 

    do while (.true.) 
     call MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

     loop = 1 
     do while (loop .eq. 1) 
      call MPI_Recv(buf, 255, MPI_INTEGER, MPI_ANY_SOURCE, MPI_ANY_TAG, newcomm, status, ierr) 
      print *, "Looping the loop." 
      loop = 0 

     enddo 

     call MPI_Comm_free(newcomm, ierr) 
     call MPI_Close_port(port_name, ierr) 
     call MPI_Finalize(ierr)  

    enddo 

end program name 

客戶端。 F90

program name 
use mpi 
implicit none 

    ! type declaration statements 
    INTEGER :: ierr, buf(255), tag, newcomm 
    CHARACTER(MPI_MAX_PORT_NAME) :: port_name 
    LOGICAL :: done 

    ! executable statements 
    call MPI_Init(ierr) 
    print *, "Please provide me with the port name: " 
    read(*,*) port_name 

    call MPI_Comm_connect(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

    done = .false. 
    do while (.not. done) 
     tag = 0 
     call MPI_Send(buf, 255, MPI_INTEGER, 0, tag, newcomm, ierr) 
     done = .true. 
    enddo 

    call MPI_Send(buf, 0, MPI_INTEGER, 0, 1, newcomm, ierr) 
    call MPI_Comm_Disconnect(newcomm, ierr) 
    call MPI_Finalize(ierr) 

end program name 

我用mpif90 server.f90 -o server.outmpif90 client.f90 -o client.out編譯和mpiexec -np 1 server.outmpiexec -np 1 client.out來運行程序。它爲客戶端提供端口名稱(即當我在read之後按Enter鍵)發生錯誤時。

which dpm_orte.c回報dpm_orte.c not found

我運行Linux和我從拱額外安裝的openmpi 1.10.3-1。

+0

@ d_1999,我移動了MPI_Finalize()並嘗試了,但仍然存在問題。 – GLaDER

回答

3

這是一個普通的Fortran輸入處理錯誤,與MPI沒有任何關係(除了Open MPI輸出完全不可理解的錯誤信息外)。只需插入線client.f90看完後馬上打印的port_name值:

print *, "Please provide me with the port name: " 
read(*,*) port_name 
print *, port_name 

用實際端口名稱是像2527592448.0;tcp://10.0.1.6,10.0.1.2,192.168.122.1,10.10.11.10:55837+2527592449.0;tcp://10.0.1.6,10.0.1.4,192.168.122.1,10.10.11.10::300輸出將2527592448.0。列表導向輸入將;作爲分隔符並在其後停止讀取,因此傳遞給MPI_COMM_CONNECT的端口地址不完整。

溶液與

read(*,'(A)') port_name 

此外,在服務器中的循環寫的不好取代read(*,*) port_name。您不能多次致電MPI_FINALIZE。關閉端口也是一個不好的主意,因爲您之後立即致電MPI_COMM_ACCEPT。正確的循環將是:

! executable statements 
call MPI_Init(ierr) 
call MPI_Comm_size(MPI_COMM_WORLD, size, ierr) 
call MPI_Open_port(MPI_INFO_NULL, port_name, ierr) 
print *, "Port name is: ", port_name 

do while (.true.) 
    call MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

    loop = 1 
    do while (loop .eq. 1) 
     call MPI_Recv(buf, 255, MPI_INTEGER, MPI_ANY_SOURCE, MPI_ANY_TAG, newcomm, status, ierr) 
     print *, "Looping the loop." 
     loop = 0 
    enddo 

    call MPI_Comm_disconnect(newcomm, ierr) 
    call MPI_Comm_free(newcomm, ierr) 
enddo 

call MPI_Close_port(port_name, ierr) 
call MPI_Finalize(ierr) 
+0

非常感謝你的「閱讀」澄清。至於循環,我知道它不好,但我陷入了另一個錯誤,只是不想做更多。再次謝謝你! – GLaDER