Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
190 views
in Technique[技术] by (71.8m points)

c++ - cout slowest processor MPI

I am writing a program using MPI. Each processor executes a for loop:

int main(int argc, char** argv) {
  boost::mpi::environment env(argc, argv);

  for( int i=0; i<10; ++i ) {
    std::cout << "Index " << i << std::endl << std::flush;
  }
}

Is there a way to make the cout only happen on the last processor to hit index i? Or flag so a line is only executed on the last processor to get to it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It might look like trivial, but actually, what you ask here is extremely complex for distributed memory models such as MPI...

In a shared memory environment, such as OpenMP for example, this would be trivially solved by defining a shared counter, incremented atomically by all threads, and checked afterwards to see if it's value corresponds to the number of threads. If so, then that would mean all threads passed the point and the current being the last one, he would take care of the printing.

In a distributed environment, defining and updating such a shared variable is very complex, since each process might run on a remote machine. To still allow for that, MPI proposes since MPI-2.0 memory windows and one-sided communications. However, even with that, it wasn't possible to properly implement an atomic counter increment while also reliably getting it's value. It is only with MPI 3.0 and the introduction of the MPI_Fetch_and_op() function that this became possible. Here is an example of implementation:

#include <mpi.h>
#include <iostream>

int main( int argc, char *argv[] ) {

    // initialisation and inquiring of rank and size
    MPI_Init( &argc, &argv);

    int rank, size;
    MPI_Comm_rank( MPI_COMM_WORLD, &rank );
    MPI_Comm_size( MPI_COMM_WORLD, &size );

    // creation of the "shared" counter on process of rank 0
    int *addr = 0, winSz = 0;
    if ( rank == 0 ) {
        winSz = sizeof( int );
        MPI_Alloc_mem( winSz, MPI_INFO_NULL, &addr );
        *addr = 1; // initialised to 1 since MPI_Fetch_and_op returns value *before* increment
    }
    MPI_Win win;
    MPI_Win_create( addr, winSz, sizeof( int ), MPI_INFO_NULL, MPI_COMM_WORLD, &win );

    // atomic incrementation of the counter
    int counter, one = 1;
    MPI_Win_lock( MPI_LOCK_EXCLUSIVE, 0, 0, win );
    MPI_Fetch_and_op( &one, &counter, MPI_INT, 0, 0, MPI_SUM, win );
    MPI_Win_unlock( 0, win );

    // checking the value of the counter and printing by last in time process
    if ( counter == size ) {
        std::cout << "Process #" << rank << " did the last update" << std::endl;
    }

    // cleaning up
    MPI_Win_free( &win );
    if ( rank == 0 ) {
        MPI_Free_mem( addr );
    }
    MPI_Finalize();

    return 0;
}

As you can see, this is quite lengthy and complex for such a trivial request. And moreover, this requires MPI 3.0 support.

Unfortunately, Boost.MPI which seems to your target, only "supports the majority of functionality in MPI 1.1". So if you really want to get this functionality, you'll have to use some plain MPI programming.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...