Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
343 views
in Technique[技术] by (71.8m points)

c++ - Exchange Data Between MPI processes (halo)

Given the following scenario, I have N MPI processes each with an object. when the communication stage comes, data "usually small" from these object will be exchanged. In general, there is data exchange between any two nodes.

What is the best strategy?:

  • In any node X, create tow buffers for each other node with a connection with this node X. and then do send/receive on peer-to-peer basis.
  • in Each node X, create one buffer to gather all the halo data to be communicated. and then "bcast" that buffer.

  • Is there any other strategy I am not aware of?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

For nearest neighbour style halo swaps, usually one of the most efficient implementations is to use a set of MPI_Sendrecv calls, usually two per each dimension:

Half-step one - Transfer of data in positive direction: each rank receives from the one on its left and into its left halo and sends data to the rank on its right

    +-+-+---------+-+-+     +-+-+---------+-+-+     +-+-+---------+-+-+
--> |R| | (i,j-1) |S| | --> |R| |  (i,j)  |S| | --> |R| | (i,j+1) |S| | -->
    +-+-+---------+-+-+     +-+-+---------+-+-+     +-+-+---------+-+-+

(S designates the part of the local data being communicated while R designates the halo into which data is being received, (i,j) are the coordinates of the rank in the process grid)

Half-step two - Transfer of data in negative direction: each rank receives from the one on its right and into its right halo and sends data to the rank on its left

    +-+-+---------+-+-+     +-+-+---------+-+-+     +-+-+---------+-+-+
<-- |X|S| (i,j-1) | |R| <-- |X|S|  (i,j)  | |R| <-- |X|S| (i,j+1) | |R| <--
    +-+-+---------+-+-+     +-+-+---------+-+-+     +-+-+---------+-+-+

(X is that part of the halo region that has already been populated in the previous half-step)

Most switched networks support multiple simultaneous bi-directional (full duplex) communications and the latency of the whole exchange is

Both of the above half-steps are repeated as many times as is the dimensionality of the domain decomposition.

The process is even more simplified in version 3.0 of the standard, which introduces the so-called neighbourhood collective communications. The whole multidimensional halo swap can be performed using a single call to MPI_Neighbor_alltoallw.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...