Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

c - Why should I use a reduction rather than an atomic variable?

Assume we want to count something in an OpenMP loop. Compare the reduction

int counter = 0;
#pragma omp for reduction( + : counter )
for (...) {
    ...
    counter++;
}

with the atomic increment

int counter = 0;
#pragma omp for
for (...) {
    ...
    #pragma omp atomic
    counter++
}

The atomic access provides the result immediately, while a reduction only assumes its correct value at the end of the loop. For instance, reductions do not allow this:

int t = counter;
if (t % 1000 == 0) {
    printf ("%dk iterations
", t/1000);
}

thus providing less functionality.

Why would I ever use a reduction instead of atomic access to a counter?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Short answer:

Performance

Long Answer:

Because an atomic variable comes with a price, and this price is synchronization. In order to ensure that there is no race conditions i.e. two threads modifying the same variable at the same moment, threads must synchronize which effectively means that you lose parallelism, i.e. threads are serialized.

Reduction on the other hand is a general operation that can be carried out in parallel using parallel reduction algorithms. Read this and this articles for more info about parallel reduction algorithms.


Addendum: Getting a sense of how a parallel reduction work

Imagine a scenario where you have 4 threads and you want to reduce a 8 element array A. What you could do this in 3 steps (check the attached image to get a better sense of what I am talking about):

  • Step 0. Threads with index i<4 take care of the result of summing A[i]=A[i]+A[i+4].
  • Step 1. Threads with index i<2 take care of the result of summing A[i]=A[i]+A[i+4/2].
  • Step 2. Threads with index i<4/4 take care of the result of summing A[i]=A[i]+A[i+4/4]

At the end of this process you will have the result of your reduction in the first element of A i.e. A[0]

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...