First you have a typo on the name
#pragma omp parallel for reduction (+: reasult[:i])
should actually be "result" not "reasult"
Nonetheless, why are you section the array with result[:i]
? Based on your code, it seems that you wanted to reduce the entire array, namely:
#pragma omp parallel for reduction (+: result)
for (i = 0; i < M; i++)
for(j = 0; j < N; j++)
if ( numbers[j] == i)
result[i]++;
When one's compiler does not support the OpenMP 4.5 array reduction feature one can alternatively explicitly implement the reduction (check this SO thread to see how).
As pointed out by @Hristo Iliev in the comments:
Provided that M * sizeof(result[0]) / #threads is a multiple of the
cache line size, and even if it isn't when the value of M is large
enough, there is absolutely no need to involve reduction in the
process. Unless the program is running on a NUMA system, that is.
Assuming that the aforementioned conditions are met, and if you analyze carefully the outermost loop iterations (i.e., variable i
) are assigned to the threads, and since the variable i
is used to access the result
array, each thread will be updating a different position of the result
array. Therefore, you can simplified your code to:
#pragma omp parallel for
for (i = 0; i < M; i++)
for(j = 0; j < N; j++)
if ( numbers[j] == i)
result[i]++;
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…