Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
532 views
in Technique[技术] by (71.8m points)

optimization - Template-ing a 'for' loop in C++?

I have a C++ snippet below with a run-time for loop,

for(int i = 0; i < I; i++)
  for (int j = 0; j < J; j++)
    A( row(i,j), column(i,j) ) = f(i,j);

The snippet is called repeatedly. The loop bounds 'I' and 'J' are known at compile time (I/J are the order of 2 to 10). I would like to unroll the loops somehow using templates. The main bottleneck is the row() and column() and f() functions. I would like to replace them with equivalent metaprograms that are evaluated at compile-time, using row<i,j>::enum tricks.

What I'd really love is something that eventually resolves the loop into a sequence of statements like:

A(12,37) = 0.5;
A(15,23) = 0.25;
A(14,45) = 0.25;

But I'd like to do so without wrecking the for-for structure too much. Something in the spirit of:

TEMPLATE_FOR<i,0,I>
  TEMPLATE_FOR<j,0,J>
     A( row<i,j>::value, column<i,j>::value ) = f<i,j>::value

Can boost::lambda (or something else) help me create this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

A good compiler should do unrolling for you. For instance, in gcc compiling with the -O2 option turns on loop unrolling.

If you try to do it yourself manually, unless you measure things carefully and really know what you are doing, you are liable to end up with slower code. For example, in your case with manual unrolling you are liable to prevent the compiler from being able to do a loop interchange or stripmine optimization (look for --floop-interchange and -floop-strip-mine in the gcc docs)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...