Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
298 views
in Technique[技术] by (71.8m points)

c++ - Fastest way to scan for bit pattern in a stream of bits

I need to scan for a 16 bit word in a bit stream. It is not guaranteed to be aligned on byte or word boundaries.

What is the fastest way of achieving this? There are various brute force methods; using tables and/or shifts but are there any "bit twiddling shortcuts" that can cut down the number of calculations by giving yes/no/maybe contains the flag results for each byte or word as it arrives?

C code, intrinsics, x86 machine code would all be interesting.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Using simple brute force is sometimes good.

I think precalc all shifted values of the word and put them in 16 ints so you got an array like this (assuming int is twice as wide as short)

 unsigned short pattern = 1234;
 unsigned int preShifts[16];
 unsigned int masks[16];
 int i;
 for(i=0; i<16; i++)
 {
      preShifts[i] = (unsigned int)(pattern<<i);  //gets promoted to int
      masks[i] = (unsigned int) (0xffff<<i);
 }

and then for every unsigned short you get out of the stream, make an int of that short and the previous short and compare that unsigned int to the 16 unsigned int's. If any of them match, you got one.

So basically like this:

  int numMatch(unsigned short curWord, unsigned short prevWord)
  {
       int numHits = 0;
       int combinedWords = (prevWord<<16) + curWord;

       int i=0;
       for(i=0; i<16; i++)
       {
             if((combinedWords & masks[i]) == preShifsts[i]) numHits++;
       }
       return numHits;
  }

Do note that this could potentially mean multiple hits when the patterns is detected more than once on the same bits:

e.g. 32 bits of 0's and the pattern you want to detect is 16 0's, then it would mean the pattern is detected 16 times!


The time cost of this, assuming it compiles approximately as written, is 16 checks per input word. Per input bit, this does one & and ==, and branch or other conditional increment. And also a table lookup for the mask for every bit.

The table lookup is unnecessary; by instead right-shifting combined we get significantly more efficient asm, as shown in another answer which also shows how to vectorize this with SIMD on x86.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...