For dense packing that doesn't incur a large overhead of reading, I'd recommend a struct with bitfields. In your example where you have four values ranging from 0 to 3, you'd define the struct as follows:
struct Foo {
unsigned char a:2;
unsigned char b:2;
unsigned char c:2;
unsigned char d:2;
}
This has a size of 1 byte, and the fields can be accessed simply, i.e. foo.a
, foo.b
, etc.
By making your struct more densely packed, that should help with cache efficiency.
Edit:
To summarize the comments:
There's still bit fiddling happening with a bitfield, however it's done by the compiler and will most likely be more efficient than what you would write by hand (not to mention it makes your source code more concise and less prone to introducing bugs). And given the large amount of structs you'll be dealing with, the reduction of cache misses gained by using a packed struct such as this will likely make up for the overhead of bit manipulation the struct imposes.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…