I often find the term "bit padding" in relation of data types, but don′t understand what it is nor what it does exactly with those.
The gist of it is they are "wasted" space. I say "wasted" because while having padding bits makes the object bigger, it can make working with the object much easier (which means faster) and the small space waste can generate huge performance gains. In some cases it is essential because the CPU can't handle working with objects of that size.
Lets say you have a struct like (all numbers are just an example, different platforms can have different values):
struct foo
{
short a; // 16 bits
char b; // 8 bits
};
and the machine you are working with reads 32 bits of data in a single read operation. Reading a single foo is not a problem since the entire object fits into that 32 bit chunk. What does become a problem is when you have an array. The important thing to remember about arrays is that they are contiguous, there is no space between elements. It's just one object immediately followed by another. So, if you have an array like
foo array[10]{};
With this the first foo
object is in a 32 bit bucket. The next element of the array though will be in the first 32 bit bucket and the second 32 bit bucket. This means that the member a
is in two separate buckets. Some processors can do this (at a cost) and other processors will just crash if you try to do this. To solve both those problems the compiler will add padding bits to the end of foo
to pad out it's size. This means foo actually becomes
struct foo
{
short a; // 16 bits
char b; // 8 bits
char _; // 8 bits of padding
};
And now it is easy for the processor to handle foo
objects by themselves or in an array. It doesn't need to do any extra work and you've only added 8 bits per object. You'd need a lot of objects for that to start to matter on a modern machine.
There is also times where you need padding between members of the type because of unaligned access. Lets say you have
struct bar
{
char c; // 8 bits
int d; // 32 bits
};
Now bar
is 40 bits wide and d
more often then not will be stored in two different buckets again. To fix this the compiler adds padding bits between c
an d
like
struct bar
{
char c; // 8 bits
char _[3]; // 24 bits
int d; // 32 bits
};
and now d
is guaranteed to go into a single 32 bit bucket.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…