Credit to Anton here please. His answer was first and his is correct.
I am posting this exposition because I know you won't believe him until you see the assembler:
Given:
#include <cstring>
#include <iostream>
// prevent the optimiser from eliding this function altogether
__attribute__((noinline))
float convert(int in)
{
static_assert(sizeof(float) == sizeof(int), "Oops");
float result;
memcpy(&result, &in, sizeof(result));
return result;
}
int main(int argc, char * argv[])
{
int a = 0x3f800000;
float f = convert(a);
std::cout << a << std::endl;
std::cout << f << std::endl;
}
result:
1065353216
1
compiled with -O2, here's the assembler output for the function convert
, with some added comments for clarity:
#
# I'll give you £10 for every call to `memcpy` you can find...
#
__Z7converti: ## @_Z7converti
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
#
# here's the conversion - simply move the integer argument (edi)
# into the first float return register (xmm0)
#
movd %edi, %xmm0
popq %rbp
retq
.cfi_endproc
#
# did you see any memcpy's?
# nope, didn't think so.
#
Just to drive the point home, here's the same function compiled with -O2 and -fomit-frame-pointer :
__Z7converti: ## @_Z7converti
.cfi_startproc
## BB#0:
movd %edi, %xmm0
retq
.cfi_endproc
Remember, this function only exists because I added the attribute to prevent the compiler from inlining it. In reality, with optimisations enabled, the entire function will be optimised away. Those 3 lines of code in the function and the call at the call site will vanish.
Modern optimising compilers are awesome.
but what I really wanted was this std::cout << *reinterpret_cast<float *>(&a) << std::endl;
and I think it expresses my intent perfectly well.
Well, yes it does. But c++ is designed with both correctness and performance in mind. Very often, the compiler would like to assume that two pointers or two references don't point to the same piece of memory. If it can do that, it can make all kinds of clever optimisations (usually involving not bothering make reads or writes which aren't necessary to produce the required effect). However, because a write to one pointer could affect the read from the other (if they really point at the same object), then in the interests of correctness, the compiler may not assume that the two objects are distinct, and it must perform every read and write you indicated in your code - just in case one write affects a subsequent read... unless the pointers point to different types. If they point to different types, the compiler is allowed to assume that they will never point to the same memory - this is the strict aliasing rule.
When you do this: *reinterpret_cast<float *>(&a)
,
you're trying to read the same memory via an int pointer and a float pointer. Because the pointers are of different types, the compiler will assume that they point to different memory addresses - even though in your mind they do not.
This is the struct aliasing rule. It's there to help programs perform quickly and correctly. A reinterpret cast like this prevents either.