Given the following struct...
#include <type_traits>
struct C {
long a[16]{};
long b[16]{};
C() = default;
};
// For godbolt
C construct() {
static_assert(not std::is_trivial_v<C>);
static_assert(std::is_standard_layout_v<C>);
C c;
return c;
}
...gcc (version 10.2 on x86-64 Linux) with enabled optimization (at all 3 levels) produces the following assembly[1] for construct
:
construct():
mov r8, rdi
xor eax, eax
mov ecx, 32
rep stosq
mov rax, r8
ret
Once I provide empty default constructor...
#include <type_traits>
struct C {
long a[16]{};
long b[16]{};
C() {} // <-- The only change
};
// For godbolt
C construct() {
static_assert(not std::is_trivial_v<C>);
static_assert(std::is_standard_layout_v<C>);
C c;
return c;
}
...generated assembly changes to initializing every field individually instead of single memset in the original:
construct():
mov rdx, rdi
mov eax, 0
mov ecx, 16
rep stosq
lea rdi, [rdx+128]
mov ecx, 16
rep stosq
mov rax, rdx
ret
Apparently, both structs are equivalent in terms of not being trivial, but being standard layout.
Is it just gcc missing an optimization opportunity, or is there more to it from the C++-the-language perspective?
The example is a stripped down version of production code where this did have material difference in performance.
[1] Godbolt: https://godbolt.org/z/8n1Mae
question from:
https://stackoverflow.com/questions/65871288/different-machine-code-for-empty-default-constructor-v-implicitly-defined-one 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…