Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
424 views
in Technique[技术] by (71.8m points)

llvm - Does the C++ standard allow for an uninitialized bool to crash a program?

I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.

In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.

I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.

Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?

// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
    // Determine which string to print based on boolValue
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    const size_t len = strlen(whichString);

    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)
    memcpy(destBuffer, whichString, len);
}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."

I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.

#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
    bool uninitializedBool;

   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem
   FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
    // Determine which string to print depending if 'boolValue' is evaluated as true or false
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    size_t len = strlen(whichString);

    memcpy(destBuffer, whichString, len);
}

int main()
{
    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
    FStruct structInstance;

    // Output "true" or "false" to stdout
    Serialize(structInstance.uninitializedBool);
    return 0;
}

The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:

const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue;       // clang clever optimization

While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.

It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.


You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.

(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)

ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.

In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)

The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
5U - boolValue.
(BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)


Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.

5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.


Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.

ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)

You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)

So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)

Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.

GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.

Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)

Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.

Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't optimize a<0 as always-false, only that tmp is always negative. (Because INT_MIN + a=INT_MAX is negative on this 2's complement target, and a can't be any higher than that.)

So gcc/clang don't currently backtrack to derive range info for the inputs of a calculation, only on the results based on the assumption of no signed overflow: <a href="https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(fontScale:1.2899450879999999,j:1,lang:c%2B%2B,source:'%23include+%3Climits.h%3E%0A%0Aint+range_check(int+a,+int+*sink1,+int+*sink2)+%7B%0A++++int+tmp+%3D+(a+%2B+INT_MIN)%3B++//+a+%3E%3D+1+to+avoid+UB%0A++++//+tmp+is+definitely+negative%0A++++//+a+is+definitely+positive%0A++++if+(tmp+%3E+0)+*sink1+%3D+0%3B++++++//+optimized+away%0A++++if+(a+%3C+0)+*sink2+%3D+0%3B++++++++//+not+optimized+away%0A++++return+tmp%3B%0A%7D%0A%0A%0A//+it+seems+that+gcc+and+clang+don!'t+derive+range+info+for+a%0A//+based+on+%60a-12345%60+not+overflowing,%0A//+but+they+do+derive+range+info+for+the+result.%0Aint+signed_overflow_rangecheck(int+a,+int+*sink)+%7B%0A++++int+tmp+%3D+(a+-+12345)%3B%0A++++//if+((unsigned)a+%3D%3D+0x80000000UL)+*sink%3D0%3B%0A++++if+(tmp+%3E+0x7ffffff0)+*sink+%3D+1%3B++//+optimized+out:+it+can!'t+have+wrapped+to+a+positive+this+close+to+INT_MAX%0A++++if+(a+%3C+(INT_MIN%2B12))+*sink+%3D+2%3B++//+not+optimized+out.++But+%60a%60+this+close+to+INT_MIN+would+mean+a-12345+wrapped.%0A++++return+tmp%3B%0A%7D%0A%0A'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:37.77562439622385,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:clang700,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...