Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
210 views
in Technique[技术] by (71.8m points)

c++ - Unoptimized clang++ code generates unneeded "movl $0, -4(%rbp)" in a trivial main()

I created a minimal C++ program:

int main() {
    return 1234;
}

and compiled it with clang++5.0 with optimization disabled (the default -O0). The resulting assembly code is:

  pushq %rbp
  movq %rsp, %rbp
  movl $1234, %eax # imm = 0x4D2
  movl $0, -4(%rbp)
  popq %rbp
  retq

I understand most of the lines, but I do not understand the "movl $0, -4(%rbp)". It seems the program initializes some local variable to 0. Why?

What compiler-internal detail leads to this store that doesn't correspond to anything in the source?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

TL;DR : In unoptimized code your CLANG++ set aside 4 bytes for the return value of main and set it to zero as per the C++(including C++11) standards. It generated the code for a main function that didn't need it. This is a side effect of not being optimized. Often an unoptimized compiler will generate code it may need, then doesn't end up needing it, and nothing is done to clean it up.


Because you are compiling with -O0 there is a very minimum of optimizations done on code (-O0 may remove dead code etc). Trying to understand artifacts in unoptimized code is usually a wasted exercise. The results of unoptimized code are extra loads and stores and other artifacts of raw code generation.

In this case main is special because in C99/C11 and C++ the standards effectively say that when reaching the outer block of main the default return value is 0. The C11 standard says:

5.1.2.2.3 Program termination

1 If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;11) reaching the } that terminates the main function returns a value of 0. If the return type is not compatible with int, the termination status returned to the host environment is unspecified.

The C++11 standard says:

3.6.1 Main function

5) A return statement in main has the effect of leaving the main function (destroying any objects with automatic storage duration) and calling std::exit with the return value as the argument. If control reaches the end of main without encountering a return statement, the effect is that of executing

 return 0;

In the version of CLANG++ you are using the unoptimized 64-bit code by default has the return value of 0 placed at dword ptr [rbp-4].

The problem is that your test code is a bit too trivial to see how this default return value comes in to play. Here is an example that should be a better demonstration:

int main() {
    int a = 3;
    if (a > 3) return 5678;
    else if (a == 3) return 42;
}

This code has two exit explicit exit points via return 5678 and return 42; but there isn't a final return at the end of the function. If } is reached the default is to return 0. If we examine the godbolt output we see this:

main:                                   # @main
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0        # Default return value of 0
        mov     dword ptr [rbp - 8], 3
        cmp     dword ptr [rbp - 8], 3        # Is a > 3
        jle     .LBB0_2
        mov     dword ptr [rbp - 4], 5678     # Set return value to 5678
        jmp     .LBB0_5                       # Go to common exit point .LBB0_5
.LBB0_2:
        cmp     dword ptr [rbp - 8], 3        # Is a == 3?
        jne     .LBB0_4
        mov     dword ptr [rbp - 4], 42       # Set return value to 42
        jmp     .LBB0_5                       # Go to common exit point .LBB0_5
.LBB0_4:
        jmp     .LBB0_5                       # Extraneous unoptimized jump artifact 
# This is common exit point of all the returns from `main`
.LBB0_5:
        mov     eax, dword ptr [rbp - 4]      # Use return value from memory
        pop     rbp
        ret

As one can see the compiler has generated a common exit point that sets the return value (EAX) from the stack address dword ptr [rbp-4]. At the beginning of the code dword ptr [rbp-4] is explicitly set to 0. In the simpler case, the unoptimized code still generates that instruction but goes unused.

If you build the code with the option -ffreestanding you should see the default return value for main no longer set to 0. This is because the requirement for a default return value of 0 from main applies to a hosted environment and not a freestanding one.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...