c - Waste in memory allocation for local variables

Question

Welcome To Ask or Share your Answers For Others

c - Waste in memory allocation for local variables

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - Waste in memory allocation for local variables

This my program:

void test_function(int a, int b, int c, int d){
    int flag;
    char buffer[10];

   flag = 31337;
   buffer[0] = 'A';
}

int main() {
    test_function(1, 2, 3, 4);
}

I compile this program with the debug option:

gcc -g my_program.c

I use gdb and I disassemble the test_function with intel syntax:

(gdb) disassemble test_function
Dump of assembler code for function test_function:
0x08048344 <test_function+0>:   push   ebp
0x08048345 <test_function+1>:   mov    ebp,esp
0x08048347 <test_function+3>:   sub    esp,0x28
0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69
0x08048351 <test_function+13>:  mov    BYTE PTR [ebp-40],0x41
0x08048355 <test_function+17>:  leave  
0x08048356 <test_function+18>:  ret    
End of assembler dump.

And I disassemble the main:

(gdb) disassemble main
Dump of assembler code for function main:
0x08048357 <main+0>:    push   ebp
0x08048358 <main+1>:    mov    ebp,esp
0x0804835a <main+3>:    sub    esp,0x18
0x0804835d <main+6>:    and    esp,0xfffffff0
0x08048360 <main+9>:    mov    eax,0x0
0x08048365 <main+14>:   sub    esp,eax
0x08048367 <main+16>:   mov    DWORD PTR [esp+12],0x4
0x0804836f <main+24>:   mov    DWORD PTR [esp+8],0x3
0x08048377 <main+32>:   mov    DWORD PTR [esp+4],0x2
0x0804837f <main+40>:   mov    DWORD PTR [esp],0x1
0x08048386 <main+47>:   call   0x8048344 <test_function>
0x0804838b <main+52>:   leave  
0x0804838c <main+53>:   ret    
End of assembler dump.

I place a breakpoint at this adresse: 0x08048355 (leave instruction for the test_function) and I run the program.

I watch the stack like this:

(gdb) x/16w $esp
0xbffff7d0:     0x00000041      0x08049548      0xbffff7e8      0x08048249
0xbffff7e0:     0xb7f9f729      0xb7fd6ff4      0xbffff818      0x00007a69
0xbffff7f0:     0xb7fd6ff4      0xbffff8ac      0xbffff818      0x0804838b
0xbffff800:     0x00000001      0x00000002      0x00000003      0x00000004

0x0804838b is the return adress, 0xbffff818 is the saved frame pointer (main ebp) and flag variable is stocked 12 bytes further. Why 12?

I don't understand this instruction:

0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69

Why we don't stock the content's variable 0x00007a69 in ebp-4 instead of 0xbffff8ac?

Same question for buffer. Why 40?

We don't waste the memory? 0xb7fd6ff4 0xbffff8ac and 0xb7f9f729 0xb7fd6ff4 0xbffff818 0x08049548 0xbffff7e8 0x08048249 are not used?

This the output for the command gcc -Q -v -g my_program.c:

Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/specs
Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug i486-linux-gnu
Thread model: posix
gcc version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1)
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/cc1 -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=6 notesearch.c -dumpbase notesearch.c -auxbase notesearch -g -version -o /tmp/ccGT0kTf.s
GNU C version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1) (i486-linux-gnu)
        compiled by GNU C version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1).
GGC heuristics: --param ggc-min-expand=99 --param ggc-min-heapsize=129473
options passed:  -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=6
 -auxbase -g
options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
 -fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
 -fbranch-count-reg -fcommon -fgnu-linker -fargument-alias
 -fzero-initialized-in-bss -fident -fmath-errno -ftrapping-math -m80387
 -mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387
 -maccumulate-outgoing-args -mcpu=pentiumpro -march=i486
ignoring nonexistent directory "/usr/local/include/i486-linux-gnu"
ignoring nonexistent directory "/usr/i486-linux-gnu/include"
ignoring nonexistent directory "/usr/include/i486-linux-gnu"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/include
 /usr/include
End of search list.
 gnu_dev_major gnu_dev_minor gnu_dev_makedev stat lstat fstat mknod fatal ec_malloc dump main print_notes find_user_note search_note
Execution times (seconds)
 preprocessing         :   0.00 ( 0%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 lexical analysis      :   0.00 ( 0%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 parser                :   0.02 (100%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.02             0.04             0.00
 as -V -Qy -o /tmp/ccugTYeu.o /tmp/ccGT0kTf.s
GNU assembler version 2.17.50 (i486-linux-gnu) using BFD version 2.17.50 20070103 Ubuntu
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crt1.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crti.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/crtbegin.o -L/usr/lib/gcc-lib/i486-linux-gnu/3.3.6 -L/usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../.. /tmp/ccugTYeu.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/crtend.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crtn.o

NOTE: I read the book "The art of exploitation" and I use the VM provides with the book.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T17:56:04+0000

The compiler is trying to maintain 16 byte alignment on the stack. This also applies to 32-bit code these days (not just 64-bit). The idea is that at the point before executing a CALL instruction the stack must be aligned to a 16-byte boundary.

Because you compiled with no optimizations there are some extraneous instructions.

0x0804835a <main+3>:    sub    esp,0x18        ; Allocate local stack space
0x0804835d <main+6>:    and    esp,0xfffffff0  ; Ensure `main` has a 16 byte aligned stack
0x08048360 <main+9>:    mov    eax,0x0         ; Extraneous, not needed
0x08048365 <main+14>:   sub    esp,eax         ; Extraneous, not needed

ESP is now 16-byte aligned after the last instruction above. We move the parameters for the call starting at the top of the stack at ESP. That is done with:

0x08048367 <main+16>:   mov    DWORD PTR [esp+12],0x4
0x0804836f <main+24>:   mov    DWORD PTR [esp+8],0x3
0x08048377 <main+32>:   mov    DWORD PTR [esp+4],0x2
0x0804837f <main+40>:   mov    DWORD PTR [esp],0x1

The CALL then pushes a 4 byte return address on the stack. We then reach these instructions after the call:

0x08048344 <test_function+0>:   push   ebp     ; 4 bytes pushed on stack
0x08048345 <test_function+1>:   mov    ebp,esp ; Setup stackframe

This pushes another 4 bytes on the stack. With the 4 bytes from the return address we are now misaligned by 8 bytes. To reach 16-byte alignment again we will need to waste an additional 8 bytes on the stack. That is why in this statement there is an additional 8 bytes allocated:

0x08048347 <test_function+3>:   sub    esp,0x28

0x08 bytes already on stack because of return address(4-bytes) and EBP(4 bytes)
0x08 bytes of padding needed to align stack back to 16-byte alignment
0x20 bytes needed for local variable allocation = 32 bytes. 32/16 is evenly divisible by 16 so alignment maintained

The second and third number above added together is the value 0x28 computed by the compiler and used in sub esp,0x28.

0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69

So why [ebp-12] in this instruction? The first 8 bytes [ebp-8] through [ebp-1] are the alignment bytes used to get the stack 16-byte aligned. The local data will then appear on the stack after that. In this case [ebp-12] through [ebp-9] are the 4 bytes for the 32-bit integer flag.

Then we have this for updating buffer[0] with the character 'A':

0x08048351 <test_function+13>:  mov    BYTE PTR [ebp-40],0x41

The oddity then would be why a 10 byte array of characters would appear from [ebp+40](beginning of array) to [ebp+13] which is 28 bytes. The best guess I can make is that compiler felt that it could treat the 10 byte character array as a 128-bit (16-byte) vector. This would force the compiler to align the buffer on a 16 byte boundary, and pad the array out to 16 bytes (128-bits). From the perspective of the compiler, your code seems to be acting much like it was defined as:

#include <xmmintrin.h>
void test_function(int a, int b, int c, int d){
    int flag;
    union {
        char buffer[10];
        __m128 m128buffer;      ; 16-byte variable that needs to be 16-bytes aligned
    } bufu;

   flag = 31337;
   bufu.buffer[0] = 'A';
}

The output on GodBolt for GCC 4.9.0 generating 32-bit code with SSE2 enabled appears as follows:

test_function:
        push    ebp     #
        mov     ebp, esp  #, 
        sub     esp, 40   #,same as: sub esp,0x28
        mov     DWORD PTR [ebp-12], 31337 # flag,
        mov     BYTE PTR [ebp-40], 65     # bufu.buffer,
        leave
        ret

This looks very similar to your disassembly in GDB.

If you compiled with optimizations (such as -O1, -O2, -O3), the optimizer could have simplified test_function because it is a leaf function in your example. A leaf function is one that doesn't call another function. Certain shortcuts could have been applied by the compiler.

As for why the character array seems to be aligned to a 16-byte boundary and padded to be 16 bytes? That probably can't be answered with certainty until we know what GCC compiler you are using (gcc --version will tell you). It would also be useful to know your OS and OS version. Even better would be to add the output from this command to your question gcc -Q -v -g my_program.c

Categories

c - Waste in memory allocation for local variables

c - Waste in memory allocation for local variables

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags