Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
196 views
in Technique[技术] by (71.8m points)

c++ - What are these seemingly-useless callq instructions in my x86 object files for?

I have some template-heavy C++ code that I want to ensure the compiler optimizes as much as possible due to the large amount of information it has at compile time. To evaluate its performance, I decided to take a look at the disassembly of the object file that it generates. Below is a snippet of what I got from objdump -dC:

0000000000000000 <bar<foo, 0u>::get(bool)>:
   0:   41 57                   push   %r15
   2:   49 89 f7                mov    %rsi,%r15
   5:   41 56                   push   %r14
   7:   41 55                   push   %r13
   9:   41 54                   push   %r12
   b:   55                      push   %rbp
   c:   53                      push   %rbx
   d:   48 81 ec 68 02 00 00    sub    $0x268,%rsp
  14:   48 89 7c 24 10          mov    %rdi,0x10(%rsp)
  19:   48 89 f7                mov    %rsi,%rdi
  1c:   89 54 24 1c             mov    %edx,0x1c(%rsp)
  20:   e8 00 00 00 00          callq  25 <bar<foo, 0u>::get(bool)+0x25>
  25:   84 c0                   test   %al,%al
  27:   0f 85 eb 00 00 00       jne    118 <bar<foo, 0u>::get(bool)+0x118>
  2d:   48 c7 44 24 08 00 00    movq   $0x0,0x8(%rsp)
  34:   00 00 
  36:   4c 89 ff                mov    %r15,%rdi
  39:   4d 8d b7 30 01 00 00    lea    0x130(%r15),%r14
  40:   e8 00 00 00 00          callq  45 <bar<foo, 0u>::get(bool)+0x45>
  45:   84 c0                   test   %al,%al
  47:   88 44 24 1b             mov    %al,0x1b(%rsp)
  4b:   0f 85 ef 00 00 00       jne    140 <bar<foo, 0u>::get(bool)+0x140>
  51:   80 7c 24 1c 00          cmpb   $0x0,0x1c(%rsp)
  56:   0f 85 24 03 00 00       jne    380 <bar<foo, 0u>::get(bool)+0x380>
  5c:   48 8b 44 24 10          mov    0x10(%rsp),%rax
  61:   c6 00 00                movb   $0x0,(%rax)
  64:   80 7c 24 1b 00          cmpb   $0x0,0x1b(%rsp)
  69:   75 25                   jne    90 <bar<foo, 0u>::get(bool)+0x90>
  6b:   48 8b 74 24 10          mov    0x10(%rsp),%rsi
  70:   4c 89 ff                mov    %r15,%rdi
  73:   e8 00 00 00 00          callq  78 <bar<foo, 0u>::get(bool)+0x78>
  78:   48 8b 44 24 10          mov    0x10(%rsp),%rax
  7d:   48 81 c4 68 02 00 00    add    $0x268,%rsp
  84:   5b                      pop    %rbx
  85:   5d                      pop    %rbp
  86:   41 5c                   pop    %r12
  88:   41 5d                   pop    %r13
  8a:   41 5e                   pop    %r14
  8c:   41 5f                   pop    %r15
  8e:   c3                      retq   
  8f:   90                      nop
  90:   4c 89 f7                mov    %r14,%rdi
  93:   e8 00 00 00 00          callq  98 <bar<foo, 0u>::get(bool)+0x98>
  98:   83 f8 04                cmp    $0x4,%eax
  9b:   74 f3                   je     90 <bar<foo, 0u>::get(bool)+0x90>
  9d:   85 c0                   test   %eax,%eax
  9f:   0f 85 e4 08 00 00       jne    989 <bar<foo, 0u>::get(bool)+0x989>
  a5:   49 83 87 b0 01 00 00    addq   $0x1,0x1b0(%r15)
  ac:   01 
  ad:   49 8d 9f 58 01 00 00    lea    0x158(%r15),%rbx
  b4:   48 89 df                mov    %rbx,%rdi
  b7:   e8 00 00 00 00          callq  bc <bar<foo, 0u>::get(bool)+0xbc>
  bc:   49 8d bf 80 01 00 00    lea    0x180(%r15),%rdi
  c3:   e8 00 00 00 00          callq  c8 <bar<foo, 0u>::get(bool)+0xc8>
  c8:   48 89 df                mov    %rbx,%rdi
  cb:   e8 00 00 00 00          callq  d0 <bar<foo, 0u>::get(bool)+0xd0>
  d0:   4c 89 f7                mov    %r14,%rdi
  d3:   e8 00 00 00 00          callq  d8 <bar<foo, 0u>::get(bool)+0xd8>
  d8:   83 f8 04                cmp    $0x4,%eax

The disassembly of this particular function continues on, but one thing I noticed is the relatively large number of call instructions like this one:

20:   e8 00 00 00 00          callq  25 <bar<foo, 0u>::get(bool)+0x25>

These instructions, always with the opcode e8 00 00 00 00, occur frequently throughout the generated code, and from what I can tell, are nothing more than no-ops; they all seem to just fall through to the next instruction. This begs the question, then, is there a good reason why all these instructions are generated?

I'm concerned about the instruction cache footprint of the generated code, so wasting 5 bytes many times throughout a function seems counterproductive. It seems a bit heavyweight for a nop, unless the compiler is trying to preserve some kind of memory alignment or something. I wouldn't be surprised if this were the case.

I compiled my code using g++ 4.8.5 using -O3 -fomit-frame-pointer. For what it's worth, I saw similar code generation using clang 3.7.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The 00 00 00 00 (relative) target address in e8 00 00 00 00 is intended to be filled in by the linker. It doesn't mean that the call falls through. It just means you are disassembling an object file that has not been linked yet.

Also, a call to the next instruction, if that was the end result after the link phase, would not be a no-op, because it changes the stack (a certain hint that this is not what is going on in your case).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...