There are three cases that you should be aware of.
- Leaf:
void foo(void) {};
- Tail call:
int foo(void) { return bar(); };
- Intermediate:
int foo(void) { int i; i = bar() + 4; return i; };
There are many ways to implement these calls. Below are some samples and are not the only way to implement epilogue and prologue in ARM assembler.
LEAF funtions
Many functions are the leaf type and do not require saving of the lr
. You simply use the bx lr
to return. For example,
SubRoutine:
PUSH {r1,r2}
//code that changes r1 and r2
POP {r1,r2}
bx lr
Also, it is typical that r1 and r2 are used to pass parameters and a SubRoutine is free to use/destroy them.ARM calling conventions This will be the case if you call 'C' function from assembler. So typically, no one would save r1 and r2 but as it is assembler you can do what ever you like (even if it is a bad idea). So actually the example is only bx lr
if you follow the standard.
Tail Call
If your function is a leaf except for a final call to another function you can use the following short cut,
Sub_w_tail:
// Save callee-saved regs (for whatever calling convention you need)
// Leave LR as is.
// ... do stuff
B tail_call
The LR
is saved by the caller to Sub_w_tail
and you just jump directly to tail_call
which returns to the original caller.
Intermediate function
This is the most complex. Here is a possible sequence,
SubRoutine:
PUSH {r1,r2,lr}
//code that changes r1 and r2
bl AnotherRoutine (where bx lr will be used to return from it)
// more code
POP {r1,r2,pc} // returns to caller of 'SubRoutine'
Some details of an older calling convention are in the ARM Link and frame registers question. You can use this convention. There are many different ways to perform the epilogue and prologue in ARM assembler.
The last is quite complex; or at least tedious to code. It is a lot better to let a compiler determine what registers to use and what to place on the stack. However, usually you only need to know how to code the first (LEAF function) when writing assembler. It is most productive only to code an optimized sub-routine called from a higher level language in assembler. It is useful to know how all of them work to understand compiled code. You should also consider inline assembler so you don't have to deal with these nuances.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…