Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

assembly - Why function parameter occupy at least 4 bytes stack on x86?

Function parameter is allocated with at least 4 bytes via push/pop if they are allocated in stack on x86. This wastes memory if there are many parameters sized less than 4 bytes for each function invocation. One reason might be push and pop work on 4 bytes least, but why not operate on esp directly to save stack space which could pack 4 parameters in 1 byte to one 4 bytes memory as below?

sub esp, 4
mov byte ptr [esp], para1
mov byte ptr [esp+1], para2
mov byte ptr [esp+2], para3
mov byte ptr [esp+3], para4
call func
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Such behaviour is normally governed by the Application Binary Interface (ABI) and the mostly used x86 ABIs (Win32 and Sys V) just requires that each parameter occupies at least 4 bytes. This is mainly due to the fact that most x86 implementations suffer from performance penalties if data is not properly aligned. While your example would not "de-align" the stack, a subroutine taking only three byte sized parameters would do so. Of course, one could define special rules in the ABI to overcome this but it complicates things for little gain.

Keep also in mind, that the x86 ABIs were designed around 1990. At this time, the number of instructions was a very good measure for the speed of a certain piece of code. You example requires one extra instruction compared with four pushes if para1-para4 are located in registers and five extra instructions in the worst case, that all parameters must be loaded from memory (x86 supports pushing memory locations directly).

Further, in your example, you trade saving 12 bytes on the stack for 14 extra code bytes: your code sequence requires 18 bytes of code in case para1-para4 (e.g. al-dl) are located in registers while four pushes require 4 bytes. So overall, you reduce the memory footprint only if you have recursions in your code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...