Code that's executed on word (for 8086) or DWORD (80386 and later) boundaries executes faster because the processor fetches whole (D)words. So if your instructions aren't aligned then there is a stall when loading.
However, you can't dword-align every instruction. Well, I guess you could, but then you'd be wasting space and the processor would have to execute the NOP instructions, which would kill any performance benefit of aligning the instructions.
In practice, aligning code on dword (or whatever) boundaries only helps when the instruction is the target of a branching instruction, and compilers typically will align the first instruction of a function, but won't align branch targets that can also be reached by fall through. For example:
MyFunction:
cmp ax, bx
jnz NotEqual
; ... some code here
NotEqual:
; ... more stuff here
A compiler that generates this code will typically align MyFunction
because it is a branch target (reached by call
), but it won't align the NotEqual
because doing so would insert NOP
instructions that would have to be executed when falling through. That increases code size and makes the fall-through case slower.
I would suggest that if you're just learning assembly language, that you don't worry about things like this that will most often give you marginal performance gains. Just write your code to make things work. After they work, you can profile them and, if you think it's necessary after looking at the profile data, align your functions.
The assembler typically won't do it for you automatically.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…