While studying OSX 10.9.4's implementation of strlen, I notice that it always compares a chunk of 16-bytes and skips ahead to the following 16-bytes until it encounters a '
'. The relevant part:
3de0: 48 83 c7 10 add $0x10,%rdi
3de4: 66 0f ef c0 pxor %xmm0,%xmm0
3de8: 66 0f 74 07 pcmpeqb (%rdi),%xmm0
3dec: 66 0f d7 f0 pmovmskb %xmm0,%esi
3df0: 85 f6 test %esi,%esi
3df2: 74 ec je 3de0 <__platform_strlen+0x40>
0x10
is 16 bytes in hex.
When I saw that, I was wondering: this memory could just as well not be allocated. If I had allocated a C string of 20 bytes and passed it to strlen
, it would read 36 bytes of memory. Why is it allowed to do that? I started looking and found How dangerous is it to access an array out of bounds?
Which confirmed that it's definitely not always a good thing, unallocated memory might be unmapped, for example. Yet, there must be something that makes this work. Some of my hypotheses:
- OSX not only guarantees that its allocations are 16-byte aligned, but also that the "quantum" of an allocated is a 16-byte chunks. Said another way, allocating 5 bytes will actually allocate 16 bytes. Allocating 20 bytes will actually allocate 32 bytes.
- It's not harmful per se to read of the end of an array when you're writing asm, as it's not undefined behaviour, as long as its within bounds (within a page?).
What's the actual reason?
EDIT: just found Why I'm getting read and write permission on unallocated memory?, which seems to indicate my first guess was right.
EDIT 2: Stupidly enough, I had forgotten that even though Apple seems to have removed the source of most of its asm implementations (Where did OSX's x86-64 assembly libc routines go?), it left strlen: http://www.opensource.apple.com/source/Libc/Libc-997.90.3/x86_64/string/strlen.s
In the comments we find:
// returns the length of the string s (i.e. the distance in bytes from
// s to the first NUL byte following s). We look for NUL bytes using
// pcmpeqb on 16-byte aligned blocks. Although this may read past the
// end of the string, because all access is aligned, it will never
// read past the end of the string across a page boundary, or even
// accross a cacheline.
EDIT: I honestly think all answerers deserved an accepted answer, and basically all contained the information necessary to understand the issue. So I went for the answer of the person that had the least reputation.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…