LuaJIT is designed to use 32-bit pointers. On x64
platforms the limit comes from the use of mmap and the MAP_32BIT
flag.
MAP_32BIT (since Linux 2.4.20, 2.6):
Put the mapping into the first 2 Gigabytes of the process address space. This flag is supported only on x86-64, for 64-bit programs. It was added to allow thread stacks to be allocated somewhere in the first 2GB of memory, so as to improve context-switch performance on some early 64-bit processors.
Essentially using this flag limits to the first 31-bits, not the first 32-bits as the name suggests. Have a look here for a nice overview of the 1GB limit using MAP_32BIT
in the Linux kernel.
Even if you could have more than 1GB, the LuaJIT author explains why this would be bad for performance:
- A full GC takes 50% more time than the allocations themselves.
- If the GC is enabled, it doubles the allocation time.
- To simulate a real application, the links between objects are randomized in the third run. This doubles the GC time!
And that was just for 1GB! Now imagine using 8GB -- a full GC cycle would keep the CPU busy for a whopping 24 seconds!
Ok, so the normal mode is to use the incremental GC. But this just means the overhead is ~30% higher, it's mixed in between the allocations and it will evict the CPU cache every time. Basically your application will be dominated by the GC overhead and you'll begin to wonder why it's slow ....
tl;dr version: Don't try this at home. And the GC needs a rewrite (postponed to LuaJIT 2.1).
To summarize, the 1GB limit is a limitation of the Linux kernel and the LuaJIT garbage collector. This only applies to objects within the LuaJIT state and can be overcome by using malloc
, which will allocate outside the lower 32-bit address space. Also, it's possible to use the x86
build on x64
in 32-bit mode and have access the full 4GB.
Check out these links for more information:
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…