Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
489 views
in Technique[技术] by (71.8m points)

firefox - How does jemalloc work? What are the benefits?

Firefox 3 came with a new allocator: jemalloc.

I have heard at several places that this new allocator is better. The top Google results don't gave any further information though and I am interested in how exactly it works.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

jemalloc first appeared for FreeBSD, the brainchild of one "Jason Evans", hence the "je". I would ridicule him for being egotistical had I not once written an operating system called paxos :-)

See this PDF for full details. It's a white paper describing in detail how the algorithms work.

The main benefit is scalability in multi-processor and multi-threaded systems achieved, in part, by using multiple arenas (the chunks of raw memory from which allocations are made).

In single-threaded situations, there is no real benefit to multiple arenas so a single arena is used.

However, in multi-threaded situations, many arenas are created (four times as many arenas as there are processors), and threads are assigned to these arenas in a round-robin fashion.

This means that lock contention can be reduced since, while multiple threads may call malloc or free concurrently, they'll only contend if they share the same arena. Two threads with different arenas will not affect each other.

In addition, jemalloc tries to optimise for cache locality since the act of fetching data from RAM is much slower than using data already in the CPU caches (no different in concept to the difference between fast fetching from RAM versus slow fetching from disk). To that end, it first tries to minimise memory usage overall since that is more likely to ensure the application's entire working set is in cache.

And, where that can't be achieved, it tries to ensure that allocations are contiguous, since memory allocated together tends to be used together.

From the white paper, these strategies seem to give similar performance to current best algorithms for single threaded use while offering improvements for multi-threaded usage.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...