c++ - How to control memory allocation strategy in third party library code?

Question

Welcome To Ask or Share your Answers For Others

c++ - How to control memory allocation strategy in third party library code?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

c++ - How to control memory allocation strategy in third party library code?

Previous header: "Must I replace global operators new and delete to change memory allocation strategy in third party code?"

Short story: We need to replace memory allocation technique in third-party library without changing its source code.

Long story:

Consider memory-bound application that makes huge dynamic allocations (perhaps, almost all available system memory). We use specialized allocators, and use them everywhere (shared_ptr's, containers etc.). We have total control and power over every single byte of memory allocated in our application.

Also, we need to link against a third-party helper library. That nasty guy makes allocations in some standard way, using default operators new, new[], delete and delete[] or malloc or something else non-standard (let's generalize and say that we don't know how this library manages it's heap allocation).

If this helper library makes allocation that are big enough we can get HDD thrashing, memory fragmentation and alignments issues, out-of-memory bad_allocs and all sorts of problems.

We can not (or do not want) to change library source code.

First attempt:

We never had such unholy "hacks" in release builds before. First test with overriding operator new works fine, except that:

we do not know what gotchas wait us in the future (and this is awful)
our users (and even our allocators) now have to allocate same way that we do

Questions:

Are there ways to hook these allocations without overloading global operators? (local lib-only hooks?)
...and if we don't know what exactly it uses: malloc or new?

Is this list of signatures complete? (and there are no other things that we must implement):

void* operator new (std::size_t size) throw (std::bad_alloc);
void* operator new (std::size_t size, const std::nothrow_t& nothrow_value) throw();
void* operator new (std::size_t size, void* ptr) throw();
void* operator new[] (std::size_t size) throw (std::bad_alloc);
void* operator new[] (std::size_t size, const std::nothrow_t& nothrow_value) throw();
void* operator new[] (std::size_t size, void* ptr) throw();

void operator delete (void* ptr) throw();
void operator delete (void* ptr, const std::nothrow_t& nothrow_constant) throw();
void operator delete (void* ptr, void* voidptr2) throw();
void operator delete[] (void* ptr) throw();
void operator delete[] (void* ptr, const std::nothrow_t& nothrow_constant) throw();
void operator delete[] (void* ptr, void* voidptr2) throw();

Something different if that library is dynamic?

Edit #1

Cross-platform solution is preferable if possible (looks like not very possible). If not, our major platforms:

Windows x86/x64 (msvc 10)
Linux x86/x64 (gcc 4.6)

Edit #2

Almost 2 years have passed, few OS and compiler versions have evolved, so I am curious if there is something new and unexplored in this area? Any standard proposals? OS-specifics? Hacks? How do you write memory-thirsty applications today? Please share your experience.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T02:51:22+0000

Ugh, my sympathy. This is going to depend a lot on your compiler, your libc, etc. Some rubber-meets-road strategies that have "worked" to varying degrees for us in the past (/me braces for downvotes) are:

The operator new / operator delete overloads you suggested -- although note that some compilers are picky about not having throw() specs, some really want them, some want them for new but not for delete, etc (I have a giant platform-specific #if/#elif block for all of the 4+ platforms we're working on now).
Also worth noting: you can generally ignore the placement versions, they don't allocate.
Look at __malloc_hook and friends -- note that these are deprecated and have thread race conditions -- but they're nice in that new/delete tend to be implemented in terms of malloc (but not always).
Providing a replacement malloc, calloc, realloc, and free and getting your linker args in the right order so that the overrides take place (this is what gcc recommends these days, although I've had situations where it was impossible to do, and I had to use deprecated __malloc_hook) -- again, new and delete tend to be implemented in terms of these, but not always.
Avoiding all the standard allocation methods (operator new, malloc, etc) in "our code" and using custom functions instead -- not very easy with existing codebase.
Tracking down the library author and delivering a savage beating polite request or patch to change their library to allow you to specify a different allocator (it may be faster than doing this yourself) -- I think this has lead to a cardinal rule of "client always specifies the allocator or does the allocation" with any libraries I write.

Please note that this is not an answer in terms of what the standards say should happen, just my experience. I've worked with more than a few buggy/broken compilers and libc implementations in the past, so YMMV. I also have the luxury of working on fairly "sealed systems", and not being all that worried about portability for any specific application.

Regarding dynamic libraries: I'm currently in a bit of a pinch in this regard myself; our "app" gets loaded as a dynamic .so and we have to be pretty careful to pass any delete/free requests back to the default allocator if they didn't come from us. The current solution is to just cordon off our allocations to a specific area: if we get a delete/free from within that address range, we dispatch to our handler, otherwise back to the default... I've even toyed with (horrors) the idea of checking the caller address to see if it's in our address space. (The probability of going boom increases with such hacks, though.)

This may be a useful strategy even if you are the process lead and you're using an outside library: tag or restrict or otherwise identify your own allocs somehow (even going so far as to keep a list of allocs you know about), and then pass on any unknowns. All of this has ugly side-effects and limitations, though.

(Looking forward to other answers!)

Categories

c++ - How to control memory allocation strategy in third party library code?

c++ - How to control memory allocation strategy in third party library code?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags