After trying some things, I finally managed to figure out how to do this.
First of all, in glibc
, malloc
is defined as a weak symbol, which means that it can be overwritten by the application or a shared library. Hence, LD_PRELOAD
is not necessarily needed. Instead, I implemented the following function in a shared library:
void*
malloc (size_t size)
{
[ ... ]
}
Which gets called by the application instead of glibc
s malloc
.
Now, to be equivalent to the __malloc_hook
s functionality, a couple of things are still missing.
1.) the caller address
In addition to the original parameters to malloc
, glibc
s __malloc_hook
s also provide the address of the calling function, which is actually the return address of where malloc
would return to. To achieve the same thing, we can use the __builtin_return_address
function that is available in gcc. I have not looked into other compilers, because I am limited to gcc anyway, but if you happen to know how to do such a thing portably, please drop me a comment :)
Our malloc
function now looks like this:
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
[ ... ]
}
2.) accessing glibc
s malloc from within your hook
As I am limited to glibc in my application, I chose to use __libc_malloc
to access the original malloc implementation. Alternatively, dlsym(RTLD_NEXT, "malloc")
can be used, but at the possible pitfall that this function uses calloc
on its first call, possibly resulting in an infinite loop leading to a segfault.
complete malloc hook
My complete hooking function now looks like this:
extern void *__libc_malloc(size_t size);
int malloc_hook_active = 0;
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
if (malloc_hook_active)
return my_malloc_hook(size, caller);
return __libc_malloc(size);
}
where my_malloc_hook
looks like this:
void*
my_malloc_hook (size_t size, void *caller)
{
void *result;
// deactivate hooks for logging
malloc_hook_active = 0;
result = malloc(size);
// do logging
[ ... ]
// reactivate hooks
malloc_hook_active = 1;
return result;
}
Of course, the hooks for calloc
, realloc
and free
work similarly.
dynamic and static linking
With these functions, dynamic linking works out of the box. Linking the .so file containing the malloc hook implementation will result of all calls to malloc
from the application and also all library calls to be routed through my hook. Static linking is problematic though. I have not yet wrapped my head around it completely, but in static linking malloc is not a weak symbol, resulting in a multiple definition error at link time.
If you need static linking for whatever reason, for example translating function addresses in 3rd party libraries to code lines via debug symbols, then you can link these 3rd party libs statically while still linking the malloc hooks dynamically, avoiding the multiple definition problem. I have not yet found a better workaround for this, if you know one,feel free to leave me a comment.
Here is a short example:
gcc -o test test.c -lmalloc_hook_library -Wl,-Bstatic -l3rdparty -Wl,-Bdynamic
3rdparty
will be linked statically, while malloc_hook_library
will be linked dynamically, resulting in the expected behaviour, and addresses of functions in 3rdparty
to be translatable via debug symbols in test
. Pretty neat, huh?
Conlusion
the techniques above describe a non-deprecated, pretty much equivalent approach to __malloc_hook
s, but with a couple of mean limitations:
__builtin_caller_address
only works with gcc
__libc_malloc
only works with glibc
dlsym(RTLD_NEXT, [...])
is a GNU extension in glibc
the linker flags -Wl,-Bstatic
and -Wl,-Bdynamic
are specific to the GNU binutils.
In other words, this solution is utterly non-portable and alternative solutions would have to be added if the hooks library were to be ported to a non-GNU operating system.