Language linkage is the term used for linkage between C++
and non-C++
code fragments. Typically, in a C++ program, all function names, function types and even variable names have the default C++ language linkage.
A C++ object code can be linked to another object code which is produced using some other source language (like C
) using a predefined linkage specifier.
As you must be aware of the concept of name mangling
, which encodes function names, function types and variable names so as to generate a unique name for them. This allows the linker to differentiate between common names (as in the case of function overloading). Name mangling is not desirable when linking C modules with libraries or object files compiled with a C++ compiler. To prevent name mangling for such cases, linkage specifiers are used. In this case, extern "C"
is the linkage specifier. Let's take an example (c++ code mentioned here):
typedef int (*pfun)(int); // line 1
extern "C" void foo(pfun); // line 2
extern "C" int g(int) // line 3
...
foo( g ); // Error! // line 5
Line 1 declares pfun
to point to a C++ function, because it lacks a linkage specifier.
Line 2 therefore declares foo to be a C function that takes a pointer to a C++ function.
Line 5 attempts to call foo with a pointer to g, a C function, a type mis-match.
Diff in function name linkage:
Let's take two different files:
One with extern "c"
linkage (file1.cpp):
#include <iostream>
using namespace std;
extern "C"
{
void foo (int a, int b)
{
cout << "here";
}
}
int main ()
{
foo (10,20);
return 0;
}
One without extern "c"
linkage (file2.cpp):
#include <iostream>
using namespace std;
void foo (int a, int b)
{
cout << "here";
}
int main ()
{
foo (10,20);
return 0;
}
Now compile these two and check the objdump.
# g++ file1.cpp -o file1
# objdump -Dx file1
# g++ file2.cpp -o file2
# objdump -Dx file2
With extern "C" linkage, there is no name mangling for the function foo
. So any program that is using it (assuming we make a shared lib out of it) can directly call foo (with helper functions like dlsym
and dlopen
) with out considering any name mangling effects.
0000000000400774 <foo>:
400774: 55 push %rbp
400775: 48 89 e5 mov %rsp,%rbp
....
....
400791: c9 leaveq
400792: c3 retq
0000000000400793 <main>:
400793: 55 push %rbp
400794: 48 89 e5 mov %rsp,%rbp
400797: be 14 00 00 00 mov $0x14,%esi
40079c: bf 0a 00 00 00 mov $0xa,%edi
4007a1: e8 ce ff ff ff callq 400774 <foo>
4007a6: b8 00 00 00 00 mov $0x0,%eax
4007ab: c9 leaveq
On the other hand, when no extern "C"
is being used, func: foo
is mangled with some predefined rules (known to compiler/linker being used) and so an application can not directly call it from it specifying the name as foo
. You can however call it with the mangled name (_Z3fooii
in this case) if you want, but nobody use it for the obvious reason.
0000000000400774 <_Z3fooii>:
400774: 55 push %rbp
400775: 48 89 e5 mov %rsp,%rbp
...
...
400791: c9 leaveq
400792: c3 retq
0000000000400793 <main>:
400793: 55 push %rbp
400794: 48 89 e5 mov %rsp,%rbp
400797: be 14 00 00 00 mov $0x14,%esi
40079c: bf 0a 00 00 00 mov $0xa,%edi
4007a1: e8 ce ff ff ff callq 400774 <_Z3fooii>
4007a6: b8 00 00 00 00 mov $0x0,%eax
4007ab: c9 leaveq
4007ac: c3 retq
This page is also a good read for this particular topic.
A nice and clearly explained article about calling convention: http://www.codeproject.com/KB/cpp/calling_conventions_demystified.aspx