Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
149 views
in Technique[技术] by (71.8m points)

c++ - How does the main() method work in C?

I know there are two different signatures to write the main method -

int main()
{
   //Code
}

or for handling command line argument, we write it as-

int main(int argc, char * argv[])
{
   //code
}

In C++ I know we can overload a method, but in C how does the compiler handle these two different signatures of main function?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Some of the features of the C language started out as hacks which just happened to work.

Multiple signatures for main, as well as variable-length argument lists, is one of those features.

Programmers noticed that they can pass extra arguments to a function, and nothing bad happens with their given compiler.

This is the case if the calling conventions are such that:

  1. The calling function cleans up the arguments.
  2. The leftmost arguments are closer to the top of the stack, or to the base of the stack frame, so that spurious arguments do not invalidate the addressing.

One set of calling conventions which obeys these rules is stack-based parameter passing whereby the caller pops the arguments, and they are pushed right to left:

 ;; pseudo-assembly-language
 ;; main(argc, argv, envp); call

 push envp  ;; rightmost argument
 push argv  ;; 
 push argc  ;; leftmost argument ends up on top of stack

 call main

 pop        ;; caller cleans up   
 pop
 pop

In compilers where this type of calling convention is the case, nothing special need to be done to support the two kinds of main, or even additional kinds. main can be a function of no arguments, in which case it is oblivious to the items that were pushed onto the stack. If it's a function of two arguments, then it finds argc and argv as the two topmost stack items. If it's a platform-specific three-argument variant with an environment pointer (a common extension), that will work too: it will find that third argument as the third element from the top of the stack.

And so a fixed call works for all cases, allowing a single, fixed start-up module to be linked to the program. That module could be written in C, as a function resembling this:

/* I'm adding envp to show that even a popular platform-specific variant
   can be handled. */
extern int main(int argc, char **argv, char **envp);

void __start(void)
{
  /* This is the real startup function for the executable.
     It performs a bunch of library initialization. */

  /* ... */

  /* And then: */
  exit(main(argc_from_somewhere, argv_from_somewhere, envp_from_somewhere));
}

In other words, this start module just calls a three-argument main, always. If main takes no arguments, or only int, char **, it happens to work fine, as well as if it takes no arguments, due to the calling conventions.

If you were to do this kind of thing in your program, it would be nonportable and considered undefined behavior by ISO C: declaring and calling a function in one manner, and defining it in another. But a compiler's startup trick does not have to be portable; it is not guided by the rules for portable programs.

But suppose that the calling conventions are such that it cannot work this way. In that case, the compiler has to treat main specially. When it notices that it's compiling the main function, it can generate code which is compatible with, say, a three argument call.

That is to say, you write this:

int main(void)
{
   /* ... */
}

But when the compiler sees it, it essentially performs a code transformation so that the function which it compiles looks more like this:

int main(int __argc_ignore, char **__argv_ignore, char **__envp_ignore)
{
   /* ... */
}

except that the names __argc_ignore don't literally exist. No such names are introduced into your scope, and there won't be any warning about unused arguments. The code transformation causes the compiler to emit code with the correct linkage which knows that it has to clean up three arguments.

Another implementation strategy is for the compiler or perhaps linker to custom-generate the __start function (or whatever it is called), or at least select one from several pre-compiled alternatives. Information could be stored in the object file about which of the supported forms of main is being used. The linker can look at this info, and select the correct version of the start-up module which contains a call to main which is compatible with the program's definition. C implementations usually have only a small number of supported forms of main so this approach is feasible.

Compilers for the C99 language always have to treat main specially, to some extent, to support the hack that if the function terminates without a return statement, the behavior is as if return 0 were executed. This, again, can be treated by a code transformation. The compiler notices that a function called main is being compiled. Then it checks whether the end of the body is potentially reachable. If so, it inserts a return 0;


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...