Are the stages of compilation of a C++ program specified by the standard?
Yes and no.
The C++ standard defines 9 "phases of translation". Quoting from the N3242 draft (10MB PDF), dated 2011-02-28 (prior to the release of the official C++11 standard), section 2.2:
The precedence among the syntax rules of translation is specified by the following phases [see footnote].
- Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set
(introducing new-line characters for end-of-line indicators) if
necessary. [SNIP]
- Each instance of a backslash character () immediately followed by a new-line character is deleted, splicing physical source lines to
form logical source lines. [SNIP]
- The source file is decomposed into preprocessing tokens (2.5) and sequences of white-space characters (including comments). [SNIP]
- Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. [SNIP]
- Each source character set member in a character literal or a string literal, as well as each escape sequence and universal-character-name
in a character literal or a non-raw string literal, is converted to
the corresponding member of the execution character set; [SNIP]
- Adjacent string literal tokens are concatenated.
- White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. (2.7). The
resulting tokens are syntactically and semantically analyzed and
translated as a translation unit. [SNIP]
- Translated translation units and instantiation units are combined as follows: [SNIP]
- All external entity references are resolved. Library components are linked to satisfy external references to entities not defined in the
current translation. All such translator output is collected into a
program image which contains information needed for execution in its
execution environment.
[footnote] Implementations must behave as if these separate phases occur, although in practice different phases might be folded together.
As indicated by the [SNIP] markers, I haven't quoted the entire section, just enough to get the idea across.
To emphasize, compilers are not required to follow this exact model, as long as the final result is as if they did.
Phases 1-6 correspond more or less to the preprocessor, 7 to what you might normally think of as compilation, 8 deals with templates, and 9 corresponds to linking.
(C's translation phases are similar, but #8 is omitted.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…