cpp/language/translation phases

The C++ source file is processed by the compiler as if the following phases take place, in this exact order:

Phase 2
@1@ Whenever backslash appears at the end of a line (immediately followed by  the newline character), these characters are deleted, combining two physical source lines into one logical source line. This is a single-pass operation; a line ending in two backslashes followed by an empty line does not combine three lines into one. If a is formed  in this phase, the behavior is undefined. @2@ If a non-empty source file does not end with a newline character after this step (whether it had no newline originally, or it ended with a newline immediately preceded by a backslash), a terminating newline character is added.

Phase 3
@1@ The source file is decomposed into comments, sequences of whitespace characters (space, horizontal tab, new-line, vertical tab, and form-feed), and preprocessing tokens, which are the following:
 * @a@ header names such as or
 * @b@
 * @c@ preprocessing numbers
 * @d@ and  literals
 * @e@ operators and punctuators (including ), such as, , , , or
 * @f@ individual non-whitespace characters that do not fit in any other category

@3@ Each comment is replaced by one space character.

Newlines are kept, and it is unspecified whether non-newline whitespace sequences may be collapsed into single space characters.

If the input has been parsed into preprocessing tokens up to a given character, the next preprocessing token is generally taken to be the longest sequence of characters that could constitute a preprocessing token, even if that would cause subsequent analysis to fail. This is commonly known as maximal munch.

The sole exceptions to the maximal munch rule are:


 * Header name preprocessing tokens are only formed within a  directive.

Phase 4
@1@ The preprocessor is executed. @2@ Each file introduced with the #include directive goes through phases 1 through 4, recursively. @3@ At the end of this phase, all preprocessor directives are removed from the source.

Phase 6
Adjacent are concatenated.

Phase 7
Compilation takes place: each preprocessing token is converted to a token. The tokens are syntactically and semantically analyzed and translated as a translation unit.

Phase 8
Each translation unit is examined to produce a list of required template instantiations, including the ones requested by. The definitions of the templates are located, and the required instantiations are performed to produce instantiation units.

Phase 9
Translation units, instantiation units, and library components needed to satisfy external references are collected into a program image which contains information needed for execution in its execution environment.