Currently C++ implementations really only have two "things" that correspond to code: source code that we human write and edit, and assembly, which the compiler spits out based on source.
Because C++ templates are "reified", separate assembly is spit out for each template instantiation. For that reason, no assembly can be produced where the templates are defined, but only where they are used. Which is why templates have to be in header files so they can basically be copy pasted into the point of use (that's all #include is really).
The idea is to have a third representation of the code. Imagine that internally the compiler has some kind of internal representation after it has parsed the code but before it starts producing assembly. The "thing" it produces is ultimately some kind of representation of an abstract syntax tree (AST). It's basically exactly your program, mapped from a form that is easiest for humans, to a form that is easiest for computers.
This is very roughly the idea behind modules (or at least their implementation). You take your code, and spit out some kind of file representing the AST. This AST is a full representation of your program, so it's completely lossless. It knows everything about the templates you declared, and so on. When a module is loaded, it would just load this file and the compiler can use it exactly as if it had all the source available. But, the step of turning human readable source into this AST is actually quite an expensive step. Starting with the AST can be a lot faster.
If you only have one translation unit, this would be slower. After all, parsing -> codegen is still faster than parsing -> serialize -> deserialize -> codegen. But say you have 10 translation units that all #include vector. You will parse the code in vector 10 times. At this point, the extra cost of serializing/deserializing is offset by the fact that you only have to parse once (and deserializing can be made much faster than parsing; this data format will be designed specifically to make deserializing fast, whereas source code is designed to be readable, backwards compatible, etc).
Pre compiled headers in some sense are a sneak preview of modules: https://clang.llvm.org/docs/PCHInternals.html
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…