I'd like to extent on BDL's answer. It is not only about the perspective interpolation. It is also about the clipping. The space the value gl_Position
is supposed to be provided in is called clip space, and this is before the division by w.
The (default) clip volume of OpenGL is defined in clip space as
-w <= x,y,z <= w (with w varying per vertex)
After the division by w we get
-1 <= x,y,z <= 1 (in NDC coordinates).
However, if you try to do the clipping after the division by w, and would check against that cube in NDC, you get a problem, because all clip space points fullfilling this:
w <= x,y,z <= -w (in clip space)
will also fullfill the NDC constraint.
The thing here is that points behind the camera will be transformed to somewhere in front of the camera, mirrored (since x/-1
is the same as -x/1
). This also happens to the z
coordinate. One might argue that this is irrelevant, because any point behind the camera is projected behind (in the sense of more far away than) the far plane, as per the construction of the typical projection matrix, so it will lie outside of the viewing volume in either case.
But if you have a primitive where at least one point is inside the view volume, and at least one point is behind the camera, you should have a primitive which intersects the near plane also. However, after the division by w
, it will intersect the far
plane now!. So clipping in NDC space, after the division, is much harder to get right. I tried to visualize this in this drawing:
(the drawing is to-scale, the depth range of projection is much shorter than anyone would typically use, to better illustrate the issue).
The clipping is done as a fixed-function stage in hardware and it has to be done before the division, hence you should provide the correct clip-space coordinates to work on.
(Note: actual GPUs might not use an extra clipping stage at all, they actually might also use a clipless rasterizer, like it is speculated in Fabian Giesen's blog article there. There are some algorithms like Olano and Greer (1997). However, this all works by doing the rasterization directly in homogenous coordinates, so we still need the w
...)