Step 2-4 are slightly off.
If there is no input in the standard I/O buffer, getchar()
calls a function to reload the buffer. On a Unix-like system, that normally ends up calling the read()
system call, and the read()
system call puts the process to sleep until there is input to be processed, or the kernel knows there will be no input to be processed (EOF). When the read returns, the code adjusts the data structures so that getchar()
knows how much data is available. You description implies polling; the standard I/O system does not poll for input.
Step 5 uses the adjusted pointers to return the correct values.
There really isn't an EOF character; it is a state, not a character. Even though you type Control-D or Control-Z to indicate 'EOF', that character is not inserted into the input stream. In fact, those characters cause the system to flush any typed characters that are still waiting for 'line editing' operations (like backspace) to change them so that they are made available to the read()
system call. If there are no such characters, then read()
returns 0 as the number of available characters, which means EOF. Then getchar()
returns the value EOF
(usually -1
but guaranteed to be negative whereas valid characters are guaranteed to be non-negative (zero or positive)).
So basically, rather than polling, is it that hitting Return causes a certain I/O interrupt, and then when the OS receives this, it wakes up any processes that are sleeping for I/O?
Yes, hitting Return triggers interrupts and the OS kernel processes them and wakes up processes that are waiting for the data. The terminal driver is woken by the kernel when interrupt occurs, and decides what to do with the character(s) that were just received. They may be stashed for further processing (canonical mode) or made available immediately (raw mode), etc. Assuming, of course, that the input is a terminal; if the input is from a disk file, it is simpler in many ways — or if it is a pipe, or …
Nominally, it isn't the terminal app that gets woken by the interrupt; it is the kernel that wakes first, then the shell running in the terminal app that is woken because there's data for it to read, and only when there's output does the terminal app get woken.
I say 'nominally' because there's an outside chance that in fact the terminal app does mediate the I/O via a pty (pseudo-tty), but I think it happens at the kernel level and the terminal application is involved fairly late in the process. There's a huge disconnect really between the keyboard where you type and the display where what you type appears.
See also Canonical vs non-canonical terminal input.