I am surprised that there is a problem, but it does seem to be a problem on Linux (I tested on Ubuntu 16.04 LTS running in a VMWare Fusion VM on my Mac) — but it was not a problem on my Mac running macOS 10.13.4 (High Sierra), and I wouldn't expect it to be a problem on other variants of Unix either.
As I noted in a comment:
There's an open file description and an open file descriptor behind each stream. When the process forks, the child has its own set of open file descriptors (and file streams), but each file descriptor in the child shares the open file description with the parent. IF (and that's a big 'if') the child process closing the file descriptors first did the equivalent of lseek(fd, 0, SEEK_SET)
, then that would also position the file descriptor for the parent process, and that could lead to an infinite loop. However, I've never heard of a library that does that seek; there's no reason to do it.
See POSIX open()
and fork()
for more information about open file descriptors and open file descriptions.
The open file descriptors are private to a process; the open file descriptions are shared by all copies of the file descriptor created by an initial 'open file' operation. One of the key properties of the open file description is the current seek position. That means that a child process can change the current seek position for a parent — because it is in the shared open file description.
neof97.c
I used the following code — a mildly adapted version of the original that compiles cleanly with rigorous compilation options:
#include "posixver.h"
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
enum { MAX = 100 };
int main(void)
{
if (freopen("input.txt", "r", stdin) == 0)
return 1;
char s[MAX];
for (int i = 0; i < 30 && fgets(s, MAX, stdin) != NULL; i++)
{
// Commenting out this region fixes the issue
int status;
pid_t pid = fork();
if (pid == 0)
{
exit(0);
}
else
{
waitpid(pid, &status, 0);
}
// End region
printf("%s", s);
}
return 0;
}
One of the modifications limits the number of cycles (children) to just 30.
I used a data file with 4 lines of 20 random letters plus a newline (84 bytes total):
ywYaGKiRtAwzaBbuzvNb
eRsjPoBaIdxZZtJWfSty
uGnxGhSluywhlAEBIXNP
plRXLszVvPgZhAdTLlYe
I ran the command under strace
on Ubuntu:
$ strace -ff -o st-out -- neof97
ywYaGKiRtAwzaBbuzvNb
eRsjPoBaIdxZZtJWfSty
uGnxGhSluywhlAEBIXNP
plRXLszVvPgZhAdTLlYe
…
uGnxGhSluywhlAEBIXNP
plRXLszVvPgZhAdTLlYe
ywYaGKiRtAwzaBbuzvNb
eRsjPoBaIdxZZtJWfSty
$
There were 31 files with names of the form st-out.808##
where the hashes were 2-digit numbers. The main process file was quite large; the others were small, with one of the sizes 66, 110, 111, or 137:
$ cat st-out.80833
lseek(0, -63, SEEK_CUR) = 21
exit_group(0) = ?
+++ exited with 0 +++
$ cat st-out.80834
lseek(0, -42, SEEK_CUR) = -1 EINVAL (Invalid argument)
exit_group(0) = ?
+++ exited with 0 +++
$ cat st-out.80835
lseek(0, -21, SEEK_CUR) = 0
exit_group(0) = ?
+++ exited with 0 +++
$ cat st-out.80836
exit_group(0) = ?
+++ exited with 0 +++
$
It just so happened that the first 4 children each exhibited one of the four behaviours — and each further set of 4 children exhibited the same pattern.
This shows that three out of four of the children were indeed doing an lseek()
on standard input before exiting. Obviously, I have now seen a library do it. I have no idea why it is thought to be a good idea, though, but empirically, that is what is happening.
neof67.c
This version of the code, using a separate file stream (and file descriptor) and fopen()
instead of freopen()
also runs into the problem.
#include "posixver.h"
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
enum { MAX = 100 };
int main(void)
{
FILE *fp = fopen("input.txt", "r");
if (fp == 0)
return 1;
char s[MAX];
for (int i = 0; i < 30 && fgets(s, MAX, fp) != NULL; i++)
{
// Commenting out this region fixes the issue
int status;
pid_t pid = fork();
if (pid == 0)
{
exit(0);
}
else
{
waitpid(pid, &status, 0);
}
// End region
printf("%s", s);
}
return 0;
}
This also exhibits the same behaviour, except that the file descriptor on which the seek occurs is 3
instead of 0
. So, two of my hypotheses are disproven — it's related to freopen()
and stdin
; both are shown incorrect by the second test code.
Preliminary diagnosis
IMO, this is a bug. You should not be able to run into this problem.
It is most likely a bug in the Linux (GNU C) library rather than the kernel. It is caused by the lseek()
in the child processes. It is not clear (because I've not gone to look at the source code) what the library is doing or why.
GLIBC Bug 23151
GLIBC Bug 23151 - A forked process with unclosed file does lseek before exit and can cause infinite loop in parent I/O.
The bug was created 2019-05-08 US/Pacific, and was closed as INVALID by 2018-05-09. The reason given was:
Please read
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01,
especially this paragraph:
Note that after a fork()
, two handles exist where one existed before. […]
POSIX
The complete section of POSIX referred to (apart from verbiage noting that this is not covered by the C standard) is this:
An open file description may be accessed through a file descriptor, which is created using functions such as open()
or pipe()
, or through a stream, which is created using functions such as fopen()
or popen()
. Either a file descriptor or a stream is called a "handle" on the open file description to which it refers; an open file description may have several handles.
Handles can be created or destroyed by explicit user action, without affecting the underlying open file description. Some of the ways to create them include fcntl()
, dup()
, fdopen()
, fileno()
, and fork()
. They can be destroyed by at least fclose()
, close()
, and the exec
functions.
A file descriptor that is never used in an operation that could affect the file offset (for example, read()
, write()
, or lseek()
) is not considered a handle for this discussion, but could give rise to one (for example, as a consequence of fdopen()
, dup()
, or fork()
). This exception does not include the file descriptor underlying a stream, whether created with fopen()
or fdopen()
, so long as it is not used directly by the application to affect the file offset. The read()
and write()
functions implicitly affect the file offset; lseek()
explicitly affects it.
The result of function calls involving any one handle (the "active handle") is defined elsewhere in this volume of POSIX.1-2017, but if two or more handles are used, and any one of them is a stream, the application shall ensure that their actions are coordinated as described below. If this is not done, the result is undefined.
A handle which is a stream is considered to be closed when either an fclose()
, or freopen()
with non-full(1) filename, is executed on it (for freopen()
</a