The lowest-level way to do this is with the same Linux system calls ls
uses.
So look at the output of strace -efile,getdents ls
:
execve("/bin/ls", ["ls"], [/* 72 vars */]) = 0
...
openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 23 entries */, 32768) = 840
getdents(3, /* 0 entries */, 32768) = 0
...
getdents is a Linux-specific system call. The man page says that it's used under the hood by libc's readdir(3)
POSIX API function.
The lowest-level portable way (portable to POSIX systems), is to use the libc functions to open a directory and read the entries. POSIX doesn't specify the exact system call interface, unlike for non-directory files.
These functions:
DIR *opendir(const char *name);
struct dirent *readdir(DIR *dirp);
can be used like this:
// print all directories, and symlinks to directories, in the CWD.
// like sh -c 'ls -1UF -d */' (single-column output, no sorting, append a / to dir names)
// tested and works on Linux, with / without working d_type
#define _GNU_SOURCE // includes _BSD_SOURCE for DT_UNKNOWN etc.
#include <dirent.h>
#include <stdint.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
DIR *dirhandle = opendir("."); // POSIX doesn't require this to be a plain file descriptor. Linux uses open(".", O_DIRECTORY); to implement this
//^Todo: error check
struct dirent *de;
while(de = readdir(dirhandle)) { // NULL means end of directory
_Bool is_dir;
#ifdef _DIRENT_HAVE_D_TYPE
if (de->d_type != DT_UNKNOWN && de->d_type != DT_LNK) {
// don't have to stat if we have d_type info, unless it's a symlink (since we stat, not lstat)
is_dir = (de->d_type == DT_DIR);
} else
#endif
{ // the only method if d_type isn't available,
// otherwise this is a fallback for FSes where the kernel leaves it DT_UNKNOWN.
struct stat stbuf;
// stat follows symlinks, lstat doesn't.
stat(de->d_name, &stbuf); // TODO: error check
is_dir = S_ISDIR(stbuf.st_mode);
}
if (is_dir) {
printf("%s/
", de->d_name);
}
}
}
There's also a fully compilable example of reading directory entries and printing file info in the Linux stat(3posix)
man page. (not the Linux stat(2)
man page; it has a different example).
The man page for readdir(3)
says the Linux declaration of struct dirent is:
struct dirent {
ino_t d_ino; /* inode number */
off_t d_off; /* not an offset; see NOTES */
unsigned short d_reclen; /* length of this record */
unsigned char d_type; /* type of file; not supported
by all filesystem types */
char d_name[256]; /* filename */
};
d_type is either DT_UNKNOWN
, in which case you need to stat
to learn anything about whether the directory entry is itself a directory. Or it can be DT_DIR
or something else, in which case you can be sure it is or isn't a directory without having to stat
it.
Some filesystems, like EXT4 I think, and very recent XFS (with the new metadata version), keep type info in the directory, so it can be returned without having to load the inode from disk. This is a huge speedup for find -name
: it doesn't have to stat anything to recurse through subdirs. But for filesystems that don't do this, d_type
will always be DT_UNKNOWN
, because filling it in would require reading all the inodes (which might not even be loaded from disk).
Sometimes you're just matching on filenames, and don't need type info, so it would be bad if the kernel spent a lot of extra CPU time (or especially I/O time) filling in d_type
when it's not cheap. d_type
is just a performance shortcut; you always need a fallback (except maybe when writing for an embedded system where you know what FS you're using and that it always fills in d_type
, and that you have some way to detect the breakage when someone in the future tries to use this code on another FS type.)