Linux API without lies for children

Last update:

One day my friend and I were arguing about the UNIX philosophy. The starting point was my explanation of /proc/cpuinfo to beginners, which went along these lines:

/proc/cpuinfo isn’t a file stored on disk — the kernel simply presents many types of information in virtual files according to the “everything is a file principle” principle of the UNIX philosophy

His position was that telling beginners that “everything is a file” would do more harm than good. Well, to be fair, all catchy maxims are misleading when repeated without any context or applicability limits.

“Everything is a file” is simply false in all widely-used UNIX-like systems. Has always been false, actually. I don’t even think it’s a worthy goal at all.. After all, one likely reason why it became so prominent in the early UNIX is that early computers often exposed most of the hardware interface as memory-mapped IO, so it was an easy abstraction to implement. For modern hardware, that abstraction is very leaky.

But the question is, how to talk to beginners about the system interfaces of UNIX-like OSes.

For complete beginners, “everything is a file” maybe an acceptable approximation because most things they are going to interact with directly are files, such as /proc/ and /sys files and block devices. It’s also not really a mental model that will break when beginners learn about kernel APIs that aren’t file-like.

But let me do an exercise in presenting the true picture to curious and somewhat more prepared beginners without any lies.

System calls

All modern CPUs have the concept of interrupts. The original purpose of interrupts was to allow software to handle asynchronous events such as keyboard input. When an event occurs, the CPU interrupts the normal execution of the program that it’s running and transfers control to an interrupt handler that processes the event, then resumes running the original program code.

An interrupt number is assigned to every device, so if the keyboard was assigned interrupt 0x20, the operating system kernel could register an interrupt handler — tell the CPU that whenever interrupt 0x20 occurs, it should transfer the control to a subroutine stored at memory address 0x1000 and put the keyboard input handling code at that address.

That concept was later generalized to software interrupts. A classic hardware interrupt is a purely external event produced by hardware. A software interrupt is generated by a program, when it calls a special CPU command like int 0x10 — “generate an interrupt number 0x10”.

The software interrupt mechanism became widely used to allow the onboard firmware (such as BIOS in the x86 IBM PC)and operating system kernels to provide services to applications. For example, to print a character on screen, a program would put its code in a specific location in the system memory and call a firmware subroutine by generating a software interrupt.

In a fantasy assembly language:

mov 0x1, r0 # Tell the firmware what we want to do — store subroutine code 0x1 (print) in r0
mov 0x20, r1 # Store character code 0x20 (space) in the CPU register r1
int 0x10 # Raise an interrupt to call the firmware subroutine

Many modern CPUs have special instructions for system calls, such as syscall in AMD64 and svc (SuperVisor Call) in ARM.

Linux provides a set of system calls through that mechanism. Every program can store a system call number and parameters for it in pre-determined memory locations, then use a sofrware interrupt or a dedicated system call command to ask the kernel to handle it.

<example: write to stdout>

An unusual feature of the Linux kernel is that its system call numbers and their parameters form a stable ABI — kernel developers make a promise to never change them, and Linus Torvalds rejects every change that would break that promise.

Since Linux is just the kernel and doesn’t have any official libraries for high-level programming languages, having a stable ABI is a critical requirement. If it wasn’t stable, Linux-based operating system projects would be unable to safely upgrade to newer kernel versions.

Many UNIX-like operating systems such as FreeBSD or OpenBSD have their kernel and userspace parts developed in lock-step so their kernel ABIs are not stable because they don’t have to be.

Signals

File-like interfaces

System information files

Device files

ioctl

Sockets

Netlink