Linux Systems Boot Process

The Linux Boot Process: From Power Button to Kernel

Part 1 — From power button to the kernel’s first breath

Part 2 — Leaving real mode, stepping through 32‑bit land, and arriving in 64‑bit

Part 3 — Unpacking the real kernel, fixing addresses, and why Linux sometimes moves itself

You press the power button. A second later a wall of text scrolls by, or a logo fades in, and eventually Linux appears. What happens in between is not magic. It is a careful handshake between tiny programs and a very literal CPU. This part follows that handshake until the very first line of C code inside the Linux kernel runs.

When power stabilizes, the CPU resets itself to a tiny, old‑fashioned mode called real mode. Real mode dates back to the original 8086 chip. The rules are simple on purpose. Memory addresses are built from two values the CPU keeps in special fast storage called registers. You combine a segment and an offset like this:

If you see numbers like 0xFFFFFFF0, that is hexadecimal. Hex is base 16. We write 0x in front to make that clear. 0x10 is 16 in everyday counting. 0x100000 is 1 megabyte. Hex lines up nicely with how hardware stores bits, which is why you see it everywhere in low‑level code.

Right after reset the CPU jumps to a special address called the reset vector at 0xFFFFFFF0. Think of it as a permanent bookmark that says “start here.” There is room for almost nothing at that address, so manufacturers put a short jump there that passes control to the firmware on your motherboard.

Tiny explainer: register A register is a tiny slot inside the CPU. It holds a number the CPU is using right now. Names like CS and IP are register names. CS means “code segment,” which marks the current neighborhood for instructions. IP means “instruction pointer,” which marks which instruction comes next.

BIOS stands for Basic Input Output System. It is the older style. BIOS does a quick health check called POST, looks at the boot order, and tries each device. If it finds a disk whose very first 512‑byte sector ends with the marker bytes 0x55 and 0xAA, it treats that device as bootable. BIOS copies that sector to memory at 0x7C00 and jumps there. That sector is tiny, so it usually knows only how to load the next, larger piece.

UEFI is the modern replacement. It still starts the machine, but it understands filesystems directly and can load bigger boot programs without the old “first sector” dance. UEFI also passes richer information to the operating system. Different path, same goal: hand control to a boot program that can load Linux.

GRUB also fills out a small structure called the setup header with useful facts: where it placed the kernel, where the command line lives, where the initrd is if you have one. Then it jumps into the setup program.

It lines up the segment registers so memory copies behave the same way every time. Names you’ll see here are CS for code, DS for data, and SS for stack. It also clears a single CPU bit called the “direction flag” so copy instructions move forward through memory.

It creates a stack. The stack is a last‑in, first‑out workbench where functions store temporary values. SS says which segment the stack uses. SP is the pointer to the current top of the stack.

It clears a region called BSS. BSS is where global variables that must start as zero live. C code assumes BSS is zero. The setup program writes zeros over that entire span to keep that promise.

If you passed earlyprintk on the kernel command line, the setup code also programs the serial port so it can print very early messages. This is useful when graphics are not ready yet.

Finally the setup program asks the firmware “how much usable RAM do we really have and where are the holes.” On old BIOS this is a call people often nickname e820, which returns a simple list of usable and reserved ranges. The kernel will use that list to avoid stepping on the firmware’s toes.

With that done, the setup code calls its first C function, which is literally named main. We are still in the small old real mode at this point. The next job is to leave it.

Modern Linux on PCs runs in long mode, which is the 64‑bit mode of x86_64. You cannot jump there directly from real mode. The path is real mode to protected mode and then protected mode to long mode. This part covers that path and explains the vocabulary on the way.

Protected mode is the 32‑bit world introduced to get past the limits of the 1980s. It adds two central ideas.

The Global Descriptor Table, or GDT, is a short list of segment descriptions. A description says “this segment starts here, covers this much, and is allowed to do these things.” Linux keeps this simple. It uses a flat model, which means the base is zero and the size covers the whole 32‑bit space. When everything is flat, addresses look like plain numbers again.

The setup code turns off the noisy parts first. It disables maskable interrupts with a single instruction. It quiets the old PIC chips so hardware interrupts are fully blocked for a moment. It opens the A20 line. This is a historical quirk. Early PCs made addresses wrap at 1 megabyte. Opening A20 removes that wrap so higher addresses work like you expect. It resets the math coprocessor so the floating point state is clean.

Then it loads a tiny GDT with only what we need right now and a tiny IDT. Finally it sets a single bit named PE in a control register named CR0 and performs a far jump. That jump reloads the code segment from the GDT and locks in protected mode. It reloads the data and stack segments and fixes the stack pointer to match the new flat world.

Tiny explainer: control registers The CPU has a few special registers for on off switches. CR0 turns on protected mode. CR3 holds the address of the top of the page tables, which we will need in a second. CR4 enables a set of extended features such as larger page table entries.

Paging must be on. Paging is the translator between virtual addresses and physical addresses. Programs use virtual addresses. The hardware reads and writes physical memory. Page tables map one to the other in fixed‑size chunks called pages. On PCs a normal page is 4 kilobytes. There are also bigger pages. Early in boot the kernel uses 2 megabyte pages to describe low memory quickly.

A single bit named LME in a special register called EFER must be set to allow long mode. EFER is a model specific register, which is a fancy way of saying “a register used for certain CPU features.”

The 32‑bit prologue builds a small set of page tables that say “for this region, virtual equals physical.” That is called an identity map. It is enough to flip paging on safely.

To make this work the code enables PAE in CR4 so larger entries are used. It builds a minimal set of tables that cover low memory in 2 megabyte chunks. It writes the address of the top table into CR3. Paging is now ready.

Finally it sets LME in EFER and performs a far return into a label that is written as 64‑bit code. Long mode is now active. Segments are still “flat,” but addresses and registers are 64‑bit wide.

Why all the extra care Switching modes while a live system runs is like changing a car tire while rolling. The code blocks interruptions, prepares the minimum needed tables, flips the bit, and only then invites interrupts back. Slow and steady prevents weird half‑switched states.

We have a 64‑bit CPU with paging on and a compressed kernel in memory. Now the small 64‑bit stub does the practical work: get out of the way if needed, unpack the kernel, fix addresses if the kernel is not at its default spot, and jump.

The stub first figures out where it is actually running. Early code is linked as if it lived at address zero and then computes its real base at runtime. If the planned destination for the decompressed kernel would overlap the stub, it copies itself to a safe place.

It loads a minimal IDT with two handlers. One for page fault and one for NMI. A page fault happens when the CPU cannot find a mapping for a virtual address it just tried to use. In our early identity‑mapped world, the tiny page fault handler can add the missing mapping on the fly and continue. The NMI handler is there so a non‑maskable interrupt does not crash the machine while we are still bringing things up.

A C function commonly named extract_kernel takes over. It sets aside a tiny heap for temporary buffers, prints the classic line, and unpacks the kernel using whatever algorithm the kernel was built with. gzip, xz, zstd, lzo, and others all plug into the same wrapper.

When the bytes are out, the decompressor reads the kernel’s ELF headers. ELF, short for Executable and Linkable Format, is both a file format and a map. It says which chunks are code, which are data, and exactly where each chunk wants to live. The decompressor copies each chunk where it belongs.

If the kernel is being loaded at a different address than it was built for, the decompressor applies relocations. A relocation is a small fix‑up that adjusts a pointer or an instruction that contains an address. The decompressor walks a list of these and patches each place so it points to the right spot in the address space we are actually using.

When everything is in place, the decompressor returns the entry point of the real kernel and jumps there, passing a pointer to the boot parameters. From that moment you are in the full kernel. The first function you meet is start_kernel, and the big initialization begins.

You may see kASLR mentioned in kernel logs. That stands for Kernel Address Space Layout Randomization. The idea is simple. If attackers do not know where the kernel lives in memory, certain attacks get a lot harder.

It builds a do not touch list. That includes the decompressor itself, the compressed image, the initial ramdisk, the boot parameters page, and the command line buffer. It can also include ranges you reserve with a memmap= option on the command line.

XDEFiANCE'e Quality Internet Shop