The quest to make Linux bulletproof

Part 2 This is the second half of a feature about work undertaken to harden and improve Linux, beginning with part 1 here.

Commercial Unix was expensive so it was carefully tended – and indeed tendered. Linux is free so it has to fend for itself.

Linux itself was inspired by the tried and tested designs of the proprietary Unixes that preceded it – or predeceased it – which it drove into extinction. Some of their tech continues to make its way into Linux, and some is being reinvented, usually to get round IP issues. The goals are to make Linux more resilient: fault-tolerant, self-healing, and in general to lower the cost of its maintenance.

Just as desktop distros get their core tech from the lucrative server ones, some of the methods being used started out in old enterprise Unixes, or are reimplementations of tools and methods from them, but that's only the beginning of the influence.

A starting point is one of the longest-standing bits of enterprise IT: databases. They've been around for longer than minicomputers and their contents are usually very valuable so lots of time, effort, and money has gone into research into how to make them more resilient. A core property has been to make them transactional – once an important buzzword for big commercial databases, and something that later filtered down to the smaller ones. The idea is to make every alteration of your precious business data into a transaction. Ideally, it completes fully, but if it doesn't, you have a record of what was going to happen, so you can fully undo it, thus putting things back exactly as they were before.

The rules for how to make this reliable were defined by the great Tedd Codd to operating systems. Notably, this means the ACID properties: Atomicity, Consistency, Isolation, and Durability.

In the Linux world, this first appeared with journaling file systems. First, some proprietary ones were hacked out of commercial Unixes: we reported on SGI's XFS in 1999, and IBM's JFS in 2000. The new journaled ext3 file system was merged into kernel 2.4.15 in November 2001, although an intrepid Reg correspondent was already trying it out shortly before. Apple added journaling to HFS+ the following year.

Now efforts are afoot to bring some degree of transactionality to software installation too, and that leads us to executive summaries of the approaches of some of the more significant players.

The SUSE family of distros has long been given to using less-than-mainstream disk file systems. Way back in the 1990s, it defaulted to using ReiserFS while most others used ext2 or ext3 – The Reg mentioned this relatively exotic file system 21 years ago. ReiserFS became a less desirable choice because its eponymous lead developer was convicted of murdering his wife in 2008. SUSE was already offering Btrfs instead a decade ago, and it became the default file system by 2014, and the company remains committed to it.

Although other snapshot-capable file systems are out there, this remains a key strength of Btrfs: while OpenZFS on Linux is a thing, and is under active development, it's not part of the Linux kernel. Red Hat has yet to commit to its own Stratis, and despite years of development the new bcachefs remains unfinished. For now, only Btrfs offers copy-on-write (COW) snapshots and is part of the Linux kernel.

The great advantage of COW snapshots is that they are very quick. Essentially, it enables the OS to make a near-instant backup of the state of a set of files: from the moment a snapshot is created, any writes to those files will be redirected to a new copy of the relevant files, held elsewhere. It's fast and basically invisible to other programs on the system.

SUSE's Snapper integrates Btrfs snapshots into package management. Whenever the OS's package manager is told to install some new software, it first makes a Btrfs snapshot. If anything doesn't work afterwards, the user can revert to the snapshot before the most recent update, and get a working system back again. Snapshot handling is integrated into the boot manager too so this also applies to kernel updates. It's a key selling point of openSUSE's rolling-release distro, Tumbleweed. Some other distros have added Snapper too, including the Debian-based SpiralLinux, the rolling-release Debian sid-based Siduction and the Arch Linux derivative Garuda.

But SUSE itself is pushing ahead with more sophisticated plans. Once independent from UK COBOL-shifter Micro Focus, SUSE chose to grow by acquisition, and soon snapped up Kubernetes merchants Rancher. With a newfound appreciation for containers, SUSE's next-generation OS, codenamed Advanced Linux Platform or ALP, aims to increase reliability by moving beyond simple snapshots by making the root file system read-only. The only way to install software, including updates, is during a reboot, using a new command, transactional-update. The OS can check that all its services come back up without errors, and if some fail, it can revert to an older snapshot and reboot itself.

If you have a cluster of hosts running lots of containers, this should not be too intrusive: an orchestration tool, such as the ubiquitous Kubernetes, can migrate "pods" of containers off that machine, apply updates, and then bring containers back when the machine comes back up and rejoins the cluster. It's less convenient for a non-clustered machine, but it may prove possible to work around this using tools akin to the existing Distrobox, which builds a conventional, read-write OS instance in a container on a machine running an immutable OS.

Other companies, with less stake in sophisticated file systems, are taking different approaches, but with similar ultimate goals. Another way to make software installation transactional is to move the functionality into the package manager, rather than the file system.

We have explored this in some depth, but we hope you'll forgive a potted recap. The main contenders are Snap and Flatpak. Snap is the more controversial because of the perception that it's proprietary, and the way that recent Ubuntu releases force it on Ubuntu users. The now official Ubuntu Unity remix adds in Flatpak support as standard. Linux Mint goes further: it completely removes Snap support and only offers Flatpak.

So what are the differences, and why?

A few years ago, and despite some controversy, Canonical was actively working on integrating OpenZFS into Ubuntu. However, more recently, it looks like Ubuntu's ZSys tool is being deprecated.

However, Canonical remains firmly committed to its Snap packaging system, and offers its own immutable distribution, Ubuntu Core; we recently looked at Ubuntu Core 22. Like SUSE MicroOS, the root file system is read-only, and the conventional package manager is gone. Ubuntu Core only supports Snap, and even the kernel is packaged as one.

Ubuntu's Snap format survived from Canonical's phone and tablet version of Ubuntu, although the company's efforts to crowdfund the hardware failed. Each Snap is a single compressed file, which is versioned and digitally signed. Snaps are squashfs files mounted as loop devices containing the relevant binaries and all the dependencies specific to that version. This means that, for example, the same Firefox snap can run on multiple different versions of Ubuntu, reducing Canonical's workload when updating Firefox across multiple versions of Ubuntu that are still in support and receiving updates.

And importantly, because Snap works with monolithic single file packages, it doesn't need a fancy file system underneath to implement rollback. When a Snap is updated, the package manager retains the old version so installation can be rewound by simply unmounting the newer version and remounting an older one.

Snap is in active development, and some things that users have issues with may yet change. When the Snap-packaged Firefox appeared in Ubuntu 22.10, some users saw very slow launch times; in response, Canonical has added a choice of compression algorithms, and moved Firefox to a compression scheme that decompresses quicker. All the Snap files are loop-mounted during bootup, which slows system startup a little. The Reg FOSS desk wouldn't put it past Canonical's boffins to in future add a feature that marks some Snaps as not being essential for system startup, so that they can be mounted on-demand each time the app they contain is first launched.

An aspect of some bits of Linux tech that is largely invisible from the outside world is the pervasive influence of the tools that the community use to build the software itself. For instance, much FOSS product documentation is written, edited, formatted and output using an approach known as Docs as Code. The tools are not perfect for the job – but they are rich, powerful, capable tools, they are free, and perhaps most of all, they are well-known, debugged and optimized by the same teams who are building the products themselves. Since documentation writers need to work closely with those teams anyway, the benefits outweigh the slightly clunky tooling. The technical writers get to know the same tools, which makes it easier to talk to, and work with, the developers.

Git is a core part of this. It's very clever, can do amazing things, and is famously hard to master. Back in 2005, the source code of the Linux kernel was already becoming unwieldy, to the point that its creator Linus Torvalds adopted a proprietary tool, Bitkeeper, to manage it. This proved controversial, and he later wrote his own tool to replace it. For any readers who aren't speakers of British English, the tool's name, git, is UK slang for an annoying or uncooperative person.

What Git does, in brief, is synchronize all the files in a whole directory tree across different computers in different places. When one develop Source: The register

Home

The quest to make Linux bulletproof