A Deep Dive Into Nixos: from Configuration to Boot CS5250: Advanced Operating Systems
Total Page:16
File Type:pdf, Size:1020Kb
A Deep Dive into NixOS: From Configuration To Boot CS5250: Advanced Operating Systems Chen Jingwen A0111764L National University of Singapore Abstract Mature operating systems (e.g. Windows, Fedora) are inherently stateful and imperative, adding layers of complexity by installing or upgrading software. This causes side-effects such as breaking existing software while upgrading shared libraries without maintaining backwards compatibility. NixOS is a Linux distribution designed to be purely functional, where building everything from the kernel to the web browser has no side- effects. System configuration files are written in the Nix language, a lazy functional domain specific language with a declarative syntax, and software packages are managed by the Nix package manager. A distinct feature of NixOS is the ability to declare the configuration of an entire system in one file, which is then used to build a bootable system deterministically. This report gives an overview and the motivations of NixOS, and a deep dive into how the configuration of an operating system can be derived from a single file. 1 Contents 1 Introduction 4 2 Motivation 5 2.1 Multiple versions . 5 2.2 Destructive updates . 5 2.3 Rollback difficulties . 6 2.4 Non-atomic upgrades . 6 2.5 Inability to reproduce builds . 6 3 NixOS Architecture 7 3.1 Package specifications and the Nix expression language . 7 3.1.1 Nix expression language . 8 3.1.2 Derivations . 9 3.2 Nix store . 9 3.2.1 Cryptographic hash . 9 3.2.2 Source to binary deployment . 10 3.2.3 Nix database . 10 3.3 Nix package manager . 11 3.3.1 Installation . 11 3.3.2 Immutability . 11 3.3.3 Dependency management . 11 3.4 Putting it all together: NixOS . 13 3.4.1 configuration.nix . 14 3.4.2 System profiles . 14 3.4.3 Garbage collection . 15 3.5 Advantages . 15 3.5.1 Version management . 15 3.5.2 Atomicity . 16 3.5.3 Easy rollbacks . 16 3.5.4 Reproducibility . 16 4 Deep Dive: Switching Configurations 18 4.1 System configuration . 18 4.2 nixos-rebuild.sh . 18 4.3 nix-instantiate . 18 4.4 nix-env . 18 4.5 Configuration switch . 19 4.6 Observations . 19 5 Experience feedback 20 2 6 Summary 20 7 Appendix 21 A NixOS configuration file 21 3 1 Introduction NixOS is an operating system built in a purely functional manner on top of the Linux kernel. At its heart is Nix, a package management system like APT1 on Ubuntu and RPM2 on Red Hat. Nix differentiates itself from other package managers by attempting to solve issues plaguing popular modern operating systems such as the inability to manage multiple versions of a package easily, rollback updates quickly, have a deterministic reproduction of system installations, and solving shared dependency issues. The core approach of Nix package management is the use of pure functions to describe each package, otherwise known as Nix expressions. In each package's Nix expression, there is a derivation that describes the steps to take to build the package from source along with its dependencies. Evaluating the derivations builds the package and stores the resulting contents in the Nix store, an immutable content-addressable folder where the entry name contains a hash of all input arguments of the derivation. Nix uses such metadata of the expressions to build a closure of the dependency graph, which can be evaluated lazily and deterministically, since every node in the graph has no side effects. The expressions are written in .nix files using the Nix expression language, a purely functional lazy programming language. NixOS is the result of taking the Nix package manager one step further by using it to build an entire operating system from scratch. It aims to extend Nix's features to manage and centralize the configuration of the system, which includes components such as the kernel, network and filesystem drivers, bootloader, and graphical environments. Eelco Dolstra designed and implemented Nix for his PhD thesis.3 Armijn Hemel designed the first prototype of NixOS for his Masters thesis4 and was further developed and documented by Dolstra, Loh and Pierron in a comprehensive paper in 2010.5 It is under development as an open source project on GitHub,6 and free to be downloaded and used.7 It is currently used by commercial deployments worldwide. We will first discuss the issues facing the current state of operating systems. Then, we will give an overview to the parts of NixOS and how their designs overcome these issues. Finally, we will do a deep dive into the core of the NixOS system, the configuration file /etc/nixos/configuration.nix, and follow the steps NixOS takes to turn that file into a bootable system. 1Apt - Debian Wiki. Mar. 2017. url: https://wiki.debian.org/Apt. 2rpm.org - Home. Mar. 2017. url: http://rpm.org. 3Eelco Dolstra. \The Purely Functional Software Deployment Model". In: Utrecht University 56.12 (2006), p. 281. issn: 14968975. doi: 10.1007/s12630-009-9179-6. url: http://www.st.ewi.tudelft.nl/%7B~%7Ddolstra/pubs/phd-thesis.pdf. 4Armijn Hemel. \NixOS: the Nix based operating system". In: (2006). 5Eelco Dolstra, Andres L¨oh,and Nicolas Pierron. \NixOS: A purely functional Linux distribution". In: Journal of Functional Programming 20.5-6 (2010), pp. 577{615. issn: 0956-7968. doi: 10.1017/S0956796810000195. 6Official Nix/Nixpkgs/NixOS. Apr. 2017. url: https://github.com/nixos. 7NixOS Linux. Mar. 2017. url: http://nixos.org. 4 2 Motivation The fundamental architecture of modern and conventional operating systems is an imperative model: every action that the user or system takes to install, update or remove software is a stateful action that modifies the global state. New versions of packages overwrite older ones. Some packages are shared and used by other packages in the form of both static and dynamic dependencies. Packages are scattered across the filesystem hierarchy of the operating system. For example in the case of Unix systems, packages are distributed over directories like /etc, /usr, /bin, /var, /lib. The more complex a system grows, the more difficult it is to keep track of where everything is. It is difficult to determine whether a file or directory is required by the system or user, or is an unneeded residual file left behind from a system update some time ago. The design of the imperative model stems from the early days of computing, and inertia has grown so strongly with an accumulation of efforts over decades, resulting in difficulty in reengineering the operating systems to address the following issues. 2.1 Multiple versions In a typical Unix system, installing a package foobar writes the compiled binary directly to /usr/bin/foobar, or some directory in the user's $PATH where binaries are located. Assuming that foobar-v1 is installed, updating it to some later version, e.g. foobar-v2, will overwrite the v1 binary at /usr/bin/foobar. This works well if only one version of the package is needed at any time, and most operating systems assume this by storing only one version of each software package in the system. However, if more than one version of the package is required by the user or system, there are no straightforward methods to get around this assumption. A common workaround is to include the version of the package in the name directly. This is seen in the python package, where the version numbers are part of the package name, i.e. python27 and python34. This approach works as intended, but does not scale well as the number of versions grow, or even in maintaining minor version bumps. 2.2 Destructive updates The side-effects of a package update in an imperative model might be cascading, non-obvious, and hard to detect. To build an intuition, let's treat the operating system like a program written in a general purpose programming language. Popular shared libraries such as glibc are like the program's global variables with shared mutable state. Different parts of the program are able to access and update these global variables via side-effects, and if care is not taken, it might result in unpredictable states of the system. Users of dynamically linked libraries (DLL) will otherwise know this problem as the DLL hell, where a shared dependency that their application relies on is unknowingly modified by a third party, leaving their application in a non-working state should there be an incompatible API change. This issue extends to the metadata of the software packages, such as configuration files and registry values. It is up to the software developer of each package to determine how to safely transition the package's 5 metadata or on whether to maintain backwards compatibility. 2.3 Rollback difficulties As a consequence of destructive updates and the lack of multiple versions coexisting in a system, rolling the system back to a previous state is not a simple task. This requires manually reverting the steps taken by the update process, which may not be possible if there were crucial data lost along the way. Modern operating systems (e.g. Windows8), keeps system restore snapshots of the system periodically, but these snapshots take up a large amount of space, and is usually not temporally granular enough to minimize the amount of lost state. Maintaining the integrity of the system is then left to the responsibility of the user in the form of backups and version control systems, which are not simple software for the average user. 2.4 Non-atomic upgrades Upgrading an operating system is a perilous and complex process. During a system upgrade, users are told to ensure that their machines have enough battery power to sustain the entire operation, or to be plugged in to a electrical socket.