Theseus: an Experiment in Operating System Structure and State Management

Theseus: an Experiment in Operating System Structure and State Management Kevin Boos Namitha Liyanage Ramla Ijaz Lin Zhong Rice University Yale University Rice University Yale University Abstract On the quest to determine to what extent state spill can be avoided in OS code, we chose to write an OS from scratch. This paper describes an operating system (OS) called The- We were drawn to Rust because its ownership model pro- seus. Theseus is the result of multi-year experimentation to vides a convenient mechanism for implementing isolation redesign and improve OS modularity by reducing the states and zero-cost state transfer between OS components. Our ini- one component holds for another, and to leverage a safe pro- tial OS-building experience led to two important realizations. gramming language, namely Rust, to shift as many OS re- First, mitigating state spill, or better state management in gen- sponsibilities as possible to the compiler. eral, necessitates a rethinking of OS structure because state Theseus embodies two primary contributions. First, an OS spill (by definition) depends on how the OS is modularized. structure in which many tiny components with clearly-defined, Second, modern systems programming languages like Rust runtime-persistent bounds interact without holding states for can be used not just to write safe OS code but also to statically each other. Second, an intralingual approach that realizes ensure certain correctness invariants for OS behaviors. the OS itself using language-level mechanisms such that the The outcome of our experimentation is Theseus OS, which compiler can enforce invariants about OS semantics. makes two contributions to systems software design and im- Theseus’s structure, intralingual design, and state manage- plementation. First, Theseus has a novel OS structure of many ment realize live evolution and fault recovery for core OS tiny components with clearly-defined, runtime-persistent components in ways beyond that of existing works. bounds. The system maintains metadata about and tracks interdependencies between components, which facilitates live 1 Introduction evolution and fault recovery of these components (§3). We report an experimentation of OS structural design, state Second, and more importantly, Theseus contributes the in- management, and implementation techniques that leverage tralingual OS design approach, which entails matching the the power of modern safe systems programming languages, OS’s execution environment to the runtime model of its im- namely Rust. This endeavor was initially motivated by stud- plementation language and implementing the OS itself us- ies of state spill [16]: one software component harboring ing language-level mechanisms. Through intralingual design, changed states as a result of handling an interaction from Theseus empowers the compiler to apply its safety checks another component, such that their future correctness depends to OS code with no gaps in its understanding of code be- on said states. Prevalent in modern systems software, state havior, and shifts semantic errors from runtime failures into spill leads to fate sharing between otherwise modularized and compile-time errors, both to a greater degree than existing isolated components and thus hinders the realization of de- OSes. Intralingual design goes beyond safety, enabling the sirable computing goals such as evolvability and availability. compiler to statically check OS semantic invariants and as- For example, state spill in Android system services causes the sume resource bookkeeping duties. This is elaborated in §4. entire userspace frameworks to crash upon a system service Theseus’s structure and intralingual design naturally reduce failure, losing the states and progress of all applications, even states the OS must maintain, reducing state spill between those not using the failed service [16]. Reliable microkernels its components. We describe Theseus’s state management further attest that management of states spilled into OS ser- techniques to further mitigate the effects of state spill in §5. vices is a barrier to fault tolerance [21] and live update [28]. To demonstrate the utility of Theseus’s design, we imple- Evolvability and availability of systems software are crucial ment live evolution and fault recovery (for availability) within in environments where reliability is necessary yet hardware it (§6). With this, we posit that Theseus is well-suited for high- redundancy is expensive or impossible. For example, sys- end embedded systems and datacenter components, where tems software updates must be painstakingly applied without availability is needed in the absence of or in addition to hard- downtime or lost execution context in pacemakers [26] and ware redundancy. Therein, Theseus’s limitations of being a space probes [25, 62]. Even in datacenters, where network new OS and needing safe-language programs have a lesser switches are replicated for reliability, switch software failures impact, as applications can be co-developed with the OS in and maintenance updates still lead to network outages [27,48]. an environment under a single operator’s control. We evaluate how well Theseus achieves these goals in §7. 1 fn main() { 2 let hel: &str; Through a set of case studies, we show that Theseus can easily 3 { and arbitrarily live evolve core system components in ways 4 let hello = String::from("hello!"); 5 // consume(hello); // −! "value moved" error in L6 beyond prior live update works, e.g., joint application-kernel 6 let borrowed_str: &str = &hello; evolution, or evolution of microkernel-level components. As 7 hel = substr(borrowed_str); 8 } Theseus can gracefully handle language-level faults (panics 9 // print!("{}", hel); // −! lifetime error in Rust), we demonstrate Theseus’s ability to tolerate more 10 } 11 fn substr<'a>(input_str: &'a str) -> &'a str{ challenging transient hardware faults that manifest in the OS 12 &input_str[0..3] // return value has lifetime 'a core. To this end, we present a study of fault manifestation 13 } _ and recovery in Theseus and a comparison with MINIX 3 of 14 fn consume(owned string: String) {...} fault recovery for components that necessarily exist inside the When the owner’s scope ends, e.g., at the end of a lexical microkernel. Although performance is not a primary goal of block, the owned value is dropped (released) by virtue of the Theseus, we find that its intralingual and spill-free designs do compiler inserting a call to its destructor. Destructors in Rust not impose a glaring performance penalty, but that the impact are realized by implementing the Drop trait for a given type, varies across subsystems. in which a custom drop handler can perform arbitrary actions Theseus is currently implemented on x86_64 with support beyond freeing memory. On L8 above, the hello string falls for most hardware features, such as multicore processing, out of scope and is auto-deallocated by its drop handler. preemptive multitasking, SIMD extensions, basic networking Values can also be borrowed to obtain references to them and disk I/O, and graphical displays. It represents roughly (L6), and the lifetime of those references cannot outlast the four person-years of effort and comprises ~38000 lines of lifetime of the owned value. The syntax in L11 gives the name from-scratch Rust code, 900 lines of bootstrap assembly code, 'a to the lifetime of the input_str argument, and specifies 246 crates of which 176 are first-party, and 72 unsafe code that the returned &str reference has that same lifetime 'a. blocks or statements across 21 crates, most of which are for That returned &str reference is assigned to hel in L7, which port I/O or special register access. would result in a lifetime violation in L9 because hel would However, Theseus is far less complete than commercial be used after the owned value it was originally borrowed systems, or experimental ones such as Singularity [33] and from (hello) was dropped in L8. Rust’s compiler includes a Barrelfish [8] that have undergone substantially more devel- borrow checker to enforce these lifetime rules, as well as the opment. For example, Theseus currently lacks POSIX support core tenet of aliasing XOR mutability, in which there can be and a full standard library. Thus, we do not make claims about multiple immutable references or a single mutable reference certain OS aspects, e.g., efficiency or security; this paper fo- to a value, but not both at once. This allows it to statically cuses on Theseus’s structure and intralingual design and the ensure memory safety for values on the stack and heap. ensuing benefits for live evolution and fault recovery. Theseus also extensively leverages Rust traits, a decla- Theseus’s code and documentation are open-source [61]. ration of an abstract type that specifies the set of meth- ods the type must implement, similar to polymorphic in- terfaces in OOP languages. Traits can be used to place 2 Rust Language Background bounds on generic type parameters. For example, the function _ The Rust programming language [40] is designed to provide fn print str<T: Into<String>>(s: T){ } uses the under- strong type and memory safety guarantees at compile time, lined trait bound to specify that its argument named s must combining the power and expressiveness of a high-level man- be of any abstract type T that can be converted into a String. aged language with the C-like efficiency of no garbage col- lection or underlying runtime. Theseus leverages many Rust 3 Theseus Overview and Design Principles features to realize an intralingual, safe OS design and em- The overall design of Theseus specifies a system architecture ploys the crate, Rust’s project container and translation unit, consisting of many small distinct components, called cells, for source-level modularity. A crate contains source code and which can be composed and interchanged at runtime. A cell is a dependency manifest. Theseus does not use Rust’s standard a software-defined unit of modularity that serves as the core library but does use its fundamental core and alloc libraries.

Load more