doi:10.1145/2541883.2541895

Article development led by queue.acm.org What if all the software layers in a virtual appliance were compiled within the same safe, high-level language framework?

By Anil Madhavapeddy and David J. Scott Unikernels: The Rise of the Virtual Library

Cloud computing has been pioneering the business of renting computing resources in large data centers to multiple (and possibly competing) tenants. The basic enabling technology for the cloud is operating-system such as Xen1 or VMWare, which allows customers to multiplex virtual machines (VMs) on a shared cluster of physical machines. actions. They can be centrally backed Each VM presents as a self-contained up and migrated across different phys- computer, a standard operating- ical hosts without interrupting service. system kernel and running unmodified Today commercial providers such as applications just as if it were executing Amazon and Rackspace maintain vast on a physical machine. data centers that host millions of VMs. A key driver to the growth of cloud These cloud providers relieve their computing in the early days was server customers of the burden of managing consolidation. Existing applications data centers and achieve economies of were often installed on physical hosts scale, thereby lowering costs. that were individually underutilized, While operating-system virtual- and virtualization made it feasible to ization is undeniably useful, it adds pack them onto fewer hosts without yet another layer to an already highly requiring any modifications or code layered software stack now including: recompilation. VMs are also managed support for old physical protocols (for via software rather than physical example, disk standards developed

january 2014 | vol. 57 | no. 1 | communications of the acm 61 practice

in the 1980s such as IDE); irrelevant This problem has received a lot modular components that are flexible, optimizations (for example, disk el- of thought at the University of Cam- secure, and reusable in the style of a evator algorithms on SSD drives); bridge, both at the Computer Labora- library operating system. What would backward-compatible interfaces (for tory (where the Xen originat- the benefits be if all the software lay- example, Posix); user-space processes ed in 2003) and within the Xen Project ers in an appliance could be compiled and threads (in addition to VMs on a (custodian of the hypervisor that now within the same high-level language hypervisor); and managed-code run- powers the public cloud via companies framework instead of dynamically as- times (for example, OCaml, .NET, or such as Amazon and Rackspace). The sembling them on every boot? First, Java). All of these layers sit beneath the solution—dubbed MirageOS—has its some background information about application code. Are we really doomed ideas rooted in research concepts that appliances, library operating systems, to adding new layers of indirection have been around for decades but are and type-safe programming languages. and abstraction every few years, leav- only now viable to deploy at scale since The shift to single-purpose appli- ing future generations of program- the availability of cloud-computing re- ances. A typical VM running on the mers to become virtual archaeologists sources has become more widespread. cloud today contains a full operating- as they dig through hundreds of layers The goal of MirageOS is to restruc- system image: a kernel such as or of software emulation to debug even ture entire VMs—including all ker- Windows hosting a primary application the simplest applications?5,18 nel and user-space code—into more running in (for example, MySQL or Apache), along with second- Figure 1. Software layers and a stand-alone kernel compilation. ary services (for example, syslog or NTP) running concurrently. The generic soft- ware within each VM is initialized every Configuration Files Mirage Compiler time the VM is booted by reading con- application source code Application Binary figuration files from storage. configuration files Language Runtime hardware architecture Despite containing many flexible whole-system optimization layers of software, most deployed VMs Parallel Threads ultimately perform a single function User Processes Application Code such as acting as a database or Web OS Kernel Mirage Runtime specialized server. The shift toward single-purpose unikernel Hypervisor Hypervisor VMs is a reflection of just how easy it has become to deploy a new virtual Hardware Hardware } computer on demand. Even a decade ago, it would have taken more time and money to deploy a single (physical) ma- chine instance, so the single machine Figure 2. Logical workflow in MirageOS. would need to run multiple end-user applications and therefore be carefully configured to isolate the constituent OCaml code git device Programmer resolve services and users from each other. repositories drivers dependencies The software layers that form a VM have not yet caught up to this trend, dependency SAT git protocols analysis solver algorithms and this represents a real opportunity for optimization—not only in terms application full static type source source Branch-consistent of performance by adapting the appli- checking module source ance to its task, but also for improv- config application ing security by eliminating redundant compiler files Compiler functionality and reducing the attack surface of services running on the pub- config whole program tree http lic cloud. Doing so statically is a chal- optimization lenge, however, because of the struc- http ture of existing operating systems. linker Limitations of current operating systems. The modern hypervisor pro- xen service dns unikernel monitor dhcp vides a resource abstraction that can outputs be scaled dynamically—both verti- cally by adding memory and cores, openstack Compute binary and horizontally by spawning more Cloud repository Branch-consistent VMs. Many applications and operating binaries systems cannot fully utilize this capa- bility since they were designed before modern came about (and

62 communications of the acm | january 2014 | vol. 57 | no. 1 practice the physical analogues such as mem- existential questions about operating ory hotplug were never ubiquitous in systems. Several research groups have commodity hardware). Often, exter- proposed operating-system designs nal application-level load balancers based on an architecture known as a are added to traditional applications library operating system (or libOS). The running in VMs in order to make the The goal of first such systems were Exokernel6 and service respond elastically by spawn- MirageOS is to Nemesis10 in the late 1990s. In a libOS, ing new VMs when load increases. protection boundaries are pushed to Traditional systems, however, are not restructure entire the lowest hardware layers, resulting optimized for size or boot time (Win- VMs—including in: a set of libraries that implement dows might apply a number of patches mechanisms, such as those needed to at boot time, for example), so the load all kernel and drive hardware or talk network proto- balancer must compensate by keep- cols; and a set of policies that enforce ing idle VMs around to deal with load user-space code— access control and isolation in the ap- spikes, wasting resources and money. into more modular plication layer. Why couldn’t these problems with The libOS architecture has several operating systems simply be fixed? components that advantages over more conventional Modern operating systems are in- are flexible, secure, designs. For applications where per- tended to remain resolutely general formance—and especially predictable purpose to solve problems for a wide and reusable in performance—is required, a libOS audience. For example, Linux runs on the style of a library wins by allowing applications to ac- an incredibly diverse set of platforms, cess hardware resources directly with- from low-power mobile devices to operating system. out having to make repeated privilege high-end servers powering vast data transitions to move data between user centers. Compromising this flexibility space and kernel space. The libOS simply to help one class of users im- does not have a central networking prove application performance would service into which both high-priority not be acceptable. network packets (such as those from a On the other hand, a specialized videoconference call) and low-priority server appliance no longer requires packets (such as from a background an OS to act as a resource multiplexer file download) are forced to mix and since the hypervisor can do this at a interfere. Instead, libOS applications lower level. One obvious problem with have entirely separate queues, and this approach is that most existing packets mix together only when they code presumes the existence of large arrive at the network device. but rather calcified interfaces such as The libOS architecture has two big POSIX or the Win32 API. Another po- drawbacks. First, running multiple tential problem is that conventional applications side by side with strong operating systems provide services resource isolation is tricky (although such as a TCP/IP stack for communica- Nemesis did an admirable job of min- tion and a file-system interface for stor- imizing crosstalk between interactive ing persistent data: in our brave new applications). Second, device drivers world, where would these come from? must be rewritten to fit the new mod- The MirageOS architecture— el. The fast-moving world of commod- dubbed unikernels—is outlined in Fig- ity PC hardware meant that, no mat- ure 1. Unikernels are specialized OS ter how many graduate students were kernels that are written in a high-level tasked to write drivers, any research language and act as individual soft- libOS prototype was doomed to be- ware components. A full application come obsolete in a few short years. (or appliance) consists of a set of run- This approach worked only in the ning unikernels working together as a real-time operating-system space (for distributed system. MirageOS is based example, VxWorks) where hardware on the OCaml (http://ocaml.org) lan- support is narrower. guage and emits unikernels that run Happily, OS virtualization over- on the Xen hypervisor. To explain how comes these drawbacks on commod- it works, let’s look at a radical operat- ity hardware. A modern hypervisor ing-system architecture from the 1990s provides VMs with CPU time and that was clearly ahead of its time. strongly isolated virtual devices for Library operating system. This is not networking, block storage, USB, and the first time people have asked these PCI bridges. A libOS running as a VM

january 2014 | vol. 57 | no. 1 | communications of the acm 63 practice

needs to implement only drivers for MirageOS aims to unify these di- these virtual hardware devices and verse interfaces—both kernel and ap- can depend on the hypervisor to drive plication user spaces—into a single the real physical hardware. Isolation high-level language framework. Some between libOS applications can be of the benefits of modern program- achieved at low cost simply by using Although OS ming languages include: the hypervisor to spawn a fresh VM for virtualization has ˲˲ Static type checking. Compilers can each distinct application, leaving each classify program variables and func- VM free to be extremely specialized to made the libOS tions into types and reject code where its particular purpose. The hypervisor possible without a variable of one type is operated on layer imposes a much simpler, less as if it were a different type. Static fine-grained policy than a convention- needing an army type checking catches these errors at al operating system, since it just pro- compile time rather than runtime and vides a low-level interface consisting of device-driver provides a flexible way for a systems of virtual CPUs and memory pages, writers, protocol programmer to protect different parts rather than the and file-ori- of a program from each other without ented architecture found in conven- libraries are still depending solely on hardware mecha- tional operating systems. needed to replace nisms such as pag- Although OS virtualization has ing. The most obvious benefit of type made the libOS possible without the services checking is the resulting lack of memo- needing an army of device-driver writ- of a traditional ry errors such as buffer or integer over- ers, protocol libraries are still needed flows, which are still prevalent in the to replace the services of a traditional operating system. CERT (Computer Emergency Readi- operating system. Modern kernels ness Team) vulnerability database. A are written in C, which excels at low- more advanced use is capability-style level programs such as device drivers, access control,19 which can be entirely but lacks the abstraction facilities of enforced in a static type system such as higher-level languages and demands ML’s, as long as the code all runs with- careful manual tracking of resources in the same language runtime. such as memory buffers. As a result, ˲˲ Automatic . many applications contain memory- Runtime systems relieve programmers handling bugs, which often manifest of the burden of allocating and freeing as serious security vulnerabilities. Re- memory, while still permitting manual searchers have done an admirable job management of buffers (for example, of porting both Windows and Linux to for efficient I/O). Modern garbage col- a libOS model,16 but for us this provid- lectors are also designed to minimize ed the perfect excuse to explore a less application interruptions via incre- backward-compatible but more natu- mental and generational collection, rally integrated high-level language thus permitting their use in high-per- model. Figure 2 shows the logical work- formance systems construction.7,11 flow in MirageOS. Precise dependency ˲˲ Modules. When the code base tracking from source code (both local grows, modules partition it into logical and global libraries) and configuration components with well-defined inter- files lets the full provenance of the de- faces gluing them together. Modules ployed kernel binaries be recorded in help software development scale as immutable data stores, sufficient to internal implementation details can precisely recompile it on demand. be abstracted and the scope of a single Stronger programming abstractions. source-code change can be restricted. High-level languages are steadily gain- Some module systems, such as those ing ground in general application de- found in OCaml and Standard ML, are velopment and are increasingly used statically resolved at compilation time to glue components together via or- and are largely free of runtime costs. chestration frameworks (for example, The goal is to harness these module Puppet and Chef). Unfortunately, all systems to build whole systems, cross- this logic is typically scattered across ing traditional kernel and user-space software components and is written boundaries in one program. in several languages. As a result, it is ˲˲ Metaprogramming. If the runtime difficult to reason statically about the configuration of a system is partially whole system’s behavior just by analyz- understood at compile time, then a ing the source code. compiler can optimize the program

64 communications of the acm | january 2014 | vol. 57 | no. 1 practice much more than it would normally ing the OPAM package manager, with key reasons. It is a full-fledged sys- be able to. Without knowledge of the solvers from the Mancoosi project) to tems programming language with runtime configuration, the compiler’s search for compatible module imple- a flexible programming model that hands are tied, as the output program mentations from a published online supports functional, imperative, and must remain completely generic, just package set. Any mismatches in inter- object-oriented styles within a single, in case. The goal here is to unify config- faces are caught at compile time be- ML-inspired type system. It also fea- uration and code at compilation time cause of OCaml’s static type checking. tures a portable single-threaded run- and eliminate waste before deploying ˲˲ The compiler can then output a time that makes it ideal for porting to the public cloud. full stand-alone kernel instead of just to restricted environments such as Together, these features signifi- a executable. These unikernels a barebones Xen VM. The compiler cantly simplify the construction of are single-purpose libOS VMs that heavily emphasizes static type check- large-scale systems: managed memory perform only the task defined in their ing, and the resulting binaries are eliminates many resource leaks, type application source and configuration fast native code with minimal run- inference results in more succinct files, and they depend on the hypervi- time type information. Principal type source code, static type checking veri- sor to provide resource multiplexing inference allows type annotations to fies that code matches some abstrac- and isolation. Even the bootloader, be safely omitted, and the module tion criteria at compilation time rather which has to set up the virtual mem- system is among the most powerful than execution time, and module sys- ory page tables and initialize the lan- in a general-purpose programming tems allow the manipulation of this guage runtime, is written as a simple language in terms of permitting flex- code at the scales demanded by a full library. Each application links to the ible and safe code reuse and refactor- OS and application stack. specific set of libraries it needs and ing. Finally, there were several exam- can glue them together in applica- ples of large-scale uses of OCaml in A Functional Prototype In OCaml tion-specific ways. industry14 and within Xen itself,17 and We started building the MirageOS ˲˲ The specialized unikernels are the positive results were encouraging prototype in 2008 with the intention deployed on the public cloud. They before embarking on the large mul- of understanding how far we could have a significantly smaller attack sur- tiyear project that MirageOS turned unify the programming models un- face than the conventional virtualized out to be. derlying library operating systems equivalents and are more resource effi- Modular operating-system librar- and cloud-service deployment. The cient in terms of boot time, binary size, ies. OCaml supports the definition first design decision was to adopt the and runtime performance. of module signatures (a collection of principles behind functional program- Why OCaml? OCaml is the sole data-type and function declarations) ming to construct the prototype. Func- base language for MirageOS for a few that abstract the implementation of tional programming has an empha- sis on supporting abstractions that Figure 3. A partial module graph for a static Web server. make it easier to track mutability in programs, and previous research has shown that this need not come at the XenBoot : UnixELF : XEN UNIX price of performance.11 The challenge was to identify the correct modular abstractions to sup- XenStore : XenEvtchn : port the expression of an entire oper- XS(XEN) EVENT(XEN) ating system and application software stack in a single manageable struc- XenRing : ture. MirageOS has since grown into a RING(XS)(EVENT) mature set of almost 100 open-source libraries that implement a wide array of functionality, and it is starting to be XenNetif : UnixTuntap : NETIF(XEN)(RING) NETIF(UNIX) integrated into commercial products such as Citrix XenServer.17 Figure 2 illustrates MirageOS’s de- MirNet : sign. It grants the compiler a much ETH(NETIF) broader view of source-code dependen- cies than a conventional cloud deploy- MirTCP : UnixSocket : ment cycle: TCP(ETH) TCP(UNIX) ˲˲ All source-code dependencies of the input application are explicitly tracked, including all the libraries re- MyHomePage : Cohttp : quired to implement kernel function- APP(HTTP)(...) HTTP(TCP) ality. MirageOS includes a build system that internally uses a SAT solver (us-

january 2014 | vol. 57 | no. 1 | communications of the acm 65 practice

module structures (definitions of con- Consider a simple example. Fig- can recompile to switch away from us- crete data types and functions). Mod- ure 3 shows a partial module graph ing Unix sockets to the OCaml TCP/ ules can be parameterized over sig- for a static Web server. Libraries are IP stack shown by MirTCP in Figure natures, creating functors that define a module graph that abstract over 3. This still requires a Unix kernel operations across other data types. operating-system functionality, and but only as a to deliver Ether- (For more information about OC- the OPAM package manager solves net frames to the Web-server process aml modules, functors, and objects, constraints over the target architec- (which now incorporates an OCaml see Real World OCaml, published by ture. The application MyHomePage TCP/IP stack as part of the applica- O’Reilly and available at https://real- depends on a HTTP signature that tion). The last compilation strategy worldocaml.org.) We applied the OC- is provided by the Cohttp library. drops the dependency on Unix entire- aml module system to breaking the Developers just starting out want to ly and recompiles the MirNet mod- usually monolithic OS kernel func- explore their code interactively us- ule to link directly to a Xen network tionality into discrete units. This lets ing a Unix-style development envi- driver, which in turn pulls in all the programmers build code that can be ronment. The Cohttp library needs dependencies it needs to boot on Xen. progressively specialized as it is be- a TCP implementation to satisfy its This progressive recompilation is key ing written, starting from a process module signature, which can be pro- to the usability of MirageOS, since we in a familiar Unix environment and vided by the UnixSocket library. can evolve from the tried-and-tested ending up with a specialized cloud When the programmers are satis- Linux or FreeBSD functionality gradu- unikernel running on Xen. fied their HTTP logic is working, they ally but still end up with specialized unikernels that can be deployed on Figure 4. Virtual address space of the MirageOS Xen unikernel target. the public cloud. This modular op- erating-system structure has led to a number of other back ends being

IP header implemented in a similar vein to Xen. text and data MirageOS now has experimental back TCP header 4kB ends that implement a simulator in

120TB foreign tx data NS3 (for large-scale functional test- grants IP header ing), a FreeBSD kernel module back end, and even a JavaScript target by TCP header 4kB using the js_of_ocaml compiler. rx data reserved A natural consequence of this modu- by Xen larity is that it is easier to write por- table code that defines exactly what it 64-bit virtual address space 8x512 needs from a target platform, which is 4kB sectors increasingly difficult on modern oper- OCaml ating systems with the lack of a mod- minor heap ern equivalent of Posix (which has led Linux, FreeBSD, Mac OS X, and Win- 128TB OCaml dows to have numerous incompatible major heap APIs for high-performance services). Configuration and state. Libraries in MirageOS are designed in as func- tional a style as possible: they are re- entrant with explicit state handles, Figure 5. Boot time. which are in turn serializable so that they can be reconstructed explicitly. Linux PV+Apache Linux PV An application consists of a set of li- Mirage braries plus some configuration code,

3 all linked together. The configuration is structured as a tree roughly like a 2 , with subdirectories be- ing parsed by each library to initialize

Time (s) 1 their own values (reminiscent of the Plan 9 operating system). All of this is 0 connected by metaprogramming—an 128 256 512 1024 2048 3072 8 16 32 64 OCaml program generates more OC- aml code that is compiled until the Memory size (MiB) desired target is reached. The metaprogramming extends into storage as well. If an application

66 communications of the acm | january 2014 | vol. 57 | no. 1 practice uses a small set of files (which would dynamic linking support that requires normally require all the baggage of executable mappings to be added after block devices and a file system), Mira- the VM has booted.13 geOS can convert it into a static OCaml module that satisfies the file-system Benefits module signature, relieving it of the One downside Consider the life cycle of a traditional need for an external storage depen- to a unikernel application. First the source code is dency. The entire MirageOS home page compiled to a binary. Later, the binary (http://openmirage.org) is served in is the burden is loaded into memory and an OS pro- this manner. it places on cess is created to execute it. The first One (deliberate) consequence of thing the running process will do is metaprogramming is that large blocks the cloud read its configuration file and special- of functionality may be entirely miss- ize itself to the environment it finds ing from the output binary. This makes orchestration itself in. Many different applications dynamic reconfiguration of the most layers because will run exactly the same binary, ob- specialized targets impossible, and tained from the same binary package, a configuration change requires the of the need but with different configuration files. unikernel to be relinked. The lines of to schedule These configuration files are effectively active (that is, post-configuration) code additional program code, except they involved in a MirageOS Web server are many more are normally written in ad hoc languag- shown in Table 1, giving a sense of the VMs with es and interpreted at runtime rather small amount of code involved in such than compiled. a recompilation. greater churn. Deployment and management. Linking the Xen unikernel. In a Configuration is a considerable over- conventional OS, application source head in managing the deployment of code is first compiled into object files a large cloud-hosted service. The tra- via a native-code compiler and then ditional split between the compiled handed off to a linker that generates (code) and interpreted (configuration) an executable binary. After compila- is unnecessary with unikernel com- tion, a dynamic linker loads the exe- pilation. Application configuration is cutable and any shared libraries into code—perhaps as an embedded do- a process with its own address space. main-specific language—and the com- The process can then communicate piler can analyze and optimize across with the outside world by system the whole unikernel. calls, mediated by the operating-sys- In MirageOS, rather than treating tem kernel. Within the kernel, vari- the database, Web server, and so on, ous subsystems such as the network as independent applications that must stack or virtual memory system pro- be connected by configuration files, cess system calls and interacts with they are treated as libraries within a the hardware. single application, allowing the ap- In MirageOS, the OCaml compiler plication developer to configure them receives the source code for an en- using either simple library calls for tire kernel’s worth of code and links dynamic parameters or metaprogram- it into a stand-alone native-code ob- ming tools for static parameters. This ject file. It is linked against a minimal has the useful effect of making con- runtime that provides boot support figuration decisions explicit and pro- and the garbage collector. There is no grammable in a host language rather preemptive threading, and the kernel than manipulating many ad hoc text is event driven via an I/O loop that files and thus benefiting from static- polls Xen devices. analysis tools and the compiler’s type The Xen unikernel compilation checker. The result is a big reduction in derives its performance benefit from the effort needed to configure complex the fact that the running kernel has a multiservice application VMs. single virtual address space, designed One downside to a unikernel is the to run only the OCaml runtime. The vir- burden it places on the cloud orchestra- tual address space of the MirageOS Xen tion layers because of the need to sched- unikernel target is shown in Figure 4. ule many more VMs with greater churn Since all configuration information (since every reconfiguration requires is explicitly part of the compilation, the VM to be redeployed). The popular there is no longer a need for the usual orchestration implementations have

january 2014 | vol. 57 | no. 1 | communications of the acm 67 practice

grown rather organically in recent years and rented. At the same time, mult- down Linux kernel and MirageOS are and consist of many distributed compo- itenant services suffer from variability similar, but the inefficiency creeps into nents that are not only difficult to man- in load that encourages rapid scaling Linux as soon as it has to initialize the age, but also relatively high in latency of deployments—both up to meet cur- user-space applications. The MirageOS and resource consumption. rent demand and down to avoid wast- unikernel is ready to serve traffic as One of the first production uses for ing money. In MirageOS, features that soon as it boots. MirageOS is to fix the cloud-manage- are not used in a particular build are The MLton20 compiler pioneered ment stacks by evolving the OCaml not included, and whole-system op- WPO (whole program optimization), code within XenServer17 toward the timization techniques can be used to where an application and all of its li- structured unikernel worldview. This eliminate waste at compilation time braries are optimized together. In the turns the monolithic management rather than deployment time. In the libOS world, a whole program is ac- layer into a more agile set of intercom- most specialized mode, all configura- tually a whole operating system: this municating VMs that can be sched- tion files are statically evaluated, en- technique can now optimize all the way uled and restarted independently. abling extensive dead-code elimina- from application-level code to low-level MirageOS makes constructing these tion at the cost of having to recompile device drivers. Traditional systems es- single-purpose VMs easy: they are to reconfigure the service. chew WPO in favor of dynamic link- first built and tested as regular Unix The small binary size of the uniker- ing, sometimes in combination with applications before flipping a switch nels (on the order of hundreds of ki- JIT (just-in-time) compiling, where a and relinking against the Xen kernel lobytes in many cases) makes deploy- program is analyzed dynamically, and libraries (http://openmirage.org/blog/ ment to remote data centers across the optimized code is generated on the fly. xenstore-stub-domain). When they are Internet much smoother. Boot time is Whole-program, compile-time optimi- combined with Xen driver domains,3 also easily less than a second, making zation is more appropriate for cloud they can dramatically increase the it feasible to boot a unikernel in re- applications that care about resource security and robustness of the cloud- sponse to incoming network packets. efficiency and reducing their attack management stack. Figure 5 shows the comparison surface. Other research elaborates on Resource efficiency and custom- between the boot time of a service in the security benefits.13 ization. The cloud is an environment MirageOS and a Linux/Apache distri- An interesting recent trend is a where all resource usage is metered bution. The boot time of a stripped- move toward operating-system con- tainers in which each container is man- Table 1. Approximate size of libraries used by a typical MirageOS unikernel running aged by the same operating-system a Web server. kernel but with an isolated file system, network, and process group. Contain- Library C/kLOC OCaml/kLOC ers are quick to create since there is no Boot 18 0 need to boot a new kernel, and they are OCaml runtime 20 0 fully compatible with existing kernel threads 5 27 interfaces. However, these gains are interdomain comms trace 1 made at the cost of reduced security network driver 0 1 and isolation; unikernels share only TCP/IP trace 12 the minimal hypervisor services via a block driver 0 1 small API, which is easy to understand HTTP 0 11 and audit. Unikernels demonstrate Total 43 52 that layering language runtimes onto a hypervisor is a viable alternative to lightweight containers. A new frontier of portability. The Table 2. Other unikernel implementations. structure of MirageOS libraries shown in Figure 3 explicitly encodes what the library needs from its execution envi- Unikernel Language Targets ronment. While this has convention- 13 Mirage OCaml Xen, kFreeBSD, POSIX, WWW/js ally meant a Posix-like kernel and user Drawbridge17 C Windows “picoprocess” space, it is now possible to compile OC- HalVM8 Haskell Xen aml into more foreign environments, ErlangOnXen Erlang Xen including FreeBSD kernel modules, Ja- OSv2 C/Java Xen, KVM vaScript running in the browser, or (as GUK Java Xen the Scala language does) directly tar- geting the Java Virtual Machine (JVM). NetBSD “rump”9 C Xen, Linux kernel, POSIX Some care is still required for ex- 14 C++ Xen ClickOS ecution properties that are not ab- stractable in the OCaml type system. For example, floating-point numbers

68 communications of the acm | january 2014 | vol. 57 | no. 1 practice

are generally forbidden when running been exploring use cases ranging from References as a kernel module; thus, a modified managing personal data4 and facilitat- 1. barham, P. et al. Xen and the art of virtualization. In th 15 Proceedings of the 19 ACM Symposium on Operating compiler emits a type error if floating- ing anonymous communication, to Systems Principles (2003), 164–177. point code is used when compiling for building software-defined data-center 2. cloudius Systems. OSv; https://github.com/cloudius- systems/osv. that hardware target. infrastructure. 3. colp, P. et al. A. Breaking up is hard to do: Security Other third-party OCaml code of- and functionality in a commodity hypervisor. In Proceedings of the 23rd ACM Symposium on Operating ten exhibits a similar structure, mak- Acknowledgments Systems Principles (2011), 189–202. ing it much easier to work under Mi- The MirageOS effort has been a large 4. crowcroft, J., Madhavapeddy, A., Schwarzkopf, M., Hong, T. and Mortier, R. Unclouded vision. In rageOS. For example, Arakoon (http:// one and would not be possible with- Proceedings of the International Conference on arakoon.org) is a distributed key-value out the intellectual and financial Distributed Computing and Networking, 29–40. 5. Eisenstadt, M. My hairiest bug war stories. Commun. store that implements an efficient support of several sources. The core ACM 40, 4 (Apr. 1997), 30–37. 6. Engler, D. R., Kaashoek, M. F. and O’Toole, Jr., multi-Paxos consensus algorithm. team of Richard Mortier, Thomas J. : An operating system architecture The source-code patch to compile it Gazagnaire, Jonatham Ludlam, Haris for application-level resource management. In Proceedings of the 15th ACM Symposium on Operating under MirageOS touched just two files Rotsos, Balraj Singh, and Vincent Ber- Systems Principles, (1995), 251–266. and was restricted to adding a new nardoff have toiled to help us build 7. Eriksen, M. Your server as a function. In Proceedings of the 7th Workshop on Programming Languages and module definition that mapped the the clean-slate OCaml code, with con- Operating Systems, (2013), 5:1–5:7. Arakoon back-end storage to the Xen stant support and feedback from Jon 8. galois Inc. The Haskell Lightweight Virtual Machine (HaLVM) source archive; https://github.com/GaloisInc/ block driver interface. Crowcroft, Steven Hand, Ian Leslie, HaLVM. Derek McAuley, Yaron Minsky, An- 9. kantee, A. Flexible operating system internals: The design and implementation of the anykernel and Unikernels in the Wild drew Moore, Simon Moore, Alan My- rump kernels. Ph.D. thesis, Aalto University, Espoo, MirageOS is certainly not the only croft, Peter G. Neumann, and Robert Finland, 2012. 10. Leslie, I.M. et al. The design and implementation of an unikernel that has emerged in the past N.M. Watson. Space prevents us from operating system to support distributed multimedia few years, although it is perhaps the fully acknowledging all those who applications. IEEE Journal of Selected Areas in Communications 14, 7 (1996), 1280–1297. most extreme in terms of exploring contributed to this effort. We encour- 11. madhavapeddy, A., Ho, A., Deegan, T., Scott, D. and the clean-slate design space. Table 2 age readers to visit http://queue.acm. Sohan, R. Melange: Creating a “functional” Internet. SIGOPS Operating Systems Review 41, 3 (2007), shows some of the other systems that org for our full list. 101–114. 8 12. madhavapeddy, A., Mortier, R., Crowcroft, J. and Hand, build unikernels. HalVM is the clos- This work was primarily supported S. Multiscale not multicore: Efficient heterogeneous est to the MirageOS philosophy, but by Horizon Digital Economy Research, cloud computing. In Proceedings of ACM-BCS Visions of Computer Science. Electronic Workshops in it is based on the famously pure and RCUK grant EP/G065802/1. A portion Computing, (Edinburgh, U.K., 2010). lazy Haskell language rather than the was sponsored by DARPA (Defense Ad- 13. madhavapeddy, A. et al. Unikernels: Library operating systems for the cloud. In Proceedings of the 18th strictly evaluated OCaml. On the other vanced Research Projects Agency) and International Conference on Architectural Support end of the spectrum, OSv2 and rump AFRL (Air Force Research Laboratory), for Programming Languages and Operating Systems, 9 (2013), 461–472. kernels provide a compatibility layer under contract FA8750-11-C-0249. The 14. minsky, Y. OCaml for the masses. Commun. ACM 54, for existing applications, and deem- views, opinions, and/or findings con- 11 (Nov. 2011), 53–58. 15. mortier, R., Madhavapeddy, A., Hong, T., Murray, D. phasize the programming model im- tained in this report are those of the au- and Schwarzkopf, M. Using dust clouds to enhance provements and type safety that guides thors and should not be interpreted as anonymous communication. In Proceedings of the 18th 16 International Workshop on Security Protocols (2010). MirageOS. The Drawbridge project representing the official views or poli- 16. Porter, D.E., Boyd-Wickizer, S., Howell, J., Olinsky, R. converts Windows into a libOS with cies, either expressed or implied, of and Hunt, G.C. Rethinking the library OS from the top down. In Proceedings of the 16th International just a reported 16MB overhead per ap- DARPA or the Department of Defense. Conference on Architectural Support for Programming plication, but it exposes higher-level MirageOS is available freely at Languages and Operating Systems, (2011), 291–304. 17. scott, D., Sharp, R., Gazagnaire, T. and Madhavapeddy, interfaces than Xen (such as threads http://openmirage.org. We welcome A. Using functional programming within an industrial product group: perspectives and perceptions. In and I/O streams) to gain this efficiency. feedback, patches, and improbable Proceedings of the 15th ACM SIGPLAN International Ultimately, the public cloud should stunts using it. Conference on Functional Programming, (2010), 87–92. 18. Vinge, V. A Fire Upon the Deep. Tor Books, New York, support all these emerging projects as NY, 1992. first-class citizens just as Linux and 19. Watson, R.N.M. A decade of OS access-control Related articles extensibility. Commun. ACM 56, 2 (Feb. 2013), 52–63. Windows are today. The Xen Project 20. Weeks, S. Whole-program compilation in MLton. In aims to support a brave new world of on queue.acm.org Proceedings of the 2006 Workshop on ML. dust clouds: tiny one-shot VMs that Self-Healing in Modern Operating Systems run on hypervisors with far greater Michael W. Shapiro Anil Madhavapeddy is a Senior Research Fellow at the http://queue.acm.org/detail.cfm?id=1039537 University of Cambridge, based in the Systems Research density than is currently possible and Group. He was on the original team that developed the that self-scale their resource needs by Erlang for Concurrent Programming Xen hypervisor and XenServer management toolstack Jim Larson written in OCaml. XenServer has been deployed on constantly calling into the cloud fab- millions of hosts and drives critical infrastructure for many http://queue.acm.org/detail.cfm?id=1454463 ric. The libOS principles underlying Fortune 500 companies. MirageOS mean it is not limited to Passing a Language through Dave Scott is a Principal Architect at Citrix Systems where he works on the XenServer virtualization platform. running on a hypervisor platform— the Eye of a Needle Roberto Ierusalimschy, Luiz Henrique de His focus is on improving XenServer reliability and many of the libraries can be compiled performance through exploiting advances in open-source Figueiredo and Waldemar Celes software and high-level languages. to multiscale environments,12 ranging http://queue.acm.org/detail.cfm?id=1983083 from ARM smartphones to bare-metal OCaml for the Masses kernel modules. To understand the Yaron Minsky implications of this flexibility, we have http://queue.acm.org/detail.cfm?id=2038036 © 2014 ACM 0001-0782/14/01 $15.00

january 2014 | vol. 57 | no. 1 | communications of the acm 69