<<

Utilizing Rust Programming Language for EFI-Based Bootloader Design

Tun¸cUzlu and Ediz S¸aykol

Beykent University, Department of Engineering, Ayaza˘ga,34396, Istanbul,˙ Turkey [email protected]; [email protected]

in , Foundations massively parallel web browsing engine, which is unique because of its concur- Abstract rent process rendering and compositing steps [JML15]. Rust, as being a systems programming language, has Rust, as being a systems programming lan- ability to operate at the lowest level without any run- guage, offers memory safety with zero cost and time penalty, like C, C++ or Cyclone, but offers com- without any runtime penalty unlike other lan- plete memory safety, unlike these languages. Systems guages like C, C++ or Cyclone. System pro- programming languages are crucial for time criticial gramming languages are mainly used for low tasks like signal processing and also for bare-metal op- level tasks such as design of erations such as design of operating system compo- components, web browsers, game engines and nents, web browsers, game engines where raw hard- time critical missions like signal processing. ware access is a must. Existing systems languages are Main disadvantages of the existing systems memory unsafe and extremely complicated because of languages are being memory unsafe and hav- their low level nature. ing low level design. On the other hand, Rust Systems programming languages are considered es- offers high level language semantics, advanced sential for embedded systems because of low mem- standard with modern skill set includ- ory availability and exiguous processing power [HL15]. ing most of the features and functional ele- The main reason is the lack of garbage collector which ments of widely-used programming languages. causes non-deterministic delays [LAC+15]. Garbage Moreover, Rust can be used as a scripting lan- collectors provide very safe memory management, but guage like Python, and a functional language poorly manages the memory space and unpredictably like Haskell or any other low level procedural runs at the background. This design choice also affects language like C or C++, since Rust is both energy consumption which is very important for em- imperative and functional having no garbage bedded systems and changes operating system design collector. These design choices make Rust a paradigm [LMP+05]. suitable match for low level tasks via includ- ing high level scalability and maintainability. On the other hand, Rust is both imperative and Meanwhile, EFI (Extensible Firmware Inter- functional language. Although including different fla- face) specification is aimed to remove the lim- vors, Rust is highly scalable with capable standard itations of legacy hardware. Hence, we present library comparable to high level languages. Rich our analysis of utilizing Rust language on EFI- language semantics and haveing no garbage collector based bootloader design for x86 architecture, makes Rust suitable match for low level tasks while to make it useful for both practitioners and having high maintainability level. Moreover, Rust can technology developers. be used as a scripting language like Python or as a functional language like Haskell because of its inher- ited skill set has been mostly adpoted from modern 1 Introduction languages. Rust programming language has been designed by C++ is the most powerful systems programming Graydon Hoare and currently it is actively being de- language today. Because of its multi paradigm de- veloped by . It is also being used sign and zero cost runtime performance, it is widely used by numerous organizations and people with dif- tion. ferent backgrounds. C++ has features with compli- Rust ecosystem includes Rustc but also a cated runtime support like RTTI and exceptions dis- very powerful package manager, Cargo with its registry abled for most bootloader applications. As it includes webpage for crates, Rustfmt for code formatting, and every element from its predecessor C language, it also Rustdoc. for automatic document generation. Cargo includes every memory safety pitfall from C. This vari- has very well dependency management as it offers ation makes C++ even more vulnerable to memory un- strict versions of dependencies to be defined. It allows safety especially architects with C background widely arbitrary flags to pass to Rustc, the Rust compiler, rely on these language elements. Cyclone, on the other but most importantly with target argument [HL15] it hand, developed as an extension to C language to pro- is possible to cross compile to another system differ- vide Rust-like memory safety mechanism with ability entiating from host operating system. There is also to port from C to Cyclone without much effort. How- features argument for conditional compiling. Cargo ever, this design choice caused the language semantics reads projects meta information from a Toml file which to become restrictive and unwieldy. is very much like JSON, but more suitable for human Another language which is popular and somehow editing, rather than data serialization. racing with Rust is Go language because of its low learning curve. Go is supported by and is a 2.1 Rust Programming Concepts high level language which can be compared to Python or Ruby. Go neither have generic types nor pro- Ownership is one of the most important language se- vides safety over its concurrency model, Goroutines. mantics of Rust. Variable bindings can have one Rust has generics with monomorphisation so they are unique owner. They can be moved, can be borrowed statically dispatched and has good runtime perfor- numerous times if they are not previously borrowed mance [Bal15]. as mutable, that can be happened only once. Own- Here, we present our analysis of utilizing Rust lan- ership also works on resources like files or sockets and guage on EFI-based bootloader design for x86 architec- across threads. Rust provides traits to offer functional- ture, to make it useful for both practitioners and tech- ity similar to inheritance [JML15]. For example, to du- nology developers. Our analysis in this paper starts plicate an object Rust have Clone trait [LAC+15] also with presenting Rust language basics in detail in Sec- there is Copy trait for bitwise copying. Anonymous tion 2. Then, bootloading basics is presented in Sec- closure functions are also defined in terms of traits in tion 3. Since the main idea behind using Rust is pro- Rust like Fn or FnMut depending on mutability and if gramming a critical-and-safe low-level task with high- the closure is called once it should be FnOnce. They level programming concepts, we found bootloader de- can not be used as a return value so they should be sign a typical application for this purpose, and discuss enclosed into a Box which allocates space from Heap design choices that make Rust suitable in Section 4. memory [Lig15]. Finally, Section 5 concludes our paper and states fu- Rust have Structs in a very similar way to C. The ture work. main difference is data structure itself may be pub- lic whereas its elements may be private in the code 2 Rust Language Details space. Rust offers algebraic Enum which is more func- tional and much more advanced compared to that of Rust is an open source programming language, includ- C++, which only has type checking. Option generic ing an issue system for bug reporting and separate type is a special Enum type with maybe characteris- RFC tracker for language standardization, which are tic. It is being used as a selector between a return located on Github repository. With the help of numer- value, Some, or an error value, Err (or absence None). ous contributors around the world, Rust provides pre- This Option and Error types are suitable for repre- compiled development environment for Linux, Win- senting Null pointers so that it is impossible Rust to dows and OS X. It is also possible to cross compile have Null pointer errors. This paradigm is also suit- Rust for Ios, Android, Rasperry Pi and other operating able for Null pointer optimization as Rust uses LLVM systems. As Rust is a separate development toolchain compiler infrastructure and benefits from same back- from operating system, it is radically closer to deter- end optimizations of C language family. Pointer safety ministic code generation process. Hence, Rust is com- is guaranteed with holding Lifetimes. Like type infer- pletely decoupled in this perspective. On the other ence, reference lifetimes can be guessed by Rust com- hand, languages like C or C++ depends on header piled and this is called lifetime elision. Sometimes ex- files and libraries through the operating system, lots plicit lifetime marks are required as references lifetime of applications along with various operating system must be equal or larger than its originating binding. distributions and updates might influence the collec- Concurrency is the core of Rust. Same owner- ship mechanism applies across threads and Rust offers audience. Like borrowing a master chefs knife, imper- thread safety mostly on compile time. Channel, for ative paradigm is powerful when used correctly, but example, allows data to be send safely across threads tend to fail because of its destructive nature on global if the type satisfy Send Marker trait. Markers are data [Oka99]. Rusts internals to enforce safety rules. Other impor- tant markers are Sync, can be shared across threads, Sized, type has a known size at compile time. When multiple threads need to modify same region of mem- 2.2 Comparing Rust with C and C++ ory classical lock mechanisms like Mutex or RWLock are provided. The key point is locking in Rust works on the data itself, not on the code. architects Rust is the remedy for numerous systems program- using C++ tries to prevent data race by locking the ming bugs by design. First one is buffer overflow or code itself by design. underflow on arrays. C++ has no bounds checking A well-known analysis on the cost of software test- for arrays so writing or reading outside of bounds may ing [Pat01] states that if a design error at the specifi- cause corruption or page fault depending on operation. cation phase costs about zero to 10 cents, in the soft- Rust checks array bounds at runtime because there is ware testing phase it costs 1 to 10 dollars. However, no way to detect array size at compile time. Also Rust if the error is found by the eventual user the cost is does not allow indexing operation with negative argu- at least 100 dollars, hence the increase is logarithmic. ment. Array elements are accessed with Index trait To help in reducing the errors, Rust is designed to be and this trait is not defined for negative values. At last a strong and static language. Dynamic languages suf- integer overflow remains. Fortunately, Rust checks for fer from compiler aid or lack of typing depending on arithmetic overflows if the number is unsigned. This language design. They have low learning curve and type of corruption is the main source of buffer related high portability or embedibility. On the other hand, attacks for years. languages with strong typing such as Rust or Haskell The second is iterator invalidation. With C++, have higher learning curve but provide superior type while an iterator is looping over a collection and the safety at compiling stage. are far better at collection has been modified, this causes the iterator catching bugs than human eye. There are also weak to be invalid. Data is corrupt or iterator goes into static languages exist. They offer automatic type con- an infinite loop depending on operation. With Rust, version and this unpredictability causes bugs just like as the collection is borrowed by the iterator, it can dynamic languages. Undefined behaviors have always not be borrowed mutably by modifier functions like been spots for hard to find bugs. For example, C++ Push [Bei15]. language, unlike Rust, does not define size of its main integer type, int, or char type can be signed or un- The last one is use-after-free memory bugs. High signed depending on various factors like compiler, op- level languages prevent this kind of error by using erating system or building flags. garbage collector while Rust has its unique ownership and lifetime semantics to prevent this memory pitfall Charles Petzold described a telegraph relay as a de- with zero runtime performance cost. Rust also has hy- vice that a clicker and a sound magnet connected with gienic macros and the macros are part of AST trans- a stick by lazy operator. Because they were moving formation [Lig15]. simultaneously [Pet00]. As it is acceptable for the op- erator to make mistakes when hearing the Morse code Rust has unsafe blocks for non-ideal conditions like for a day and clicking the correct dash or dot code dereferencing raw pointers, type transmute or foreign as there is no mechanical aid. Dynamic languages are function interface. With Rust, there is no possibility somehow the same. Compiler support is an example to cause concurrency failure outside of unsafe block for the relay device, with strong type checking, is seri- even if the design of application is tremendously bad. ously important to prevent human errors. Rust takes Raw pointers are ideal for storing MMIO or interrupt this a step forward by providing compile time memory controller, system tables memory address as they are and thread safety. Runtime checks are done only if stored on constant memory location. C language does there is no any other choice, like bound checking for not prevent pointers to be modified outside of their arrays. lifetime this is a problem with Rust only when unsafe is Rust also have borrowed functional elements from used. Rust also offers strong foreign function interface various languages, for example, Iterators. They are to C language with Extern keyword and talking to C lazily evaluated and offers numbers of higher order has no runtime performance cost. This makes calling functions when an iterator is defined or converted into. foreign function from EFI is extremely simple with a Functional flavor is harder for systems programming simple binding module. 3 Bootloading Basics most importantly runs the system in long mode.

3.1 Legacy Bootloading 3.2 Unified Extensible Firmware Interface Bootloaders are responsible for building memory map, (UEFI) finding system tables and launching operating system EFI specification has been designed by Intel in 1999 kernel. For backwards compatibility reasons CPUs and now it maintained by UEFI consortium that in- with x86 architecture used to start in 16-bit real mode cludes more than 160 companies [ZRM11]. EFI has which only has access to 1MB of memory. Typical lots of modern features such as networking, human in- routine of a bootloader should be first enabling higher terface device support and bootloader driver model. memory over A20 gate [Cor16]. Bootloading concepts It provides safer way to update firmware update with heavily relies on chipset specification and BIOS inter- packages, Capsules, that enforce EEPROM valida- rupts. As they are designed by different hardware tion [BZ15]. The flowchart of EFI-based bootloading vendors, conflicts exist on different systems. Such process is shown in Figure 1. units have grown organically over years and they have EFI is built up with numerous modules while boot, poorly standardized. runtime and driver modules are mandatory. Boot Next step should be enabling protected mode, which module is the key to generating memory map and lo- provides 32-bit addressing and paging. Activation of cating systems tables. x86 memory model, while de- paging is mandatory and also very useful as it provides pending on memory controller or chipset, has lots of separation between kernels and user applications pages gaps in the memory [YZ15]. These include MMIO, in terms of permissions. Also paging is the key for vir- configuration registers for PCI devices4, legacy timers, tual memory along with creation noexecutable pages video frame buffers or regions belongs to ACPI or to prevent runtime code execution from text sections. interrupt controller tables (reclaimable or not). As Paging is also being used on high level, for example brute-forcing to generate a memory map is extremely guard paging is being used to grow stack when there unstable, EFI provides the map out of the box. Driver is a page fault exception at the end of program stack. model allows to create drivers for file systems or NIC On real mode there is another memory management devices for richer bootloading environment. While called segmentation. It works by using different selec- runtime module offers monotonic timers, system time, tors for sectioning areas of code and data blocks. After power supply commands or firmware updating. protected mode switch segmentation is now obsolete, EFI bootloader applications can be developed with but at the same time it is still active and has to be Rust like any other applications uses foreign function configured such as it should provide the same flat ad- interface, but there should be no standard library for dressing. Some segment registers are still being used all types of operating systems. The library of Rust in Linux kernel to detect buffer overflow over function is rich as high level languages. Most of the language call return address on stack. characteristics provided over standard library and not Lastly, there is long mode with provides 64-bit ad- embedded into languages itself. Rust binaries should dressing in canonical form and removes historical fea- be linked into a final Portable Executable (PE). PE tures like BCD [Cor16]. Different kernels have strict file format is being used in Windows operating system requirements about the state that it is going to be and offers sectioning along with relocation [Hah14]. started. There are also various sub-modes like for em- ulating real mode interrupts in protected mode, called 4 Designing EFI-based Bootloader virtual-8086 mode, or emulating complicated driver- required devices in early modes, called system man- with Rust agement mode. Between this mode switches interrupt In order to create an EFI application with Rust, first controller must be reconfigured correctly. At the old Libcore should be compiled for target platform. Lib- times real mode interrupts which were invoking appro- core is the bare-metal subset of Rust standard library priate BIOS support were being used in place of device that has no operating system dependency. A few mem- drivers in order to talk to the hardware. ory functions are needed to build Libcore, which can As devices became much more complicated operat- be obtained from Rlibc. It is also possible to use their ing systems took over all hardware interaction. BIOS C counterparts. EFI application, Rlibc library and were started to be used as a bootloader firmware. Its Libcore should be cross-compiled to target system by complex nature was such a boredom and also lack in- correct triplet. Although x86 64-pc-windowsgnu is the teraction with modern technology, such as network ac- most suitable triplet (because of a future PE linkage) cess, was led Intel to design EFI specification which for such a bootloader application, it is not sufficient. is a modern platform firmware for bootloading. EFI There should be a custom target triplet definition can run applications just like an operating system and file in JSON format and it should disable few language Figure 1: The flowchart of EFI (Source: ://en.wikipedia.org/wiki/Uni-fied Extensible Firmware Interface). features. SSE, there are also other mathematical floating point units such as MMX and 3dNow depending • First of them is Compiler-rt, because otherwise on CPU model. LLVM does not allow us to dis- LLVM compiler infrastructures helper library or able floating point support in such state because Rust languages itself should be reconfigured and Libcore library has floating point code. It should recompiled for target architecture even though be modified and cleaned from floating point in there is no need. order to be used in kernel or bootloader program- • Second one is Morestack, as there is no highlevel ming. One example can be that Fxsave or Fxstor memory management Morestack is not declared instructions copy every FPU storage registers into by the application and stack is managed manually stack between function calls. so compiler should not define Morestack. The EFI application then can be linked with sub- • Third one is stack unwinding as when an excep- system 10 flag, put into FAT32 drive and tested with a tion occurs in a bootloader, there is little to no computer or virtual machine. Ovmf is an open source chance to recover. It is also known as landing BIOS for Qemu having EFI support. Qemus nographic pads in Rust and can also be defined as compiler option makes it easy to integrate into any develop- flag. ment environment. There is also a tool called Multi- rust which crates Rust version overrides for folders. It • Finally, floating point operations and optimiza- makes easier to make switch between nightly versions tions must be disabled from the triplet configura- or stable release of Rust. EFI also has a shell which tion file. It has been found that floating point op- is a helper for bootloader design. For example, Pci timizations corrupts interrupt handlers with bare- command lists pci device paths or Memmap shows the metal Rust [HL15]. Also in bootloader environ- memory map. EFI Capsules also support I2C which ment, floating point stack or coprocessor have not can be used to flash ROMs belonging other hardware. yet configured. Also most operating system ker- Historically bootloaders consisted two or three nels does not provide floating point functionality phases. They were loaded into memory step by step, in kernel space. Along with the FPU stack and upgraded the system to a higher mode and prepared the environment for the next phase. This is no longer References required with EFI, but it is possible to keep this de- [Bal15] I. Balbaert. Rust Essentials. Packt Pub- sign. As an EFI application relies on its own binary lishing, May 2015. structure and calling convention, it may beneficial to use a second stage bootloader which has been started [Bei15] A. Beingessner. You can’t spell trust with- from EFI. This second stage application is not sub- out rust. Master’s thesis, Charlton Uni- jected to EFI specification and is just a small kernel versity, Department of Computer Science, indented to run the real kernel. 2015. There are numerous resources on operating systems design with Rust including [HL15] and [Lig15]. All re- [BZ15] M. Bulusu and V. Zimmer. Challanges for sources with C language are applicable to Rust since UEFI and the cloud. In UEFI Plugfest the syntactic elements of these two languages are sim- 2015, May 2015. ilar. Also Rusts strong foreign function interfaces pro- vides strong interaction. C is lingua franca of systems [Cor16] Intel Corporation. Intel 64 and IA-32 ar- languages. It has very good runtime performance and chitectures software developers manual vol- has raw memory management capability. Its abstract ume 3 (3a, 3b, 3c and 3d): System pro- machine model perfectly fits into current hardware gramming guide. Technical report, Order which utilizes program counter, registers and address- Number: 325384-058US, April, 2016. able memory, but its type system has aged [Pos14]. [Hah14] K. Hahn. Robust static analysis of. Rust, on the other hand, is fresh and brings lots of portable executable malware. Master’s the- modern features from newer high level designs. It of- sis, HTWK Leipzig, Department of Com- fers safety at compile time and abstractions are zero- puter Science, December 2014. cost at runtime. [HL15] H.W. Hoiby and S. Lefsaker. Rustygecko - 5 Conclusion and Future Work developing rust on bare-metal - an experi- mental embedded software platform. Mas- In this paper, the advanced semantics of Rust pro- ter’s thesis, Norwegian University of Sci- gramming language is presented to clarify the possi- ence and Technology, 2015. ble use within EFI-based bootloader design process. Various design alternatives and choices are mentioned [JML15] T.B.L. Jespersen, P. Munksgaard, and and the point that make Rust a better choice are dis- K.F. Larsen. Session types for Rust. In cussed. Since one of the main ideas behind using Rust Proceedings of the 11th ACM SIGPLAN is programming a critical-and-safe low-level task with Workshop on Generic Programming, WGP high-level programming concepts, we found bootloader 2015, pages 13–22, New York, NY, USA, design a typical application for this purpose 2015. ACM.

As discussed, Rust offers high level language se- + mantics, advanced standard library with modern skill [LAC 15] A. Levy, M.P. Andersen, B. Campbell, set including most of the features and functional ele- D. Culler, P. Dutta, B. Ghena, P. Levis, ments of widely-used programming languages. More- and P. Pannuto. Ownership is theft: Ex- over, Rust can be used as both a scripting language periences building an embedded os in rust. or a functional language. Additionally, it can also be In Proceedings of the 8th Workshop on Pro- used as a low level procedural language since it is both gramming Languages and Operating Sys- imperative and functional having no garbage collector. tems, PLOS’15, pages 21–26, New York, These design choices make Rust a suitable match for NY, USA, 2015. ACM. low level tasks via including high level scalability and [Lig15] A. Light. Reenix: Implementing a unix- maintainability. like operating system in rust. Master’s the- From the bootloading perspective, the future seems sis, Brown University, Department of Com- to be based on EFI on x86 hardware. It currently al- puter Science, April 2015. lows end users to download operating system from the Internet and install easily. Today memory unsafety [LMP+05] P. Levis, S. Madden, J. Polastre, causes serious problems, hence adaptation of Rust is R. Szewczyk, A. Woo, D. Gay, J. Hill, not economical or social, it is intellectual. As our fu- M. Welsh, E. Brewer, and D. Culler. ture work, we plan to develop a prototype based on Tinyos: An operating system for sensor this design process and validate the use of Rust via networks. In Ambient Intelligence, pages performance experiments. 115–148. Springer Verlag, 2005. [Oka99] C. Okasaki. Purely Functional Data Struc- tures. Cambridge University Press, 1999. [Pat01] R. Patton. . Sams Pub- lishing, 2001. [Pet00] C. Petzold. Code: The Hidden Language of Computer Hardware and Software. Mi- crosoft Press, 2000.

[Pos14] R. Poss. Rust for functional programmers. http://science.raphael.poss.name/rust- for-functional-programmers.html, July 2014.

[YZ15] J. Yao and V. Zimmer. A tour beyond bios memory map design in UEFI BIOS. Tech- nical report, Intel Corporation, February 2015.

[ZRM11] V. Zimmer, M. Rothman, and S. Marisetty. Beyond BIOS: Developing with the Unified Extensible Firmware Interface 2nd Edition. Intel Press, January 2011.