Minimal Virtual Machines on Iot Microcontrollers: the Case of Berkeley Packet Filters with Rbpf
Total Page:16
File Type:pdf, Size:1020Kb
If you cite this paper, please use the PEMWN 2020 reference: K. Zandberg, E. Baccelli. Minimal Virtual Machines on IoT Microcontrollers: The Case of Berkeley Packet Filters with rBPF. In Proc. of 9th IFIP/IEEE PEMWN, Dec. 2020. Minimal Virtual Machines on IoT Microcontrollers: The Case of Berkeley Packet Filters with rBPF Koen Zandberg∗, Emmanuel Baccelli∗y ∗Inria, France yFreie Universitat¨ Berlin, Germany Abstract—Virtual machines (VM) are widely used to host and aiming to mitigate network- and some software-based attack isolate software modules. However, extremely small memory and vectors [23]. low-energy budgets have so far prevented wide use of VMs Complementary mechanisms focus on isolating critical soft- on typical microcontroller-based IoT devices. In this paper, we explore the potential of two minimal VM approaches on such ware processes (e.g. access to specific sections of the address low-power hardware. We design rBPF, a register-based VM based space) from the rest of the application software running on- on extended Berkeley Packet Filters (eBPF). We compare it with board the microcontroller. This facilitates establishing a root of a stack-based VM based on WebAssembly (Wasm) adapted for trust on the microcontroller, which can bootstrap other security embedded systems. We implement prototypes of each VM, hosted mechanisms. in the IoT operating system RIOT. We perform measurements on commercial off-the-shelf IoT hardware. Unsurprisingly, we Concretely, we consider two categories of use-cases: observe that both Wasm and rBPF virtual machines yield 1) Isolating high-level business logic, updatable on-demand execution time and memory overhead, compared to not using remotely over the low-power network. This type of logic a VM. We show however that this execution time overhead is is rather long-lived, and has loose (non-real-time) timing tolerable for low-throughput, low-energy IoT devices. We further show that, while using a VM based on Wasm entails doubling the requirements. memory budget for a simple networked IoT application using a 2) Isolating debug/monitoring code snippets at low-level, 6LoWPAN/CoAP stack, using a VM based on rBPF requires only inserted and removed on-demand, remotely, over the negligible memory overhead (less than 10% more memory). rBPF network. Comparatively, this type of logic is short-lived is thus a promising approach to host small software modules, and exhibits stricter timing requirements. isolated from OS software, and updatable on-demand, over low- power networks. One approach is to modify the hardware architecture of mi- crocontrollers, adding specific hardware mechanisms to guar- I. INTRODUCTION antee such isolations. Such hardware functionalities facilitate establishing a root of trust on the microcontroller. Prominent The availability of cheap low-power microcontrollers and examples of this trend include TrustZone on Arm Cortex- low-power radios is driving the emergence of the Internet M architectures [18], Sanctum on RISC-V architectures [8], of Things (IoT). Typical microcontrollers based on archi- Sancus2.0 on MSP430 architectures [17]. tecture such as Arm Cortex-M are combined with various However, changing hardware is both (i) more difficult than sensors/actuators, and with a radio such as Bluetooth Low- upgrading software, and (ii) heavily dependent, by nature, on a arXiv:2011.12047v2 [cs.NI] 9 Dec 2020 Energy, LoRa or IEEE 802.15.4, on a small Arduino-like specific hardware architecture. Therefore, a legitimate question hardware module. This class of embedded hardware [5] trades which arises is: what software-only equivalent can be achieved, off limiting resources (slow processor, kilobytes of RAM to isolate the processes in our use-cases? and Flash memory. ), for small energy consumption (in the milliwatt range) and a small price tag (a few dollars). II. RELATED WORK In parallel, however, security concerns [16] grow with Different categories of software-based process isolation the emergence of IoT. Cyberphysical chain reactions [21], techniques have been developed specifically for microcon- or extended functionality attacks [19] expand the traditional trollers. Small virtual machines are used to host and isolate attack surface of networked systems. processes from other processes running on the microcontroller. To mitigate such a variety of attacks, specific security mech- For example Darjeeling [6] is a subset of the Java VM, anisms are needed at all levels of the system. For instance, modified to use a 16 bit architecture, designed for 8- and on-going work defines new network protocols and workflows 16-bit microcontrollers. Another example is WebAssembly (Wasm [12]), a virtual machine (VM) specification with a Acknowledgement: H2020 SPARTA partly funded this work. stack-based architecture, designed for process isolation in Web browsers, which has recently been ported to microcon- C/C++ Bindings WASI IoT Device trollers [20]. Beyond the low-power IoT domain, tiny VMs OS facilities are also used in other contexts for a long time. For instance JavaCard [13] uses a small Java VM running on smart cards. Rust Wasm EMCC Intermediate Sandboxed Elsewhere, in the Linux ecosystem, eBPF [15], [9] enables a Bytecode compilation execution TinyGo small VM hosting and isolating debug and inspection code, in VM sandbox LLVM EmSDK the Linux kernel, at run-time. Another type of approach uses scripted logic interpreters to Fig. 1. Wasm code development and execution workflow with the WASM3 isolate some processes. For instance, prior work such as [4] VM. uses a small JavaScript run-time container, hosting (update- C/C++ OS facilities IoT Device able) business logic, interpreted on-board a microcontroller, bindings OS facilities glued atop a real-time OS (RIOT). Yet another category of solution uses OS-level mechanisms Rust BPF Sandboxed LLVM for process isolation. For instance, Tock [14] is an OS written Bytecode execution in the Rust programming language, which offers strong isola- TinyGo tion between its kernel and application logic processes. How- VM sandbox ever, Tock requires hardware providing an memory protection unit (MPU) (only some Cortex-M and RISC-V hardware is Fig. 2. rBPF code development and execution workflow. supported so far). The WebAssembly VM uses both a stack and a flat heap for The goal we pursue in this paper is to explore in practice memory storage. The stack is required by the architecture, and solutions which: can be configured to any size. An interface for allocating heap • require minimal memory footprint; memory is provided by the standard. Specification mandates • do not depend on extra hardware-specific mechanisms to memory allocations in chunks of 64 KiB (pages). protect memory; Toolchain & SDK. The full workflow for development • offer tolerable code execution speed slump; and execution of Wasm applications is depicted in Figure 1. • require small data transfer over-the-air when isolated code Wasm uses the LLVM compiler: applications in any language is updated; supported by LLVM are possible, such as C/C++, D, Rust, and For this purpose, we explore approaches based on virtual TinyGo among others. A standardized interface is specified machines. More specifically, we consider two architectures for host access in a POSIX-like way is provided by the WASI of VMs: a stack-based VM based on WebAssembly, and a standard [24]. register-based VM based on eBPF, as described below. The Interpreter. Once the Wasm binary is created, it can be main contributions of this paper are: transferred to the IoT device, on which it is interpreted and • we design rBPF, an adaptation of eBPF providing a executed, as shown in Figure 1. Several interpreters exist, in software-based solution to isolate processes on low-power this paper we use the WASM3 [20] interpreter, which uses a microcontrollers; two-stage approach: the loaded application is first transpiled • we provide the implementation of two open source proto- to an optimized executable, which then can be executed in the types of VMs, using rBPF on one hand, and on the other interpreter. hand using Wasm, based on the WASM3 interpreter; • we evaluate the performance of our rBPF prototype B. Extended Berkeley Packet Filters compared to Wasm, via experiments running the VMs Berkeley Packet Filter (BPF [9]) is a small in-kernel VM on real microcontrollers; available on most Unix-like operating systems. Its original • we show that rBPF offers promising perspectives in purpose was network packet filtering, for example: only pass terms of smaller memory footprint, we discuss security to userspace packets matching a set of requirements. Within guarantees and potential next steps. the Linux kernel the VM is extended (to eBPF) to allow for multiple non-network related purposes. eBPF provides a small III. BACKGROUND and efficient facility for running custom code inside the kernel, A. WebAssembly hooking into various subsystems. WebAssembly (Wasm [12]) is a virtual instruction set The state-of-the-art eBPF architecture is 64-bit register architecture, standardized by the World Wide Web Consortium based VM with a fixed stack. The stack itself is specified as (W3C), primarily aimed at portable web applications. The fixed at 512 B. A heap is not contained in the specification. As instruction set allows for binaries small in size, to minimize an alternative, the Linux kernel provides an interface to key- transfer time to the client. The sandbox provided by imple- value maps for persistent storage between invocations. The mentations offers strong guarantees on memory access. Both limited stack size and absence of a heap put only minimal of these properties aim to ensure security while requiring only requirements on the RAM a platform has to provide for the limited memory footprint on the platform target. VM. The VM itself is inherently suitable for isolating the op- erating system from the virtualized application: all memory Admin Store access, including to the stack, happen via load and store instructions. Moreover, branch and jump instructions are also Application Store Script limited, the application has no access to the program counter fetch and a jump is always direct and relative to current program Isolated sandbox Value Store counter.