Control-Flow Integrity for Real-Time Embedded Systems
Total Page:16
File Type:pdf, Size:1020Kb
Control-flow Integrity for Real-time Embedded Systems by Nicholas Brown A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the Degree of Master of Science in Computer Science by May 2017 APPROVED: Professor Robert J. Walls, Major Thesis Advisor Professor Craig A. Shue, Thesis Reader Professor Craig E. Wills, Head of Department Abstract As embedded systems become more connected and more ubiquitous in mission- and safety-critical systems, embedded devices have become a high-value target for hackers and security researchers. Attacks on real-time embedded systems software can put lives in danger and put our critical in- frastructure at risk. Despite this, security techniques for embedded systems have not been widely studied. Many existing software security techniques for general purpose computers rely on assump- tions that do not hold in the embedded case. This thesis focuses on one such technique, control-flow integrity (CFI), that has been vetted as an effective countermeasure against control-flow hijacking attacks on general purpose computing systems. Without the process isolation and fine-grained memory protections provided by a general purpose computer with a rich operating system, CFI cannot provide any security guarantees. This thesis explores a way to use CFI on ARM Cortex-R devices running minimal real-time operating systems. We provide techniques for protecting runtime structures, isolating processes, and instrumenting compiled ARM binaries with CFI protection. i Contents 1 Introduction 1 1.1 Outline........................................... 2 2 Background and Related Works3 2.1 ARM Architecture..................................... 3 2.1.1 ARM/Thumb/Thumb2 Instruction Sets..................... 4 2.1.2 Processing Modes................................. 5 2.1.3 Memory....................................... 7 2.2 Control-Flow Integrity................................... 7 2.2.1 Threat Model.................................... 8 2.2.2 How CFI Works.................................. 8 2.2.3 CFI Variations................................... 10 2.2.4 Existing CFI Implementations.......................... 12 2.2.5 Limitations of CFI................................. 13 2.3 Real-Time Operating Systems............................... 14 2.3.1 FreeRTOS...................................... 15 3 Control-Flow Integrity on ARM Cortex-R 17 3.1 Design Goals........................................ 17 3.2 System Model........................................ 18 3.3 Threat Model........................................ 18 3.4 Approach.......................................... 18 3.4.1 Binary Instrumentation.............................. 19 3.4.2 Shadow Stack.................................... 22 4 CFI Integration into FreeRTOS 25 4.1 Design Goals........................................ 25 4.2 System Model........................................ 26 4.3 Threat Model........................................ 26 4.4 Approach.......................................... 26 4.4.1 Shadow Stacks................................... 27 4.4.2 FreeRTOS Tasks.................................. 28 4.4.3 Context Switching................................. 28 5 Evaluation 31 5.1 Security Evaluation..................................... 31 5.1.1 Basic Exploitation Test (BET).......................... 31 5.2 Performance Impact.................................... 34 ii 5.2.1 CPU Benchmarks................................. 34 5.2.2 FreeRTOS Performance.............................. 35 5.2.3 Additional Resource Use.............................. 36 5.3 Discussion.......................................... 37 5.3.1 Limitations..................................... 37 5.3.2 Future Work.................................... 38 6 Conclusion 40 iii List of Figures 2.1 Simplified memory map for RM46L852 processor [1].................. 6 2.2 Example CFG of a sorting function............................ 9 3.1 Overview of our CFI system................................ 19 3.2 Overview of binary patching system........................... 19 3.3 Input and output of the binary patching program.................... 23 3.4 Connection between the new .text section and the .cfi section........... 23 5.1 CoreMark results with and without CFI. Default settings, 1000 iterations...... 35 5.2 Latency (in number of instructions) of CFI critical sections in FreeRTOS with two running tasks........................................ 36 iv List of Tables 2.1 ARM processors and their features [2].......................... 4 2.2 Register sets for ARM processing modes [3]....................... 5 2.3 Different indirect jump operations in ARM....................... 9 2.4 Different CFI implementations and their features.................... 12 v List of Listings 3.1 C function to instrument.................................. 20 3.2 Selected portions of disassembly for foo() function................... 20 3.3 Rewritten version of foo() function............................ 21 3.4 Function prologue instrumentation............................ 21 3.5 Indirect call instrumentation............................... 21 3.6 Function epilogue instrumentation............................ 22 3.7 Supervisor call handler................................... 23 3.8 Shadow stack type definition............................... 24 3.9 Shadow stack operations.................................. 24 4.1 C type definition and declaration for multiple shadow stacks.............. 27 4.2 FreeRTOS Task Control Block definition......................... 28 4.3 FreeRTOS save context routine with shadow stack................... 29 4.4 FreeRTOS restore context routine with shadow stack.................. 30 5.1 General format of indirect call instrumentation..................... 33 5.2 Function prologue and epilogue instrumentation..................... 34 vi Chapter 1 Introduction Modern real-time embedded systems have countless applications, varying in complexity. There are simple devices like thermostats or coffee makers, more complex systems like smartphone radios, and highly complex health and safety critical systems like jet engine controllers or automotive braking controllers. In safety critical systems especially, there is great risk when manufacturers release faulty devices. Failures in these systems can cause injury, and the designers of the system will be held liable [4]. Often these faults are in software and can be exploited by an intelligent adversary [5,6]. The most pernicious type of attack allows an attacker to execute arbitrary code on a device. Such attacks, commonly called control-flow hijacking, manipulate the execution of a program by redirecting control-flow transfers to either attacker-supplied code (e.g., stack smashing [7]) or useful code sequences already in the program (e.g., return-oriented programming [8]). State-of-the-art defenses against control-flow hijacking are largely based on the concept of control-flow integrity (CFI) [9]. Intuitively, CFI compares the behavior of a running program to a predefined model. If the program's behavior deviates from what is expected, an error is thrown. In particular, CFI monitors control-flow transfers and only allows a transfer if it is accepted by a precomputed control- flow graph (CFG) of the program. CFI is a subset of a larger category of security techniques called memory safety. While full memory safety is possible, the performance overhead (nearly 200% in some cases) makes it impractical for most applications [10,11]. CFI is an approach to memory safety on a small subset of memory, namely code pointers, which allows for significantly lower overhead as opposed to full memory safety. While myriad CFI implementations exist for general-purpose systems (e.g., desktops and smart- phones), real-time embedded systems present several unique challenges for CFI. First, many em- bedded systems do not have the hardware or software support for task isolation. Such isolation is common in general-purpose systems and relied upon by existing CFI solutions. Second, the scheduler in a real-time operating system (RTOS) can interrupt any instruction and can return to any arbitrary instruction. This makes the scheduler a high degree node in the CFG, which Carlini et al. have shown severely weakens the effectiveness of CFI [12]. Third, the majority of existing CFI solutions are for x86-based hardware, while embedded systems are commonly ARM-based, an 1 architecture which has several challenges for CFI, such as multiple instruction sets and the lack of a dedicated function return instruction. Finally, real-time embedded systems generally have lim- ited resources and strict timing requirements, limiting the amount of storage available for runtime structures required for CFI (e.g., shadow stacks). CFI instrumentation must adhere to the timing constraints in place by the real-time system. We propose a CFI scheme for real-time embedded systems that addresses these challenges and prevents control-flow hijacking attacks. For our initial efforts, we focus on ARM-based systems run- ning the FreeRTOS real-time operating system, but we anticipate that our system will be portable to any RTOS running on an ARM microcontroller, since the majority of the implementation is de- signed to be operating system agnostic. Existing approaches to CFI on embedded systems, such as C-FLAT [13] and TrackOS [14], move away from the traditional CFI approach but introduce new time-of-check to time-of-use vulnerabilities. Traditional approaches depend on process isolation in the presence