SPECS: a Lightweight Runtime Mechanism for Protecting Software from Security-Critical Processor Bugs

SPECS: A Lightweight Runtime Mechanism for Protecting Software from Security-Critical Processor Bugs Matthew Hicks Cynthia Sturton Samuel T. King Jonathan M. Smith University of Michigan University of North Twitter, Inc. University of Pennsylvania [email protected] Carolina at Chapel Hill [email protected] [email protected] [email protected] Abstract Catch All Catch Security- Low Processor implementation errata remain a problem, and Bugs? Critical Bugs? Overhead? worse, a subset of these bugs are security-critical. We classi- SW-only 7 7 3 fied 7 years of errata from recent commercial processors to HW-only 3 3 7 understand the magnitude and severity of this problem, and SPECS 7 3 3 found that of 301 errata analyzed, 28 are security-critical. SPECS+SW 3 3 3 We propose the SECURITY-CRITICAL PROCESSOR ER- RATA CATCHING SYSTEM (SPECS) as a low-overhead so- Table 1: The design space for catching processor bugs: existing software- lution to this problem. SPECS employs a dynamic verifi- only approaches are limited, but practical; existing hardware-only approaches are powerful, but impractical; and SPECS, combined with ex- cation strategy that is made lightweight by limiting protec- isting software approaches, is both powerful and practical. tion to only security-critical processor state. As a proof-of- concept, we implement a hardware prototype of SPECS in an open source processor. Using this prototype, we evaluate SPECS against a set of 14 bugs inspired by the types hardware typically ships with bugs. For example, the errata of security-critical errata we discovered in the classification document for Intel’s Core 2 Duo processor family [20] lists phase. The evaluation shows that SPECS is 86% effective as 129 known bugs. a defense when deployed using only ISA-level state; incurs Some processor bugs have the potential to weaken the se- less than 5% area and power overhead; and has no software curity of software by allowing unprivileged code to modify run-time overhead. or control privileged processor state. We deem these bugs security critical (we do not claim that this set is unique). While Categories and Subject Descriptors C.0 [General]: Hard- some effects of bugs are detectable by software through pe- ware/software interfaces; C.0 [General]: System architec- riodic checks [8, 27] or special encodings [6, 40], other tures; K.6.5 [Security and Protection]: Invasive software processor bugs affect software in a manner that software (e.g., viruses, worms, Trojan horses) cannot detect practically. Such bugs are security-critical be- Keywords Processor errata; hardware security exploits; cause they affect a critical subset of processor functionality security-critical processor errata trusted by the software tasked with detecting and recovering from imperfections. One example of a security-critical bug 1. Introduction is a design flaw in the MIPS R4000 processor that allows user-mode code to control the location of an exception vec- Modern processors are imperfect. Processors are built from tor [26]. The effect of this bug is user-mode code that runs at millions of lines of code [42], yielding chips with billions the supervisor level—privilege escalation. of transistors [22]. Verifying the functional correctness of These types of flaws compromise otherwise correct hardware at such scales remains intractable. The resulting software—especially software that implements security gaps in functional verification mean that today’s production policies. There are many documented examples [4,5, 29– 32] of the impact security-critical bugs have on software. Software-only patching approaches have been explored Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed (e.g., microcode [16, 43], re-compilation [28], and binary for profit or commercial advantage and that copies bear this notice and the full citation translation [35]), but are not able to address all processor on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). bugs [33]. Further, many of the existing patching mecha- ASPLOS ’15, March 14–18, 2015, Istanbul, Turkey. nisms incur large performance penalties [38, 39], mainly Copyright is held by the owner/author(s). ACM 978-1-4503-2835-7/15/03. due to the coarse granularity and heavyweight nature of their http://dx.doi.org/10.1145/2694344.2694366 consistency checks. Recovery AMD Sub Software Synopsis Class Repair ID# class 418 Incorrect translation tables used A6Icretpage walks MA561Incorrect ISA state 731 Using large pages causes incorrect IO address trans- lations ISA state 77 Using unaligned descriptor table allows software to XR ISA ISA state state Assertions jump past the end of the table without causing an exception Assertions Exception 170 May use instructions from old page tables pointed to (a) (b) (c) (d) by old CR3 EI design time run time 578 Branch prediction causes invalid control flow 639 CALL instruction treated as a NOP 776 Branch prediction causes invalid control flow Figure 1: Processor design flow with SPECS: (a) Hardware description 144 Shadow RAM not invalidated language implementation of the instruction set (b) At some point during de- IR 784 Control register data leaked through load operation sign, a bug is introduced. (c) Assertions that implement SPECS invariants are added to the processor, with taps directly on the outputs of ISA state 165 Guest can clear data in VMCB storage elements. (d) To squash in-flight state, SPECS adds write gating to 503 Writes to APIC task priority register use the wrong the inputs of ISA state storing elements and attaches the result of dynamic register IU IL verification to the exception logic, which triggers existing recovery/repair 573 Instruction pointer updated incorrectly software in the event of an invariant violation. 734 Incorrect data stored to and past VMCB 770 Privilege escalation 171 Execute breakpoint handler with mixed guest and host state Alternatively, hardware-based bug defenses can address 248 TLB pages not invalidated all processor bugs using checks on every cycle, but this com- 342 Interrupts disabled for guest prehensiveness comes at a prohibitive cost in terms of hard- 385 Incorrect error address reported ware area and power. One example illustrating the tradeoffs 401 Hypervisor doesn’t get correct IP 440 CR3 not saved correctly IU IX of hardware patching mechanisms is Diva [1]. Diva relies on 563 Flags not stored correctly in VMCB modular redundancy; a second formally-verified but lower- 564 Auto halt flag not set in SMM save state performance processor core checks the calculations of the 659 VMCB flag bit not cleared buggy, high-performance core. Diva can protect against all 691 Cache not invalidated correctly 704 Incorrect IP stored processor bugs, but is complex (potentially creating new pro- 738 Incorrect SP stored in VMCB cessor bugs) and has a high hardware cost. A second ex- 744 A CC6 transition may not restore trap registers ample is Phoenix [37], which detects potentially inconsistent processor state by comparing the hardware-level state of Table 2: Security-critical errata mined from AMD processors manufactured between 2006 and 2013. The security-critical errata fit into five classes: MA the processor to known buggy state values; Phoenix requires (arbitrary memory access), XR (exception related), EI (execute incorrect more hardware than Diva and cannot respond to unknown instruction), IR (incorrect results), and IU (invalid register update). Further, bugs. there are two types of IU: IL (illegal) and IX (incorrect). Rather than incur the costs of protecting software from all processor bugs, we propose a hybrid approach, shown in We make three contributions in this paper: Table1: SPECS adds a small amount of additional hard- • We identify and classify security-critical errata from re- ware to the processor that dynamically verifies a subset cent commercial processors and discover that these bugs of processor functionality. The goal is to protect software involve invalid updates to privileged ISA-level state, from security-critical processor bugs, especially those that which is updated in a few simple ways (Section2). it cannot protect itself from efficiently. SPECS works by enforcing instruction-set-specification-derived invariants on • We introduce a lightweight dynamic-verification strategy security-critical processor state dynamically. The invariants against security-critical processor bugs, providing an in- are enforced by a series of simple assertions in hardware trospection point whenever security-critical state is up- (Figure1(c)). The assertions analyze ISA-level state and dated outside the specification (Section3). events to validate in-flight (i.e., not yet visible at the soft- • We implement (Section4) and evaluate (Section5) ware level) ISA-level state updates. If in-flight state vio- SPECS, showing that it is possible to protect software lates a SPECS invariant, a combination of assertions fire from a range of realistic security-critical processor bugs. that then triggers an exception that squashes the inconsistent The evaluation shows that SPECS: (1) is able to detect in-flight state (Figure1(d)). From here, existing software re- 86% of the implemented processor bugs using only ISA- covery/repair mechanisms take over—while SPECS contin- level state; (2) requires less than 5% additional hardware; ues to work in the background. Essentially, SPECS acts as and (3) has no software run-time overhead. 1 an interposition layer for a range of existing processor repair and recovery approaches that allows software to execute past 1 We make the errata used for our analysis and our hardware and software security-critical bugs safely. source code available publicly [19]. 2. Security-Critical Errata 100 % XR - Exception Related 90 % IR - Incorrect Results Existing work on processor bugs as security vulnerabilities MA - Memory Access EI - Incorrect Instruction takes the approach of finding a single processor bug and 80 % IU - Register Related realizing a usable software-level attack with that bug as a 70 % foothold [3, 11, 12, 23].

Load more