A Systematic Evaluation of Transient Execution Attacks and Defenses

A Systematic Evaluation of Transient Execution Attacks and Defenses Claudio Canella1, Jo Van Bulck2, Michael Schwarz1, Moritz Lipp1, Benjamin von Berg1, Philipp Ortner1, Frank Piessens2, Dmitry Evtyushkin3, Daniel Gruss1 1 Graz University of Technology, 2 imec-DistriNet, KU Leuven, 3 College of William and Mary Abstract instruction which has not been executed (and retired) yet. Hence, to keep the pipeline full at all times, it is essential to Research on transient execution attacks including Spectre predict the control flow, data dependencies, and possibly even and Meltdown showed that exception or branch mispredic- the actual data. Modern CPUs, therefore, rely on intricate mi- tion events might leave secret-dependent traces in the CPU’s croarchitectural optimizations to predict and sometimes even microarchitectural state. This observation led to a prolifera- re-order the instruction stream. Crucially, however, as these tion of new Spectre and Meltdown attack variants and even predictions may turn out to be wrong, pipeline flushes may be more ad-hoc defenses (e.g., microcode and software patches). necessary, and instruction results should always be committed Both the industry and academia are now focusing on finding according to the intended in-order instruction stream. Pipeline effective defenses for known issues. However, we only have flushes may occur even without prediction mechanisms, as on limited insight on residual attack surface and the completeness modern CPUs virtually any instruction can raise a fault (e.g., of the proposed defenses. page fault or general protection fault), requiring a roll-back In this paper, we present a systematization of transient of all operations following the faulting instruction. With pre- execution attacks. Our systematization uncovers 6 (new) tran- diction mechanisms, there are more situations when partial sient execution attacks that have been overlooked and not pipeline flushes are necessary, namely on every misprediction. been investigated so far: 2 new exploitable Meltdown ef- The pipeline flush discards any architectural effects of pend- fects: Meltdown-PK (Protection Key Bypass) on Intel, and ing instructions, ensuring functional correctness. Hence, the Meltdown-BND (Bounds Check Bypass) on Intel and AMD; instructions are executed transiently (first they are, and then and 4 new Spectre mistraining strategies. We evaluate the they vanish), i.e., we call this transient execution [50, 56, 85]. attacks in our classification tree through proof-of-concept im- plementations on 3 major CPU vendors (Intel, AMD, ARM). While the architectural effects and results of transient in- Our systematization yields a more complete picture of the structions are discarded, microarchitectural side effects re- attack surface and allows for a more systematic evaluation of main beyond the transient execution. This is the foundation defenses. Through this systematic evaluation, we discover that of Spectre [50], Meltdown [56], and Foreshadow [85]. These most defenses, including deployed ones, cannot fully mitigate attacks exploit transient execution to encode secrets through all attack variants. microarchitectural side effects (e.g., cache state) that can later arXiv:1811.05441v3 [cs.CR] 15 May 2019 be recovered by an attacker at the architectural level. The field of transient execution attacks emerged suddenly and pro- 1 Introduction liferated, leading to a situation where people are not aware of all variants and their implications. This is apparent from CPU performance over the last decades was continuously the confusing naming scheme that already led to an arguably improved by shrinking processing technology and increasing wrong classification of at least one attack [48]. Even more clock frequencies, but physical limitations are already hin- important, this confusion leads to misconceptions and wrong dering this approach. To still increase the performance, ven- assumptions for defenses. Many defenses focus exclusively dors shifted the focus to increasing the number of cores and on hindering exploitation of a specific covert channel, instead optimizing the instruction pipeline. Modern CPU pipelines of addressing the microarchitectural root cause of the leak- are massively parallelized allowing hardware logic in prior age [45,47,50,91]. Other defenses rely on recent CPU features pipeline stages to perform operations for subsequent instruc- that have not yet been evaluated from a transient security per- tions ahead of time or even out-of-order. Intuitively, pipelines spective [84]. We also debunk implicit assumptions including may stall when operations have a dependency on a previous that AMD or the latest Intel CPUs are completely immune to in-place (IP) vs., out-of-place (OP) PHT-CA-IP ⭑ Meltdown-type effects, or that serializing instructions miti- mistraining Cross-address-space PHT-CA-OP ⭑ gate Spectre Variant 1 on any CPU. strategy Same-address-space PHT-SA-IP [48, 50] Spectre-PHT In this paper, we present a systematization of transient PHT-SA-OP ⭑ microarchitec- Spectre-BTB i.e. tural buffer BTB-CA-IP [13, 50] execution attacks, , Spectre, Meltdown, Foreshadow, and Spectre-RSB Cross-address-space Spectre-type BTB-CA-OP [50] related attacks. Using our decision tree, transient execution Spectre-STL [29] Same-address-space attacks are accurately classified through an unambiguous nam- BTB-SA-IP ⭑ ing scheme (cf. Figure1). The hierarchical and extensible na- prediction Cross-address-space BTB-SA-OP [13] Transient Same-address-space RSB-CA-IP [52, 59] cause? ture of our taxonomy allows to easily identify residual attack RSB-CA-OP [52] Meltdown-NM [78] Meltdown-US [56] fault fault type surface, leading to 6 previously overlooked transient execu- RSB-SA-IP [59] Meltdown-AC ⭐ Meltdown-P [85, 90] RSB-SA-OP [52, 59] tion attacks (Spectre and Meltdown variants) first described in Meltdown-type Meltdown-DE ⭐ Meltdown-RW [48] this work. Two of the attacks are Meltdown-BND, exploiting Meltdown-PF Meltdown-PK ⭑ a Meltdown-type effect on the x86 bound instruction on Intel Meltdown-UD ⭐ Meltdown-XD ⭐ and AMD, and Meltdown-PK, exploiting a Meltdown-type Meltdown-SS ⭐ Meltdown-SM ⭐ effect on memory protection keys on Intel. The other 4 attacks Meltdown-BR Meltdown-MPX [40] Meltdown-GP [8, 35] Meltdown-BND ⭑ are previously overlooked mistraining strategies for Spectre- PHT and Spectre-BTB attacks. We demonstrate the attacks Figure 1: Transient execution attack classification tree with in our classification tree through practical proofs-of-concept demonstrated attacks (red, bold), negative results (green, with vulnerable code patterns evaluated on CPUs of Intel, dashed), some first explored in this work (⭑ / ⭐). ARM, and AMD. Next, we provide a systematization of the state-of-the-art a processor supports, the available registers, the addressing defenses. Based on this, we systematically evaluate defenses mode, and describes the execution model. Examples of dif- with practical experiments and theoretical arguments to show ferent ISAs are x86 and ARMv8. The microarchitecture then which work and which do not or cannot suffice. This sys- describes how the ISA is implemented in a processor in the tematic evaluation revealed that we can still mount transient form of pipeline depth, interconnection of elements, execution execution attacks that are supposed to be mitigated by rolled units, cache, branch prediction. The ISA and the microarchi- out patches. Finally, we discuss how defenses can be designed tecture are both stateful. In the ISA, this state includes, for to mitigate entire types of transient execution attacks. instance, data in registers or main memory after a success- Contributions. The contributions of this work are: ful computation. Therefore, the architectural state can be ob- 1. We systematize Spectre- and Meltdown-type attacks, ad- served by the developer. The microarchitectural state includes, vancing attack surface understanding, highlighting mis- for instance, entries in the cache and the translation lookaside classifications, and revealing new attacks. buffer (TLB), or the usage of the execution units. Those mi- 2. We provide a clear distinction between Meltdown/Spectre, croarchitectural elements are transparent to the programmer required for designing effective countermeasures. and can not be observed directly, only indirectly. 3. We categorize defenses and show that most, including Out-of-Order Execution. On modern CPUs, individual in- deployed ones, cannot fully mitigate all attack variants. structions of a complex instruction set are first decoded and 4. We describe new branch mistraining strategies, highlight- split-up into simpler micro-operations (µOPs) that are then ing the difficulty of eradicating Spectre-type attacks. processed. This design decision allows for superscalar op- We responsibly disclosed the work to Intel, ARM, and AMD. timizations and to extend or modify the implementation of Experimental Setup. Unless noted otherwise, the experi- specific instructions through so-called microcode updates. mental results reported were performed on recent Intel Sky- Furthermore, to increase performance, CPU’s usually imple- lake i5-6200U, Coffee Lake i7-8700K, and Whiskey Lake i7- ment a so-called out-of-order design. This allows the CPU 8565U CPUs. Our AMD test machines were a Ryzen 1950X to execute µOPs not only in the sequential order provided by and a Ryzen Threadripper 1920X. For experiments on ARM, the instruction stream but to dispatch them in parallel, utiliz- an NVIDIA Jetson TX1 has been used. ing the CPU’s

A Systematic Evaluation of Transient Execution Attacks and Defenses

What Is an Operating System III 2.1 Compnents II an Operating System

Irmx® 286/Irmx® II Troubleshooting Guide

CS 151: Introduction to Computers

Parallel Architectures

BCIS 1305 Business Computer Applications

"LISARM: Embedded ARM Platform Design and Optimization" Thesis

How Can I Test My Memory

Energy Efficient Branch Prediction

CS5460/6460: Operating Systems Lecture 6: Interrupts and Exceptions

Constant-Time Foundations for the New Spectre Era

CS152: Computer Systems Architecture Virtual Memory

ARM Cortex-A Series Programmer's Guide