Leaky Processors

Lessons from Spectre, Meltdown, and Foreshadow

Jo Van Bulck (@jovanbulck)1, Daniel Gruss (@lavados)2 Red Hat Research Day, January 23, 2020

1 imec-DistriNet, KU Leuven, 2 Graz University of Technology v1, v2, v4, v5, v3, v3.1, v3a, Foreshadow-NG, Spectre-BTB, RDCL? L1TF? Spectre-RSB, ret2spec, SGXPectre, SmotherSpectre, NetSpectre? ZombieLoad, MDS? RIDL, Fallout?

Lessons from Spectre, Meltdown, and Foreshadow?

Spectre Meltdown Foreshadow

1 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Lessons from Spectre, Meltdown, and Foreshadow?

Spectre Meltdown Foreshadow v1, v2, v4, v5, v3, v3.1, v3a, Foreshadow-NG, Spectre-BTB, RDCL? L1TF? Spectre-RSB, ret2spec, SGXPectre, SmotherSpectre, NetSpectre? ZombieLoad, MDS? RIDL, Fallout?

1 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

• CPU builds “walls” for memory isolation between applications and privilege levels ↔ Architectural protection walls permeate microarchitectural side channels!

Processor security: Hardware isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Different software protection domains: user processes, virtual machines, enclaves

2 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) ↔ Architectural protection walls permeate microarchitectural side channels!

Processor security: Hardware isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Different software protection domains: user processes, virtual machines, enclaves • CPU builds “walls” for memory isolation between applications and privilege levels

2 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Processor security: Hardware isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Different software protection domains: user processes, virtual machines, enclaves • CPU builds “walls” for memory isolation between applications and privilege levels ↔ Architectural protection walls permeate microarchitectural side channels!

2 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) National Geographic

CPU Cache

printf("%d", i); printf("%d", i);

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

Cache miss printf("%d", i); printf("%d", i);

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

Request Cache miss printf("%d", i); printf("%d", i);

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

Request Cache miss printf("%d", i); printf("%d", i); Response

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

Request Cache miss printf("%d", i); i printf("%d", i); Response

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

Request Cache miss printf("%d", i); i printf("%d", i); Response

Cache hit

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

DRAM access, slow

Request Cache miss printf("%d", i); i printf("%d", i); Response

Cache hit

5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) CPU Cache

DRAM access, slow

Request Cache miss printf("%d", i); i printf("%d", i); Response

Cache hit No DRAM access, much faster 5 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush cached Shared Memory access access cached

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush Shared Memory access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush Shared Memory access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush Shared Memory access access

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Flush+Reload

Shared Memory ATTACKER VICTIM

flush Shared Memory access access

fast if victim accessed data, slow otherwise

6 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Memory Access Latency

Cache Hits

107

104

101 Number of Accesses 50 100 150 200 250 300 350 400 Latency [Cycles]

7 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Memory Access Latency

Cache Hits Cache Misses

107

104

101 Number of Accesses 50 100 150 200 250 300 350 400 Latency [Cycles]

7 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

We can communicate across protection walls using microarchitectural side channels! Leaky processors: Jumping over protection walls with side channels

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

10 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

Side-channel attacks are known for decades already – what’s new?

4000

3000

2000 DO WE JUST SUCK AT... COMPUTERS? YUP. ESPECIALLY SHARED O 1000

1990 1994 1998 2002 2006 2010 2014 2018

Based on github.com/Pold87/academic-keyword-occurrence and xkcd.com/1938/

11 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

Can we do better? Can we demolish architectural protection walls instead of just peaking over? • Meltdown breaks user/kernel isolation • Foreshadow breaks SGX enclave and virtual machine isolation • Spectre breaks software-defined isolation on various levels • . . . many more – but all exploit the same underlying insights!

Leaky processors: Breaking isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

13 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Foreshadow breaks SGX enclave and virtual machine isolation • Spectre breaks software-defined isolation on various levels • . . . many more – but all exploit the same underlying insights!

Leaky processors: Breaking isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Meltdown breaks user/kernel isolation

13 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Spectre breaks software-defined isolation on various levels • . . . many more – but all exploit the same underlying insights!

Leaky processors: Breaking isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Meltdown breaks user/kernel isolation • Foreshadow breaks SGX enclave and virtual machine isolation

13 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • . . . many more – but all exploit the same underlying insights!

Leaky processors: Breaking isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Meltdown breaks user/kernel isolation • Foreshadow breaks SGX enclave and virtual machine isolation • Spectre breaks software-defined isolation on various levels

13 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Leaky processors: Breaking isolation mechanisms

App App App Enclave

VM OS VM OS

Hypervisor (VMM)

• Meltdown breaks user/kernel isolation • Foreshadow breaks SGX enclave and virtual machine isolation • Spectre breaks software-defined isolation on various levels • . . . many more – but all exploit the same underlying insights!

13 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

• Modern CPUs are inherently parallel

⇒ Execute instructions ahead of time

Best-effort: What if triangle fails? → Commit in-order, roll-back square ... But side channels may leave traces (!)

Out-of-order and

Key discrepancy: • Programmers write sequential instructions

14 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Best-effort: What if triangle fails? → Commit in-order, roll-back square ... But side channels may leave traces (!)

Out-of-order and speculative execution

Key discrepancy: • Programmers write sequential instructions • Modern CPUs are inherently parallel

⇒ Execute instructions ahead of time

14 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Out-of-order and speculative execution

Key discrepancy: • Programmers write sequential instructions • Modern CPUs are inherently parallel

Overflow ⇒ Execute instructions ahead of time Roll-back exception Best-effort: What if triangle fails? → Commit in-order, roll-back square ... But side channels may leave traces (!)

14 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Transient world (microarchitecture) may temp bypass architectural software intentions:

Transient-execution attacks: Welcome to the world of fun!

CPU executes ahead of time in transient world • Success → commit results to normal world • Fail → discard results, compute again in normal, world /

15 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Transient world (microarchitecture) may temp bypass architectural software intentions:

Transient-execution attacks: Welcome to the world of fun!

Key finding of 2018 ⇒ Transmit secrets from transient to normal world

15 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Transient-execution attacks: Welcome to the world of fun!

Key finding of 2018 ⇒ Transmit secrets from transient to normal world

Transient world (microarchitecture) may temp bypass architectural software intentions:

Delayed exception handling Control flow prediction

15 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Transient-execution attacks: Welcome to the world of fun!

Key finding of 2018 ⇒ Transmit secrets from transient to normal world

Transient world (microarchitecture) may temp bypass architectural software intentions:

CPU access control bypass Speculative buffer overflow/ROP

15 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) The transient-execution zoo https://transient.fail

PHT-CA-IP Cross-address-space PHT-CA-OP Spectre-PHT PHT-SA-IP Same-address-space PHT-SA-OP

BTB-CA-IP Cross-address-space BTB-CA-OP Spectre-BTB Spectre-type BTB-SA-IP Same-address-space BTB-SA-OP

RSB-CA-IP Cross-address-space RSB-CA-OP Spectre-RSB Spectre-STL RSB-SA-IP Same-address-space RSB-SA-OP

Meltdown-US-L1 Transient cause Meltdown-US Meltdown-US-LFB Meltdown-US-SB

Meltdown-NM-REG Meltdown-P-L1 Meltdown-PF Meltdown-P-LFB Meltdown-P Meltdown-P-SB Meltdown-RW Meltdown-P-LP Meltdown-PK-L1 Meltdown-SM-SB

Meltdown-type Meltdown-MPX Meltdown-BR Meltdown-BND

Meltdown-CPL-REG Meltdown-GP Meltdown-NC-SB

Meltdown-AD-LFB Meltdown-AD Meltdown-MCA Meltdown-AD-SB Meltdown-AVX-LP

Canella et al. “A systematic evaluation of transient execution attacks and defenses”, USENIX Security 2019

16 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

Meltdown: Transiently encoding unauthorized memory

Unauthorized access

17 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Meltdown: Transiently encoding unauthorized memory

Unauthorized access Transient out-of-order window

oracle array secret idx

17 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Meltdown: Transiently encoding unauthorized memory

Unauthorized access Transient out-of-order window Exception (discard architectural state)

17 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Meltdown: Transiently encoding unauthorized memory

Unauthorized access Transient out-of-order window Exception handler

oracle array

cache hit

17 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Recovering from a Meltdown: Re-building protection walls? • Unmap kernel from user virtual address space → Unauthorized physical addresses out-of-reach (˜cookie jar)

user unmapped

context switch user kernel switch address space

context switch SMAP+SMEP kernel

Gruss et al. “KASLR is dead: Long live KASLR”, ESSoS 2017

Mitigating Meltdown: Unmap kernel addresses from user space

• OS software fix for faulty hardware( ↔ future CPUs)

18 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Mitigating Meltdown: Unmap kernel addresses from user space

• OS software fix for faulty hardware( ↔ future CPUs) • Unmap kernel from user virtual address space → Unauthorized physical addresses out-of-reach (˜cookie jar)

user unmapped

context switch user kernel switch address space

context switch SMAP+SMEP kernel

Gruss et al. “KASLR is dead: Long live KASLR”, ESSoS 2017

18 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Problem seemed to be solved • No attack surface left • That is what everyone thought

Problem Solved?

• Meltdown fully mitigated in software

19 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • That is what everyone thought

Problem Solved?

• Meltdown fully mitigated in software • Problem seemed to be solved • No attack surface left

19 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Problem Solved?

• Meltdown fully mitigated in software • Problem seemed to be solved • No attack surface left • That is what everyone thought

19 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

Foreshadow-NG: Breaking the virtual memory abstraction

CPU micro-architecture

L1D Tag?

vadrs PT padrs walk?

L1 cache design: Virtually-indexed, physically-tagged

20 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Foreshadow-NG: Breaking the virtual memory abstraction

CPU micro-architecture

L1D Tag?

vadrs PT padrs walk?

Page fault: Early-out address translation

20 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Foreshadow-NG: Breaking the virtual memory abstraction

CPU micro-architecture

L1D Tag? Pass to out-of-order

vadrs PT padrs walk?

L1-Terminal Fault: match unmapped physical address (!)

20 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Foreshadow-NG: Breaking the virtual memory abstraction

CPU micro-architecture

L1D Tag? Pass to out-of-order

vadrs PT padrs SGX? walk?

Foreshadow-SGX: bypass enclave isolation

20 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Foreshadow-NG: Breaking the virtual memory abstraction

CPU micro-architecture

L1D Tag? Pass to out-of-order

guest host vadrs PT padrs EPT padrs SGX? walk? walk?

Foreshadow-VMM: bypass virtual machine isolation(!)

20 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Mitigating Foreshadow/L1TF: Hardware-software cooperation

21 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Not only the user-accessible check • There are many more page table bits and exception types. . .

P RW US WT UC R D S G Ignored Physical Page Number

Ignored X

Generalization – Lessons from Foreshadow

• Meltdown is a whole category of vulnerabilities

22 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • There are many more page table bits and exception types. . .

P RW US WT UC R D S G Ignored Physical Page Number

Ignored X

Generalization – Lessons from Foreshadow

• Meltdown is a whole category of vulnerabilities • Not only the user-accessible check

22 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Generalization – Lessons from Foreshadow

• Meltdown is a whole category of vulnerabilities • Not only the user-accessible check • There are many more page table bits and exception types. . .

P RW US WT UC R D S G Ignored Physical Page Number

Ignored X

22 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Meltdown subtree: Exploiting page-table bits

Meltdown-US-L1 Meltdown-US Meltdown-US-LFB Meltdown-US-SB

Meltdown-P-L1 Pagefault Meltdown-P-LFB Meltdown-P Meltdown-P-SB Meltdown-RW Meltdown-P-LP Meltdown-PK-L1 Meltdown-SM-SB

23 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

May 2019: Meltdown redux

24 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Key take-aways: 1. Leakage from various intermediate buffers( ⊃ L1D) 2. Transient execution through microcode assists( ⊃ exceptions)

There is no noise. Noise is just someone else’s data

Microarchitectural data sampling: RIDL, ZombieLoad, Fallout

• May 2019: 3 new Meltdown-type attacks • Leakage from: line-fill buffer, store buffer, load ports

25 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) There is no noise. Noise is just someone else’s data

Microarchitectural data sampling: RIDL, ZombieLoad, Fallout

• May 2019: 3 new Meltdown-type attacks • Leakage from: line-fill buffer, store buffer, load ports • Key take-aways: 1. Leakage from various intermediate buffers( ⊃ L1D) 2. Transient execution through microcode assists( ⊃ exceptions)

25 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Microarchitectural data sampling: RIDL, ZombieLoad, Fallout

• May 2019: 3 new Meltdown-type attacks • Leakage from: line-fill buffer, store buffer, load ports • Key take-aways: 1. Leakage from various intermediate buffers( ⊃ L1D) 2. Transient execution through microcode assists( ⊃ exceptions)

There is no noise. Noise is just someone else’s data

25 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) MDS take-away 1: Microarchitectural buffers

CDB L1 Instruction Cache Reorder buffer ITLB µOP µOP µOP µOP µOP µOP µOP µOP

Scheduler Branch Instruction Fetch & PreDecode Predictor µOP µOP µOP µOP µOP µOP µOP µOP Instruction Queue

µOP Cache 4-Way Decode Frontend AGU Load data Load data

µOPs µOP µOP µOP µOP Store data Execution Engine

MUX ALU, Branch ALU, AES, . . . ALU, Vect, . . . ALU, FMA, . . .

Allocation Queue Execution Units

µOP µOP µOP µOP

Load Buffer Store Buffer

DTLB STLB L1 Data Cache LFB L2 Cache L3 Cache DRAM Memory Subsystem

26 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Need help? Re-issue the load with a microcode assist • assist == “microarchitectural fault”

• Example: setting A/D bits in the page table walk • Likely many more!

MDS take-away 2: Microcode assists

• Optimization: only implement fast-path in silicon • More complex edge cases (slow-path) in microcode

27 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) • Example: setting A/D bits in the page table walk • Likely many more!

MDS take-away 2: Microcode assists

• Optimization: only implement fast-path in silicon • More complex edge cases (slow-path) in microcode • Need help? Re-issue the load with a microcode assist • assist == “microarchitectural fault”

27 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) MDS take-away 2: Microcode assists

• Optimization: only implement fast-path in silicon • More complex edge cases (slow-path) in microcode • Need help? Re-issue the load with a microcode assist • assist == “microarchitectural fault”

• Example: setting A/D bits in the page table walk • Likely many more!

27 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Extended Meltdown tree with microcode assists https://transient.fail

Meltdown-US-L1 Meltdown-US Meltdown-US-LFB Meltdown-US-SB

Meltdown-NM-REG Meltdown-P-L1 Meltdown-PF Meltdown-P-LFB Meltdown-P Meltdown-P-SB Meltdown-RW Meltdown-P-LP Meltdown-PK-L1 Meltdown-UD Meltdown-SM-SB

Meltdown-MPX Transient cause Meltdown-type Meltdown-BR Meltdown-BND

Meltdown-CPL-REG Meltdown-GP Meltdown-NC-SB

Meltdown-AD-LFB Meltdown-AD Meltdown-AD-SB

Meltdown-MCA Meltdown-AVX-LP Meltdown-TAA-LFB Meltdown-TAA Meltdown-TAA-LP Meltdown-TAA-SB

28 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) 2018 era: Depth-first search (e.g., Foreshadow/L1TF)

Transient cause Meltdown-type Meltdown-PF Meltdown-P Meltdown-P-L1

• Meltdown is a category of attacks and not a single instance or bug • Systematic analysis (tree search) revealed several overlooked variants

Canella et al. “A Systematic Evaluation of Transient Execution Attacks and Defenses”, USENIX Security 2019.

29 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) 2019 era: Breadth-first search (e.g., Fallout)

Meltdown-US Meltdown-US-SB

Meltdown-PF Meltdown-P Meltdown-P-SB Meltdown-SM-SB

Transient cause Meltdown-type Meltdown-GP Meltdown-NC-SB

Meltdown-MCA Meltdown-AD Meltdown-AD-SB

Not “just another buffer”, include systematic fault-type analysis

30 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados)

⇒ New emerging and powerful class of transient-execution attacks ⇒ Importance of fundamental side-channel research ⇒ Security cross-cuts the system stack: hardware, hypervisor, kernel, compiler, application

Conclusions and take-away https://transient.fail/

Update your systems! (+ disable HyperThreading)

31 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Conclusions and take-away https://transient.fail/

Update your systems! (+ disable HyperThreading)

⇒ New emerging and powerful class of transient-execution attacks ⇒ Importance of fundamental side-channel research ⇒ Security cross-cuts the system stack: hardware, hypervisor, kernel, compiler, application

31 Jo Van Bulck (@jovanbulck), Daniel Gruss (@lavados) Leaky Processors

Lessons from Spectre, Meltdown, and Foreshadow

Jo Van Bulck (@jovanbulck)1, Daniel Gruss (@lavados)2 Red Hat Research Day, January 23, 2020

1 imec-DistriNet, KU Leuven, 2 Graz University of Technology

Appendix • Branch can be mistrained to speculatively (i.e., ahead of time) execute with idx ≥ LEN in the transient world • Insert explicit speculation barriers to tell the CPU to halt the transient world... • Huge manual, error-prone effort. . .

Spectre v1: Speculative buffer over-read

• Programmer intention: never access out-of-bounds

user buffer secret memory • Insert explicit speculation barriers to tell the CPU to halt the transient world... • Huge manual, error-prone effort. . .

Spectre v1: Speculative buffer over-read

• Programmer intention: never access out-of-bounds

user buffer secret memory • Branch can be mistrained to speculatively (i.e., ahead of time) execute with idx ≥ LEN in the transient world • Huge manual, error-prone effort. . .

Spectre v1: Speculative buffer over-read

• Programmer intention: never access out-of-bounds user buffer secret memory • Branch can be mistrained to speculatively (i.e., ahead of time) execute with idx ≥ LEN in the transient world • Insert explicit speculation barriers to tell the CPU to halt the transient world... Spectre v1: Speculative buffer over-read

• Programmer intention: never access out-of-bounds user buffer secret memory • Branch can be mistrained to speculatively (i.e., ahead of time) execute with idx ≥ LEN in the transient world • Insert explicit speculation barriers to tell the CPU to halt the transient world... • Huge manual, error-prone effort. . .