Ghosts in a Nutshell Moritz Lipp (@mlqxyz) Claudio Canella (@cc0x1f) Who am I?
Moritz Lipp PhD student @ Graz University of Technology 7 @mlqxyz R [email protected]
1 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Who am I?
Claudio Canella PhD student @ Graz University of Technology 7 @cc0x1f R [email protected]
2 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Media Media Meltdown & Spectre
• Side-Channel Vulnerability Variant 1, 2, 3 • and Variant 3a • Meltdown and Spectre have been disclosed
4 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Motivation
• Variant 1 - Bounds Check Bypass (BCB) • Variant 2 - Branch Target Injection (BTI) • Variant 3 - Rogue Data Cache Load (RDCL) • Variant 3a - Rogue System Register Read (RSRR) • Variant 4 - Speculative Store Bypass (SSB) • Variant 1.1 - Bounds Check Bypass Store (BCBS) • Variant 1.2 - Read-only protection bypass (RPB) • Lazy FP State Restore
5 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Motivation
• SpectreRSB - Return Mispredict • Foreshadow • L1 Terminal Fault (L1TF) • Portsmash • Netspectre • SMoTherSpectre • SPOILER
6 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Motivation
• KAISER patch / KPTI / KVA Shadow • Microcode Updates • IBRS / STIPB / IBPB • Retpoline • Taint Tracking • Serialization • InvisiSpec / SafeSpec / DAWG • RSB Stuffing • Site Isolation • SSBD / SSBB • ...
7 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Now you already lost me . . . • Give a comprehensible overview of all attacks and defenses • Show that systematic analysis allows to find new attacks and circumventions of countermeasures
What this talk is
• We want to shed some light and make it less confusing
8 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Show that systematic analysis allows to find new attacks and circumventions of countermeasures
What this talk is
• We want to shed some light and make it less confusing • Give a comprehensible overview of all attacks and defenses
8 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) What this talk is
• We want to shed some light and make it less confusing • Give a comprehensible overview of all attacks and defenses • Show that systematic analysis allows to find new attacks and circumventions of countermeasures
8 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Background • Information leaks due to underlying hardware • Exploit leakage through side-effects
Power Execution Microarchitectural consumption time elements
Side-channel Attacks
• Bug-free software does not mean safe execution
9 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Exploit leakage through side-effects
Power Execution Microarchitectural consumption time elements
Side-channel Attacks
• Bug-free software does not mean safe execution • Information leaks due to underlying hardware
9 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Power Execution Microarchitectural consumption time elements
Side-channel Attacks
• Bug-free software does not mean safe execution • Information leaks due to underlying hardware • Exploit leakage through side-effects
9 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Side-channel Attacks
• Bug-free software does not mean safe execution • Information leaks due to underlying hardware • Exploit leakage through side-effects
Power Execution Microarchitectural consumption time elements
9 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Interface between hardware and software • Microarchitecture is an ISA implementation
Architecture vs Microarchitecture
• Instruction Set Architecture (ISA) is an abstract model of a computer (x86, ARMv8, SPARC, . . . )
10 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Microarchitecture is an ISA implementation
Architecture vs Microarchitecture
• Instruction Set Architecture (ISA) is an abstract model of a computer (x86, ARMv8, SPARC, . . . ) • Interface between hardware and software
10 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Architecture vs Microarchitecture
• Instruction Set Architecture (ISA) is an abstract model of a computer (x86, ARMv8, SPARC, . . . ) • Interface between hardware and software • Microarchitecture is an ISA implementation
10 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Architecture vs Microarchitecture
• Instruction Set Architecture (ISA) is an abstract model of a computer (x86, ARMv8, SPARC, . . . ) • Interface between hardware and software • Microarchitecture is an ISA implementation
10 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Caches and buffer Predictor
• Transparent for the programmer • Timing optimizations → side-channel leakage
Microarchitectural Elements
• Modern CPUs contain multiple microarchitectural elements
11 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Transparent for the programmer • Timing optimizations → side-channel leakage
Microarchitectural Elements
• Modern CPUs contain multiple microarchitectural elements
Caches and buffer Predictor
11 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Timing optimizations → side-channel leakage
Microarchitectural Elements
• Modern CPUs contain multiple microarchitectural elements
Caches and buffer Predictor
• Transparent for the programmer
11 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Microarchitectural Elements
• Modern CPUs contain multiple microarchitectural elements
Caches and buffer Predictor
• Transparent for the programmer • Timing optimizations → side-channel leakage
11 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Caches & Cache Attacks Cache
printf("%d", i);
printf("%d", i);
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
Cache miss printf("%d", i);
printf("%d", i);
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
Cache miss Request printf("%d", i);
printf("%d", i);
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
Cache miss Request printf("%d", i);
printf("%d", i); Response
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
Cache miss Request printf("%d", i); i printf("%d", i); Response
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
Cache miss Request printf("%d", i); i printf("%d", i); Response Cache hit
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
DRAM access, slow
Cache miss Request printf("%d", i); i printf("%d", i); Response Cache hit
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache
DRAM access, slow
Cache miss Request printf("%d", i); i printf("%d", i); Response Cache hit
No DRAM access, much faster
12 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Caching speeds up Memory Accesses
·104 Cache hit 3 Cache miss
2
1 Number of accesses
0 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 Measured access time (CPU cycles)
13 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush+Reload
14 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Only meta data. Not interesting and not in any threat model.
Cache Attacks
• Leak cryptographic keys • Leak information on co-located virtual machines • Monitor function calls of other applications • Build covert communication channels • ...
15 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Cache Attacks
• Leak cryptographic keys • Leak information on co-located virtual machines • Monitor function calls of other applications • Build covert communication channels • ...
Only meta data. Not interesting and not in any threat model.
15 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Out-of-order execution Out-of-order Execution
int width = 10, height = 5;
float diagonal= sqrt(width* width + height* height); int area= width* height;
printf("Area%dx%d=%d\n", width, height, area);
16 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Out-of-order Execution
Parallelize
int width = 10, height = 5;
float diagonal= sqrt(width* width + height* height); int area= width* height;
Dependency printf("Area%dx%d=%d\n", width, height, area);
16 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • dispatched to the backend • processed by individual execution units
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue
µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP
MUX
Allocation Queue
µOP µOP µOP µOP Instructions are
CDB Reorder buffer µOP µOP µOP µOP µOP µOP µOP µOP • fetched and decoded in the front-end Scheduler
µOP µOP µOP µOP µOP µOP µOP µOP AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
17 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • processed by individual execution units
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue
µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP
MUX
Allocation Queue
µOP µOP µOP µOP Instructions are
CDB Reorder buffer µOP µOP µOP µOP µOP µOP µOP µOP • fetched and decoded in the front-end Scheduler µOP µOP µOP µOP µOP µOP µOP µOP • dispatched to the backend AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
17 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue
µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP
MUX
Allocation Queue
µOP µOP µOP µOP Instructions are
CDB Reorder buffer µOP µOP µOP µOP µOP µOP µOP µOP • fetched and decoded in the front-end Scheduler µOP µOP µOP µOP µOP µOP µOP µOP • dispatched to the backend AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine • processed by individual execution units Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
17 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • wait until their dependencies are ready • Later instructions might execute prior earlier instructions • retire in-order • State becomes architecturally visible • Exceptions are checked during retirement • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue
µOP µOP µOP µOP
CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler
µOP µOP µOP µOP µOP µOP µOP µOP AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Later instructions might execute prior earlier instructions • retire in-order • State becomes architecturally visible • Exceptions are checked during retirement • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler
µOP µOP µOP µOP µOP µOP µOP µOP AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • retire in-order • State becomes architecturally visible • Exceptions are checked during retirement • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler • Later instructions might execute prior earlier instructions
µOP µOP µOP µOP µOP µOP µOP µOP AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • State becomes architecturally visible • Exceptions are checked during retirement • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler • Later instructions might execute prior earlier instructions µOP µOP µOP µOP µOP µOP µOP µOP • retire in-order AGU Load data Load data Store data ALU, Branch ALU, Vect, . . . ALU, AES, . . . ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Exceptions are checked during retirement • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler • Later instructions might execute prior earlier instructions µOP µOP µOP µOP µOP µOP µOP µOP • retire in-order AGU Load data Load data Store data ALU, Branch ALU, Vect, . . .
ALU, AES, . . . • State becomes architecturally visible ALU, FMA, . . . Execution Engine Execution Units
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Flush pipeline and recover state
Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler • Later instructions might execute prior earlier instructions µOP µOP µOP µOP µOP µOP µOP µOP • retire in-order AGU Load data Load data Store data ALU, Branch ALU, Vect, . . .
ALU, AES, . . . • State becomes architecturally visible ALU, FMA, . . . Execution Engine Execution Units • Exceptions are checked during retirement
Load Buffer Store Buffer
DTLB STLB L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Out-of-Order Execution
ITLB L1 Instruction Cache
Branch Instruction Fetch & PreDecode Predictor Instruction Queue Instructions µOP Cache 4-Way Decode Frontend µOPs µOP µOP µOP µOP MUX • are executed out-of-order Allocation Queue µOP µOP µOP µOP • wait until their dependencies are ready CDB Reorder buffer
µOP µOP µOP µOP µOP µOP µOP µOP
Scheduler • Later instructions might execute prior earlier instructions µOP µOP µOP µOP µOP µOP µOP µOP • retire in-order AGU Load data Load data Store data ALU, Branch ALU, Vect, . . .
ALU, AES, . . . • State becomes architecturally visible ALU, FMA, . . . Execution Engine Execution Units • Exceptions are checked during retirement
Load Buffer Store Buffer
DTLB STLB • Flush pipeline and recover state L1 Data Cache Memory
Subsystem L2 Cache
18 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) The state does not become architecturally visible but . . . • Do they leave any side effects? • Can we observe them?
Thinking
• What happens to the instructions that are dismissed?
19 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Can we observe them?
Thinking
• What happens to the instructions that are dismissed? • Do they leave any side effects?
19 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Thinking
• What happens to the instructions that are dismissed? • Do they leave any side effects? • Can we observe them?
19 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Toy example
*(volatile char*) 0; // raise_exception(); array[84 * 4096] = 0;
20 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • “Unreachable” code line was actually executed • Exception was only thrown afterwards • Out-of-order instructions leave microarchitectural traces • Give such instructions a name: transient instructions
Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
21 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Exception was only thrown afterwards • Out-of-order instructions leave microarchitectural traces • Give such instructions a name: transient instructions
Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• “Unreachable” code line was actually executed
21 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Out-of-order instructions leave microarchitectural traces • Give such instructions a name: transient instructions
Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• “Unreachable” code line was actually executed • Exception was only thrown afterwards
21 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Give such instructions a name: transient instructions
Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• “Unreachable” code line was actually executed • Exception was only thrown afterwards • Out-of-order instructions leave microarchitectural traces
21 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• “Unreachable” code line was actually executed • Exception was only thrown afterwards • Out-of-order instructions leave microarchitectural traces • Give such instructions a name: transient instructions
21 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Does the CPU ignore the semantics of exceptions before handling it? • This isolation is a combination of hardware and software • User applications cannot access anything from the kernel
Memory Isolation
Userspace Kernelspace • Kernel is isolated from user space
Operating Applications System Memory
22 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • User applications cannot access anything from the kernel
Memory Isolation
Userspace Kernelspace • Kernel is isolated from user space • This isolation is a combination of hardware and software
Operating Applications System Memory
22 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Isolation
Userspace Kernelspace • Kernel is isolated from user space • This isolation is a combination of hardware and software • User applications cannot access Operating Applications System Memory anything from the kernel
22 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Physical memory is organized in page frames • Virtual memory pages are mapped to page frames using page tables
Paging
• CPU support virtual address spaces to isolate processes
23 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Virtual memory pages are mapped to page frames using page tables
Paging
• CPU support virtual address spaces to isolate processes • Physical memory is organized in page frames
23 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Paging
• CPU support virtual address spaces to isolate processes • Physical memory is organized in page frames • Virtual memory pages are mapped to page frames using page tables
23 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Address Translation on x86-64
PML4 CR3 PML4E 0 PML4E 1 · · PDPT #PML4I PDPTE 0 · · PDPTE 1 PML4E 511 · · Page Directory #PDPTI PDE 0 · · PDE 1 PDPTE 511 · · Page Table PDE #PDI PTE 0 · · PTE 1 PDE 511 · · 4 KiB Page PTE #PTI Byte 0 · · Byte 1 PTE 511 · · Offset · · PML4I (9 b) PDPTI (9 b) PDI (9 b) PTI (9 b) Offset (12 b) Byte 4095 48-bit virtual address
24 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Address Translation on x86-64
PML4 CR3 PML4E 0 PML4E 1 · · PDPT #PML4I PDPTE 0 · · PDPTE 1 PML4E 511 · · Page Directory #PDPTI PDE 0 · · PDE 1 PDPTE 511 · · Page Table PDE #PDI PTE 0 · · PTE 1 PDE 511 · · 4 KiB Page PTE #PTI Byte 0 · · Byte 1 PTE 511 · · Offset · · PML4I (9 b) PDPTI (9 b) PDI (9 b) PTI (9 b) Offset (12 b) Byte 4095 48-bit virtual address
24 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Page Table Entry
P RW US WT UC R D S G Ignored Physical Page Number
Ignored PK X
• User/Supervisor bit defines in which privilege level the page can be accessed
25 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Then check whether any part of array is cached
Building the Code
• Add another layer of indirection to test
char data = *(char*) 0xffffffff81a000e0; // kernel address array[data * 4096] = 0;
26 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Building the Code
• Add another layer of indirection to test
char data = *(char*) 0xffffffff81a000e0; // kernel address array[data * 4096] = 0;
• Then check whether any part of array is cached
26 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Permission check is in some cases not fast enough
Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• Index of cache hit reveals data
27 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Building the Code
• Flush+Reload over all pages of the array
500 400 300 [cycles] Access time 0 50 100 150 200 250 Page
• Index of cache hit reveals data • Permission check is in some cases not fast enough
27 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f)
• Entire physical memory is typically accessible through kernel space
Meltdown
• Using out-of-order execution, we can read data at any address
28 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Meltdown
• Using out-of-order execution, we can read data at any address • Entire physical memory is typically accessible through kernel space
28 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value • Target data is not in the L1 cache of the attacking core • We can still leak the data at a lower reading rate • Meltdown might implicitly cache the data
Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Target data is not in the L1 cache of the attacking core • We can still leak the data at a lower reading rate • Meltdown might implicitly cache the data
Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • We can still leak the data at a lower reading rate • Meltdown might implicitly cache the data
Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value • Target data is not in the L1 cache of the attacking core
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Meltdown might implicitly cache the data
Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value • Target data is not in the L1 cache of the attacking core • We can still leak the data at a lower reading rate
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value • Target data is not in the L1 cache of the attacking core • We can still leak the data at a lower reading rate • Meltdown might implicitly cache the data
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Uncached memory
• Assumed that one can only read data stored in the L1 with Meltdown • Experiment where a thread flushes the value constantly and a thread on a different core reloads the value • Target data is not in the L1 cache of the attacking core • We can still leak the data at a lower reading rate • Meltdown might implicitly cache the data
29 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Which bits are left in the PTE? Page Table Entry
P RW US WT UC R D S G Ignored Physical Page Number
Ignored PK X
• Present bit defines whether a page is present in physical memory.
30 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Use virtual machine → completely controlled by attacker
Problem
• Can not really manage the PTE from user space
31 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Problem
• Can not really manage the PTE from user space • Use virtual machine → completely controlled by attacker
31 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Virtual Machine: Extended Page Table Nested Walk Nested Walk Nested Walk Nested Walk ...... #0 #1 #0 #1 #0 #1 #0 #1 #511 #511 #511 #511 Offset Byte 0 Byte 1 Byte 4095 Page Table 4 KiB Page PTE (guest) PDE (guest) Pointer Table Page Directory Page Directory PML4E (guest) PDPTE (guest) Pag-Map Level-4 Nested Walk CR3 (guest)
63 0
Sign extended (16 bit) PML4I (9 bit) PDPTI (9 bit) PDI (9 bit) PTI (9 bit) Offset (12 bit)
64-bit guest virtual address
32 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · PTE #PTI · · PTE 511
L1 Cache
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · present PTE #PTI · · PTE 511
L1 Cache
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · present Guest Physical PTE #PTI · to Host Physical · PTE 511
L1 Cache
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · present Guest Physical PTE #PTI Physical · to Host Physical · Page PTE 511
L1 lookup L1 with Cache physical address
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · not present PTE #PTI · · PTE 511
L1 Cache
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Present Bit
Page Table PTE 0 PTE 1 · · not present PTE #PTI · · PTE 511
L1 lookup L1 with Cache virtual address
33 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Demo • Leak data from L1 data cache • Affects virtual machines (VM), hypervisors (VMM), operating systems (OS) and system management mode (SMM) • Read SGX-protected memory and leak machine’s private attestation key
Impact
• Foreshadow or L1TF
34 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Affects virtual machines (VM), hypervisors (VMM), operating systems (OS) and system management mode (SMM) • Read SGX-protected memory and leak machine’s private attestation key
Impact
• Foreshadow or L1TF • Leak data from L1 data cache
34 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Read SGX-protected memory and leak machine’s private attestation key
Impact
• Foreshadow or L1TF • Leak data from L1 data cache • Affects virtual machines (VM), hypervisors (VMM), operating systems (OS) and system management mode (SMM)
34 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Impact
• Foreshadow or L1TF • Leak data from L1 data cache • Affects virtual machines (VM), hypervisors (VMM), operating systems (OS) and system management mode (SMM) • Read SGX-protected memory and leak machine’s private attestation key
34 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) There are still bits left in the PTE Page Table Entry
P RW US WT UC R D S G Ignored Physical Page Number
Ignored PK X
• PK bits define the assigned protection key
35 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • 4 bits in PTE identify key for protected memory regions • Quick update of access rights • Available on Intel Xeon CPUs
Intel MPK
• Protection key for a group of pages
36 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Quick update of access rights • Available on Intel Xeon CPUs
Intel MPK
• Protection key for a group of pages • 4 bits in PTE identify key for protected memory regions
36 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Available on Intel Xeon CPUs
Intel MPK
• Protection key for a group of pages • 4 bits in PTE identify key for protected memory regions • Quick update of access rights
36 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Intel MPK
• Protection key for a group of pages • 4 bits in PTE identify key for protected memory regions • Quick update of access rights • Available on Intel Xeon CPUs
36 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Protected value is forwarded to transient instructions
Protection Keys
• Protection keys are lazily enforced
37 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Protection Keys
• Protection keys are lazily enforced • Protected value is forwarded to transient instructions
37 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Other exceptions? • Data used in transient execution • Attacker can determine data • First Meltdown-type attack on AMD
Bound Range Exceeded
• x86 provides dedicated instruction raising #BR exception if bound-range is exceeded
38 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Attacker can determine data • First Meltdown-type attack on AMD
Bound Range Exceeded
• x86 provides dedicated instruction raising #BR exception if bound-range is exceeded • Data used in transient execution
38 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • First Meltdown-type attack on AMD
Bound Range Exceeded
• x86 provides dedicated instruction raising #BR exception if bound-range is exceeded • Data used in transient execution • Attacker can determine data
38 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Bound Range Exceeded
• x86 provides dedicated instruction raising #BR exception if bound-range is exceeded • Data used in transient execution • Attacker can determine data • First Meltdown-type attack on AMD
38 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) We tested all of them Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
Take a closer look . . . again
Exceptions
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Non-Executable Present Read/Write Alignment Check
Protection Key User/Supervisor Stack Segment
SMAP General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
Take a closer look . . . again
Page Fault
Exceptions
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Alignment Check
Stack Segment
General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
Take a closer look . . . again
Non-Executable Present Read/Write
Protection Key Page Fault User/Supervisor
SMAP Exceptions
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Alignment Check
Stack Segment
General Protection
Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
Take a closer look . . . again
Non-Executable Present Read/Write
Protection Key Page Fault User/Supervisor
SMAP Exceptions
Device not available
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Stack Segment
General Protection
Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor
SMAP Exceptions
Device not available
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Stack Segment
General Protection
Bound-Range-Exceeded Illegal Opcode
MPX Bound
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor
SMAP Exceptions
Device not available
Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Stack Segment
General Protection
Bound-Range-Exceeded
MPX Bound
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor
SMAP Exceptions
Device not available Illegal Opcode
Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) General Protection
Bound-Range-Exceeded
MPX Bound
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions
Device not available Illegal Opcode
Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Bound-Range-Exceeded
MPX Bound
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions General Protection
Device not available Illegal Opcode
Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) MPX Bound
Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Take a closer look . . . again
Non-Executable Present Read/Write Alignment Check
Protection Key Page Fault User/Supervisor Stack Segment
SMAP Exceptions General Protection
Device not available Bound-Range-Exceeded Illegal Opcode
MPX Bound Division-by-Zero
39 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK → ...
New Naming Scheme
• Names are still confusing
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Meltdown = Meltdown-US → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK → ...
New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK → ...
New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK → ...
New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US → Foreshadow = Meltdown-P
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Protection Keys = Meltdown-PK → ...
New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → ...
New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) New Naming Scheme
• Names are still confusing • Propose: use exception/bit for naming attack → Meltdown = Meltdown-US → Foreshadow = Meltdown-P → Meltdown 3a = Meltdown-GP → Protection Keys = Meltdown-PK → ...
40 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we fix this? D1 Architecturally inaccessible data is also microarchitecturally inaccessible
Meltdown Defense Categorization
Meltdown defenses in 2 categories:
41 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Meltdown Defense Categorization
Meltdown defenses in 2 categories:
D1 Architecturally inaccessible data is also microarchitecturally inaccessible
41 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Why don’t we take the kernel addresses...
Take the kernel addresses...
• Kernel addresses in user space are a problem
42 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Take the kernel addresses...
• Kernel addresses in user space are a problem • Why don’t we take the kernel addresses...
42 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • User accessible check in hardware is not reliable
...and remove them
• ...and remove them if not needed?
43 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) ...and remove them
• ...and remove them if not needed? • User accessible check in hardware is not reliable
43 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Meltdown-US Mitigation: KAISER
Userspace Kernelspace
Operating Applications System Memory
44 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) KAISER
Kernel View User View
Userspace Kernelspace Userspace Kernelspace
Operating Applications System Memory Applications
context switch
45 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Apple: Released updates • Windows: Kernel Virtual Address (KVA) Shadow
Mitigations
• Linux: Kernel Page-table Isolation (KPTI)
46 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Windows: Kernel Virtual Address (KVA) Shadow
Mitigations
• Linux: Kernel Page-table Isolation (KPTI) • Apple: Released updates
46 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Mitigations
• Linux: Kernel Page-table Isolation (KPTI) • Apple: Released updates • Windows: Kernel Virtual Address (KVA) Shadow
46 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Meltdown Defense Categorization
Meltdown defenses in 2 categories:
D1 Architecturally inaccessible data is D2 Preventing occurrence of faults also microarchitecturally inaccessible
47 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • FPU is always available → never faults
D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) D2: Preventing occurrence of faults
• Prevent the occurrence of faults in the first place • Accesses which would normally fault → become architecturally and microarchitecturally valid accesses • But do not leak data • SGX Page Abort Semantics • Returns -1 when enclave memory is accessed from the outside world • Meltdown-NM mitigation: Use eager switching instead of lazy switching in the FPU • FPU is always available → never faults
48 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush L1 upon switching protection domains
Clear phyiscal address field of unmapped PTEs
Meltdown-P Mitigation
49 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Flush L1 upon switching protection domains
Meltdown-P Mitigation
Clear phyiscal address field of unmapped PTEs
49 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Meltdown-P Mitigation
Clear phyiscal address field of Flush L1 upon switching protection unmapped PTEs domains
49 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) What about other microarchitectural elements? Branch Prediction • . . . based on what happened in the past • Speculative execution of instructions • Correct prediction, . . . • . . . very fast • otherwise: Discard results
Branch Prediction
• CPU tries to predict the future, . . .
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Speculative execution of instructions • Correct prediction, . . . • . . . very fast • otherwise: Discard results
Branch Prediction
• CPU tries to predict the future, . . . • . . . based on what happened in the past
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Correct prediction, . . . • . . . very fast • otherwise: Discard results
Branch Prediction
• CPU tries to predict the future, . . . • . . . based on what happened in the past • Speculative execution of instructions
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • . . . very fast • otherwise: Discard results
Branch Prediction
• CPU tries to predict the future, . . . • . . . based on what happened in the past • Speculative execution of instructions • Correct prediction, . . .
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • otherwise: Discard results
Branch Prediction
• CPU tries to predict the future, . . . • . . . based on what happened in the past • Speculative execution of instructions • Correct prediction, . . . • . . . very fast
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Prediction
• CPU tries to predict the future, . . . • . . . based on what happened in the past • Speculative execution of instructions • Correct prediction, . . . • . . . very fast • otherwise: Discard results
50 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Pattern History Table Branch Target Return Stack Buffer Disambiguation (PHT) Buffer (BTB) (RSB) (STL)
Branch Prediction
• CPU has many different predictors...
51 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Branch Target Return Stack Buffer Disambiguation Buffer (BTB) (RSB) (STL)
Branch Prediction
• CPU has many different predictors...
Pattern History Table (PHT)
51 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Return Stack Buffer Disambiguation (RSB) (STL)
Branch Prediction
• CPU has many different predictors...
Pattern History Table Branch Target (PHT) Buffer (BTB)
51 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Disambiguation (STL)
Branch Prediction
• CPU has many different predictors...
Pattern History Table Branch Target Return Stack Buffer (PHT) Buffer (BTB) (RSB)
51 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Prediction
• CPU has many different predictors...
Memory Pattern History Table Branch Target Return Stack Buffer Disambiguation (PHT) Buffer (BTB) (RSB) (STL)
51 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Can we trick the CPU into executing code speculatively? And exploit that? Pattern History Table (PHT) Pattern History Table (PHT)
LUT index =0;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =0;
char* data ="textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =0;
char* data ="textKEY";
if (index <4) Speculate else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =0;
char* data ="textKEY";
if (index <4) Execute else then Index ’t’ Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =1;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =1;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =1;
char* data = "textKEY";
if (index <4) Speculate Index ’e’ else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =1;
char* data = "textKEY";
if (index <4)
Index ’e’ else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =2;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =2;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =2;
char* data = "textKEY";
if (index <4) Speculate else then Prediction Index ’x’ LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =2;
char* data = "textKEY";
if (index <4)
else then Prediction Index ’x’ LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =3;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =3;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =3;
char* data = "textKEY";
if (index <4) Speculate else then Index ’t’ Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =3;
char* data = "textKEY";
if (index <4)
else then Index ’t’ Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =4;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =4;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =4;
Index ’K’ char* data = "textKEY";
if (index <4) Speculate else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =4;
Index ’K’ char* data = "textKEY";
if (index <4) Execute else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =5;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =5;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =5;
Index ’E’ char* data = "textKEY";
if (index <4) Speculate else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =5;
Index ’E’ char* data = "textKEY";
if (index <4) Execute else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =6;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =6;
char* data = "textKEY";
if (index <4)
else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =6;
char* data = "textKEY"; Index ’Y’
if (index <4) Speculate else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Pattern History Table (PHT)
LUT index =6;
char* data = "textKEY"; Index ’Y’
if (index <4) Execute else then Prediction
LUT[data[index] * 4096] 0
52 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Spectre Variant 1.1: Bounds Check Bypass on Stores • Intel: Side-Channel Vulnerability 1 • Propose: Spectre-PHT
Naming
• Spectre Variant 1: Bounds Check Bypass on Load
53 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Intel: Side-Channel Vulnerability 1 • Propose: Spectre-PHT
Naming
• Spectre Variant 1: Bounds Check Bypass on Load • Spectre Variant 1.1: Bounds Check Bypass on Stores
53 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Propose: Spectre-PHT
Naming
• Spectre Variant 1: Bounds Check Bypass on Load • Spectre Variant 1.1: Bounds Check Bypass on Stores • Intel: Side-Channel Vulnerability 1
53 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Naming
• Spectre Variant 1: Bounds Check Bypass on Load • Spectre Variant 1.1: Bounds Check Bypass on Stores • Intel: Side-Channel Vulnerability 1 • Propose: Spectre-PHT
53 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB) Branch Target Buffer (BTB)
Animal* a = bird;
a->move()
swim() swim() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move() Speculate swim() swim() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move()
swim() swim() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move() Execute swim() swim() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move()
swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move() Speculate swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = bird;
a->move()
swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = fish;
a->move()
swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = fish;
a->move() Speculate swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = fish;
a->move()
swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = fish;
a->move() Execute swim() fly() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Branch Target Buffer (BTB)
Animal* a = fish;
a->move()
swim() swim() fly()
Prediction LUT[data[index] * 4096] 0
54 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Intel: Side-Channel Vulnerability 2 • Propose: Spectre-BTB
Naming
• Spectre Variant 2: Branch Target Injection
55 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Propose: Spectre-BTB
Naming
• Spectre Variant 2: Branch Target Injection • Intel: Side-Channel Vulnerability 2
55 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Naming
• Spectre Variant 2: Branch Target Injection • Intel: Side-Channel Vulnerability 2 • Propose: Spectre-BTB
55 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB) Return Stack Buffer (RSB)
function() ...
Victim Attacker
RSB
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy
RSB
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy call function(SHORT) RSB
&victim
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy call function(SHORT) call function(LONG) RSB data[reg * 4096]
&attacker &victim
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy call function(SHORT) call function(LONG) RSB data[reg * 4096]
&attacker &victim
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy call function(SHORT) call function(LONG) RSB data[reg * 4096]
&victim
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Return Stack Buffer (RSB)
function() ...
Victim Attacker reg = secret reg = dummy call function(SHORT) call function(LONG) RSB data[reg * 4096]
&victim
56 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • SpectreRSB • ret2spec • Propose: Spectre-RSB
Naming
• Spectre-NG
57 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • ret2spec • Propose: Spectre-RSB
Naming
• Spectre-NG • SpectreRSB
57 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Propose: Spectre-RSB
Naming
• Spectre-NG • SpectreRSB • ret2spec
57 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Naming
• Spectre-NG • SpectreRSB • ret2spec • Propose: Spectre-RSB
57 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Disambiguation (STL) → need to check for previous stores • Check is time consuming • Optimization: Speculate whether a store happened or not • no store: bypass check • stall
Memory Disambiguation (STL)
• Loads can be executed out-of-order
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Check is time consuming • Optimization: Speculate whether a store happened or not • no store: bypass check • stall
Memory Disambiguation (STL)
• Loads can be executed out-of-order → need to check for previous stores
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Optimization: Speculate whether a store happened or not • no store: bypass check • stall
Memory Disambiguation (STL)
• Loads can be executed out-of-order → need to check for previous stores • Check is time consuming
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • no store: bypass check • stall
Memory Disambiguation (STL)
• Loads can be executed out-of-order → need to check for previous stores • Check is time consuming • Optimization: Speculate whether a store happened or not
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • stall
Memory Disambiguation (STL)
• Loads can be executed out-of-order → need to check for previous stores • Check is time consuming • Optimization: Speculate whether a store happened or not • no store: bypass check
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Memory Disambiguation (STL)
• Loads can be executed out-of-order → need to check for previous stores • Check is time consuming • Optimization: Speculate whether a store happened or not • no store: bypass check • stall
58 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Propose: Spectre-STL
Naming
• Spectre Variant 4: Speculative Store Bypass
59 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Naming
• Spectre Variant 4: Speculative Store Bypass • Propose: Spectre-STL
59 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) What can you attack? OS / HV
Other Process
What can you attack?
Own Process
60 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Other Process
What can you attack?
OS / HV
Own Process
60 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) What can you attack?
OS / HV
Own Process
Other Process
60 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) What can you attack?
OS / HV
Own Process
Other Process
60 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain the CPU? How can we mistrain?
Victim
same address space/ Victim in place branch
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain?
Victim
same address space/ Congruent out of place branch Address collision
same address space/ Victim in place branch
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain?
Victim
same address space/ Congruent out of place branch Address collision
same address space/ Victim in place branch
Shared Branch Prediction State
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain?
Victim Attacker
same address space/ Congruent out of place branch Address collision
same address space/ Victim in place branch
Shared Branch Prediction State
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain?
Victim Attacker
same address space/ Congruent out of place branch Address collision
same address space/ Victim Shadow cross address space/ in place branch branch in place
Shared Branch Prediction State
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we mistrain?
Victim Attacker
same address space/ Congruent Congruent cross address space/ out of place branch branch out of place Address collision Address collision
same address space/ Victim Shadow cross address space/ in place branch branch in place
Shared Branch Prediction State
61 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we fix this? syscall/hypercall ? context switch
How can we fix this?
OS / HV
Own Process
Other Process
62 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) ? context switch
How can we fix this?
syscall/hypercall OS / HV
Own Process
Other Process
62 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) ?
How can we fix this?
syscall/hypercall OS / HV
Own Process context switch
Other Process
62 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) How can we fix this?
syscall/hypercall OS / HV ? Own Process context switch
Other Process
62 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C1 Mitigating or reducing the accuracy of covert channels
Spectre Defense Categorization
Spectre defenses in 3 categories:
63 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Spectre Defense Categorization
Spectre defenses in 3 categories:
C1 Mitigating or reducing the accuracy of covert channels
63 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C1: Mitigating covert channels
• Deactivate the cache → System would be too slow • Remove flush instructions → Use eviction • Remove rdtsc → Many alternative high-resolution timers • Build new caches: DAWG, InvisiSpec, SafeSpec → Require hardware changes
• Other covert channels besides the cache: • PortSmash • SMoTHERSpectre • AVX
64 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Spectre Defense Categorization
Spectre defenses in 3 categories:
C1 Mitigating or reducing C2 Mitigating or aborting the accuracy of covert speculation channels
65 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C2: Mitigating or aborting speculation in hardware
Drilling template (@kreon nrw)
66 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Single Thread Indirect Branch Prediction (STIBP) • Indirect Branch Predictor Barrier (IBPB) • ARMv8.5-A: New barrier (sb) • Speculative Store Bypass Safe/Disable (SSBS/SSBD)
C2: Mitigating or aborting speculation in hardware
• Three new controls for ISA: • Indirect Branch Restricted Speculation (IBRS)
67 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Indirect Branch Predictor Barrier (IBPB) • ARMv8.5-A: New barrier (sb) • Speculative Store Bypass Safe/Disable (SSBS/SSBD)
C2: Mitigating or aborting speculation in hardware
• Three new controls for ISA: • Indirect Branch Restricted Speculation (IBRS) • Single Thread Indirect Branch Prediction (STIBP)
67 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • ARMv8.5-A: New barrier (sb) • Speculative Store Bypass Safe/Disable (SSBS/SSBD)
C2: Mitigating or aborting speculation in hardware
• Three new controls for ISA: • Indirect Branch Restricted Speculation (IBRS) • Single Thread Indirect Branch Prediction (STIBP) • Indirect Branch Predictor Barrier (IBPB)
67 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Speculative Store Bypass Safe/Disable (SSBS/SSBD)
C2: Mitigating or aborting speculation in hardware
• Three new controls for ISA: • Indirect Branch Restricted Speculation (IBRS) • Single Thread Indirect Branch Prediction (STIBP) • Indirect Branch Predictor Barrier (IBPB) • ARMv8.5-A: New barrier (sb)
67 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C2: Mitigating or aborting speculation in hardware
• Three new controls for ISA: • Indirect Branch Restricted Speculation (IBRS) • Single Thread Indirect Branch Prediction (STIBP) • Indirect Branch Predictor Barrier (IBPB) • ARMv8.5-A: New barrier (sb) • Speculative Store Bypass Safe/Disable (SSBS/SSBD)
67 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Developers need to know where to put them • Retpoline: replacing indirect branches with return instructions • RSB Stuffing • Fill RSB with benign addresses on every context switch
C2: Mitigating or aborting speculation in software
• Use serializing instructions on branch outcomes (lfence, dsb sy)
68 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Retpoline: replacing indirect branches with return instructions • RSB Stuffing • Fill RSB with benign addresses on every context switch
C2: Mitigating or aborting speculation in software
• Use serializing instructions on branch outcomes (lfence, dsb sy) • Developers need to know where to put them
68 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • RSB Stuffing • Fill RSB with benign addresses on every context switch
C2: Mitigating or aborting speculation in software
• Use serializing instructions on branch outcomes (lfence, dsb sy) • Developers need to know where to put them • Retpoline: replacing indirect branches with return instructions
68 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Fill RSB with benign addresses on every context switch
C2: Mitigating or aborting speculation in software
• Use serializing instructions on branch outcomes (lfence, dsb sy) • Developers need to know where to put them • Retpoline: replacing indirect branches with return instructions • RSB Stuffing
68 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C2: Mitigating or aborting speculation in software
• Use serializing instructions on branch outcomes (lfence, dsb sy) • Developers need to know where to put them • Retpoline: replacing indirect branches with return instructions • RSB Stuffing • Fill RSB with benign addresses on every context switch
68 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Spectre Defense Categorization
Spectre defenses in 3 categories:
C1 Mitigating or reducing C2 Mitigating or aborting C3 Ensuring secret data the accuracy of covert speculation cannot be reached channels
69 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Webkit: Index masking instead of array bounds checks • Webkit: Poison values to protect pointers • Chrome: Site Isolation
C3: Ensuring secret data cannot be reached
• What to do if an attacker attacks his own process (sandboxing)?
70 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Webkit: Poison values to protect pointers • Chrome: Site Isolation
C3: Ensuring secret data cannot be reached
• What to do if an attacker attacks his own process (sandboxing)? • Webkit: Index masking instead of array bounds checks
70 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Chrome: Site Isolation
C3: Ensuring secret data cannot be reached
• What to do if an attacker attacks his own process (sandboxing)? • Webkit: Index masking instead of array bounds checks • Webkit: Poison values to protect pointers
70 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) C3: Ensuring secret data cannot be reached
• What to do if an attacker attacks his own process (sandboxing)? • Webkit: Index masking instead of array bounds checks • Webkit: Poison values to protect pointers • Chrome: Site Isolation
70 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Systematization Transient cause? Spectre-type
prediction Transient cause? fault
Meltdown-type microarchitec- Spectre-PHT tural buffer Spectre-BTB
Spectre-RSB Spectre-type Spectre-STL
prediction Transient cause? fault
Meltdown-type mistraining strategy Cross-address-space microarchitec- Spectre-PHT Same-address-space tural buffer Spectre-BTB
Spectre-RSB Cross-address-space Spectre-type Spectre-STL Same-address-space
prediction Cross-address-space Transient Same-address-space cause? fault
Meltdown-type PHT-CA-IP ∗
PHT-CA-OP ∗
PHT-SA-OP ∗
BTB-SA-IP ∗
BTB-SA-OP ∗
in-place (IP) vs., out-of-place (OP)
mistraining strategy Cross-address-space microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB
Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space Transient Same-address-space cause?
fault RSB-CA-IP
RSB-CA-OP Meltdown-type RSB-SA-IP
RSB-SA-OP in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗
fault RSB-CA-IP
RSB-CA-OP Meltdown-type RSB-SA-IP
RSB-SA-OP Meltdown-AC .
Meltdown-DE .
Meltdown-UD .
Meltdown-SS .
in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM fault RSB-CA-IP
RSB-CA-OP Meltdown-type RSB-SA-IP Meltdown-PF RSB-SA-OP
fault type
Meltdown-BR
Meltdown-GP in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM fault RSB-CA-IP Meltdown-AC . RSB-CA-OP Meltdown-type Meltdown-DE . RSB-SA-IP Meltdown-PF RSB-SA-OP Meltdown-UD . fault type Meltdown-SS .
Meltdown-BR
Meltdown-GP Meltdown-PK ∗
Meltdown-XD .
Meltdown-SM .
Meltdown-BND ∗
in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM Meltdown-US fault RSB-CA-IP Meltdown-AC . Meltdown-P RSB-CA-OP Meltdown-type Meltdown-DE . Meltdown-RW RSB-SA-IP Meltdown-PF RSB-SA-OP Meltdown-UD . fault type Meltdown-SS .
Meltdown-BR Meltdown-MPX
Meltdown-GP Meltdown-PK ∗
Meltdown-XD .
Meltdown-SM .
Meltdown-BND ∗
in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM Meltdown-US fault RSB-CA-IP Meltdown-AC . Meltdown-P RSB-CA-OP Meltdown-type Meltdown-DE . Meltdown-RW RSB-SA-IP Meltdown-PF RSB-SA-OP Meltdown-UD . fault type Meltdown-SS .
Meltdown-BR Meltdown-MPX
Meltdown-GP Meltdown-PK ∗
Meltdown-BND ∗
in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM Meltdown-US fault RSB-CA-IP Meltdown-AC . Meltdown-P RSB-CA-OP Meltdown-type Meltdown-DE . Meltdown-RW RSB-SA-IP Meltdown-PF RSB-SA-OP Meltdown-UD . Meltdown-XD . fault type Meltdown-SS . Meltdown-SM .
Meltdown-BR Meltdown-MPX
Meltdown-GP in-place (IP) vs., out-of-place (OP)
mistraining PHT-CA-IP ∗ strategy Cross-address-space PHT-CA-OP ∗ microarchitec- Spectre-PHT Same-address-space tural buffer PHT-SA-IP Spectre-BTB PHT-SA-OP ∗ Spectre-RSB Cross-address-space Spectre-type BTB-CA-IP Spectre-STL Same-address-space BTB-CA-OP prediction Cross-address-space BTB-SA-IP ∗ Transient Same-address-space cause? BTB-SA-OP ∗ Meltdown-NM Meltdown-US fault RSB-CA-IP Meltdown-AC . Meltdown-P RSB-CA-OP Meltdown-type Meltdown-DE . Meltdown-RW RSB-SA-IP Meltdown-PF Meltdown-PK ∗ RSB-SA-OP Meltdown-UD . Meltdown-XD . fault type Meltdown-SS . Meltdown-SM .
Meltdown-BR Meltdown-MPX
Meltdown-GP Meltdown-BND ∗ Are the defenses working? • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Poison Value: works • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Meltdown defenses
Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Defenses working?
• Evaluated all mitigations without hardware changes • Serialization: still leaks through TLB, AVX • STIPB, IBPB, SSBD: not enabled by default • Site Isolation: can still leak data of own process • Index Masking: only limits how far beyond leakage can occur • Speculative Load Hardening: Spectrum showed potential leakage • Timer reduction: can build new timers • Poison Value: works • Meltdown defenses
72 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Conclusion • Some mitigations cost more than gained by the feature they defend • Transient-execution attacks will keep us busy for a while • There are still many other microarchitectural elements left to explore
Conclusion
• Optimizations always come at a cost
73 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • Transient-execution attacks will keep us busy for a while • There are still many other microarchitectural elements left to explore
Conclusion
• Optimizations always come at a cost • Some mitigations cost more than gained by the feature they defend
73 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) • There are still many other microarchitectural elements left to explore
Conclusion
• Optimizations always come at a cost • Some mitigations cost more than gained by the feature they defend • Transient-execution attacks will keep us busy for a while
73 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f) Conclusion
• Optimizations always come at a cost • Some mitigations cost more than gained by the feature they defend • Transient-execution attacks will keep us busy for a while • There are still many other microarchitectural elements left to explore
73 Moritz Lipp (@mlqxyz), Claudio Canella (@cc0x1f)