Dynamic Eviction Set Algorithms and Their Applicability to Cache Characterisation

UPTEC IT 20036 Examensarbete 30 hp September 2020 Dynamic Eviction Set Algorithms and Their Applicability to Cache Characterisation Maria Lindqvist Institutionen för informationsteknologi Department of Information Technology Abstract Dynamic Eviction Set Algorithms and Their Applicability to Cache Characterisation Maria Lindqvist Teknisk- naturvetenskaplig fakultet UTH-enheten Eviction sets are groups of memory addresses that map to the same cache set. They can be used to perform efficient information-leaking attacks Besöksadress: against the cache memory, so-called cache side channel attacks. In Ångströmlaboratoriet Lägerhyddsvägen 1 this project, two different algorithms that find such sets are implemented Hus 4, Plan 0 and compared. The second of the algorithms improves on the first by using a concept called group testing. It is also evaluated if these algorithms can Postadress: be used to analyse or reverse engineer the cache characteristics, which is a Box 536 751 21 Uppsala new area of application for this type of algorithms. The results show that the optimised algorithm performs significantly better than the previous Telefon: state-of-the-art algorithm. This means that countermeasures developed 018 – 471 30 03 against this type of attacks need to be designed with the possibility of Telefax: faster attacks in mind. The results also shows, as a proof-of-concept, that 018 – 471 30 00 it is possible to use these algorithms to create a tool for cache analysis. Hemsida: http://www.teknat.uu.se/student Handledare: Christos Sakalis Ämnesgranskare: Stefanos Kaxiras Examinator: Lars-Åke Nordén UPTEC IT 20036 Tryckt av: Reprocentralen ITC Acknowledgements I would like to thank my supervisor Christos Sakalis for all the guidance, dis- cussions and feedback during this thesis. I also want to thank Stefanos Kaxiras for the idea of subject and making the project possible. Finally, I would like to thank everyone close to me for all the moral support and patience. 3 Contents 1 Introduction6 1.1 Motivation and Purpose.......................7 1.2 Limitations..............................7 2 Background8 2.1 Cache Memories...........................8 2.1.1 Cache Hierarchy.......................8 2.1.2 Cache Organisation.....................9 2.1.3 Cache Hits and Cache Misses................ 10 2.1.4 Eviction and Replacement Policies............. 10 2.2 Memory Addresses and Virtual Memory.............. 11 2.2.1 Addresses and Cache Indexing............... 11 2.2.2 Virtual Memory....................... 12 2.3 Cache Attacks and Eviction Sets.................. 13 2.3.1 Cache Attacks........................ 14 2.3.2 Prime+Probe, Evict+Time and Flush+Reload...... 14 2.3.3 Eviction Sets......................... 16 2.4 Algorithms for Finding Eviction Sets................ 17 2.4.1 Baseline Algorithm...................... 17 2.4.2 Group Testing........................ 20 2.4.3 Group Testing Algorithm.................. 20 2.4.4 Targeting Last Level Cache - Purpose and Challenges.. 22 2.4.5 Implementation Complications............... 23 3 Related Work 25 3.1 Characterising the Cache - Microbenchmarks........... 26 4 Method 28 4.1 Pin................................... 28 4.2 Implementation Decisions...................... 29 4 4.3 Method to Determine Cache Presence............... 29 4.4 Method of Measuring Number of Memory Accesses........ 30 4.5 Comparing the Algorithms..................... 30 4.5.1 Parameters.......................... 31 4.5.2 Testing Correctness..................... 31 4.5.3 Testing Performance and Scaling.............. 31 4.6 Characterising the Cache...................... 32 4.6.1 Using Baseline Algorithm.................. 32 4.6.2 Using Group Testing Algorithm............... 34 4.6.3 Evaluation.......................... 35 5 Results 36 5.1 Comparing Algorithms........................ 36 5.1.1 Correctness.......................... 36 5.1.2 Scaling............................. 36 5.1.3 Performance......................... 38 5.2 Characterising the Cache...................... 40 5.2.1 Correctness.......................... 40 5.2.2 Scaling............................. 41 6 Discussion 42 6.1 Comparing Algorithms........................ 42 6.2 Characterising the Cache...................... 43 6.3 Problems, Pitfalls and Lessons Learned.............. 44 6.4 Future Work............................. 44 7 Conclusion 46 5 1 Introduction Cache memories are in most computers today used as the bridge between the fast processors, and the slower main memory. While having many great advantages, these caches could also be used in so-called side channel attacks. In a computer system, a side channel could leak information by measuring side effects on the system. By performing a side channel attack that uses the cache memory, it is possible to indirectly retrieve encryption keys, by observing the memory access patterns [1,3, 16, 15]. More recent research has also shown that many modern processors are vulnerable for so-called speculative attacks [9,8]. In that case, traces left in the cache combined with misuse of performance mechanisms leads to the possibility to read other processes private memory content. Cache side channel attacks could also break the separation between virtual machines (VM’s) in a cloud computing setting, if multiple VM’s are hosted on the same physical hardware [6]. Such separation is necessary to preserve the integrity of the users of the cloud environment. To perform such cache side channel attacks, one needs to be able to control and examine the content of the cache. That is, being able to remove data associated with a specific memory address from the cache, and some time later examine if it has been brought back into the cache by some other process. In that case, it could give away information about the actions of the other process. To know if the data is present in the cache, one typically measures the time it takes to access it. A long time to accesses could indicate that it is not in the cache memory. The removal of data from the cache memory is referred to as an eviction. The most straight-forward way to evict a target victim address is to use a flush-based approach [27], where you use an instruction that directly evict the targeted address. While convenient, this type of attack can easily be mitigated and is not available in some environments. The other type of method is called conflict-based [10]. The aim is then to con- struct a so-called eviction set, which is a set of memory addresses. Accessing the addresses that are in the set brings the corresponding data from main memory into the smaller cache memory. Because of the organisation of the cache, the victim data will then be evicted after accessing all the addresses. To be able to use this approach in a practical attack, one want the set to be as small as possible. Another requirement is that it is possible to compute the set in an efficient way. Eviction sets can be derived in a static way. This means to form the set man- ually, using an already known scheme that maps memory addresses to their corresponding location in the cache. However, some countermeasures has been presented that prevents this. One is to randomise this mapping in some time intervals [17], making it harder to use the static approach, since it needs to be re-done after each randomisation. The alternative is instead to use a dynamic method, which requires less information about the system. The main idea is to randomly find a large set of addresses that will evict the target, and then reduce the set to its minimal core. Dynamic methods is what I examine in this thesis. 6 1.1 Motivation and Purpose The previous state-of-the-art algorithm [10] dynamically finds eviction sets in quadratic time. In 2019, a new version of the algorithm was presented [24], that found minimal eviction sets in linear time, which is a significant improvement. When the eviction set can be found in a shorter amount of time, some of the countermeasures against conflict-based attacks are less powerful. Thus, to be able to improve protections against this type of attacks [18], it is necessary to understand and analyse methods for quickly finding eviction sets. The purpose of this thesis project has two parts. The first is to examine these two algorithms that are used for finding minimal eviction sets. This is done by implementing the previous state-of-the-art algorithm as well as the optimised algorithm. The performance and correctness of these two algorithms are then compared. The second part is to explore if the algorithms can be used for a different purpose, namely for characterising a cache memory. That is, to find out the different parameters or settings of the cache, without knowing these in advance. Specifically, these are the questions to be answered: 1. Does the increase in performance conform with the theoretical calcula- tions? 2. What are the impact of different parameters and settings on the effective- ness of the two algorithms? 3. What are the challenges when implementing eviction set algorithms and finding the minimal eviction sets? 4. Can dynamical methods for finding eviction sets be used to characterise/analyse the cache parameters? 1.2 Limitations This project will focus on algorithms for cache-based attacks. It will not cover other types of side-channel attacks, or attacks that targets other types of micro- architectural structures than the cache memories. The attacks that are using the discovered eviction sets will not be implemented. New eviction set algorithms will not be developed. Countermeasures against the methods of finding eviction sets will not be evaluated. This project will use dynamic methods for finding eviction sets. Static methods (reverse engineer the mapping from addresses to cache sets) will not be explored. 7 2 Background This section provides the background necessary to understand the purpose and meaning of eviction sets, as well as the mechanism behind the algorithms. It will first cover the basics of cache memories, memory addresses and conflict-based cache attacks. Then the two algorithms are presented. Some general challenges for the algorithms are also briefly discussed. 2.1 Cache Memories Memory accesses from the CPU to the main memory introduce a long latency.

Dynamic Eviction Set Algorithms and Their Applicability to Cache Characterisation

Branch-Directed and Pointer-Based Data Cache Prefetching

Non-Sequential Instruction Cache Prefetching for Multiple--Issue Processors

Denial-Of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention

Fetch Directed Instruction Prefetching

In Using the GNU Compiler Collection (GCC)

Gnu Assembler

A Survey on Recent Hardware Data Prefetching Approaches with an Emphasis on Servers

Optimizations Enabled by a Decoupled Front-End Architecture

Data Locality Optimizations for Multigrid Methods on Structured Grids

Instruction Cache Prefetching Using Multilevel Branch Prediction

Coverstory by Markus Levy, Technical Editor

Basicblocker: ISA Redesign to Make Spectre-Immune Cpus Faster