A Novel Scheduling Framework Leveraging Hardware Cache Partitioning for Cache-Side-Channel Elimination in Clouds
Total Page:16
File Type:pdf, Size:1020Kb
A Novel Scheduling Framework Leveraging Hardware Cache Partitioning for Cache-Side-Channel Elimination in Clouds Read Sprabery Konstantin Evchenko Abhilash Raj University of Illinois, University of Illinois, Oregon State University Urbana-Champaign Urbana-Champaign [email protected] [email protected] [email protected] Rakesh B. Bobba Sibin Mohan Roy H. Campbell Oregon State University University of Illinois, University of Illinois, [email protected] Urbana-Champaign Urbana-Champaign [email protected] [email protected] ABSTRACT on the Last-Level-Cache (LLC) that is shared across multiple cores While there exist many isolation mechanisms that are available [18, 23, 43, 44, 48] – these make defenses much harder. to cloud service providers, including virtual machines, containers, Many defenses against cache-side-channel attacks in cloud envi- etc. , the problem of side-channel increases in importance as a re- ronments have also been proposed (e.g., [6, 19, 21, 22, 24, 25, 28, 31, maining security vulnerability – particularly in the presence of 32, 37–40, 45, 51]). However, the proposed solutions suer from a shared caches and multicore processors. In this paper we present variety of drawbacks: (a) some are probabilistic [25, 37, 51]; (b) oth- a hardware-software mechanism that improves the isolation of ers do not protect applications when SMT is enabled [51]; (c) some cloud processes in the presence of shared caches on multicore require developers to re-write applications [19, 22, 31], (d) while chips. Combining the Intel CAT architecture that enables cache others require hardware changes [39, 40] impacting deployability; partitioning on the y with novel scheduling techniques and state (e) some depend on violating x86 semantics by modifying the reso- cleansing mechanisms, we enable cache-side-channel free comput- lution, accuracy or availability of timing instructions [21, 24, 38] ing for Linux-based containers and virtual machines, in particular, and consequently require signicant changes to the applications. those managed by KVM. We do a preliminary evaluation of our Compiler-based [6] and page coloring based cache-partitioning system using a CPU bound workload. Our system allows Simultane- [28, 32, 45] approaches have high overheads making them imprac- ous Multithreading (SMT) to remain enabled and does not require tical. application level changes. Defenses against cache-side-channel attacks that eliminate the attacks, rather than frustrate techniques employed by the attacker KEYWORDS are desirable. Shannon’s noisy-channel coding theorem states that information can be transmitted, regardless of the amount of noise on cache, side-channels, scheduling, hardware partitioning, cloud com- the channel. Probabilistic defenses (e.g., [25, 37, 51]) may decrease puting the bit-rate of attacks, but cannot fully eliminate them. In addition to a guaranteed defense, the solution must not severely impact (i) the performance of the applications or (ii) utilization of the machine. 1 INTRODUCTION In other words, defenses must minimize the performance impact of Cache-based side-channel attacks (e.g., [13, 26, 27]) are a threat enforcing hard isolation to remain practical. For instance, disabling to computing environments where a diverse set of users share hyper-threading [51], which many existing solutions do, can have hardware resources. Such attacks take advantage of observable a signicant performance impact on the applications. To the best side-eects (on hardware) due to the execution of software pro- of our knowledge, every cloud provider enables hyperthreading. grams. A number of these attacks focus on dierences in timing Furthermore, the solutions must be easy to adopt. History has shown while accessing shared processor caches. Recently, researchers have that solutions requiring additional development time (or signicant adapted these cache-based side-channel attacks to cloud computing changes to existing applications) have a harder time being adopted environments, especially Infrastructure-as-a-Service (IaaS) clouds (as shown in the Return Oriented Programming (ROP) community (e.g., [18, 23, 30, 47]), and showed that secrets and sensitive infor- [34]). Thus, solutions that require developers make application level mation can be extracted across co-located virtual machines (VMs). changes [19, 22] may be challenging to apply to existing workloads. Container frameworks such as Docker[7] that virtualize the oper- In this paper, we present a framework designed to eliminate ating system are even more susceptible to such attacks since they side-channel attacks in cloud computing systems that use multicore share the underlying operating system (e.g., [48]). processors with a shared LLC. The proposed framework uses a Initial cache-based side-channel attacks focused on schedulers, combination of Commodity-o-the-Shelf (COTS) hardware features either at the OS level or Virtual Machine Monitor (VMM) layer [13, along with novel scheduling techniques to eliminate cache-based side- 26, 47]. Other approaches also focused on resource sharing – e.g., channel attacks. In particular, we use Cache Allocation Technology using the processor core (for instance SMT on a single core) to (CAT) [17] that allows us to partition last-level caches at runtime. access shared L1 and L2 caches [27]. Recent attacks have focused CAT, coupled with state cleansing between context switches, and 1 selective sharing of common libraries removes any possibility of Broadwell and Skylake micro-architectures that succeeded Haswell cache-timing-based side-channel attacks between dierent security micro-architecture. domains. We implement a novel scheduling method, as an extension While our implementation and evaluation used an Intel Haswell to the commonly-used Completely-Fair-Scheduler (CFS) in Linux, to processor with Intel CAT technology the proposed approach is reduce the overheads inherent due to any such cleansing operation. generally applicable to any multi-core processor with a hierarchical Our solution provides a transparent1 way to eliminate side-channel cache, shared last-level cache, and a mechanism to partition the attacks, while still working with hyperthreading enabled systems. It shared last-level cache. works with containers, kernel-based virtual machines (KVMs) and any other schedulable entity that relies on the OS scheduler2. In summary, the proposed framework: 2.2 Attack Model While there are many potential threats to security in public cloud C1 Eliminates cache-based side-channel attacks for schedula- environments (e.g., [5, 35]) the focus of this paper is on cache-based ble units (e.g., containers, KVMs) side-channel attacks (e.g., [46, 48]). At a high-level, in cache-based C2 Requires no application level changes side-channel attacks an attacker deduces information about the C3 Allows providers to exploit hyperthreading, and victim’s computations by observing the victims cache usage. The C4 Imposes modest performance overheads information deduced can range from high-level information such 2 SYSTEM AND ATTACK MODEL as which tenant you are co-located with (e.g., [30]) to more ne grained details such as cryptographic keys or items in a shopping cart (e.g., [18, 48]). In such attacks an attacker process rst needs to co-locate (i.e., get assigned to the same physical server) itself with the target or vic- tim in the infrastructure. Methods to both achieve co-residency [30] and to thwart co-residency (e.g., [1, 14, 15, 25, 49]) have been discussed in the literature. In this work we assume that an at- tacker is able to co-locate with the victim and focus on thwart- ing side-channel attacks themselves. Our framework complements approaches that thwart co-residency. Figure 1: Hierarchical Cache in Modern Processors There are primarily two techniques to exploit cache based side- channels discussed in the literature, namely, ‘Prime+Probe” attacks 2.1 System Model [27] and ‘Flush-Reload” attacks [13]. It is important to note that while these techniques are popular, they are only possible because We consider public Platform-as-a-Service (PaaS) or Infrastructure- of measurable interference. Our solution addresses these specic as-a-cloud (IaaS) cloud environments. Such environments allow techniques and other techniques that leverage the cache as a side- for co-residency of multiple computational appliances (e.g., con- channel. tainers, VMs) belonging to potentially dierent security domains. (a) “Prime+Probe” attacks [27]: At a high level, the attack We assume that the cloud computing infrastructure is built using ‘primes’ the entire cache with random data. After waiting for a cer- commodity-o-the-shelf (COTS) components. In particular, we as- tain time (so that the victim executes), the adversary then ‘probes’ sume that the servers have multi-core processors with multiple the cache to see which of its own lines have been evicted – thus levels of caches, some of which are shared (See Figure 1). We also providing information on which lines were used by the victim. This assume that the servers have a runtime mechanism to partition the information combined with the knowledge of cache access patterns last-level shared cache. exhibited by victim programs (e.g., cryptographic algorithms) can For this work, we evaluated our approach using an Intel Haswell be used to extract information (e.g., cryptographic key being used) series processor that has a three-level cache hierarchy: private about the victim. level 1 (L1) and level 2 (L2) caches for each core (64KB