Avocado: a Secure In-Memory Distributed Storage System
Total Page:16
File Type:pdf, Size:1020Kb
Avocado: A Secure In-Memory Distributed Storage System Maurice Bailleu , Dimitra Giantsidi Vasilis Gavrielatos , Do Le Quoc , Vijay Nagarajan , Pramod Bhatotia University of Edinburgh 1Huawei Research 1 TU Munich 1 2∗ 1 1,3 1 2 3 Abstract hypervisor. Given this promise, TEEs are now commercially oered by major cloud computing providers [23, 34, 60]. We introduce Avocado, a secure in-memory distributed Although TEEs provide a promising building block for se- storage system that provides strong security, fault-tolerance, curing systems againsta powerfuladversary,theyalso present consistency (linearizability) and performance for untrusted signicant challenges while designing a replicated secure cloud environments. Avocado achieves these properties distributed storage system. The fundamental issue is that the based on TEEs, which, however, are primarily designed TEEs are primarily designed to secure the limited in-memory for securing limited physical memory (enclave) within a state of a single-node system, and thus, the security properties single-node system. Avocado overcomes this limitation by of TEEs do not naturally extend to a distributed infrastructure. extending the trust of a secure single-node enclave to the Therefore we ask the question: How can we leverage TEEs distributed environment over an untrusted network, while to design a high-performance, secure, and fault-tolerant ensuring that replicas are kept consistent and fault-tolerant in-memory distributed KVS for untrusted cloud environments? in a malicious environment. To achieve these goals, we design and implement Avocado In this workwe introduce Avocado,a secure,distributedin- underpinning on the cross-layer contributions involving the memory KVS based on Intel SGX [5] as the foundational TEE security network stack, the replication protocol, scalable trust estab- that achieves the following properties: (a) strong , in condentiality lishment, and memory management. Avocado is practical: particular, —unauthorized reads are prevented, integrity In comparison to BFT, Avocado provides condentiality and —unauthorized changes to the data are detected, fault tolerance with fewer replicas and is signicantly faster— to for (b) —the service continues uninterrupted in the consistency YCSB read and write heavy workloads, respectively. presence of faults, (c) —strong consistency seman- tics for a replicated store (linearizability), while protecting 1 Introduction 4.5× 65× against roll back and forking attacks and (d) performance— achieving all of these without compromising performance. In-memory distributed key-value stores (KVS) [8, 29–31, To achieve the aforementioned properties, we need to 40, 44, 45, 61, 66, 88, 106] have been widely adopted as the address the following four design challenges pertaining to the underlying storage system infrastructure in the cloud because network stack, the replication protocol, trust establishment, (i) they support latency sensitive applications by keeping and memory management in TEEs. data in main memory, and (ii) they are able to accommodate large datasets beyond the memory limits of a single server Firstly, in-memory distributed KVSs are increasingly build by adopting a scale-out distributed design. on high-performance network stacks, where they bypass the At the same time, the transition to the cloud has increased kernelusingdirectI/O[4,30,46]. Unfortunately,theprominent the risk of security violations in storage systems [77]. In I/O mechanism employed by TEE frameworks [13, 65, 72] untrustedenvironments,anattackercancompromisethesecu- is based on asynchronous system calls [85], which exhibit rity properties of the stored data and query operations. In fact, signicant overheads [104]. On the other hand, the direct I/O several studies [36,37,83] show that software bugs, congura- mechanism is fundamentally incompatible with TEEs as the tion errors, and security vulnerabilities pose a serious threat data stored within the protected memory of TEEs cannot be to storage systems. Further, a malicious cloud operator or directly accessed via the untrusted DMA connection. co-located tenant, presents an additional attack vector [78,79]. To address this challenge, we design a high-performance To address these security threats, hardware-assisted network stack for TEEs based on eRPC [46]—it supports the trusted execution environments (TEEs), such as Intel SGX [5], complete transport and session layers, while enabling direct ARM Trustzone [12], RISC-V Keystone [53, 76], and AMD- I/O within the protected TEE domain. Our network stack iperf SEV [10] provide an appealing way to build secure systems. outperforms asynchronous syscall by for ( 6.1). In particular, TEEs provide a hardware-protected secure Secondly, in-memory distributed KVSs rely on data repli- memory region whose residing code and data are isolated cation for fault tolerance. To ensure replicas are consistent 66 % § from any layers in the software stack including the OS/ in the presence of faults and adversary, a secure replication protocol is deployed. While conventional wisdom requires the Do Le Quoc performed this work at TU Dresden. employment of BFT protocols [20, 52], they are prohibitively ∗ expensive for practical systems [22]. 2 Background To overcome the limitation, we design a secure replication 2.1 Trusted computing protocol, which builds on top of any high-performance Trusted Execution Environments (TEEs) [5, 10, 12, 27, 76] are non-Byzantine protocol [56,98]—ourkey insight is to leverage tamper-resistant processing environments that guarantee the TEEs to preserve the integrity of protocol execution, which al- authenticity, the integrity and the condentiality of their exe- lows to model Byzantine behavior as a normal crash fault. Our cuting code, data and runtime states, e.g. CPU registers, mem- replication protocol oers linearizable reads and writes, and ory and others. Their content remains resistant against all outperforms BFT [87] by a factor of — , while requiring software attacks even from privileged code (OS, hypervisor). fewer replicas and stronger security properties ( 6.2). SGX, Intel’s version of a TEE, oers the abstraction of an isolated memory called enclave. Enclave pages reside in the Thirdly, a secure distributed system4.5× 65× requires a scalable enclave page cache (EPC) — a specic memory region (up fattestation mechanism to establish trust between the§ servers to for v1 and for v2) which is protected by and clients. Unfortunately, the remote attestation mechanism an on-chip memory encryption engine (MEE). To support in TEEs is designed for establishing root of trust for a single applications with larger memory footprint SGX implements node [68] and it does not provide collective trust establish- a paging128 MiBmechanism. However,256 MiB the EPC paging mechanism ment across the multiple nodes of a distributed system. More- incurs high overheads [16, 65]. over, the attestation itself is based on Intel attestation service This isolation prohibits SGX applications from executing (IAS) [3,11], which suers from scalability and latency issues. outside-of-the-enclave code directly. Thus, enclave threads To address this, we design a conguration and attestation need to exit the trusted environment and further copy all service (CAS) that ensures scalability and exibility in a associated data out of the enclave since kernel code cannot distributed environment. Further, it provides conguration access it. Afterwards, threads have to enter the enclave again. management, and improved performance of compared We refer to this as world switch. to Intel’s IAS attestation ( 6.3). Our Avocado project leverages the advancements in shielded execution frameworks; in particular, we use And lastly, an in-memory distributed KVS18.3× requires fast Scone [13] to build a distributed storage system. access to large amount of§ main memory on each server for single-node KVS. Unfortunately, TEEs provide a limited se- 2.2 High-performance networking with eRPC cure physical memory, and rely on prohibitively expensive Traditionally the network stack and the I/O are handled paging mechanism to access data beyond the physical limit. inside the OS kernel where conventional applications perform system calls to send/receive messages. However, To address this limitation, we design a novel single-node context switches due to system calls present a bottleneck and KVS based on a partitioned skip list data structure, which might sacrice performance [2, 38, 43, 86, 101]. Consequently, overcomes the memory limitations of TEEs, while supporting approaches like RDMA and DPDK [4] are widely favorized lock-free scalable concurrent updates. Our KVS provides in high performance networking because they (i) can map a fast lookup speed; — faster than ShieldStore [48], a device into the users address space, and (ii) replace the costly state-of-the-art secure KVS for single-node systems ( 6.4). context switches with a polling-based approach. Based on these aforementioned four contributions, we In ourwork,we build ournetwork stack based on eRPC [46], 1.5× 9× a state-of-the art general-purpose and asynchronous remote build Avocado as an end-to-end system from the ground-up,§ and evaluate it using a real hardware cluster using the procedure call (RPC) library for high-speed networking for YCSB [7, 24] workloads. Our evaluation shows that Avocado lossy Ethernet or lossless fabrics. eRPC uses a polling-based is scalable and performs similar in read heavy and write heavy network I/O along with userspace drivers, eliminating