Kurma: Efficient and Secure Multi-Cloud Storage Gateways For

Kurma: Efficient and Secure Multi-Cloud Storage Gateways for Network-Attached Storage A Dissertation Presented by Ming Chen to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Stony Brook University Technical Report FSL-17-01 April 2017 Abstract Cloud computing is becoming increasingly popular as utility computing is being gradually real- ized. Still, many organizations cannot enjoy the high accessibility, availability, flexibility, scalabil- ity, and cost-effectiveness of cloud systems because of security concerns and legacy infrastructure. A promising solution to this problem is the hybrid cloud model, which combines public clouds with private clouds and Network-Attached Storage (NAS). Many researchers tried to secure and optimize public clouds, but few studied the unique security and performance problems of such hybrid solutions. This thesis explores hybrid cloud storage solutions that have the advantages of both public and private clouds. We focus on preserving the strong security and good performance of on- premises storage, while using public clouds for convenience, data availability, and economic data sharing. We propose Kurma, an efficient and secure gateway (middleware) system that bridges traditional NAS and cloud storage. Kurma allows legacy NAS-based programs to seamlessly and securely access cloud storage. Kurma optimizes performance by supporting and improving on the latest NFSv4.1 protocol, which contains new performance-enhancing features including compound procedures and delegations. Kurma also caches hot data in order to serve popular I/O requests from the faster, on-premises network. On-premises Kurma gateways act as sources of trust, and overcome the security concerns caused by the opaque and multi-tenant nature of cloud storage. Kurma protects data from untrusted clouds with end-to-end integrity and confidentiality, and efficiently detects replay attacks while al- lowing data sharing among geo-distributed gateways. Kurma uses multiple clouds as backends for higher availability, and splits data among clouds using secret sharing for higher confidentiality. Kurma can also efficiently detect stale data caused by replay attacks or due to the eventual consistency nature of clouds. We have thoroughly benchmarked the in-kernel NFSv4.1 implementation and improved its performance by up to 11×. Taking advantage of NFSv4.1 compound procedures, we have designed and implemented a vectorized file-system API and library (called vNFS) that can further boost NFS performance by up to two orders of magnitude. Assuming a public cloud supporting NFSv4, we have designed and implemented an early Kurma prototype (called SeMiNAS) with a performance penalty of less than 18%, while still protecting integrity and confidentiality of files. Based on SeMiNAS, we developed Kurma which uses real public clouds including AWS S3, Azure Blob Store, Google Cloud Storage, and Rackspace Cloud Files. Kurma reliably stores files in multiple clouds with replication, erasure coding, or secret sharing to tolerate cloud failures. To share files among clients in geo-distributed offices, Kurma maintains a unified file-system namespace across geo-distributed gateways. Kurma keeps file-system metadata on-premises and encrypts data blocks before writing them to clouds. In spite of the eventual consistency of clouds, Kurma ensures data freshness using an efficient scheme that combines versioning and timestamping. Our evaluation showed that Kurma’s performance is around 52–91% that of a local NFS server while providing geo-replication, confidentiality, integrity, and high availability. Our thesis is that cloud storage can be made efficient and highly secure for traditional NAS- based systems utilizing hybrid cloud solutions such as Kurma. ii Contents List of Figures vii List of Tables viii Acknowledgments x 1 Introduction 1 2 Benchmarking Network File System 5 2.1 NFS Introduction . .5 2.2 Benchmarking Methodology . .6 2.2.1 Experimental Setup . .6 2.2.2 Benchmarks and Workloads . .7 2.3 Benchmarking Data-Intensive Workloads . .8 2.3.1 Random Read . .8 2.3.2 Sequential Read . 10 2.3.3 Random Write . 12 2.3.4 Sequential Write . 14 2.4 Benchmarking Metadata-Intensive Workloads . 14 2.4.1 Read Small Files . 14 2.4.2 File Creation . 16 2.4.3 Directory Listing . 19 2.5 Benchmarking NFSv4 Delegations . 20 2.5.1 Granting a Delegation . 20 2.5.2 Delegation Performance: Locked Reads . 21 2.5.3 Delegation Recall Impact . 23 2.6 Benchmarking Macro-Workloads . 24 2.6.1 The File Server Workload . 24 2.6.2 The Web Server Workload . 25 2.6.3 The Mail Server Workload . 27 2.7 Related Work of NFS Performance Benchmarking . 28 2.8 Benchmarking Conclusions . 29 2.8.1 Limitations . 30 iii 3 vNFS: Maximizing NFS Performance with Compounds and Vectorized I/O 31 3.1 vNFS Introduction and Background . 31 3.2 vNFS Design Overview . 33 3.2.1 Design Goals . 34 3.2.2 Design Choices . 34 3.2.2.1 Overt vs. covert coalescing . 34 3.2.2.2 Vectorized vs. start/end-based API . 35 3.2.2.3 User-space vs. in-kernel implementation . 35 3.2.3 Architecture . 35 3.3 vNFS API . 36 3.3.1 vread/vwrite ............................... 36 3.3.2 vopen/vclose ............................... 38 3.3.3 vgetattrs/vsetattrs ......................... 38 3.3.4 vsscopy/vcopy .............................. 38 3.3.5 vmkdir ................................... 39 3.3.6 vlistdir .................................. 39 3.3.7 vsymlink/vreadlink/vhardlink ................... 39 3.3.8 vremove ................................... 39 3.3.9 vrename ................................... 40 3.4 vNFS Implementation . 40 3.4.1 RPC size limit . 40 3.4.2 Protocol extensions . 41 3.4.3 Path compression . 41 3.4.4 Client-side caching . 41 3.5 vNFS Evaluation . 42 3.5.1 Experimental Testbed Setup . 42 3.5.2 Micro-workloads . 42 3.5.2.1 Small vs. big files . 42 3.5.2.2 Compounding degree . 44 3.5.2.3 Caching . 45 3.5.3 Macro-workloads . 46 3.5.3.1 GNU Coreutils . 46 3.5.3.2 tar ................................ 48 3.5.3.3 Filebench . 49 3.5.3.4 HTTP/2 server . 51 3.6 Related Work of vNFS . 51 3.6.1 Improving NFS performance . 51 3.6.2 I/O compounding . 52 3.6.3 Vectorized APIs . 52 3.7 vNFS Conclusions . 53 4 SeMiNAS: A Secure Middleware for Cloud-Backed Network-Attached Storage 54 4.1 SeMiNAS Introduction . 54 4.2 SeMiNAS Background and Motivation . 55 4.2.1 A revisit of cryptographic file systems. 56 iv 4.2.2 An NFS vs. a key-value object back-end. 56 4.3 SeMiNAS Design . 57 4.3.1 Threat Model . 57 4.3.2 Design Goals . 58 4.3.3 Architecture . 58 4.3.4 Integrity and Confidentiality . 59 4.3.4.1 Key Management . 60 4.3.4.2 File-System Namespace Protection . 60 4.3.4.3 Security Metadata Management . 61 4.3.5 NFSv4-Based Performance Optimizations . 61 4.3.5.1 NFS Data-Integrity eXtension . 61 4.3.5.2 Compound Procedures . 62 4.3.6 Caching . 63 4.4 SeMiNAS Implementation . 63 4.4.1 NFS-Ganesha . 64 4.4.2 Authenticated Encryption . 64 4.4.3 Caching . 65 4.4.4 Lines of Code . 65 4.5 SeMiNAS Evaluation . 65 4.5.1 Experimental Setup . 65 4.5.2 Micro-Workloads . 67 4.5.2.1 Read-Write Ratio Workload . 67 4.5.2.2 File-Creation Workload . 67 4.5.2.3 File-Deletion Workload . 68 4.5.3 Macro-Workloads . 69 4.5.3.1 Network File-System Server Workload . 69 4.5.3.2 Web-Proxy Workload . 70 4.5.3.3 Mail-Server Workload . 71 4.6 Related Work of SeMiNAS . 72 4.6.1 Secure Distributed Storage Systems . 72 4.6.2 Cloud NAS . 72 4.6.3 Cloud storage gateways . 73 4.7 SeMiNAS Conclusions . 73 4.7.1 Limitations . 73 5 Kurma: Multi-Cloud Secure Gateways 74 5.1 Kurma Introduction . 74 5.2 Kurma Background . ..

Kurma: Efficient and Secure Multi-Cloud Storage Gateways For

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support