Distributed Storage Solutions and Optimizations

University POLITEHNICA of Bucharest Faculty of Automatic Control and Computer Science TEZĂ DE DOCTORAT Optimizări ṣi soluṭii de stocare distribuită Distributed Storage Solutions and Optimizations Autor: Ing. Sorin-Andrei Piṣtirică COMISIE DOCTORAT Preşedinte Prof. Dr. Ing. Adina Magda Florea Universitatea POLITEHNICA din București Coordonator ṣtiinṭific Prof. Dr. Ing. Florica Moldoveanu Universitatea POLITEHNICA din București Referent Prof. Dr. Ing. Nicolae Ţăpuṣ Universitatea POLITEHNICA din București Referent Prof. PhD. Eng. Alexandru Soceanu Munich University of Applied Sciences Referent Prof. Dr. Ing. Ṣtefan Pentiuc Universitatea „Ştefan cel Mare” din Suceava Bucureṣti, 2015 Copyright © Sorin Andrei Pistirica Bucureṣti 2015 TABLE OF CONTENTS 1 INTRODUCTION ........................................................................................................... 10 1.1 SCIENTIFIC PUBLICATIONS IN CONNECTION WITH THIS THESIS ....................................................................................... 12 1.2 THESIS OUTLINE ................................................................................................................................................. 13 2 STORAGE SYSTEMS: STATE-OF-THE-ART ...................................................................... 14 2.1 STORAGES CLASSIFICATION ................................................................................................................................... 14 2.2 REQUIREMENTS FOR DISTRIBUTED STORAGES ........................................................................................................... 15 2.3 STORAGE SYSTEMS SCALING METHODS .................................................................................................................... 17 2.4 SHARING SEMANTICS ........................................................................................................................................... 20 2.5 LOW LEVEL DATA ORGANIZATION ........................................................................................................................... 20 2.5.1 Data organization at device level ........................................................................................................... 20 2.5.2 Networked storage data organization ................................................................................................... 22 2.5.3 Generalization of data organization ...................................................................................................... 23 2.5.4 Storage units .......................................................................................................................................... 23 2.5.5 Trends towards object based storage systems....................................................................................... 24 2.6 DISTRIBUTED STORAGES ARCHITECTURES ................................................................................................................. 25 2.6.1 Andrew File System ................................................................................................................................ 25 2.6.2 Google File System ................................................................................................................................. 26 2.6.3 General Parallel File System ................................................................................................................... 28 2.6.4 Lustre File System ................................................................................................................................... 29 2.6.5 Ceph File System ..................................................................................................................................... 30 2.6.5.1 Data storage: RADOS ....................................................................................................................................... 32 2.6.5.2 Namespace management ............................................................................................................................... 33 2.6.5.3 Decentralized data distribution ....................................................................................................................... 33 2.6.5.4 Security in Ceph .............................................................................................................................................. 35 2.6.6 Comparison ............................................................................................................................................ 35 2.7 NETWORK TECHNOLOGIES FOR DISTRIBUTED STORAGES .............................................................................................. 36 2.7.1 The topology ........................................................................................................................................... 36 2.7.2 Protocol stack ......................................................................................................................................... 39 2.7.3 Real world considerations ...................................................................................................................... 40 2.8 INTERFACING WITH STORAGE SYSTEMS (CASE STUDY: CLOUDS) .................................................................................... 41 3 CASE STUDY: INFRASTRUCTURE FOR ELEARNING ENVIRONMENTS .............................. 44 3.1 MOTIVATION ..................................................................................................................................................... 44 3.2 CLOUDS STORAGE SYSTEMS AND ENHANCED NETWORKS FOR ELEARNING ENVIRONMENTS ................................................. 44 3.2.1 Data distribution for eLearning environments ....................................................................................... 45 3.2.2 Clouds network ....................................................................................................................................... 46 3.3 CLOUDS AND ELEARNING ENVIRONMENT: PROPOSED ARCHITECTURE ............................................................................ 47 3.3.1 System administration and eLearning components ............................................................................... 48 3.3.2 The storage system ................................................................................................................................ 48 3.3.3 The network profile ................................................................................................................................ 48 3.3.4 Real world considerations ...................................................................................................................... 49 4 OPTIMIZATIONS OF DISTRIBUTED STORAGE SYSTEMS BASED ON HARDWARE ACCELERATORS ................................................................................................................... 50 4.1 MOTIVATION ..................................................................................................................................................... 50 4.2 INTEGRATED PACKET PROCESSING ENGINES .............................................................................................................. 51 4.3 HARDWARE PACKET PROCESSING ........................................................................................................................... 52 4.4 HARDWARE ACCELERATED CLUSTER NODES .............................................................................................................. 53 4.5 CASE STUDY: HARDWARE ACCELERATED CEPH WITH QORIQTM.................................................................................... 55 4.5.1 QorIQTM packet processors ..................................................................................................................... 56 4.5.2 Accelerated Ceph’s nodes ....................................................................................................................... 57 4.5.3 Accelerated RADOS based on queue prioritization ................................................................................ 58 4.5.4 Micro-benchmark results ....................................................................................................................... 59 5 NETWORKING CHARACTERISTICS IN CONVERGED INFRASTRUCUTRE ........................... 61 5.1 FIBRE CHANNEL ................................................................................................................................................. 62 5.2 DATA CENTER BRIDGING...................................................................................................................................... 63 5.2.1 Priority Flow Control ............................................................................................................................... 63 5.2.2 Enhanced Transmission Selection........................................................................................................... 63 5.2.3 Data Center Bridging eXchange ............................................................................................................. 63 5.2.4 Quantized Congestion Notification ........................................................................................................ 63 5.3 QCN WEAKNESSES AND IMPROVEMENTS ................................................................................................................ 67 5.3.1 The Fair QCN Solution ...........................................................................................................................

Load more