A Ground-Up Approach to High-Throughput Cloud Computing
Total Page:16
File Type:pdf, Size:1020Kb
Università degli Studi di Torino Scuola di Dottorato in Scienze ed Alte Tecnologie XXVI ciclo Tesi di Dottorato A ground-up approach to High-Throughput Cloud Computing in High-Energy Physics Dario Berzano Relatore Correlatori Prof. Massimo Masera Dott. Gerardo Ganis Università degli Studi di Torino CERN Dott. Stefano Bagnasco INFN Torino CERN-THESIS-2014-376 16/04/2014 Controrelatore Prof. Marco Paganoni Università degli Studi di Milano-Bicocca Anno Accademico 2012–2013 «Simplicity is the ultimate sophistication.» Apple II commercial (1977) Table of Contents 1 Virtualization and Cloud Computing in High Energy Physics 13 1.1 Computing in the LHC era . 13 1.1.1 Hierarchy of computing centres . 16 1.1.2 Event processing paradigm . 17 1.1.3 Analysis framework and data format: ROOT . 18 1.1.4 The ALICE Experiment . 19 1.2 Virtualization and Cloud Computing . 21 1.2.1 State of the art . 23 1.3 From the Grid to clouds? . 25 1.3.1 Grid over clouds and more . 26 1.4 Overview of this work . 27 2 Building a Private Cloud out of a Grid Tier-2 29 2.1 Overview of the INFN Torino Computing Centre . 29 2.1.1 Grid VOs and ALICE . 30 2.2 Rationale of the Cloud migration . 32 2.2.1 No performance loss through virtualization . 32 2.2.2 Abstraction of resources in the Grid and the cloud . 32 2.2.3 Harmonization of diverse use cases in a private cloud . 33 2.2.4 IaaS: separating administrative domains . 34 2.2.4.1 Virtual farm administrator . 35 2.2.4.2 Rolling updates . 35 2.2.5 Computing on demand and elasticity . 36 2.2.6 Preparing for the near future . 37 2.3 The cloud controller: OpenNebula . 38 2.3.1 Datacenter virtualization vs. infrastructure provisioning . 38 2.3.2 OpenNebula features . 39 2.4 Designing and building the private cloud . 42 2.4.1 Management of the physical infrastructure . 43 6 CONTENTS 2.4.1.1 Provisioning: Cobbler . 43 2.4.1.2 Configuration: Puppet . 44 2.4.1.3 Monitoring: Zabbix . 46 2.4.1.4 Network connectivity: NAT and proxy . 46 2.4.2 Classes of hypervisors . 46 2.4.2.1 Working class hypervisors . 49 2.4.2.2 Service hypervisors . 50 2.4.3 Level 2 and 3 networking . 51 2.4.4 Shared storage . 52 2.4.4.1 GlusterFS . 52 2.4.4.2 Virtual machine image repository . 53 2.4.5 Disks of running virtual guests . 54 2.4.6 Benchmarks . 56 2.5 Performance tunings and other issues . 57 2.5.1 Kernel SamePage Merging issues . 57 2.5.2 OpenNebula backing database: MySQL vs. SQLite . 60 2.5.3 Fast creation of ephemeral storage . 60 2.5.3.1 Allocation of large files . 60 2.5.3.2 Filesystem formatting . 62 2.5.4 Direct QEMU GlusterFS support . 62 2.5.5 Fast deployment: caching and snapshotting . 63 2.5.5.1 LVM . 63 2.5.5.2 QCow2 . 64 2.6 Custom virtual guest images . 65 3 Running elastic applications on the Cloud 69 3.1 Virtualizing the Grid Tier-2 . 69 3.1.1 Preparation and update of a worker node . 70 3.1.1.1 Creation of the base image . 71 3.1.1.2 Contextualization . 72 3.1.1.3 Dynamic configuration: Puppet . 72 3.1.1.4 Creating images vs. contextualizing . 73 3.1.2 Automatic addition of worker nodes to TORQUE . 74 3.1.3 Nodes self registration to Puppet . 76 3.1.4 CPU efficiency measurements . 77 3.1.5 Grid services on the cloud . 77 3.2 Sandboxed virtual farms . 80 3.2.1 Virtual networks: ebtables . 80 3.2.2 Virtual Router: OpenWRT . 82 3.2.2.1 Building a custom OpenWRT image . 82 3.2.2.2 Contextualization . 85 CONTENTS 7 3.2.2.3 Virtual Routers management and security . 86 3.2.2.4 VPN for geographically distributed virtual farms . 86 3.2.3 OpenNebula EC2 interface: econe-server . 87 3.2.3.1 EC2 credentials . 88 3.2.3.2 Mapping flavours to templates . 88 3.2.3.3 SSH keys and user-data . 89 3.2.3.4 Improved econe-server Elastic IPs support . 90 3.2.3.5 EC2 API traffic encryption . 91 3.2.4 Elastic IPs . 91 3.2.4.1 Integration of econe-server and Virtual Routers . 92 3.2.5 Creating a user sandbox . 93 3.2.6 Using the cloud . 93 3.3 M5L-CAD: an elastic farm for medical imaging . 94 3.3.1 Challenges . 95 3.3.2 The WIDEN web frontend . 96 3.3.3 Processing algorithms . 96 3.3.4 Elastic cloud backend . 98 3.3.4.1 Evolution towards a generic Virtual Analysis Facility 99 3.3.5 Privacy and patient confidentiality . 100 4 PROOF as a Service: deploying PROOF on clouds and the Grid 103 4.1 Overview of the PROOF computing model . 103 4.1.1 Event based parallelism . 103 4.1.2 Interactivity . 104 4.1.3 PROOF and ROOT . 105 4.1.4 Dynamic scheduling . 106 4.1.5 Data locality and XRootD . 107 4.1.6 PROOF-Lite . 109 4.2 Static and dynamic PROOF deployments . 109 4.2.1 Dedicated PROOF clusters . 110 4.2.2 Issues of static deployments . 110 4.2.3 Overcoming deployment issues . 111 4.2.4 PROOF on Demand . 112 4.3 ALICE Analysis Facilities and their evolution . 112 4.3.1 Specifications . 113 4.3.1.1 The analysis framework . 113 4.3.1.2 Deployment . 114 4.3.2 Users authentication and authorization . 115 4.3.3 Data storage and access model . 116 4.3.3.1 PROOF datasets . 117 4.3.3.2 The AliEn File Catalog . 118 8 CONTENTS 4.3.4 Issues of the AAF data model . 119 4.3.4.1 Very slow first time data access . 119 4.3.4.2 Lack of a pre-staging mechanism . 120 4.3.4.3 Data integrity concerns and resiliency . 121 4.3.5 The dataset stager . 121 4.3.5.1 Features of the dataset stager . 122 4.3.5.2 Robustness of the staging daemon . 123 4.3.5.3 Daemon status monitoring: the MonALISA plugin . 125 4.3.5.4 The ALICE download script . 126 4.3.5.5 The dataset management utilities for ALICE . 127 4.3.5.6 Alternative usage of the staging daemon . 127 4.3.6 A PROOF interface to the AliEn File Catalog . 128 4.3.6.1 Dataset queries . 129 4.3.6.2 Issuing staging requests . 130 4.3.7 The first virtual Torino Analysis Facility for ALICE . 131 4.3.7.1 Overview . 131 4.3.7.2 Storage and datasets model . 132 4.3.7.3 Software distribution model . 136 4.3.7.4 Dynamically changing PROOF resources . 136 4.3.7.5 Virtual machine images and Tier-2 integration . 136 4.3.8 Spawning and scaling the TAF . 138 4.4 The Virtual Analysis Facility . 139 4.4.1 Cloud awareness . 140 4.4.1.1 PROOF is cloud aware . 141 4.4.2 Components . 142 4.4.3 The CernVM ecosystem . 143 4.4.3.1 µCernVM: an on-demand operating system . 143 4.4.3.2 CernVM-FS . 146 4.4.3.3 CernVM Online . 147 4.4.4 External components . 149 4.4.4.1 HTCondor . 151 4.4.4.2 PROOF on Demand . 154 4.4.5 Authentication: sshcertauth . 155 4.4.5.1 Rationale . 156 4.4.5.2 Components . 157 4.4.5.3 The web interface . 158 4.4.5.4 Users mapping plugins . 159 4.4.5.5 The keys keeper . 161 4.4.5.6 Restricting access to some users . 162 4.4.6 PROOF dynamic workers . 163 4.4.6.1 Chain of propagation of the new workers . 163 CONTENTS 9 4.4.6.2 User’s workflow . 164 4.4.6.3 Updating workers list in xproofd . 166 4.4.6.4 Configuring workers from the PROOF master . 166 4.4.6.5 Adding workers.