Evaluating Energy Consumption of Distributed Storage Systems Comparative Analysis

Thesis no: MSEE-2016:20 Evaluating Energy Consumption of Distributed Storage Systems Comparative analysis Samuel Sushanth Kolli Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona Sweden This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering with Emphasis in Telecommunication Systems. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Author(s): Samuel Sushanth Kolli E-mail: [email protected], [email protected] External advisor: Stefan Bernbo CEO, Compuverde AB E-mail: [email protected] University advisor: Dr. Dragos Ilie Assistant Professor of Telecommunication Systems Department of Communication Systems E-mail: [email protected] Faculty of Computing Internet : www.bth.se Blekinge Institute of Technology Phone : +46 455 38 50 00 SE-371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 i i ABSTRACT Context. Big Data and Cloud Computing nowadays require large amounts of storage that are accessible by many servers. The Energy consumed by these servers as well as that consumed by hosts providing the storage has been growing rapidly over the recent years. There are various approaches to save energy both at the hardware and software level, respectively. In the context of software, this challenge requires identification of new development methodologies that can help reduce the energy footprint of the Distributed Storage System. Until recently, reducing the energy footprint of Distributed Storage Systems is a challenge because there is no new methodology implemented to reduce the energy footprint of the Distributed Storage Systems. To tackle this challenge, we evaluate the energy consumption of Distributed Storage Systems by using a Power Application Programming Interface (PowerAPI) that monitors, in real-time, the energy consumed at the granularity of a system process. Objectives. In this study we investigate the Energy Consumption of distributed storage system. We also attempt to understand the effect on energy consumption for various patters of video streams. Also we have observed different measurement approaches for energy performance. Methods. The method is to use a power measuring software library while a synthetic load generator generates the load i.e., video data streams. The Tool which generates the workload is Standard Performance Evaluation Corporation Solution File Server (SPECsfs 2014) and PowerAPI is the software power monitoring library to evaluate the energy consumption of distributed storage systems of GlusterFS and Compuverde. Results. The mean and median values of power samples in mill watts for Compuverde higher than Gluster. For Compuverde the mean and median values until the load increment of three streams was around a 400 milliwatt value. The values of mean and median for the Gluster system were gradually increasing. Conclusions. The results show Compuverde having a higher consumption of energy than Gluster as it has a higher number of running processes that implement additional features that do not exist in Gluster. Also we have concluded that the conpuverde performed better for higher values of Load i.e., video data streams. Keywords: Distributed storage, Energy Consumption, Erasure coding, Streaming. ACKNOWLEDGEMENTS First and above all, I praise God, the almighty for providing me this opportunity and giving me the capability to proceed successfully. I would like to express my deepest gratitude to my supervisor Dr. Dragos Ilie for his valuable support and guidance throughout the completion of my thesis. His guidance, patience, comments and encouragement help me a lot to explore more in the topic. I thank him for his valuable time given for my thesis in his busy schedule. It was great to have such an advisor for my thesis. I would like to thank Dr. Kurt Tutschku examiner for his valuable comments. His guidance in ATS course has helped me a lot in my master thesis. I would like to thank my external supervisor Stefan Bernbo, CEO, Compuverde AB for his guidance and suggestion in my thesis. I thank Carl, Manager, Eddie Rapp, system Administrator, Shishir Saha and Jim Gunnarsson, Software developer’s and finally Nikoleta Dodova, Tester for helping me in providing the information about the issues that I have faced while performing the thesis and made my project successful one. I am thankful to my thesis partner Poojitha Rajana for her coordination and help for the entire thesis. I thank all my friends who helped me directly or indirectly though out my master studies. I would like to express my indebtedness to my parents for their endless love, encouragement and moral support all through thesis and my life. LIST OF ABBREVIATIONS CIFS Common Internet File System DSS Distributed Storage System DHT Distributed Hash Table GFS Google File System HDD Hard Disk Drive IT Information Technology IOPS Input/output Operations Per Second IaaS Infrastructure as a Service NIST National Institute of Standards and Technology NAS Network Attached Storage NFS Network File System MFS Moose File System PaaS Platform as a Service SaaS Software as a Service SN Storage Node SSD Solid State Disk SMB Server Message Block SPEC Standard Performance Evaluation Corporation SUT Service Under Test SWBUILD Software Build VDA Video Data Acquisition VDI Virtual Desktop Interface Thesis Report CONTENTS ABSTRACT ...........................................................................................................................................I ACKNOWLEDGEMENTS ................................................................................................................ II LIST OF ABBREVIATIONS ........................................................................................................... III CONTENTS .......................................................................................................................................... 4 LIST OF TABLES ................................................................................................................................ 6 LIST OF FIGURES .............................................................................................................................. 7 1 INTRODUCTION ....................................................................................................................... 8 1.1 BACKGROUND........................................................................................................................ 8 1.1.1 Cloud computing ............................................................................................................... 8 1.1.2 Big Data .......................................................................................................................... 10 1.1.3 Why Cloud Computing for Big Data? ............................................................................. 10 1.1.4 Distributed Storage Systems ........................................................................................... 10 1.1.5 CONVINcE ..................................................................................................................... 11 1.1.6 Energy Consumption using PowerAPI ........................................................................... 12 1.2 AIM OF THE THESIS .............................................................................................................. 12 1.3 OBJECTIVES ......................................................................................................................... 12 1.4 RESEARCH QUESTIONS ........................................................................................................ 13 1.5 SPLIT OF WORK .................................................................................................................... 13 2 RELATED WORK .................................................................................................................... 15 3 METHODOLOGY .................................................................................................................... 17 3.1 METHOD USED..................................................................................................................... 17 3.1.1 Why SPECsfs 2014?........................................................................................................ 17 3.1.2 SPECsfs 2014.................................................................................................................. 17 3.1.3 Description of Four workloads: ...................................................................................... 18 3.1.4 Why Power API? ............................................................................................................. 19 3.1.5 Power API ....................................................................................................................... 20 3.2 EXPERIMENTAL SETUP ......................................................................................................... 21 3.2.1 GlusterFS setup .............................................................................................................. 21 3.2.2 Compuverde Setup .......................................................................................................... 25 3.3 TESTBED OF THE ENERGY EVALUATION EXPERIMENT ......................................................... 29 3.3.1 Internal structure of SPECsfs 2014 for generating load for PowerAPI evaluations:

Evaluating Energy Consumption of Distributed Storage Systems Comparative Analysis

Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques

Performance and Scalability of a Sensor Data Storage Framework

Windows Internals, Sixth Edition, Part 2

The Evolution of File Systems

Analytical Study on Performance, Challenges and Future Considerations of Google File System

QUICK START GUIDE System Requirements

Indexing the World Wide Web: the Journey So Far Abhishek Das Google Inc., USA Ankit Jain Google Inc., USA

The File Systems Evolution

CEPH Storage [Open Source Osijek]

Performance Evaluation of Gluster and Compuverde Storage Systems Comparative Analysis

Syst`Eme De Fichiers Distribu´E : Comparaison De Glusterfs

What's up in the Land of the Linux Kernel =Quick Orientation=