Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage Pengfei Xuan Clemson University, [email protected]

Clemson University TigerPrints All Dissertations Dissertations December 2016 Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage Pengfei Xuan Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Xuan, Pengfei, "Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage" (2016). All Dissertations. 2318. https://tigerprints.clemson.edu/all_dissertations/2318 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. ACCELERATING BIG DATA ANALYTICS ON TRADITIONAL HIGH-PERFORMANCE COMPUTING SYSTEMS USING TWO-LEVEL STORAGE A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Computer Science by Pengfei Xuan December 2016 Accepted by: Dr. Feng Luo, Committee Chair Dr. Pradip Srimani Dr. Rong Ge Dr. Jim Martin Abstract High-performance Computing (HPC) clusters, which consist of a large number of compute nodes, have traditionally been widely employed in industry and academia to run diverse compute-intensive applications. In recent years, the revolution in data-driven science results in large volumes of data, often size in terabytes or petabytes, and makes data-intensive applications getting exponential growth. The data-intensive computing presents new challenges to HPC clusters due to the different workload characteristics and optimization objectives. One of those challenges is how to efficiently integrate software frameworks developed for big data analytics, such as Hadoop and Spark, with traditional HPC systems to support both data-intensive and compute-intensive workloads. To address this challenge, we first present a novel two-level storage system, TLS, that integrates a distributed in-memory storage system with a parallel file system. The former renders memory-speed high I/O performance and the latter renders consistent storage with large capacity. We model and compare its I/O throughput to Hadoop distributed file system (HDFS) and OrangeFS (formerly PVFS2). We further build a prototype of TLS with Alluxio (formerly Tachyon) and OrangeFS, and evaluate its performance using MapReduce benchmarks. Both analyses and experiments on real systems show that the proposed storage architecture delivers higher aggregate I/O throughputs than HDFS and OrangeFS while retaining weak scalability on both read and write. ii However, statically configured in-memory storage may leave inadequate space for compute-intensive jobs or lose the opportunity to utilize more available space for data- intensive applications. Then, we develop a dynamic memory controller, DynIMS, which infers memory demands of compute tasks in real time and employs a feedback-based control model to dynamically adjust the capacity of the in-memory storage system. The DynIMS is able to quickly release capacity of in-memory storage system for compute- intensive workload, as well as maximize the capacity of in-memory storage system for data-intensive applications when other compute workloads are finished. We test DynIMS using mixed HPCC and Spark workloads on a production HPC cluster. Experimental results show that DynIMS can achieve up to 5× performance improvement compared to systems with static memory allocations. We expect the work in this dissertation helps further accelerate the adoption of big data frameworks to solve the data-intensive problems in traditional HPC systems, and gears up the converged computing infrastructure for both academia and industry. iii Dedication To my wife, daughter, and all of my family, friends, and teachers. Thank you for all of your support and inspiration. You have all touched me deeply and made this work possible. iv Acknowledgments I would like to sincerely thank the following people, without whose kindness and patience this thesis would not have been possible: - My research adviser and co-advisers: Feng Luo, Pradip K. Srimani, Rong Ge - Various faculty at School of Computing, Clemson University: Amy W. Apon, Jim Martin, Stephen T. Hedetniemi, D. E. Stevenson - Various faculty at Electrical and Computer Engineering, Clemson University: Walter B. Ligon, Kuang-Ching Wang - Academic and industry collaborators: Anna Blenda, Stephen Ficklin, Boyd Wilson, Jeffrey Denton, Haoyuan Li, Calvin Jia, Bin Fan - Lab members: Yuehua Zhang, Yunsheng Wang, Bo Li - Friends and family from outside of work - thanks for making life even more exciting. v Table of Contents Title Page ................................................................................................................. i Abstract ................................................................................................................... ii Dedication .............................................................................................................. iv Acknowledgments ................................................................................................... v List of Tables ......................................................................................................... ix List of Figures ......................................................................................................... x Chapter 1: Introduction ........................................................................................... 1 1.1 Background ............................................................................................. 1 1.2 Problem Statement .................................................................................. 3 1.3 Proposed Conceptual Framework ........................................................... 8 1.4 Solution Summary .................................................................................. 9 1.5 Publication Contributions ..................................................................... 11 1.6 Dissertation Organization ..................................................................... 12 Chapter 2: Two-Level Storage .............................................................................. 14 2.1 Introduction ........................................................................................... 14 2.2 Background ........................................................................................... 16 2.2.1 HPC and Hadoop Architectures ........................................................ 16 2.2.2 Data Access Performance ................................................................. 17 2.3 Design and Implementation .................................................................. 20 2.3.1 Data Layout Mapping ....................................................................... 22 2.3.2 I/O Modes of the Two-level Storage ................................................ 23 vi 2.4 I/O Throughput Analysis ...................................................................... 25 2.4.1 I/O Modeling of Different Storage Systems ..................................... 25 2.4.2 I/O Modeling of HDFS ..................................................................... 26 2.4.3 I/O Modeling of OrangeFS ............................................................... 27 2.4.4 I/O Modeling of Tachyon ................................................................. 28 2.4.5 I/O Modeling of the Two-level Storage ............................................ 29 2.4.6 Aggregate I/O Throughput Comparison ........................................... 29 2.4.7 MapReduce I/O Cost Analysis .......................................................... 31 2.4.8 I/O Cost in Map Phase ...................................................................... 32 2.4.9 I/O Cost in Shuffle Phase .................................................................. 33 2.4.10 I/O Cost in Reduce Phase ............................................................... 33 2.5 Evaluation ............................................................................................. 34 2.5.1 Experimental Setup ........................................................................... 34 2.5.2 I/O Characterization for Two-level Storage ..................................... 36 2.5.3 Performance Evaluation Using TeraSort .......................................... 39 2.5.4 Performance Evaluation Using K-means .......................................... 42 2.6 Related Work ........................................................................................ 48 2.7 Summary ............................................................................................... 52 Chapter 3: In-Memory Storage Controller ............................................................ 54 3.1 Introduction ........................................................................................... 54 3.2 Background ........................................................................................... 58 3.2.1 Memory Usage Pattern of HPC Applications ................................... 59 vii 3.2.2 Memory Pressure and HPC Application Performance ..................... 59 3.2.3 Memory Management ....................................................................... 61 3.3 System Design ...................................................................................... 62 3.3.1 Architecture

Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage Pengfei Xuan Clemson University, [email protected]

Efficient Implementation of Data Objects in the OSD+-Based Fusion

Instituto De Pesquisas Tecnológicas Do Estado De São Paulo ANDERSON TADEU MILOCHI Grids De Dados: Implementação E Avaliaçã

Globalfs: a Strongly Consistent Multi-Site File System

Storage Systems and Input/Output 2018 Pre-Workshop Document

HFAA: a Generic Socket API for Hadoop File Systems

Maximizing the Performance of Scientific Data Transfer By

Using Alluxio to Optimize and Improve Performance of Kubernetes-Based Deep Learning in the Cloud

Alluxio Overview

Comparison of Kernel and User Space File Systems

Locofs: a Loosely-Coupled Metadata Service for Distributed File Systems

A Persistent Storage Model for Extreme Computing Shuangyang Yang Louisiana State University and Agricultural and Mechanical College

The Ceph Distributed Storage System