Exploiting HPC Technologies to Accelerate Big Data Processing

Exploi'ng HPC Technologies to Accelerate Big Data Processing Talk at Open Fabrics Workshop (April 2016) by Dhabaleswar K. (DK) Panda The Ohio State University E-mail: [email protected] h<p://www.cse.ohio-state.edu/~panda Introduc'on to Big Data Applica'ons and Analy'cs • Big Data has become the one of the most important elements of business analyFcs • Provides groundbreaking opportuniFes for enterprise informaon management and decision making • The amount of data is exploding; companies are capturing and digiFzing more informaon than ever • The rate of informaon growth appears to be exceeding Moore’s Law Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 2 Data Genera'on in Internet Services and Applica'ons • Webpages (content, graph) • Clicks (ad, page, social) • Users (OpenID, FB Connect, etc.) • e-mails (Hotmail, Y!Mail, Gmail, etc.) • Photos, Movies (Flickr, YouTube, Video, etc.) • Cookies / tracking info (see Ghostery) • Installed apps (Android market, App Store, etc.) • Locaon (Latude, Loopt, Foursquared, Google Now, etc.) • User generated content (Wikipedia & co, etc.) • Ads (display, text, DoubleClick, Yahoo, etc.) • Comments (Discuss, Facebook, etc.) • Reviews (Yelp, Y!Local, etc.) • Social connecFons (LinkedIn, Facebook, etc.) • Purchase decisions (Netflix, Amazon, etc.) • Instant Messages (YIM, Skype, Gtalk, etc.) • Search terms (Google, Bing, etc.) • News arFcles (BBC, NYTimes, Y!News, etc.) Number of Apps in the Apple App Store, Android Market, Blackberry, • Blog posts (Tumblr, Wordpress, etc.) and Windows Phone (2013) • Android Market: <1200K • Microblogs (Twi<er, Jaiku, Meme, etc.) • Apple App Store: ~1000K • Link sharing (Facebook, Delicious, Buzz, etc.) Courtesy: hp://dazeinfo.com/2014/07/10/apple-inc-aapl-ios-google-inc-goog- android-growth-mobile-ecosystem-2014/ Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 3 Not Only in Internet Services - Big Data in Scienfic Domains • ScienFfic Data Management, Analysis, and Visualizaon • Applicaons examples – Climate modeling – Combuson – Fusion – Astrophysics – Bioinformacs • Data Intensive Tasks – Runs large-scale simulaons on supercomputers – Dump data on parallel storage systems – Collect experimental / observaonal data – Move experimental / observaonal data to analysis sites – Visual analyFcs – help understand data visually Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 4 Typical Solu'ons or Architectures for Big Data Analy'cs • Hadoop: h<p://hadoop.apache.org – The most popular framework for Big Data AnalyFcs – HDFS, MapReduce, HBase, RPC, Hive, Pig, ZooKeeper, Mahout, etc. • Spark: h<p://spark-project.org – Provides primiFves for in-memory cluster compuFng; Jobs can load data into memory and query it repeatedly • Storm: h<p://storm-project.net – A distributed real-Fme computaon system for real-Fme analyFcs, online machine learning, conFnuous computaon, etc. • S4: h<p://incubator.apache.org/s4 – A distributed system for processing conFnuous unbounded streams of data • GraphLab: h<p://graphlab.org – Consists of a core C++ GraphLab API and a collecFon of high-performance machine learning and data mining toolkits built on top of the GraphLab API. • Web 2.0: RDBMS + Memcached (h<p://memcached.org) – Memcached: A high-performance, distributed memory object caching systems Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 5 Big Data Processing with Hadoop Components • Major components included in this tutorial: User Applica'ons – MapReduce (Batch) – HBase (Query) – HDFS (Storage) MapReduce HBase – RPC (Inter-process communicaon) • Underlying Hadoop Distributed File System (HDFS) used by both MapReduce and HBase HDFS • Model scales but high amount of Hadoop Common (RPC) communicaon during intermediate phases can be further opFmized Hadoop Framework Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 6 Spark Architecture Overview • An in-memory data-processing framework Worker – Iterave machine learning jobs Zookeeper – InteracFve data analyFcs Worker – Scala based Implementaon – Standalone, YARN, Mesos Worker • Scalable and communicaon intensive HDFS Driver – Wide dependencies between Resilient SparkContext Distributed Datasets (RDDs) Worker Worker – MapReduce-like shuffle operaons to Master reparFFon RDDs – Sockets based communicaon hUp://spark.apache.org Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 7 Memcached Architecture Main Main CPUs CPUs memory memory ... SSD HDD SSD HDD Main Main CPUs CPUs memory memory High Performance Networks SSD HDD SSD HDD ... Main ... CPUs memory SSD HDD • Distributed Caching Layer – Allows to aggregate spare memory from mulFple nodes – General purpose • Typically used to cache database queries, results of API calls • Scalable model, but typical usage very network intensive Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 8 Data Management and Processing on Modern Clusters • SubstanFal impact on designing and uFlizing data management and processing systems in mulFple Fers – Front-end data accessing and serving (Online) • Memcached + DB (e.g. MySQL), HBase – Back-end data analyFcs (Offline) • HDFS, MapReduce, Spark Front-end Tier Back-end Tier Memcached Data Analytics Apps/Jobs + DBMemcached (MySQL) + DBMemcached (MySQL) Web + DB (MySQL) Internet ServerWeb MapReduce Spark ServerWeb Server NoSQL DB HDFS (HBase)NoSQL DB Data Accessing (HBase)NoSQL DB and Serving (HBase) Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 9 Trends in HPC Technologies • Advanced Interconnects and RDMA protocols – InfiniBand – 10-40 Gigabit Ethernet/iWARP – RDMA over Converged Enhanced Ethernet (RoCE) • Delivering excellent performance (Latency, Bandwidth and CPU UFlizaon) • Has influenced re-designs of enhanced HPC middleware – Message Passing Interface (MPI) and PGAS – Parallel File Systems (Lustre, GPFS, ..) • SSDs (SATA and NVMe) • NVRAM and Burst Buffer Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 10 How Can HPC Clusters with High-Performance Interconnect and Storage Architectures Benefit Big Data Applica'ons? Can RDMA-enabled Can HPC Clusters with How much Can the bo<lenecks be high-performance alleviated with new high-performance performance benefits storage systems (e.g. designs by taking interconnects can be achieved advantage of HPC SSD, parallel file through enhanced technologies? benefit Big Data systems) benefit Big designs? processing? Data applicaons? How to design What are the major benchmarks for bolenecks in current Big evaluang the Data processing performance of Big middleware (e.g. Hadoop, Data middleware on Spark, and Memcached)? HPC clusters? Bring HPC and Big Data processing into a “convergent trajectory”! Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 11 Designing Communicaon and I/O Libraries for Big Data Systems: Challenges Applica'ons Benchmarks Big Data Middleware Upper level (HDFS, MapReduce, HBase, Spark and Memcached) Changes? Programming Models (Sockets) Other Protocols? Communicaon and I/O Library Point-to-Point Threaded Models Communicaon and Synchronizaon Virtualizaon I/O and File Systems QoS Fault-Tolerance Commodity Compung System Networking Technologies Architectures Storage Technologies (InfiniBand, 1/10/40/100 GigE (Mul'- and Many-core (HDD, SSD, and NVMe-SSD) and Intelligent NICs) architectures and accelerators) Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 12 The High-Performance Big Data (HiBD) Project • RDMA for Apache Spark • RDMA for Apache Hadoop 2.x (RDMA-Hadoop-2.x) – Plugins for Apache, Hortonworks (HDP) and Cloudera (CDH) Hadoop distribuFons • RDMA for Apache Hadoop 1.x (RDMA-Hadoop) Available for InfiniBand and RoCE • RDMA for Memcached (RDMA-Memcached) • OSU HiBD-Benchmarks (OHB) – HDFS and Memcached Micro-benchmarks • hp://hibd.cse.ohio-state.edu • Users Base: 166 organizaons from 22 countries • More than 15,900 downloads from the project site • RDMA for Apache HBase and Impala (upcoming) Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 13 Different Modes of RDMA for Apache Hadoop 2.x • HHH: Heterogeneous storage devices with hybrid replicaon schemes are supported in this mode of operaon to have be<er fault-tolerance as well as performance. This mode is enabled by default in the package. • HHH-M: A high-performance in-memory based setup has been introduced in this package that can be uFlized to perform all I/O operaons in- memory and obtain as much performance benefit as possible. • HHH-L: With parallel file systems integrated, HHH-L mode can take advantage of the Lustre available in the cluster. • MapReduce over Lustre, with/without local disks: Besides, HDFS based soluFons, this package also provides support to run MapReduce jobs on top of Lustre alone. Here, two different modes are introduced: with local disks and without local disks. • Running with Slurm and PBS: Supports deploying RDMA for Apache Hadoop 2.x with Slurm and PBS in different running modes (HHH, HHH-M, HHH- L, and MapReduce over Lustre). Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 14 Accelera'on Case Studies and Performance Evalua'on • RDMA-based Designs and Performance Evaluaon – HDFS – MapReduce – RPC – HBase – Spark – Memcached (Basic and Hybrid) – HDFS + Memcached-based Burst Buffer Network Based Compu'ng Laboratory OFA-BigData (April ‘16) 15 Design Overview of HDFS with RDMA Applica'ons • Design Features – RDMA-based HDFS write HDFS – RDMA-based HDFS Others Write replicaon Java Socket Interface Java Nave Interface (JNI) – Parallel replicaon OSU Design support Verbs – On-demand connecFon setup 1/10/40/100 GigE, IPoIB

Load more