Tachyon-Further-Improve-Sparks

Tachyon: A Reliable Memory Centric Storage for Big Data Analytics Haoyuan (HY) Li, Ali Ghodsi, Matei Zaharia, Scott Shenker, Ion Stoica UC Berkeley Outline • Overview • Research • Open Source • Future Outline • Overview • Research • Open Source • Future Memory is King • RAM throughput increasing exponen9ally • Disk throughput increasing slowly Memory-locality key to interac9ve response 9me Realized by many… • Frameworks already leverage memory 5 Problem solved? 6 An Example: - • Fast in-memory data processing framework – Keep one in-memory copy inside JVM – Track lineage of operaons used to derive data – Upon failure, use lineage to recompute data Lineage Tracking map join reduce filter map Issue 1 Different jobs share data: Slow writes to disk storage engine & Spark Task Spark Task execution engine same process Spark block 1 mem Spark block 3 mem (slow writes) block manager block 3 block manager block 1 block 1 block 2 HDFS block 3 block 4 disk 8 Issue 1 Different frameworks share data: Slow writes to disk storage engine & Spark Task Hadoop MR execution engine same process Spark block 1 mem YARN (slow writes) block manager block 3 block 1 block 2 HDFS block 3 Block 4 disk 9 Issue 2 execution engine & Spark Task storage engine same process block 1 Spark memory block 3 block manager block 1 block 2 HDFS block 3 block 4 disk 10 Issue 2 execution engine & crash storage engine same process block 1 Spark memory block 3 block manager block 1 block 2 HDFS block 3 block 4 disk 11 Issue 2 Process crash: lose all cache execution engine & storage engine crash same process block 1 block 2 HDFS block 3 block 4 disk 12 Issue 3 duplicate memory per job & GC execution engine & Spark Task Spark Task storage engine same process Spark block 1 mem Spark block 3 mem (duplication & GC) block manager block 3 block manager Block 1 block 1 block 2 HDFS block 3 block 4 disk 13 Tachyon Reliable data sharing at memory-speed within and across cluster frameworks/jobs 14 Soluon Overview Basic idea • Push lineage down to storage layer • Use memory aggressively Facts • One data copy in memory • Rely on recomputaon for fault-tolerance Stack Computation Frameworks (Spark, MapReduce, Impala, Tez, …) Tachyon Existing Storage Systems (HDFS, S3, GlusterFS, …) Architecture 17 Lineage 18 Issue 1 revisited Different frameworks share at memory-speed execution engine & storage engine Spark Task Hadoop MR same process (fast writes) Spark block 1 mem YARN block 1 block 2 Tachyon HDFS block 3 block 4 in-memory disk block 1 block 2 HDFS Block 3 Block 4 disk 19 Issue 2 revisited execution engine & Spark Task storage engine same process block 1 Spark memory block manager block 1 block 2 Tachyon HDFS block 3 block 4 in-memory disk 20 Issue 2 revisited execution engine & crash storage engine same process Spark memory block manager block 1 block 2 Tachyon HDFS block 3 block 4 in-memorydisk block 1 block 2 HDFS block 3 block 4 disk 21 Issue 2 revisited process crash: keep memory- cache execution engine & storage engine crash same process block 1 block 2 Tachyon HDFS block 3 block 4 in-memorydisk block 1 block 2 HDFS block 3 block 4 disk 22 Issue 3 revisited Off-heap memory storage one memory copy & no GC execution engine & storage engine Spark Task Spark Task same process (no duplication & GC) Spark mem Spark mem block 1 block 2 Tachyon HDFS block 3 block 4 in-memory disk block 1 block 2 HDFS Block 3 Block 4 disk 23 Outline • Overview • Research • Open Source • Future Ques9on 1: How long to get missing data back? That server contains the data computed last month! 25 Lineage enables Asynchronous Checkpoinng 26 Edge Algorithm • checkpoint leaves • checkpoint hot files • Bounded Recovery cost 27 Ques9on 2: How to allocate recomputaon resource? Would recomputaon slow down my high priority jobs? Priority Inversion? 28 Recomputaon Resource Allocaon • Priority Based Scheduler • Fair Sharing Based Scheduler 29 comparison with in Memory HDFS Further Improve Spark’s Performance Grep Master Faster Recovery Recomputaon Resource Consumpon 33 Outline • Overview • Research • Open Source • Future A Open Source Status • Apache License, Version 0.4.1 (Feb 2014) • 15+ Companies • Spark and MapReduce applications can run without any code change Tachyon 0.5: Release Growth - ? contributors Tachyon 0.4: - 30 contributors Tachyon 0.3: - 15 contributors Tachyon 0.2: - 3 contributors Apr ‘13 Oct‘13 Feb ‘14 July ‘14 36 contributors Outside of Berkeley More than 75% 37 Tachyon is the Default Off-Heap Storage Soluon for . Thanks to Redhat! Thanks to Redhat! Commercially Supported By Ageo Thanks to Redhat! in 41 Spark/MapReduce/Shark without Tachyon • Spark – val file = sc.textFile(“hdfs://ip:port/path”) • Hadoop MapReduce – hadoop jar hadoop-examples-1.0.4.jar wordcount hdfs://localhost:19998/input hdfs://localhost: 19998/output • Shark – cREATE TABLE orders_cached AS SELEcT * FROM orders; Spark/MapReduce/Shark with Tachyon • Spark – val file = sc.textFile(“tachyon://ip:port/path”) • Hadoop MapReduce – hadoop jar hadoop-examples-1.0.4.jar wordcount tachyon://localhost:19998/input tachyon:// localhost:19998/output • Shark – cREATE TABLE orders_tachyon AS SELEcT * FROM orders; Spark OFF_HEAP with Tachyon // Input data from Tachyon’s Memory val file = sc.textFile(“tachyon://ip:port/path”) // Store RDD OFF_HEAP in Tachyon’s Memory file.persist(OFF_HEAP) Thanks to our code contributors! Aaron Davidson Chang Cheng Jey Kottalam Raymond Liu Achal Soni Du Li Joseph Tang Reynold Xin Ali Ghodsi Fei Wang Lukasz Jastrzebski Robert Metzger Andrew Ash Gerald Zhang Manu Goyal Rong Gu Anurag Khandelwal Grace Huang Mark Hamstra Sean Zhong Aslan Bekirov Hao Cheng Nick Lanham Srinivas Parayya Bill Zhao Haoyuan Li Orcun Simsek Tao Wang Calvin Jia Henry Saputra Pengfei Xuan Timothy St. Clair Vamsi Chitters Colin Patrick McCabe Hobin Yoon Qifan Pu Xi Liu Shivaram Venkataraman Huamin Chen Qianhao Dong Xiaomin Zhang 45 Outline • Overview • Research • Open Source • Future Goal? 47 Beper Assist Other components MapRe Spark Tez Shark GraphX Impala …… duce Tachyon Gluster HDFS S3 Localfs NFS Ceph …… fs Welcome Collaboraon! Short Term Roadmap • Ceph Integraon (Redhat) • Hierarchical Local Storage (Intel) • Further improver Shark Performance (Yahoo) • Beper support for Mul9-tenancy (AMPLab) • Many more from AMPLab and Industry collaborators. 49 Your Requirements? Tachyon Summary • High-throughput, fault-tolerant memory centric storage, with lineage as a first class citizen • Further improve performance for frameworks such as Spark, Hadoop, and Shark etc. • Healthy community with 15+ companies contributing Thanks! Quesons? • More Informa,on: – h5p://tachyon-project.org – h5ps://github.com/amplab/tachyon • Email: – [email protected] .

Tachyon-Further-Improve-Sparks

Discretized Streams: Fault-Tolerant Streaming Computation at Scale

The Design and Implementation of Declarative Networks

Apache Spark: a Unified Engine for Big Data Processing

A Ditya a Kella

Alluxio: a Virtual Distributed File System by Haoyuan Li A

Donut: a Robust Distributed Hash Table Based on Chord

UC Berkeley UC Berkeley Electronic Theses and Dissertations

A Networking Abstraction for Distributed Data-Parallel Applications

Heterogeneity and Load Balance in Distributed Hash Tables

Big Data Analysis with Apache Spark

Table of Contents

An Architecture for Fast and General Data Processing on Large Clusters