DSP Frameworks DSP Frameworks We Consider
Total Page:16
File Type:pdf, Size:1020Kb
Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica DSP Frameworks Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini DSP frameworks we consider • Apache Storm (with lab) • Twitter Heron – From Twitter as Storm and compatible with Storm • Apache Spark Streaming (lab) – Reduce the size of each stream and process streams of data (micro-batch processing) • Apache Flink • Apache Samza • Cloud-based frameworks – Google Cloud Dataflow – Amazon Kinesis Streams Valeria Cardellini - SABD 2017/18 1 Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications – Initially developed by Twitter • Topology – DAG of spouts (sources of streams) and bolts (operators and data sinks) Valeria Cardellini - SABD 2017/18 2 Stream grouping in Storm • Data parallelism in Storm: how are streams partitioned among multiple tasks (threads of execution)? • Shuffle grouping – Randomly partitions the tuples • Field grouping – Hashes on a subset of the tuple attributes Valeria Cardellini - SABD 2017/18 3 Stream grouping in Storm • All grouping (i.e., broadcast) – Replicates the entire stream to all the consumer tasks • Global grouping – Sends the entire stream to a single task of a bolt • Direct grouping – The producer of the tuple decides which task of the consumer will receive this tuple Valeria Cardellini - SABD 2017/18 4 Storm architecture • Master-worker architecture Valeria Cardellini - SABD 2017/18 5 Storm components: Nimbus and Zookeeper • Nimbus – The master node – Clients submit topologies to it – Responsible for distributing and coordinating the topology execution • Zookeeper – Nimbus uses a combination of the local disk(s) and Zookeeper to store state about the topology Valeria Cardellini - SABD 2017/18 6 Storm components: worker • Task: operator instance – The actual work for a bolt or a spout is done in the task • Executor: smallest schedulable entity – Execute one or more tasks related to same operator • Worker process: Java process running one or more executors worker process • Worker node: computing JAVA PROCESS executor executor resource, a container for THREAD THREAD one or more worker processes task task task task task Valeria Cardellini - SABD 2017/18 7 Storm components: supervisor • Each worker node runs a supervisor The supervisor: • receives assignments from Nimbus (through ZooKeeper) and spawns workers based on the assignment • sends to Nimbus (through ZooKeeper) a periodic heartbeat; • advertises the topologies that they are currently running, and any vacancies that are available to run more topologies Valeria Cardellini - SABD 2017/18 8 Twitter Heron • Realtime, distributed, fault-tolerant stream processing engine from Twitter • Developed as direct successor of Storm – Released as open source in 2016 https://twitter.github.io/heron/ – De facto stream data processing engine inside Twitter • Goal of overcoming Storm’s performance, reliability, and other shortcomings • Compatibility with Storm – API compatible with Storm: no code change is required for migration Valeria Cardellini - SABD 2017/18 9 Heron: in common with Storm • Same terminology of Storm – Topology, spout, bolt • Same stream groupings – Shuffle, fields, all, global • Example: WordCount topology Valeria Cardellini - SABD 2017/18 10 Heron: design goals • Isolation – Process-based topologies rather than thread-based – Each process should run in isolation (easy debugging, profiling, and troubleshooting) – Goal: overcoming Storm’s performance, reliability, and other shortcomings • Resource constraints – Safe to run in shared infrastructure: topologies use only initially allocated resources and never exceed bounds • Compatibility – Fully API and data model compatible with Storm Valeria Cardellini - SABD 2017/18 11 Heron: design goals • Backpressure – Built-in rate control mechanism to ensure that topologies can self-adjust in case components lag – Heron dynamically adjusts the rate at which data flows through the topology using backpressure • Performance – Higher throughput and lower latency than Storm – Enhanced configurability to fine-tune potential latency/throughput trade-offs • Semantic guarantees – Support for both at-most-once and at-least-once processing semantics • Efficiency – Minimum possible resource usage Valeria Cardellini - SABD 2017/18 12 Heron topology architecture • Master-work architecture • One Topology Master (TM) – Manages a topology throughout its entire lifecycle • Multiple Containers – Each Container multiple Heron Instances, a Stream Manager, and a Metrics Manager – A Heron Instance is a process that handles a single task of a spout or bolt – Containers communicate with TM to ensure that the topology forms a fully connected graph Valeria Cardellini - SABD 2017/18 13 Heron topology architecture Valeria Cardellini - SABD 2017/18 14 Heron topology architecture • Stream Manager (SM): routing engine for data streams – Each Heron connects to its local SM, while all of the SMs in a given topology connect to one another to form a network – Responsbile for propagating back pressure Valeria Cardellini - SABD 2017/18 15 Topology submit sequence Valeria Cardellini - SABD 2017/18 16 Self-adaptation in Heron • Dhalion: framework on top of Heron to autonomously reconfigure topologies to meet throughput SLOs, scaling resource consumption up and down as needed • Phases in Dhalion: - Symptom detection (backpressure, skew, …) - Diagnosis generation - Resolution • Adaptation actions: parallelism changes Valeria Cardellini - SABD 2017/18 17 Heron environment • Heron supports deployment on Apache Mesos • Can also run on Mesos using Apache Aurora as a scheduler or using a local scheduler Valeria Cardellini - SABD 2017/18 18 Batch processing vs. stream processing • Batch processing is just a special case of stream processing Valeria Cardellini - SABD 2017/18 19 Batch processing vs. stream processing • Batched/stateless: scheduled in batches – Short-lived tasks (Hadoop, Spark) – Distributed streaming over batches (Spark Streaming) • Dataflow/stateful: continuous/scheduled once (Storm, Flink, Heron) – Long-lived task execution – State is kept inside tasks Valeria Cardellini - SABD 2017/18 20 Native vs. non-native streaming Valeria Cardellini - SABD 2017/18 21 Apache Flink • Distributed data flow processing system • One common runtime for DSP applications and batch processing applications – Batch processing applications run efficiently as special cases of DSP applications • Integrated with many other projects in the open-source data processing ecosystem • Derives from Stratosphere project by TU Berlin, Humboldt University and Hasso Plattner Institute • Support a Storm-compatible API Valeria Cardellini - SABD 2017/18 22 Flink: software stack • On top: libraries with high-level APIs for different use cases, still in beta Valeria Cardellini - SABD 2017/18 23 Flink: programming model • Data stream – An unbounded, partitioned immutable sequence of events • Stream operators – Stream transformations that generate new output data streams from input ones Valeria Cardellini - SABD 2017/18 24 Flink: some features • Supports stream processing and windowing with Event Time semantics – Event time makes it easy to compute over streams where events arrive out of order, and where events may arrive delayed • Exactly-once semantics for stateful computations • Highly flexible streaming windows Valeria Cardellini - SABD 2017/18 25 Flink: some features • Continuous streaming model with backpressure • Flink's streaming runtime has natural flow control: slow data sinks backpressure faster sources Valeria Cardellini - SABD 2017/18 26 Flink: APIs and libraries • Streaming data applications: DataStream API – Supports functional transformations on data streams, with user-defined state, and flexible windows – Example: how to compute a sliding histogram of word occurrences of a data stream of texts WindowWordCount in Flink's DataStream API Sliding time window of 5 sec length and 1 sec trigger interval Valeria Cardellini - SABD 2017/18 27 Flink: APIs and libraries • Batch processing applications: DataSet API • Supports a wide range of data types beyond key/value pairs, and a wealth of operators Core loop of the PageRank algorithm for graphs Valeria Cardellini - SABD 2017/18 28 Flink: program optimization • Batch programs are automatically optimized to exploit situations where expensive operations (like shuffles and sorts) can be avoided, and when intermediate data should be cached Valeria Cardellini - SABD 2017/18 29 Flink: control events • Control events: special events injected in the data stream by operators • Periodically, the data source injects checkpoint barriers into the data stream by dividing the stream into pre- checkpoint and post-checkpoint • More coarse-grained approach than Storm: acks sequences of records instead of individual records • Watermarks signal the progress of event-time within a stream partition • Flink does not provide ordering guarantees after any form of stream repartitioning or broadcasting – Dealing with out-of-order records is left to the operator implementation Valeria Cardellini - SABD 2017/18 30 Flink: fault-tolerance • Based on Chandy-Lamport distributed snapshots • Lightweight mechanism – Allows to maintain high throughput rates and provide strong consistency guarantees at the same time Valeria Cardellini - SABD 2017/18 31 Flink: performance and memory management • High performance and low latency • Memory management