Apache Flink Fast and Reliable Large-Scale Data Processing
Total Page:16
File Type:pdf, Size:1020Kb
Apache Flink Fast and Reliable Large-Scale Data Processing Fabian Hueske @fhueske 1 What is Apache Flink? Distributed Data Flow Processing System • Focused on large-scale data analytics • Real-time stream and batch processing • Easy and powerful APIs (Java / Scala) • Robust execution backend 2 What is Flink good at? It‘s a general-purpose data analytics system • Real-time stream processing with flexible windows • Complex and heavy ETL jobs • Analyzing huge graphs • Machine-learning on large data sets • ... 3 Flink in the Hadoop Ecosystem Libraries Dataflow Table API Table ML Library ML Gelly Library Gelly Apache MRQL Apache SAMOA DataSet API (Java/Scala) DataStream API (Java/Scala) Flink Core Optimizer Stream Builder Runtime Environments Embedded Local Cluster Yarn Apache Tez Data HDFS Hadoop IO Apache HBase Apache Kafka Apache Flume Sources HCatalog JDBC S3 RabbitMQ ... 4 Flink in the ASF • Flink entered the ASF about one year ago – 04/2014: Incubation – 12/2014: Graduation • Strongly growing community 120 100 80 60 40 20 0 Nov.10 Apr.12 Aug.13 Dec.14 #unique git committers (w/o manual de-dup) 5 Where is Flink moving? A "use-case complete" framework to unify batch & stream processing Data Streams • Kafka Analytical Workloads • RabbitMQ • ETL • ... • Relational processing Flink • Graph analysis • Machine learning “Historic” data • Streaming data analysis • HDFS • JDBC • ... Goal: Treat batch as finite stream 6 Programming Model & APIs HOW TO USE FLINK? 7 Unified Java & Scala APIs • Fluent and mirrored APIs in Java and Scala • Table API for relational expressions • Batch and Streaming APIs almost identical ... ... with slightly different semantics in some cases 8 DataSets and Transformations Input filter First map Second ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); DataSet<String> input = env.readTextFile(input); DataSet<String> first = input .filter (str -> str.contains(“Apache Flink“)); DataSet<String> second = first .map(str -> str.toLowerCase()); second.print(); env.execute(); 9 Expressive Transformations • Element-wise – map, flatMap, filter, project • Group-wise – groupBy, reduce, reduceGroup, combineGroup, mapPartition, aggregate, distinct • Binary – join, coGroup, union, cross • Iterations – iterate, iterateDelta • Physical re-organization – rebalance, partitionByHash, sortPartition • Streaming – Window, windowMap, coMap, ... 10 Rich Type System • Use any Java/Scala classes as a data type – Tuples, POJOs, and case classes – Not restricted to key-value pairs • Define (composite) keys directly on data types – Expression – Tuple position – Selector function 11 Counting Words in Batch and Stream case class Word (word: String, frequency: Int) DataSet API (batch): val lines: DataSet[String] = env.readTextFile(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print() DataStream API (streaming): val lines: DataStream[String] = env.fromSocketStream(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Count.of(1000)).every(Count.of(100)) .groupBy("word").sum("frequency") .print() 12 Table API • Execute SQL-like expressions on table data – Tight integration with Java and Scala APIs – Available for batch and streaming programs val orders = env.readCsvFile(…) .as('oId, 'oDate, 'shipPrio) .filter('shipPrio === 5) val items = orders .join(lineitems).where('oId === 'id) .select('oId, 'oDate, 'shipPrio, 'extdPrice * (Literal(1.0f) - 'discnt) as 'revenue) val result = items .groupBy('oId, 'oDate, 'shipPrio) .select('oId, 'revenue.sum, 'oDate, 'shipPrio) 13 Libraries are emerging • As part of the Apache Flink project – Gelly: Graph processing and analysis – Flink ML: Machine-learning pipelines and algorithms – Libraries are built on APIs and can be mixed with them • Outside of Apache Flink – Apache SAMOA (incubating) – Apache MRQL (incubating) – Google DataFlow translator 14 Processing Engine WHAT IS HAPPENING INSIDE? 15 System Architecture Client (pre-flight) Master Type extraction Recovery Flink stack metadata Program Cost-based Task optimizer scheduling Workers Memory manager Coordination Data serialization stack Out-of-core ... ... algos Pipelined or Blocking Data Transfer 16 Cool technology inside Flink • Batch and Streaming in one system • Memory-safe execution • Built-in data flow iterations • Cost-based data flow optimizer • Flexible windows on data streams • Type extraction and serialization utilities • Static code analysis on user functions • and much more... 17 Pipelined Data Transfer STREAM AND BATCH IN ONE SYSTEM 18 Stream and Batch in one System • Most systems are either stream or batch systems • In the past, Flink focused on batch processing – Flink‘s runtime has always done stream processing – Operators pipeline data forward as soon as it is processed – Some operators are blocking (such as sort) • Stream API and operators are recent contributions – Evolving very quickly under heavy development 19 Pipelined Data Transfer • Pipelined data transfer has many benefits – True stream and batch processing in one stack – Avoids materialization of large intermediate results – Better performance for many batch workloads • Flink supports blocking data transfer as well 20 Pipelined Data Transfer Large Interm. map Program Input DataSet Small join Result Input Pipeline 2 Large map No intermediate Input Pipelined materialization! Execution Small Probe Build join Result Input HT HT Pipeline 1 21 Memory Management and Out-of-Core Algorithms MEMORY SAFE EXECUTION 22 Memory-safe Execution • Challenge of JVM-based data processing systems – OutOfMemoryErrors due to data objects on the heap • Flink runs complex data flows without memory tuning – C++-style memory management – Robust out-of-core algorithms 23 Managed Memory • Active memory management – Workers allocate 70% of JVM memory as byte arrays – Algorithms serialize data objects into byte arrays – In-memory processing as long as data is small enough – Otherwise partial destaging to disk • Benefits – Safe memory bounds (no OutOfMemoryError) – Scales to very large JVMs – Reduced GC pressure 24 Going out-of-core Single-core join of 1KB Java objects beyond memory (4 GB) Blue bars are in-memory, orange bars (partially) out-of-core 25 Native Data Flow Iterations GRAPH ANALYSIS 26 Native Data Flow Iterations • Many graph and ML algorithms require iterations • Flink features native data flow iterations – Loops are not unrolled – But executed as cyclic data flows 0.1 2 0.3 1 0.7 5 • Two types of iterations 0.5 0.4 – Bulk iterations 0.9 3 – Delta iterations 0.2 4 • Performance competitive with specialized systems 27 Iterative Data Flows • Flink runs iterations „natively“ as cyclic data flows – Operators are scheduled once – Data is fed back through backflow channel – Loop-invariant data is cached • Operator state is preserved across iterations! Replace initial interm. interm. join reduce result result result result other datasets 28 45000000 40000000 Delta Iterations 35000000 30000000 25000000 20000000 • Delta iteration computes 15000000 – Delta update of solution set 10000000 5000000 # of elements updated # of elements 0 – Work set for next iteration 1 6 11 16 21 26 31 36 41 46 51 56 61 # of iterations • Work set drives computations of next iteration – Workload of later iterations significantly reduced – Fast convergence • Applicable to certain problem domains – Graph processing 29 Iteration Performance 30 Iterations 61 Iterations (Convergence) PageRank on Twitter Follower Graph 30 Roadmap WHAT IS COMING NEXT? 31 Flink’s Roadmap Mission: Unified stream and batch processing • Exactly-once streaming semantics with flexible state checkpointing • Extending the ML library • Extending graph library • Interactive programs • Integration with Apache Zeppelin (incubating) • SQL on top of expression language • And much more… 32 tl;dr – What’s worth to remember? • Flink is general-purpose analytics system • Unifies streaming and batch processing • Expressive high-level APIs • Robust and fast execution engine 34 I Flink, do you? ;-) If you find this exciting, get involved and start a discussion on Flink‘s ML or stay tuned by subscribing to [email protected] or following @ApacheFlink on Twitter 35 36 BACKUP 37 Data Flow Optimizer • Database-style optimizations for parallel data flows • Optimizes all batch programs • Optimizations – Task chaining – Join algorithms – Re-use partitioning and sorting for later operations – Caching for iterations 38 Data Flow Optimizer val orders = … val lineitems = … val filteredOrders = orders .filter(o => dataFormat.parse(l.shipDate).after(date)) .filter(o => o.shipPrio > 2) val lineitemsOfOrders = filteredOrders .join(lineitems) .where(“orderId”).equalTo(“orderId”) .apply((o,l) => new SelectedItem(o.orderDate, l.extdPrice)) val priceSums = lineitemsOfOrders .groupBy(“orderDate”) .sum(“l.extdPrice”); 39 Data Flow Optimizer Reduce sort[0,1] hash-part [0,1] Best plan depends on Combine relative sizes Reduce partial sort[0,1] of input files sort[0,1] Join Join Hybrid Hash Hybrid Hash buildHT probe buildHT probe broadcast forward hash-part [0] hash-part [0] Filter DataSource Filter DataSource lineitem.tbl lineitem.tbl DataSource DataSource orders.tbl orders.tbl 40.