Portable stateful big data processing in Apache Beam
Kenneth Knowles
Apache Beam PMC Software Engineer @ Google https://s.apache.org/ffsf-2017-beam-state [email protected] / @KennKnowles Flink Forward San Francisco 2017 Agenda
1. What is Apache Beam? 2. State 3. Timers 4. Example & Little Demo What is Apache Beam? TL;DR
(Flink draws it more like this) 4 DAGs, DAGs, DAGs Apache Beam Apache Flink Apache Cloud Hadoop Apache Apache Dataflow Spark Samza MapReduce Apache Apache Apache (paper) Storm Gearpump Apex (incubating) FlumeJava (paper) Heron MillWheel (paper) Dataflow Model (paper)
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Apache Flink local, on-prem, The Beam Vision cloud
Cloud Dataflow: Java fully managed input.apply( Apache Spark Sum.integersPerKey()) local, on-prem, cloud
Sum Per Key Apache Apex Python local, on-prem, cloud input | Sum.PerKey() Apache Gearpump (incubating) ⋮ ⋮ 6 Apache Flink local, on-prem, The Beam Vision cloud
Cloud Dataflow: Python fully managed input | KakaIO.read() Apache Spark local, on-prem, cloud
KafkaIO Apache Apex ⋮ local, on-prem, cloud
Apache Java Gearpump (incubating) class KafkaIO extends UnboundedSource { … }
⋮ 7 The Beam Model
PTransform Pipeline
PCollection (bounded or unbounded)
8 The Beam Model
What are you computing? (read, map, reduce)
Where in event time? (event time windowing)
When in processing time are results produced? (triggers)
How do refinements relate? (accumulation mode)
9 What are you computing?
Read ParDo Grouping Composite Parallel connectors to Per element Group by key, Encapsulated external systems "Map" Combine per key, subgraph "Reduce"
10 State What are you computing?
Read ParDo Grouping Composite Parallel connectors to Per element Group by key, Encapsulated external systems "Map" Combine per key, subgraph "Reduce"
12 State for ParDo
ParDo(DoFn)
13 ParDo.of(new DoFn<...>() { "Example" // declare some state
@ProcessElement public void process(...) { // update quantiles // output if needed } }) Per Key Quantiles
14 Partitioned
Quantiles Quantiles Quantiles
15 Parallelism! new DoFn
… }
Quantiles Quantiles Quantiles DoFn DoFn DoFn
16 Windowed
Window into Fixed windows of one hour
Quantiles Quantiles Quantiles DoFn DoFn DoFn
Expected result: Quantiles for each hour
17 Windowed
Window into windows of 30 min sliding by 10 min
Quantiles Quantiles Quantiles DoFn DoFn DoFn
Expected result: Quantiles for 30 minutes sliding by 10 min
18 State is per key and window
...
"x" 3 7 15
"y" "fizz" "7" "fizzbuzz"
...
Bonus: automatically garbage collected when a window expires (vs manual clearing of keyed state) Windowed
Window into Fixed windows of one hour
Quantiles Quantiles Quantiles DoFn DoFn DoFn
Expected result: Quantiles for each hour
20 What about Combine?
Window into Fixed windows of one hour
Quantiles Quantiles Quantiles Combiner Combiner Combiner
Expected result: Quantiles for each hour
21 Combine vs State (naive conceptualization)
Window hourly Window hourly
Quantiles Quantiles Combiner DoFn
Expected result: Expected result: Quantiles for each hour Quantiles for each hour
22 Combine vs State (likely execution plan)
Quantiles Quantiles non-associative* Combiner Combiner non-commutative* Shuffle Elements multi-output Shuffle side outputs Accumulators (user is in control)
Quantiles Quantiles associative Combiner DoFn commutative single-output enables optimizations (engine is in control) Combine vs State
Quantiles Combiner Quantiles DoFn
output governed by trigger
(data/computation unaware) "output only when there's an interesting change"
(data/computation aware) Example: Arbitrary indices
E
B
A
C
D non-associative
Assign indices non-commutative
non-deterministic E,4
B,3 and totally fine!
A,2
C,1
D,0 Kinds of state
● Value - just a mutable cell for a value ● Bag - supports "blind writes" ● Combining - has a CombineFn built in; can support blind writes and lazy accumulation ● Set - membership checking ● Map - lookups and partial writes Timers new DoFn<...>() { Timers for ParDo @TimerId("timeout") private final TimerSpec timeoutTimer = TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
@OnTimer("timout") public void timeout(...) { // access state, set new timers, output } … }
Stateful ParDo
28 Timers in processing time
buffer request
"call me back in 5 On Element seconds"
On Timer
batched RPC
output based only output return on incoming value of batched element RPC Timers in event time
"call me back when store speculative result the watermark hits the end of the On Element window"
On Timer
output output replacement speculative result value only if immediately changed More example uses for state & timers
● Per-key arbitrary numbering ● Output only when result changes ● Tighter "side input" management for slowly changing dimension ● Streaming join-matrix / join-biclique ● Fine-grained combine aggregation and output control ● Per-key "workflows" like user sign up flow w/ expiration ● Low-latency deduplication (let the first through, squash the rest) Performance considerations (cross-runner)
● Shuffle to colocate keys ● Linear processing of elements for key+window ● Window merging ● Storage of state and timers ● GC of state
32 Demo
33 Summary
State and Timers in Beam...
● … unlock new uses cases
● … they "just work" with event time windowing
● … are portable across runners (implementation ongoing) Thank you for listening! This talk: ● Me - @KennKnowles ● These Slides - https://s.apache.org/ffsf-2017-beam-state
Go Deeper ● Design doc - https://s.apache.org/beam-state ● Blog post - https://beam.apache.org/blog/2017/02/13/stateful-processing.html
Join the Beam community: ● User discussions - [email protected] ● Development discussions - [email protected] ● Follow @ApacheBeam on Twitter
You can contribute to Beam + Flink ● New types of state ● Easy launch of Beam job on Flink-on-YARN ● Integration tests at scale ● Fit and finish: polish, polish, polish! ● … and lots more!
https://beam.apache.org