Apache Apex: Next Gen Big Data Analytics

Apache Apex: Next Gen Big Data Analytics Thomas Weise <[email protected]> @thweise PMC Chair Apache Apex, Architect DataTorrent Apache Big Data Europe, Sevilla, Nov 14th 2016 Stream Data Processing Data Delivery Transform / Analytics Real-time visualization, … Declarative SQL API Data Beam Beam SAMOA Operator SAMOA DAG API Sources Library Events Logs Oper1 Oper2 Oper3 Sensor Data Social Databases CDC (roadmap) 2 Industries & Use Cases Financial Services Ad-Tech Telecom Manufacturing Energy IoT Real-time Call detail record customer facing (CDR) & Supply chain Fraud and risk Smart meter Data ingestion dashboards on extended data planning & monitoring analytics and processing key performance record (XDR) optimization indicators analysis Understanding Reduce outages Credit risk Click fraud customer Preventive & improve Predictive assessment detection behavior AND maintenance resource analytics context utilization Packaging and Improve turn around Asset & Billing selling Product quality & time of trade workforce Data governance optimization anonymous defect tracking settlement processes management customer data HORIZONTAL • Large scale ingest and distribution • Enforcing data quality and data governance requirements • Real-time ELTA (Extract Load Transform Analyze) • Real-time data enrichment with reference data • Dimensional computation & aggregation • Real-time machine learning model scoring 3 Apache Apex • In-memory, distributed stream processing • Application logic broken into components (operators) that execute distributed in a cluster • Unobtrusive Java API to express (custom) logic • Maintain state and metrics in member variables • Windowing, event-time processing • Scalable, high throughput, low latency • Operators can be scaled up or down at runtime according to the load and SLA • Dynamic scaling (elasticity), compute locality • Fault tolerance & correctness • Automatically recover from node outages without having to reprocess from beginning • State is preserved, checkpointing, incremental recovery • End-to-end exactly-once • Operability • System and application metrics, record/visualize data • Dynamic changes and resource allocation, elasticity 4 Native Hadoop Integration • YARN is the resource manager • HDFS for storing persistent state 5 Application Development Model A Stream is a sequence of data Directed Acyclic Graph (DAG) tuples A typical Operator takes one or Filtered Enriched more input streams, performs Stream Stream computations & emits one or more output streams Output Operator • Each Operator is YOUR custom Stream business logic in java, or built-in operator from our open source library Tuple Operator Operator Operator Operator • Operator has many instances that run in parallel and each instance is single-threaded Filtered Enriched Stream Stream Directed Acyclic Graph (DAG) is Operator made up of operators and streams 6 Development Process Kafka Word JDBC Parser Filter Input Counter Output Lines Words Filtered Counts Kafka Database Apex Application • Operators from library or develop for custom logic • Connect operators to form application • Configure operator properties • Configure scaling and other platform attributes • Test functionality, performance, iterate 7 Application Specification DAG API (compositional) Java Stream API (declarative) 8 Developing Operators 9 Operator Library Messaging NoSQL RDBMS • Kafka • Cassandra, HBase • JDBC • JMS (ActiveMQ, …) • Aerospike, Accumulo • MySQL • Kinesis, SQS • Couchbase/ CouchDB • Oracle • Flume, NiFi • Redis, MongoDB • MemSQL • Geode File Systems Parsers Transformations • HDFS/ Hive • XML • Filter, Expression, Enrich • NFS • JSON • Windowing, Aggregation • S3 • CSV • Join • Avro • Dedup • Parquet Analytics Protocols Other • Dimensional Aggregations • HTTP • Elastic Search (with state management for • FTP • Script (JavaScript, Python, R) historical data + query) • WebSocket • Solr • MQTT • Twitter • SMTP 10 Stateful Processing with Event Time Event Stream k=A k=B k=B k=A k=A t=4:00 t=5:00 t=5:59 t=4:30 t=5:00 Processing Time +30s +60s +90s (All) : 1 (All) : 4 (All) : 5 t=4:00 : 1 t=4:00 : 2 t=4:00 : 2 State t=5:00 : 2 t=5:00 : 3 k=A, t=4:00 : 1 k=A, t=4:00 : 2 k=A, t=4:00 : 2 k=A, t=5:00 : 1 K=B, t=5:00 : 2 k=B, t=5:00 : 2 11 Windowing - Apache Beam Model Event-time Session windows Watermarks Accumulation Triggers Keyed or Not Keyed Allowed Lateness Accumulation Mode Merging streams ApexStream<String> stream = StreamFactory .fromFolder(localFolder) .flatMap(new Split()) .window(new WindowOption.GlobalWindow(), new TriggerOption().withEarlyFiringsAtEvery(Duration.millis(1000)).accumulatingFiredPanes()) .countByKey(new ConvertToKeyVal()).print(); 12 Fault Tolerance • Operator state is checkpointed to persistent store ᵒ Automatically performed by engine, no additional coding needed ᵒ Asynchronous and distributed ᵒ In case of failure operators are restarted from checkpoint state • Automatic detection and recovery of failed containers ᵒ Heartbeat mechanism ᵒ YARN process status notification • Buffering to enable replay of data from recovered point ᵒ Fast, incremental recovery, spike handling • Application master state checkpointed ᵒ Snapshot of physical (and logical) plan ᵒ Execution layer change log 13 Checkpointing State . Distributed, asynchronous . No artificial latency . Periodic callbacks . Pluggable storage 14 Buffer Server & Recovery • In-memory PubSub Container 1 Container 2 • Stores results until committed Operator Buffer Operator • Backpressure / spillover to disk 1 Server 2 • Ordering, idempotency Node 1 Node 2 Independent pipelines Downstream Operators reset (can be used for speculative execution) 15 Recovery Scenario sum … EW2, 1, 3, BW2, EW1, 4, 2, 1, BW1 0 sum … EW2, 1, 3, BW2, EW1, 4, 2, 1, BW1 7 sum … EW2, 1, 3, BW2, EW1, 4, 2, 1, BW1 10 sum … EW2, 1, 3, BW2, EW1, 4, 2, 1, BW1 7 16 Processing Guarantees At-least-once • On recovery data will be replayed from a previous checkpoint ᵒ No messages lost ᵒ Default, suitable for most applications • Can be used to ensure data is written once to store ᵒ Transactions with meta information, Rewinding output, Feedback from external entity, Idempotent operations At-most-once • On recovery the latest data is made available to operator ᵒ Useful in use cases where some data loss is acceptable and latest data is sufficient Exactly-once ᵒ At-least-once + idempotency + transactional mechanisms (operator logic) to achieve end-to-end exactly once behavior 17 End-to-End Exactly Once • Important when writing to external systems • Data should not be duplicated or lost in the external system in case of application failures • Common external systems ᵒ Databases ᵒ Files ᵒ Message queues • Exactly-once = at-least-once + idempotency + consistent state • Data duplication must be avoided when data is replayed from checkpoint ᵒ Operators implement the logic dependent on the external system ᵒ Platform provides checkpointing and repeatable windowing 18 Scalability Unifier NxM Partitions Logical DAG Logical Diagram 0 1 2 3 0 1 2 Physical DAG with (1a, 1b, 1c) and (2a, 2b): Bottleneck on intermediate Unifier 1a 2a Physical Diagram with operator 1 with 3 partitions 0 1b Unifier Unifier 3 1 2b 1c Physical DAG with (1a, 1b, 1c) and (2a, 2b): No bottleneck 0 1 Unifier 2 1a Unifier 2a 1 0 1b Unifier 3 Unifier 2b 1c 19 Advanced Partitioning Parallel Partition Cascading Unifiers Logical DAG Logical Plan 0 1 2 3 4 uopr dopr Execution Plan, for N = 4; M = 1 Physical DAG uopr1 1a uopr2 Container 0 Unifier 2 3 4 unifier dopr NIC 1b uopr3 uopr4 Physical DAG with Parallel Partition Execution Plan, for N = 4; M = 1, K = 2 with cascading unifiers uopr1 Container 1a 2a 3a unifier NIC NIC 0 Unifier 4 uopr2 Container unifier dopr NIC 1b 2b 3b uopr3 Container unifier NIC NIC uopr4 20 Dynamic Partitioning 2a 2a 1a 2a 1a 2b 1a 2b 3a 3 3 1b 2b 1b 2c 1b 2c 3b 2d 2d Unifiers not shown • Partitioning change while application is running ᵒ Change number of partitions at runtime based on stats ᵒ Determine initial number of partitions dynamically • Kafka operators scale according to number of kafka partitions ᵒ Supports re-distribution of state when number of partitions change ᵒ API for custom scaler or partitioner 21 How dynamic partitioning works • Partitioning decision (yes/no) by trigger (StatsListener) ᵒ Pluggable component, can use any system or custom metric ᵒ Externally driven partitioning example: KafkaInputOperator • Stateful! ᵒ Uses checkpointed state ᵒ Ability to transfer state from old to new partitions (partitioner, customizable) ᵒ Steps: • Call partitioner • Modify physical plan, rewrite checkpoints as needed • Undeploy old partitions from execution layer • Release/request container resources • Deploy new partitions (from rewritten checkpoint) ᵒ No loss of data (buffered) ᵒ Incremental operation, partitions that don’t change continue processing • API: Partitioner interface 22 Compute Locality • By default operators are deployed in containers (processes) on different nodes across the Hadoop cluster • Locality options for streams ᵒ RACK_LOCAL: Data does not traverse network switches ᵒ NODE_LOCAL: Data transfer via loopback interface, frees up network bandwidth ᵒ CONTAINER_LOCAL: Data transfer via in memory queues between operators, does not require serialization ᵒ THREAD_LOCAL: Data passed through call stack, operators share thread • Host Locality ᵒ Operators can be deployed on specific hosts • (Anti-)Affinity ᵒ Ability to express relative deployment without specifying a host 23 Performance: Throughput vs. Latency? https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-

Apache Apex: Next Gen Big Data Analytics

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support