"Web Age Speaks!" Webinar Series
Getting Started with Apache NiFi Introduction
Mikhail Vladimirov Director, Curriculum Architecture [email protected]
Web Age Solutions Providing a broad spectrum of regular and customized training classes in programming, system administration and architecture to our clients across the world for over ten years
©WebAgeSolutions.com 2 Agenda
What is Apache Nifi Apache Nifi concepts
©WebAgeSolutions.com 3 Getting Started with Apache NiFi What is Apache NiFi
Apache NiFi (short for NiagaraFiles), pronounced [nī fī], is a platform for distributed data flow automation Based on the "NiagaraFiles" software previously developed by the NSA (the National Security Agency of the USA) and open-sourced as a part of its technology transfer program in 2014 Written in Java Software development and commercial support is currently offered by Hortonworks Among many other things, NiFi is well positioned for efficient handling the Internet of Things (IoT) use cases
©WebAgeSolutions.com 5 Distributed Data Flow Challenges
System failures due to network problems, data formats and sizes, data ingestion rates, (add to the list from your own experience …) Resource underprovisioning Compliance and security Difficult to maintain and upgrade when in production
©WebAgeSolutions.com 6 Some of the NiFi Features
Apache NiFi supports capabilities for data routing, transformation, and system mediation logic for messages (called FlowFiles) Think: the distributed ESB Web-based UI for flow creation and control Dataflow tracking from beginning to end through data provenance Backpressure Guaranteed delivery Secure Flow-specific QoS (latency vs throughput, loss tolerance, etc.) Artifacts for developing custom data processors
©WebAgeSolutions.com 7 NiFi Architecture (Single Node)
Source: Apache NiFi
©WebAgeSolutions.com 8 NiFi Clustered Architecture
Source: Apache NiFi
©WebAgeSolutions.com 9 The Web UI
The interactive NiFi workflow construction and monitoring environment
©WebAgeSolutions.com 10 Adding a Processor
©WebAgeSolutions.com 11 Processor Types
Data Transformation: CompressContent ConvertCharacterSet EncryptContent ReplaceText TransformXml Routing and Mediation: ControlRate DetectDuplicate DistributeLoad MonitorActivity RouteOnContent ScanContent
©WebAgeSolutions.com 12 Processors … Wait, there is more !
Database access Attribute extraction System integration Data ingestion: GetFile GetSFTP GetJMSQueeue GetHTTP GetHDFS FetchS3Object Data Egress / Sending Data: PutEmail PutSFTP PutSQL …
©WebAgeSolutions.com 13 Configure a Processor Building a Simple Data Flow Data Provenance
Source: Apache NiFi
©WebAgeSolutions.com 16 Provenance Lineage Graph
Source: Apache NiFi
©WebAgeSolutions.com 17 Status History
Source: Apache NiFi
©WebAgeSolutions.com 18 Q & A
Q? & A!
©WebAgeSolutions.com 19