"Web Age Speaks!" Webinar Series

Getting Started with Apache NiFi Introduction

 Mikhail Vladimirov  Director, Curriculum Architecture  [email protected]

 Web Age Solutions  Providing a broad spectrum of regular and customized training classes in programming, system administration and architecture to our clients across the world for over ten years

©WebAgeSolutions.com 2 Agenda

What is Apache Nifi Apache Nifi concepts

©WebAgeSolutions.com 3 Getting Started with Apache NiFi What is Apache NiFi

 Apache NiFi (short for NiagaraFiles), pronounced [nī fī], is a platform for distributed data flow automation  Based on the "NiagaraFiles" previously developed by the NSA (the National Security Agency of the USA) and open-sourced as a part of its technology transfer program in 2014  Written in Java  and commercial support is currently offered by  Among many other things, NiFi is well positioned for efficient handling the Internet of Things (IoT) use cases

©WebAgeSolutions.com 5 Distributed Data Flow Challenges

 System failures due to network problems, data formats and sizes, data ingestion rates, (add to the list from your own experience …)  Resource underprovisioning  Compliance and security  Difficult to maintain and upgrade when in production

©WebAgeSolutions.com 6 Some of the NiFi Features

 Apache NiFi supports capabilities for data routing, transformation, and system mediation logic for messages (called FlowFiles)  Think: the distributed ESB  Web-based UI for flow creation and control  Dataflow tracking from beginning to end through data provenance  Backpressure  Guaranteed delivery  Secure  Flow-specific QoS (latency vs throughput, loss tolerance, etc.)  Artifacts for developing custom data processors

©WebAgeSolutions.com 7 NiFi Architecture (Single Node)

Source: Apache NiFi

©WebAgeSolutions.com 8 NiFi Clustered Architecture

Source: Apache NiFi

©WebAgeSolutions.com 9 The Web UI

 The interactive NiFi workflow construction and monitoring environment

©WebAgeSolutions.com 10 Adding a Processor

©WebAgeSolutions.com 11 Processor Types

 Data Transformation:  CompressContent  ConvertCharacterSet  EncryptContent  ReplaceText  TransformXml  Routing and Mediation:  ControlRate  DetectDuplicate  DistributeLoad  MonitorActivity  RouteOnContent  ScanContent

©WebAgeSolutions.com 12 Processors … Wait, there is more !

 Database access  Attribute extraction  System integration  Data ingestion:  GetFile  GetSFTP  GetJMSQueeue  GetHTTP  GetHDFS  FetchS3Object  Data Egress / Sending Data:  PutEmail  PutSFTP  PutSQL …

©WebAgeSolutions.com 13 Configure a Processor Building a Simple Data Flow Data Provenance

Source: Apache NiFi

©WebAgeSolutions.com 16 Provenance Lineage Graph

Source: Apache NiFi

©WebAgeSolutions.com 17 Status History

Source: Apache NiFi

©WebAgeSolutions.com 18 Q & A

Q? & A!

©WebAgeSolutions.com 19