Hdf® Stream Developer 3 Days
Total Page:16
File Type:pdf, Size:1020Kb
TRAINING OFFERING | DEV-371 HDF® STREAM DEVELOPER 3 DAYS This course is designed for Data Engineers, Data Stewards and Data Flow Managers who need to automate the flow of data between systems as well as create real-time applications to ingest and process streaming data sources using Hortonworks Data Flow (HDF) environments. Specific technologies covered include: Apache NiFi, Apache Kafka and Apache Storm. The course will culminate in the creation of a end-to-end exercise that spans this HDF technology stack. PREREQUISITES Students should be familiar with programming principles and have previous experience in software development. First-hand experience with Java programming and developing within an IDE are required. Experience with Linux and a basic understanding of DataFlow tools and would be helpful. No prior Hadoop experience required. TARGET AUDIENCE Developers, Data & Integration Engineers, and Architects who need to automate data flow between systems and/or develop streaming applications. FORMAT 50% Lecture/Discussion 50% Hands-on Labs AGENDA SUMMARY Day 1: Introduction to HDF Components, Apache NiFi dataflow development Day 2: Apache Kafka, NiFi integration with HDF/HDP, Apache Storm architecture Day 3: Storm management options, multi-language support, Kafka integration DAY 1 OBJECTIVES • Introduce HDF’s components; Apache NiFi, Apache Kafka, and Apache Storm • NiFi architecture, features, and characteristics • NiFi user interface; processors and connections in detail • NiFi dataflow assembly • Processor Groups and their elements • Remote Processor Groups and their elements • NiFi Expression Language • NiFi Attributes • NiFi Templates DAY 1 LABS AND DEMONSTRATIONS • Demonstration: The NiFi user interface • Building a NiFi dataflow • Working with Processor Groups • Working with Remote Processor Groups • Exploring NiFi Expression Language • Demonstration: NiFi Attributes • Demonstration: NiFi Templates DAY 2 OBJECTIVES • Understand NiFi’s Data Provenance • Kafka introduction and architecture • Integrating NiFi with other HDF components • Integrating NiFi with HDP components • Storm introduction and component model; tuple, stream, topology, spout, and bolt • Storm runtime elements; nimbus, supervisor, worker process, executor, and task • Understand the purpose, and the define the types, of stream grouping About Hortonworks Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™. Contact For further information visit +1 408 675-0983 www.hortonworks.com +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2019 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service DAY 2 LABS AND DEMONSTRATIONS • Demonstration: Data Provenance with NiFi • Creating and managing Kafka topics • Integrating NiFi to Kafka • Integrating NiFi to HDP components • Creating a Word Count Storm Topology DAY 3 OBJECTIVES • List Tools Used to Manage Apache Storm • Idenfity how to interpret the metrics available in the Storm UI Console • Compare and contrast distributed and local model topology submission • Explore the polyglot programming options available for Storm development • Idenfity the Differences Between Reliable and Unrealiable Operation • Diagram a Tuple Tree and Identify its Branches • Liste the Two Requirements for Reliable Operation • Describe how to integrate Storm with Kafka DAY 3 LABS AND DEMOMSTRATIONS • Demonstration: Management/monitoring UIs • Submit Storm Topology in local mode • Develop Storm components with Python • Integrating Storm with Kafka About Hortonworks Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™. Contact For further information visit +1 408 675-0983 www.hortonworks.com +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2019 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service .