TRAINING OFFERING DATA PLATFORM (HDP®) ADMINISTRATION FAST TRACK 5 DAYS | FOUNDATION

This 5 day training course is designed primarily for systems administrators and platform architects who need to understand HDP cluster capabilities, and manage HDP clusters. Topics include: Understanding HDF capabilities, , Apache YARN, HDFS, and other Hadoop ecosystem components. Students will understand how to administer, manage, and monitor HDP clusters.

PREREQUISITES Students should be familiar with server or platform software concepts and have a basic understanding of system administration. TARGET AUDIENCE For students who range from understanding server software concepts to system administrators and platform architects who plan on administering HDP clusters. FORMAT 50% Lecture/Discussion 50% Hands-on Labs AGENDA SUMMARY

Day 1: Introduction to Hadoop and Ambari Day 2: Managing HDFS, YARN Architecture and Management Day 3: The YARN Capacity Scheduler, High Availability, Monitoring and Backups Day 4: Advanced HDFS & YARN Services Day 5: Additional HDP Components and Tuning

DAY 1 OBJECTIVES

• Describe Apache Hadoop • Summarize the Purpose of the Hortonworks Data Platform Software Frameworks • List Hadoop Cluster Management Choices • Describe Apache Ambari • Identify Hadoop Cluster Deployment Options • Plan for Hadoop Cluster Deployments • Perform an Interactive HDP Installation using Apache Ambari • Install Apache Ambari • Describe the Differences Between Hadoop Users, Hadoop Service Owners and Ambari Users • Manage Users, Groups and Permissions • Identify Hadoop Configuration Files • Summarize Operations of the Web UI Tool • Manage Hadoop Service Configuration Properties using the Ambari Web UI • Manage Client Configuration Files Using the Command-line Interface

DAY 1 LABS

• Setting Up the Lab Environment • Installing HDP • Managing Apache Ambari Users and Groups • Managing Hadoop Services

About Hortonworks

Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™.

Contact

For further information visit www.hortonworks.com +1 408 675-0983 +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2016 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service

DAY 2 OBJECTIVES

• Describe the Hadoop Distributed File System (HDFS) • Perform HDFS Shell Operations • Use the Ambari Files View • Use WebHDFS • Protect Data using HDFS Access Control Lists (ACLs) • Describe HDFS Architecture and Operation • Manage HDFS using Ambari Web, NameNode and DataNode UIs • Manage HDFS using Command-line Tools • Enable and Manage HDFS Quotas • Identify Reasons to Add, Replace and Delete Worker Nodes • Configure and Run HDFS Balancer • Decommission and Re-commission a Worker Node • Move a Master Component • Summarize the Purpose and Benefits of Rack Awareness • Configure Rack Awareness

DAY 2 LABS

• Using Hadoop Storage • Using WebHDFS • Using HDFS Access Control Lists • Managing Hadoop Storage • Managing HDFS Quotas • Adding, Decommissioning, and Re-commissioning Worker Nodes • Configuring Rack Awareness

About Hortonworks

Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™.

Contact

For further information visit www.hortonworks.com +1 408 675-0983 +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2016 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service

DAY 3 OBJECTIVES

• Describe YARN Resource Management • Summarize YARN Architecture and Operation • Identify and Use YARN Management Options • Summarize YARN Response to Component Failure • Understand the Basics of Running a Sample YARN Application, Including: o MapReduce and Tez o o • Summarize the Purpose and Operation of the YARN Capacity Scheduler • Configure and Manager YARN Queues • Control Access to YARN Queues

DAY 3 LABS

• Managing the YARN Service Using the Apache Ambari Web UI • Managing the YARN Service Using the CLI Commands • Running Sample YARN Applications • Setting Up for the Capacity Scheduler • Managing YARN Containers and Queues • Managing YARN ACLs and User Limits • Working with YARN Node Labels

About Hortonworks

Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™.

Contact

For further information visit www.hortonworks.com +1 408 675-0983 +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2016 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service

DAY 4 OBJECTIVES

• Summarize the Purpose of NameNode HA • Configure NameNode HA using Ambari • Summarize the Purpose of ResourceManager HA • Configure ResourceManager HA using Ambari • Summarize the Purpose and Operation of Ambari Metrics • Describe Features and Benefits of the Ambari Dashboard • Summarize the Purpose and Operation of Ambari Alerts • Configure Ambari Alerts • Summarize Hadoop Backup Considerations • Enable and Manage HDFS Snapshots • Copy Data Using DistCp • Use Snapshots and DistCp Together • Identify the Purpose and Operation of Heterogeneous HDFS Storage • Identify HDFS NFS Gateway Use Cases • Install and Configure an HDFS NFS Gateway • Summarize the Purpose and Operation of HDFS Centralized Caching

DAY 4 LABS

• Configuring NameNode High Availability • Configuring ResourceManager High Availability • Managing Apache Ambari Alerts • Managing HDFS Snapshots • Using DistCP • Configuring HDFS Storage Policies • Configuring an NFS Gateway • Configuring HDFS Centralized Cache

About Hortonworks

Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™.

Contact

For further information visit www.hortonworks.com +1 408 675-0983 +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2016 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service

DAY 5 OBJECTIVES

• Configure YARN Queues, Tez, and Hive Properties to Support Performance Goals • Recall Basic Facts About Hive and the Hive Architecture • Recall the Requirements and Benefits of Hive HA • Summarize the Hive HA Architecture and Operation • Configure and Test Hive HA • Recall the Purpose, Job Types, Structure and Benefits of Oozie • Install and Configure Oozie using Ambari • Deploy and Manage a Sample Oozie Workflow • Identify Characteristics of Ambari Local Versus LDAP Users and Groups • Integrate Ambari Server with LDAP • Summarize the Purpose and Benefits of Ambari Blueprints • Recall the Process Used to Deploy a Cluster Using Ambari Blueprints • Configure Ambari Blueprints Logical Cluster Configuration Files • Recall the Definition of an HDP Stack and Interpret its Version Number • View the Current Stack and Identify Compatible Ambari Software Versions • Recall the Types and Methods of Upgrades Available in HDP • Describe the Rolling Upgrade Process, Restrictions, and Pre-Upgrade Checklist • Perform a Rolling Upgrade Using the Ambari Web UI

DAY 5 LABS

• Configuring Apache Hive High Availability • Managing Workflows Using • Integrating Apache Ambari with AD/LDAP • Automating Cluster Provisioning using Apache Ambari • Performing an HDP Upgrade

Revised 02/27/2018

About Hortonworks

Hortonworks is a leading innovator at creating, distributing and supporting enterprise-ready open data platforms. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data™.

Contact

For further information visit www.hortonworks.com +1 408 675-0983 +1 855 8-HORTON INTL: +44 (0) 20 3826 1405 © 2011-2016 Hortonworks Inc. All Rights Reserved. Privacy Policy | Terms of Service