HDP Developer: Apache Pig and Hive

HDP Developer: Apache Pig and Hive

HDP Developer: Apache Pig and Hive Overview Hands-On Labs This course is designed for developers who need to create • Lab: Starting and HDP 2.3 Cluster applications to analyze Big Data stored in Apache Hadoop using • Demo: Block Stprage Pig and Hive. Topics include: Hadoop, YARN, HDFS, • Lab: Using HDFS commands MapReduce, data ingestion, workflow definition and using Pig • Lab: Importing and Exporting Data in HDFS and Hive to perform data analytics on Big Data. Labs are • Lab: Using Flume to import log files into HDFS executed on a 7-node HDP cluster. • Demo: MapReduce • Lab: Running a MapReduce Job Duration • Demo: Apache Pig Lab: Getting started with Apache Pig 4 days • • Lab: Exploring data with Apache Pig • Lab: Splitting a datasetUse Sqoop to transfer data between Target Audience HDFS and a RDBMS Software developers who need to understand and develop • Run MapReduce and YARN application jobs applications for Hadoop. • Explore and transform data using Pig • Split and join a dataset using Pig Course Objectives • Use Pig to transform and export a dataset for use with Hive • Describe Hadoop, YARN and use cases for Hadoop • Use HCatLoader and HCatStorer • Describe Hadoop ecosystem tools and frameworks • Use Hive to discover useful information in a dataset • Describe the HDFS architecture • Describe how Hive queries get executed as MapReduce jobs • Use the Hadoop client to input data into HDFS • Perform a join of two datasets with Hive • Transfer data between Hadoop and a relational database • Use advanced Hive features: windowing, views, ORC files • Explain YARN and MaoReduce architectures • Use Hive analytics functions • Run a MapReduce job on YARN • Write a custom reducer in Python • Use Pig to explore and transform data in HDFS • Analyze and sessionize clickstream data • Use Hive to explore Understand how Hive tables are defined • Compute quantiles of NYSE stock prices and implementedand analyze data sets • Use Hive to compute ngrams on Avro-formatted files • Use the new Hive windowing functions • Lab: Exploring Spark SQL • Explain and use the various Hive file formats • Lab: Defining an Oozie workflow • Create and populate a Hive table that uses ORC file formats • Use Hive to run SQL-like queries to perform data analysis Prerequisites • Use Hive to join datasets using a variety of techniques, Students should be familiar with programming principles and including Map-side joins and Sort-Merge-Bucket joins have experience in software development. SQL knowledge is also • Write efficient Hive queries helpful. No prior Hadoop knowledge is required. • Create ngrams and context ngrams using Hive • Perform data analytics like quantiles and page rank on Big Format Data using the DataFu Pig library 50% Lecture/Discussion Explain the uses and purpose of HCatalog • 50% Hands-on Labs • Use HCatalog with Pig and Hive • Define a workflow using Oozie • Schedule a recurring workflow using the Oozie Coordinator Certification Hortonworks offers a comprehensive certification program that identifies you as an expert in Apache Hadoop. Visit About Hortonworks US: 1.855.846.7866 Hortonworks develops, distributes and supports the International: +1.408.916.4121 only 100 percent open source distribution of www.hortonworks.com Apache Hadoop explicitly architected, built and 5470 Great America Parkway tested for enterprise-grade deployments. Santa Clara, CA 95054 USA .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    1 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us