Amazon Glue Studio User Guide Amazon Glue Studio User Guide

Amazon Glue Studio User Guide Amazon Glue Studio User Guide

Amazon Glue Studio User Guide Amazon Glue Studio User Guide Amazon Glue Studio: User Guide Amazon Glue Studio User Guide Table of Contents What is Amazon Glue Studio? .............................................................................................................. 1 Features of Amazon Glue Studio .................................................................................................. 2 Visual job editor ................................................................................................................ 2 Job script code editor ......................................................................................................... 2 Job performance dashboard ................................................................................................ 3 Support for dataset partitioning .......................................................................................... 3 When should I use Amazon Glue Studio? ...................................................................................... 3 Accessing Amazon Glue Studio .................................................................................................... 3 Pricing for Amazon Glue Studio ................................................................................................... 4 Setting up ......................................................................................................................................... 5 Sign up for Amazon ................................................................................................................... 5 Create an IAM administrator user ................................................................................................. 5 Signing in as an IAM user ........................................................................................................... 6 IAM permissions needed for the Amazon Glue Studio user .............................................................. 6 Amazon Glue service permissions ......................................................................................... 6 Amazon CloudWatch permissions ......................................................................................... 7 Job-related permissions .............................................................................................................. 7 Data source and data target permissions ............................................................................... 7 Permissions required for deleting jobs .................................................................................. 8 Amazon Key Management Service permissions ....................................................................... 8 Additional permissions when using connectors ....................................................................... 8 Set up IAM permissions for Amazon Glue Studio ............................................................................ 8 Configuring a VPC for your ETL job .............................................................................................. 9 Populate the Amazon Glue Data Catalog ...................................................................................... 9 Tutorial: Getting started .................................................................................................................... 10 Prerequisites ............................................................................................................................ 10 Step 1: Start the job creation process ......................................................................................... 10 Step 2: Edit the data source node in the job diagram .................................................................... 11 Step 3: Edit the transform node of the job .................................................................................. 12 Step 4: Edit the data target node of the job ................................................................................ 12 Step 5: View the job script ........................................................................................................ 13 Step 6: Specify the job details and save the job ........................................................................... 13 Step 7: Run the job .................................................................................................................. 14 Next steps ............................................................................................................................... 14 Creating jobs ................................................................................................................................... 15 Start the job creation process .................................................................................................... 15 Create jobs that use a connector ................................................................................................ 16 Next steps for creating a job in Amazon Glue Studio .................................................................... 16 Editing jobs ..................................................................................................................................... 17 Accessing the job diagram editor ................................................................................................ 17 Job editor features ................................................................................................................... 17 Using schema previews in the visual job editor .................................................................... 18 Using data previews in the visual job editor ......................................................................... 18 Restrictions when using data previews ................................................................................ 19 Editing the data source node ..................................................................................................... 19 Using Data Catalog tables for the data source ..................................................................... 20 Using a connector for the data source ................................................................................ 21 Using files in Amazon S3 for the data source ....................................................................... 21 Using a streaming data source ........................................................................................... 22 Editing the data transform node ................................................................................................ 23 Overview of mappings and transforms ................................................................................ 23 Using ApplyMapping to remap data property keys ................................................................ 24 Using SelectFields to remove most data property keys .......................................................... 25 Using DropFields to keep most data property keys ............................................................... 25 iii Amazon Glue Studio User Guide Renaming a field in the dataset ......................................................................................... 26 Using Spigot to sample your dataset .................................................................................. 27 Joining datasets ............................................................................................................... 27 Using SplitFields to split a dataset into two ......................................................................... 29 Overview of SelectFromCollection transform ........................................................................ 29 Using SelectFromCollection to choose which dataset to keep ................................................. 30 Filtering keys within a dataset ........................................................................................... 30 Find and fill missing values in a dataset .............................................................................. 31 Using a SQL query to transform data ................................................................................. 32 Creating a custom transformation ...................................................................................... 33 Configuring data target nodes ................................................................................................... 36 Overview of data target options ........................................................................................ 36 Editing the data target node ............................................................................................. 37 Editing or uploading a job script ................................................................................................ 39 Creating and editing Scala scripts in Amazon Glue Studio ...................................................... 40 Creating and editing Python shell jobs in Amazon Glue Studio ............................................... 41 Adding nodes to the job diagram ............................................................................................... 42 Changing the parent nodes for a node in the job diagram ............................................................. 42 Deleting nodes from the job diagram ......................................................................................... 43 Using connectors and connections .....................................................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    89 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us