SageMaker End-to-End Managed ML Platform

Lee Pang, Kevin Jorissen

© 2017, , Inc. or its Affiliates. All rights reserved. The Amazon AI/DL/ML Stack

APPLICATION SERVICES

Rekognition Transcribe Translate Polly Comprehend Lex

PLATFORM SERVICES

Amazon SageMaker AWS DeepLens

FRAMEWORKS & INTERFACES AWS AMIs Apache Caffe2 CNTK MXNet PyTorch TensorFlow Torch Keras Gluon

INFRASTRUCTURE

GPU (P3) CPU IoT & Edge Mobile

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Platforms

APPLICATION SERVICES

Rekognition Transcribe Translate Polly Comprehend Lex

PLATFORM SERVICES

Amazon SageMaker AWS DeepLens

FRAMEWORKS & INTERFACES AWS Deep Learning AMIs Apache Caffe2 CNTK MXNet PyTorch TensorFlow Torch Keras Gluon

INFRASTRUCTURE

GPU (P3) CPU IoT & Edge Mobile

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

Afully managed service that enables data scientists and developers to quickly and easily build machine-learning based models into production smart applications.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Machine learning process is hard…

Fetch data 1. Data wrangling • Setup and manage 3. Deployment Monitor/ Clean & Notebook environments debug/refresh format data • Setup and manage • Get data to inference clusters notebooks securely • Manage and auto scale inference APIs • Testing, versioning, and monitoring Prepare & Integrate transform with prod data 2. Experimentation • Setup and manage clusters

Evaluate • Scale/distribute ML model Train model algorithms

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Build, train, and deploy machine learning models at scale

$

End-to-End Zero setup Flexible Model Pay by the second Machine Learning Training Platform

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker 1 2 3 4

I I I I Notebook Instances Algorithms ML Training Service ML Hosting Service

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1 Zero Setup For Exploratory Data Analysis

I “Just add data” Notebook Instances • Recommendations/Personalization • Fraud Detection • Forecasting

• Image Classification • Churn Prediction

• Marketing Email/Campaign Targeting

• Log processing and anomaly detection

• Speech to Text • More…

Access to S3 Data Authoring & ETL Access to AWS Lake Notebooks Database services

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Amazon SageMaker: 10x better algorithms

Training code I Algorithms • Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering Bring Your Own Script (SM builds the Container) • Gradient Boosted Trees • And More! SM Estimators in Amazon provided Algorithms Apache Spark Bring Your Own Algorithm (You build the Container)

Streaming Train faster, in a Greater datasets, for single pass reliability on cheaper training extremely large datasets

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 3 Managed Distributed Training with Flexibility – Secured

Save Model Artifacts

Fetch Training data

Save Inference Image I Training code ML Training Service Amazon ECR

• Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering Bring Your Own Script (SM builds the Container) • Gradient Boosted Trees • And More! SM Estimators in Amazon provided Algorithms Apache Spark Bring Your Own Algorithm (You build the Container)

CPU GPU HPO

Fully managed –

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 Easy Model Deployment to Amazon SageMaker

Model Artifacts In feren ce E n d p o in t

In stan ceT yp e: c3.4xlarge

In itialIn stan ceC o u n t: 3

ModelName: prod I VariantName: primary

ML Hosting Service In itialV arian tW eig h t: 5 0 30 50

Versions of the same in feren ce co d e saved in ProductionVariant in feren ce co n tain ers. Prod is th e p rim ary one, 50% of the traffic must be served there! 10 10 One-Click!

In feren ce Im ag e

Model versions

EndpointConfiguration Amazon Provided Algorithms Amazon ECR

Amazon SageMaker © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 Easy Model Deployment to Amazon SageMaker

ü Auto-Scaling Inference APIs I ML Hosting Service ü A/B Testing (more to come)

ü Low Latency & High Throughput

ü Bring Your Own Model

ü Python SDK

Amazon SageMaker © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Let’s get started!

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learning Objectives

• End-to-End machine learning with SageMaker

• Deep learning frameworks and distributed training

• Bringing your own model

• Leveraging public datasets

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Prerequisites

AWS Account AWS Region Your own (recommended) with a Choose one of the following for all user or role with full permissions to: resources created in this workshop: • AWS IAM • Oregon (us-west-2) • • N. Virginia (us-east-1) • Amazon SageMaker • Ohio (us-east-2) • Ireland (eu-west-1)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab Content

Download from: https://bit.ly/2HhD2SG

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Setup

1. Create an S3 Bucket: 1. Name: smworkshop-firstname-lastname 2. Region: your region of choice 2. Launch a Notebook instance 1. Region: your region of choice 2. Instance Type: ml.m4.xlarge 3. IAM role: “Create a new role” 4. S3 Bucket: (the one you created above)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab 1

Introduction to Amazon SageMaker and Amazon Algorithms

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker – End to End

Fully managed One-click Hyperparameter One-click Pre-built Built-in, high hosting with auto- notebooks for performance training optimization deployment scaling common algorithms problems

BUILD TRAIN DEPLOY

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker – End to End

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab 2

Distributed Training with TensorFlow

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Distributed GPU State

GPU State

GPU State

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Shared State Local GPU State

Shared State Local GPU State

Local GPU State

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab 3

Bringing Your Own Algorithms

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

Training code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR Training data Training code Training code Helper code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

In feren ce co d e Model artifacts Training data Training code Training code Helper code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

In feren ce co d e Helper code In feren ce co d e

Model Hosting (on EC2) Model artifacts Training data Training code Training code Helper code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

In feren ce resp o n se In feren ce req u est Amazon SageMaker Amazon ECR

In feren ce E n d p o in t

In feren ce co d e Helper code In feren ce co d e

Model Hosting (on EC2) Model artifacts Training data Training code Training code Helper code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

In feren ce resp o n se In feren ce req u est Amazon SageMaker Amazon ECR

In feren ce E n d p o in t Ground Truth

In feren ce co d e Helper code In feren ce co d e

Model Hosting (on EC2) Model artifacts Training data Training code Training code Helper code

Model Training (on EC2)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab 4

Using Public Datasets

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Public Data on AWS AWS hosts a variety of public datasets that anyone can access for free. https://aws.amazon.com/public-datasets/

1000 Genomes Project The 1000 Genomes Project is an international collaboration which has established the most detailed catalogue of human genetic variation, including SNPs, structural variants, and their haplotype context.

https://aws.amazon.com/1000genomes/

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lab 5

Classifying Buildings in Vietnam MXNet, GPU instances, and Open Map Data

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A real world example

Bring your Integrated Open Data GPU Based own model Framework Training

Developed by developmentSEED.org https://developmentseed.org/blog/2018/01/19/sagemaker-label-maker-case/

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Clean-up!

Avoid charges for resources you no longer need after this workshop • Endpoints • Notebook instances • S3 Bucket

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Review

ü End-to-End machine learning with SageMaker • Linear Learner binary classification of MNIST

ü Deep learning frameworks and distributed training • TensorFlow CNN on MNIST

ü Bringing your own model • Deploying scikit-learn decision trees

ü Leveraging public datasets • K-means clustering of 1000 Genomes data

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Resources

• Getting started with Amazon SageMaker: https://aws.amazon.com/sagemaker/

• Use the Amazon SageMaker SDK: • For Python: https://github.com/aws/sagemaker-python-sdk • For Spark: https://github.com/aws/sagemaker-spark

• SageMaker Examples: https://github.com/awslabs/amazon-sagemaker-examples

• Let us know what you build!

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.