Predictive Analytics with SageMaker

Steve Shirkey Specialist SA, AWS (Singapore)

© 2018 , Inc. or its Affiliates. All rights reserved. 2 Explosion in AI and ML Use Cases

Image recognition and tagging for photo organization

Object detection, tracking and navigation for Autonomous Vehicles

Speech recognition & synthesis in Intelligent Voice Assistants

Algorithmic trading strategy performance improvement

Sentiment analysis for targeted advertisements

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ~1997

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thousands of Amazon Engineers Focused on

Fulfillment & Search & Existing New At logistics discovery products products AWS

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Over 20 years of AI at Amazon…

• Applied research • Q&A systems • Core research • Supply chain optimization • Alexa • Advertising • Demand forecasting • Machine translation • Risk analytics • Video content analysis • Search • Robotics • Recommendations • Lots of computer vision… • AI services • NLP/NLU

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ML @ AWS Put machine learning in the OUR MISSION hands of every developer and data scientist

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customers Running Machine Learning On AWS Today

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Amazon Machine Learning Stack

APPLICATION SERVICES

Rekognition Transcribe Translate Polly Comprehend Lex

PLATFORM SERVICES

Amazon SageMaker Amazon EMR (Spark ML) AWS DeepLens

FRAMEWORKS & INTERFACES AWS AMIs Apache Caffe2 CNTK PyTorch TensorFlow Keras Gluon MXNet

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Let’s Review the ML Process

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – The Machine Learning Process

Re-training ML problem framing Data Collection

Data Integration

Monitoring & Data Preparation & Feature Engineering Debugging Cleaning

Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis

Model Evaluation Model Deployment

No Yes Augmentation Feature Are Business Goals met?

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Discovery: The Analysts

Re-training ML problem framing Data Collection

Data Integration

Monitoring & • Help formulate the right Data Preparation & Feature Engineering Debugging questions Cleaning • Domain Knowledge

Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis

Model Evaluation Model Deployment

No Yes Augmentation Feature Are Business Goals met?

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Integration: The Data Architecture

Retraining ML problem framing Data Collection

Data Integration

Monitoring & • Build the data platform: Data Preparation & Feature Engineering Debugging • Cleaning • AWS Glue • Amazon Athena

Data Augmentation Data Model Training & • Amazon EMR Parameter Tuning – Predictions • Data Visualization & Spectrum Analysis

Model Evaluation Model Deployment

No Yes Augmentation Feature Are Business Goals met?

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Training Undifferentiated Heavy Lifting

• Setup and manage Notebook Environments

• Setup and manage Feature Engineering Training Clusters

• Write Data Connectors Model Training & Parameter Tuning Data Visualization & • Scale ML algorithms to Analysis large datasets Model Evaluation

• Distribute ML training algorithm to multiple machines

• Secure Model artifacts

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Deployment Undifferentiated Heavy Lifting Business Problem –

• Setup and manage Model Inference Clusters

Monitoring & • Manage and Scale Model Debugging Inference APIs

• Monitor and Debug Model – Predictions Predictions

• Models versioning and Model Deployment performance tracking

• Automate New Model version promotion to production (A/B testing)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

Easily build, train, and deploy machine learning models

Collect and Choose and Set up and Train and tune Deploy model Scale and prepare training optimize your manage model in production manage the data ML algorithm environments for (trial and error) production training environment

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

Easily build, train, and deploy machine learning models

Pre-built Built-in, high notebooks performance Set up and Train and Scale and Deploy model for common algorithms manage tune model manage the problems environments (trial and in production production for training error) environment

BUILD

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker

SageMaker Built-in Algorithms k-Means Clustering PCA Neural Topic Modelling Factorisation Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq DeepAR Forecasting BlazingText () Random Cut Forest k-Nearest Neighbor Object Detection

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker

SageMaker Built-in Bring Your Own Algorithms Algorithms

K-means Clustering ML Algorithms PCA R Neural Topic Modelling MXNet Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker

SageMaker Built-in Bring Your SageMaker Algorithms Own Framework SDKs Algorithms K-means Clustering ML Algorithms TensorFlow SDK PCA R MXNet (Gluon) SDK Neural Topic Modelling MXNet Chainer SDK Factorisation Machines TensorFlow PyTorch SDK Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker

SageMaker Built-in Bring Your SageMaker Apache Spark Algorithms Own Framework SDK Estimator Algorithms TensorFlow SDK Apache Spark Python library K-means Clustering ML Algorithms MXNet (Gluon) SDK Apache Spark Scala library PCA R Chainer SDK Neural Topic Modelling MXNet PyTorch SDK Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Amazon EMR Seq2Seq … Linear Learner – Classification DeepAR Forecasting

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

Easily build, train, and deploy machine learning models

Pre-built Built-in, high One-click Hyperparameter training tuning notebooks performance Scale and for common algorithms Deploy model manage the problems in production production environment

BUILD TRAIN

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker

Easily build, train, and deploy machine learning models

Pre-built Built-in, high One-click Hyperparameter One-click Fully managed notebooks performance training tuning deployment hosting with for common algorithms auto-scaling problems

BUILD TRAIN DEPLOY

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reference Architecture Prepare Train & Optimise Deploy

SageMaker Training Notebooks Algorithm

Algorithm Raw Prepared Container User HPO Data Data Interactions

Training Data Inference requests

Amazon S3 SageMaker SageMaker Training Hosting AWS API Trained Lambda Gateway Model Trained Model

Amazon S3 © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Walkthrough: SageMaker Console

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Architecture

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

Training code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR Trainingdata Training code Training code Helper code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

Inference code

Modelartifacts Trainingdata Training code Training code Helper code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Amazon SageMaker Amazon ECR

Inference code Helper code Inference code

Model Hosting (on EC2)

Modelartifacts Trainingdata Training code Training code Helper code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Inference response Inference request Amazon SageMaker Amazon ECR

Inference Endpoint

Inference code Helper code Inference code

Model Hosting (on EC2)

Modelartifacts Trainingdata Training code Training code Helper code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application

Inference response Inference request Amazon SageMaker Amazon ECR

Inference Endpoint Ground Ground Truth

Inference code Helper code Inference code

Model Hosting (on EC2)

Modelartifacts Trainingdata Training code Training code Helper code

Model Training (on EC2)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Photo credit: https://www.flickr.com/photos/ceasedesist/5821282085 (Licensed for commercial use) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Option 1: Grid Search Regularization

Learning Rate

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 2: Random Search

“Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as

Regularization good or better within a small fraction of the computation time.” Learning Rate

Bergstra, Bengio, “Random Search for Hyper-Parameter Optimization” https://dl.acm.org/citation.cfm?id=2188395

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 3: Bayesian Optimization Papers

• A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical (https://arxiv.org/abs/1012.2599)

• Practical Bayesian Optimization of Machine Learning Algorithms (https://arxiv.org/abs/1206.2944)

• Taking the Human Out of the Loop: A Review of Bayesian Optimization (https://ieeexplore.ieee.org/document/7352306)

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning in Amazon SageMaker

Amazon SageMaker’s Hyperparameter Tuning feature is based on an implementation of Bayesian Optimization, along with some additional optimizations

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Walkthrough: Hyperparameter Tuning

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Additional Resources

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Steps with Amazon SageMaker

• Getting started with Amazon SageMaker: • https://aws.amazon.com/sagemaker/

• Use the Amazon SageMaker SDK: • For Python: https://github.com/aws/sagemaker-python-sdk • For Spark: https://github.com/aws/sagemaker-spark

• SageMaker Code Samples / Workshops: • https://github.com/awslabs/amazon-sagemaker-examples • https://github.com/awslabs/amazon-sagemaker-workshop

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ML Solutions Lab

Leverage Amazon experts with decades of ML experience with technologies like , , Prime Air and

Amazon ML Solutions Lab provides ML expertise Brainstorming Modeling Teaching

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Training Offer

Make your data driven decisions count, and make a career in Big Data on AWS Certified Big Data - Specialty AWS. Follow the Big Data Specialty learning path and become a specialist in Big Data: • Implement core AWS Big Data services according to best practices Big Data on AWS – 3-day Classroom Training • Design and maintain Big Data • Leverage tools to automate data analysis Free AWS digital training: Big Data Technology Fundamentals Who should attend

• Enterprise solutions • Big Data solutions architects architects

Certified Cloud • Data scientists • Data analysts Associate-level Certification Practitioner

Visit www.aws.training to find out more. Free AWS digital training: Foundational knowledge

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q&A

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You For Attending AWS Data Driven Decisions Webinar Series.

We hope you found it interesting! A kind reminder to complete the survey. Let us know what you thought of today’s event and how we can improve the event experience for you in the future.

[email protected] youtube.com/user/AmazonWebServices

twitter.com/AWSCloud slideshare.net/AmazonWebServices

facebook.com/AmazonWebServices .tv/aws

© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.