Predictive Analytics with Amazon SageMaker
Steve Shirkey Specialist SA, AWS (Singapore)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Explosion in AI and ML Use Cases
Image recognition and tagging for photo organization
Object detection, tracking and navigation for Autonomous Vehicles
Speech recognition & synthesis in Intelligent Voice Assistants
Algorithmic trading strategy performance improvement
Sentiment analysis for targeted advertisements
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ~1997
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thousands of Amazon Engineers Focused on Machine Learning
Fulfillment & Search & Existing New At logistics discovery products products AWS
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Over 20 years of AI at Amazon…
• Applied research • Q&A systems • Core research • Supply chain optimization • Alexa • Advertising • Demand forecasting • Machine translation • Risk analytics • Video content analysis • Search • Robotics • Recommendations • Lots of computer vision… • AI services • NLP/NLU
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ML @ AWS Put machine learning in the OUR MISSION hands of every developer and data scientist
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customers Running Machine Learning On AWS Today
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Amazon Machine Learning Stack
APPLICATION SERVICES
Rekognition Transcribe Translate Polly Comprehend Lex
PLATFORM SERVICES
Amazon SageMaker Amazon EMR (Spark ML) Amazon Mechanical Turk AWS DeepLens
FRAMEWORKS & INTERFACES AWS Deep Learning AMIs Apache Caffe2 CNTK PyTorch TensorFlow Chainer Keras Gluon MXNet
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Let’s Review the ML Process
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – The Machine Learning Process
Re-training ML problem framing Data Collection
Data Integration
Monitoring & Data Preparation & Feature Engineering Debugging Cleaning
Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis
Model Evaluation Model Deployment
No Yes Augmentation Feature Are Business Goals met?
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Discovery: The Analysts
Re-training ML problem framing Data Collection
Data Integration
Monitoring & • Help formulate the right Data Preparation & Feature Engineering Debugging questions Cleaning • Domain Knowledge
Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis
Model Evaluation Model Deployment
No Yes Augmentation Feature Are Business Goals met?
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Integration: The Data Architecture
Retraining ML problem framing Data Collection
Data Integration
Monitoring & • Build the data platform: Data Preparation & Feature Engineering Debugging • Amazon S3 Cleaning • AWS Glue • Amazon Athena
Data Augmentation Data Model Training & • Amazon EMR Parameter Tuning – Predictions • Amazon Redshift Data Visualization & Spectrum Analysis
Model Evaluation Model Deployment
No Yes Augmentation Feature Are Business Goals met?
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Training Undifferentiated Heavy Lifting
• Setup and manage Notebook Environments
• Setup and manage Feature Engineering Training Clusters
• Write Data Connectors Model Training & Parameter Tuning Data Visualization & • Scale ML algorithms to Analysis large datasets Model Evaluation
• Distribute ML training algorithm to multiple machines
• Secure Model artifacts
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Deployment Undifferentiated Heavy Lifting Business Problem –
• Setup and manage Model Inference Clusters
Monitoring & • Manage and Scale Model Debugging Inference APIs
• Monitor and Debug Model – Predictions Predictions
• Models versioning and Model Deployment performance tracking
• Automate New Model version promotion to production (A/B testing)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker
Easily build, train, and deploy machine learning models
Collect and Choose and Set up and Train and tune Deploy model Scale and prepare training optimize your manage model in production manage the data ML algorithm environments for (trial and error) production training environment
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker
Easily build, train, and deploy machine learning models
Pre-built Built-in, high notebooks performance Set up and Train and Scale and Deploy model for common algorithms manage tune model manage the problems environments (trial and in production production for training error) environment
BUILD
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker
SageMaker Built-in Algorithms k-Means Clustering PCA Neural Topic Modelling Factorisation Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq DeepAR Forecasting BlazingText (word2vec) Random Cut Forest k-Nearest Neighbor Object Detection
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker
SageMaker Built-in Bring Your Own Algorithms Algorithms
K-means Clustering ML Algorithms PCA R Neural Topic Modelling MXNet Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker
SageMaker Built-in Bring Your SageMaker Algorithms Own Framework SDKs Algorithms K-means Clustering ML Algorithms TensorFlow SDK PCA R MXNet (Gluon) SDK Neural Topic Modelling MXNet Chainer SDK Factorisation Machines TensorFlow PyTorch SDK Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker
SageMaker Built-in Bring Your SageMaker Apache Spark Algorithms Own Framework SDK Estimator Algorithms TensorFlow SDK Apache Spark Python library K-means Clustering ML Algorithms MXNet (Gluon) SDK Apache Spark Scala library PCA R Chainer SDK Neural Topic Modelling MXNet PyTorch SDK Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Amazon EMR Seq2Seq … Linear Learner – Classification DeepAR Forecasting
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker
Easily build, train, and deploy machine learning models
Pre-built Built-in, high One-click Hyperparameter training tuning notebooks performance Scale and for common algorithms Deploy model manage the problems in production production environment
BUILD TRAIN
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker
Easily build, train, and deploy machine learning models
Pre-built Built-in, high One-click Hyperparameter One-click Fully managed notebooks performance training tuning deployment hosting with for common algorithms auto-scaling problems
BUILD TRAIN DEPLOY
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reference Architecture Prepare Train & Optimise Deploy
SageMaker Training Notebooks Algorithm
Algorithm Raw Prepared Container User HPO Data Data Interactions
Training Data Inference requests
Amazon S3 SageMaker SageMaker Training Hosting AWS API Trained Lambda Gateway Model Trained Model
Amazon S3 © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Walkthrough: SageMaker Console
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Architecture
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Amazon SageMaker Amazon ECR
Training code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Amazon SageMaker Amazon ECR Trainingdata Training code Training code Helper code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Amazon SageMaker Amazon ECR
Inference code
Modelartifacts Trainingdata Training code Training code Helper code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Amazon SageMaker Amazon ECR
Inference code Helper code Inference code
Model Hosting (on EC2)
Modelartifacts Trainingdata Training code Training code Helper code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Inference response Inference request Amazon SageMaker Amazon ECR
Inference Endpoint
Inference code Helper code Inference code
Model Hosting (on EC2)
Modelartifacts Trainingdata Training code Training code Helper code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application
Inference response Inference request Amazon SageMaker Amazon ECR
Inference Endpoint Ground Ground Truth
Inference code Helper code Inference code
Model Hosting (on EC2)
Modelartifacts Trainingdata Training code Training code Helper code
Model Training (on EC2)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Photo credit: https://www.flickr.com/photos/ceasedesist/5821282085 (Licensed for commercial use) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Option 1: Grid Search Regularization
Learning Rate
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 2: Random Search
“Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as
Regularization good or better within a small fraction of the computation time.” Learning Rate
Bergstra, Bengio, “Random Search for Hyper-Parameter Optimization” https://dl.acm.org/citation.cfm?id=2188395
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 3: Bayesian Optimization Papers
• A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning (https://arxiv.org/abs/1012.2599)
• Practical Bayesian Optimization of Machine Learning Algorithms (https://arxiv.org/abs/1206.2944)
• Taking the Human Out of the Loop: A Review of Bayesian Optimization (https://ieeexplore.ieee.org/document/7352306)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning in Amazon SageMaker
Amazon SageMaker’s Hyperparameter Tuning feature is based on an implementation of Bayesian Optimization, along with some additional optimizations
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Walkthrough: Hyperparameter Tuning
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Additional Resources
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Steps with Amazon SageMaker
• Getting started with Amazon SageMaker: • https://aws.amazon.com/sagemaker/
• Use the Amazon SageMaker SDK: • For Python: https://github.com/aws/sagemaker-python-sdk • For Spark: https://github.com/aws/sagemaker-spark
• SageMaker Code Samples / Workshops: • https://github.com/awslabs/amazon-sagemaker-examples • https://github.com/awslabs/amazon-sagemaker-workshop
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ML Solutions Lab
Leverage Amazon experts with decades of ML experience with technologies like Amazon Echo, Amazon Alexa, Prime Air and Amazon Go
Amazon ML Solutions Lab provides ML expertise Brainstorming Modeling Teaching
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Training Offer
Make your data driven decisions count, and make a career in Big Data on AWS Certified Big Data - Specialty AWS. Follow the Big Data Specialty learning path and become a specialist in Big Data: • Implement core AWS Big Data services according to best practices Big Data on AWS – 3-day Classroom Training • Design and maintain Big Data • Leverage tools to automate data analysis Free AWS digital training: Big Data Technology Fundamentals Who should attend
• Enterprise solutions • Big Data solutions architects architects
Certified Cloud • Data scientists • Data analysts Associate-level Certification Practitioner
Visit www.aws.training to find out more. Free AWS digital training: Foundational knowledge
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q&A
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You For Attending AWS Data Driven Decisions Webinar Series.
We hope you found it interesting! A kind reminder to complete the survey. Let us know what you thought of today’s event and how we can improve the event experience for you in the future.
[email protected] youtube.com/user/AmazonWebServices
twitter.com/AWSCloud slideshare.net/AmazonWebServices
facebook.com/AmazonWebServices twitch.tv/aws
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.