Predictive Analytics with Amazon Sagemaker
Total Page:16
File Type:pdf, Size:1020Kb
Predictive Analytics with Amazon SageMaker Steve Shirkey Specialist SA, AWS (Singapore) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Explosion in AI and ML Use Cases Image recognition and tagging for photo organization Object detection, tracking and navigation for Autonomous Vehicles Speech recognition & synthesis in Intelligent Voice Assistants Algorithmic trading strategy performance improvement Sentiment analysis for targeted advertisements © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ~1997 © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thousands of Amazon Engineers Focused on Machine Learning Fulfillment & Search & Existing New At logistics discovery products products AWS © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Over 20 years of AI at Amazon… • Applied research • Q&A systems • Core research • Supply chain optimization • Alexa • Advertising • Demand forecasting • Machine translation • Risk analytics • Video content analysis • Search • Robotics • Recommendations • Lots of computer vision… • AI services • NLP/NLU © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. ML @ AWS Put machine learning in the OUR MISSION hands of every developer and data scientist © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customers Running Machine Learning On AWS Today © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Amazon Machine Learning Stack APPLICATION SERVICES Rekognition Transcribe Translate Polly Comprehend Lex PLATFORM SERVICES Amazon SageMaker Amazon EMR (Spark ML) Amazon Mechanical Turk AWS DeepLens FRAMEWORKS & INTERFACES AWS Deep Learning AMIs Apache Caffe2 CNTK PyTorch TensorFlow Chainer Keras Gluon MXNet © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Let’s Review the ML Process © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – The Machine Learning Process Re-training ML problem framing Data Collection Data Integration Monitoring & Data Preparation & Feature Engineering Debugging Cleaning Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis Model Evaluation Model Deployment No Yes Augmentation Feature Are Business Goals met? © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Discovery: The Analysts Re-training ML problem framing Data Collection Data Integration Monitoring & • Help formulate the right Data Preparation & Feature Engineering Debugging questions Cleaning • Domain Knowledge Data Augmentation Data Model Training & Parameter Tuning Data Visualization & – Predictions Analysis Model Evaluation Model Deployment No Yes Augmentation Feature Are Business Goals met? © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Integration: The Data Architecture Retraining ML problem framing Data Collection Data Integration Monitoring & • Build the data platform: Data Preparation & Feature Engineering Debugging • Amazon S3 Cleaning • AWS Glue • Amazon Athena Data Augmentation Data Model Training & • Amazon EMR Parameter Tuning – Predictions • Amazon Redshift Data Visualization & Spectrum Analysis Model Evaluation Model Deployment No Yes Augmentation Feature Are Business Goals met? © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Training Undifferentiated Heavy Lifting • Setup and manage Notebook Environments • Setup and manage Feature Engineering Training Clusters • Write Data Connectors Model Training & Parameter Tuning Data Visualization & • Scale ML algorithms to Analysis large datasets Model Evaluation • Distribute ML training algorithm to multiple machines • Secure Model artifacts © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why We built Amazon SageMaker: Model Deployment Undifferentiated Heavy Lifting Business Problem – • Setup and manage Model Inference Clusters Monitoring & • Manage and Scale Model Debugging Inference APIs • Monitor and Debug Model – Predictions Predictions • Models versioning and Model Deployment performance tracking • Automate New Model version promotion to production (A/B testing) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Easily build, train, and deploy machine learning models Collect and Choose and Set up and Train and tune Deploy model Scale and prepare training optimize your manage model in production manage the data ML algorithm environments for (trial and error) production training environment © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Easily build, train, and deploy machine learning models Pre-built Built-in, high notebooks performance Set up and Train and Scale and Deploy model for common algorithms manage tune model manage the problems environments (trial and in production production for training error) environment BUILD © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker SageMaker Built-in Algorithms k-Means Clustering PCA Neural Topic Modelling Factorisation Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq DeepAR Forecasting BlazingText (word2vec) Random Cut Forest k-Nearest Neighbor Object Detection © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker SageMaker Built-in Bring Your Own Algorithms Algorithms K-means Clustering ML Algorithms PCA R Neural Topic Modelling MXNet Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker SageMaker Built-in Bring Your SageMaker Algorithms Own Framework SDKs Algorithms K-means Clustering ML Algorithms TensorFlow SDK PCA R MXNet (Gluon) SDK Neural Topic Modelling MXNet Chainer SDK Factorisation Machines TensorFlow PyTorch SDK Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Seq2Seq … Linear Learner – Classification DeepAR Forecasting © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training ML Models Using Amazon SageMaker SageMaker Built-in Bring Your SageMaker Apache Spark Algorithms Own Framework SDK Estimator Algorithms TensorFlow SDK Apache Spark Python library K-means Clustering ML Algorithms MXNet (Gluon) SDK Apache Spark Scala library PCA R Chainer SDK Neural Topic Modelling MXNet PyTorch SDK Factorisation Machines TensorFlow Linear Learner – Regression Caffe XGBoost PyTorch Latent Dirichlet Allocation Keras Image Classification CNTK Amazon EMR Seq2Seq … Linear Learner – Classification DeepAR Forecasting © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Easily build, train, and deploy machine learning models Pre-built Built-in, high One-click Hyperparameter training tuning notebooks performance Scale and for common algorithms Deploy model manage the problems in production production environment BUILD TRAIN © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Easily build, train, and deploy machine learning models Pre-built Built-in, high One-click Hyperparameter One-click Fully managed notebooks performance training tuning deployment hosting with for common algorithms auto-scaling problems BUILD TRAIN DEPLOY © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reference Architecture Prepare Train & Optimise Deploy SageMaker Training Notebooks Algorithm Algorithm Raw Prepared Container User HPO Data Data Interactions Training Data Inference requests Amazon S3 SageMaker SageMaker Training Hosting AWS API Trained Lambda Gateway Model Trained Model Amazon S3 © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Walkthrough: SageMaker Console © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Architecture © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application Amazon SageMaker Amazon ECR Training code Model Training (on EC2) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application Amazon SageMaker Amazon ECR Trainingdata Training code Training code Helper code Model Training (on EC2) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application Amazon SageMaker Amazon ECR Inference code Modelartifacts Trainingdata Training code Training code Helper code Model Training (on EC2) © 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved. Client application Amazon SageMaker Amazon ECR Inference code Helper code Inference code Model Hosting (on EC2) Modelartifacts Trainingdata Training