A I M 3 0 7 SageMaker: A deep dive

Shyam Srinivasan Julien Simon Nils Mohr Product Marketing Lead, ML Principal Evangelist, AI/ML Flight Data Recorder Amazon Web Services Programming Analyst British Airways

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Great things

team Steve Jobs Agenda

What is Amazon SageMaker?

Why did we build Amazon SageMaker?

Ongoing challenges with and how to overcome them

Demo

Seeing is believing – Amazon SageMaker at British Airways Amazon SageMaker

Machine learning for every developer & data scientist Build, train, and deploy ML models at scale Amazon SageMaker Getting models from concept to production

Pre-built AWS Marketplace Amazon EC2 Amazon Auto-scaling Amazon notebooks for ML P3dn Instances SageMaker Neo infrastructure Elastic Inference

Amazon SageMaker Amazon Managed Spot Open-source Multi-model Operators for Ground Truth SageMaker RL Training runtime endpoints Kubernetes

1 Built-in, Hundreds2 of One-click Model One-click Inference and high-performance notebook3 examples training optimization deployment inter-operability algorithms

Scale and manage Collect and Choose and Train and tune Optimize models Deploy models in the production prepare training optimize your ML models for the cloud and production data algorithm the edge environment Amazon SageMaker Build, train, and deploy machine learning models quickly at scale

ML Algorithms Training & Reinforcement Deployment & Ground Truth Notebooks Neo Marketplace & Frameworks Tuning Learning Hosting

Amazon SageMaker But, what are we seeing as the ongoing challenges to machine learning? Amazon SageMaker Addressing challenges to machine learning

Amazon SageMaker Model Monitor Amazon SageMaker Amazon SageMaker Studio Experiments

Amazon SageMaker Amazon SageMaker Notebooks Autopilot (Preview) Amazon SageMaker Debugger Build, train, and deploy machine learning models quickly at scale

NEW! Amazon SageMaker Studio IDE

NEW! NEW! NEW! NEW! NEW! Ground ML Algorithms SageMaker Training Reinforcement SageMaker SageMaker SageMaker Neo Deployment SageMaker Truth Marketplace and Notebooks and Learning Experiments Debugger Autopilot and Model Frameworks Tuning Hosting Monitor Amazon SageMaker Multiple tools needed for different phases of the ML workflow +

Lack of an integrated experience

+ Machine learning is iterative, Large number of iterations involving dozens of tools and hundreds of iterations =

Cumbersome, lengthy processes, resulting in loss of productivity NEW Introducing Amazon SageMaker Studio The first fully integrated development environment (IDE) for machine learning

Share scalable notebooks Organize, track, and Get accurate models for Automatically debug Code, build, train, deploy without tracking code compare thousands of with full visibility & control errors, monitor models & & monitor in a unified dependencies experiments without writing code maintain high quality visual interface Set up and manage resources + Collaboration across multiple data scientists + Different data science Data science and collaboration projects have different needs to be easy resource needs = Managing notebooks and collaborating across multiple data scientists is highly complicated NEW Introducing Amazon SageMaker Notebooks (Available in Preview) Fast-start shareable notebooks

Start your notebooks Share your notebooks Dial up or down Access your notebooks in Administrators manage without spinning up as a URL with a single click compute resources seconds with your access and permissions compute resources corporate credentials Thousands of experiments

+ Hundreds of parameters per experiment Managing trials and experiments + is cumbersome Compare and evaluate

=

Very cumbersome and error-prone NEW Introducing Amazon SageMaker Experiments Organize, track, and compare training experiments

Tracking at scale Custom organization Visualization Metrics and logging Fast Iteration Track parameters & Easily visualize Organize experiments by Log custom metrics using Quickly go back & forth metrics across experiments and teams, goals & hypotheses the Python SDK & APIs & identify the best experiments & users compare performing experiments Large neural networks with many layers

+

Data capture with many connections Analysis & debugging + Additional tooling for analysis while training ML models is painful and debug

=

Extraordinarily difficult to inspect, debug, and profile the "black box" NEW Introducing Amazon SageMaker Debugger Analysis & debugging, explainability, and alert generation

Relevant data Data analysis & Automatic error Improved productivity Visual analysis capture debugging detection with alerts and debugging Visually analyze & Data is automatically Analyze & debug data with Take corrective action Errors are automatically debug from Amazon captured for analysis based on alerts no code changes detected based on rules SageMaker Studio Concept drift due to divergence of data

+ Model performance can change due to Deploying a model is not the end unknown factors + You need to continuously monitor Continuous monitoring involves a models in production and iterate lot of tooling and expense

=

Model monitoring is cumbersome but critical NEW Introducing Amazon SageMaker Model Monitor Continuous monitoring of models in production

Amazon Continuous Flexibility Visual Automatic data CloudWatch monitoring with rules data analysis collection integration See monitoring results, Define a monitoring Use built-in rules to Automate corrective Data is automatically data statistics, and schedule and detect detect data drift or write actions based on Amazon collected from your violation reports in changes in quality against your own rules for CloudWatch alerts endpoints Amazon SageMaker a pre-defined baseline custom analysis Studio Largely explorative & iterative + Requires broad and complete Successful ML requires knowledge of ML domain complex, hard to discover + combinations Lack of visibility = of algorithms, data, parameters Time-consuming, Error-prone process, even for ML experts NEW Introducing Amazon SageMaker Autopilot Automatic model creation with full visibility & control

Automatic Quick to start Visibility & control Recommendations & model creation optimization

Provide your data in a Get ML models with feature Get notebooks for your Get a leaderboard & continue tabular form & specify target engineering & automatic model models with source code to improve your model prediction tuning automatically done Build, train, and deploy machine learning models quickly at scale

NEW! Amazon SageMaker Studio IDE

NEW! NEW! NEW! NEW! NEW! Ground ML Algorithms SageMaker Training Reinforcement SageMaker SageMaker SageMaker Neo Deployment SageMaker Truth Marketplace and Notebooks and Learning Experiments Debugger Autopilot and Model Frameworks Tuning Hosting Monitor Amazon SageMaker © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aircraft Information Intelligence Architecture & intentions

Amazon S3 Data optimization and Condition monitoring Amazon SageMaker storage metadata generation © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using built-in algorithms

Unsupervised Supervised

• K-Means • XGBoost Clustering of discrete data points Further classification/root cause detection

• Random Cut Forest • DeepAR Anomaly detection in continuous data Time-series forecasting Our dataset ACARS Timeseries data Airspeed (kt) 300 QX QUKMEBA LONKFBA 150 .LONAXBA 221134 0 Heading (deg) OFF01 360 BA414 G-NEOV 221134 180

0 1134DIA02EGLLEDDT154853 Altitude (ft) 40000 15.4t fuel 85.3t take-off weight 20000

0 © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Example dataset

1.2

1

0.8 abnormal 0.6

0.4

normal 0.2

0

-0.2

-0.4 Resulting Amazon SageMaker scores

4

3

2

1

0

4

3

2

1

0 Condition monitoring over multiple flights

2

1.5

1

0.5

0

2

1.5

1

0.5

0 A year of using Amazon SageMaker

• About 4 flights earlier detection than a “traditional” detection mechanism

• Amazon SageMaker built-in algorithms allow us to focus on our domain expertise – handling flight data

• BUT: We found some “false” positives, which turned out to be genuine events of a different type Expect the unexpected

1.6

1.4

1.2

1

0.8

0.6

0.4

0.2

0 © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Build, train, and test with Amazon SageMaker

Train Model Model Endpoint config Endpoint config

Engineer Jupyter lab Run historic data a Run historic data gainst model against model Interfacing with Amazon SageMaker

Amazon SQS Checks config and Amazon SQS Creates dataset for Evaluation of the data creates endpoints inference creation of alerts if needed

Amazon SageMaker Amazon SageMaker Endpoint Configuration Endpoint Build, train, and deploy machine learning models quickly at scale

NEW! Amazon SageMaker Studio IDE

NEW! NEW! NEW! NEW! NEW! Ground ML Algorithms SageMaker Training Reinforcement SageMaker SageMaker SageMaker Neo Deployment SageMaker Truth Marketplace and Notebooks and Learning Experiments Debugger Autopilot and Model Frameworks Tuning Hosting Monitor Amazon SageMaker Related sessions

Session ID Title When Where AIM361-R1 Optimizing your ML models with Wed, Dec. 4, Venetian, Level 4, Amazon SageMaker Autopilot 3:15 PM Lando 4304

AIM214-R1 Introducing Amazon SageMaker Wed, Dec. 4, Venetian, Level 2, Studio 6:15 PM Titian 2202

AIM362-R1 Build, train, tune & debug, and deploy Thu, Dec. 5, Aria, Level 1 East, & monitor with Amazon SageMaker 12:15 PM Joshua 2

AIM213-R1 Introducing Amazon SageMaker Thu, Dec. 5, Venetian, Level 4, Model Monitor 1 PM Delfino 405

AIM215-R1 Introducing Amazon SageMaker Thu, Dec. 5, Mirage, Events Center Autopilot 1:45 PM C3 Thank you!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.