A I M 3 0 3 Stop guessing: Use AI to understand customer conversations
Dirk Fröhner Boaz Ziniman Senior Solutions Architect Principal Technical Evangelist Amazon Web Services Amazon Web Services
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda
• Introduction Amazon Web Services (AWS) AI/ML offering • Introduction Amazon Connect • Architecture of the labs • Work on the labs • Wrap-up Related breakouts
• AIM211 - AI document processing for business automation • AIM212 - ML in retail: Solutions that add intelligence to your business • AIM222 - Monetizing text-to-speech AI • AIM302 - Create a Q&A bot with Amazon Lex and Amazon Alexa © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Natural language processing (NLP)
• Automatic speech recognition (ASR) • Natural language understanding (NLU) • Text to speech • Translation @ziniman
@ziniman Use cases for NLP
Voice of customer Knowledge management Education applications
Customer service/ Semantic search Accessibility call centers
Enterprise Captioning workflows Information bots digital assistant
Personalization Localization The Amazon ML stack: Broadest & deepest set of capabilities
Vision Speech Language Chatbots Forecasting Recommendations
A I s e r v i c e s A m a z o n A m a z o n T r a n s l a t e A m a z o n A m a z o n A m a z o n A m a z o n A m a z o n C o m p r e h e n d A m a z o n A m a z o n A m a z o n Rekognition Rekognition T e x t r a c t P o l l y T r a n s c r i b e & A m a z o n L e x F o r e c a s t Personalize i m a g e v i d e o Comprehend M e d i c a l
A m a z o n S a g e M a k e r
ML s e r v i c e s B u i l d T r a i n D e p l o y Prebuilt algorithms & notebooks One-click model training & tuning One-click deployment & hosting
Data labeling (Ground Truth) Optimization (NEO) Algorithms & models (AWS Marketplace Reinforcement learning for machine learning)
F r a m e w o r k s I n t e r f a c e s Infrastructure
ML frameworks & infrastructure E C 2 P 3 E C 2 C 5 F P G A s A W S I o T A m a z o n & P 3 d n G r e e n g r a s s E l a s t i c i n f e r e n c e AI services
Vision Speech Language Chatbots Forecasting Recommendations
A m a z o n A m a z o n T r a n s l a t e A I s e r v i c e s A m a z o n A m a z o n A m a z o n A m a z o n A m a z o n C o m p r e h e n d A m a z o n A m a z o n A m a z o n Rekognition Rekognition T e x t r a c t P o l l y T r a n s c r i b e & A m a z o n L e x F o r e c a s t Personalize i m a g e v i d e o Comprehend M e d i c a l
Pretrained AI services that require Ability to easily add intelligence to Quality and accuracy from no ML skills or training your existing apps and workflows continuously learning APIs AI services
Vision Speech Language Chatbots Forecasting Recommendations
A m a z o n A m a z o n T r a n s l a t e A I s e r v i c e s A m a z o n A m a z o n A m a z o n A m a z o n A m a z o n C o m p r e h e n d A m a z o n A m a z o n A m a z o n Rekognition Rekognition T e x t r a c t P o l l y T r a n s c r i b e & A m a z o n L e x F o r e c a s t Personalize I m a g e V i d e o Comprehend M e d i c a l
Pretrained AI services that require Ability to easily add intelligence to Quality and accuracy from no ML skills or training your existing apps and workflows continuously learning APIs Turn text into lifelike speech using deep learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Polly: Use cases
Podcasting, voiced blogs Navigation & news articles
Special needs AI assistant Amazon Polly Contact centers Language Voiced videos learning & presentations Amazon Polly: Text in, lifelike speech out
“Today in Seattle, WA, “Today in Seattle, Washington, it’s 11°F” it’s 11 degrees Fahrenheit”
Amazon Polly
29 languages Amazon Polly: Text in, lifelike speech out
“Genießt ihr den Rest “Genießt ihr den Rest des Tages” des Tages”
Amazon Polly
29 languages Add semantic meaning to text https://www.w3.org/TR/speech-synthesis/
Text Sentence to synthesize.
ˈsɛntəns tə ˈsɪnθəˌsaɪz. Standard vs. neural TTS
Text Sentence to synthesize.
ˈsɛntəns tə ˈsɪnθəˌsaɪz. Phonetic transcription
Concatenative TTS Neural TTS
ˈsɛnt sɛntəns tə ˈsɪnθ əˌsaɪz. Neural TTS with Amazon Polly Standard vs. neural
“Hi. My name is Boaz, and I will be your host today during this Innovate online session.” Neural TTS with Amazon Polly Standard vs. neural
“Hi. My name is Boaz, and I will be your host today during this Innovate online session.” Neural TTS with Amazon Polly Standard vs. neural Styles with neural TTS
“Hi. My name is Boaz, and I will “Hi. My name is Boaz, and I will be your host today during this be your host today during this Innovate online session.” Innovate online session.” Add voice to your app
Amazon Polly Automatic speech recognition
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automatic speech recognition service
“Hello. This is Allan speaking.”
Amazon Transcribe Amazon Transcribe: Key features
Custom Speaker Channel vocabulary identification identification
Punctuation and Word-level Word-level capitalization time stamps confidence scores Amazon Transcribe: Streaming transcription
Bidirectional stream over:
HTTP/2 protocol
or
WebSocket (WSS) protocol
Audio stream
Source file or Amazon Transcribe microphone Text stream Speech to text
RingDNA is an end-to-end communications platform for sales teams. Hundreds of enterprise organizations use RingDNA to increase productivity, engage in smarter sales conversations, gain predictive sales insights, and improve their win rate.
"A critical component of RingDNA’s Conversation AI requires best-of-breed speech-to-text to deliver transcriptions of every phone call. RingDNA is excited about Amazon Transcribe since it provides high-quality speech recognition at scale, helping us to better transcribe every call to text.”
Howard Brown, CEO & Founder RingDNA Natural and accurate language translation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Key features
54 languages $15/1M characters 2,804 combinations Or $0.000075 per word; pay as you go, 2M characters monthly free tier
Real-time < / > Tag handling < 500ms / sentence on average XML tag placement maintains < 150ms / conversational / short form styling and formatting through translation Data security Ease of use Data ownership Simple API calls and partner Encryption solutions Access management
HIPAA eligible Amazon Translate
Natural and fluent language translation
“Hello, what’s up? Do "Hallo, wat is er? Wil je you want to go see a vanavond naar de film movie tonight?” gaan?" Amazon Translate Translate API example import boto3 translate = boto3.client("translate") lang_flag_pairs = [("fr", ""), ("de", ""), ("es", ""), ("pt", ""),("zh", ""), ("ja", ""), ("ru", ""),("it", ""), ("zh-TW", ""), ("tr", ""), ("cs", ""), ("he", "")] for lang, flag in lang_flag_pairs: print(flag) print(translate.translate_text( Text="Hello, World", SourceLanguageCode="en", TargetLanguageCode=lang )['TranslatedText']) Translate API example https://github.com/ziniman/aws-translate-demo Bonjour, Monde Привет, Мир Hallo, Welt Ciao, Mondo Hola, Mundo 大家好,世界 Olá, Mundo Merhaba, Dünya. 您好,世界 Ahoj, světe. שלום, עולם. ハローワールド Discover insights and relationships in text
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Comprehend Discover insights and relationships in text
Entities
Key phrases
Language
Sentiment Amazon Comprehend Syntax
Grouping Run Amazon Comprehend on an S3 bucket import boto3 import json s3 = boto3.resource('s3') bucket_name = 'my_bucket' region_name = 'us-east-1' bucket = s3.Bucket(bucket_name) comprehend = boto3.client(service_name='comprehend', region_name=region) for obj in bucket.objects.all(): body = obj.get()['Body'].read() text = body
sentiment_response = comprehend.detect_sentiment(Text=text, LanguageCode='en') print(json.dumps(sentiment_response, sort_keys=True, indent=4)) "The Babel fish is small, yellow, leech-like—and probably the oddest thing in the universe. It feeds on brain wave energy, absorbing all unconscious frequencies and then excreting telepathically a matrix formed from the conscious frequencies and nerve signals picked up from the speech centres of the brain, the practical upshot of which is that if you stick one in your ear, you can instantly understand anything said to you in any form of language: the speech you hear decodes the brain wave matrix."
Douglas Adams The Hitchhiker’s Guide to the Galaxy
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Build your own Babel fish
Person speaks in English Build your own Babel fish
Amazon Transcribe
Person speaks in Transcribes voice to text English in English Build your own Babel fish
Amazon Amazon Transcribe Translate
Person speaks in Transcribes voice to text Translates text to English in English German Build your own Babel fish
Amazon Amazon Amazon Transcribe Translate Polly
Person speaks in Transcribes voice to text Translates text to Speaks in English in English German German https://voice.boaz.cloud/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Connect
Easy-to-use, cloud-based contact center solution that scales to support businesses of any size With tools that grow with your needs
Skills-based routing Call Real-time and High-quality (Automatic call distribution [ACD]) recording historical analytics voice capability Improving contact centers with artificial intelligence
A m a z o n L e x
A m a z o n C o n n e c t
T r a n s c r i p t
A m a z o n A m a z o n An a ly tics T r a n s c r i b e C o m p r e h e n d © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our use-case today
Processing of contact center call recordings
Continual analyses for sentiment and key words
Actionable insights from data as strategic asset
Cloud-first & serverless
Photo: Dirk Fröhner Choose your path
Labs for different experience levels to respect a heterogeneous audience
Foundation lab to generate or obtain sample audio recordings
Exploration path for less advanced participants
Deep-dive path for more advanced participants Photo: Dirk Fröhner Choose your path
Photo: Dirk Fröhner Foundation
Foundation pillar for both paths
Create or obtain your sample call recordings
Photo: Dirk Fröhner Foundation: Lab 0
Create or obtain your sample call recordings, three options available Foundation: Lab 0
Create or obtain your sample call recordings, three options available
Amazon Connect Foundation: Lab 0
Create or obtain your sample call recordings, three options available
Amazon Connect
Amazon Polly Foundation: Lab 0
Create or obtain your sample call recordings, three options available
Amazon Connect
Amazon Polly
Amazon S3 Foundation: Lab 0
Create or obtain your sample call recordings, three options available
Amazon Connect
Amazon Amazon Polly S3
Amazon S3 Exploration path
Manual soldering in the AWS management console
Learning objectives: • Explore speech and language AI services • Understand how these services can help you without the need to train your own ML models
Photo: Dirk Fröhner Exploration: Labs 1–3
Manually run through ASR and NLP analyses, look into language translation as well Exploration: Labs 1–3
Manually run through ASR and NLP analyses, look into language translation as well
Lab 1
Amazon Amazon S3 Transcribe Exploration: Labs 1–3
Manually run through ASR and NLP analyses, look into language translation as well
Lab 2
Amazon Translate
Lab 1
Amazon Amazon S3 Transcribe Exploration: Labs 1–3
Manually run through ASR and NLP analyses, look into language translation as well
Lab 2
Amazon Translate
Lab 1
Amazon Amazon S3 Transcribe
Lab 3
Amazon Comprehend Deep dive path
Launching a processing pipeline to automatically run ASR + NLP analyses
Learning objectives: • Explore how to use speech and language AI services with their APIs programmatically • Understand how to further analyze and visualize these results with AWS analytics services
Photo: Dirk Fröhner Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Call recordings
Amazon S3 Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Call Invoke recordings transcription
Amazon AWS S3 Lambda Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Transcription workflow
Call Invoke recordings transcription AWS Step Functions
Amazon AWS S3 Lambda
AWS Amazon Lambda Transcribe Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Transcription workflow
Call Invoke Transcription recordings transcription results AWS Step Functions
Amazon AWS Amazon S3 Lambda S3
AWS Amazon Lambda Transcribe Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Transcription Invoke workflow NLP analyses
Call Invoke Transcription recordings transcription results AWS Step AWS Functions Lambda
Amazon AWS Amazon S3 Lambda S3
AWS Amazon Amazon Lambda Transcribe Comprehend Deep dive: Lab 4
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Transcription Invoke workflow NLP analyses
Call Invoke Transcription NLP analyses recordings transcription results results AWS Step AWS Functions Lambda
Amazon AWS Amazon Amazon S3 Lambda S3 S3
AWS Amazon Amazon Lambda Transcribe Comprehend Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5
NLP analyses results
Amazon S3 Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Data dictionary
NLP analyses AWS results Glue
Amazon S3 Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Data dictionary
NLP analyses AWS Queries on results Glue NLP results
Amazon Amazon S3 Athena Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Data dictionary
Lab 6 NLP analyses AWS Queries on results Glue NLP results
Amazon Amazon S3 Athena Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Data dictionary
Lab 6 NLP analyses AWS Queries on results Glue NLP results
Amazon Amazon S3 Athena Deep dive: Labs 5–6
Launch a processing pipeline to automatically run ASR, NLP, and further analytics
Lab 5 Data dictionary
Lab 6 NLP analyses AWS Queries on Sentiment results Glue NLP results visualization
Amazon Amazon Amazon S3 Athena Quicksight © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resources / call to action
AWS solution: AI-powered speech analytics for Amazon Connect http://bit.ly/2019-aim303-r1 Blog posts that inspired this workshop Part 1: http://bit.ly/2019-aim303-r2 Part 2: http://bit.ly/2019-aim303-r3 Where do you go from here? Think about use-case opportunities Get continual insights about callers’ sentiments Search for and reason on top-of-call metadata Report on agent performance / callers’ happiness or frustration (agent ID, caller ID) Find the content of this slide also on the lab guide website Learn ML with AWS Training and Certification The same training that our own developers use, now available on demand
Role-based ML learning paths for developers, data scientists, data platform engineers, and business decision makers
70+ free digital ML courses from AWS experts let you learn from real-world challenges tackled at AWS
Validate expertise with the AWS Certified Machine Learning - Specialty exam
Visit https://aws.training/machinelearning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you!
Dirk Fröhner Boaz Ziniman [email protected] | @dirk_f5r [email protected] | @ziniman
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.