Using Advanced Video AI in Media to answer… ”What’s In My Content?”

Martin Wahl Principal Program Manager, Cloud AI & Cognitive Services Redmond, WA mawahl@.com Over $150B in IT Spend expected ($19B in Media & Entertainment)

AI over any data, anywhere Productive 1 AI platform in the Cloud & at the Edge

Hybrid Comprehensive 2 Enterprise data estate

On-premises data Cloud data

Enterprise-proven 3 solutions

LOB CRM Graph Image Social Device / IoT Azure AI Services

Tools

Azure Infrastructure

Imagine if you could…

• Automatically transcribe your video or audio content • Flag any adult or objectionable content before you publish to web sites • Enable true video search – for any spoken word, face, object, topic, or even a product – across your entire video archive • Create captions for your video content – in any language – and then distribute your videos to anyone, anywhere in the world, and playable on any device

• Understand which part of your video were most popular/interesting to your viewers • Create automated summaries or highlight reels of video content based on specific people, topics, or Microsoftscenes Confidential, within For Internal Use the Only video Video AI is the key to solving these challenges…. ▪ Improve Content Discoverability ▪ Increase Content Value ▪ Personalize the Viewing Experience ▪ Uncover Hidden Content Insights ▪ Augment Manual Labor ▪ Auto Transcripts ▪ Find Footage Quickly ▪ Automatically Create Highlight Reels Speech-to-Text & Face detection & Video Video Face/Image Translation recognition stabilization OCR redaction

Convert audio to text Find when each face Create smooth videos Extract text that Detect faces and based on acoustic appears in the video * from videos captured appears in videos as choose which ones you language models and identify from well by moving camera overlay, slides or want to redact translate to any known sources or background language custom models

Motion Emotion & Video Content Object detection Sentiment summarization moderation detection & Analysis Recognition Detect when motion Create summaries of Detect and prevent has occurred in videos Recognize the long videos to enable explicitly visual or Detect objects based emotion of a person or quick previews based objectionable content on a pre-defined crowd based on facial on key metadata object model expressions, text used Video Indexer

http://video.ai

Video Indexer | Typical Workflow Video Indexer JSON insights Big Brother Going Into The Cloud

Adopting this ground-breaking technology will complete will completely revolutionize ‟ the way we produce our global formats and opens up an unprecedented level of creative freedom. This is the first of many innovations we are working on that will optimize our productions around the world, giving our audiences across multiple platforms a much richer experience. [This] isn’t about cost cutting or optimization, this is actually about being able to tell great stories in totally different ways.”

— Lisa Perrin CEO, Creative Networks, EndemolShine Azure Logic App Media Services Cosmos DB On-premise Face Redaction encoder Live archive Speech-to-text Video AI Services: Low-res Motion Detection • Face API Live stream • Speech Analytics (rtmp) Every minute • Subclip Logger High-res Then run analytics Live stream • Face redaction 5 min MXF segments • Speech-to-text • Motion Detection

Logic App Azure Storage Azure Storage Upload MXF file

Import to on-premise

Supclipping Output • Output copy and stitching • Asset cleaning • Encoder config generation • Asset preparation MP4 file Editor • Transcoding job launch Azure Media Services Premium Encoder Export to social media Legend Low resolution flow (4mbps) High resolution flow (XDCAM 4:2:2 50mbps) Scene download request (MXF or MP4) Content Delivery Network (CDN) Content Flagged for follow-up"

Architecture

n times

Content workflow Quality workflow Web App “Show me all programs with this actor” “actor” Face at (28, 456) “Matt Damon” List of assets

Natural language Who are my audiences?

How many users are viewing my content now?

What devices are my customers using to consume content?

What is the peak consumption time?

What do my customers like to watch?

How do I enable my customers to find content that matches their preferences?

What content should be recommended to my users for maximum engagement?

How do I find related programs for targeted advertisements?

What is the likelihood of success of a new program?

What kind of content should I acquire?

Which communities do my audiences belong to? Ingest Merge & Match Analyze Persist & Publish Visualize SaaS Solutions

Linear TV Viewership

Audience Research Extraction, Extraction, Transformation Loading & Set-Top-Box Data

Other analog data

DMP

Clickstream Data

OTT

Other online data

XBOX

MSN

LinkedIn

Bing Source, ingest Consolidate Build, model, predict Persist and publish Explore, visualize Audience Analytics Apps

Linear TV Data Sources Online Data Sources Msft Data Sources Advanced Analytics Audience Analytics Apps Sophisticated pretrained models …

Most comprehensive set of pretrained services Vision Speech Language Search

Popular frameworks Open & interoperable Pytorch TensorFlow Keras Onnx

Productive services

Machine learning at scale Azure Azure Machine Learning Databricks Machine Learning VMs

Powerful Infrastructure Most comprehensive Lowest cost inferencing using FPGAs

Flexible deployment From cloud to edge On-premises Cloud Edge https://azure.microsoft.com/en-us/services/cognitive-services Martin Wahl Principal Program Manager, Cloud AI & Cognitive Services Redmond, WA [email protected]