<<

ML and with Cloud Platform

The power of on any data, any size

Alex Osterloh Solution Engineer, Google [email protected] @BigDataWizard An Evolving Cloud

1st Wave 2nd Wave Colocation Virtualized Data Centers ?

Your kit, someone else’s Standard virtual kit, for rent. building. Still yours to manage. Yours to manage.

Google Cloud Platform 5 An Evolving Cloud

1st Wave 2nd Wave 3rd Wave Colocation Virtualized Automated Services Data Centers Scalable Data

Your kit, someone else’s Standard virtual kit, for rent. Focus in insight, building. Still yours to manage. not infrastructure Yours to manage.

Google Cloud Platform 6 An Evolving Cloud

1st Wave 2nd Wave 3rd Wave Colocation Virtualized Automated Services Data Centers Scalable Data

Your kit, someone else’s Standard virtual kit, for rent. Focus in insight, building. Still yours to manage. not infrastructure Yours to manage.

Google Cloud Platform 7 “Google is living a few years in the future and sending the rest of us ” Doug Cutting Chief Architect Cloudera Google in Data Technologies

F1

Spanner

MapReduce Dremel Flume

Millwheel GFS Colossus Megastore PubSub

2002 2004 2006 2008 2010 2012 2013

Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html 10+ Years of Tackling Data Problems

Open Apache Source Beam

Map Flume Google GFS BigTable Dremel PubSub Millwheel TensorFlow Papers Reduce Java

Google Cloud Products BigQuery Pub/Sub Dataflow Bigtable ML

2002 2004 2005 2006 2008 2010 2012 2014 2015 2016

Google Cloud Platform We don’t really use MapReduce anymore

“ Urs Hölzle SVP Technical Infrastructure Google ”

Google Cloud Platform Confidential & Proprietary 11

ML

Management Storage Developer Tools Mobile

Compute

Services

Networking Google Cloud Platform Confidential & Proprietary 12 The Big Data Lifecycle

Capture Store Process Analyze

Pub/Sub Storage Dataflow BigQuery SQL Dataflow Datastore Cloud ML BigTable The Big Data Lifecycle

Capture Store Process Analyze Learn

Pub/Sub Storage Dataflow BigQuery SQL Dataflow Datastore Cloud ML BigTable Enterprise Big Data Architecture on Google

Applications + Reports PubSub Bigtable

Your Data Dataflow

Cloud Storage BigQuery

BI Tools

GCS-Hadoop Connector Spreadsheets

unmanaged Fast ETL Regex Coworkers JSON Hadoop on UDFs Compute Engine

Google Cloud Platform Confidential & Proprietary 15 Enterprise Big Data Architecture on Google

Applications + Reports PubSub Bigtable

Your Data Dataflow

Cloud Storage BigQuery

BI Tools

GCS-Hadoop Connector Spreadsheets

unmanaged managed Fast ETL Regex Coworkers JSON Hadoop on Cloud UDFs

Compute Engine Dataproc

Google Cloud Platform Confidential & Proprietary 16 http://blog.shinetech.com/2015/10/14/google-cloud- dataproc-and-the-17-minute-train-challenge/ Applications that can see, hear & understand

Google confidential | Do not distribute Examples of applying ML

Input

Neural Networks

Output

Google confidential | Do not distribute Machine Learning Use Cases

Structured Data Unstructured Data

Classification/ Regression Image Analytics ● Customer Churn Analysis ● Identify damaged shipments ● Product Diagnostics ● Explicit Content Classification ● Forecasting ● Identify “styles” in images

Recommendation Text Analytics ● Content Personalization ● Call Center log analysis ● Product X-Sells/Up-sells ● Language Identification ● Topic Classification Anomaly Detection ● Sentiment Analysis ● Fraud Detection ● Asset Sensor Diagnostics ● Log Metric Anomalies The Spectrum of Machine Learning

Use pretrained models

Cloud Cloud Cloud Translate API Vision API Speech API Or use your own data to train models The Machine Learning Spectrum

Industry / applications

TensorFlow Cloud Machine Learning Machine Learning APIs

Academic / research Translate API

Vision API Cloud Datalab OSS SDK Managed Infrastructure Notebook experience Speech API

Google Cloud Platform Confidential & Proprietary 24 Google Cloud Vision API

● Detect faces, landmarks, logos, text, and more ● Perform sentiment analysis ● Straightforward REST API ● Works on a base64-encoded image ● Connects to ● Returns label, score pair

Google Cloud Platform Confidential & Proprietary 25 Google Cloud Platform Confidential & Proprietary 26 Google Cloud Platform Confidential & Proprietary 27 Google Cloud Speech API

● Pass raw audio data and language

● Returns a transcript of the audio data

● Works across >80 languages

● Receive response in streaming or non- streaming

Google Cloud Platform Confidential & Proprietary 28 Speech API

● Enable voice interface to devices and applications ● Transcribe audio from stored media ● Multiple language support

● Access from mobile devices Click for Demo Speech API Demo

“What are you sinking about ? “

Click for Demo Google Cloud Translate API

● translate text between thousands of language pairs. ● let’s websites and programs integrate with programmatically

Google Cloud Platform Confidential & Proprietary 31 The Machine Learning Spectrum

Industry / applications

TensorFlow Cloud Machine Learning Machine Learning APIs

Academic / research Translate API

Vision API Cloud Datalab OSS SDK Managed Infrastructure Notebook experience Speech API

Google Cloud Platform Confidential & Proprietary 32 The Machine Learning Spectrum

Industry / applications

TensorFlow Cloud Machine Learning Machine Learning APIs

Academic / research Translate API

Vision API Cloud Datalab OSS SDK Managed Infrastructure Notebook experience Speech API

Google Cloud Platform Confidential & Proprietary 33 A brief look at TensorFlow

Largest Machine Learning repository on GitHub

Operates over : n-dimensional arrays Using a flow graph: data flow computation framework

● Train on CPUs, GPUs

● Run wherever you like (local, cloud, mobile)

Google Cloud Platform Confidential & Proprietary 34 A brief look at TensorFlow

Largest Machine Learning repository on GitHub

Operates over tensors: n-dimensional arrays Using a flow graph: data flow computation framework

● Train on CPUs, GPUs

● Run wherever you like (local, cloud, mobile)

Google Cloud Platform Confidential & Proprietary 35 The Machine Learning Spectrum

Industry / applications

TensorFlow Cloud Machine Learning Machine Learning APIs

Academic / research Translate API

Vision API Cloud Datalab OSS SDK Managed Infrastructure Notebook experience Speech API

Google Cloud Platform Confidential & Proprietary 36 What Cloud Machine Learning Can Do

● Fully managed service

● Train using a custom Flow graph

● Batch and online predictions, at scale

● Integrated Datalab experience

● Regression and classification tasks

Google Cloud Platform Confidential & Proprietary 37 Want more ? → http://bit.ly/gcp16data

Google Cloud Platform Confidential & Proprietary 38 Thank You

Alex Osterloh [email protected]