Definition of Professional is the one who sees the job through till the end.

Deep Learning - Curated

Naveed Hussain Global CSA (Data & AI) – OCP Microsoft

People who think they know their stuff are actually of great announce to those who actually know their stuff. Agenda

Microsoft Leadership Vision & Landscape AI Pitfalls

Overview of Deep Dive into Microsoft Azure Services Portfolio Microsoft Leadership Vision

“Next decade computing will become even more ubiquitous and intelligence will become ambient”

“The core currency of any business going forward will be the ability to convert their data into AI that drives competitive advantage”

“Our goal is to democratise AI to empower every person and every organisation to achieve more.”

“Every developer can be an AI developer, and every company can become an AI company” Modern Disruptors in the IT Field

Machine learning Cloud Computing Quantum Computing

Deep Neural Networks

Data Explosion

Science Fiction Becomes Reality Microsoft Data & AI Ethics Fairness | Accountability | Transparency | Ethics

AI must maximize efficiencies without destroying the dignity of people

AI must be AI must guard designed to assist against bias humanity

AI needs AI must be accountability so designed for humans can undo intelligent privacy unintended harm

AI must be transparent AI Landscape AI Applications Landscape What AI is not? Weak vs Strong OR Human Assisted vs Self-Aware

The frame reference problem is that a The symbol grounding problem is that a The feature engineering problem in machine machine can't recognize what knowledge it machine can't understand a concept that puts learning is that a machine can't find out what should use when it is assigned a task knowledge together because it only the feature is for objects. recognizes knowledge as a mark The Microsoft AI platform

Services CONVERSATIONAL AI TRAINED SERVICES CUSTOM SERVICES CODING & MANAGEMENT TOOLS

Azure Bot Service VS Tools Azure ML Azure ML Cognitive Services Azure for AI ToolsStudio Workbench

Others (PyCharm, Jupyter Notebooks…)

DEEP LEARNING FRAMEWORKS AI ON DATA Infrastructure AI COMPUTE 3rd Party

Cosmos SQL SQL Data Batch Spark DSVM ACS Edge Cognitive DB DB DW Lake AI Toolkit TensorFlow

CPU, GPU, FPGA+ Others (Scikit-learn, MXNet, , Chainer, Gluon…) Azure Data & AI Stack

Orchestration Data Storage Machine Learning Intelligence

Cognitive Services Data Factory Data Lake Store ML Workbench

ML Experimentation Service Integrations Services SQL Data Warehouse Operationalization ML Model Data Catalog SQL Azure Management Service Functions & Serverless

ML Server Storage Blob CosmosDB Bot Framework Data Science & Deep Learning VM Table Storage SQL Server 2017 Cortana

Event Hubs Big Data & Analytics Dashboards & Visualizations

Stream Analytics HDInsight / Databricks IoT Power BI IoT Hub Data Lake Analytics Analysis Services Data Sources Streaming Storage OLTP OLAP Analytics Presentation

Archive, Cool, Hot, Polybase Premium Storage blob Service Bus Event Grid SQL Data Warehouse Machine Learning Elasticity Replication Columnstore Graph Azure SQL Inmemory database Event Hubs On Prem Storage Map Reduce Table (CSVs, etc) Table YARN JSON Storage TEZ Graph LLAP Cosmos DB Global Scale Low latency IoT Hub

Cognitive U-SQL Services Stream Analytics Azure Storage Queues Data lake SQL Store 2017 Azure Data Lake Machine Analytics Learning Models Time Series Insights Batch Sensors MySQL Jupyter Postgresql Spark ML HDInsight Data Wrangling SSIS SQL IoT Edge Data Azure Data Catalog Stream Analytics Sync Machine Gateways Learning (Field, Protocol) Workbench Data Factory V2 AI Options?

Machine learning product What it is What you can do with it In the cloud

Azure Machine Learning service Managed cloud service for ML Train, deploy, and manage models in Azure using Python and CLI

Azure Machine Learning Studio Drag-and-drop visual interface for ML Build, experiment, and deploy models using preconfigured algorithms

Azure Databricks Spark-based analytics platform Build and deploy models and data workflows

Azure Cognitive Services Azure services with pre-built AI and ML models Easily add intelligent features to your apps

Azure Data Science Virtual Machine Virtual machine with pre-installed data science tools Develop ML solutions in a pre-configured environment

On-premises

SQL Server Machine Learning Services Analytics engine embedded in SQL Build and deploy models inside SQL Server

Microsoft Machine Learning Server Standalone enterprise server for predictive analysis Build and deploy models with R and Python

Developer tools

ML.NET Open-source, cross-platform ML SDK Develop ML solutions for .NET applications

Windows ML Windows 10 ML platform Evaluate trained models on a Windows 10 device Azure Roadmap for AI + Machine Learning

As Azure continues to grow and to keep informed: https://azure.microsoft.com/en-us/roadmap/ • Generally Available • In Public Preview • Development

*** Strategy Roadmap are shared with Partners and Customers under Strict NDA Pitfalls of AI

Understanding • Where to Start and Finish – Scope of AI Qualification Creteria

• Tech. Management Management • Skills Management of AI • IP Management

• Use and Abuse of AI (Not Use of AI every problem is an AI problem) Traditional vs. Modern Analytics

Demo Notebooks – Statistical and ML Classic vs. Modern AI

DS vs ML vs AI

• Data science produces insights.

• Machine learning produces predictions. • Deep Learning does it over deep iterations simulated in ANNs Architecture

• Artificial intelligence produces actions. • RPAs or IPAs Quick Comparison

Machine Learning Deep Learning We use a machine algorithm to Deep Learning is used in layers to parse data, learn from that data, and create an Artificial “Neural Network” make informed decisions based on that can learn and make intelligent what it has learned. decisions on its own. For example: Any Deep Neural Network will • Find-S consist of three types of layers: • Decision trees • The Input Layer • Random forests • The Hidden Layer • Support Vector Machines • The Output Layer • Artificial Neural Networks Feature Comparison of ML and DL

ML DL • Data Dependencies – Small Data • Data Dependencies – Big Data • Hardware Dependencies – Low-end • Hardware Dependencies – High-end Machines Machines (GPU – Matrix Operations) • Feature Engineering – Manual • Feature Engineering – Automatic • Problem Solving – Divided Approach • Problem Solving – Unified Approach • object detection • Execution Time – Long (More Accurate • object recognition and Generalizable) • Execution Time – Short (Less Accurate • Interpretability – Hard (Mathematical and Generalizable) Model based Proofs) • Interpretability – Easy to Moderate

Human vs. Computer Overfitting vs Underfitting – Model Training Error – (Learnt via Gradient Descend)

• Analysis of Bias + Analysis of Variance = Error • Cost = Generated Output – Actual or Desired Output • Gradient = Rate at which cost / error changes w.r.t weight and bias • Loss = Amount of Information Lost during reconstruction

Trade-off Model Complexity Graph

Splitting Data – Post Sampling

Evaluation Matrices Confusion Matrix Accuracy

Precision vs Recall

High Recall High Precision Precision Precision Recall Recall Precision vs Recall Arithmetic vs Harmonic Mean

Harmonic Mean

Deep Learning Theory? What is ANN (Artificial Neural Network)?

Neural Networks = Pattern Recognition Types of ANN

FeedForward (Classifiers/Regressors)

FeedBackward (Forecasters) Early ANNs?

• Vanilla Neural Nets = Multilayer Perceptrons (MLPs) Input(s) x Weight(s) + Bias(es) = Output(s) Data Pattern Complexity

• Simple – Use Methods like SVM

• Moderate – Deep Nets Outperforms

• Complex – Deep Nets – Only Practical Choice CPUs vs GPUs vs FPGAs

• Parallel Programming = Parallelism implemented at software level • Partitioning, Grouping, Threading & Multi-Tasking (Psuedo- Parallelism

• Parallel Processing = Parallelism implemented at Hardware level • Shared Memory = GPU (100-1000s of core) • Distributed Computing = Data, Model & Pipeline Parallelism How to choose a Deep Net?

Problem Domain Choice of Algorithms (Techniques) Classification MLP/RELU – Deep Belief Net Time Series Analysis Recurrent Net

Issues = Vanishing Gradient Descent Backpropagation with drop out (Random Pruning)

KL Divergence = Compare Actual to Recreation Deep Learning Tools and Libraries

Implementation Libraries

• Tensorflow • CNTK Code Training / Execution Keras • PyTorch Platform (CPUs, GPUs & FPGAs) • Caffe • MXNet Model

RBM (Restricted Boltzmann Machine)

• Feature Extractor Neural Net (Auto recognised Inherent Patterns in Data)

• Set of Autoencoders as it doesn’t need label data and it creates it’s own patterns (encode their own structure) and identifies them DBN (Deep Belief Net)

• Stack of RBM

• Hidden Layer of One RBM is Visible Layer one above it

• RBM 1..n (Training) – Each RBM layers full input … called Pre-Training

• Use Fine Tuning at Cost/Error learning phase… uses dropout (random pruning) features to avoid VGD. Convolutional Neural Networks (CNN)

• Good for Image Processing

• Consists of series of Convolutional & Pooling Layer • Convolutional Layer filters through for a specific pattern

• RELU Layer is an Activation Layer

• Pooling is used for Dimensionality Reduction (Too Reduce identified complex pattern), is a optimisation technique

• Full Connected Layer (To Classify Data Samples) (RNN)

• Built with Feedback loop (good for to be used as forecasting engine)

• Input Sequence and Output Singular • Image Captioning • Document Classification

• Uses Sequences of Inputs and Produces Sequences • Classify Videos frame by frame

• With Introduction of Time Delay • Can do Statistical Forecasting

• Difficult to train due to VGD being exponential • Gating is solution (LSTM and GRU) • Better Optimisers, Steeper Gates & Gradient Clipping

• GPU optimises by 250% Autoencoders

• RBM is stack of autoencoders

• Feature Extraction Engine

• It is shallow layer model Input → Hidden → Output • Encode • Decode

• Backpropagation using Loss not Cost

• Dimensionality Reduction Traits and better than PCA (predecessor) RNTN (Recursive Neural Tensor Nets)

• Good for Sentiment Analysis using Context (Grouping based)

• Can be used for Image Processing with Image or Scene with many different components

• Uses Recursive Parsing and Grouping of sentence technique

• S → NP → VP … Deep Learning Use Cases

• Machine Vision • Image Classification / Tagging • Object Recognition • Video Analysis / Recognition (Automated Vehicles, Video Insight & Security)

• Speech Recognition • Text Processing (TTS, STT) • Sentiment Analysis • Voice Recognition Demos – Pre-canned

1. Vision Demos 2. Speech Demos 3. Text Demos 4. Metamind 5. Tensorflow Playground Domain Applications

• Medical • Cancer Detection (capturing 6,642 factors in consideration) • Drug Discovery (molecular activity challenge – on cell biology based chemical structures) • Radiology (Tumours / Cancers via MRI, FRMI, EKG,CT-Scan & etc) • Financial • Portfolio Allocations • Risk profiles • Short Term Trading • Long Term Trading • Digital Advertising • Segmentation • Ad-space Bidding • Fraud Detection • Market & Sales • Cross & Up Sell • Agriculture • Satellite and Sensor Feed Analysis for Problematic environment field conditions What Areas of Deep Learning? Optimisations Techniques using HPT

• Pre-Training

• Dropouts

• Regularisation (L0, L1 & L2) – penalisation (penalty) on weights and biases

• Max Norm Constraint (capping technique) on weights and biases

• Epochs, Batch Size & Iterations

• Parallelism Parallelism

• Data Parallelism

• Model Parallelism

• Pipeline Parallelism Architecting the DNNs? – Hyperparameters

• Layers • Input • Hidden = Convolution, Pooling, Activation & Loss • Matrix Operation – Data Shaping • Output Layers – Cost Function (Sum of Square, Cross Entropy) • Neurons • Growing (small to large) • Pruning (large to small) • Activations • Sigmoid, RELU, TANH… etc. • Regularisation • L0, L1 & L2 • Learning Rate (start small … 0.001 … etc.) • Epochs, Iterations & Batch Size… etc. Transfer Learning

• Take model good at one task and tweak it to solve another task

• Drop input and output layers and train with new I/O layer and domain specific data for the new use case

• neural-storyteller • neural-storyteller is a recurrent neural network that generates little stories about images. This repository contains code for generating stories with your own images, as well as instructions for training new models. • *We were barely able to catch the breeze at the beach , and it felt as if someone stepped out of my mind . She was in love with him for the first time in months , so she had no intention of escaping . The sun had risen from the ocean , making her feel more alive than normal . She 's beautiful , but the truth is that I do n't know what to do . The sun was just starting to fade away , leaving people scattered around the Atlantic Ocean . I d seen the men in his life , who guided me at the beach once more .* GAN (Generative Adversarial Networks)

• Are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero- sum game framework

• This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics (though in tests people can tell real from generated in many cases)

• Deep Q Network (DQN) Coded Demos

• Regression • Classification • Clustering • Text Processing • Image Processing References

• https://dzone.com/articles/comparison-between-deep-learning-vs- machine-learni • https://becominghuman.ai/cheat-sheets-for-ai-neural-networks- machine-learning-deep-learning-big-data-678c51b4b463 • Luis Serrano (YouTube Channel) - https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ • Deep Learning TV (YouTube Channel) - https://www.youtube.com/channel/UC9OeZkIwhzfv-_Cb7fCikLQ