Definition of Professional is the one who sees the job through till the end.
Deep Learning - Curated
Naveed Hussain Global CSA (Data & AI) – OCP Microsoft
People who think they know their stuff are actually of great announce to those who actually know their stuff. Agenda
Microsoft Leadership Vision & Landscape AI Pitfalls
Overview of Deep Dive into Microsoft Deep Learning Azure Services Portfolio Microsoft Leadership Vision
“Next decade computing will become even more ubiquitous and intelligence will become ambient”
“The core currency of any business going forward will be the ability to convert their data into AI that drives competitive advantage”
“Our goal is to democratise AI to empower every person and every organisation to achieve more.”
“Every developer can be an AI developer, and every company can become an AI company” Modern Disruptors in the IT Field
Machine learning Cloud Computing Quantum Computing
Deep Neural Networks
Data Explosion
Science Fiction Becomes Reality Microsoft Data & AI Ethics Fairness | Accountability | Transparency | Ethics
AI must maximize efficiencies without destroying the dignity of people
AI must be AI must guard designed to assist against bias humanity
AI needs AI must be accountability so designed for humans can undo intelligent privacy unintended harm
AI must be transparent AI Landscape AI Applications Landscape What AI is not? Weak vs Strong OR Human Assisted vs Self-Aware
The frame reference problem is that a The symbol grounding problem is that a The feature engineering problem in machine machine can't recognize what knowledge it machine can't understand a concept that puts learning is that a machine can't find out what should use when it is assigned a task knowledge together because it only the feature is for objects. recognizes knowledge as a mark The Microsoft AI platform
Services CONVERSATIONAL AI TRAINED SERVICES CUSTOM SERVICES CODING & MANAGEMENT TOOLS
Azure Bot Service VS Tools Azure ML Azure ML Cognitive Services Azure Machine Learning for AI ToolsStudio Workbench
Others (PyCharm, Jupyter Notebooks…)
DEEP LEARNING FRAMEWORKS AI ON DATA Infrastructure AI COMPUTE 3rd Party
Cosmos SQL SQL Data Batch Spark DSVM ACS Edge Cognitive DB DB DW Lake AI Toolkit TensorFlow Caffe
CPU, GPU, FPGA+ Others (Scikit-learn, MXNet, Keras, Chainer, Gluon…) Azure Data & AI Stack
Orchestration Data Storage Machine Learning Intelligence
Cognitive Services Data Factory Data Lake Store ML Workbench
ML Experimentation Service Integrations Services SQL Data Warehouse Operationalization ML Model Data Catalog SQL Azure Management Service Functions & Serverless
ML Server Storage Blob CosmosDB Bot Framework Data Science & Deep Learning VM Table Storage SQL Server 2017 Cortana
Event Hubs Big Data & Analytics Dashboards & Visualizations
Stream Analytics HDInsight / Databricks IoT Power BI IoT Hub Data Lake Analytics Analysis Services Data Sources Streaming Storage OLTP OLAP Analytics Presentation
Archive, Cool, Hot, Polybase Premium Storage blob Service Bus Event Grid SQL Data Warehouse Machine Learning Elasticity Replication Columnstore Graph Azure SQL Inmemory database Event Hubs On Prem Storage Map Reduce Table (CSVs, etc) Table YARN JSON Storage TEZ Graph LLAP Cosmos DB Global Scale Low latency IoT Hub
Cognitive U-SQL Services Stream Analytics Azure Storage Queues Data lake SQL Store 2017 Azure Data Lake Machine Analytics Learning Models Time Series Insights Batch Sensors MySQL Jupyter Postgresql Spark ML HDInsight Data Wrangling SSIS SQL IoT Edge Data Azure Data Catalog Stream Analytics Sync Machine Gateways Learning (Field, Protocol) Workbench Data Factory V2 AI Options?
Machine learning product What it is What you can do with it In the cloud
Azure Machine Learning service Managed cloud service for ML Train, deploy, and manage models in Azure using Python and CLI
Azure Machine Learning Studio Drag-and-drop visual interface for ML Build, experiment, and deploy models using preconfigured algorithms
Azure Databricks Spark-based analytics platform Build and deploy models and data workflows
Azure Cognitive Services Azure services with pre-built AI and ML models Easily add intelligent features to your apps
Azure Data Science Virtual Machine Virtual machine with pre-installed data science tools Develop ML solutions in a pre-configured environment
On-premises
SQL Server Machine Learning Services Analytics engine embedded in SQL Build and deploy models inside SQL Server
Microsoft Machine Learning Server Standalone enterprise server for predictive analysis Build and deploy models with R and Python
Developer tools
ML.NET Open-source, cross-platform ML SDK Develop ML solutions for .NET applications
Windows ML Windows 10 ML platform Evaluate trained models on a Windows 10 device Azure Roadmap for AI + Machine Learning
As Azure continues to grow and to keep informed: https://azure.microsoft.com/en-us/roadmap/ • Generally Available • In Public Preview • Development
*** Strategy Roadmap are shared with Partners and Customers under Strict NDA Pitfalls of AI
Understanding • Where to Start and Finish – Scope of AI Qualification Creteria
• Tech. Management Management • Skills Management of AI • IP Management
• Use and Abuse of AI (Not Use of AI every problem is an AI problem) Traditional vs. Modern Analytics
Demo Notebooks – Statistical and ML Classic vs. Modern AI
DS vs ML vs AI
• Data science produces insights.
• Machine learning produces predictions. • Deep Learning does it over deep iterations simulated in ANNs Architecture
• Artificial intelligence produces actions. • RPAs or IPAs Quick Comparison
Machine Learning Deep Learning We use a machine algorithm to Deep Learning is used in layers to parse data, learn from that data, and create an Artificial “Neural Network” make informed decisions based on that can learn and make intelligent what it has learned. decisions on its own. For example: Any Deep Neural Network will • Find-S consist of three types of layers: • Decision trees • The Input Layer • Random forests • The Hidden Layer • Support Vector Machines • The Output Layer • Artificial Neural Networks Feature Comparison of ML and DL
ML DL • Data Dependencies – Small Data • Data Dependencies – Big Data • Hardware Dependencies – Low-end • Hardware Dependencies – High-end Machines Machines (GPU – Matrix Operations) • Feature Engineering – Manual • Feature Engineering – Automatic • Problem Solving – Divided Approach • Problem Solving – Unified Approach • object detection • Execution Time – Long (More Accurate • object recognition and Generalizable) • Execution Time – Short (Less Accurate • Interpretability – Hard (Mathematical and Generalizable) Model based Proofs) • Interpretability – Easy to Moderate
Human vs. Computer Overfitting vs Underfitting – Model Training Error – (Learnt via Gradient Descend)
• Analysis of Bias + Analysis of Variance = Error • Cost = Generated Output – Actual or Desired Output • Gradient = Rate at which cost / error changes w.r.t weight and bias • Loss = Amount of Information Lost during reconstruction
Trade-off Model Complexity Graph
Splitting Data – Post Sampling
Evaluation Matrices Confusion Matrix Accuracy
Precision vs Recall
High Recall High Precision Precision Precision Recall Recall Precision vs Recall Arithmetic vs Harmonic Mean
Harmonic Mean
Deep Learning Theory? What is ANN (Artificial Neural Network)?
Neural Networks = Pattern Recognition Types of ANN
FeedForward (Classifiers/Regressors)
FeedBackward (Forecasters) Early ANNs?
• Vanilla Neural Nets = Multilayer Perceptrons (MLPs) Input(s) x Weight(s) + Bias(es) = Output(s) Data Pattern Complexity
• Simple – Use Methods like SVM
• Moderate – Deep Nets Outperforms
• Complex – Deep Nets – Only Practical Choice CPUs vs GPUs vs FPGAs
• Parallel Programming = Parallelism implemented at software level • Partitioning, Grouping, Threading & Multi-Tasking (Psuedo- Parallelism
• Parallel Processing = Parallelism implemented at Hardware level • Shared Memory = GPU (100-1000s of core) • Distributed Computing = Data, Model & Pipeline Parallelism How to choose a Deep Net?
Problem Domain Choice of Algorithms (Techniques) Classification MLP/RELU – Deep Belief Net Time Series Analysis Recurrent Net
Issues = Vanishing Gradient Descent Backpropagation with drop out (Random Pruning)
KL Divergence = Compare Actual to Recreation Deep Learning Tools and Libraries
Implementation Libraries
• Tensorflow • CNTK Code Training / Execution Keras • PyTorch Platform (CPUs, GPUs & FPGAs) • Caffe • MXNet Model
RBM (Restricted Boltzmann Machine)
• Feature Extractor Neural Net (Auto recognised Inherent Patterns in Data)
• Set of Autoencoders as it doesn’t need label data and it creates it’s own patterns (encode their own structure) and identifies them DBN (Deep Belief Net)
• Stack of RBM
• Hidden Layer of One RBM is Visible Layer one above it
• RBM 1..n (Training) – Each RBM layers full input … called Pre-Training
• Use Fine Tuning at Cost/Error learning phase… uses dropout (random pruning) features to avoid VGD. Convolutional Neural Networks (CNN)
• Good for Image Processing
• Consists of series of Convolutional & Pooling Layer • Convolutional Layer filters through for a specific pattern
• RELU Layer is an Activation Layer
• Pooling is used for Dimensionality Reduction (Too Reduce identified complex pattern), is a optimisation technique
• Full Connected Layer (To Classify Data Samples) Recurrent Neural Network (RNN)
• Built with Feedback loop (good for to be used as forecasting engine)
• Input Sequence and Output Singular • Image Captioning • Document Classification
• Uses Sequences of Inputs and Produces Sequences • Classify Videos frame by frame
• With Introduction of Time Delay • Can do Statistical Forecasting
• Difficult to train due to VGD being exponential • Gating is solution (LSTM and GRU) • Better Optimisers, Steeper Gates & Gradient Clipping
• GPU optimises by 250% Autoencoders
• RBM is stack of autoencoders
• Feature Extraction Engine
• It is shallow layer model Input → Hidden → Output • Encode • Decode
• Backpropagation using Loss not Cost
• Dimensionality Reduction Traits and better than PCA (predecessor) RNTN (Recursive Neural Tensor Nets)
• Good for Sentiment Analysis using Context (Grouping based)
• Can be used for Image Processing with Image or Scene with many different components
• Uses Recursive Parsing and Grouping of sentence technique
• S → NP → VP … Deep Learning Use Cases
• Machine Vision • Image Classification / Tagging • Object Recognition • Video Analysis / Recognition (Automated Vehicles, Video Insight & Security)
• Speech Recognition • Text Processing (TTS, STT) • Sentiment Analysis • Voice Recognition Demos – Pre-canned
1. Vision Demos 2. Speech Demos 3. Text Demos 4. Metamind 5. Tensorflow Playground Domain Applications
• Medical • Cancer Detection (capturing 6,642 factors in consideration) • Drug Discovery (molecular activity challenge – on cell biology based chemical structures) • Radiology (Tumours / Cancers via MRI, FRMI, EKG,CT-Scan & etc) • Financial • Portfolio Allocations • Risk profiles • Short Term Trading • Long Term Trading • Digital Advertising • Segmentation • Ad-space Bidding • Fraud Detection • Market & Sales • Cross & Up Sell • Agriculture • Satellite and Sensor Feed Analysis for Problematic environment field conditions What Areas of Deep Learning? Optimisations Techniques using HPT
• Pre-Training
• Dropouts
• Regularisation (L0, L1 & L2) – penalisation (penalty) on weights and biases
• Max Norm Constraint (capping technique) on weights and biases
• Epochs, Batch Size & Iterations
• Parallelism Parallelism
• Data Parallelism
• Model Parallelism
• Pipeline Parallelism Architecting the DNNs? – Hyperparameters
• Layers • Input • Hidden = Convolution, Pooling, Activation & Loss • Matrix Operation – Data Shaping • Output Layers – Cost Function (Sum of Square, Cross Entropy) • Neurons • Growing (small to large) • Pruning (large to small) • Activations • Sigmoid, RELU, TANH… etc. • Regularisation • L0, L1 & L2 • Learning Rate (start small … 0.001 … etc.) • Epochs, Iterations & Batch Size… etc. Transfer Learning
• Take model good at one task and tweak it to solve another task
• Drop input and output layers and train with new I/O layer and domain specific data for the new use case
• neural-storyteller • neural-storyteller is a recurrent neural network that generates little stories about images. This repository contains code for generating stories with your own images, as well as instructions for training new models. • *We were barely able to catch the breeze at the beach , and it felt as if someone stepped out of my mind . She was in love with him for the first time in months , so she had no intention of escaping . The sun had risen from the ocean , making her feel more alive than normal . She 's beautiful , but the truth is that I do n't know what to do . The sun was just starting to fade away , leaving people scattered around the Atlantic Ocean . I d seen the men in his life , who guided me at the beach once more .* GAN (Generative Adversarial Networks)
• Are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero- sum game framework
• This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics (though in tests people can tell real from generated in many cases) Reinforcement Learning
• Deep Q Network (DQN) Coded Demos
• Regression • Classification • Clustering • Text Processing • Image Processing References
• https://dzone.com/articles/comparison-between-deep-learning-vs- machine-learni • https://becominghuman.ai/cheat-sheets-for-ai-neural-networks- machine-learning-deep-learning-big-data-678c51b4b463 • Luis Serrano (YouTube Channel) - https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ • Deep Learning TV (YouTube Channel) - https://www.youtube.com/channel/UC9OeZkIwhzfv-_Cb7fCikLQ