Master Class: Deep Learning for Signals
Total Page:16
File Type:pdf, Size:1020Kb
Master Class: Deep Learning for Signals Abhijit Bhattacharjee Senior Application Engineer MathWorks © 2015 The MathWorks, Inc.1 Music Genre Classification 2 3 4 Agenda Why deep learning? Deep learning with signal data (Demo) Speech Command Recognition (Demo) LSTM Networks Enabling Features in MATLAB Deploying deep learning 5 What is Deep Learning? 6 Deep learning is a type of machine learning in which a model learns to perform tasks directly from image, time-series or text data. Deep learning is usually implemented using a neural network architecture. 7 What is Deep Learning? ▪ Subset of machine learning with automatic feature extraction – Can learn features directly from data – More Data = better model Machine Deep Learning Learning Deep Learning 8 Types of Datasets Tabular Time Series/ Image Data Signal Data ML or LSTM CNN LSTM or CNN 9 Image Example: Object recognition using deep learning Training Millions of images from 1000 (GPU) different categories Real-time object recognition using Prediction a webcam connected to a laptop 10 Signals Example: Analyzing signal data using deep learning Signal Classification using LSTMs Speech Recognition using CNNs 11 Deep Learning Workflow ACCESS AND EXPLORE LABEL AND PREPROCESS DEVELOP PREDICTIVE INTEGRATE MODELS WITH DATA DATA MODELS SYSTEMS Files Data Augmentation/ Hardware-Accelerated Desktop Apps Transformation Training Databases Labeling Automation Hyperparameter Tuning Enterprise Scale Systems Sensors Import Reference Network Visualization Embedded Devices and Models Hardware 12 How Does a Convolutional Neural Network Work? Edges Shapes Objects 13 Speech Command Recognition Using Convolutional Neural Networks No Up Right Off 14 Example: Speech Command Recognition Using Deep Learning CNN Network for Audio Classification audioDatastore 1. Train Stop Up Yes auditorySpectrogram <none> Google speech command dataset 2. Predict audioDeviceReader "Up" auditorySpectrogram 15 Command Recognition Demo 16 17 18 19 20 21 22 23 24 25 26 https://www.isca-speech.org/archive/interspeech_2015/papers/i15_1478.pdf 27 28 29 30 31 32 33 34 35 MATLAB Based Algorithm Wins the 2017 PhysioNet/CinC Challenge to Automatically Detect Atrial Fibrillation Challenge Design an algorithm that uses machine learning to detect atrial fibrillation and other abnormal heart rhythms in noisy, single-lead ECG recordings Solution Use MATLAB to analyze ECG data, extract features using signal processing and wavelet techniques, and evaluate Block diagram for Black Swan’s atrial fibrillation detection different machine learning algorithms to train and algorithm. implement a best-in-class classifier to detect AF “I don’t think MATLAB has any strong competitors for Results signal processing and wavelet analysis. When you add in ▪ First place in PhysioNet/CinC Challenge achieved its statistics and machine learning capabilities, it’s easy to ▪ ECG data visualized in multiple domains see why nonprogrammers enjoy using MATLAB, ▪ Feature extraction accelerated with parallel particularly for projects that require combining all these processing methods.” - Ali Bahrami Rad, Aalto University Link to user story 36 Agenda Why deep learning? Deep learning with signal data (Demo) Speech Command Recognition (Demo) LSTM Networks Enabling Features in MATLAB Deploying deep learning 37 I was born in France. I speak __________ ? 38 Recurrent Neural Networks ▪ Take previous data into account when making new predictions ▪ Signals, text, time series 39 I was born in France… [2000 words] … I speak __________ ? 40 Long Short Term Memory Networks ▪ RNN that carries a memory cell throughout the process ▪ Sequence Problems c0 C1 Ct 41 LSTM Demo 42 Time Series Classification (Human Activity Recognition) Long short-term memory networks ▪ Dataset is accelerometer and gyroscope signals captured with a smartphone ▪ Data is a collection of time series with 9 channels sequence to one Deep Learning 43 44 Agenda Why deep learning? Deep learning with signal data (Demo) Speech Command Recognition (Demo) LSTM Networks Enabling Features in MATLAB Deploying deep learning 45 Deep Learning on CPU, GPU, Multi-GPU and Clusters H OW TO TARGE T ? Single Single CPU CPU Single GPU Single CPU, Multiple GPUs On-prem server with Cloud GPUs GPUs (AWS) 46 Audio Datastore ads = audioDatastore('.\Dataset', ... 'IncludeSubfolders',true, ... 'FileExtensions','.wav', ... 'LabelSource','foldernames') ads = ▪ Programmatic interface to large Datastore with properties: Files: { collections of audio files ' ...\Temp\speech_commands_v0.01\_background_noise_ ' ...\Local\Temp\speech_commands_v0.01\_background_noise_ ' ...\Local\Temp\speech_commands_v0.01\_background_noise_ ▪ ... and 64724 more (Optional) auto-generation of } Labels: [_background_noise_; _background_noise_; _background_noise_ ... and 64724 more categorical] labels from folder names ReadMethod: 'File' OutputDataType: 'double' ads = getSubsetDatastore(ads,isCommand|isUnknown); ▪ Label-based indexing and countEachLabel(ads) ans = partitioning 11×2 table Label Count _______ _____ ▪ Random file sampling down 2359 go 2372 left 2353 no 2375 ▪ Automatic sequential file reading off 2357 on 2367 right 2367 stop 2380 ▪ Parallel access for pre- unknown 4143 up 2375 processing, feature extraction, or yes 2377 data augmentation [adsTrain,adsValidation,adsTest] = splitData(ads,datafolder); 47 “I love to label and preprocess my data” ~ Said no engineer, ever. 48 Audio Labeler ▪ Work on collections of recordings or record new audio directly within the app ▪ Navigate dataset and playback interactively ▪ Define and apply labels to – Entire files – Regions within files ▪ Import and export audio folders, label definitions and datastores 49 50 Signal Analyzer App Analyze multiple signals in time, frequency and time-frequency domains • Navigate through signals • Extract regions of interest • Generate MATLAB scripts 51 Feature Extraction: Time-Frequency Analysis Reassigned spectrogram Persistent spectrum Wavelet scalogram Constant Q transform and synchrosqueezing Empirical mode decomposition Kurtogram Instantaneous frequency Hilbert-Huang transform Instantaneous BW 52 Enabling Features Summary (For Audio Applications) Audio System Toolbox Neural Network TB (For Signal Applications) Signal Processing Toolbox Stats & ML TB Prototype! Ground Process Record & Read & Extract Truth & Train Predict store Partition Features Labeling augment ads = Datastore with properties: Files: { ' ...\Temp\speech_commands_v0.01\_background_noise_\doing_the_dishes.wav'; ' ...\Local\Temp\speech_commands_v0.01\_background_noise_\dude_miaowing.wav'; ' ...\Local\Temp\speech_commands_v0.01\_background_noise_\exercise_bike.wav' ... and 64724 more } Labels: [_background_noise_; _background_noise_; _background_noise_ ... and 64724 more categorical] ReadMethod: 'File' OutputDataType: 'double' 53 Agenda Why deep learning? Deep learning with signal data (Demo) Speech Command Recognition (Demo) LSTM Networks Enabling Features in MATLAB Deploying deep learning 54 Deploying Deep Learning Models for Inference NVIDIA TensorRT & cuDNN Libraries ARM Code Compute Generation Library CNN Intel MKL-DNN Library 55 GPU Coder Fills a Gap in the Deep Learning Solution Training Inference Access Data Preprocess Select Network Train Deploy Image Neural GPU Image Acq. PCT Processing Network Coder Computer Vision 56 With GPU Coder, MATLAB is fast GPU Coder is faster than TensorFlow, MXNet and Pytorch TensorFlow MXNet GPU Coder PyTorch Intel® Xeon® CPU 3.6 GHz - NVIDIA libraries: CUDA9 - cuDNN 7 - Frameworks: TensorFlow 1.8.0, MXNet 1.2.1, PyTorch 0.3.1 57 MathWorks can help you do Deep Learning Free resources More options ▪ Guided evaluations with a ▪ Consulting services MathWorks deep learning ▪ Training courses engineer ▪ Technical support ▪ Proof-of-concept projects ▪ Advanced customer support ▪ Deep learning hands-on ▪ Installation, enterprise, and cloud workshop deployment ▪ Seminars and technical deep ▪ MATLAB for Deep Learning dives ▪ Deep learning onramp course 58 Thank you! 59 Extra Slides 60 Machine learning vs. deep learning Deep learning performs end-to-end learning by learning features, representations and tasks directly from images, text, sound, and signals Machine Learning Machine Learning Deep Learning Deep Learning Subset of machine learning with automatic feature extraction 61 Feature Extraction: Spectral Analysis Welch periodogram BW measurements Lomb periodogram Octave spectrum Harmonic distortion Spectral statistics 62 Features Extraction and Signal Segmentation For Machine and Deep Learning ▪ Mel Frequency Cepstral Coefficients (MFCC) ▪ Pitch ▪ Voice Activity Detection 63 Mel Frequency Cepstral Coefficients (MFCC) ▪ Variation of "perceptually-adjusted" spectrum content over time ▪ Most common voice and speech features for Machine Learning applications ▪ Option of: – mfcc function, – cepstralFeatureExtractor System object – Cepstral Feature Extractor block (Simulink) ▪ Returns: – MFCC – "Delta" (= mfcc[n]-mfcc[n-1]) – "Delta delta" (= delta[n]-delta[n-1]) 64 Pitch Extraction ▪ Estimate audio pitch over time ▪ Popular feature for machine learning in voice and speech processing ▪ Choice of 5 different popular algorithms – 'NCF' – Normalized Correlation Function – 'PEF' – Pitch Estimation Filter – 'CEP' – Cepstrum Pitch Determination – 'LHS' – Log-Harmonic Summation – 'SRH' – Summation of Residual Harmonics [ft, idx] = pitch(audioIn, fs); 65 Voice Activity Detection (VAD) ▪ Standard algorithm to segment voice and speech signals ▪ voiceActivityDetector System object for MATLAB ▪