Supervised, Semi, Weakly, Unsupervised) (30Mins

Total Page:16

File Type:pdf, Size:1020Kb

Supervised, Semi, Weakly, Unsupervised) (30Mins Learning Methods Supervised, Semi-supervised, Weakly-supervised, Unsupervised Learnings Hao Dong 2019, Peking University 1 Learning Methods • Supervised, Semi-supervised, Weakly-supervised, Unsupervised Learnings • Unsupervised Learning • Semi-supervised Learning • Weakly-supervised Learning • Summary 2 From Data Point of View: Supervised, Unsupervised, Semi-supervised, and Weakly- supervised Learning 3 From Data Point of View Data in both input � and output � Data in both input � and output � with known mapping (Learn the mapping �) (Learn the mapping �) � = �(�) � = �(�) Supervised Learning Unsupervised Learning • Image classification • Autoencoder • Object detection (when output is features) • … • GANs • … 4 Data in input only (Learn the self-mapping �) From Data Point of View Data in both input � and output � Data in both input � and output � with known partial mapping with known mapping for � (Learn the mapping �) (Learn the mapping � for another output �′) � = �(�) �′ = �(�) Semi-supervised Learning Weakly-supervised Learning • … • Learn segmentation via classification • … 5 From Data Point of View weakly-supervised learning 6 https://niug1984.github.io/paper/niu_tdlw2018.pdf Unsupervised Learning 7 Unsupervised Learning • Unsupervised learning is about problems where we don’t have labeled answers, such as clustering, dimensionality reduction, and anomaly detection. • Clustering: EM • Dimension Reduction: PCA • … 8 Unsupervised Learning • Autoencoder • In practice, it is difficult to obtain a large amount of labeled data, but it is easy to get a large amount of unlabeled data. • Learn a good feature extractor using unlabeled data and then learn the classifier using labeled data can improve the performance. 9 Unsupervised Learning • Autoencoder input layer hidden layer output layer Encoder Decoder � �# ! ! • The hidden units are usually less than the number of inputs • Dimension reduction --- Feature learning �" �1 �#" Given � data samples �# �2 �## ! 1 $ $ ' ℒ !"# = S �T − � ' � � �# � $ 3 $ $%& • It is trying to learn an approximation to the identity function � �4 �# % % so that the input is “compress” to the “compressed” features, discovering interesting structure about the data. �& �#& 10 Unsupervised Learning • Autoencoder input layer hidden layer • Autoencoder is an unsupervised learning method if we Encoder considered the features as the “output”. �! • Auto encoder is also a self-taught learning method which is a type of supervised learning where the training labels are �" �1 determined by the input data. �# �2 • Word2Vec is another unsupervised, self-taught learning extracted features example. � $ �3 Autoencoder for MNIST dataset (28×28×1, 784 pixels) � �% �4 �T �& 11 Unsupervised Learning • Sparse Autoencoder input layer hidden layer Encoder • Even when the number of hidden units is large �! Sigmoid (perhaps even greater than the number of input pixels), we can still discover interesting structure, �" �1 0.02 “inactive” by imposing other constraints on the network. • In particular, if we impose a ”‘sparsity”’ constraint �# �2 0.97 “active” on the hidden units, then the autoencoder will still discover interesting structure in the data, even if the number of hidden units is large. �$ �3 0.01 “inactive” �% �4 0.98 “active” �& 12 Unsupervised Learning • Sparse Autoencoder Given � data samples and Sigmoid activation function, the active ratio of a neuron ��: input layer hidden layer ! Encoder 1 � �g = S � ! � � � Sigmoid $%& The make the output “sparse”, we would like to �" �1 0.02 “inactive” enforce the following constraint, where � is a “sparsity parameter”, such as 0.2 (20% of the neurons) �# �2 0.97 “active” �g� = � The penalty term is as follow, where s is the � � 0.01 “inactive” $ 3 number of output neurons. * ℒ ( = ∑)%& ��(�||�g�) ( &,( � �4 * % 0.98 “active” = ∑)%&(�log + (1 − �)log ) (+! &,(+! The total loss: �& ℒ -.-/0 = ℒ !"# + ℒ ( 13 Unsupervised Learning • Sparse Autoencoder Smaller � == More sparse Autoencoders for MNIST dataset Input � Autoencoder �T Sparse Autoencoder �T 14 Unsupervised Learning • Sparse Autoencoder Visualizing the learned features One neuron == One feature extractor �! �" �1 �# �2 �$ �3 �% �4 �& 15 Unsupervised Learning • Sparse Autoencoder Method Hidden Reconstruction Loss Function Activation Activation Method 1 Sigmoid Sigmoid ℒ -.-/0 = ℒ !"# + ℒ ( Method 2 ReLU Softplus ℒ -.-/0 = ℒ !"# + � ℒ & on the hidden activation output 16 Unsupervised Learning • Denoising Autoencoder input layer hidden layer output layer Encoder Decoder �! �! �#! Applying dropout between the input and the first hidden layer �" �" �1 �#" • Improve the robustness �# �# �2 �## �$ �$ �3 �#$ �% �% �4 �#% �& �& �#& 17 Unsupervised Learning • Denoising Autoencoder Features of Sparse Autoencoder Denoising Autoencoder 18 Unsupervised Learning • Denoising Autoencoder Autoencoders for MNIST dataset Input � Autoencoder �T Sparse Autoencoder �T Denoising Autoencoder �T 19 Unsupervised Learning • Stacked Autoencoder input hidden 1 output Encoder Decoder Unsupervised �! �#! ! �" �! �#" The feature extractor for the input data ! �# �" �## ! �$ �# �#$ ! �% �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 20 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 output Encoder Encoder Decoder Unsupervised �! �#! ! " �" �! �! �#" The feature extractor for the first feature extractor ! " �# �" �" �## ! " �$ �# �# �#$ ! " �% �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 21 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 output Encoder Encoder Encoder Decoder Unsupervised �! �#! ! " # �" �! �! �! �#" The feature extractor for the second feature extractor ! " # �# �" �" �" �## ! " # �$ �# �# �# �#$ ! " # �% �$ �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 22 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Encoder Encoder Encoder Encoder Decoder Unsupervised �! �#! ! " # $ The feature extractor for the third feature extractor �" �! �! �! �! �#" ! " # $ �# �" �" �" �" �## ! " # $ �$ �# �# �# �# �#$ ! " # % �% �$ �$ �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 23 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Supervised �! Learn to classify the input data by using the labels and high-level features ! " # $ �" �! �! �! �! ! " # $ �# �" �" �" �" % �! ! " # $ �$ �# �# �# �# ! " # $ �% �$ �$ �$ �$ Red lines indicate the trainable weights � & Black lines indicate the fixed/nontrainable weights 24 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Supervised �! ! " # $ �" �! �! �! �! Fine-tune the entire model for classification ! " # $ �# �" �" �" �" % �! ! " # $ �$ �# �# �# �# ! " # $ �% �$ �$ �$ �$ Red lines indicate the trainable weights � & Black lines indicate the fixed/nontrainable weights 25 Unsupervised Learning • Variational Autoencoder, VAE input layer hidden layer output layer Encoder Decoder Can the hidden output be a prior distribution, e.g., Normal distribution? �! �#! Decoder �#! �" �1 �#" �1 �#" �# �2 �## �2 �## If yes, we can have a Generator that �$ �3 �#$ �(0, 1) can map �(0, 1) to data space �3 �#$ �% �4 �#% �4 �#% �& �#& �#& 26 Unsupervised Learning • Variational Autoencoder, VAE Data Space Latent Space � �(0, 1) Bidirectional Mapping 27 The penalty term is as follow: ℒ ( = ��(�(0, 1) ||�g) Unsupervised Learning • Generative Adversarial Network, GAN � fake �(0, 1) � �T real � Normal/uniform distribuon min max � �, � = ��~5 log �(�) + ��~5 log(1 − �(� � ) 1 2 '()( * Unidirectional Mapping ℒ 2 = − ��~5'()( log �(�) − ��~5* log(1 − �(� � ) ℒ 1 = − ��~5* log �(� � ) 28 Latent space Data space � � � � � � � � � � Semi-supervised Learning 29 Semi-supervised Learning • Motivation: • Unlabeled data is easy to be obtained • Labeled data can be hard to get • Goal: • Semi-supervised learning mixes labeled and labeled data to produce better models. • vs. Transductive Learning: • Semi-supervised learning is eventually applied to the testing data • Transductive learning is only related to the unlabelled data 30 Semi-supervised Learning • Unlabeled data can help Unlabeled data can help to find a better boundary �' �' �& �& 31 Semi-supervised Learning • Pseudo-Labelling 32 Semi-supervised Learning • Generative Methods • EM with some labelled data Cluster and then label 33 Semi-supervised Learning • GeneraEve Methods Unlabeled data may hurt the learning Wrong Correct 34 Multi-variables Gaussian model Semi-supervised Learning • Graph-based Methods 1. De~ine the similarity s(�7, �)) 2. Add edges 1. KNN 2. e-Neighborhood 3. Edge weight is proportional to s(�7, �)) 4. Propagate through the graph 35 Semi-supervised Learning • Low-density separation • Semi-supervised SVM (S3VM) == Transductive SVM (TSVM) 36 Weakly-supervised Learning 39 Weakly-supervised Learning • Weakly supervised learning is a machine learning framework where the model is trained using examples that are only partially annotated or labeled. 40 Weakly-supervised Learning • Attention CycleGAN: Learn the segmentation via synthesis 41 Unsupervised Attention-guided Image-to-Image Translation. Mejjati, Y. A., Richardt, C., & Cosker, D. NIPS, 2018 Weakly-supervised Learning • Semantic Image Synthesis: Learn the segmentation via synthesis �4 = �(�,
Recommended publications
  • Predrnn: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal Lstms
    PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs Yunbo Wang Mingsheng Long∗ School of Software School of Software Tsinghua University Tsinghua University [email protected] [email protected] Jianmin Wang Zhifeng Gao Philip S. Yu School of Software School of Software School of Software Tsinghua University Tsinghua University Tsinghua University [email protected] [email protected] [email protected] Abstract The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal vari- ations are two crucial structures. This paper models these structures by presenting a predictive recurrent neural network (PredRNN). This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial ap- pearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously. PredRNN achieves the state-of-the-art prediction performance on three video prediction datasets and is a more general framework, that can be easily extended to other predictive learning tasks by integrating with other architectures. 1 Introduction
    [Show full text]
  • A Survey on Data Collection for Machine Learning a Big Data - AI Integration Perspective
    1 A Survey on Data Collection for Machine Learning A Big Data - AI Integration Perspective Yuji Roh, Geon Heo, Steven Euijong Whang, Senior Member, IEEE Abstract—Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more widely-used, we are seeing new applications that do not necessarily have enough labeled data. Second, unlike traditional machine learning, deep learning techniques automatically generate features, which saves feature engineering costs, but in return may require larger amounts of labeled data. Interestingly, recent research in data collection comes not only from the machine learning, natural language, and computer vision communities, but also from the data management community due to the importance of handling large amounts of data. In this survey, we perform a comprehensive study of data collection from a data management point of view. Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models. We provide a research landscape of these operations, provide guidelines on which technique to use when, and identify interesting research challenges. The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research. Index Terms—data collection, data acquisition, data labeling, machine learning F 1 INTRODUCTION E are living in exciting times where machine learning expertise. This problem applies to any novel application that W is having a profound influence on a wide range of benefits from machine learning.
    [Show full text]
  • Almost Unsupervised Text to Speech and Automatic Speech Recognition
    Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren * 1 Xu Tan * 2 Tao Qin 2 Sheng Zhao 3 Zhou Zhao 1 Tie-Yan Liu 2 Abstract 1. Introduction Text to speech (TTS) and automatic speech recognition (ASR) are two popular tasks in speech processing and have Text to speech (TTS) and automatic speech recog- attracted a lot of attention in recent years due to advances in nition (ASR) are two dual tasks in speech pro- deep learning. Nowadays, the state-of-the-art TTS and ASR cessing and both achieve impressive performance systems are mostly based on deep neural models and are all thanks to the recent advance in deep learning data-hungry, which brings challenges on many languages and large amount of aligned speech and text data. that are scarce of paired speech and text data. Therefore, However, the lack of aligned data poses a ma- a variety of techniques for low-resource and zero-resource jor practical problem for TTS and ASR on low- ASR and TTS have been proposed recently, including un- resource languages. In this paper, by leveraging supervised ASR (Yeh et al., 2019; Chen et al., 2018a; Liu the dual nature of the two tasks, we propose an et al., 2018; Chen et al., 2018b), low-resource ASR (Chuang- almost unsupervised learning method that only suwanich, 2016; Dalmia et al., 2018; Zhou et al., 2018), TTS leverages few hundreds of paired data and extra with minimal speaker data (Chen et al., 2019; Jia et al., 2018; unpaired data for TTS and ASR.
    [Show full text]
  • Self-Supervised Learning
    Self-Supervised Learning Andrew Zisserman Slides from: Carl Doersch, Ishan Misra, Andrew Owens, Carl Vondrick, Richard Zhang The ImageNet Challenge Story … 1000 categories • Training: 1000 images for each category • Testing: 100k images The ImageNet Challenge Story … strong supervision The ImageNet Challenge Story … outcomes Strong supervision: • Features from networks trained on ImageNet can be used for other visual tasks, e.g. detection, segmentation, action recognition, fine grained visual classification • To some extent, any visual task can be solved now by: 1. Construct a large-scale dataset labelled for that task 2. Specify a training loss and neural network architecture 3. Train the network and deploy • Are there alternatives to strong supervision for training? Self-Supervised learning …. Why Self-Supervision? 1. Expense of producing a new dataset for each new task 2. Some areas are supervision-starved, e.g. medical data, where it is hard to obtain annotation 3. Untapped/availability of vast numbers of unlabelled images/videos – Facebook: one billion images uploaded per day – 300 hours of video are uploaded to YouTube every minute 4. How infants may learn … Self-Supervised Learning The Scientist in the Crib: What Early Learning Tells Us About the Mind by Alison Gopnik, Andrew N. Meltzoff and Patricia K. Kuhl The Development of Embodied Cognition: Six Lessons from Babies by Linda Smith and Michael Gasser What is Self-Supervision? • A form of unsupervised learning where the data provides the supervision • In general, withhold some part of the data, and task the network with predicting it • The task defines a proxy loss, and the network is forced to learn what we really care about, e.g.
    [Show full text]
  • Reinforcement Learning in Supervised Problem Domains
    Technische Universität München Fakultät für Informatik Lehrstuhl VI – Echtzeitsysteme und Robotik reinforcement learning in supervised problem domains Thomas F. Rückstieß Vollständiger Abdruck der von der Fakultät für Informatik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. Daniel Cremers Prüfer der Dissertation 1. Univ.-Prof. Dr. Patrick van der Smagt 2. Univ.-Prof. Dr. Hans Jürgen Schmidhuber Die Dissertation wurde am 30. 06. 2015 bei der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 18. 09. 2015 angenommen. Thomas Rückstieß: Reinforcement Learning in Supervised Problem Domains © 2015 email: [email protected] ABSTRACT Despite continuous advances in computing technology, today’s brute for- ce data processing approaches may not provide the necessary advantage to win the race against the ever-growing amount of data that can be wit- nessed over the last decades. In this thesis, we discuss novel methods and algorithms that are capable of directing attention to relevant details and analysing it in sequence to overcome the processing bottleneck and to keep up with this data explosion. In the first of three parts, a novel exploration technique for Policy Gradi- ent Reinforcement Learning is presented which replaces traditional ad- ditive random exploration with state-dependent exploration, exploring on a higher, more strategic level. We will show how this new exploration method converges faster and finds better global solutions than random exploration can. The second part of this thesis will introduce the concept of “data con- sumption” and discuss means to minimise it in supervised learning tasks by deriving classification as a sequential decision process and ma- king it accessible to Reinforcement Learning methods.
    [Show full text]
  • Infinite Variational Autoencoder for Semi-Supervised Learning
    Infinite Variational Autoencoder for Semi-Supervised Learning M. Ehsan Abbasnejad Anthony Dick Anton van den Hengel The University of Adelaide {ehsan.abbasnejad, anthony.dick, anton.vandenhengel}@adelaide.edu.au Abstract with whatever labelled data is available to train a discrimi- native model for classification. This paper presents an infinite variational autoencoder We demonstrate that our infinite VAE outperforms both (VAE) whose capacity adapts to suit the input data. This the classical VAE and standard classification methods, par- is achieved using a mixture model where the mixing coef- ticularly when the number of available labelled samples is ficients are modeled by a Dirichlet process, allowing us to small. This is because the infinite VAE is able to more ac- integrate over the coefficients when performing inference. curately capture the distribution of the unlabelled data. It Critically, this then allows us to automatically vary the therefore provides a generative model that allows the dis- number of autoencoders in the mixture based on the data. criminative model, which is trained based on its output, to Experiments show the flexibility of our method, particularly be more effectively learnt using a small number of samples. for semi-supervised learning, where only a small number of The main contribution of this paper is twofold: (1) we training samples are available. provide a Bayesian non-parametric model for combining autoencoders, in particular variational autoencoders. This bridges the gap between non-parametric Bayesian meth- 1. Introduction ods and the deep neural networks; (2) we provide a semi- supervised learning approach that utilizes the infinite mix- The Variational Autoencoder (VAE) [18] is a newly in- ture of autoencoders learned by our model for prediction troduced tool for unsupervised learning of a distribution x x with from a small number of labeled examples.
    [Show full text]
  • Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J
    1 Unsupervised speech representation learning using WaveNet autoencoders Jan Chorowski, Ron J. Weiss, Samy Bengio, Aaron¨ van den Oord Abstract—We consider the task of unsupervised extraction speaker gender and identity, from phonetic content, properties of meaningful latent representations of speech by applying which are consistent with internal representations learned autoencoding neural networks to speech waveforms. The goal by speech recognizers [13], [14]. Such representations are is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being desired in several tasks, such as low resource automatic speech invariant to confounding low level details in the signal such as recognition (ASR), where only a small amount of labeled the underlying pitch contour or background noise. Since the training data is available. In such scenario, limited amounts learned representation is tuned to contain only phonetic content, of data may be sufficient to learn an acoustic model on the we resort to using a high capacity WaveNet decoder to infer representation discovered without supervision, but insufficient information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the to learn the acoustic model and a data representation in a fully kind of constraint that is applied to the latent representation. supervised manner [15], [16]. We compare three variants: a simple dimensionality reduction We focus on representations learned with autoencoders bottleneck, a Gaussian Variational Autoencoder (VAE), and a applied to raw waveforms and spectrogram features and discrete Vector Quantized VAE (VQ-VAE). We analyze the quality investigate the quality of learned representations on LibriSpeech of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately re- [17].
    [Show full text]
  • A Primer on Machine Learning
    A Primer on Machine Learning By instructor Amit Manghani Question: What is Machine Learning? Simply put, Machine Learning is a form of data analysis. Using algorithms that “ continuously learn from data, Machine Learning allows computers to recognize The core of hidden patterns without actually being programmed to do so. The key aspect of Machine Learning Machine Learning is that as models are exposed to new data sets, they adapt to produce reliable and consistent output. revolves around a computer system Question: consuming data What is driving the resurgence of Machine Learning? and learning from There are four interrelated phenomena that are behind the growing prominence the data. of Machine Learning: 1) the ever-increasing volume, variety and velocity of data, 2) the decrease in bandwidth and storage costs and 3) the exponential improve- ments in computational processing. In a nutshell, the ability to perform complex ” mathematical computations on big data is driving the resurgence in Machine Learning. 1 Question: What are some of the commonly used methods of Machine Learning? Reinforce- ment Machine Learning Supervised Machine Learning Semi- supervised Machine Unsupervised Learning Machine Learning Supervised Machine Learning In Supervised Learning, algorithms are trained using labeled examples i.e. the desired output for an input is known. For example, a piece of mail could be labeled either as relevant or junk. The algorithm receives a set of inputs along with the corresponding correct outputs to foster learning. Once the algorithm is trained on a set of labeled data; the algorithm is run against the same labeled data and its actual output is compared against the correct output to detect errors.
    [Show full text]
  • Learning from Noisy Labels with Deep Neural Networks: a Survey Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, Jae-Gil Lee
    UNDER REVIEW 1 Learning from Noisy Labels with Deep Neural Networks: A Survey Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, Jae-Gil Lee Abstract—Deep learning has achieved remarkable success in 100% 100% numerous domains with help from large amounts of big data. Noisy w/o. Reg. Noisy w. Reg. 75% However, the quality of data labels is a concern because of the 75% Clean w. Reg Gap lack of high-quality labels in many real-world scenarios. As noisy 50% 50% labels severely degrade the generalization performance of deep Noisy w/o. Reg. neural networks, learning from noisy labels (robust training) is 25% Accuracy Test 25% Training Accuracy Training Noisy w. Reg. becoming an important task in modern deep learning applica- Clean w. Reg tions. In this survey, we first describe the problem of learning with 0% 0% 1 30 60 90 120 1 30 60 90 120 label noise from a supervised learning perspective. Next, we pro- Epochs Epochs vide a comprehensive review of 57 state-of-the-art robust training Fig. 1. Convergence curves of training and test accuracy when training methods, all of which are categorized into five groups according WideResNet-16-8 using a standard training method on the CIFAR-100 dataset to their methodological difference, followed by a systematic com- with the symmetric noise of 40%: “Noisy w/o. Reg.” and “Noisy w. Reg.” are parison of six properties used to evaluate their superiority. Sub- the models trained on noisy data without and with regularization, respectively, sequently, we perform an in-depth analysis of noise rate estima- and “Clean w.
    [Show full text]
  • Introduction to Deep Learning in Signal Processing & Communications with MATLAB
    Introduction to Deep Learning in Signal Processing & Communications with MATLAB Dr. Amod Anandkumar Pallavi Kar Application Engineering Group, Mathworks India © 2019 The MathWorks, Inc.1 Different Types of Machine Learning Type of Machine Learning Categories of Algorithms • Output is a choice between classes Classification (True, False) (Red, Blue, Green) Supervised Learning • Output is a real number Regression Develop predictive (temperature, stock prices) model based on both Machine input and output data Learning Unsupervised • No output - find natural groups and Clustering Learning patterns from input data only Discover an internal representation from input data only 2 What is Deep Learning? 3 Deep learning is a type of supervised machine learning in which a model learns to perform classification tasks directly from images, text, or sound. Deep learning is usually implemented using a neural network. The term “deep” refers to the number of layers in the network—the more layers, the deeper the network. 4 Why is Deep Learning So Popular Now? Human Accuracy Source: ILSVRC Top-5 Error on ImageNet 5 Vision applications have been driving the progress in deep learning producing surprisingly accurate systems 6 Deep Learning success enabled by: • Labeled public datasets • Progress in GPU for acceleration AlexNet VGG-16 ResNet-50 ONNX Converter • World-class models and PRETRAINED PRETRAINED PRETRAINED MODEL MODEL CONVERTER MODEL MODEL TensorFlow- connected community Caffe GoogLeNet IMPORTER PRETRAINED Keras Inception-v3 MODEL IMPORTER MODELS 7
    [Show full text]
  • 4 Perceptron Learning
    4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch–Pitts units and perceptrons, but the question of how to find the parameters adequate for a given task was left open. If two sets of points have to be separated linearly with a perceptron, adequate weights for the comput- ing unit must be found. The operators that we used in the preceding chapter, for example for edge detection, used hand customized weights. Now we would like to find those parameters automatically. The perceptron learning algorithm deals with this problem. A learning algorithm is an adaptive method by which a network of com- puting units self-organizes to implement the desired behavior. This is done in some learning algorithms by presenting some examples of the desired input- output mapping to the network. A correction step is executed iteratively until the network learns to produce the desired response. The learning algorithm is a closed loop of presentation of examples and of corrections to the network parameters, as shown in Figure 4.1. network test input-output compute the examples error fix network parameters Fig. 4.1. Learning process in a parametric system R. Rojas: Neural Networks, Springer-Verlag, Berlin, 1996 78 4 Perceptron Learning In some simple cases the weights for the computing units can be found through a sequential test of stochastically generated numerical combinations. However, such algorithms which look blindly for a solution do not qualify as “learning”. A learning algorithm must adapt the network parameters accord- ing to previous experience until a solution is found, if it exists.
    [Show full text]
  • Learning from Minimally Labeled Data with Accelerated Convolutional Neural Networks Aysegul Dundar Purdue University
    Purdue University Purdue e-Pubs Open Access Dissertations Theses and Dissertations 4-2016 Learning from minimally labeled data with accelerated convolutional neural networks Aysegul Dundar Purdue University Follow this and additional works at: https://docs.lib.purdue.edu/open_access_dissertations Part of the Computer Sciences Commons, and the Electrical and Computer Engineering Commons Recommended Citation Dundar, Aysegul, "Learning from minimally labeled data with accelerated convolutional neural networks" (2016). Open Access Dissertations. 641. https://docs.lib.purdue.edu/open_access_dissertations/641 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. Graduate School Form 30 Updated ¡¢ ¡£¢ ¡¤ ¥ PURDUE UNIVERSITY GRADUATE SCHOOL Thesis/Dissertation Acceptance This is to certify that the thesis/dissertation prepared By Aysegul Dundar Entitled Learning from Minimally Labeled Data with Accelerated Convolutional Neural Networks For the degree of Doctor of Philosophy Is approved by the final examining committee: Eugenio Culurciello Chair Anand Raghunathan Bradley S. Duerstock Edward L. Bartlett To the best of my knowledge and as understood by the student in the Thesis/Dissertation Agreement, Publication Delay, and Certification Disclaimer (Graduate School Form 32), this thesis/dissertation adheres to the provisions of Purdue University’s “Policy of Integrity in Research” and the use of copyright material. Approved by Major Professor(s): Eugenio Culurciello Approved by: George R. Wodicka 04/22/2016 Head of the Departmental Graduate Program Date LEARNING FROM MINIMALLY LABELED DATA WITH ACCELERATED CONVOLUTIONAL NEURAL NETWORKS A Dissertation Submitted to the Faculty of Purdue University by Aysegul Dundar In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy May 2016 Purdue University West Lafayette, Indiana ii To my family: Nezaket, Cengiz and Deniz.
    [Show full text]