Supervised, Semi, Weakly, Unsupervised) (30Mins

Learning Methods Supervised, Semi-supervised, Weakly-supervised, Unsupervised Learnings Hao Dong 2019, Peking University 1 Learning Methods • Supervised, Semi-supervised, Weakly-supervised, Unsupervised Learnings • Unsupervised Learning • Semi-supervised Learning • Weakly-supervised Learning • Summary 2 From Data Point of View: Supervised, Unsupervised, Semi-supervised, and Weakly- supervised Learning 3 From Data Point of View Data in both input � and output � Data in both input � and output � with known mapping (Learn the mapping �) (Learn the mapping �) � = �(�) � = �(�) Supervised Learning Unsupervised Learning • Image classification • Autoencoder • Object detection (when output is features) • … • GANs • … 4 Data in input only (Learn the self-mapping �) From Data Point of View Data in both input � and output � Data in both input � and output � with known partial mapping with known mapping for � (Learn the mapping �) (Learn the mapping � for another output �′) � = �(�) �′ = �(�) Semi-supervised Learning Weakly-supervised Learning • … • Learn segmentation via classification • … 5 From Data Point of View weakly-supervised learning 6 https://niug1984.github.io/paper/niu_tdlw2018.pdf Unsupervised Learning 7 Unsupervised Learning • Unsupervised learning is about problems where we don’t have labeled answers, such as clustering, dimensionality reduction, and anomaly detection. • Clustering: EM • Dimension Reduction: PCA • … 8 Unsupervised Learning • Autoencoder • In practice, it is difficult to obtain a large amount of labeled data, but it is easy to get a large amount of unlabeled data. • Learn a good feature extractor using unlabeled data and then learn the classifier using labeled data can improve the performance. 9 Unsupervised Learning • Autoencoder input layer hidden layer output layer Encoder Decoder � �# ! ! • The hidden units are usually less than the number of inputs • Dimension reduction --- Feature learning �" �1 �#" Given � data samples �# �2 �## ! 1 $ $ ' ℒ !"# = S �T − � ' � � �# � $ 3 $ $%& • It is trying to learn an approximation to the identity function � �4 �# % % so that the input is “compress” to the “compressed” features, discovering interesting structure about the data. �& �#& 10 Unsupervised Learning • Autoencoder input layer hidden layer • Autoencoder is an unsupervised learning method if we Encoder considered the features as the “output”. �! • Auto encoder is also a self-taught learning method which is a type of supervised learning where the training labels are �" �1 determined by the input data. �# �2 • Word2Vec is another unsupervised, self-taught learning extracted features example. � $ �3 Autoencoder for MNIST dataset (28×28×1, 784 pixels) � �% �4 �T �& 11 Unsupervised Learning • Sparse Autoencoder input layer hidden layer Encoder • Even when the number of hidden units is large �! Sigmoid (perhaps even greater than the number of input pixels), we can still discover interesting structure, �" �1 0.02 “inactive” by imposing other constraints on the network. • In particular, if we impose a ”‘sparsity”’ constraint �# �2 0.97 “active” on the hidden units, then the autoencoder will still discover interesting structure in the data, even if the number of hidden units is large. �$ �3 0.01 “inactive” �% �4 0.98 “active” �& 12 Unsupervised Learning • Sparse Autoencoder Given � data samples and Sigmoid activation function, the active ratio of a neuron ��: input layer hidden layer ! Encoder 1 � �g = S � ! � � � Sigmoid $%& The make the output “sparse”, we would like to �" �1 0.02 “inactive” enforce the following constraint, where � is a “sparsity parameter”, such as 0.2 (20% of the neurons) �# �2 0.97 “active” �g� = � The penalty term is as follow, where s is the � � 0.01 “inactive” $ 3 number of output neurons. * ℒ ( = ∑)%& ��(�||�g�) ( &,( � �4 * % 0.98 “active” = ∑)%&(�log + (1 − �)log ) (+! &,(+! The total loss: �& ℒ -.-/0 = ℒ !"# + ℒ ( 13 Unsupervised Learning • Sparse Autoencoder Smaller � == More sparse Autoencoders for MNIST dataset Input � Autoencoder �T Sparse Autoencoder �T 14 Unsupervised Learning • Sparse Autoencoder Visualizing the learned features One neuron == One feature extractor �! �" �1 �# �2 �$ �3 �% �4 �& 15 Unsupervised Learning • Sparse Autoencoder Method Hidden Reconstruction Loss Function Activation Activation Method 1 Sigmoid Sigmoid ℒ -.-/0 = ℒ !"# + ℒ ( Method 2 ReLU Softplus ℒ -.-/0 = ℒ !"# + � ℒ & on the hidden activation output 16 Unsupervised Learning • Denoising Autoencoder input layer hidden layer output layer Encoder Decoder �! �! �#! Applying dropout between the input and the first hidden layer �" �" �1 �#" • Improve the robustness �# �# �2 �## �$ �$ �3 �#$ �% �% �4 �#% �& �& �#& 17 Unsupervised Learning • Denoising Autoencoder Features of Sparse Autoencoder Denoising Autoencoder 18 Unsupervised Learning • Denoising Autoencoder Autoencoders for MNIST dataset Input � Autoencoder �T Sparse Autoencoder �T Denoising Autoencoder �T 19 Unsupervised Learning • Stacked Autoencoder input hidden 1 output Encoder Decoder Unsupervised �! �#! ! �" �! �#" The feature extractor for the input data ! �# �" �## ! �$ �# �#$ ! �% �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 20 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 output Encoder Encoder Decoder Unsupervised �! �#! ! " �" �! �! �#" The feature extractor for the first feature extractor ! " �# �" �" �## ! " �$ �# �# �#$ ! " �% �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 21 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 output Encoder Encoder Encoder Decoder Unsupervised �! �#! ! " # �" �! �! �! �#" The feature extractor for the second feature extractor ! " # �# �" �" �" �## ! " # �$ �# �# �# �#$ ! " # �% �$ �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 22 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Encoder Encoder Encoder Encoder Decoder Unsupervised �! �#! ! " # $ The feature extractor for the third feature extractor �" �! �! �! �! �#" ! " # $ �# �" �" �" �" �## ! " # $ �$ �# �# �# �# �#$ ! " # % �% �$ �$ �$ �$ �#% Red lines indicate the trainable weights � �# & & Black lines indicate the fixed/nontrainable weights 23 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Supervised �! Learn to classify the input data by using the labels and high-level features ! " # $ �" �! �! �! �! ! " # $ �# �" �" �" �" % �! ! " # $ �$ �# �# �# �# ! " # $ �% �$ �$ �$ �$ Red lines indicate the trainable weights � & Black lines indicate the fixed/nontrainable weights 24 Unsupervised Learning • Stacked Autoencoder input hidden 1 hidden 2 hidden 3 hidden 4 output Supervised �! ! " # $ �" �! �! �! �! Fine-tune the entire model for classification ! " # $ �# �" �" �" �" % �! ! " # $ �$ �# �# �# �# ! " # $ �% �$ �$ �$ �$ Red lines indicate the trainable weights � & Black lines indicate the fixed/nontrainable weights 25 Unsupervised Learning • Variational Autoencoder, VAE input layer hidden layer output layer Encoder Decoder Can the hidden output be a prior distribution, e.g., Normal distribution? �! �#! Decoder �#! �" �1 �#" �1 �#" �# �2 �## �2 �## If yes, we can have a Generator that �$ �3 �#$ �(0, 1) can map �(0, 1) to data space �3 �#$ �% �4 �#% �4 �#% �& �#& �#& 26 Unsupervised Learning • Variational Autoencoder, VAE Data Space Latent Space � �(0, 1) Bidirectional Mapping 27 The penalty term is as follow: ℒ ( = ��(�(0, 1) ||�g) Unsupervised Learning • Generative Adversarial Network, GAN � fake �(0, 1) � �T real � Normal/uniform distribuon min max � �, � = ��~5 log �(�) + ��~5 log(1 − �(� � ) 1 2 '()( * Unidirectional Mapping ℒ 2 = − ��~5'()( log �(�) − ��~5* log(1 − �(� � ) ℒ 1 = − ��~5* log �(� � ) 28 Latent space Data space � � � � � � � � � � Semi-supervised Learning 29 Semi-supervised Learning • Motivation: • Unlabeled data is easy to be obtained • Labeled data can be hard to get • Goal: • Semi-supervised learning mixes labeled and labeled data to produce better models. • vs. Transductive Learning: • Semi-supervised learning is eventually applied to the testing data • Transductive learning is only related to the unlabelled data 30 Semi-supervised Learning • Unlabeled data can help Unlabeled data can help to find a better boundary �' �' �& �& 31 Semi-supervised Learning • Pseudo-Labelling 32 Semi-supervised Learning • Generative Methods • EM with some labelled data Cluster and then label 33 Semi-supervised Learning • GeneraEve Methods Unlabeled data may hurt the learning Wrong Correct 34 Multi-variables Gaussian model Semi-supervised Learning • Graph-based Methods 1. De~ine the similarity s(�7, �)) 2. Add edges 1. KNN 2. e-Neighborhood 3. Edge weight is proportional to s(�7, �)) 4. Propagate through the graph 35 Semi-supervised Learning • Low-density separation • Semi-supervised SVM (S3VM) == Transductive SVM (TSVM) 36 Weakly-supervised Learning 39 Weakly-supervised Learning • Weakly supervised learning is a machine learning framework where the model is trained using examples that are only partially annotated or labeled. 40 Weakly-supervised Learning • Attention CycleGAN: Learn the segmentation via synthesis 41 Unsupervised Attention-guided Image-to-Image Translation. Mejjati, Y. A., Richardt, C., & Cosker, D. NIPS, 2018 Weakly-supervised Learning • Semantic Image Synthesis: Learn the segmentation via synthesis �4 = �(�,

Supervised, Semi, Weakly, Unsupervised) (30Mins

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support