Transfer Learning and the Importance of Datasets Application Note
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Transfer Adaptation Learning: a Decade Survey
JOURNAL OF LATEX CLASS FILES, VOL. -, NO. -, MARCH 2019 1 Transfer Adaptation Learning: A Decade Survey Lei Zhang, Xinbo Gao Abstract—The world we see is ever-changing and it always changes with people, things, and the environment. Domain is referred to as the state of the world at a certain moment. A research problem is characterized as transfer adaptation learning (TAL) when it needs knowledge correspondence between different moments/domains. Conventional machine learning aims to find a model with the minimum expected risk on test data by minimizing the regularized empirical risk on the training data, which, however, supposes that the training and test data share similar joint probability distribution. TAL aims to build models that can perform tasks of target domain by learning knowledge from a semantic related but distribution different source domain. It is an energetic research filed of increasing influence and importance, which is presenting a blowout publication trend. This paper surveys the advances of TAL methodologies in the past decade, and the technical challenges and essential problems of TAL have been observed and discussed with deep insights and new perspectives. Broader solutions of transfer adaptation learning being created by researchers are identified, i.e., instance re-weighting adaptation, feature adaptation, classifier adaptation, deep network adaptation and adversarial adaptation, which are beyond the early semi-supervised and unsupervised split. The survey helps researchers rapidly but comprehensively understand and identify the research foundation, research status, theoretical limitations, future challenges and under-studied issues (universality, interpretability, and credibility) to be broken in the field toward universal representation and safe applications in open-world scenarios. -
Backpropagation and Deep Learning in the Brain
Backpropagation and Deep Learning in the Brain Simons Institute -- Computational Theories of the Brain 2018 Timothy Lillicrap DeepMind, UCL With: Sergey Bartunov, Adam Santoro, Jordan Guerguiev, Blake Richards, Luke Marris, Daniel Cownden, Colin Akerman, Douglas Tweed, Geoffrey Hinton The “credit assignment” problem The solution in artificial networks: backprop Credit assignment by backprop works well in practice and shows up in virtually all of the state-of-the-art supervised, unsupervised, and reinforcement learning algorithms. Why Isn’t Backprop “Biologically Plausible”? Why Isn’t Backprop “Biologically Plausible”? Neuroscience Evidence for Backprop in the Brain? A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: How to convince a neuroscientist that the cortex is learning via [something like] backprop - To convince a machine learning researcher, an appeal to variance in gradient estimates might be enough. - But this is rarely enough to convince a neuroscientist. - So what lines of argument help? How to convince a neuroscientist that the cortex is learning via [something like] backprop - What do I mean by “something like backprop”?: - That learning is achieved across multiple layers by sending information from neurons closer to the output back to “earlier” layers to help compute their synaptic updates. How to convince a neuroscientist that the cortex is learning via [something like] backprop 1. Feedback connections in cortex are ubiquitous and modify the -
Deep Learning for Electromyographic Hand Gesture Signal Classification
1 Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning Ulysse Cotˆ e-Allard,´ Cheikh Latyr Fall, Alexandre Drouin, Alexandre Campeau-Lecours, Clement´ Gosselin, Kyrre Glette, Franc¸ois Laviolettey, and Benoit Gosseliny Abstract—In recent years, deep learning algorithms have widely adopted both in research and clinical settings. The become increasingly more prominent for their unparalleled sEMG signals, which are non-stationary, represent the sum ability to automatically learn discriminant features from large of subcutaneous motor action potentials generated through amounts of data. However, within the field of electromyography- based gesture recognition, deep learning algorithms are seldom muscular contraction [1]. Artificial intelligence can then be employed as they require an unreasonable amount of effort from leveraged as the bridge between sEMG signals and the pros- a single person, to generate tens of thousands of examples. thetic behavior. This work’s hypothesis is that general, informative features can The literature on sEMG-based gesture recognition primarily be learned from the large amounts of data generated by aggre- focuses on feature engineering, with the goal of characterizing gating the signals of multiple users, thus reducing the recording burden while enhancing gesture recognition. Consequently, this sEMG signals in a discriminative way [1], [2], [3]. Recently, paper proposes applying transfer learning on aggregated data researchers have proposed deep learning approaches [4], [5], from multiple users, while leveraging the capacity of deep learn- [6], shifting the paradigm from feature engineering to feature ing algorithms to learn discriminant features from large datasets. learning. Regardless of the method employed, the end-goal Two datasets comprised of 19 and 17 able-bodied participants remains the improvement of the classifier’s robustness. -
A Comprehensive Survey on Transfer Learning
1 A Comprehensive Survey on Transfer Learning Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Senior Member, IEEE, Hui Xiong, Fellow, IEEE, and Qing He Abstract—Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey paper reviews more than forty representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. -
Automated Elastic Pipelining for Distributed Training of Transformers
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers Chaoyang He 1 Shen Li 2 Mahdi Soltanolkotabi 1 Salman Avestimehr 1 Abstract the-art convolutional networks ResNet-152 (He et al., 2016) and EfficientNet (Tan & Le, 2019). To tackle the growth in The size of Transformer models is growing at an model sizes, researchers have proposed various distributed unprecedented rate. It has taken less than one training techniques, including parameter servers (Li et al., year to reach trillion-level parameters since the 2014; Jiang et al., 2020; Kim et al., 2019), pipeline paral- release of GPT-3 (175B). Training such models lel (Huang et al., 2019; Park et al., 2020; Narayanan et al., requires both substantial engineering efforts and 2019), intra-layer parallel (Lepikhin et al., 2020; Shazeer enormous computing resources, which are luxu- et al., 2018; Shoeybi et al., 2019), and zero redundancy data ries most research teams cannot afford. In this parallel (Rajbhandari et al., 2019). paper, we propose PipeTransformer, which leverages automated elastic pipelining for effi- T0 (0% trained) T1 (35% trained) T2 (75% trained) T3 (100% trained) cient distributed training of Transformer models. In PipeTransformer, we design an adaptive on the fly freeze algorithm that can identify and freeze some layers gradually during training, and an elastic pipelining system that can dynamically Layer (end of training) Layer (end of training) Layer (end of training) Layer (end of training) Similarity score allocate resources to train the remaining active layers. More specifically, PipeTransformer automatically excludes frozen layers from the Figure 1. Interpretable Freeze Training: DNNs converge bottom pipeline, packs active layers into fewer GPUs, up (Results on CIFAR10 using ResNet). -
Personalizing EEG-Based Affective Models with Transfer Learning
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Personalizing EEG-Based Affective Models with Transfer Learning 1 1,2,3, Wei-Long Zheng and Bao-Liang Lu ⇤ 1Center for Brain-like Computing and Machine Intelligence Department of Computer Science and Engineering 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering 3Brain Science and Technology Research Center Shanghai Jiao Tong University, Shanghai, China weilong, bllu @sjtu.edu.cn { } Abstract way to enhance current BCI systems with an increase of infor- mation flow, while at the same time without additional cost. Individual differences across subjects and non- Therefore, aBCIs have attracted increasing interests in both stationary characteristic of electroencephalography research and industry communities, and various studies have (EEG) limit the generalization of affective brain- presented their efficiency and feasibility [Muhl¨ et al., 2014a; computer interfaces in real-world applications. On 2014b; Jenke et al., 2014; Eaton et al., 2015]. Although the the other hand, it is very time consuming and large progress has been obtained about development of aB- expensive to acquire a large number of subject- CIs in recent years, there still exist some challenges such as specific labeled data for learning subject-specific the adaptation to changing environments and individual dif- models. In this paper, we propose to build per- ferences. sonalized EEG-based affective models without la- beled target data using transfer learning techniques. Until now, most of studies emphasized choices of fea- We mainly explore two types of subject-to-subject tures and classifiers [Singh et al., 2007; Jenke et al., 2014; transfer approaches. -
Convolutional Neural Network Transfer Learning for Robust Face
Convolutional Neural Network Transfer Learning for Robust Face Recognition in NAO Humanoid Robot Daniel Bussey1, Alex Glandon2, Lasitha Vidyaratne2, Mahbubul Alam2, Khan Iftekharuddin2 1Embry-Riddle Aeronautical University, 2Old Dominion University Background Research Approach Results • Artificial Neural Networks • Retrain AlexNet on the CASIA-WebFace dataset to configure the neural • VGG-Face Shows better results in every performance benchmark measured • Computing systems whose model architecture is inspired by biological networf for face recognition tasks. compared to AlexNet, although AlexNet is able to extract features from an neural networks [1]. image 800% faster than VGG-Face • Capable of improving task performance without task-specific • Acquire an input image using NAO’s camera or a high resolution camera to programming. run through the convolutional neural network. • Resolution of the input image does not have a statistically significant impact • Show excellent performance at classification based tasks such as face on the performance of VGG-Face and AlexNet. recognition [2], text recognition [3], and natural language processing • Extract the features of the input image using the neural networks AlexNet [4]. and VGG-Face • AlexNet’s performance decreases with respect to distance from the camera where VGG-Face shows no performance loss. • Deep Learning • Compare the features of the input image to the features of each image in the • A subfield of machine learning that focuses on algorithms inspired by people database. • Both frameworks show excellent performance when eliminating false the function and structure of the brain called artificial neural networks. positives. The method used to allow computers to learn through a process called • Determine whether a match is detected or if the input image is a photo of a training. -
Tiny Imagenet Challenge
Tiny ImageNet Challenge Jiayu Wu Qixiang Zhang Guoxi Xu Stanford University Stanford University Stanford University [email protected] [email protected] [email protected] Abstract Our best model, a fine-tuned Inception-ResNet, achieves a top-1 error rate of 43.10% on test dataset. Moreover, we We present image classification systems using Residual implemented an object localization network based on a Network(ResNet), Inception-Resnet and Very Deep Convo- RNN with LSTM [7] cells, which achieves precise results. lutional Networks(VGGNet) architectures. We apply data In the Experiments and Evaluations section, We will augmentation, dropout and other regularization techniques present thorough analysis on the results, including per-class to prevent over-fitting of our models. What’s more, we error analysis, intermediate output distribution, the impact present error analysis based on per-class accuracy. We of initialization, etc. also explore impact of initialization methods, weight decay and network depth on system performance. Moreover, visu- 2. Related Work alization of intermediate outputs and convolutional filters are shown. Besides, we complete an extra object localiza- Deep convolutional neural networks have enabled the tion system base upon a combination of Recurrent Neural field of image recognition to advance in an unprecedented Network(RNN) and Long Short Term Memroy(LSTM) units. pace over the past decade. Our best classification model achieves a top-1 test error [10] introduces AlexNet, which has 60 million param- rate of 43.10% on the Tiny ImageNet dataset, and our best eters and 650,000 neurons. The model consists of five localization model can localize with high accuracy more convolutional layers, and some of them are followed by than 1 objects, given training images with 1 object labeled. -
Deep Learning Is Robust to Massive Label Noise
Deep Learning is Robust to Massive Label Noise David Rolnick * 1 Andreas Veit * 2 Serge Belongie 2 Nir Shavit 3 Abstract Thus, annotation can be expensive and, for tasks requiring expert knowledge, may simply be unattainable at scale. Deep neural networks trained on large supervised datasets have led to impressive results in image To address this limitation, other training paradigms have classification and other tasks. However, well- been investigated to alleviate the need for expensive an- annotated datasets can be time-consuming and notations, such as unsupervised learning (Le, 2013), self- expensive to collect, lending increased interest to supervised learning (Pinto et al., 2016; Wang & Gupta, larger but noisy datasets that are more easily ob- 2015) and learning from noisy annotations (Joulin et al., tained. In this paper, we show that deep neural net- 2016; Natarajan et al., 2013; Veit et al., 2017). Very large works are capable of generalizing from training datasets (e.g., Krasin et al.(2016); Thomee et al.(2016)) data for which true labels are massively outnum- can often be obtained, for example from web sources, with bered by incorrect labels. We demonstrate remark- partial or unreliable annotation. This can allow neural net- ably high test performance after training on cor- works to be trained on a much wider variety of tasks or rupted data from MNIST, CIFAR, and ImageNet. classes and with less manual effort. The good performance For example, on MNIST we obtain test accuracy obtained from these large, noisy datasets indicates that deep above 90 percent even after each clean training learning approaches can tolerate modest amounts of noise example has been diluted with 100 randomly- in the training set. -
Transfer Learning: Introduction & Application
Transfer Learning: Introduction & Application Yun Gu Department of Automation Shanghai Jiao Tong University Shanghai, CHINA June 16th, 2014 Overview of Transfer Learning Categorization Applications Conclusion Outline 1 Overview of Transfer Learning 2 Categorization Three Research Issues Different Settings 3 Applications Image Annotation Image Classification Deep Learning 4 Conclusion Y. Gu Transfer Learning: Introduction & Application Overview of Transfer Learning Categorization Applications Conclusion Outline of this section 1 Overview of Transfer Learning 2 Categorization 3 Applications 4 Conclusion Y. Gu Transfer Learning: Introduction & Application Transfer Learning · What to “Transfer”? · Machiine Learniing Scheme.. · How to “Transfer”? · Rellatiionshiip wiith other ML · When to “Transfer”? tech? In Top-Level Conference: 7 papers in CVPR 2014 are related with Transfer Learning; In Top-Level Journal: M.Guilaumin, et.al, "ImageNet Auto-Annotation with Segmentation Propagation", IJCV,2014 Overview of Transfer Learning Categorization Applications Conclusion What is Transfer Learning?[Pan and Yang, 2010] Naive View (Transfer Learning) Transfer Learning (i.e. Knowledge Transfer, Domain Adaption) aims at applying knowledge learned previously to solve new problems faster or with better solutions. Y. Gu Transfer Learning: Introduction & Application Transfer Learning · What to “Transfer”? · Machiine Learniing Scheme.. · How to “Transfer”? · Rellatiionshiip wiith other ML · When to “Transfer”? tech? In Top-Level Conference: 7 papers in CVPR 2014 are related with Transfer Learning; In Top-Level Journal: M.Guilaumin, et.al, "ImageNet Auto-Annotation with Segmentation Propagation", IJCV,2014 Overview of Transfer Learning Categorization Applications Conclusion What is Transfer Learning?[Pan and Yang, 2010] Naive View (Transfer Learning) Transfer Learning (i.e. Knowledge Transfer, Domain Adaption) aims at applying knowledge learned previously to solve new problems faster or with better solutions. -
Transfer Learning for Reinforcement Learning Domains: a Survey
JournalofMachineLearningResearch10(2009)1633-1685 Submitted 6/08; Revised 5/09; Published 7/09 Transfer Learning for Reinforcement Learning Domains: A Survey Matthew E. Taylor∗ [email protected] Computer Science Department The University of Southern California Los Angeles, CA 90089-0781 Peter Stone [email protected] Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 Editor: Sridhar Mahadevan Abstract The reinforcement learning paradigm is a popular way to address problems that have only limited environmental feedback, rather than correctly labeled examples, as is common in other machine learning contexts. While significant progress has been made to improve learning in a single task, the idea of transfer learning has only recently been applied to reinforcement learning tasks. The core idea of transfer is that experience gained in learning to perform one task can help improve learning performance in a related, but different, task. In this article we present a framework that classifies transfer learning methods in terms of their capabilities and goals, and then use it to survey the existing literature, as well as to suggest future directions for transfer learning work. Keywords: transfer learning, reinforcement learning, multi-task learning 1. Transfer Learning Objectives In reinforcement learning (RL) (Sutton and Barto, 1998) problems, leaning agents take sequential actions with the goal of maximizing a reward signal, which may be time-delayed. For example, an agent could learn to play a game by being told whether it wins or loses, but is never given the “correct” action at any given point in time. The RL framework has gained popularity as learning methods have been developed that are capable of handling increasingly complex problems. -
The History Began from Alexnet: a Comprehensive Survey on Deep Learning Approaches
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md Zahangir Alom1, Tarek M. Taha1, Chris Yakopcic1, Stefan Westberg1, Paheding Sidike2, Mst Shamima Nasrin1, Brian C Van Essen3, Abdul A S. Awwal3, and Vijayan K. Asari1 Abstract—In recent years, deep learning has garnered I. INTRODUCTION tremendous success in a variety of application domains. This new ince the 1950s, a small subset of Artificial Intelligence (AI), field of machine learning has been growing rapidly, and has been applied to most traditional application domains, as well as some S often called Machine Learning (ML), has revolutionized new areas that present more opportunities. Different methods several fields in the last few decades. Neural Networks have been proposed based on different categories of learning, (NN) are a subfield of ML, and it was this subfield that spawned including supervised, semi-supervised, and un-supervised Deep Learning (DL). Since its inception DL has been creating learning. Experimental results show state-of-the-art performance ever larger disruptions, showing outstanding success in almost using deep learning when compared to traditional machine every application domain. Fig. 1 shows, the taxonomy of AI. learning approaches in the fields of image processing, computer DL (using either deep architecture of learning or hierarchical vision, speech recognition, machine translation, art, medical learning approaches) is a class of ML developed largely from imaging, medical information processing, robotics and control, 2006 onward. Learning is a procedure consisting of estimating bio-informatics, natural language processing (NLP), cybersecurity, and many others.