Automatic Data Augmentation for Generalization in Reinforcement Learning

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Data Augmentation for Generalization in Reinforcement Learning Automatic Data Augmentation for Generalization in Reinforcement Learning Roberta Raileanu 1 Max Goldstein 1 Denis Yarats 1 2 Ilya Kostrikov 1 Rob Fergus 1 Abstract sentation learning (Zhang et al., 2020a;c). In this work, we focus on the use of data augmentation in RL. We identify Deep reinforcement learning (RL) agents often key differences between supervised learning and reinforce- fail to generalize beyond their training environ- ment learning which need to be taken into account when ments. To alleviate this problem, recent work using data augmentation in RL. has proposed the use of data augmentation. How- ever, different tasks tend to benefit from different More specifically, we show that a naive application of data types of augmentations and selecting the right augmentation can lead to both theoretical and practical prob- one typically requires expert knowledge. In this lems with standard RL algorithms, such as unprincipled ob- paper, we introduce three approaches for auto- jective estimates and poor performance. As a solution, we matically finding an effective augmentation for propose Data-regularized Actor-Critic or DrAC, a new any RL task. These are combined with two novel algorithm that enables the use of data augmentation with regularization terms for the policy and value func- actor-critic algorithms in a theoretically sound way. Specif- tion, required to make the use of data augmenta- ically, we introduce two regularization terms which con- tion theoretically sound for actor-critic algorithms. strain the agent’s policy and value function to be invariant Our method achieves a new state-of-the-art on the to various state transformations. Empirically, this approach Procgen benchmark and outperforms popular RL allows the agent to learn useful behaviors (outperforming algorithms on DeepMind Control tasks with dis- strong RL baselines) in settings in which a naive use of tractors. In addition, our agent learns policies and data augmentation completely fails or converges to a sub- representations which are more robust to changes optimal policy. While we use Proximal Policy Optimization in the environment that are irrelevant for solving (PPO, Schulman et al.(2017)) to describe and validate the task, such as the background. Our implemen- our approach, the method can be easily integrated with any tation is available at https://github.com/ actor-critic algorithm with a discrete stochastic policy such rraileanu/auto-drac. as A3C (Mnih et al., 2013), SAC (Haarnoja et al., 2018), or IMPALA (Espeholt et al., 2018). The current use of data augmentation in RL either relies 1. Introduction on expert knowledge to pick an appropriate augmenta- Generalization to new environments remains a major chal- tion (Cobbe et al., 2018; Lee et al., 2020; Kostrikov et al., lenge in deep reinforcement learning (RL). Current methods 2020) or separately evaluates a large number of transforma- fail to generalize to unseen environments even when trained tions to find the best one (Ye et al., 2020; Laskin et al., 2020). on similar settings (Farebrother et al., 2018; Packer et al., In this paper, we propose three methods for automatically finding a useful augmentation for a given RL task. The first arXiv:2006.12862v2 [cs.LG] 20 Feb 2021 2018; Zhang et al., 2018a; Cobbe et al., 2018; Gamrian & Goldberg, 2019; Cobbe et al., 2019; Song et al., 2020). two learn to select the best augmentation from a fixed set, using either a variant of the upper confidence bound algo- This indicates that standard RL agents memorize specific 2 trajectories rather than learning transferable skills. Several rithm (UCB, Auer(2002)) or meta-learning ( RL , Wang strategies have been proposed to alleviate this problem, such et al.(2016)). We refer to these methods as UCB-DrAC and as the use of regularization (Farebrother et al., 2018; Zhang RL2-DrAC, respectively. The third method, Meta-DrAC, et al., 2018a; Cobbe et al., 2018; Igl et al., 2019), data aug- directly meta-learns the weights of a convolutional network, mentation (Cobbe et al., 2018; Lee et al., 2020; Ye et al., without access to predefined transformations (MAML, Finn 2020; Kostrikov et al., 2020; Laskin et al., 2020), or repre- et al.(2017)). Figure1 gives an overview of UCB-DrAC. We evaluate these approaches on the Procgen generaliza- 1New York University, New York City, NY, USA 2Facebook AI Research, New York City, NY, USA. Correspondence to: Roberta tion benchmark (Cobbe et al., 2019) which consists of 16 Raileanu <[email protected]/edu>. procedurally generated environments with visual observa- tions. Our results show that UCB-DrAC is the most effective Automatic Data Augmentation for Generalization in Reinforcement Learning Figure 1. Overview of UCB-DrAC. A UCB bandit selects an image transformation (e.g. random-conv) and applies it to the observations. The augmented and original observations are passed to a regularized actor-critic agent (i.e. DrAC) which uses them to learn a policy and value function which are invariant to this transformation. among these at finding a good augmentation, and is compa- trained on a fixed set of n = 200 levels (generated using rable or better than using DrAC with the best augmentation seeds from 1 to 200) and tested on the full distribution of from a given set. UCB-DrAC also outperforms baselines levels (generated by sampling seeds uniformly at random specifically designed to improve generalization in RL (Igl from all computer integers). et al., 2019; Lee et al., 2020; Laskin et al., 2020) on both Proximal Policy Optimization (PPO, Schulman et al. train and test. In addition, we show that our agent learns poli- (2017)) is an actor-critic algorithm that learns a policy π cies and representations that are more invariant to changes θ and a value function V with the goal of finding an optimal in the environment which do not alter the reward or transi- θ policy for a given MDP. PPO alternates between sampling tion function (i.e. they are inconsequential for control), such data through interaction with the environment and maxi- as the background theme. mizing a clipped surrogate objective function JPPO using To summarize, our work makes the following contributions: stochastic gradient ascent. See AppendixA for a full de- (i) we introduce a principled way of using data augmentation scription of PPO. One component of the PPO objective with actor-critic algorithms, (ii) we propose a practical ap- is the policy gradient term JPG, which is estimated using proach for automatically selecting an effective augmentation importance sampling: in RL settings, (iii) we show that the use of data augmenta- X ^ tion leads to policies and representations that better capture JPG(θ) = πθ(ajs)Aθold (s; a) task invariances, and (iv) we demonstrate state-of-the-art a2A (1) results on the Procgen benchmark and outperform popular πθ(ajs) = a∼π A^θ (s; a) ; RL methods on four DeepMind Control tasks with natural E θold old πθold (ajs) and synthetic distractors. ^ where A(·) is an estimate of the advantage function, πθold 2. Background is the behavior policy used to collect trajectories (i.e. that generates the training distribution of states and actions), and We consider a distribution q(m) of Partially Observ- π is the policy we want to optimize (i.e. that generates the able Markov Decision Processes (POMDPs, (Bell- θ true distribution of states and actions). man, 1957)) m 2 M, with m defined by the tuple (Sm; Om; A;Tm;Rm; γ), where Sm is the state space, Om 0 is the observation space, A is the action space, Tm(s js; a) 3. Automatic Data Augmentation for RL is the transition function, R (s; a) is the reward function, m 3.1. Data Augmentation in RL and γ is the discount factor. During training, we restrict access to a fixed set of POMDPs, Mtrain = fm1; : : : ; mng, Image augmentation has been successfully applied in com- where mi ∼ q, 8 i = 1; n. The goal is to find a puter vision for improving generalization on object clas- policy πθ which maximizes the expected discounted re- sification tasks (Simard et al., 2003; Cire¸sanet al., 2011; ward over the entire distribution of POMDPs, J(πθ) = Ciregan et al., 2012; Krizhevsky et al., 2012). As noted PT t by Kostrikov et al.(2020), those tasks are invariant to cer- Eq,π;Tm;pm t=0 γ Rm(st; at) . tain image transformations such as rotations or flips, which In practice, we use the Procgen benchmark which contains is not always the case in RL. For example, if your observa- 16 procedurally generated games. Each game corresponds tion is flipped, the corresponding reward will be reversed to a distribution of POMDPs q(m), and each level of a for the left and right actions and will not provide an ac- game corresponds to a POMDP sampled from that game’s curate signal to the agent. While data augmentation has distribution m ∼ q. The POMDP m is determined by the been previously used in RL settings without other algorith- i.e. seed ( integer) used to generate the corresponding level. mic changes (Cobbe et al., 2018; Ye et al., 2020; Laskin Following the setup from Cobbe et al.(2019), agents are et al., 2020), we argue that this approach is not theoretically Automatic Data Augmentation for Generalization in Reinforcement Learning sound. Algorithm 1 DrAC: Data-regularized Actor-Critic Black: unmodified actor-critic algorithm. If transformations are naively applied to observations in Cyan: image transformation. PPO’s buffer, as done in Laskin et al.(2020), the PPO Red: policy regularization. objective changes and equation (1) is replaced by Blue: value function regularization. πθ(ajf(s)) 1: Hyperparameters: image transformation f, regular- JPG(θ) = a∼π A^θ (s; a) ; (2) E θold old ization loss coefficient αr, minibatch size M, replay πθold (ajs) buffer size T, number of updates K.
Recommended publications
  • Improving Sequence-To-Sequence Speech Recognition Training with On-The-Fly Data Augmentation
    IMPROVING SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION TRAINING WITH ON-THE-FLY DATA AUGMENTATION Thai-Son Nguyen1, Sebastian Stuker¨ 1, Jan Niehues2, Alex Waibel1 1 Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology 2Department of Data Science and Knowledge Engineering (DKE), Maastricht University ABSTRACT tation method together with a long training schedule to reduce overfitting, they have achieved a large gain in performance su- Sequence-to-Sequence (S2S) models recently started to show perior to many modifications in network architecture. state-of-the-art performance for automatic speech recognition To date, there have been different sequence-to-sequence (ASR). With these large and deep models overfitting remains encoder-decoder models [12, 13] reporting superior perfor- the largest problem, outweighing performance improvements mance over the HMM hybrid models on standard ASR bench- that can be obtained from better architectures. One solution marks. While [12] uses Long Short-Term Memory (LSTM) to the overfitting problem is increasing the amount of avail- networks, for both encoder and decoder, [13] employs self- able training data and the variety exhibited by the training attention layers to construct the whole S2S network. data with the help of data augmentation. In this paper we ex- In this paper, we investigate three on-the-fly data aug- amine the influence of three data augmentation methods on mentation methods for S2S speech recognition, two of which the performance of two S2S model architectures. One of the are proposed in this work and the last was recently discov- data augmentation method comes from literature, while two ered [12]. We contrast and analyze both LSTM-based and other methods are our own development – a time perturbation self-attention S2S models that were trained with the proposed in the frequency domain and sub-sequence sampling.
    [Show full text]
  • A Group-Theoretic Framework for Data Augmentation
    A Group-Theoretic Framework for Data Augmentation Shuxiao Chen Edgar Dobriban Department of Statistics Department of Statistics University of Pennsylvania University of Pennsylvania [email protected] [email protected] Jane H. Lee Department of Computer Science University of Pennsylvania [email protected] Abstract Data augmentation has become an important part of modern deep learning pipelines and is typically needed to achieve state of the art performance for many learning tasks. It utilizes invariant transformations of the data, such as rotation, scale, and color shift, and the transformed images are added to the training set. However, these transformations are often chosen heuristically and a clear theoretical framework to explain the performance benefits of data augmentation is not available. In this paper, we develop such a framework to explain data augmentation as averaging over the orbits of the group that keeps the data distribution approximately invariant, and show that it leads to variance reduction. We study finite-sample and asymptotic empirical risk minimization and work out as examples the variance reduction in certain two-layer neural networks. We further propose a strategy to exploit the benefits of data augmentation for general learning tasks. 1 Introduction Many deep learning models succeed by exploiting symmetry in data. Convolutional neural networks (CNNs) use that image identity is roughly invariant to translations: a translated cat is still a cat. Such invariances are present in many domains, including image and language data. Standard architectures are invariant to some, but not all transforms. CNNs induce an approximate equivariance to translations, but not to rotations. This is an inductive bias of CNNs, and the idea dates back at least to the neocognitron [30].
    [Show full text]
  • Real-Time Monitoring for Hydraulic States Based on Convolutional Bidirectional LSTM with Attention Mechanism
    sensors Article Real-Time Monitoring for Hydraulic States Based on Convolutional Bidirectional LSTM with Attention Mechanism Kyutae Kim and Jongpil Jeong * Department of Smart Factory Convergence, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Korea; [email protected] * Correspondence: [email protected] Received: 30 October 2020; Accepted: 9 December 2020; Published: 11 December 2020 Abstract: By monitoring a hydraulic system using artificial intelligence, we can detect anomalous data in a manufacturing workshop. In addition, by analyzing the anomalous data, we can diagnose faults and prevent failures. However, artificial intelligence, especially deep learning, needs to learn much data, and it is often difficult to get enough data at the real manufacturing site. In this paper, we apply augmentation to increase the amount of data. In addition, we propose real-time monitoring based on a deep-learning model that uses convergence of a convolutional neural network (CNN), a bidirectional long short-term memory network (BiLSTM), and an attention mechanism. CNN extracts features from input data, and BiLSTM learns feature information. The learned information is then fed to the sigmoid classifier to find out if it is normal or abnormal. Experimental results show that the proposed model works better than other deep-learning models, such as CNN or long short-term memory (LSTM). Keywords: hydraulic system; CNN; bidirectional LSTM; attention mechanism; classification; data augmentation 1. Introduction Mechanical fault diagnosis and condition monitoring using artificial intelligence technology are an significant part of the smart factory and fourth industrial revolution [1]. In recent decades, condition monitoring of hydraulic systems has become increasingly important in industry, energy and mobile hydraulic applications as a significant part of the condition-based maintenance strategy, which can reduce machine downtime and maintenance costs and significantly improve planning for the security of the production process.
    [Show full text]
  • Data Augmentation and Feature Selection for Automatic Model Recommendation in Computational Physics
    Mathematical and Computational Applications Article Data Augmentation and Feature Selection for Automatic Model Recommendation in Computational Physics Thomas Daniel 1,2,* , Fabien Casenave 1 , Nissrine Akkari 1 and David Ryckelynck 2 1 SafranTech, Rue des Jeunes Bois, Châteaufort, 78114 Magny-les-Hameaux, France; [email protected] (F.C.); [email protected] (N.A.) 2 Centre des matériaux (CMAT), MINES ParisTech, PSL University, CNRS UMR 7633, BP 87, 91003 Evry, France; [email protected] * Correspondence: [email protected] Abstract: Classification algorithms have recently found applications in computational physics for the selection of numerical methods or models adapted to the environment and the state of the physical system. For such classification tasks, labeled training data come from numerical simulations and generally correspond to physical fields discretized on a mesh. Three challenging difficulties arise: the lack of training data, their high dimensionality, and the non-applicability of common data augmentation techniques to physics data. This article introduces two algorithms to address these issues: one for dimensionality reduction via feature selection, and one for data augmentation. These algorithms are combined with a wide variety of classifiers for their evaluation. When combined with a stacking ensemble made of six multilayer perceptrons and a ridge logistic regression, they enable reaching an accuracy of 90% on our classification problem for nonlinear structural mechanics. Keywords: machine learning; classification; automatic model recommendation; feature selection; data augmentation; numerical simulations Citation: Daniel, T.; Casenave, F.; Akkari, N.; Ryckelynck, D. Data Augmentation and Feature Selection 1. Introduction for Automatic Model Classification problems can be encountered in various disciplines such as handwritten Recommendation in Computational text recognition [1], document classification [2], and computer-aided diagnosis in the Physics.
    [Show full text]
  • Generative Adversarial Networks to Improve the Robustness of Visual Defect Segmentation by Semantic Networks in Manufacturing Components
    applied sciences Article Generative Adversarial Networks to Improve the Robustness of Visual Defect Segmentation by Semantic Networks in Manufacturing Components Fátima A. Saiz 1,2,* , Garazi Alfaro 1, Iñigo Barandiaran 1 and Manuel Graña 2 1 Vicomtech Foundation, Basque Research and Technology Alliance (BRTA), Donostia, 20009 San Sebastián, Spain; [email protected] (G.A.); [email protected] (I.B.) 2 Computational Intelligence Group, Computer Science Faculty, University of the Basque Country, UPV/EHU, 20018 San Sebastián, Spain; [email protected] * Correspondence: [email protected] Abstract: This paper describes the application of Semantic Networks for the detection of defects in images of metallic manufactured components in a situation where the number of available samples of defects is small, which is rather common in real practical environments. In order to overcome this shortage of data, the common approach is to use conventional data augmentation techniques. We resort to Generative Adversarial Networks (GANs) that have shown the capability to generate highly convincing samples of a specific class as a result of a game between a discriminator and a generator module. Here, we apply the GANs to generate samples of images of metallic manufactured components with specific defects, in order to improve training of Semantic Networks (specifically DeepLabV3+ and Pyramid Attention Network (PAN) networks) carrying out the defect detection and segmentation. Our process carries out the generation of defect images using the StyleGAN2 Citation: Saiz, F.A.; Alfaro, G.; with the DiffAugment method, followed by a conventional data augmentation over the entire Barandiaran, I.; Graña, M. Generative enriched dataset, achieving a large balanced dataset that allows robust training of the Semantic Adversarial Networks to Improve the Network.
    [Show full text]
  • Posture Recognition Using Ensemble Deep Models Under Various Home Environments
    applied sciences Article Posture Recognition Using Ensemble Deep Models under Various Home Environments Yeong-Hyeon Byeon 1, Jae-Yeon Lee 2, Do-Hyung Kim 2 and Keun-Chang Kwak 1,* 1 Department of Control and Instrumentation Engineering, Chosun University, Gwangju 61452, Korea; [email protected] 2 Intelligent Robotics Research Division, Electronics Telecommunications Research Institute, Daejeon 61452, Korea; [email protected] (J.-Y.L.); [email protected] (D.-H.K.) * Correspondence: [email protected]; Tel.: +82-062-230-6086 Received: 26 December 2019; Accepted: 11 February 2020; Published: 14 February 2020 Abstract: This paper is concerned with posture recognition using ensemble convolutional neural networks (CNNs) in home environments. With the increasing number of elderly people living alone at home, posture recognition is very important for helping elderly people cope with sudden danger. Traditionally, to recognize posture, it was necessary to obtain the coordinates of the body points, depth, frame information of video, and so on. In conventional machine learning, there is a limitation in recognizing posture directly using only an image. However, with advancements in the latest deep learning, it is possible to achieve good performance in posture recognition using only an image. Thus, we performed experiments based on VGGNet, ResNet, DenseNet, InceptionResNet, and Xception as pre-trained CNNs using five types of preprocessing. On the basis of these deep learning methods, we finally present the ensemble deep model combined by majority and average methods. The experiments were performed by a posture database constructed at the Electronics and Telecommunications Research Institute (ETRI), Korea. This database consists of 51,000 images with 10 postures from 51 home environments.
    [Show full text]
  • Automatic Relational Data Augmentation for Machine Learning
    ARDA: Automatic Relational Data Augmentation for Machine Learning Nadiia Chepurko1, Ryan Marcus1, Emanuel Zgraggen1, Raul Castro Fernandez2, Tim Kraska1, David Karger1 1MIT CSAIL 2University of Chicago fnadiia, ryanmarcus, emzg, kraska, [email protected] [email protected] ABSTRACT try to find the best model and tune hyper-parameters, but Automatic machine learning (AML) is a family of techniques also perform automatic feature engineering. In their current to automate the process of training predictive models, aim- form, AML tools can outperform experts [36] and can help to ing to both improve performance and make machine learn- make ML more accessible to a broader range of users [28,41]. ing more accessible. While many recent works have focused However, what if the original dataset provided by the user on aspects of the machine learning pipeline like model se- does not contain enough signal (predictive features) to cre- lection, hyperparameter tuning, and feature selection, rela- ate an accurate model? For instance, consider a user who tively few works have focused on automatic data augmen- wants to use the publicly-available NYC taxi dataset [24] to tation. Automatic data augmentation involves finding new build a forecasting model for taxi ride durations. Suppose features relevant to the user's predictive task with minimal the dataset contains trip information over the last five years, \human-in-the-loop" involvement. including license plate numbers, pickup locations, destina- We present ARDA, an end-to-end system that takes as tions, and pickup times. A model built only on this data input a dataset and a data repository, and outputs an aug- may not be very accurate because there are other major ex- mented data set such that training a predictive model on this ternal factors that impact the duration of a taxi ride.
    [Show full text]
  • Data Augmentation Schemes for Deep Learning in an Indoor Positioning Application
    electronics Article Data Augmentation Schemes for Deep Learning in an Indoor Positioning Application Rashmi Sharan Sinha 1, Sang-Moon Lee 2, Minjoong Rim 1 and Seung-Hoon Hwang 1,* 1 Division of Electronics and Electrical Engineering, Dongguk University-Seoul, Seoul 04620, Korea; [email protected] (R.S.S.); [email protected] (M.R.) 2 JMP Systems Co., Ltd, Gyeonggi-do 12930, Korea; [email protected] * Correspondence: [email protected]; Tel.: +82-2-2260-3994 Received: 25 April 2019; Accepted: 13 May 2019; Published: 17 May 2019 Abstract: In this paper, we propose two data augmentation schemes for deep learning architecture that can be used to directly estimate user location in an indoor environment using mobile phone tracking and electronic fingerprints based on reference points and access points. Using a pretrained model, the deep learning approach can significantly reduce data collection time, while the runtime is also significantly reduced. Numerical results indicate that an augmented training database containing seven days’ worth of measurements is sufficient to generate acceptable performance using a pretrained model. Experimental results find that the proposed augmentation schemes can achieve a test accuracy of 89.73% and an average location error that is as low as 2.54 m. Therefore, the proposed schemes demonstrate the feasibility of data augmentation using a deep neural network (DNN)-based indoor localization system that lowers the complexity required for use on mobile devices. Keywords: augmentation; deep learning; CNN; indoor positioning; fingerprint 1. Introduction Identifying the location of a mobile user is an important challenge in pervasive computing because their location provides a lot of information about the user with which adaptive computer systems can be created.
    [Show full text]
  • Augmenting Image Classifiers Using Data Augmentation Generative
    Augmenting Image Classifiers using Data Augmentation Generative Adversarial Networks Antreas Antoniou1, Amos Storkey1, and Harrison Edwards1;2 1 University of Edinburgh, Edinburgh, UK {a.antoniou,a.storkey,h.l.edwards}@sms.ed.ac.uk https://www.ed.ac.uk/ 2 Open AI https://openai.com/ Abstract. Effective training of neural networks requires much data. In the low-data regime, parameters are underdetermined, and learnt net- works generalise poorly. Data Augmentation alleviates this by using ex- isting data more effectively, but standard data augmentation produces only limited plausible alternative data. Given the potential to generate a much broader set of augmentations, we design and train a generative model to do data augmentation. The model, based on image conditional Generative Adversarial Networks, uses data from a source domain and learns to take a data item and augment it by generating other within- class data items. As this generative process does not depend on the classes themselves, it can be applied to novel unseen classes. We demon- strate that a Data Augmentation Generative Adversarial Network (DA- GAN) augments classifiers well on Omniglot, EMNIST and VGG-Face. 1 Introduction Over the last decade Deep Neural Networks have enabled unprecedented perfor- mance on a number of tasks. They have been demonstrated in many domains [12] including image classification [25, 17, 16, 18, 21], machine translation [44], natu- ral language processing [12], speech recognition [19], and synthesis [42], learning from human play [6] and reinforcement learning [27, 35, 10, 40, 13] among others. In all cases, very large datasets have been utilized, or in the case of reinforce- ment learning, extensive play.
    [Show full text]
  • GAN-Based Image Data Augmentation
    GAN-Based Image Data Augmentation Stanford CS229 Final Project: Computer Vision David Liu Nathan Hu Department of Mathematics Department of Computer Science Stanford University Stanford University [email protected] [email protected] Abstract Generative adversarial networks (GANs) are powerful generative models that have lead to break- throughs in image generation. In this project, we investigate the use of GANs in generating synthetic data from the MNIST dataset to either augment or replace the original data when training classifiers. We demonstrate that training classifiers on purely synthetic data achieves comparable results to those trained solely on pure data and show that for small sets of training data, augmenting the dataset by first training GANs on the data can lead to dramatic improvement in classifier performance. We also begin to explore using GAN-generated data to recursively train other GANs. 1 Introduction and Related Work Decribed by Yann LeCun, Director of AI Research at Facebook AI, as "the most interesting idea in the last 10 years in Machine Learning", Generative Adversarial Networks (GANs) are powerful generative models which reformulate the task of learning a data distribution as an adversarial game. A fundamental bottleneck in machine learning is data availability, and a variety of techniques are used to augment datasets to create more training data. As powerful gen- erative models, GANs are good candidates for data augmentation. In recent years, there has been some development in exploring the use of GANs in generating synthetic data for data augmentation given limited or imbalanced datasets [1]. Aside from augmenting real data, there are scenarios in which one may wish to directly substitute real data with synthetic data ––for example, when people provide images in a medical context, having a GAN as the "middle man" would grant confidentiality to parties providing the original data.
    [Show full text]
  • Image-Based Perceptual Learning Algorithm for Autonomous Driving
    Image-based Perceptual Learning Algorithm for Autonomous Driving DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Yunming Shao, M.S. Graduate Program in Geodetic Science The Ohio State University 2017 Dissertation Committee: Dr. Dorota A. Grejner-Brzezinska, Advisor Dr. Charles Toth, Co-advisor Dr. Alper Yilmaz Dr. Rongjun Qin Copyrighted by Yunming Shao 2017 Abstract Autonomous driving is widely acknowledged as a promising solution to modern traffic problems such as congestions and accidents. It is a complicated system including sub-modules such as perception, path planning and control etc. During the last few years, the research and experiment have transferred from the academic to the industrial sector. Different sensor configurations exist depending on the manufactories, but an imaging components is exclusively used by every company. In this dissertation, we mainly focus on innovating and improving the camera perception algorithms using the deep learning algorithms. In addition, we propose an end-to-end control approach which can map the image pixels directly to control commands. This dissertation contributes in the development of autonomous driving in the three following aspects: Firstly, a novel dynamic objects detection architecture using still images is proposed. Our dynamic detection architecture utilizes the Convolution Neural Network (CNN) with end-to-end training approach. In our model, we consider multiple requirements for dynamic object detection in autonomous driving in addition to accuracy, such as inference speed, model size, and energy consumption. These are crucial to the deployment of a detector in a real autonomous vehicle.
    [Show full text]
  • Empirical Evaluation of Variational Autoencoders for Data Augmentation
    Empirical Evaluation of Variational Autoencoders for Data Augmentation Javier Jorge, Jesus´ Vieco, Roberto Paredes, Joan Andreu Sanchez and Jose´ Miguel Bened´ı Departamento de Sistemas Informaticos´ y Computacion,´ Universitat Politecnica` de Valencia,` Valencia, Spain Keywords: Generative Models, Data Augmentation, Variational Autoencoder. Abstract: Since the beginning of Neural Networks, different mechanisms have been required to provide a sufficient number of examples to avoid overfitting. Data augmentation, the most common one, is focused on the gen- eration of new instances performing different distortions in the real samples. Usually, these transformations are problem-dependent, and they result in a synthetic set of, likely, unseen examples. In this work, we have studied a generative model, based on the paradigm of encoder-decoder, that works directly in the data space, that is, with images. This model encodes the input in a latent space where different transformations will be applied. After completing this, we can reconstruct the latent vectors to get new samples. We have analysed various procedures according to the distortions that we could carry out, as well as the effectiveness of this process to improve the accuracy of different classification systems. To do this, we could use both the latent space and the original space after reconstructing the altered version of these vectors. Our results have shown that using this pipeline (encoding-altering-decoding) helps the generalisation of the classifiers that have been selected. 1 INTRODUCTION which others could harm the model’s generalisation. Currently, one of the most critical issues is finding a Several of the successful applications of machine way of making the most of the tons of instances that learning techniques are based on the amount of data are unlabeled and, using just a few labelled examples, available nowadays, such as millions of images, days being able to obtain useful classifiers.
    [Show full text]