Effective Techniques of the Use of Data Augmentation in Classification

Effectiv e Techniques of the Use of Data Augmentation in Classification Tasks Degree in Computer Engineering Final Degree Project Autor: Unai Garay Maestre Tutor/s: Antonio Javier Gallego Jorge Calvo June 2018 Abstract This report follows the research and development of a final degree project of computer engineering. The purpose of this project is to accomplish a new method to overcome the lack of data. In the literature the strategy that is accustomed to achieve this task is data augmentation which is a method that artificially creates new data based on the modifications of the existing data. The heuristics underlying this modifications are very dependent on which processes are suitable for the classification task at issue. In this project we introduce an alternative using Variational Autoencoders which are powerful generative models. These are capable of extracting latent values from input variables to generate new information without the user having to take specific decisions. Acknowledgments First of all, I would not have been interested in this topic, Neural Networks, if it were not for the subject \Desaf´ıosde la Programación",taught by Juan RamónRico. I asked him to do this Final Project and he introduced me to my current tutors: Antonio Javier Gallego and Jorge Calvo. I am very grateful to have them. I wanted to use Neural Networks on images and they gave me this fantastic idea. They have supported me with this research all this way along, answering and helping me when I needed it. Thank you for everything. I could not be here if it were not because of my friends. They have supported me and helped me all this way along. Thank you very much. I also want to thank my family and friends who have supported me from the beginning until the end, in every phase of my life and with every decision I have made, and they will still do. Thank you with all my heart. It is to them that I dedicate this work. v Intelligence is the ability to adapt to change. Stephen Hawking. vii Contents 1 Introduction1 1.1 Motivation...................................2 1.2 Objectives....................................3 2 State of the Art5 2.1 Data Augmentation..............................5 2.2 Improving Data Augmentation........................6 2.3 Autoencoders..................................7 2.3.1 Problem with Autoencoders......................8 2.4 Variational Autoencoders...........................9 3 Technologies 15 3.1 Python..................................... 15 3.1.1 Tensorflow............................... 15 3.1.2 Keras Library.............................. 16 3.1.3 Numpy, OpenCV, Matplotlib And Others.............. 16 3.2 Jupyter Notebook............................... 17 3.3 Google Colaboratory.............................. 17 4 Methodology 19 4.1 Introduction................................... 19 4.2 First Stage................................... 20 4.2.1 Dataset................................. 20 4.2.2 Train Variational Autoencoder.................... 21 4.2.3 Generate New Samples......................... 22 4.3 Second Stage.................................. 23 4.3.1 Dataset + Generated Samples.................... 23 4.3.2 Train CNN............................... 23 4.3.3 Prediction................................ 24 5 Implementation 25 5.1 Introduction................................... 25 5.2 Autoencoder and Variational Autoencoder.................. 26 5.2.1 Simplest Possible Autoencoder.................... 26 5.2.2 Variational Autoencoder........................ 27 5.2.3 Variational Autoencoder with Convolutions............. 29 ix x Contents 5.3 Convolutional Neural Network........................ 32 5.3.1 Keras Image Augmentation API................... 32 5.3.2 Model.................................. 33 5.3.3 Parameters............................... 34 6 Experimentation 35 6.1 MNIST dataset................................. 35 6.2 Train VAEs................................... 37 6.2.1 Configuration of VAEs......................... 39 6.3 Tests....................................... 40 6.3.1 50 Images................................ 41 6.3.2 100 Images............................... 46 6.3.3 250 Images............................... 50 6.3.4 500 Images............................... 53 6.3.5 1000 Images............................... 55 6.3.6 Analysis of the results......................... 57 6.4 New way of feeding the CNN......................... 58 6.4.1 Implementation............................. 58 6.4.2 Results and Comparison........................ 59 6.4.3 Analysis of the results......................... 61 6.5 Using 3 dimensions on the latent space.................... 62 6.5.1 Results and Comparison........................ 63 6.5.2 Using 4 and 8 dimensions on the latent space............ 65 6.5.3 Analysis of the results......................... 66 6.6 Variational Autoencoders with Keras data augmentation.......... 66 6.6.1 Implementation............................. 66 6.6.2 Results and Comparison........................ 67 6.6.3 Analysis of the results......................... 68 6.7 Summary Results................................ 68 7 Conclusion 71 7.1 Project Summary................................ 71 7.2 Evaluation.................................... 71 7.3 Further Work.................................. 72 7.4 Final Thoughts................................. 72 Bibliography 76 List of Figures 2.3.1.Encoder of a CNN...............................7 2.3.2.Autoencoder structure.............................8 2.3.3.Autoencoder MNIST 2D latent space.....................9 2.4.1.Variational Autoencoder structure...................... 10 2.4.2.Sampled vector................................. 10 2.4.3.Encoding variation comparison........................ 11 2.4.4.Ideally encoding vs. possible encoding.................... 12 2.4.5.Pure KL loss results in 2D latent space.................... 13 2.4.6.Reconstruction loss and KL loss results in 2D latent space......... 14 4.1.1.First stage diagram............................... 19 4.1.2.Second stage diagram............................. 19 4.2.1.Train test percentage.............................. 20 4.2.2.Dataset steps.................................. 21 4.2.3.Encoder decoder reconstruction........................ 22 4.3.1.Data augmented workflow........................... 23 4.3.2.Methodology summary............................. 24 5.2.1.VAE encoded structure............................ 28 6.1.1.MNIST's handwritten digits.......................... 35 6.1.2.MNIST digit as a matrix............................ 36 6.1.3.MNIST's directory structure.......................... 36 6.2.1.Convolutional VAE using MNIST and a 2D latent space.......... 38 6.2.2.Convolutional VAE using MNIST digits................... 38 6.2.3.Graph validation loss VAE digit 0...................... 39 6.3.1.Generated digits using 50 images and a 2D latent space.......... 41 6.3.2.Generated digits using 100 images and a 2D latent space.......... 47 6.3.3.Generated digits using 250 images and a 2D latent space.......... 50 6.3.4.Generated digits using 500 images and a 2D latent space.......... 53 6.3.5.Generated digits using 1000 images and a 2D latent space......... 55 6.5.1.3D latent space digit 0............................. 62 6.5.2.Generated samples of the digit 0 using a 3D latent space......... 63 6.5.3.Graph comparison using F1 score of 2, 3, 4 and 8 dimensions....... 65 6.6.1.Graph comparison using F1 score augmentation and augmentation with VAE....................................... 68 xi List of Tables 6.1. VAE validation loss for digit 0......................... 39 6.2. CNN with no extra images using 50 images................. 43 6.3. CNN average results of 50 images....................... 43 6.4. CNN VAE using 50 images and a 2D latent space.............. 44 6.5. CNN VAE average results 50 images 2D latent space............ 44 6.6. CNN Keras data augmentation using 50 images............... 45 6.7. CNN with data augmentation average results using 50 images....... 45 6.8. Summary of average results different methods using 50 images and a 2D latent space................................... 46 6.9. CNN with no extra images using 100 images................. 47 6.10. CNN average results using 100 images.................... 48 6.11. CNN VAE using 100 images and a 2D latent space............. 48 6.12. CNN VAE average results using 100 images and a 2D latent space.... 49 6.13. CNN with Keras data augmentation using 100 images........... 49 6.14. CNN with Keras data augmentation average results using 100 images.. 49 6.15. Average results of the methods using 100 images and a 2D latent space. 50 6.16. CNN average results 250 images....................... 51 6.17. CNN VAE average results using 250 images and a 2D latent space.... 51 6.18. CNN with Keras data augmentation average results using 250 images.. 52 6.19. Average results of the methods using 250 images and a 2D latent space. 52 6.20. CNN average results using 500 images.................... 53 6.21. CNN VAE average results using 500 images and a 2D latent space.... 54 6.22. CNN with Keras data augmentation average results using 500 images.. 54 6.23. Average results of the methods using 500 images and a 2D latent space. 54 6.24. CNN average results using 1000 images................... 55 6.25. CNN VAE average results using 1000 images and a 2D latent space.... 56 6.26. CNN with Keras data augmentation average results using 1000 images.. 56 6.27. Average results of the methods using 1000 images and a 2D latent space. 57 6.28. Comparison of a new way of feeding

Load more