Deep Learning Models of Biological Visual Information Processing

Turcsány, Diána (2016) Deep learning models of biological visual information processing. PhD thesis, University of Nottingham. Access from the University of Nottingham repository: http://eprints.nottingham.ac.uk/35561/1/thesis_DianaTurcsany.pdf Copyright and reuse: The Nottingham ePrints service makes this work by researchers of the University of Nottingham available open access under the following conditions. This article is made available under the University of Nottingham End User licence and may be reused according to the conditions of the licence. For more details see: http://eprints.nottingham.ac.uk/end_user_agreement.pdf For more information, please contact [email protected] The University of Nottingham School of Computer Science Deep Learning Models of Biological Visual Information Processing Diána Turcsány Thesis submitted to The University of Nottingham for the degree of Doctor of Philosophy July, 2016 Deep Learning Models of Biological Visual Information Processing Abstract Improved computational models of biological vision can shed light on key processes contributing to the high accuracy of the human visual system. Deep learning models, which extract multiple layers of increasingly complex features from data, achieved recent breakthroughs on visual tasks. This thesis proposes such flexible data-driven models of biological vision and also shows how insights regarding biological visual processing can lead to advances within deep learning. To harness the potential of deep learning for modelling the retina and early vision, this work introduces a new dataset and a task simulating an early visual processing function and evaluates deep belief networks (DBNs) and deep neural networks (DNNs) on this input. The models are shown to learn feature detectors similar to retinal ganglion and V1 simple cells and execute early vision tasks. To model high-level visual information processing, this thesis proposes novel deep learning architectures and training methods. Biologically inspired Gaussian receptive field constraints are imposed on restricted Boltzmann machines (RBMs) to improve the fidelity of the data representation to encodings extracted by visual processing neurons. Moreover, concurrently with learning local features, the proposed local receptive field constrained RBMs (LRF-RBMs) automatically discover advantageous non-uniform feature detector placements from data. Following the hierarchical organisation of the visual cortex, novel LRF-DBN and LRF-DNN models are constructed using LRF-RBMs with gradually increas- ing receptive field sizes to extract consecutive layers of features. On a challeng- ing face dataset, unlike DBNs, LRF-DBNs learn a feature hierarchy exhibiting hierarchical part-based composition. Also, the proposed deep models outper- form DBNs and DNNs on face completion and dimensionality reduction, thereby demonstrating the strength of methods inspired by biological visual processing. ii List of Publications Research presented in this thesis has resulted in the following publications: Turcsany, D., Bargiela, A., and Maul, T. (2016). Local receptive field constrained deep networks. Information Sciences, 349–350:229–247. doi: 10.1016/j.ins.2016.02.034 Turcsany, D. and Bargiela, A. (2014). Learning local receptive fields in deep belief networks for visual feature detection. In Neural Information Process- ing, volume 8834 of Lecture Notes in Computer Science, pages 462–470. Springer International Publishing. doi: 10.1007/978-3-319-12637-1_58 Turcsany, D., Bargiela, A., and Maul, T. (2014). Modelling retinal feature detection with deep belief networks in a simulated environment. In Proceedings of the European Conference on Modelling and Simulation, pages 364–370. doi: 10.7148/2014-0364 iii Acknowledgements First and foremost, I would like to gratefully thank my supervisor, Prof. Andrzej Bargiela, for supporting my research directions—which allowed me to explore ideas and a research topic I greatly enjoyed—as well as for extending help, encouragement, and motivation throughout the years of this PhD. I also wish to convey my gratitude to Dr Tomas Maul, my second supervisor, for providing support, encouragement, and invaluable comments on my work. Furthermore, I would like to kindly thank Prof. Tony Pridmore who offered supervision and help in the final year of my PhD. Thanks also go to Andy for insightful comments on my annual reports. Likewise, it is a pleasure to express my appreciation to friends and colleagues from the department as well as to the University for providing an excellent environment for research, including a HPC service, which facilitated my experiments. I would like to gratefully acknowledge my friends and previous lecturers who helped and encouraged my choice to pursue a PhD. Also, I wish to deeply thank all my family members for their assistance, with special thanks to my mother, Katalin, whose relentless support and dedication has been an invaluable help. My sincerest gratitude is also greatly deserved by my partner, James, for years of support, care, and inspiration, which made my work and this thesis possible. iv Contents Abstract . ii List of Publications . iii Acknowledgements . iv Table of Contents . v List of Figures . ix List of Tables . xi Nomenclature . xii Abbreviations . xii Notation . xiii 1 Introduction 1 1.1 Motivation . 1 1.1.1 Computational Vision Systems . 1 1.1.2 Biological Visual Information Processing . 2 1.1.3 Computational Modelling of Biological Vision . 2 1.2 The Proposed Approach . 3 1.2.1 Early Visual Processing . 4 1.2.2 High-Level Visual Information Processing . 6 1.2.3 Novel Methods . 6 1.3 Goals and Contributions . 8 1.4 Overview . 10 2 Deep Learning 11 2.1 The Deep Learning Shift . 11 2.1.1 Impact in Computer Vision . 13 2.1.2 Deep Network Models . 14 2.2 Multi-layer Representations in the Brain . 16 2.3 Restricted Boltzmann Machines . 18 2.3.1 The RBM Model . 20 2.3.2 Contrastive Divergence Learning . 22 2.3.3 RBMs with Gaussian Visible Nodes . 23 2.4 Deep Belief Networks . 24 2.4.1 Generative Model . 25 v Contents vi 2.4.2 Deep Neural Networks and Classification . 27 2.4.3 Autoencoders . 28 2.5 Learning the Structure of Deep Networks . 30 2.5.1 Bayesian Non-parametrics . 31 2.5.2 Hyperparameter Learning . 32 2.6 Feature Visualisation . 32 2.7 Summary . 33 3 Modelling the Retina 35 3.1 Motivation for Retinal Modelling . 35 3.2 Anatomy and Physiology of the Retina . 36 3.2.1 Photoreceptors . 37 3.2.2 Horizontal Cells . 39 3.2.3 Bipolar Cells . 39 3.2.4 Amacrine Cells . 41 3.2.5 Ganglion Cells . 42 3.3 Retinal Modelling on Different Scales . 44 3.4 A New Retinal Modelling Approach . 47 3.4.1 Traditional Approach: Modelling the retina to the ‘best of our knowledge’ ........................... 47 3.4.2 Novel Approach: Discovering the neural structure of the retina from data ............................ 49 3.4.3 Considerations . 51 3.4.4 Training and Evaluation Protocol . 53 3.4.5 Advantages of the Proposed Approach . 56 3.5 Open Questions in Retinal Research . 57 3.5.1 Eye Movements . 57 3.5.2 Retinal Development . 58 3.5.3 Colour Opponency . 58 3.5.4 Rod Pathway . 60 3.5.5 Photosensitive Ganglion Cells . 61 3.5.6 Digital Versus Analogue Computation . 61 3.5.7 Retinal Prostheses . 63 3.5.8 Population Codes . 64 3.5.9 Neuronal Diversity . 65 3.5.10 Plasticity of the Brain . 66 3.6 Summary . 67 4 Modelling the Retina with Deep Networks 68 4.1 Experiments . 68 4.2 Modelling the Retina . 69 4.3 A Multi-layer Retinal Model . 71 4.4 Simulated Photoreceptor Input Dataset . 73 4.4.1 Video Dataset . 75 Contents vii 4.4.2 Image Dataset and Classification Task . 75 4.4.3 Training and Test Set . 76 4.4.4 Advantages of Simulated Data . 78 4.5 Methods . 80 4.6 Experimental Set-Up . 81 4.6.1 Training Protocol . 81 4.6.2 Evaluation Protocol . 82 4.7 Results . 86 4.7.1 Learnt Features . 86 4.7.2 Generative Model . 90 4.7.3 Reconstruction . 90 4.7.4 Classification . 92 4.8 Summary . 95 5 Learning Local Receptive Fields in Deep Networks 97 5.1 Motivation . 97 5.2 The Proposed Methods . 99 5.2.1 Contributions . 100 5.3 Related Work . 101 5.3.1 Convolutional Networks . 101 5.3.2 Feature Detector Placement . 103 5.4 Local Receptive Field Constrained RBMs . 105 5.4.1 Training with Local Receptive Fields . 105 5.4.2 Automatically Learning Receptive Field Centres . 109 5.5 Local Receptive Field Constrained Deep Networks . 111 5.5.1 Pretraining . 111 5.5.2 Fine-Tuning . 113 5.5.3 Notation . 114 5.6 Summary . 115 6 Deep Network Models of Visual Processing 116 6.1 Experiments . 116 6.2 Dataset . 117 6.2.1 The ‘Labeled Faces in the Wild’ Dataset . 117 6.2.2 Task . 118 6.2.3 Preprocessing . 119 6.2.4 The MNIST Digit Dataset . 119 6.3 Experimental Set-Up . 120 6.3.1 Training Protocol . 120 6.3.2 Evaluation Protocol . 122 6.4 Results . 125 6.4.1 Learnt Features . 125 6.4.2 Face Completion . 132 6.4.3 Reconstruction . ..

Deep Learning Models of Biological Visual Information Processing

The Retina the Retina the Pigment Layer Contains Melanin That Prevents Light Reflection Throughout the Globe of the Eye

Informing Computer Vision with Optical Illusions

A Neural Model of the Scintillating Grid Illusion: Disinhibi- Tion and Self-Inhibition in Early Vision

Color Appearance Models Second Edition

Arxiv:1907.09019V2 [Cs.CV] 5 Aug 2019

Book XVII License and the Law Editor: Ramon F

Use of the Hermann Grid Illusion in the Measurement of Contrast Perception in Dyslexia

The Hermann Grid Illusion Revisited

Variations on the Hermann Grid: an Extinction Illusion

Vision Health Module Workbook Answers

The Hermann-Hering Grid Illusion Demonstrates Disruption of Lateral Inhibition Processing in Diabetes Mellitus Nigel P Davies, Antony B Morland

Systematic Structural Analysis of Optical Illusion Art and Application in Graphic Design with Autism Spectrum Disorder