Computational Models of Primary Visual Cortex and the Structure of Natural Images

Computational Models of Primary Visual Cortex and the Structure of Natural Images vorgelegt von Diplom-Informatiker (Dipl.-Inf.) Hauke Bartsch Von der Fakultät IV - Elektrotechnik und Informatik der Technischen Universität Berlin zur Erlangung des akademischen Grades Doktor der Naturwissenschaften – Dr. rer. nat. – Promotionsausschuss: Vorsitzender: Prof. Dr.-Ing. R. Orglmeister Berichter: Prof. Dr. rer. nat. K. Obermayer Berichter: Prof. Dr. sci. nat. F. Wysotzki TagderwissenschaftlichenAussprache: 2003-12-01 Berlin 2004 / D 83 . Hauke Bartsch January 28, 2003 Computational Models of Primary Visual Cortex and the Structure of Natural Images PhD Thesis Department of Electrical Engineering and Computer Science Technical University of Berlin, Germany . Abstract Understanding the function of the brain appears to be mainly an introspective task. Nevertheless, large parts of our brain are used as a direct interface to our environment. They constantly acquire information and control our actions. The dynamic interweaving of brain and environment opens a dual route to the understanding of cortical processing. Based on that insight the present work focuses on the following two issues: a) analyzing the functional role of cortical networks, and b) analyzing the special requirements imposed onto the brain by the statistics of its input, namely the statistics of natural images. After an introduction into both, natural images and the mammalian visual system, we first analyze the connection scheme found in the primary visual cortex at different levels of abstraction. We start by deriving a system of cou- pled differential equations describing a column of excitatory and inhibitory neurons. Phenomena like contrast invariant orientation tuning and contrast saturation are investigated which have been found important features of cortical neurons. The initial model is extended to explain also response properties related to contextual effects. There stimuli presented outside the classical receptive field of the neuron can modulate its response. Principle difficulties in having cross-orientation modulations by iso-orientation specific patchy connections are shown. In explaining cross-orientation modulations we analyze the effect of the distinctive spatial layout of the cortex. We found that two opposite effects contribute to the observed contextual modulation; (i) local inhibition that is induced by a local change in input (leads to suppression), and (ii) dis-inhibition. The second part deals with the input of the visual system, namely pictures of scenes encountered in the surrounding world. We formulate the hypoth- esis that higher order features in spatial pattern can be described in terms of intrinsic invariance and symmetry and introduce a mathematical formula- tion of smooth local symmetries. Applications for object classification, image alignment, and landmark detection illustrate the principle advantage of our structure analysis over methods of shape analysis. Two new algorithms are introduced to efficiently learn higher order features. The first one introduces a centralized Gaussian mixture model to extract 4 second-order features estimating the density of the data. The obtained code is shown to outperform other known linear codes by being well distributed and by showing a high population sparseness, both are preferable properties in coding of natural images. Originating from geometrical considerations of manifolds in high dimen- sional spaces we introduce a non-linear transformation and by this a family of feature spaces that are shown to be useful to detect correlation of a specific order in the data. Moreover it is shown that these correlations can be learned in the feature space by linear methods. This general property of the transformation is interesting for a large class of algorithms in the field of explorative data analysis. In the context of independent component analysis this transformation defines a feature space in which the assumption of independent sources can be fulfilled for a set of over-complete basis functions. 5 Ziel dieser Arbeit ist es, zum Verständnis der Strukturen des menschlichen Gehirns beizutragen. Grosse Teile unseres Gehirns funktionieren als eine direkte Schnittstelle zu unserer Umwelt. Zwischen Umwelt und Gehirn werden kontinuierlich Informa- tionen verarbeitet und Handlungen initiiert. Das dynamische Wechselspiel zwischen Umwelt und Gehirn macht es notwendig, auch bei der Analyse der Verschaltungsstruk- turen im Gehirn ihre jeweiligen Eingaben, hier meist sensorische Signale, zu untersuchen. Auf der Grundlage einer Dualität von Gehirnstrukturen und sensorischen Signalen beschäftigt sich diese Arbeit mit den folgenden beiden Themen: a) der Analyse der Funktion kortikaler Schaltkreise und b) den speziellen Anforderungen, die bei der Verarbeitung von Bildern an das Gehirn, insbesondere deren statistischen Eigenschaften gestellt werden. Nach einer kurzen Einfuhrung¨ in die Statistik naturlicher¨ Bilder und die Anatomie und Funktion des visuellen Systems der Säugetiere untersuchen wir die Verschal- tungsstrukturen, die im ersten visuellen Areal gefunden werden. Insbesondere die Phänomene der kontrastinvarianten Antwort auf orientierte Gitter als optische Stimuli und das Sättigungsverhalten der Zellen bei hohen Kontrasten in der Eingabe werden analytisch und durch Computersimulationen unterstutzt¨ untersucht. Das verwendete Differentialgleichungsmodell fur¨ gekoppelte Zellpopulationen wird schrittweise er- weitert, um auch kontextabhängige Effekte untersuchen zu konnen.¨ Hierbei hängt die Antwort einer Zelle von den Reizen in ihrer weiteren Umgebung ab. Wir zeigen unter anderem, dass zwei unterschiedliche Effekte zu den Kontextmodulationen beitragen: zum einen lokale Hemmung (Inhibition), die durch eine Anderung¨ in der Struktur der Eingabe bestimmt werden, und des weiteren durch den Effekt der Dis-inhibition. Die Analyse von Bildern unserer Umgebung ist das Hauptziel im zweiten Teil der Arbeit. Sie sind der Input in das visuelle System. Wir formulieren die Hypothese, dass wichtige Eigenschaften von Bildern durch ihre inhärenten Invarianzen und Symme- trien definiert werden. Um diese Hypothese zu testen, fuhren¨ wir ein mathematisches Mass fur¨ lokale Symmetrien in räumlichen Mustern ein. Anwendungen auf den Ge- bieten der Objektidentifikation, der Objektausrichtung und der Landmarkenfindung unterstreichen die Vorzuge¨ der Strukturanalyse gegenuber¨ einer reinen Formanalyse. Zwei neue Algorithmen werden vorgestellt um Eigenschaften hoherer¨ Ordnung in Bildern zu lernen. Der erste basiert auf dem Modell eines zentralisierten Gauss'schen- Mixture Modells und extrahiert Merkmale dadurch, dass er ein Modell der Verteilungs- funktion der Daten lernt. Die gelernten Merkmale sind denen anderer Modelle erster Ordnung hinsichtlich der Populationsantworteigenschaften uberlegen.¨ Ausgehend von geometrischen Uberlegungen¨ zu Mannigfaltigkeiten in hochdimen- sionalen Räumen fuhren¨ wir eine Transformation in einen nicht-linearen Merkmal- sraum ein, in dem Korrelationen beliebiger Ordnung mit linearen Methoden gelernt werden konnen.¨ Im Kontext der Independent Component Analysis als einem Beispiel fur¨ einen Algorithmus der explorativen Datenanalyse kann die Transformation dazu benutzt werden, um uber-komplette¨ Basisfunktionen zu lernen. 6 Hauke Bartsch, 2002 . Contents 1. Introduction 2 1.1. Scope and Goals . 4 1.2. Plan of the Manuscript . 4 1.3. The Input of the Visual System . 6 1.3.1. A Statistical Description of Images . 8 1.3.2. First Order Statistics . 9 1.3.3. Second Order Statistics . 10 1.3.4. Decomposition into Basis Sets . 12 2. The Mammalian Visual System 15 2.1. The Retina . 17 2.2. The Visual Pathway and the LGN . 19 2.3. The Primary Visual Cortex . 20 2.3.1. Classical Receptive Field Measurements . 21 2.3.2. Complex Stimuli in the CRF . 25 2.3.3. Anatomy of Lateral Connections in V1 . 27 2.3.4. Non-classical Receptive Field Measurements . 29 2.3.5. Texture Segmentation and Line Completion . 32 2.3.6. Origin of Orientation Selectivity . 33 2.3.7. Iceberg-Model . 35 2.4. Discussion . 39 3. From Columns to Hypercolumns and Lattices 40 3.1. Introduction . 40 3.2. A Mean-Field Model of Neuronal Population Activity . 41 3.2.1. Modeling Orientation Selectivity with Two Cell Types . 44 3.2.2. Hypercolumns with Multiple Populations . 48 3.2.3. Hypercolumn Model Setup for Contextual Effects . 62 3.2.4. Numerical Simulations . 64 3.3. A Lattice Model for Contextual Effects . 69 3.3.1. Model description . 70 3.3.2. Results . 72 3.4. Concluding Remarks . 77 8 Contents 4. Invariance and Symmetry 78 4.1. Related Models . 79 4.1.1. Learning Invariance . 79 4.1.2. Learning Symmetry . 81 4.1.3. The Lie Transformation Group Model . 85 4.1.4. Symmetries of the Visual Cortex . 86 4.2. A Structure Preserving Transformation . 87 4.2.1. Co-variation and the Quadratic Form . 91 5. The Binary Valued Quadratic Form 93 5.1. The Choice of a Binary Valued A . 95 5.2. The Properties of . 96 5.2.1. is LinearSin Contrast . 98 5.2.2. TheS Moments of . 98 5.2.3. Dependence on RFSS and Preferred Orientation . 99 5.2.4. Stability Analysis . 100 5.3. Inverting the Symmetry Detection . 101 5.3.1. Minimizing F by Metropolis Algorithm . 105 5.3.2. Minimizing F by Gradient Descent . 107 5.4. Applications . 108 5.4.1. Object classification . 108 5.4.2. Image Alignment . 110 5.4.3. Landmark Detection . 112 5.5. Concluding Remarks . 113 6. The Real Valued Quadratic Form 114 6.1. A Tensor Form for Symmetry Detection . 114 6.2. Independent

Load more