Binocular Fusion and Invariant Category Learning Due to Predictive Remapping During Scanning of a Depthful Scene with Eye Movements

Total Page:16

File Type:pdf, Size:1020Kb

Binocular Fusion and Invariant Category Learning Due to Predictive Remapping During Scanning of a Depthful Scene with Eye Movements ORIGINAL RESEARCH ARTICLE published: 14 January 2015 doi: 10.3389/fpsyg.2014.01457 Binocular fusion and invariant category learning due to predictive remapping during scanning of a depthful scene with eye movements Stephen Grossberg*, Karthik Srinivasan and Arash Yazdanbakhsh Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Center for Computational Neuroscience and Neural Technology, and Department of Mathematics, Boston University, Boston, MA, USA Edited by: How does the brain maintain stable fusion of 3D scenes when the eyes move? Every Chris Fields, Independent Scientist, eye movement causes each retinal position to process a different set of scenic features, USA and thus the brain needs to binocularly fuse new combinations of features at each Reviewed by: position after an eye movement. Despite these breaks in retinotopic fusion due to each Greg Francis, Purdue University, USA movement, previously fused representations of a scene in depth often appear stable. The Christopher W. Tyler, 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about Smith-Kettlewell Eye Research how multiple cortical areas in the What and Where cortical streams interact to coordinate Institute, USA processes of 3D boundary and surface perception, spatial attention, invariant object *Correspondence: category learning, predictive remapping, eye movement control, and learned coordinate Stephen Grossberg, Center for Adaptive Systems, Boston transformations. The model explains data from single neuron and psychophysical studies University, 677 Beacon Street, of covert visual attention shifts prior to eye movements. The model further clarifies how Boston, MA 02215, USA perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, e-mail: [email protected] V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object’s surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations. Keywords: depth perception, perceptual stability, predictive remapping, saccadic eye movements, object recognition, spatial attention, gain fields, category learning 1. INTRODUCTION boundaries and surfaces are formed from 3D scenes and 2D pic- 1.1. STABILITY OF 3D PERCEPTS ACROSS EYE MOVEMENTS tures that may include partially occluding objects (Grossberg, Our eyes continually move from place to place as they scan a scene 1994, 1997; Grossberg and McLoughlin, 1997; Grossberg and to fixate different objects with their high resolution foveal rep- Kelly, 1999; Kelly and Grossberg, 2000; Grossberg et al., resentations. Despite the evanescent nature of each fixation, we 2002, 2007, 2008; Grossberg and Swaminathan, 2004; Cao and perceive the world continuously in depth. Such percepts require Grossberg, 2005, 2012; Grossberg and Yazdanbakhsh, 2005; Fang explanation, if only because each eye movement causes the fovea and Grossberg, 2009). The articles that develop FACADE also to process a different set of scenic features, and thus there are summarize and simulate perceptual and neurobiological data breaks in retinotopic fusion due to each movement. Within a supporting the model’s prediction that 3D boundary and surface considerable range of distances and directions of movement, the representations are, indeed, the perceptual units of 3D vision. fused scene appears stable in depth, despite the fact that new aFILM (Anchored Filling-In Lightness Model) simulates psy- retinotopic matches occur after each movement. How does the chophysical data about how the brain generates representations of brain convert such intermittent fusions into a stable 3D percept anchored lightness and color in response to psychophysical dis- that persists across eye movements? plays and natural scenes (Hong and Grossberg, 2004; Grossberg This article develops the 3D ARTSCAN model to explain and and Hong, 2006). simulate how the brain does this, and makes several predictions ARTSCAN (Grossberg, 2007, 2009; Fazl et al., 2009)models to further test model properties. The model builds upon and and simulates perceptual, attentional, and neurobiological data integrates concepts and mechanisms from earlier models: about how the brain can coordinate spatial and object attention FACADE (Form-And-Color-And-DEpth) is a theory of 3D across the Where and What cortical streams to learn and recog- vision and figure-ground separation that proposes how 3D nize view-invariant object category representations as it scans a www.frontiersin.org January 2015 | Volume 5 | Article 1457 | 1 Grossberg et al. Binocular fusion during eye movements 2D scene with eye movements. These category representations The main theoretical goal of the current article is to demon- form in the inferotemporal cortex in response to 2D boundary strate the property of perceptual stability of 3D visual boundaries and surface representations that form across several parts of the and surfaces across saccadic eye movements, which has been clar- visual cortex. In order to learn view-invariant object categories, ified using a variety of experimental paradigms (Irwin, 1991; the model showed how spatial attention maintains its stability in Carlson-Radvansky, 1999; Cavanagh et al., 2001; Fecteau and head-centered coordinates during eye movements as a result of Munoz, 2003; Henderson and Hollingworth, 2003; Beauvillain the action of eye-position-sensitive gain fields. et al., 2005). The article also predicts how this process interacts These earlier models did not, however, consider how 3D with processes of spatial and object attention, invariant object cat- boundary and surface representations that are formed from egory learning, predictive remapping, and eye movement control, binocularly fused information from the two eyes is maintained as notably how they all regulate and/or respond to adaptive coordi- theeyesmovetofixatedifferentsetsofobjectfeatures.Thecur- nate transformations. As explained more fully below, the brain rent article shows how the stability of 3D boundary and surface can prevent a break in binocular fusion after an eye movement representations and of spatial attention are ensured using gain occurs by using predictive gain fields to maintain 3D boundary fields. With this new competence incorporated, the 3D ARTSCAN and surface representations in head-centered coordinates, even model can learn view-invariant object representations as the eyes though these representations are not maintained in retinotopic scan a depthful scene. coordinates. This property is demonstrated by simulations using 3D ARTSCAN is also consistent with the pARTSCAN (posi- 2D geometrical shapes and natural objects that are viewed in tional ARTSCAN) model (Cao et al., 2011), which clarifies how an 3D. In particular, the simulations show that the 3D boundary observer can learn both positionally-invariant and view-invariant and surface representations of these objects are maintained in object categories in a 2D scene; the dARTSCAN (distributed head-centered coordinates as the eyes move. ARTSCAN) model (Foley et al., 2012), which clarifies how These simulation results generalize immediately to 3D objects visual backgrounds do not become dark when spatial attention that have multiple 2D planar surfaces, since the simulations due is focused on a particular object, how Where stream transient not depend upon a particular binocular disparity. Indeed, other attentional components and What stream sustained attentional modeling studies have demonstrated how the same retinotopic components interact, and how prefrontal priming interacts with binocular mechanisms can process object features at multiple dis- parietal attention mechanisms to influence search efficiency; and parities (Grossberg and McLoughlin, 1997; Grossberg and Howe, the ARTSCAN Search model (Chang et al., 2014), which, in 2003; Cao and Grossberg, 2005, 2012), including objects per- addition to supporting view- and positionally-invariant object ceived from viewing stereograms (Fang and Grossberg, 2009)and category learning and recognition using Where-to-What stream natural 3D scenes (Cao and Grossberg, submitted), as well as interactions, can also search a scene for a valued goal object objects that are slanted in depth (Grossberg and Swaminathan, using reinforcement learning, cognitive-emotional interactions, 2004). All these results should be preserved under the action and What-to-Where stream interactions. It thereby proposes a of predictive gain fields to convert their retinotopic boundary neurobiologically-grounded solution of the Where’s Waldo prob- and surface representations into head-centered ones, since the lem. With the capacity of searching objects in depth added, gain fields merely predictively shift the representations that are which the results hereby about 3D perceptual stability permit, created by the retinotopic mechanisms. The key point is thus a 3D ARTSCAN Search model could learn and recognize
Recommended publications
  • Interactions Between Motion and Form Processing in the Human Visual System
    REVIEW ARTICLE published: 20 May 2013 COMPUTATIONAL NEUROSCIENCE doi: 10.3389/fncom.2013.00065 Interactions between motion and form processing in the human visual system George Mather 1*, Andrea Pavan 2, Rosilari Bellacosa Marotti 3, Gianluca Campana 3 and Clara Casco 3 1 School of Psychology, University of Lincoln, Lincoln, UK 2 Institut für Psychologie, Universität Regensburg, Regensburg, Germany 3 Department of General Psychology, University of Padua, Padua, Italy Edited by: The predominant view of motion and form processing in the human visual system Timothy Ledgeway, University of assumes that these two attributes are handled by separate and independent modules. Nottingham, UK Motion processing involves filtering by direction-selective sensors, followed by integration Reviewed by: to solve the aperture problem. Form processing involves filtering by orientation-selective Cees Van Leeuwen, Katholieke Universiteit Leuven, Belgium and size-selective receptive fields, followed by integration to encode object shape. It has Mark Edwards, Australian National long been known that motion signals can influence form processing in the well-known University, Australia Gestalt principle of common fate; texture elements which share a common motion *Correspondence: property are grouped into a single contour or texture region. However, recent research George Mather, School of in psychophysics and neuroscience indicates that the influence of form signals on Psychology, University of Lincoln, Brayford Pool, Lincoln LN6 7TS, UK. motion processing is more extensive than previously thought. First, the salience and e-mail: [email protected] apparent direction of moving lines depends on how the local orientation and direction of motion combine to match the receptive field properties of motion-selective neurons.
    [Show full text]
  • Neural Models of Motion Integration, Segmentation, and Probabilistic Decision-Making
    Chapter 13 Neural Models of Motion Integration, Segmentation, and Probabilistic Decision-Making Stephen Grossberg Abstract What brain mechanisms carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer’s motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer and its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interac- tions. Figure-ground processes in the form stream through V2, such as the separa- tion of occluding boundaries of the forest cover from boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feature tracking signals are ampli- fied before they propagate across position and are integrated with far more numer- ous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are sum- marized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays. S. Grossberg (*) Department of Cognitive and Neural Systems, Center for Adaptive Systems, Center of Excellence for Learning in Education, Science and Technology, Boston University, Boston, MA, USA e-mail: [email protected] U.J.
    [Show full text]
  • Optical Illusion - Wikipedia, the Free Encyclopedia
    Optical illusion - Wikipedia, the free encyclopedia Try Beta Log in / create account article discussion edit this page history [Hide] Wikipedia is there when you need it — now it needs you. $0.6M USD $7.5M USD Donate Now navigation Optical illusion Main page From Wikipedia, the free encyclopedia Contents Featured content This article is about visual perception. See Optical Illusion (album) for Current events information about the Time Requiem album. Random article An optical illusion (also called a visual illusion) is characterized by search visually perceived images that differ from objective reality. The information gathered by the eye is processed in the brain to give a percept that does not tally with a physical measurement of the stimulus source. There are three main types: literal optical illusions that create images that are interaction different from the objects that make them, physiological ones that are the An optical illusion. The square A About Wikipedia effects on the eyes and brain of excessive stimulation of a specific type is exactly the same shade of grey Community portal (brightness, tilt, color, movement), and cognitive illusions where the eye as square B. See Same color Recent changes and brain make unconscious inferences. illusion Contact Wikipedia Donate to Wikipedia Contents [hide] Help 1 Physiological illusions toolbox 2 Cognitive illusions 3 Explanation of cognitive illusions What links here 3.1 Perceptual organization Related changes 3.2 Depth and motion perception Upload file Special pages 3.3 Color and brightness
    [Show full text]
  • From Moving Contours to Object Motion: Functional Networks for Visual Form/Motion Processing
    Chapter 1 From Moving Contours to Object Motion: Functional Networks for Visual Form/Motion Processing Jean Lorenceau Abstract Recovering visual object motion, an essential function for living organisms to survive, remains a matter of experimental work aiming at understanding how the eye–brain system overcomes ambiguities and uncertainties, some intimately related to the sampling of the retinal image by neurons with spatially restricted receptive fields. Over the years, perceptual and electrophysiological recordings during active vision of a variety of motion patterns, together with modeling efforts, have partially uncovered the dynamics of the functional cortical networks underlying motion integration, segmentation and selection. In the following chapter, I shall review a subset of the large amount of available experimental data, and attempt to offer a comprehensive view of the building up of the unitary perception of moving forms. 1.1 Introduction An oriented slit of moving light, a microelectrode and an amplifier! Such were Hubel and Wiesel’s scalpel used during the 1960s (1959–1968) to uncover the properties of the visual brain of cat and monkey. A very simple visual stimulus indeed which, coupled with electrophysiological techniques, nevertheless allowed the analysis of many fundamental aspects of the functional architecture of primary visual cortex in mammals: distribution of orientation and direction selective neurons in layers, columns and hyper columns, discovery of simple, complex and hyper complex cells, distribution of ocular dominance bands, retinotopic organization of striate visual areas, etc. J. Lorenceau (*) Equipe Cogimage, UPMC Univ Paris 06, CNRS UMR 7225, Inserm UMR_S 975, CRICM 47 boulevard de l’Hôpital, Paris, F-75013, France e-mail: [email protected] U.J.
    [Show full text]
  • Motion Perception and Mid-Level Vision
    Motion Perception and Mid-Level Vision Josh McDermott and Edward H. Adelson Dept. of Brain and Cognitive Science, MIT Note: the phenomena described in this chapter are very difficult to understand without viewing the moving stimuli. The reader is urged to view the demos when reading the chapter, at: http://koffka.mit.edu/~kanile/master.html 1 Like many aspects of vision, motion perception begins with a massive array of local measurements performed by neurons in area V1. Each receptive field covers a small piece of the visual world, and as a result suffers from an ambiguity known as the aperture problem, illustrated in Figure 1. A moving contour, locally observed, is consistent with a family of possible motions (Wallach, 1935; Adelson and Movshon, 1982). This ambiguity is geometric in origin - motion parallel to the contour cannot be detected, as changes to this component of the motion do not change the images observed through the aperture. Only the component of the velocity orthogonal to the contour orientation can be measured, and as a result the actual velocity could be any of an infinite family of motions lying along a line in velocity space, as indicated in Figure 1. This ambiguity depends on the contour in question being straight, but smoothly curved contours are approximately straight when viewed locally, and the aperture problem is thus widespread. The upshot is that most local measurements made in the early stages of vision constrain object velocities but do not narrow them down to a single value; further analysis is necessary to yield the motions that we perceive.
    [Show full text]
  • Motion Perception What You Should Be Reading…
    What you should be reading… Motion Perception • For today: Chapter 9 (on motion) • For Thursday: Chapter 6 pp 219-242, on object recognition Josh McDermott • For Tuesday 4/13: Ch 6 pp 243-246, on 9.35 attention April 6, 2004 One way to think about detecting image motion: PROBLEM: Local motion detectors cannot determine true direction of motion: (Image removed due to copyright considerations.) (Image removed due to copyright considerations.) Detector will respond as long as perpendicular component of velocity matches the space-time offset. These local computations are incapable of determining Each detector gives a constraint line in velocity space, so if the specific motion direction and speed of a 2D signal. two are combined, the true velocity can be derived from the intersection of the two lines: They can only “see” the component at their orientation. How, then, does the visual system detect the motion of 2D features? (Image removed due to copyright considerations.) (Image removed due to copyright considerations.) The local detectors give a constraint line in velocity space… 1 Suggests a two stage model of motion processing: First, the stimulus is decomposed into a set of 1D components Neurons in area V1 respond (“wavelet”, Gabor, or fourier components). only to 1D motion. These 1D components are combined during a subsequent stage of processing using a form of intersection of constraints. This implies that 2D features (like points, corners, etc.) are derived. (Image removed due to copyright consideration.) Physiological experiments have since confirmed that this occurs… Area MT (V5) has neurons that respond to 2D motion.
    [Show full text]
  • Microstimulation of the Superior Colliculus Focuses Attention Without Moving the Eyes
    Microstimulation of the superior colliculus focuses attention without moving the eyes James R. Mu¨ ller*†, Marios G. Philiastides‡§, and William T. Newsome* *Howard Hughes Medical Institute and Department of Neurobiology, Stanford University School of Medicine, and §Department of Electrical Engineering, Stanford University, Stanford, CA 94305 This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on May 2, 2000. Contributed by William T. Newsome, November 8, 2004 The superior colliculus (SC) is part of a network of brain areas that have shown that microstimulation can bias perceptual choices in directs saccadic eye movements, overtly shifting both gaze and discrimination tasks (20–22) or serve as a substitute for a attention from position to position, in space. Here, we seek direct stimulus that is not actually present (23). But no manipulation of evidence that the SC also contributes to the control of covert this sort has improved an animal’s ability to discern what is spatial attention, a process that focuses attention on a region of actually present in the sensory world. space different from the point of gaze. While requiring monkeys to To explore this important phenomenon further, we adopted keep their gaze fixed, we tested whether microstimulation of a the general approach of Moore and Fallah to test whether specific location in the SC spatial map would enhance visual electrical stimulation of the SC, like that of the FEF, exerts a performance at the corresponding region of space, a diagnostic causal influence on covert attention. We sought to determine measure of covert attention.
    [Show full text]
  • Motion Andy Rider [email protected]
    Motion Andy Rider [email protected] Motion aftereffect • If you stare at something that contains motion, and then look at something that’s stationary, you perceive illusory motion in the opposite direction First described by Aristotle Then by Lucretius (384–322 BC) (99–55 BC) 1 Motion aftereffect Addams, R. (1834). An account of a peculiar optical phænomenon seen after having looked at a moving body. London and Edinburgh Philosophical Magazine and Journal of Science, 5, 373–4. Fall of Foyers https://www.youtube.com/watch?v=6MK9qQ_ApHQ Motion aftereffect Addams, R. (1834). An account of a peculiar optical phænomenon seen after having looked at a moving body. London and Edinburgh Philosophical Magazine and Journal of Science, 5, 373–4. Fall of Foyers https://www.youtube.com/watch?v=6MK9qQ_ApHQ 2 Motion aftereffect Storm Illusion https://www.youtube.com/watch?v=OAVXHzAWS60 Motion aftereffect Storm Illusion https://www.youtube.com/watch?v=OAVXHzAWS60 3 Motion aftereffect Wohlgemuth (1911) Direction-selective cells in visual cortex Hubel & Wiesel (1959): Neuron in cat primary visual cortex (V1) Hubel & Wiesel (1968): Neuron in monkey primary visual cortex (V1) 4 Adaptation of direction-selective neurons Vautin & Berkley (1977): Neuron in cat primary visual cortex (V1) Average response Average Blank screen Pattern moving Blank screen Time in preferred direction • Presentation of motion with neuron’s preferred direction causes substantial adaptation Adaptation of direction-selective neurons Vautin & Berkley (1977): Neuron in cat primary visual
    [Show full text]
  • The Visual World As Illusion the Ones We Know and the Ones We Don’T
    90 Chapter 7 The Visual World as Illusion The Ones We Know and the Ones We Don’t Stephen Grossberg EXPECTATION, IMAGINATION, AND ILLUSION that the line is there, but it is invisible, or not seen. Such a percept is called amodal in this chapter. A modal percept When you open your eyes in the morning, you usually see is one that does carry some visible brightness or color dif- what you expect to see. Often it will be your bedroom, with ference. This use of the term amodal generalizes the more things where you left them before you went to sleep. What if traditional use by authors such as Albert Michotte, Gabio you opened your eyes and unexpectedly found yourself in a Metelli, and Gaetano Kanizsa because the mechanistic steaming tropical jungle? Why do we have expectations about analysis of how boundaries and surfaces interact, which is what is about to happen to us? Why do we get surprised summarized later in this chapter, supports a more general when something unexpected happens to us? More generally, terminology in which perceptual boundaries are formed why are we Intentional Beings who are always projecting without any corresponding visible surface qualia. our expectations into the future? How does having such ex- Simple images like this turn our naive views about the pectations help us to fantasize and plan events that have not function of seeing upside down. For example, one introspec- yet occurred? Without this ability, all creative thought would tively appealing answer to the question “Why do we see?” is be impossible, and we could not imagine different possible that “We see things in order to recognize them.” But we can futures for ourselves or our hopes and fears for them.
    [Show full text]
  • Fractional-Order Information in the Visual Control of Locomotor Interception Reinoud J
    Fractional-Order Information in the Visual Control of Locomotor Interception Reinoud J. Bootsma, Rémy Casanova, Frank T. J. M. Zaal To cite this version: Reinoud J. Bootsma, Rémy Casanova, Frank T. J. M. Zaal. Fractional-Order Information in the Visual Control of Locomotor Interception. Perception, SAGE Publications, 2019, 48 (1_suppl), pp.187-187. 10.1177/0301006618824879. hal-02185336 HAL Id: hal-02185336 https://hal-amu.archives-ouvertes.fr/hal-02185336 Submitted on 16 Jul 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Perception 2019, Vol. 48(S1) 1–233 ! The Author(s) 2019 41st European Conference on Article reuse guidelines: sagepub.com/journals-permissions Visual Perception (ECVP) DOI: 10.1177/0301006618824879 2018 Trieste journals.sagepub.com/home/pec Welcome Address The 41st European Conference on Visual Perception (ECVP) took place in Trieste (Italy), from August 26 to 30, 2018. This edition was dedicated to the memory of our esteemed colleague and friend Tom Troscianko, with an emotional Memorial lecture in his honour held by Peter Thompson during the opening ceremony. The conference saw the participation of over 900 fellow vision scientists coming from all around the world; the vast majority of them actively participated, allowing us to offer an outstanding scientific program.
    [Show full text]
  • Bifurcation Study of a Neural Field Competition Model with an Application to Perceptual Switching in Motion Integration
    J Comput Neurosci DOI 10.1007/s10827-013-0465-5 Bifurcation study of a neural field competition model with an application to perceptual switching in motion integration J. Rankin · A. I. Meso · G. S. Masson · O. Faugeras · P. Kornprobst Received: 25 January 2013 / Revised: 19 May 2013 / Accepted: 20 May 2013 © The Author(s) 2013. This article is published with open access at SpringerLink.com Abstract Perceptual multistability is a phenomenon in broad range of perceptual competition problems in which which alternate interpretations of a fixed stimulus are per- spatial interactions play a role. ceived intermittently. Although correlates between activity in specific cortical areas and perception have been found, Keywords Multistability · Competition · Perception · the complex patterns of activity and the underlying mecha- Neural fields · Bifurcation · Motion nisms that gate multistable perception are little understood. Here, we present a neural field competition model in which competing states are represented in a continuous feature 1 Introduction space. Bifurcation analysis is used to describe the differ- ent types of complex spatio-temporal dynamics produced Perception can evolve dynamically for fixed sensory inputs by the model in terms of several parameters and for differ- and so-called multistable stimuli have been the attention ent inputs. The dynamics of the model was then compared of much recent experimental and computational investiga- to human perception investigated psychophysically during tion. The focus of many modelling studies has been to long presentations of an ambiguous, multistable motion pat- reproduce the switching behaviour observed in psychophys- tern known as the barberpole illusion. In order to do this, ical experiment and provide insight into the underlying the model is operated in a parameter range where known mechanisms (Laing and Chow 2002; Freeman 2005; Kim physiological response properties are reproduced whilst also et al.
    [Show full text]
  • Computational Models and Systems for Gaze Guidance
    Aus dem Institut fur¨ Neuro- und Bioinformatik der Universitat¨ zu Lubeck¨ Direktor: Univ.-Prof. Dr. rer. nat. Thomas Martinetz Computational models and systems for gaze guidance Inauguraldissertation zur Erlangung der Doktorwurde¨ der Universitat¨ zu Lubeck¨ – Aus der Technisch-Naturwissenschaftlichen Fakultat¨ – vorgelegt von Michael Dorr aus Hamburg Lubeck¨ 2009 Erster Berichterstatter: Prof. Dr.-Ing. Erhardt Barth Zweiter Berichterstatter: Prof. Dr. rer. nat. Heiko Neumann Tag der mundlichen¨ Prufung:¨ 23. April 2010 Zum Druck genehmigt: Lubeck,¨ den 26. April 2010 ii Contents Acknowledgements iv Zusammenfassung v I Introduction and basics 1 1 Introduction 3 1.1 Thesis organization . 6 1.2 Previous publications . 7 2 Basics 9 2.1 Image processing basics . 10 2.2 Spectra of spatio-temporal natural scenes . 11 2.3 Gaussian multiresolution pyramids . 13 2.4 Laplacian multiresolution pyramids . 16 2.5 Spatio-temporal “natural” noise . 20 2.6 Movie blending . 21 2.7 Geometry of image sequences . 23 2.8 Geometrical invariants . 26 2.9 Multispectral image sequences . 26 2.10 Orientation estimation . 27 2.11 Motion estimation . 28 2.12 Generalized structure tensor . 31 2.13 Support vector machines . 33 2.14 Human vision basics . 34 2.15 Eye movements . 40 II Models 43 3 Eye movements on natural videos 47 3.1 Previous work . 47 3.2 Our work . 48 3.3 Methods . 50 3.4 Results . 60 3.5 Discussion . 68 3.6 Chapter conclusion . 72 iii CONTENTS 4 Prediction of eye movements 75 4.1 Bottom-up and top-down eye guidance . 76 4.2 Saccade target selection based on low-level features at fixation .
    [Show full text]