Things and Places The Jean Nicod Lectures Franc¸ois Recanati, editor

The Elm and the Expert: Mentalese and Its Semantics, Jerry A. Fodor (1994)

Naturalizing the Mind, (1995)

Strong Feelings: Emotion, Addiction, and Human Behavior, Jon Elster (1999)

Knowledge, Possibility, and Consciousness, John Perry (2001)

Rationality in Action, John R. Searle (2001)

Varieties of Meaning, Ruth Garrett Millikan (2004)

Sweet Dreams: Philosophical Obstacles to a Science of Consciousness, Daniel C. Dennett (2005)

Reliable Reasoning: Induction and Statistical Learning Theory, and Sanjeev Kulkarni (2007)

Things and Places: How the Mind Connects with the World, Zenon W. Pylyshyn (2007) Things and Places

How the Mind Connects with the World

Zenon W. Pylyshyn

A Bradford Book The MIT Press Cambridge, Massachusetts London, England ( 2007 Massachusetts Institute of Technology

All rights reserved. No part of this book may be reproduced in any form by any elec- tronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.

MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please e-mail [email protected] or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142.

This book was set in Stone Serif and Stone Sans on 3B2 by Asco Typesetters, Hong Kong and was printed and bound in the United States of America.

Library of Congress Cataloging-in-Publication Data

Pylyshyn, Zenon W., 1937–. Things and places : how the mind connects with the world / by Zenon W. Pylyshyn. p. cm.—(The Jean Nicod lectures) ‘‘A Bradford book.’’ Includes bibliographical references and indexes. ISBN 978-0-262-16245-6 (hardcover : alk. paper) 1. Visual perception. 2. Cognition. 3. Mental representation. I. Title. BF241.P953 2007 152.14—dc22 2006035507

10987654321 Contents

Series Foreword vii Preface and Acknowledgments ix

1 Introduction to the Problem: Connecting Perception and the World 1 1.1 Background 1 1.2 What’s the Problem of Connecting the Mind with the World? Doesn’t Every Computational Theory of Vision Do That? 2 1.3 The Need for a Direct Way of Referring to Certain Individual Tokens in a Scene 10 1.4 Some Empirical Phenomena Illustrating the Role of Indexes 22 1.5 What Are We to Make of Such Empirical Demonstrations? 29

2 Indexing and Tracking Individuals 31 2.1 Individuating and Tracking 31 2.2 Indexes and Primitive Tracking 34 2.3 What Goes On in MOT? 37 2.4 Other Empirical and Theoretical Issues Surrounding MOT 40 2.5 Infants Capacity for Individuating and Tracking Objects 49 2.6 Summary and Implications for the Foundations of 52

3 Selection: The Key to Linking Representations and Things 59 3.1 Selection: The Role of Focal Attention 59 3.2 Selection and Demonstrative Reference: The Role of FINSTs 67 3.3 Problems with Selection by Location 79 3.4 Feature Placing and Sentience 91 3.5 What Do FINSTs Select? Some Consequences of the Present View 94

4 Conscious Contents and Nonconceptual Representation 99 4.1 Nonconceptual Representation and Perceptual Beliefs 99 4.2 The Role of Conscious Experience in the Study of Perception and Cognition 101 4.3 What Subjective Experience Reveals about Psychological Processes 112 vi Contents

4.4 The Phenomenal Experience of Seeing 119 4.5 The Phenomenal Experience of Mental Imagery 125 4.6 Does Phenomenal Appearance Correspond to a Level of Representation? 143

5 How We Represent Space: Internal versus External Constraints 147 5.1 What Does It Mean to Represent Space? 148 5.2 Internalizing General Spatial Constraints 151 5.3 Internalizing Spatial Properties by Mapping Them Onto an Inner Space 156 5.4 What Is Special about Representing Space? 169 5.5 Externalizing Spatial Properties: The Index Projection Hypothesis 179 5.6 Index Projection in Nonvisual Modalities 191

Conclusions 205

References 211 Index 235 Series Foreword

The Jean Nicod Lectures are delivered annually in Paris by a leading philos- opher of mind or philosophically oriented cognitive scientist. The 1993 inaugural lectures marked the centenary of the birth of the French philoso- pher and logician Jean Nicod (1893–1931). The lectures are sponsored by the Centre National de la Recherche Scientifique (CNRS), in cooperation with the Ecole des Hautes Etudes en Sciences Sociales (EHESS) and the Ecole Normale Superieure (ENS). The series hosts the texts of the lectures or the monographs they inspire.

Jean Nicod Committee

Jacques Bouveresse, President Je´roˆme Dokic and Elisabeth Pacherie, Secretary Franc¸ois Recanati, Editor of the Series Daniel Andler Jean-Pierre Changeux Stanislas Dehaene Emmanuel Dupoux Jean-Gabriel Ganascia Pierre Jacob Philippe de Rouilhan Dan Sperber

Preface and Acknowledgments

This book is based in part on the Jean Nicod Lectures that I delivered in Paris in May–June 2004. The temporal gap between the lectures and the publication is not entirely due to my slow typing, but arose from the need to assimilate the rather wide range of publications that are relevant to the thesis I am presenting. The thesis, it turns out, is one that I have been gestating over many years, and hints of it occur in fragmentary form in a number of my publications. Many of these are reports of experimental work carried out with graduate students over the years, whose contribution is much appreciated. The thesis rests on a growing appreciation of an idea I first learned from David Marr, who refers to it as the principle of natural constraints. The mind has been tuned over its evolutionary history so that it carries out certain functions in a modular fashion, without regard for what an organism knows or believes or desires, but because it is in its nature, or as I more often put it, because of its architecture. So far this is an innocent enough idea that fits many different schools of psychology (and in fact is a familiar part of J. J. Gibson’s direct realism theory, though used there to very differ- ent ends). The particular constraint I am interested in here takes the form of a mechanism that allows the modular perceptual system to do things that many philosophers have said (correctly) can only be done by using the sophisticated machinery of concepts and the logical machinery of in- duction, deduction, and what Charles Sanders Peirce called abduction. The mechanism includes the capacity to select individual things in one’s field of view, to reidentify each of them under certain conditions as the same in- dividual thing that was seen before, and to keep track of their enduring individuality despite radical changes in their properties. I claim that so long as we are in the kind of world we live in there are mechanisms that allow the visual system to do these things without using the heavy equip- ment of concepts, identity, and tenses (which are needed for other tasks). x Preface and Acknowledgments

For example, this is a world in which most surfaces that we see are sur- faces of physical objects, so that most of the texture elements we see move coherently as the object moves; almost all elements nearby on the proxi- mal image are at the same distance from the viewer; and, when objects disappear, they frequently reappear nearby, and often with a particular pat- tern of occlusion and disocclusion at the edges of the occluding opaque surfaces, and so on. Identifying things as ones we have seen before and keeping track of them as being the same individual objects over time is at the heart of my research, which has shown that we are very good at doing this in a way that does not use encoded properties (or the conceptual cate- gory) of the things that are tracked and reidentified. This mechanism is im- portant to us because if it were not for the existence of such nonconceptual processes, our concepts would not be grounded in experience and thus would not have the meaning that they do. I have proposed that the capacity to individuate and track several inde- pendently moving things is accomplished by a mechanism in the early vision module that I have called FINSTs (I call them ‘‘Fingers of INSTantia- tion’’ because they were initially viewed as a mechanism for instantiating or binding the arguments of visual predicates to objects in the world). This primitive nonconceptual mechanism functions to identify, reidentify, and track distal objects. It is an ability that we exercise every waking minute, and it has also been understood to be fundamental to the way we see and understand the world. I came upon these ideas in quite a different context, initially when I (along with my colleague Edward Elcock) attempted to develop a computer system for reasoning about diagrams, and later when I was carrying out experimental research on vision, visual attention, and mental imagery. This may seem like a circuitous route, but it has turned out that all these endeavors involve the same puzzles, which I later discovered were also the puzzles that preoccupy many philosophers: how concepts are grounded in experience; how we manage to encode and represent properties of the world when there are so many of them; why we feel that we are conscious of seeing an enormous number of things but are unable to report most of them, while at the same time a great deal of information of which we are not conscious can affect our behavior. These puzzles appear in their most striking form in discussions of two related problems: What are the proper- ties of mental images that allows them to function in thought, and how do certain kinds of thoughts—thoughts about spatial layouts—manage to dis- play properties very similar to those of perceived space? Preface and Acknowledgments xi

In this book I examine a number of critical functions of early vision (the part of vision that is informationally encapsulated from the rest of the mind) in the light of the FINST mechanism. Chapter 1 looks at the nature of the problems that FINST are intended to solve, from our initial encounter with the problem of incrementally building a representation over time as various aspects are noticed, to the deep problem of grounding concepts in sensory information. This chapter also introduces FINST theory in terms of a number of experiments that illustrate their nature as pointers to things in the perceived world. It also offers the suggestion that FINSTs serve to provide what philosophers have called demonstrative reference or demonstrative identification. Chapter 2 focuses on a particular function that FINSTs serve, namely, the nonconceptual tracking of individual things that move and change their properties. Since tracking is one of the critical aspects of our commerce with the world, these experiments serve as con- crete examples of the role that FINSTs play in this process. It also provides a basis for a number of additional properties of this mind–world connec- tion: it shows that things can be tracked as unidentified things with an enduring numerical identity (where by ‘‘unidentified’’ I mean they are not represented in terms of any conceptual category or in terms of distinctive properties). The notion of tracking also links this work to some findings in cognitive development where it has been shown that very young infants (under six months of age) are able to track things that briefly disappear and are able to anticipate how many hidden things there are (as long as there are three or fewer). Chapter 3 looks more closely at another of the functions that FINSTs per- form, that of selecting things through something like attention (FINSTs are not exactly the same as focal attention and I discuss the differences). This chapter raises some of the controversial aspects of attentional selection. It argues that selecting is nonconceptual and does not depend on the prior encoding of any properties of the things selected—including their location. I argue that the reason this seems unintuitive is that we fail to distinguish between the various roles that the properties of things play in this process. Properties are involved in picking out things to which FINSTs are assigned, and they are involved in determining whether things can be tracked, but they need not be encoded and used in the process of maintaining the iden- tity of the things that are tracked. I spend time in chapter 3 distinguishing between causes and codes and suggest that we should be conservative in describing certain mind–world connections as representations. This brings us to an important function that FINST selection plays, solving what has xii Preface and Acknowledgments

been called the binding problem: the problem of encoding certain sets of properties as being conjoined, as being properties of the same thing, as opposed to being properties that merely occur simultaneously in the scene. Whereas much of the psychological and philosophical literature sees the binding problem as being solved in terms of the collocation of properties, my proposal is that properties are considered conjoined if they are proper- ties of the same FINSTed thing. This brings us to a point where we can say roughly what FINSTs attach to—what it is that I have been calling ‘‘things’’ (or even FINGs, to indicate that they are interdefined with FINSTs, as those things that FINSTs select and refer to)—they attach to what, in our sort of world, typically turn out to be individual visible physical objects. I discuss the frequently cited notion of nonconceptual representation and suggest how this idea is closely related to the story I am telling about FINSTs. In chapter 3 I propose that the only nonconceptual representational content we have is that secured by FINST indexes. The view that only properties of FINSTed things get represented puts me in conflict with those who appeal to the richness of conscious experience in defending nonconceptual representation. For this reason I devote chap- ter 4 to a discussion of the role of conscious contents in the process of con- necting mind and world. What I claim is that the contents of conscious experience are only one source of evidence for mental contents, and not even a very reliable one. I claim that there is no level of representation that corresponds specifically to the content of conscious experience and, therefore, that equating nonconceptual representation with the content of conscious experience is a mistake. The discussion of the content of con- scious experience brings the topic to the nature of the mental images that we experience when reasoning about concrete sensory appearances, about spatial layouts, or when we use spatial mental model models in reasoning. In chapter 4 I focus primarily on what we can make of the contents of conscious experience, and I use theories of mental imagery as the exam- ple of how we are misled when we view conscious content as a type of representation. It is not until chapter 5 that I focus directly on the problem of the repre- sentation of space in active working memory (as opposed to long-term memory). Here I review some of the proposals on how some mental repre- sentations manage to exhibit certain sorts of spatial properties. Most of these proposals hypothesize some internal constraints on representations of spatial layouts. The most widely accepted and intuitive proposal is that the spatial character of representations of space arises because the represen- tations themselves are instantiated by spatial layouts in the brain—what I Preface and Acknowledgments xiii

call neural layouts. I review this proposal as well as some related ones that attempt to retain the benefits of neural layouts without assuming a spatial arrangement in the brain. Of these, the one referred to as functional space fails to have any explanatory value, and the other, which I call the spatial architecture proposal, reduces essentially to the literal space alternative. In this discussion I distinguish representations involved in long-term memory from representations I refer to as active spatial representations (ASPARs), which are constructed both by vision and by reasoning that relies on imag- ined geometrical of spatial layouts. I then list what I take to be some of the conditions that an ASPAR must meet, which include its capacity to repre- sent magnitudes, its spatial configurational stability, its amodal nature, and its intimate connection with the motor system. The latter brings us to an overview of Poincare´’s proposal. I finish chapter 5 by presenting what might be seen as a fairly radical and speculative proposal for an externalist theory of spatial representation in ASPARs (i.e., in spatial reasoning). The hypothesis, which arises from the ideas about FINSTs that I discuss in the first part of the book, is what I call the projection hypothesis. This proposal claims that in constructing a spa- tial representation from approximate, partial, and qualitative information stored in long-term memory, we pick out things in the concurrently per- ceived world using FINSTs and associate imagined objects with them (to think of the imagined objects as being located at the sensed objects). This allows us to use the perceptual system to draw inferences by pattern recog- nition rather than logical reasoning. I then generalize the projection hy- pothesis to nonvisual modalities such as proprioception, which requires that I deal with the multiple frame of reference problem (as opposed to a unitary allocentric frame of reference). In this task, coordinate transforma- tion mechanisms, which are plentiful in the brain (especially in parietal cortex, as well as in superior colliculus and premotor cortical areas), play a central role. Throughout this essay I try to draw morals for a number of philosophical issues such as whether there are nonconceptual representations, how con- cepts are grounded in perception, and how the mind deals with spatial properties. Clearly this palette is more than can be dealt with adequately in one book. Yet there are some clear themes that run through these puzzles, especially when they are considered against the background of ex- perimental findings in psychophysics, cognitive science, cognitive develop- ment, and neuroscience. So this is my attempt to find a way through that forest by focusing on one or two sunny spots where I think progress has been made in the past two decades. xiv Preface and Acknowledgments

In this pursuit I must express my gratitude to the Institut Jean-Nicod, who generously invited me to give these lectures, the Centre Nationale de la Recherche Scientifique (CNRS), L’E´cole des Hautes E´tudes en Sciences Sociales (EHESS), and L’E´cole Normale Supe´rieure (ENS), who funded the Jean Nicod Prize and provided space for me the following year as I worked on this book. In particular I wish to thank Franc¸ois Re´canati, who chairs the Nicod Prize committee and looked after the logistics of my visit; Pierre Jacob, who directs the Institut Jean-Nicod; and the many people from the Institute who made my stay memorable, especially Roberto Casati, Je´roˆme Dokic, E´lizabeth Pacherie, Joelle Proust, and Dan Sperber. Michel Imbert and Sylvia Duchacek-Imbert were most welcoming and helped make our stay pleasant and memorable. I also acknowledge the help I received, intended or not, from discussions with my friend and with friends and colleagues Charles R. Gal- listel (who helped to educate me on the subject of navigation), Georges Rey (who carefully read an earlier version of the manuscript and tried his best to keep me from making some embarrassing philosophical slips and mis- using some philosophical terms of art), Alan Leslie and Lila Gleitman (with whom I have taught some of this material in joint courses), Susan Carey, Ned Block, and Luca Bonatti. I am also grateful to the participants of the conference on Spatial Frames of Reference held in Paris in November 2005, including organizers Je´roˆme Dokic and E´lizabeth Pacherie, and participants with whom I had a chance to try out the ideas in chapter 5: Jean-Rene´ Duhamel, Yves Rossetti, Charles Spence, Barbara Tversky, Yann Coello, Paolo Bartolomeo, and Sylvie Chokron. Finally, I wish to thank my editor, Judy Feldmann, who displayed an un- canny knack for finding words for what I meant instead of what I wrote.