What Is the Problem of Spatial Representation(S)

Spatial representation in the mind/brain: Do we need a global topographical map? 

Zenon Pylyshyn (Rutgers Center for Cognitive Science, IJN)

1. Introduction to the problem of spatial representation There are few problems in cognitive science and philosophy of mind that are as ramified and deeply connected to classical questions of the nature of mind than the question of how we represent space. It is but a small step from how we represent space to the nature of consciousness itself. In this talk I will focus on some reasons why the problem is so difficult and why we, as scientists, are so easily led astray in attempting to understand the psychology of spatial perception and reasoning and in theorizing about the mental and neural representation of space in general. I begin by trying to set some boundary conditions on this study, asking what an adequate theory of spatial representation might be expected to provide. I don’t expect to propose a definitive answer even to this preparatory (or propaedeutic) question because it is, in my view, a stage that is the locus of much of our stumbling and false starts in the theory of spatial representation. What I will do is approach the question of what an adequate theory ought to provide from several perspectives and examine what is wrong with some obvious proposals. In the interest of sharing the blame for some of the missteps I begin with a position that I articulated, or at least implied, some years ago in discussing the nature of mental imagery – a study that is very close to the heart of questions concerning spatial representation, as we shall see shortly. In that early paper (Pylyshyn, 1973) I asked what was special about mental imagery and I claimed, in effect, that what was special was not their form but their content – they were about sensory perception, about how things looked (or how they appeared in one sensory modality or other). I still think this is largely correct, but it is not the whole story. The reason it is not the whole story is itself highly instructive and I would like to dwell on this issue for a moment. When we think about space we are engaged in thinking of the usual sort – we are drawing inferences, both logical and heuristic, from what we believe; we are solving problems, perhaps using some form of means-ends analysis. Surely we reason about shapes and locations just the way we might reason about medicine or history or economics. So why does it seem to us that something special is going on when we think about, say, where things are located relative to one another? A major part of the answer surely lies in the phenomenology of spatial perception and spatial reasoning. It feels like we have access to spatial information in “bulk” or holistically. We feel we see and imagine not only objects and their relations, but also unoccupied places in a scene. I will not spend time on this view, which has been demonstrated to be false – both by psychophysical experiments showing that we have access to far less dense information than our impressions would suggest, and by a more general analysis of the conceptual and information- theoretic problems raised by such a view. But so tempting is the view that many many pages have been spent arguing it. I will not add to this discussion here, since I am doing so in other current writings. I will instead point to the very general error that underlies the inference from phenomenology to a theory of representation. The error is so general it has several names in different fields of scholarship. In early introspectionist psychology, when people attempted to give an objective description of their experiences, an error arose that Titchener called the “stimulus error” in which people mistakenly attributed certain properties of a stimulus to

 Previously advertised title: What is the Problem of Spatial Representation?

1 5／3／2018 5／3／2018 properties of their experience of the stimulus. The same sort of error of confusing the world with one’s experience of it appears and is discussed more frequently in philosophy under the term “intentional error”. It is almost universally committed by people working on theories of mental imagery (See the discussion in Dennett, 1991; Pylyshyn, 1984; Pylyshyn, 2002). Simply put, the intentional error is the mistake of attributing to a mental representation, properties of what is being represented. It is the mistake of attributing color or size of an object to the representation of the object. It’s a mistake we rarely make when the representation is linguistic (e.g., we are never tempted to assume that large things are represented by larger constructs in a language), yet we are always tempted – usually successfully – to assume that our representation of greater distances or larger size involves bigger things in the mind/brain. (We also almost universally assume that our mental representation of long-lasting events takes longer, but I will not raise this today). What I plan to do today is review some of the reasons why we cannot trust our phenomenology to provide valid cues as to the nature of our mental or neural representation of space. Then I will back off from this to criticize the view I myself have taken on the question of the form of representation of space – the view that spatial information is represented in pretty much the way any other information is represented – i.e., as expressions in the Language of Thought. In doing so I will outline a number of reasons why space is special. In talking about spatial representation I will be specifically addressing the role that spatial information plays in perception and in reasoning. The representation of spatial properties in one’s world knowledge – in what is often referred to as Long Term Memory – is a different issue and I will have nothing in particular to say about it. In fact I suspect that what I said in 1973 really does apply here: There is nothing special about our knowledge of spatial relations beyond what is special about any other topic about which we have some knowledge – it involves a different content and no doubt requires different concepts to capture that content. But whatever format we find we need to posit to account for how this knowledge is represented – say a Language of Thought (as argued forcibly by Fodor, 1975) – will likely also apply to knowledge of space. But when it comes to the space of our immediate perception and the space of our active reasoning (as for example when we use “mental images”, but not only then since spatial reasoning need not involve and conscious experiences at all – and certainly not visual ones as in the case of blind people) something special is going on that demands special principles of mental architecture. To anticipate this discussion it will center on the observation that our representations have to reflect certain properties that space, and not other subject matters, have. When I say “reflect” I mean that the representation must somehow carry information that allows certain properties to be captured and appropriate inferences to be drawn from them. This is not the same as the often-made claim that representations of spatial patterns must “preserve” spatial properties (although it is not very clear what exactly that means – despite the fact that it has been the source of some ideas about the spatial nature of representations that are clearly untenable). The properties of spatial layouts that we need to in representations include: their metrical nature, their configurational or relative holistic nature, their continuity or connectedness, their amodal and three-dimensional character, and their close connection with the motor system. In the space available I will have more to say about some of these characteristics than others. But I will argue that these valid points have been widely misunderstood and have implicitly merged with the seduction of the intentional fallacy. Our grasp of space appears to us to be subtle, complex and extremely fine-grained. Our experience of space is all-pervasive; we experience ourselves as being totally immersed in the space around us which remains fixed as we move through it or as objects other than ourselves move through it. Our spatial abilities are remarkable. We can orient ourselves in space rapidly and effortlessly and can perceive spatial layouts based on extremely partial and ambiguous cues. We can recall spatial relations and recreate spatial properties in our imagination. Animals low on the phylogenetic scale, who may not even have any concepts, exhibit amazing powers of navigation that prove that they have quantitative representations of the space through which they

2 5／3／2018 5／3／2018 travel. Although vision science is arguably the most developed of the cognitive sciences there are many areas of vision science where it is far from clear that we have posed the problems correctly, and the problem of spatial cognition strikes me as an extremely likely candidate for one of those problems. Before I look at some of the scientific problems we face, I will take a slight detour to examine the role that conscious experience plays in our study of perception.

2. The role of conscious experience in the study of perception Perception science, and particularly vision science, has had a deeply ambivalent relation with conscious experience. On one hand, the way things appear or what they look like has always constituted the primary data of the science. When one thing looks bigger in one condition than in another or when something looks to be moving faster under one condition than another, or when colors appear different under one lighting condition than another, these are considered primary data to which theories are expected to respond. On the other hand, the content of experience has also proven to be one of the most misleading sources of evidence because it is not neutral with respect to what explanations appear most natural. The way we describe the experience often caries with it the implication that the experience itself explains the data – that the experience of X is explained by adverting to the experience of Y (we will see examples of this when we discuss mental imagery – e.g., the reason it takes longer to report details from a small image is that the details are harder to see). Phenomenological evidence has also tempted people into accepting that vision provides a dense manifold of panoramic information. It has suggested that mental images consist of pictures examined by an inner eye. It also has encouraged direct perception theories which claim that we pick up information about aspects of the world that are prominent in our experience, such as whether the things we see are suitable for certain purposes – from eating to sitting on. A theory that begins with these sorts of observations as the givens of perception, as they do in Gibson’s direct realism theory, fails to take even the first step towards a theory of the mechanisms that show how vision works and that might eventually make contact with neuroscience. There is also no room in phenomenology- based theories for the growing evidence of vision-without-awareness, including change blindness, visuomotor control without conscious awareness, blindsight and other sources of behavioral and neuroscience data. We have oscillated between these two extremes at different points in the history of perception research. Although perceptual experience cannot be discounted in the study of perception neither can one assume that the experience itself is to be taken at face value. How are we to reconcile these differences (which often coexist within individual researchers). There is no general solution to this problem. The question of how to interpret a particular type of observation can only be resolved as we build more successful theories – in particular when we have at least a sketch of a theory that scales up from individual laboratory experiments to more general phenomena. The situation we are in is very similar to that which linguists have been in during the last 60 years. Intuitions of grammatical structure led early linguistics astray by focusing on surface phenomena. But as generative linguistics became better able to capture a wide range of generalizations, it found itself relying more, rather than less, on linguistic intuitions. What changed is that the use of the intuitions was now under the control the evolving theories. Even such general questions as whether a particular intuitive judgment was relevant to linguistics became conditioned by the theory itself. Take Chomsky’s famous sentence “Colorless green ideas sleep furiously” which was introduced to show the distinction between grammaticality and acceptability judgments. This example engendered considerable debate because what constitutes grammaticality as opposed to acceptability is not given by intuition but came from the nascent theory itself. So my view is that as theories of vision formulate general principles, the theories will direct us in the interpretation of evidence from conscious experience. It will show us how to interpret such findings as those of (Wittreich, 1959). Wittreich confirmed that when people walked across the floor of the Ames distorting room they appeared to change in size. But he also found

3 5／3／2018 5／3／2018 that this did not happen when the people were well-known to the observer, e.g., their spouse, even if they were accompanied in the walk by a stranger (show size did change!). Even now I think we are in a fairly good position to be incredulous of the theoretical significance of this finding, given what we know about how reports of conscious experience can be cognitively penetrable, hypnotism being an extreme example of this. Sometimes we can show this fairly directly by comparing measures from which the response bias has been mathematically factored out, as we do when we use the signal detection measure d′ rather than percent correct. But sometimes we make the decision on the grounds that a theory that takes certain observations at face value will simply miss the deeper underlying principles.

3. The perceptual experience of space The experience of spatial layout is fundamental because it presents many constraints on a theory of early visual representation (some of which I will take up later). But it is also problematic because our experience reveals a stable panoramic layout of spatial locations, some of which are empty while others are filled with objects, surfaces and features that stand in some spatial relation to one another. This is the phenomenology that led people to postulate an inner replica of the perceived world that constitutes the experiential content of our perceived space – a panoramic display that fills the world around us (Attneave called it cycloramic since it appears to cover all 360 degrees, Attneave & Farrar, 1977). If we assume that the content of experience must somehow arise from a representation that has that content, and that the representation is constructed from the information we receive through vision, then there is a problem about how such a representation could possibly come about, given the poverty of the incoming information. The incoming information consists of a small peephole view from the fovea that jumps about several times a second, during which we are essentially blind, and so on (the information available to the brain has been described in detail and is a familiar story, see e.g., O'Regan, 1992). So the gap between our visual experience and the available visual information requires some explanation. While there are many ways to try to fill the gap (some of which appeal to visual indexes) the natural way, given the form of the experience, is to try to build an internal facsimile of the contents of the experience – the scenario – by postulating a process that takes account of the saccades by constructing an inner picture in synchrony with the eye movements, along the lines of Figure 1 below:

Figure 1. The intuitive (and for a long time, scientifically accepted) view of the experience of seeing (from Pylyshyn, 2003)

But as we now know, this theory is patently false – there is no inner picture of any kind in our head, neither literally nor in any other non-vacuous sense that could explain either our visual experience or how we represent spatial information in cognitive processing. What has gone wrong that has led so many otherwise intelligent people to succumb to that story? What’s gone wrong is that we are using a particularly natural description of phenomenological experience as the explanandum: we are trying to explain the content of the experience in terms of intrinsic properties of a representation. But we are not entitled to assume that the content of experience

4 5／3／2018 5／3／2018 reflects the structure or format of a representation – that’s the intentional fallacy. Yet so long as we take the content of the perceptual experience as our primary data this is where it will lead us. Should we, then, discount the experience and start afresh from psychophysical data? It is inevitable that we will eventually develop theories that are based on psychophysical evidence and retain only those aspects of our phenomenology as is consistent with such evidence. But it is also unlikely that we will have to shed the entire experiential infrastructure. In what follows I will try to show that some of our intuitions are in fact quite prescient – it’s just when we take them to transparently reveal properties of mechanisms that we go astray. In what follows I will examine our intuitive notion of space from two perspectives. First I will look closely at our experience of space and ask whether there may be some real constraints implied by the experience. I have already written a great deal about how it has misled us into internalizing the properties of space. Today I will ask whether there may not be something in our strong intuitions that might suggest what physicists call “boundary conditions” that have to be met by an adequate theory of spatial representation. To do that I will examine some functional properties of spatial representations. Secondly I will ask how our experience of space may have arisen and how it manifests itself in our behavior – especially our visual-motor behavior. Here I will draw some lessons from history, particularly from Henri Poincare who thought deeply about the problem of space, including about the psychological question of how we could have acquired an amodal sense of three-dimensional space given the experiences we have. In this lecture I will try to take a middle road on the issue of the role of experience, although in the subsequent lecture I will take a harder line on the misdirection we are all under when it comes to understanding the nature of the spatial representations that occur in mental imagery. I will begin today by looking again at certain phenomena of perceptual experience that philosophers call sentience, or the experiential content of sensory stimulation. Such phenomena are of central interest to philosophers because they bear on the foundational questions about mind and its relation to the world. And they are of interest to us because the “level of sentience” is intended to be the level of nonconceptual content: in philosophy, sensory experience is typically equated with the nonconceptual representation.

3.1 How do we cognize space? And what does that mean?

One of the difficulties in understanding spatial experience is the fact that it is so extremely intuitive to us that it is unclear what we mean when we ask how we cognize space. We are like the proverbial fish trying to understand water. It seems obvious that space is the three- dimensional receptacle in which objects reside and that spatial relations are there for us to see and experience without the need for any concepts or inferences. It also seems to our modern sensibilities that space consists of a dense array of points which can be connected by straight lines. But these notions, which have been enshrined in our view of space since Euclid, may not be the right notions in terms of which we ought to describe how we perceive space and especially how we represent space in our mind when we think about it or imagine events taking place in it. But what does it mean to say that this is not the way we cognize space? How can it be that out conceptualization of space does not give a privileged place to points and lines? I begin by trying to outline the nature of the problem that faces us. What does it mean to see the world as being laid out in space? What must the architecture of a mind be like that can do this? Given the patterns of energy that impinge on our sense organs, what must the mind do with them to create the experience of space? Many of our greatest thinkers have sought to answer this question at a foundational level. Putting aside the classical Greeks, who had views about everything that matters, the problem fascinated thinkers like Kepler, who was one of the first to recognize (a) that the retinal image plays an important role in the causal chain and (b) that the gap between the retinal image and the apprehension of space would not succumb to the same style of geometrical analysis that worked so well in filling the gap between the light, the objects, and the image on the retina (Lindberg, 1976). Rene Descartes’ arithmetization of

5 5／3／2018 5／3／2018 geometry was one of the seminal accomplishments in understanding that the problem had a formal structure (not dependent on diagrams) that was amenable to rigorous study. Then in the 20th century several great French natural philosophers were stirred by the problem. Henri Poincaré (Poincaré, 1963/1913) was one of the most important of these and I will return to his views below. The problem of developing a sensory-based Euclidean geometry was raised again by Jean Nicod who, in the 1930s wrote a dissertation entitled “Geometry and the Sensory World” which laid the groundwork for a very different way of looking at this question (Nicod, 1970) and which, by the way, had a profound effect on me when I began the work that lead to the FINST theory. For Nicod the problem was that the basic building blocks of the Euclidean (and Cartesian) view are points and lines and a way of constructing figures from them, together with the relation of congruity, none of which seemed to him like the sorts of things that perceptual systems are equipped to detect – they are complex types that collapse collections of sensory experiences into categories that make the statement of geometrical principles simple at the cost of making their connection with sensory data opaque. Nicod suggested that since there are very many models of the Euclidean axioms (the Cartesian mapping of space onto n-tuples of numbers being the best known) we should seek instead a way to capture Euclidean spatial properties in terms of primitives more suited for creatures with sensory systems like ours. After considering a variety of such possible primitives, he developed several “sensible geometries” based on the geometry of volumes and of volume-inclusion (or what he called “spatio-temporal interiority”) and argued that this basis is closer to our sensory capacities than one based on points and lines (one reason being that volume inclusion is detectable and is invariant with viewpoint so it can be sensed as we move through space). With the addition of a few other novel ideas (such as succession and global resemblance) Nicod set out a new direction for understanding what space might consist in for a sentient organism. While in the end he did not succeed in developing a complete formalization of geometry based on these sensory primitives he did point the way to the possibility of understanding sense-based space radically different from the Euclidean, Kantian, and Cartesian approaches that seem so natural to us. If Nicod had been able to carry out his program he might have provided a set of tools for viewing space that would have been more useful to us than the view that is thoroughly embedded in our way of thinking. But he did show us that thinking in terms of points and lines may not be the only way and indeed it may not be the most perspicuous way for cognitive science to proceed in studying the experience of space. Another person who worked on the problem of characterizing the nonconceptual experience of space is Christopher Peacocke. In (Peacocke, 1992) he develops a way of characterizing the experience of space in terms of what he calls scenarios, which he defines as ways in which space can be filled. As I mentioned at the beginning of this lecture, while characterizing the nonconceptual experience of space is a deep and interesting philosophical problem, it is not clear how cognitive science can build on these ideas, since we have reason to believe that the experience does not capture the appropriate mental structures on which one can build an explanatory theory. However, Peacocke’s work helps us to understand what might constitute the content of nonconceptual experience of space and what one would have to capture in order to properly account for it – in what Chomsky referred to as a descriptively adequate (as opposed to an explanatorily adequate) theory. I will not have more to say about this approach here, although the notion of the nonconceptual content of spatial experience will be relevant when I discuss mental imagery in the next lecture. In this lecture I will, instead, offer some comments on several approaches to understanding nonconceptual spatial representation which postulate some form of internalizing of spatial properties.

6 5／3／2018 5／3／2018 4. The genesis of our “Sense of Space”

4.1 Internalizing by incorporating visuomotor experience: Poincaré’s insights

There is another way to understand our concept of space that approaches being an internalizing view, but only to the extent that it emphasizes the link with the motor system. In what follows I will make much of this general idea and it will lead to the notion that rather than internalizing space, the converse actually holds. The mind actually externalizes space by projecting spatial concepts onto a nonconceptual motor- and proprioception-based representation. The basic idea for this direction comes from Henri Poincaré, whose views left a lasting impression on me even before I realized that they were philosophical. In a series of essays written almost a century ago, Poincaré analyzed the concept of space. In one of these essays he describes how a three-dimensional impression of space might arise in a sentient organism confronted with information in various forms and modalities and in many dimensions. A central idea in Poincaré’s account is that the organism has to be able to distinguish between experiences that correspond to changes in position and those that do not. According to Poincaré the key to being able to recognize the difference depends of being able to distinguish between changes brought about by our own actions and movements that were externally caused. Here Poincaré makes use of the notion of the reversibility of certain sensations – what he calls a “correction” – whereby we can undo an externally caused change by a complementary voluntary change that brings back the sensory state that existed prior to the externally caused change. Suppose, for example that you are touching some object with the index finger of your right hand and experience the tactile sensation T while at the same time you sense the object visually as with sensation V. If the object moves from external causes, you will perceive the visual sensation change from V to V’ and the tactile sensation will fade. But you may be able to bring back the tactile sensation T by an action, represented by a series of muscular sensations S, S’, S’’. Moreover, this same “renewal” of the tactile sensation can be accomplished equally by any of an equivalence class of sequences {S1, S2, S3, …}. What the members of this set have in common is that they can be described as “moving your finger from a common starting position to a common final position”. According the Poincaré what you, or your ancestors have learned is that if you are touching an object and your visual sensation changes from V to V’, you can once again touch the object by carrying out a motor sequence in which the muscular sensations follow the sequence of one of the S’s in the equivalence class. Thus the basis for your knowledge of spatial locations is this skill of moving in such a way as to bring back a tactile or visual sensation. Poincaré used the notion of an equivalence class of sequences of sensations that move a finger from a particular initial position to a particular final position as a way of defining a common location across the several fingers and hands. The classes of movements define “spaces” and the spaces marked out by each finger are then merged by the recognition that when two fingers touch they define the notion of “same place” and so lead to the convergence of the initially distinct spaces. Poincaré then goes on to argue that the reason that our representation of space has 3 rather than 2 or 4 dimensions is tied up with the way that the equivalence classes are established, together with the boundary condition that we should not count as equivalent two sequences of sensations that fail to take us to the same final position (where the tactile sensation is renewed) nor should we count as equivalent two sequences of sensations that take us to different final positions (where the tactile sensation is not renewed). It is these boundary conditions that force the tri-dimensionality of space. The reason I have belabored this point is that apart from providing an elegant account of the basis for the dimensionality of space, Poincaré’s analysis touches on several issues that will be relevant to our present discussion., not the least of which is his appeal to fingers! 1 The details of this analysis don’t carry much conviction these days, and indeed the reversibility of sensation condition was criticized by Jean Nicod, but the main ideas remain

7 5／3／2018 5／3／2018 sound. For example, the first point of contact between Poincaré’s analysis and the ones I will propose concerns the recognition that there are two distinct kinds of changes in sensory states; those that signal a difference in location and those that signal a difference in some sensory quality, say the sensation of a property like color or texture. Whether or not you like his way of making the distinction, in terms of the capacity to “correct” or revert to an earlier location- marking sensory state, the distinction does play an important role in recent discussions of sentience, and is especially central in the work of Austen Clark, though for different reasons. The second point of contact concerns the emphasis placed on sequences of muscular actions and sensations and to equivalence classes of such sequences. The is a remarkably modern idea, although it is not expressed in this way in current writings. What Poincaré’s analysis shares with contemporary analysis of what I will be calling the “sense of space” is the idea that the nonvisual apprehension of space may be a construction based on mechanisms that compute the equivalences among otherwise very different sequences of muscular actions (I don’t use the term sensation since this implies that these sequences are conscious). Computing the relations among representations of positions of limbs, sensors, and other movable parts of the body is arguably one of most ubiquitous and best understood functions of the brain – functions carried out primarily in the posterior parietal cortex, but also in the superior colliculus, in the motor and premotor cortical areas and elsewhere. Computing one position representation given a different position representation is commonly referred to as a coordinate transformations (CT). One way to view CTs is as a function from the representation of the orientation of an articulated part of the body (e.g., the eye in its orbit) to the representation of that part (or a different part) in a different orientation or relative to a different frame of reference. It also applies to computing a representation of a location within the reference frame of one modality to a corresponding representation in the reference frame of another modality. The relevant representations of limbs in these cases are typically expressed within a framework that is local to the parts in question – such as the states of the muscles that control the movements, or the joint angles that characterize their relative positions, or to endpoint locations relative to the body. The relevant representations of sensory inputs may similarly be in proximal coordinates (e.g., positions on the retina or on the basilar membrane) or other local coordinates. The importance of these ideas in the present context relates directly to the theme of nonconceptual contact between mind and the world. In particular, since I have been arguing that this contact does not begin with the selection of spatiotemporal regions I need to say how places in space are represented – and indeed whether they are represented as such. What I will do in the last part of this lecture is to consider another approach to the question of what it means for the nonconceptual visual system to index or pick out a place or region in space. We have already discussed the problems with the traditional view that the first, nonconceptual (or sentient) contact with the world occurs through the detection of features-at-locations (the idea developed most forcefully by Austen Clark). What I want to do now is suggest another way in which the apparent function of spatial selection might be achieved without any actual selection of places specified in a unitary frame of reference (whether or not it is allocentric). 4.2 Our sense of space may derive from our indexical anchoring of objects of thought to concurrently perceived objects in space Earlier I described a theory of visual indexing (FINST Theory – see Pylyshyn, 2000, 2001) according to which we have a mechanism that is able to individuate and point to some 4 or 5 ‘objects’ in the world. In the initial formulation these objects were assumed to be clusters of visual features, but it has become clear that they must be things in the distal environment that we perceive and that we are able to track as individuals, independent of their properties or locations (as long as we have perceptual contact with them – and in fact, as we have discovered, for short times after they disappear behind an occluding surface). We have demonstrated the functioning of these indexes in a variety of experimental paradigms, from their use in selecting subsets of search items (and retaining them despite an intervening saccade) to their use in detection simple

8 5／3／2018 5／3／2018 patterns that require picking out and marking parts (as in the “visual routines” examples discussed by Ullman, 1984) and perhaps most clearly in multiple-object tracking (MOT) studies (these and other examples are discussed in Pylyshyn, 2003, chapter5). In MOT subjects are able to select 4 or 5 briefly cued simple objects among an equal number of identical objects and then to keep track of them as they move with random and interpenetrating paths for up to 10 seconds or more. We have shown that these objects can be tracked without encoding their properties (indeed, changes in their color or shape are not noticed) and even when they disappear for up to one second. We have also argued that given the parameters of the experiment tracking could not be done by encoding and updating objects’ locations as they move, leading us to conclude that tracking is a primitive operation in the early visual system (Pylyshyn, 2001) which (for historical reasons we call the FINST indexing system) Given such a mechanism, which allows stable access to a few individual things in the world and allows attention to be switched to them, there is a natural explanation for a number of phenomena that have led people to postulate an internal spatial medium or display. If we know (from instructions or from memory) what the relative locations are for a number of objects of thought, we can then link them to real perceived objects in space. Once we are in such a relation to actual objects in space, we are in a position to use our perceptual system to detect previously unnoticed patterns among these objects or to scan our attention (or our gaze) from one to another or to judge relative distances between pairs whose relative locations we had not encoded, and so on. Of course in the case of nonvisual modalities we must assume the equivalent object-based perception in, say, audition or some somatosensory sense. Since these do not always provide continuous information from individual objects we must also assume that the indexing system can select and index from short-term sensory memory as well as from the current stimulus array. Here is an example of an empirical phenomenon that can be accounted for in this way. One experiment (which has been frequently cited and taken to provide a “window on the mind”, Denis & Kosslyn, 1999) is “mental scanning” – a phenomenon whereby the time to switch attention between imagined locations increases with the distance between the items as they are imagined. Although this has been interpreted as showing that such mental images “have” metrical distance properties, it can also be seen in terms of scanning attention in real space between indexed objects. For example, if the map on the right is imagined and a subject is asked to scan attention from one object (say the tree) to another object (say the tower), the time taken is proportional to the relative distance between the two imagined places. But now suppose a subject is able to use FINST indexes to attach a few of the places whose locations have been memorized (in some unspecified, but non-imaginal way) to objects in the room (shown in below). Then he or she would be able to scan attention (or even direction of gaze) through physical space between the two relevant objects. In that case the equation time = distance/speed literally applies and the relevant time (give or take some factor to account for the different psychophysical properties that might come into play when scanning between mentally and visually selected objects) would be proportional to the actually distance in space. Figure 2. Binding indexes to objects in an office scene to associate these objects with the imagined mental objects (labels)

5. Do we pick out spatial locations in a unitary frame of reference? There are reasons to resist the idea that we have a unitary representation of location-in-space. To begin, there are a very large number of distinct frames of reference. Some are required

9 5／3／2018 5／3／2018 because of the way in which the sensory information comes in. The visual system, for example, receives information in an eye-centered frame of reference, but may be required for controlling a hand in some motor frame of reference. There cannot be a direct mapping between these two since the orientation of the eye in its socket, of the head and neck in a body centered frame of reference and of the arm, wrist and hand in a body centered frame of reference are all relevant to the final motor movement. Even for so simple a task as pointing at what one sees the question arises concerning how the relevant conversion is performed. One possibility might be that the position in each frame of reference is mapped onto a single frame of reference – say an body- centered or allocentric frame of reference. But another alternative is that the frames of reference are mapped across pairs of frames only for the objects that are relevant and only as the information is needed. For example (Henriques, Klier, Smith, Lowry, & Crawford, 1998) studied an open loop pointing task under conditions in which subjects either kept their eyes fixated or performed a saccade to a peripheral location. They found that pointing errors were highly correlated with ocular fixations and not with head orientation, leading them to propose “…a ‘conversion-on-demand’ model of visuomotor control in which multiple visual targets are stored and rotated (noncommutatively) within the oculocentric frame, whereas only selected targets are transformed further into head- or body-centric frames of motor execution.” There is a great deal of evidence that we have a large number of different (often incompatible) representations of spatial locations within different frames of reference (Colby & Goldberg, 1999). We know that the gaze-centered frame of reference plays an important role in visual-motor coordination (Crawford, Medendorp, & Marotta, 2004; Henriques et al., 1998; Medendorp, Goltz, Villis, & Crawford, 2003; Snyder, Grieve, Brotchie, & Andersen, 1998) but even within this frame of reference the actual coordinates are modified extensively one-line by what are called gain fields (Salinas & Thier, 2000) which reflect head and body position, and even by not-yet executed intentions to move the eyes (Andersen, 1995; Duhamel, Colby, & Goldberg, 1992). There is also the famous distinction between ventral and dorsal visual systems, illustrated most famously by patients such as DF studied by (Milner & Goodale, 1995). These findings show that even within one modality different functions (motor control vs object recognition) may involve different frames of reference – with the ventral system using a relatively local frame of reference representing qualitative rather then metric spatial relations, as opposed to the dorsal system which uses a body centered frame of reference and represents relatively precise spatial magnitudes (see also Bridgeman, Lewis, Heit, & Nagle, 1979). The use of multiple frames of reference is also illustrated by cases of visual neglect – a deficit in attention due to damage in parietal cortex – in which patients fail to notice or respond to objects in half of their visual field. Even so clearly a spatial deficit appears to show the many different frames of reference that may be involved. “Neglect occurs in all sensory modalities and can be expressed relative to any of several spatial reference frames, including retinocentric, body-centered, and environment- centered (Abernethy, 1991) can be specific for stimuli presented at particular distances. Some patients tend to ignore stimuli presented near the body, in peripersonal space, while responding normally to distant stimuli, or vice versa… Distance-specific neglect may be apparent only when the subject must produce a motor response to the stimulus, and not when spatial perception alone is tested” (Colby & Goldberg, 1999, p320-321). Properties of many of these frames of reference have been investigated, often with surprising results. For example, there appear to be integrated visual-tactile representations in peripersonal space surrounding the hand and face. Visual stimuli presented near the body tend to be processed together with tactile stimuli so that when one modality shows deficits, such as extinction (a deficit in processing two stimuli presented together bilaterally when neither is impaired when presented individually), the other tends to show similar deficits. The visual deficits in these cases are in a frame of reference relative to a body part (e.g., the hand or the face). The visual experience of the region around a body part appears to be tied to the somatosensory experience of the body part itself, so that it moves with the body part, appears

10 5／3／2018 5／3／2018 with “phantom limb” experiences of amputees, and has even been shown to be extended with tool use (Làdavas, 2002). There is other evidence of hand-based visual representations as well as gaze-centered and head-centered representations in normals. For example, pointing performance without vision is poorer when the starting position of the hand is not visible (Chieffi, Allport, & Woodin, 1999) and auditory localization is poorer without visual stimulation by a textured surface. Analysis of errors in pointing to previously felt locations suggests that the representation of the initial kinesthetically-sensed targets retains information about the previous movement (Baud-Bovy & Viviani, 1998) even though performance transfers readily to the other hand with only a slight systematic bias in the direction of the midsagittal plane (i.e. towards the original hand). The visual and motor frames of reference are very closely linked. For example, the accuracy of pointing to a seen object after the eyes are closed remains high, but the accuracy of pointing from an different imagined location is very poor unless the subject actually moves to the new location even without vision during the move– a phenomenon well-known to animal psychologists (Gallistel, 1990) and shown for humans by (Farrell & Thomson, 1998). It seems that the many coordinate systems are automatically updated when we move.

6. The coordinate transformation function and as-needed translation One of the main motivation for the assumption that there is a uniform frame of reference available to cognition is the fact that we easily go back and forth between perceptual modalities and, more importantly, between perception and motor action. Since perception begins with various peripheral frames of reference (e.g., vision starts with a retinal image) and motor control requires a world-centered frame of reference an obvious solution is to convert everything into a common allocentric reference frame Such a unitary coordinate system could then serve as the lingua franca for representing locations accessible to all modalities and all local frames of reference. This also comports well with our experience of seeing things in a stable allocentric frame of reference. But there are problems with this view. Motor control is not in an allocentric frame of reference. Commands must be issued in a variety of different frames, including joint- angle frames and limb-centric frames (e.g., there is evidence for coding in hand-centered frames of reference). There are also very many intermediate frames involved. For example in vision there is not only two retinal frames but also a cyclopean frame and a 3D frame. Thus the problem of conversion is computationally extremely complex. There is also evidence that encoding in all these various frames of reference is not erased when conversion to motor control occurs. Many of these intermediate representations leave their mark on ultimate motor performance (Baud-Bovy & Viviani, 1998). An interesting case occurs when we reach for an object after a brief exposure. While we are able to do so even when eye movements occur between perceiving the object and reaching, neither the retinal location nor the motor representations are irrelevant to the outcome, as can be seen from errors in reaching with intervening eye movements. Analysis of motor control reveals that “motor error commands cannot be treated independently of their frame of origin or the frame of their destined motor command (Crawford et al., 2004, p10)” In that case the retinal location affects the direction of reach (see also, Batista, Buneo, Snyder, & Andersen, 1999). Many other studies confirm the residual effect of the multiple reference frames involved in the entire process, thus suggesting that a single conversion to a global frame of reference, if it occurs at all, cannot be the whole story. But there is alternative account that does not require a single global frame of reference – that is the option that frames of reference are translated “on the go” as needed. The plurality of reference frames, the speed with which we generally have to coordinate across such frames of reference, and the relative impermanence of the transitional coordinate representations make this alternative plausible. In addition, a large number of coordinate transformations have been identified through neurophysiological studies (many of which are described by Gallistel, 1999) all point to coordinate transformation as being a basic operation in the central nervous system.

11 5／3／2018 5／3／2018 Such transformations occur not only between modalities, but also between many distinct and constantly-changing forms of representation within a modality. Thus in moving your arm to grasp a perceived object you not only have to coordinate between visual location-information and proprioceptive location-information, but also between a representation in terms of joint angles to a representation in terms of body-centered spatial coordinates and then from body- centered coordinates to allocentric coordinates. Since in reaching for something you generally move your eye, head and body, thus dynamically changing the body-centered coordinates), the coordination must occur rapidly on line. Although one might in principal convert each of these frames of reference to one (e.g. allocentric) frame of reference, the evidence appears to support pairwise coordinate transformations among closely connected frameworks (e.g. eye-centered and head centered frames of reference to a body centered frame of reference or joint-angle frame of reference to a body-centered frame of reference). There is evidence that the plethora of frames of reference is tied together by a web of coordinate transformation operations. It makes sense to bypass the single allocentric framework (with its 6 degrees of freedom, including rotations) since the origin and coordinates of that framework are not given directly by the senses and there are growing reasons to think that we do not need it as an intermediary for coordinating spatial representations across modalities and between perception and action.2 But an additional factor argues against the view that retinal information is converted to a global reference frame. A great deal of evidence, both behavioral and neurophysiological, suggests that only a small portion of the information is converted and, moreover, that it is converted retained in modified eye-centered coordinates (Batista et al., 1999; Klier, Wang, & Crawford, 2001; Snyder, 2000), (and this may even be the case for auditory inputs, Stricanne, Andersen, & Mazzoni, 1996). The richness of the visual input and the complexity of the transformations that would be involved if the entire contents of each reference frame were converted suggest that this would not only be computationally intractable, but that it is unnecessary given the selective nature of the properties that go into the sensory-motor control process. This conclusion was also reached in a recent review by (Colby & Goldberg, 1999) who argued that attention also plays an important roll in determining which objects are represented and converted. For this reason (Henriques et al., 1998) proposed a “conversion-on-demand” paradigm in which only objects involved in a particular planned motor action are converted in retinal coordinates. The same may well be true of other pairs of representation frames, such as cyclopean, joint-angle, and so on (perhaps even including maps of “intentional” movement goals Andersen & Buneo, 2002), where in each case only selected objects are converted so that no global allocentric coordinate system is constructed, or if one is constructed, it is highly partial and contains little more than references to a few objects as postulated by FINST theory. If this assumption is correct, then we do not need a unitary representation of space in order to deal with objects that are in fact located at a fixed lacation in allocentric space, so long as we can coordinate among the many other frames of reference from which we do receive direct information through the senses. Tolman’s “cognitive map” need be nothing more than a compact description of a set of skills that are exercised in the course of navigating through an environment. These skills are not only remarkable, but can be shown to rest on certain kinds of information and computations. For example (as argued in Gallistel, 1990) in the case of animal navigation it may rest on a representation of direction and distance to visible landmarks along a traveled path, together with the ability to compute distances by “dead reckoning” while traversing through the space – i.e., to determine the distance and direction of where it is from the starting point (a remarkable feat since it entails integrating the component of instantaneous velocity along the line from the start to the current location). These are all remarkable capacities (especially for the desert ant which is extremely adept at it, Wehner & Menzel, 1990) and are in some ways equivalent to having an allocentric map. But they do not actually entail such a map because they do not entail that the allocentric coordinates of each landmark is available. If it were, the animal would be in a position to compute the distance and bearing of every pair of landmarks directly. As far as I know there is no evidence that it can do so.

12 5／3／2018 5／3／2018 7. What’s in a map? What’s even more surprising, the “map” does not seem to contain any identifying information for the individual landmarks other than their location in relation to other landmarks visible from it: It’s as if you had a map with no labels (other than “you are here”?)! This latter fact means that certain systematic ambiguities necessarily come out of the representation. For example if a particular goal landmark is located as in the figure below, the animal would show no preference between the actual location of the goal G and the mirror-image location G’ which bears the same local-geometry to landmarks A, B, C, and D. In other words the animal is unable to take account of the intrinsic properties (including color, texture, size, shape, odor) of the landmarks (or even the paths AB, BD, DC, CA). These properties don’t seem to have been entered into the object files corresponding to the landmarks. The properties are, however, available for other purposes, such as choosing between two targets located side-by-side.

Figure 3 Mirror image indeterminacy in the navigation module (Cheng, 1986; Gallistel, 1990)

What is needed is a careful analysis of the alternative forms of representation compatible with what is generally referred to as a “map.” In fact there is a general problem with representations of space: Many different forms of representation are equivalent with respect to their spatial information content, even though they make different assumptions about the cognitive architecture. Talk of “maps” typically assumes either a literal topographical display (as in the figure) or an analog form of it, in which all the metrical properties are encoded by metrical properties in a representation in such a way that whatever can be read off the literal map could (with the appropriate read operations) be read off the analog. This sort of proposal is not excluded a priori by any principle, but nobody has been able to propose an analog representation of spatial layout that itself does not use spatial magnitudes to represent spatial magnitudes. What some people have proposed is that the engineer’s (or roboticist’s) approach to this problem may be the correct (or “psychologically real”) one. Scientists and engineers invariably represent spatial locations in terms of Cartesian coordinates and operate on them with vector operations (using matrix multiplication) – the way that all CAD systems do. What would it mean for a mind to use this form of representation? This question is a bit tricky since there is temptation here to confuse levels. There are two distinct questions implied in the claim that people use coordinates and vector operations. One is the question of what information is being used and what functions are being computed. This is Marr’s notion of the “level of computation", as opposed to the level of algorithm. The other is an algorithm and format question. It asks what form of representation is used and what algorithms operate over representations in that form. The first question – concerning what information the process is sensitive to is the least contentious. It is an empirical question and the answer seems to be that from insects on up, the process is sensitive to locations of the sort that are encoded when we use a Cartesian coordinate system and it computes transformations of the sort that are described by the vector manipulations. There is ample evidence for this assumption from the navigational capacities of animals and insects, which show an uncanny ability to do dead reckoning that demonstrably requires not only representing magnitudes, but also being able to do sophisticated numerical operations over them. Answering the second question, concerning the form of

13 5／3／2018 5／3／2018 representation as opposed to informational content, is more difficult. We don’t want to answer it by saying that people use numeral representations and perform sequences of numeral transformations the way that computers do it. The reason we don’t want to say that is that this entails a certain sort of architecture that we have good reason to doubt characterizes the human processing of magnitudes (except when it is engaged in doing longhand arithmetic in school!). Yet magnitudes are undoubtedly represented and processed, by means that neither cognitive science nor neuroscience has yet laid bare, except for certain special cases (the coordinate transformation operations). So where does that leave us on the matter of spatial representation (or, as Hume calls such nonconceptual representations, spatial impressions)? I have said very little about the conscious experience of space – the phenomenology of seeing a spatial layout. I believe that a description such as Peacocke’s scenario content is probably correct – our experience of space is very fine grained, richly textured and wholly fills our conscious vista. But this is not true of our conceptualization of it. This, as many experiments have shown, is surprisingly sparse and partial. Studies have suggested that only a few things are retained between visual fixations or between glimpses. But our results seem to show that among the information that is retained is information that allows us to move attention back to salient objects in the scene that have not been fully encoded; that’s what FINST indexes allow one to do; this is implied by results of (Ballard, Hayhoe, Pook, & Rao, 1997; Burkell & Pylyshyn, 1997; Henderson & Hollingworth, 1999). What happens to the rest of the details in the scenario content, what function they play, and what underlies and supports their presence (e.g., what form does their representation take and why does it fail to make a discernable mark on observed behavior) are questions that we can only speculate on for the present time. Some years ago (Sperling, 1960) suggested that the information implied by the experiential content was actually represented in a usable form, but that it faded very quickly (in a fraction of a second). Several investigators have suggested rapid fading or overwriting by new information as a possible answer to why most of scenario content fails to have an enduring effect on information processing. Be that as it may, we are for the moment left with only a very small fraction of what we experience being available to cognition. On the other hand, we know that a great deal of what is not available to cognition is available to motor control. That’s a very important (and fairly recent) finding. But it still leaves us with the puzzle of why our conscious content plays so small a role in the mental life that falls under cognitive science. Both the question of why so much of what we are conscious of is not available and also the question why so much of what we are not conscious of is available are puzzles. They raise questions about what unconscious content is, why it appears under some circumstances and not others and whether to treat it as part of one’s psychology or as mere epiphenomena. I confess it would be a pity if something as intimate as one’s phenomenal experience of the world turned out to be epiphenomenal! On the other hand, as I will show next time, the evidence that our phenomenal conscious experience not only fails to tell us what is going on but may in fact have led us astray on the heartland of cognition – in thought, problem-solving, and imagination – is also impressive. It just reinforces what I tell my undergraduate cognitive science students: That cognitive science is a delicate balance between the prosaic and the incredible, between what your grandmother knew and which is true and what the scientific community thinks it knows and which is false! References Abernethy, B. (1991). Visual search strategies and decision-making in sport. Special issue: Information processing and decision making in sport. International Journal of Sport Psychology, 22(3-4), 189-210. Andersen, R. A. (1995). Encoding of intention and spatial location in the posterior parietal cortex. Cerebral Cortex, 5(5), 457-469. Andersen, R. A., & Buneo, C. A. (2002). Intentional maps in posterior parietal cortex. Annual Review of Neuroscience, 25, 189-220.

14 5／3／2018 5／3／2018 Attneave, F., & Farrar, P. (1977). The visual world behind the head. American Journal of Psychology, 90(4), 549-563. Ballard, D. H., Hayhoe, M. M., Pook, P. K., & Rao, R. P. N. (1997). Deictic codes for the embodiment of cognition. Behavioral and Brain Sciences, 20(4), 723-767. Batista, A. P., Buneo, C. A., Snyder, L. H., & Andersen, R. A. (1999). Reach plans in eye- centered coordinates. Science, 285(5425), 257-260. Baud-Bovy, G., & Viviani, P. (1998). Pointing to kinesthetic targets in space. Journal of Neuroscience, 18(4), 1528-1545. Bridgeman, B., Lewis, S., Heit, G., & Nagle, M. (1979). Relation between cognitive and motor- oriented systems of visual position perception. Journal of Eperimental Pyschology: Human Perception and Performance, 5, 692-700. Burkell, J., & Pylyshyn, Z. W. (1997). Searching through subsets: A test of the visual indexing hypothesis. Spatial Vision, 11(2), 225-258. Cheng, K. (1986). A purely geometric module in the rat's spatial representation. Cognition, 23, 149-178. Chieffi, S., Allport, D. A., & Woodin, M. (1999). Hand-centred coding of target location in visuao-spatial working memory. Neuropsychologia, 37, 495-502. Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Review of Neuroscience, 22, 319-349. Crawford, J. D., Medendorp, W. P., & Marotta, J. J. (2004). Spatial Transformations for Eye– Hand Coordination. J Neurophysiol, 92, 10-19. Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental images: A window on the mind. Cahiers de Psychologie Cognitive / Current Psychology of Cognition, 18(4), 409-465. Dennett, D. C. (1991). Consciousness Explained. Boston: Little, Brown & Company. Duhamel, J.-R., Colby, C. L., & Goldberg, M. E. (1992). The updating of the representation of visual space in parietal cortex by intended eye movements. Science, 255(5040), 90-92. Farrell, M. J., & Thomson, J. A. (1998). Automatic spatial updating during locomotion without vision. Quarterly J Experimental psychology A, 51(3), 637-654. Fodor, J. A. (1975). The Language of Thought. New York: Crowell. Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: MIT Press (A Bradford Book). Gallistel, C. R. (1999). Coordinate transformations in the genesis of directed action. In Bly, Benjamin Martin (Ed); Rumelhart, David E. (Ed). (1999). Cognitive science. Handbook of perception and cognition (2nd ed.). (pp. 1-42). xvii, 391pp. Henderson, J. M., & Hollingworth, A. (1999). The role of fixation position in detecting scene changes across saccades. Psychological Science, 10(5), 438-443. Henriques, D. Y., Klier, E. M., Smith, M. A., Lowry, D., & Crawford, J. D. (1998). Gaze- centered remapping of remembered visual space in an open-loop pointing task. Journal of Neuroscience, 18(4), 1583-1594. Klier, E. M., Wang, H., & Crawford, J. D. (2001). The superior colliculus encodes gaze commands in retinal coordinates. Nature Neuroscience, 4(6), 627-632. Làdavas, E. (2002). Functional and dynamic properties of visual peripersonal space. Trends in Cognitive Sciences, 6(1), 17-22. Lindberg, D. C. (1976). Theories of vision from al-Kindi to Kepler. Chicago: University of Chicago Press. Medendorp, W. P., Goltz, H. C., Villis, T., & Crawford, J. D. (2003). Gaze-Centered Updating of Visual Space in Human Parietal Cortex. The Journal of Neuroscience, 23(15), 6209- 6214.

15 5／3／2018 5／3／2018 Milner, A. D., & Goodale, M. A. (1995). The Visual Brain in Action. New York: Oxford University Press. Nicod, J. (1970). Geometry and Induction. Berkeley: Univ. of California Press. O'Keefe, J., & Nadel, L. (1978). The Hippocampus as a cognitive map. Oxford: Oxford University Press. O'Regan, J. K. (1992). Solving the "real" mysteries of visual perception: The world as an outside memory. Canadian Journal of Psychology, 46, 461-488. Peacocke, C. (1992). A Study of Concepts. Cambridge, MA: MIT Press/Bradford Books. Poincaré, H. (1963/1913). Why space has three dimensions (J. W. Bolduc, Trans.). In Mathematics and Science: Last Essays (pp. 25-44). New York: Dover. Pylyshyn, Z. W. (1973). What the Mind's Eye Tells the Mind's Brain: A Critique of Mental Imagery. Psychological Bulletin, 80, 1-24. Pylyshyn, Z. W. (1984). Computation and cognition: Toward a foundation for cognitive science. Cambridge, MA: MIT Press. Pylyshyn, Z. W. (2000). Situating vision in the world. Trends in Cognitive Sciences, 4(5), 197- 207. Pylyshyn, Z. W. (2001). Visual indexes, preconceptual objects, and situated vision. Cognition, 80(1/2), 127-158. Pylyshyn, Z. W. (2002). Mental Imagery: In search of a theory. Behavioral and Brain Sciences, 25(2), 157-237. Pylyshyn, Z. W. (2003). Seeing and visualizing: It's not what you think. Cambridge, MA: MIT Press/Bradford Books. Salinas, E., & Thier, P. (2000). Gain modulation: A major computational principle of the central nervous system. Neuron, 27, 15-21. Snyder, L. H. (2000). Coordinate transformations for eye and arm movements in the brain. Current Opinion in Neurobiology, 10(6), 747-754. Snyder, L. H., Grieve, K. L., Brotchie, P., & Andersen, R. A. (1998). Separate body- and world- referenced representations of visual space in parietal cortex. Nature, 394(6696), 887-891. Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs, 74(11, whole No. 498), 1-29. Stricanne, B., Andersen, R. A., & Mazzoni, P. (1996). Eye-centered, head-centered, and intermediate coding of remembered sound locations in area LIP. Journal of Neurophysiology, 76(3), 2071-2076. Ullman, S. (1984). Visual routines. Cognition, 18, 97-159. Wehner, R., & Menzel, R. (1990). Do insects have cognitive maps? Annual Review of Neuroscience, 13, 403-414. Wittreich, W. J. (1959). Visual perception and personality. Scientific American, 200 (April), 56- 75.

16 5／3／2018 5／3／2018 1 Poincaré’s examples use fingers and the capacity to sense the locations of fingers. I can now confess that his essay was very much on my mind at the time I was formulating the FINST Index theory and is the reason for the appearance of “finger” in FINST.

2 Note that the impressive work by (O'Keefe & Nadel, 1978), which argues for an allocentric representation of space on the grounds of the existence of “place cells” in the rat that respond selectively to unique places in a room, need not be in conflict with the present view. Place cells are not cells that fire when the animal thinks about a certain place, but only when it gets there. Getting there may be a matter of coordinate transformations from the various sensed inputs and motor actions. It is not known whether the animal can consider or plan in terms of the relative direction of A and B in a room when it is situated at some place C different from A or B. What the work on the hippocampus has shown is the remarkable capacity of the navigation module to compute the equivalence of different movements in reaching a particular allocentric place.