<<

Brain science

MI muscle SI tactile combinations detection PA1 detailed actions for self SA1 tactile images DV2 spatial maps PA3 specific joint plans and DV1 spatial plan persons features PA4 overall plans Episodic memory

Context PM1 person VI visual store Event map positions and G goals movements features

PM2 person VV1 object actions and identities relations PM3 social dispositions and affiliations

motor tactile visual input output input

A theory of brain structure and function

interacting selves, groups social psychology

mental states, consciousness, the self psychotherapy

cognitive mechanisms, motivation theories cognitive psychology

system level brain models, neurocognitive models neuropsychology

cortical layers, associative memory models neural nets

single neuron models, synapses, transmission single neurons cell dynamics, synapse dynamics, genetic transcription cell dynamics

by Alan H. Bond

Brain Science

A theory of brain structure and function based on neuroanatomy, psychology and computer science

Alan H. Bond 1 September 20, 2004

1Alan H. Bond, Ph.D., California Institute of Technology, Computer Science Department, Mailstop 256-80, Pasadena, California 91125. email: [email protected] Disclaimer

I started putting this book together on January 1st 2003, and now have a very rough draft. I will then rewrite it during the following months, guided by the criticism of a literary agent and potential publishers.

This current version has a lot of mistakes and omissions, but it does outline what material will be in the book and in what order.

There are currently many figures copied by scanning from other sources, but many of these will be redrawn by me.

It is thus with due hesitancy that I am making it available so that it can help in ex- plaining my research. Any feedback and criticism is welcomed, please email me at [email protected] or phone me at 310-828-8719.

To avoid a negative reaction to missing and incorrect details, it might help to think of it as a rough hewn sculpture rather than a detailed sculpture with errors. This is how I think of it. In the next months, I will be able to give it the necessary detail, precision, polish and completion. Quotation i

Prologue from the PhD thesis of Paul D. Scott 1975 Sussex University, England, entitled “The cerebral neocortex as a programmable machine”

THE PARADOX

‘It is conventional for a dissertation such as this to begin with a review of the current state of knowledge relevant to its subject. Unfortunately Sussex University possesses a library containing three hundred thousand volumes of which the vast majority are devoted to the subject matter of this thesis, the human cerebral cortex. Strangely only a small proportion of the authors concerned were actually aware that this was what they were writing about. Certainly many of those whose works are kept in the psychology section may have been but move a few shelves and we come to the astronomy section. Here we find expounded how the cerebral cortex has transformed the eye from an organ for finding the nearest meal into an instrument capable of perceiving the composition of a galaxy many millions of light years distant. Passing on a few feet we come to applied sciences. Here we can read how the hand-eye coordination which got our ancestors to the next branch has developed to enable men to get to the moon. Surrounding volumes describe innumerable ways in which the cortex by controlling our sensory and motor systems has developed ways of controlling the world. Wherever we go in the library we are confronted with testaments of cortical activity - in politics and economics, in mathematics and philosophy or in Marxism and theology. Take away the books and we still have the building as a consequence of activity in (among many others) Sir Basil Spence’s neocortex. Take away the building and we still have the concept of a library - an institution dedicated to fostering the most striking of all human cortical activities, verbal communication.

No doubt many of our readers, whom we have so far ignored on our ’gedanken’ tour of the ii Quotation library (for fear of disrupting their cortical activities), would be horrified at such extreme reductionism. Such a reaction would not be possible without a cortex. It is impossible to avoid a feeling of awe upon the realisation that the whole of human achievement, so extensively documented in the history section is due to the organised firing of neurones in the neocortex. Nor should we make the mistake of attributing only the spectacular achievements of the human race to this region of the brain. We glance at the clock as we leave the library. It is five past one. Just time to skim through a chapter of a book before lunch. The cortex has translated a simple visual pattern into the concept of time and used this information to plan our future activities.

Returning to our desks we open the book, a medical student’s introduction to the nervous system. What does it have to say about the neocortex? “There are essentially only two types of nerve cell in the cerebral cortex: a PYRAMID cell and a GRANULAR cell concerned with the output and input functions respectively. The pyramid cells have a long corticofugal axon and an apical dendrite which goes up to the most superficial layer of the cortex. The granule cells are characterised by a profuse dendritic tree while their axons are usually intracortical and so may go off in any direction. The neocortex is constructed in six layers. The outermost or MOLECULAR layer contains mostly the apical dendrites of the pyramidal cells together with a small number of internuncial neurones. The innermost or FUSIFORM layer contains internuncial and callosal neurons whose axons end in the most superficial layers of their own or the opposite side. The intermediate layers are made up as a four-decker sandwich of granular and pyramidal cells. Layers II and IV are the external and internal layers respectively while layers III and V are the external and internal pyramidal layers. The layers vary considerably in relative thickness in different areas of the cortex....” Bowsher (1970)

This is of course an oversimplification. Nevertheless it is true that almost everywhere cerebral neocortex consists of a six-layered structure composed of a few basic cell types Quotation iii with definite, if not yet defineable, restrictions on their connectivities. So if we are to try and relate structure and function in the cortex we are at once confronted with a striking paradox. The cortex exhibits a uniquely diverse range of behaviour and yet it has a relatively uniform structure. Any attempt to explain how the brain works must sooner or later face up to this paradox’

Reference. David Bowsher, “Introduction to the anatomy and physiology of the nervous system”, 2nd ed. Oxford, Blackwell Scientific Publications, 1970. iv Quotation Preface v

Style policy

I and we. I have used I to mean alone, and we in two different senses, namely, we meaning the author and the reader, and we meaning the brain science community, the scientific community or people in general. Using I and my often sounds rather egotistical, but this is not my intention. The alternatives seem to be to either use we and our throughout, or to use reported speech, mainly passives. I noticed that other scientific books also use the mixed convention that I have adopted.

The index. By inspection of preliminary forms of the index, I determined some categories which act as subheadings. Some of these are adjectives; I hope this makes sense. e.g., auditory, or even hierarchical.

People’s names include their first name, unless it has been too difficult to find it, and I have not included middle initials, except to remove ambiguity. This is a little unusual in scientific writing, where often just the surname, or perhaps some initials as well, are used. It also becomes a problem when we get down to less important figures, where it is better to just use surnames alone. Also for teams of people, one can give the leader’s name and say “coworkers” or “their research colleagues”, or one can use the lead author’s name with et al. which can be misleading if the lead author is not one of the main scientists involved.

I use lower case index terms unless they are proper names. I assumed that old world monkeys was all lower case.

I have made hyphenation in the index consistent to avoid ambiguity, since in some con- texts in the text we might need to hyphenate, but not in other contexts, but in the index there should be one corresponding form, e.g. context 1 - problem-solving method, vi Preface

context 2 - problem solving, index entry - problem solving.

There are a few ambiguous cases, for example joint rotation means a joint of one person rotating, whereas joint action means the action of two of more persons.

A region only means the regions that I have defined as sets of areas. Other peoples’ regions I index as areas. Nuclei I also index as areas for now.

I seem to have converged on orbital frontal cortex and not orbitofrontal cortex or orbital- frontal cortex. In any case, its abbreviation is OFC.

Glossary. Since most of the terms used are defined and explained in the body of the text, it seemed that an explicit glossary would be wasteful. Instead, for every term used in the text I have made two index entries. One, under the term, I have put a subentry for its definition, and two I have put under an explicit index entry glossary with the term as a subentry. Thus, computation occurs in the index as computation, definition and as glossary, computation.

Hyphenation. My strategy is to insert a hyphen if it is needed to resolve ambiguity. One partial exception is object-file which I have kept hyphenated following Treisman and Kahneman’s use.

Technical terms across culture boundaries. I have used all of the terms language, language processing and natural language processing, since natural language processing is a computer science term used always to distinguish it from programming languages or formal languages, whereas for psychologists and neuroscientists, language always means natural language. Protocol means an experimental procedure to psychologists, but to them it can also mean a record of the subject’s verbalization, and it means a sequence of exchanged messages in computer science networking.

Acknowledgements. I’ve put some acknowledgements in the body of the text where Preface vii

there was a specific debt to a person. I think this thanks them better.

Historical figures. I’ve put in dates of birth and death, so we can better appreciate the historical time course.

Diacritical marks, accents. There are occasional discrepancies due to the inability of various software tools to handle them correctly. A name may occur in the text, in a cited reference, in a figure caption, cited in a figure caption, as a table entry, as an index term, or in a footnote.

Typesetting. I am basically using default latex typesetting for a document of type re- port, which results in various infelicities, which will eventually be removed, first manually by me, and then by using a publisher’s latex macros.

Brevity of style. I have tried to present my theory of the brain, to explain the scientific reasons and the postulated ideas and mechanisms with a reasonable degree of precision, rigor and completeness. I did not want to gloss over this, I want to explain all of the supporting experimental data and theoretical concepts to the point where the reader can evaluate and use them for him- or herself. The magnitude of this undertaking has lead me to a rather terse or elliptical style of writing. Ideas are explained only briefly without much illustration, repetition, redundancy of expression, or ado. In this way a book of managealbe size has been produced, and the reader is lead at a good speed, some would say a fair clip, others breakneck speed, through the material, the arguments, explanation of mechanisms and conclusions. I hope that the reader will find this style habitable, even comfortable. A more usually elaborated and embroidered form would have taken 2000 pages. viii Preface Contents

I Foundation 1

1 Introduction, motivation and summary 3 1.1 Introduction ...... 4 1.1.1 The desire for a framework ...... 5 1.2 A basis in three disciplines ...... 9 1.3 Science ...... 10 1.3.1 Science and neuroscience ...... 10 1.3.2 The Bohr model of the atom ...... 11 1.3.3 Approximation in science ...... 12 1.4 Description by the brain and by brain scientists ...... 13 1.4.1 Description and computation ...... 13 1.4.2 Description by brain modules ...... 14 1.4.3 The scientific description of the brain ...... 15 1.4.4 Description of and by the brain ...... 15 1.4.5 Representation of episodes, plans and goals ...... 16 1.4.6 The evolution of representations ...... 17 1.4.7 Approximation in modeling the brain ...... 17 1.4.8 Computer science and description ...... 18 1.5 History of the research on my model ...... 19 1.6 Overview of this book and my model of the brain ...... 22

ix x Contents

2 Primate behavior 28 2.1 From nonhuman to human primates ...... 29 2.2 Biology and development ...... 31 2.3 The evolution of primates ...... 31 2.3.1 Present day primates ...... 32 2.4 Primate behaviors ...... 33 2.5 Societal dynamics ...... 36

3 The primate brain 40 3.1 The evolution of the primate brain ...... 41 3.1.1 Invertebrates ...... 41 3.1.2 Vertebrates ...... 41 3.1.3 Mammals ...... 44 3.1.4 Primates ...... 44 3.1.5 Modern evolution theory ...... 49 3.2 The structure of the cortex ...... 50 3.3 The uniform process of the neocortex ...... 52 3.3.1 There is a uniform process ...... 52 3.3.2 Theories of the uniform process ...... 52 3.4 Overall components and architecture ...... 53

4 The historical development of system-level approaches to the brain 61 4.1 Introduction ...... 62 4.2 Neurology, neuroanatomy and neurophysiology ...... 63 4.3 Psychology ...... 68 4.3.1 System models in psychology ...... 69

5 The history of formal description 73 5.1 Introduction ...... 74 5.1.1 Using natural language for scientific and mathematical description 74 Contents xi

5.1.2 Logic in natural language ...... 76 5.2 Formal logic ...... 76 5.3 Theoretical computer science ...... 87 5.4 Artificial intelligence ...... 87 5.5 The choice of programming language ...... 96 5.6 The intellectual revolution of computer science ...... 96

6 Logic programming 98 6.1 Introduction ...... 99

7 Describing information processing in computer science 101 7.1 What is a computer science? ...... 102 7.2 Concepts in computer science ...... 102 7.2.1 Data and data structures ...... 103 7.2.2 Program, control and process ...... 104 7.2.3 Interfaces ...... 105 7.2.4 Communication ...... 106 7.3 Symbols in computer science ...... 106 7.4 The computer science experience ...... 107

8 Computer science description and the brain 109 8.1 The computer and the brain ...... 110 8.2 Computer science assumptions from von Neumann machines ...... 111 8.3 Life and computer systems ...... 112 8.4 Assumptions underlying computer science descriptions ...... 112 8.5 Computer science concepts for the brain ...... 114 8.6 Summary ...... 118

9 Levels of description in computer science 119 9.1 Describing computers ...... 120 xii Contents

9.2 Levels of description for computer systems ...... 120 9.3 Descriptions used at each level ...... 124 9.4 Properties of descriptions ...... 133 9.5 Design, constraints and optimization principles ...... 134

10 Brain science 135 10.1 Describing the brain ...... 136 10.1.1 Levels of description of the brain ...... 136 10.1.2 Natural science and computer science ...... 138 10.2 A hierarchy of scientific cultures ...... 138 10.3 Scientific culture ...... 140 10.4 Information is generated at each level ...... 140 10.5 Formal and computational models at each level ...... 141 10.6 Interactions between levels ...... 141 10.7 A role for logic programming ...... 142 10.8 Neuroscience ...... 143 10.9 Concepts for describing information processing in the brain ...... 143 10.9.1 Goals ...... 144 10.9.2 Plans ...... 145 10.9.3 Sequencing of action ...... 145 10.9.4 Representations of events ...... 147 10.9.5 Social interaction ...... 148 10.9.6 Contexts ...... 148

II The cortex 151

11 Information-processing analysis 153 11.1 Introduction ...... 154 11.2 Hierarchies ...... 155 Contents xiii

11.2.1 The concept of hierarchy ...... 155 11.2.2 The elements of hierarchies ...... 156 11.2.3 Sensory and motor hierarchies ...... 157 11.2.4 Possible bases for hierarchy in the neocortex ...... 158 11.3 Anatomical regions and connections ...... 159 11.3.1 Neural areas ...... 159 11.3.2 Anatomical connections and their analysis ...... 162 11.4 Sensing as the construction of descriptions ...... 171

12 An information-processing analysis of the primate neocortex 175 12.1 Introduction ...... 176 12.2 Analysis of neocortical perception hierarchies ...... 179 12.2.1 Olfactory areas ...... 179 12.2.2 Gustatory areas ...... 181 12.2.3 Somatosensory areas ...... 183 12.2.4 Auditory areas ...... 186 12.2.5 Ventral visual areas ...... 191 12.2.6 Dorsal visual areas ...... 194 12.2.7 Polymodal STS areas ...... 197 12.3 Summary and conclusion ...... 200

13 Frontal areas and the perception-action hierarchy 202 13.1 The neocortical planning and action hierarchy ...... 203 13.1.1 Planning and action areas ...... 203 13.1.2 Human cognition ...... 205 13.1.3 Frontal regions ...... 207 13.1.4 Planning and action hierarchy ...... 211 13.2 The perception and action hierarchies ...... 212 13.2.1 Cortical regions and their hierarchies ...... 212 xiv Contents

13.2.2 Connectivity between perception and action hierarchies ...... 215 13.2.3 Perception-action hierarchical architecture ...... 220 13.3 Summary and conclusion ...... 221

14 Describing information processing in the neocortex 224 14.1 Introduction ...... 225 14.2 The biological basis of our computational approach ...... 233 14.3 Representing data and processes using logic ...... 235 14.4 Dynamics of the model ...... 239

15 My implemented model of the primate neocortex 247 15.1 Our implemented brain model ...... 248 15.2 Behaviors and results obtained ...... 252 15.3 Discussion ...... 263 15.4 Summary and conclusion ...... 270

III Mental dynamics 271

16 Problem-solving behavior 273 16.1 Introduction ...... 274 16.2 Problem solving and the choice of the Tower of Hanoi problem ...... 274 16.3 Problem solving and the frontal lobes ...... 278 16.4 The Tower of Hanoi problem ...... 282 16.5 Strategies for solving Tower of Hanoi problems ...... 287 16.6 Human performance in problem solving ...... 290 16.6.1 Performance on Tower of Hanoi problems ...... 290 16.6.2 Performance on Tower of problems ...... 292 16.7 Learning problem-solving strategies ...... 297 16.8 Extending our model to allow solution of the Tower of Hanoi problem . . 298 Contents xv

16.8.1 Tower of Hanoi strategies ...... 298 16.8.2 Selective search ...... 299 16.8.3 Working goals ...... 301 16.8.4 Perceptual tests and mental imagery ...... 301 16.9 Episodic memory and its use in goal stacking ...... 303 16.10Falsifiable predictions of brain area activation ...... 305 16.11Tower of Hanoi in BAD ...... 306 16.12Appendix - Talairach coordinates ...... 306

17 BAD - a Brain Architecture Description language 311 17.1 Introduction and motivation ...... 312 17.2 Specifying modules and channels ...... 314 17.2.1 Descriptions and description transformation rules ...... 314 17.2.2 Basic rule form ...... 315 17.2.3 Prolog ...... 317 17.2.4 Weights ...... 318 17.2.5 Computation ...... 319 17.2.6 Rule execution ...... 320 17.2.7 Competition ...... 320 17.3 Data ...... 322 17.3.1 Storage - descriptions ...... 322 17.3.2 Uniqueness and integrity ...... 323 17.3.3 Updating ...... 323 17.3.4 Attenuation ...... 324 17.3.5 Confirmation ...... 324 17.4 Specifying brain architecture ...... 325 17.5 The form of an external world specification ...... 326 17.6 Sensors and effectors ...... 327 17.7 Executing a complete brain model ...... 329 xvi Contents

17.8 Specifying a complete system and experiment ...... 330 17.9 Specifying the initial state of the brain models ...... 330 17.10Loading the complete model world ...... 331 17.11Trial files ...... 333 17.12Appendix - BAD Syntax ...... 333 17.12.1 Syntax of BAD models ...... 333 17.12.2 Syntax of the BAD rule ...... 333 17.12.3 Syntax of BAD modules ...... 334 17.12.4 General specification - The gmod file ...... 337

18 Logical systems 339 18.1 Using logic ...... 340 18.2 Inference and models ...... 345

19 Symbols 349 19.1 Approaches to symbols ...... 350 19.2 Programming issues ...... 352

20 Cortical motivation 354 20.1 Integrity, continuity and identity ...... 355 20.2 Integrity mechanisms in my model ...... 356

21 The layer, neural and cell levels of description 358 21.1 Introduction ...... 359 21.2 The structure of a brain module ...... 359 21.3 The flow of data ...... 361 21.4 Representing a module as a set of interacting layers ...... 362 21.5 Neural nets ...... 365 21.6 Cell types ...... 365 21.7 Synaptic plasticity ...... 367 Contents xvii

21.8 Genetic involvement in memory ...... 370

IV Memory 373

22 Episodic memory 375 22.1 The cognitive psychology of memory ...... 376 22.2 The hippocampus ...... 378 22.3 Episodic memory ...... 383 22.3.1 The definition of episodic memory ...... 383 22.3.2 The event structure of episodic memory ...... 384 22.3.3 The concept of event in philosophy and linguistics ...... 386 22.3.4 Rhythms and clocks ...... 387 22.4 My overall approach to memory ...... 388 22.5 My approach to episodic memory ...... 389 22.5.1 Main principles of my theory ...... 390 22.5.2 Instantaneous events ...... 391 22.6 My representation of events and episodes ...... 394 22.6.1 My overall idea ...... 394 22.6.2 Segmentation and chunking within each module ...... 395 22.6.3 Events ...... 395 22.6.4 Event descriptions ...... 397 22.6.5 Uniqueness of reference to events ...... 398 22.6.6 Episodes ...... 399 22.7 Episode formation ...... 401 22.7.1 Recording events ...... 401 22.7.2 The segmentation of event sequences into episodes ...... 402 22.8 Long term memory ...... 403 22.9 Using event and episode information ...... 404 xviii Contents

22.9.1 Querying the hippocampal formation ...... 404 22.9.2 The role of short term episodic memory in ongoing behavior . . . 405 22.9.3 The role of long-term episodic memory in ongoing behavior . . . . 406 22.9.4 Possible functioning of the hippocampus ...... 406 22.10The problem of representation ...... 408 22.11Episodes in thinking ...... 409 22.12My own concept of episode in thinking ...... 420 22.12.1 Types of mental action ...... 421 22.12.2 Event types ...... 423 22.12.3 The dynamics of episodes ...... 423

23 Contexts 424 23.1 Introduction ...... 425 23.2 Artificial intelligence models of memory ...... 425 23.3 My dynamic model ...... 428 23.4 The context module ...... 430 23.5 The representation of contexts ...... 432 23.6 Activation and execution of contexts ...... 433 23.7 Example of a context ...... 434 23.7.1 The form of the ss context ...... 434 23.8 Generating and updating contexts ...... 437 23.9 Relation of my representation to Schank’s ...... 439 23.10Contexts and memory ...... 440 23.11Executing a context ...... 442 23.12Contexts required for the Tower of Hanoi protocol ...... 444 23.12.1 Restarting ...... 448 23.12.2 Quasilinear searching ...... 448 23.13The hierarchy property ...... 448 23.14The learnability property ...... 450 Contents xix

23.14.1 The problem ...... 450

23.14.2 Events ...... 451

23.15The cognitive map ...... 453

23.16Logical representation of contexts and their execution ...... 454

23.17The code ...... 454

23.17.1 Brief outline ...... 454

23.17.2 Normal modules ...... 457

23.17.3 The episodic module ...... 457

23.17.4 The context module ...... 460

23.17.5 The plan module ...... 460

24 Learning by doing 467

24.1 Learning by doing ...... 468

24.2 Tower of Hanoi learning ...... 468

24.3 Research by others on modeling Tower of Hanoi learning ...... 470

24.4 My analysis of the Anzai-Simon protocol ...... 470

24.5 Learning Tower of Hanoi strategies ...... 470

24.6 Contexts in the Tower of Hanoi example ...... 473

24.7 Lessons learned from attempting to extend the model to do learning . . . 473

24.8 Appendix - The Anzai and Simon protocol ...... 474

25 Procedural memory and routinization 481

25.1 Introduction ...... 482

25.2 The basal ganglia ...... 486

25.3 Learning by the basal ganglia ...... 491

25.4 The interleaving of routine and creative action ...... 494 xx Contents

V Extensions 499

26 Vision 501 26.1 Introduction ...... 502 26.2 The neuroanatomy of the visual system ...... 502 26.3 The psychology of vision ...... 502 26.3.1 Representations of the percept ...... 508 26.3.2 The total percept and consciousness ...... 509 26.3.3 Models of the visual system ...... 510 26.4 Phenomena to be modeled ...... 511 26.4.1 Representations ...... 511 26.4.2 Timing phenomena ...... 512 26.4.3 Object tokens ...... 512 26.4.4 Attention and awareness ...... 513 26.4.5 Object types ...... 515 26.5 Eye movement control ...... 516 26.5.1 Transsaccadic vision ...... 517 26.5.2 Models of vision including eye movement ...... 517 26.6 Problem solving, perceptual queries, and topdown attention ...... 518 26.7 Vision and mental imagery ...... 520 26.8 Vision and episodic memory ...... 521 26.9 The interface between the visual system and the core brain model . . . . 522 26.10My approach to the visual system ...... 523 26.10.1 My proposed contribution ...... 523 26.10.2 An example of visual system behavior ...... 524 26.11A proposed model of the visual system ...... 526 26.11.1 Areas and mechanisms to be modeled ...... 526 26.11.2 The mechanism of the model for the Treisman process ...... 526 Contents xxi

27 Natural language processing 532 27.1 Introduction ...... 533 27.2 Kempen’s model of grammar ...... 538 27.3 Vosse and Kempen’s results ...... 544 27.4 My grammatical model for the brain ...... 546 27.5 The interface between the natural language system and the core brain model551 27.6 Using general brain model mechanisms for strengths ...... 551 27.7 Agrammatism ...... 557 27.8 Conclusion ...... 558

28 Analysis of subcortical systems 559 28.1 Introduction ...... 560 28.2 The psychology of subcortically motivated behaviors ...... 563 28.2.1 Agonism ...... 563 28.2.2 Attachment ...... 567 28.2.3 Sex ...... 570 28.3 The neuroanatomy of subcortically motivated behavior ...... 572 28.3.1 The hypothalamus ...... 573 28.3.2 The amygdala ...... 574 28.4 The action of subcortical systems ...... 577 28.4.1 Subcortical effects ...... 577 28.4.2 System functions and connections ...... 578 28.4.3 Brain mechanisms of agonism ...... 579 28.4.4 Brain mechanisms for attachment behavior ...... 581 28.4.5 Brain mechanisms of sexual behavior ...... 584 28.4.6 A unified picture ...... 586 28.5 Stress ...... 589

29 Modeling interacting cortical and subcortical systems 590 xxii Contents

29.1 Introduction ...... 591 29.2 Ontogenetic development ...... 591 29.3 The interface between the subcortical systems and the core brain model . 594 29.4 Modeling agonistic motivation ...... 595 29.5 Modeling attachment motivation ...... 596 29.6 Modeling sexual motivation ...... 597 29.7 Logical models of motivational systems ...... 600 29.7.1 Formal representation ...... 600 29.7.2 Representing the hypothalamus ...... 600 29.7.3 Representing the amygdala ...... 605 29.8 Towards a logical model for sexual behavior ...... 605

VI Conclusions 609

30 Consciousness 611 30.1 Some remarks on consciousness ...... 612 30.2 Lived experience and creative imagination ...... 613

31 Towards a computer science of the brain 616 31.1 Introduction ...... 617 31.2 Concepts ...... 617 31.3 Design, constraints and optimization principles ...... 618 31.4 Programming language ...... 620 31.5 Research issues ...... 621 31.6 Summary and conclusion ...... 622

32 Toward Brain Science 623 32.1 Describing the brain ...... 624

33 Summary of the model 625 Contents xxiii

33.1 Overall ...... 626 33.2 Some basic tenets of the theory ...... 626 33.3 The basic representation ...... 627 33.4 The core dynamic model ...... 627 33.5 Extensions to vision, language and subcortical systems ...... 635 33.5.1 Vision ...... 635 33.5.2 Natural language ...... 636 33.5.3 Subcortical systems ...... 638 33.6 My disagreements ...... 644

34 In conclusion 645 xxiv Contents List of Figures

1.1 Dividing the brain into four main parts ...... 24

2.1 The similarity of the brains of primates, from [Brodmann, 1909] via [Bullock, 1977] Fig 10.92, p. 487 ...... 29

3.1 The telencephalon, from [Bullock, 1977] ...... 42 3.2 The reptile brain, from [Romer and Parsons, 1986] ...... 43 3.3 The sequence of cortical evolution of the cortex, from [Romer and Parsons, 1986] ...... 45 3.4 The circuit of three layer cortex, from [Shepherd, 1998] ...... 46 3.5 The emergence of six layer cortex in reptiles, from [Dart, 1934] ...... 46 3.6 Type of cells in six layer cortex, from [Douglas and Martin, 1998] . . . . 47 3.7 The evolution from three to six cortical layers, from [Reiner, 1993] . . . . 47 3.8 Stefan and Andy’s data on the relative growth of different parts of the brain, from [Eccles, 1989] ...... 48 3.9 Todd Preuss’s findings of new neocortical areas by comparing a prosimian (a Galago bushbaby) and a simian (a Macaque monkey) ...... 56 3.10 Brodmann’s areas of the human neocortex ...... 57 3.11 Cell types and internal connectivity in the neocortex, from Shepherd and Koch in [Shepherd, 1998] ...... 58 3.12 External connectivity of the different cortical layers ...... 58 3.13 Canonical neocortical circuit, due to Douglas and Martin [Douglas and Martin, 1998] ...... 59 3.14 Overview of brain components ...... 60

xxv xxvi Figures

4.1 Events in the history of neurology and neuropsychology ...... 62 4.2 Wernicke’s system diagram ...... 65 4.3 Extended system diagram ...... 66 4.4 Events in the history of psychology ...... 70 4.5 Freud’s diagram of his transcription model, from [Freud, 1900] ...... 70

5.1 Events in the history of formal description ...... 75 5.2 Frege’s concept script ...... 79 5.3 Example of a Turing machine ...... 82 5.4 The idea of a universal Turing machine ...... 84 5.5 Tarski’s concept of an interpretation of a theory ...... 86 5.6 Geometric knowledge in children’s drawings ...... 93 5.7 Minsky’s concept of frame ...... 94

9.1 Levels of description in computer science ...... 122 9.2 Levels of descriptions and their interactions ...... 123 9.3 The process of chip design, from [Edwards, 1992] ...... 124 9.4 Levels on laptop running by brain model ...... 125 9.5 Specification documents for levels and interlecl interfaces ...... 127 9.6 NAND gate defined by circuit, logic diagram, layout and logic formula representations ...... 128 9.7 Definitions of NOT, NAND and NOR gates as truth tables ...... 129 9.8 A logic circuit ...... 130 9.9 Pipelining diagram defines serialized register transfer language in terms of hardware RTL ...... 131

10.1 Levels of description in brain science ...... 137 10.2 My concept of Brain Science ...... 139

11.1 Neural areas and notation used ...... 161 11.2 Lesioning an area affects a small number of other areas ...... 163 Figures xxvii

11.3 Pattern of connectivity discovered by Jones and Powell ...... 164 11.4 Summary diagram showing the three sequences reported by Jones and Powell166 11.5 Summary of hierarchies reported by Pandya and coworkers ...... 168 11.6 Intrinsic connectivity of frontal areas, from (Barbas and Pandya, 1989) . 170 11.7 Sensory features - first part ...... 173 11.8 Sensory features- second part ...... 174

12.1 Table of all hierarchical regions ...... 178 12.2 The olfactory hierarchy ...... 180 12.3 The gustatory hierarchy ...... 182 12.4 The somatosensory hierarchy ...... 184 12.5 The auditory hierarchy ...... 187 12.6 The ventral visual hierarchy ...... 192 12.7 The dorsal visual hierarchy ...... 195 12.8 The polymodal hierarchy of the superior temporal sulcus ...... 198

13.1 The planning and action hierarchy ...... 204 13.2 Characterization of planning and action hierarchy ...... 211 13.3 Views of the cortex showing regions ...... 213 13.4 Summary of experimental findings for hierarchy of data abstraction . . . 214 13.5 Table of all extrinsic connections among neural areas, part 1, AS - arcuate sulcus ...... 216 13.6 Table of all extrinsic connections among neural areas, part 2 ...... 217 13.7 Diagram of connections to frontal areas ...... 219 13.8 Neocortical perception-action hierarchy ...... 223

14.1 Summary of experimental findings for hierarchy of data abstraction . . . 227 14.2 Computational hierarchical levels used in my model ...... 228 14.3 (a) Lateral view of the cortex showing neural regions and functional in- volvements, (b) Connectivity of regions showing perception-action hierarchy229 xxviii Figures

14.4 Modules from neural areas of the primate neocortex, and my initial system model ...... 231 14.5 Functioning of interacting perception and action hierarchies in behavior . 240 14.6 How the model works ...... 242 14.7 Response to variation in environment ...... 245

15.1 Description types in each module ...... 250 15.2 Outline of description transformations in each module ...... 251 15.3 Grooming sequence ...... 253 15.4 Visualization of a typical instantaneous state of the model ...... 254 15.5 Instantaneous behavioral states of two interacting primates ...... 255 15.6 Social conflict sequence ...... 257 15.7 Avoidance sequence ...... 259 15.8 Displacement sequence ...... 260 15.9 Displacement sequence, with persons visualized as humans ...... 262 15.10Predicted brain area activation for different kinds of processing ...... 268

16.1 Initial and general positions for five disk Tower of Hanoi problem . . . . 275 16.2 Typical initial and final positions for Tower of London problems . . . . . 276 16.3 Cards used for the Wisconsin Sort Test ...... 277 16.4 Roland’s diagram summarizing his findings for thinking tasks ...... 280 16.5 Construction of state spaces for Tower of Hanoi problems ...... 284 16.6 Tower of Hanoi state space for problems with three disks ...... 285 16.7 Tower of Hanoi state space for problems with five disks ...... 286 16.8 Image of Tower of London performance ...... 293 16.9 Table of brain area activations for Tower of London ...... 294 16.10Image of Tower of London performance for harder problem ...... 294 16.11Table of brain area activations for harder problem ...... 295 16.12Distributing a strategy over several modules ...... 298 16.13Representation of the selective search strategy on the brain model . . . . 300 Figures xxix

16.14Representation of the perceptual strategy on the brain model ...... 302 16.15Representation of the goal recursion strategy on the brain model . . . . . 304 16.16The corpus callosum shown in coronal section ...... 307 16.17The corpus callosum showing anterior and posterior commissures . . . . . 308 16.18The sagittal plane showing AC and PC, and the x and z axes, drawn in orange ...... 309 16.19Talairach coordinate system ...... 310

18.1 The concept of logical module ...... 340

19.1 Newell’s concept of symbol ...... 351 19.2 Difference between Conventional AI Program and our Distributed System 353

21.1 Structure of a brain module ...... 360 21.2 Module as interacting layers ...... 363 21.3 Module mechanism as interacting layers ...... 364 21.4 Cell types for layer IV cortical neurons, Firing patterns: RS regular spik- ing, BS burst spiking and FS fast spiking ...... 366 21.5 Mechanisms and sites for synaptic plasticity, from [Malenka and Siegelbaum, 2001] ...... 369 21.6 Molecular events within the neuron leading to short and long term memory, based on Figure 1 of [Alberini, 1999] ...... 371

22.1 Separate learning modules ...... 379 22.2 The rat brain, shown flattened, adapted from (Swanson, 1992) ...... 379 22.3 Hippocampal evolution ...... 380 22.4 Hippocampal neuroanatomy ...... 380 22.5 The hippocampal formation, block diagram ...... 381 22.6 Cortical connections to hippocampal complex, for rhesus monkey, from Kobayashi and Amaral ...... 392 22.7 The episodic memory module ...... 393 22.8 An event as a set of hippocampal inputs ...... 398 xxx Figures

22.9 An episode as a sequence of events ...... 400 22.10An episode as a sequence of episodes ...... 401 22.11Multiple contexts and episodes ...... 402 22.12The possible action of the hippocampal formation in memory ...... 407 22.13The context representation problem ...... 408 22.14A chess position used by DeGroot and by Newell and Simon ...... 410 22.15A chess search tree, taken from HPS Fig 12.3, p. 714 ...... 411 22.16A chess problem behavior graph, first half, taken from HPS Fig 12.4, p. 715-6, ...... 412 22.17Chess episodes, taken from HPS Table 2.1, p. 723 ...... 413 22.18Progessive deepening in chess proof, three phases, taken from Chapter 7, Figure 7, pp. 268-9 ...... 416 22.19Chess proof, taken from De Groot Chapter 1, Figure 3, pp. 28-30 . . . . 417

23.1 The basic action of the system ...... 429 23.2 Episodic memory and context mechanism ...... 431 23.3 A context as part of a hierarchy of contexts ...... 432 23.4 An example of a context, for the selective search strategy ...... 435 23.5 Executing the ss context, and forming an episode ...... 436 23.6 The environment of context execution ...... 440 23.7 Episode creation during problem solving ...... 441 23.8 Context for the selective search strategy, showing messages ...... 445 23.9 Contexts which send messages ...... 445 23.10Context for obstacle on source peg ...... 446 23.11Context for obstacle on target peg ...... 447 23.12Context for evaluation ...... 447 23.13Nesting of contexts ...... 449 23.14Learning and nesting property ...... 450 23.15Outline of code ...... 455 Figures xxxi

23.16Executing a context ...... 462

24.1 Strategy learning sequence of Anzai and Simon ...... 469

25.1 Subgoals in routine action ...... 484 25.2 Subgoals in perception hierarchy ...... 485 25.3 Basal ganglia ...... 486 25.4 The geometric influence of the third ventricle ...... 487 25.5 Loops involving the basal ganglia ...... 489 25.6 Basal ganglia loop shown in coronal section ...... 490 25.7 The association loop mapped onto the perception-action hierarchy . . . . 491 25.8 The sensory-motor loop mapped onto the perception-action hierarchy . . 492 25.9 The association loop mapped onto the perception-action hierarchy . . . . 493 25.10Possible arrangement for monitoring of routine action by planning module 494 25.11Driver and modulator connections of the thalamus ...... 495 25.12Composite diagram of cortex, thalamus and basal ganglia ...... 496

26.1 Visual Brodmann areas for the monkey brain (left) and the human brain (right) ...... 504 26.2 Retinotopic mapping of retina onto V1 ...... 505 26.3 Early vision modules and functioning - from Gallant and Van Essen 1995 506 26.4 The Treisman psychological model of early vision - from [Treisman, 1988] 507 26.5 Eye movement control in the brain ...... 516 26.6 Our concept of hierarchical eye movement control in the brain ...... 518 26.7 Organization of Tower of Hanoi strategy showing perceptual goals . . . . 519 26.8 Stephen Kosslyn’s model for mental imagery ...... 520 26.9 The main brain areas of the visual system ...... 527 26.10An image and its different components during processing ...... 528 26.11The different modules and their data during perception ...... 529 xxxii Figures

27.1 Summmary of imaging data for natural language processing, from [Deacon, 1997] ...... 536 27.2 A suggestion for the natural language processing system, from [Deacon, 1988]537 27.3 Lexical frames ...... 539 27.4 Sharing of features, from Gerard Kempen’s book [Kempen, 2000] Figure 2.2540 27.5 Normal parameter values determined by Vosse and Kempen ...... 543 27.6 Step during construction of structure description ...... 544 27.7 Vosse and Kempen’s sentence recognition results ...... 545 27.8 Processes organized as concurrent modules ...... 549 27.9 Brain areas corresponding to concurrent modules ...... 550 27.10Variation of strengths in the V-K model of sentence recognition . . . . . 552 27.11Variation of strengths in our brain model for sentence recognition . . . . 554 27.12Parameter values used ...... 555 27.13Caplan’s stimulus sentence types and comprehension scores ...... 557

28.1 General approach with two levels of control ...... 563 28.2 Time course of male-female relationship ...... 571 28.3 The human hypothalamus, taken from Carpenter, 9th edition, Figure 17.1, page707...... 573 28.4 Hormone outputs from the hypothalamus via the pituitary ...... 574 28.5 Intrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000], Legend: AAA: anterior amygdala area, AB: accessory basal nucleus, CE: central nucleus, COa,p: cortical nucleus, anterior and posterior parts, B mc,pc:Basal nucleus, magnocel- lular and parvocellular parts, L: lateral nucleus, PAC: periamygdaloid cortex, PL: paralamellar part of basal nucleus ...... 575 28.6 Extrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000] ...... 576 28.7 Summary connections diagram for the amygdala, illustrated by agonism . 580 28.8 Levels of threat processing, from Graeff 1994 ...... 581 28.9 Table of pup action effects on the dam, from [Hofer, 1987] ...... 581 Figures xxxiii

28.10Table of dam action effects on the pup, from [Hofer, 1987] ...... 582 28.11Reaction of the different control systems of the pup on separation from the dam, from [Hofer, 1987] ...... 583 28.12Subcortical systems involved in sexual behavior ...... 584 28.13Female and male hormonal circulation ...... 586 28.14Levels of interacting motivational control ...... 588

29.1 Postulated brain mechanisms for agonistic behaviors ...... 595 29.2 Postulated brain mechanisms for attachment behaviors ...... 596 29.3 Postulated brain mechanisms for sexual behaviors ...... 599

33.1 Modules from neural areas of the primate neocortex corresponding to my initial system model ...... 628 33.2 Functioning of interacting perception and action hierarchies in behavior . 629 33.3 Separate learning modules ...... 630 33.4 Summary diagram of learning module and core model ...... 631 33.5 The episodic memory system ...... 632 33.6 The formation and use of contexts ...... 633 33.7 The association loop of the basal ganglia and cortex ...... 634 33.8 The different modules and their data during visual perception ...... 635 33.9 Summary diagram of extension for vision ...... 636 33.10Brain areas corresponding to language processing ...... 637 33.11Summary diagram of extension for language processing ...... 638 33.12Levels of interacting motivational control ...... 640 33.13Summary diagram of extension for subcortical systems ...... 641 33.14Summary diagram of brain model showing some detail ...... 642 33.15Simplified summary diagram of brain model ...... 643 xxxiv Figures Part I

Foundation

1 2 Chapter 1

Introduction, motivation and summary

Abstract: In this first chapter, I explain some motivations for wanting to develop a general scientific theory of the brain, the chief one being a desire for a framework within which to define psychological concepts in terms of the overall functioning of the brain.

I explain that this book is an attempt to define not only a theory of the brain but also a field which I call Brain Science which intimately combines elements of neuroscience, psychology and theoretical computer science. I discuss my notion of natural science, and approximation. I make some remarks about the problem of scientific description of the brain, as well as the description process carried out by the brain. I indicate the kinds of representation that I will postulate are used by the brain.

I give a brief history of my own career and the events leading to the development of this theory. I also briefly outline what my theory consists of, by giving an account of what will be in each chapter as we read through the book.

3 4 Chapter 1: Introduction, motivation and summary

1.1 Introduction

In this book, I will explain an approach to developing a working model of the brain. This is a scientific model of the brain, in that it is described by a precise theory. It has a causal dynamics and describes the action of the brain at a system level, rather than a neural level. I indicate how a neural-level model could be obtained from this system-level model. My model is also realized as a computer program. This model can generate falsifiable predictions that can be compared with experimental data.

The model results from a use of abstract computer science concepts to describe the brain as a computer. Most of computer science concerns techniques for designing and implementing present-day computer systems, and uses the notion of address throughout, for describing control as well as data, which is not biologically plausible. However there is a more abstract theoretical computer science and I was able to use more abstract ideas.

The resulting research unfortunately falls well outside of neuroscience, psychology and computer science, as these are narrowly defined in university departments. However, my one redeeming feature of obtuse tenacity enabled this research to progress and to bear fruit.

It was opined several decades ago that we already had enough data about the brain so that a reasonable model could be found. However, the predominant idea or opinion of most neuroscientists and psychologists is that a correct theory and model of the brain will not occur for many decades, perhaps a century. Further that it will emerge from painstaking empirical research by neuroscientists. A psychologist friend of mine is fond of saying it will not occur in her lifetime. To her I say, put this book down immediately and try to enjoy what is left of your short life. To everybody else, read on! Introduction 5

1.1.1 The desire for a framework

The impulse that propelled me along this path was my need for an intellectual framework within which to define concepts in psychology. I needed an overall view of the total action of the mind, and this lead inevitably to the need to understand the brain.

The kinds of problems I would like to illuminate include: (i) what is the basic regime of processing in the brain? A lot of present-day scientific thinking basically assumes a “straight through” processing of input data to produce an output action. (ii) What happens in perception, how do we construct our percept of the world? (iii) What is the role of perception in overall action? (iv) How can we think of motivation - what causes different mental activities to happen? (v) Can we think of emotion in a scientific manner? (vi) What is happening during different kinds of thinking? (vii) What happens in the brain during social interaction? (viiii) How does a child learn and develop into an adult? (ix) How do different personalities form? (x) How should we think about consciousness?

Can we find a set of information-processing concepts that will allow us to describe infor- mation processing in the brain?

Artificial intelligence has terms such as goal, plan, semantic net, and so on. What are the corresponding concepts for the brain?

What is the correct set of concepts for psychological thinking. Current ideas include episodic memory, phonological buffer, chunk, motor program, etc., how do we define these precisely? 6 Chapter 1: Introduction, motivation and summary

Neurophysiological explanations are unsatisfactory: Pathway - the action of the system is assumed to be “straight through” and does not iterate or execute actions conditionally. The feedback “upstream” is difficult for them to handle. How does data flow within the brain?

This book presents a well-defined framework within which to define psychological and neuroscience concepts and to give answers to many of these kinds of questions.

There are psychologists who have put forward computer-science type theories to describe their experimental results, although rarely realised as computer programs, for example: 1. Daniel Kahneman and Anne Treisman’s object-file theory of object perception [Kahneman and Treisman, 1984]. 2. Stephen Kosslyn’s mental imagery model [Kosslyn, 1994]. 3. Vickie Bruce’s theory of face perception [Bruce, 1988]. 4. Tim Shallice’s theory of frontal decision making [Shallice, 1988].

System models. In my research, I will model the primate neocortex as a system. A system model treats an object of study as a set of interacting subsystems, each of which is easier to understand and to describe than the complete system. It results in explanations of objects as due to the action of each subsystem and the interactions among subsystems.

The use of a systems level of thinking in neuroscience, where more than one neural area are conceived as working together, has a venerable history going back to Wernicke’s books [Wernicke, 1874][Wernicke, 1894]. There is, for neuroscientists, a natural systems-level of explanation of experimental data, see for example [Gazzaniga, 1989]. Current imaging evidence is showing coactivation of distributed areas in many tasks. From this type of evidence, McIntosh et al. [McIntosh et al., 1994] have developed influence graphs, which give a measure of dynamic influence among neural areas. Mesulam [Mesulam, 1990] has Introduction 7 developed ideas of a distributed system mediating attention. Kosslyn [Kosslyn, 1994] has developed a system model of visual imagery and perception. Goldman-Rakic [Goldman-Rakic, 1988] has investigated distributed systems for working memory, which involve areas in the frontal lobes and in the parietal and temporal lobes. Modular ex- planations of language processing have progressed [Geschwind, 1965] [Deacon, 1989] and now have support from imaging experiments. Petersen et al. [Petersen et al., 1988], for example, have produced a modular description of language processing. There are modular explanations of the aphasias, and of the dyslexias [Karmiloff-Smith, 1992].

Boxology and control processes. In cognitive psychology, the “boxology” diagrams for short term memory have always been unsatisfactory to me for two main reasons: 1. They are not part of an overall complete system which controls behavior, or even of an overall memory system. 2. Control issues are completely finessed, as “control processes”, a term introduced by Atkinson and Shiffrin [Atkinson and Shiffrin, 1968] and which has never been defined to a workable degree of precision.

The interface. The subject of my research in many ways sits as an interface between biology and computer science. From a computer science perspective I seek to describe computers based on the brain, which might have some of the legendary properties of the brain, its parallelism, intelligence, flexibility and resilience. From a biological perspective I seek to bring computer-science concepts and methods to bear on the problem of the scientific understanding the brain, which may provide a well-defined theory and a precise, tractable, description of the brain.

Natural science and causal models. Experimental results demonstrate the involve- ment of some parts of the brain in some given behavior. but they do not provide a causal functioning model of the brain actually operating to produce the behavior. By causal I mean that the model has a dynamics of changing in time from one state to another, each 8 Chapter 1: Introduction, motivation and summary next state being determined from its current state. Very little experimental information is available on how different parts of the brain work together, what information flows, what is computed, or how the activities of different parts are coordinated. Indeed, most of natural science to date has been concerned with matter and energy and transforma- tions among their various forms. What we will need for the brain is a natural science of information processing.

Computer science. In order to produce a working causal model of information pro- cessing in the primate neocortex, I will turn to a computer science analysis, where I will draw on knowledge of information-processing systems from several different specializa- tions within the field of computer science - parallel architectures, distributed systems, formal description languages, and artificial intelligence planning.

Neuroscientific information-processing requirements can be used to constrain the design of a scientific model. We know the brain does indeed work in a coordinated manner to produce behavior, we know it is stable under a number of disturbances, we think it is organized for real-time responsiveness, we believe it is distributed, and we have some idea of timings of components.

Computer science brings to my research notions of processing architecture, of data and process representation, and of control [Siewiorek et al., 1982]. Further, computer science brings techniques for describing, specifying and implementing models using programming languages. Description languages have been developed for the high-level description of complete computer systems [Davis, 1993] [Calvez, 1993]. In my case, computer science will be of particular value for understanding how control in the brain could be organized, and for understanding how to create a model by specifying and implementing one using logic programming. A basis in three disciplines 9

1.2 A basis in three disciplines

I will propose a multidisiplinary approach to the study of the brain. The arguments in this book are based on three core disciplines, namely, neuroscience, psychology, and computer science: (i) Neurology, neuroanatomy and neurophysiology deal directly with the biology of the brain and with neurons, (ii) Psychology, which includes experimental psychology, psychiatry and psychoanalysis, deals with human behavior and its pathology, without neural explanations, and (iii) Computer science and logic, which deal with formal mechanisms and theories of information processing. In addition, however there are other relevant areas which we will involve as needed: (iv) Linguistics, the study of language in its own terms, (v) Primatology, the study of behavior and its underlying mechanisms in, mainly non- human, primates, and (vi) Sociology, the study of groups, populations and the interactions of individuals.

In order to understand the brain, we need to have a good knowledge of all three core disciplines. It is one of my aims that this book can be read by a specialist in any one of these disciplines with the result that they are able to understand the contribution of the other two.

It is also my hope that this book will help define a new subject which we can call Brain Science, which intimately combines Neuroscience, Psychology and Computer Science into a unified field of study and research. It will do this by giving an intellectual framework incorporating all three component disciplines, in suggesting a hierarchy of description levels allowing different types of knowledge, experiment and theory to be developed at each level and allowing different levels to relate to each other. It will also show how one 10 Chapter 1: Introduction, motivation and summary might select, from each discipline, material and concepts so that one does not need to attempt the impossible task of becoming an expert in all three subjects. I also suggest one possible brain model and one possible precise language in which to represent brain models.

I have tried to avoid unnecessary technicality without sacrificing an adult-strength treat- ment. In order to do this I have necessarily simplified my treatment. One can verify and correct my treatment by consulting more specialized material to which I will provide references. I also assume that the reader may skip the more demanding sections without causing a catastrophic break in their understanding.

1.3 Science

1.3.1 Science and neuroscience

As every student knows, the word “science” comes from the latin word “scientia” meaning knowledge, from scientem, scire, to know. Now, although in common parlance one can be said to know facts, such as a phone number, it is clear that the intention of science is to know concepts and principles. I emphasize this point since most of present-day neuroscience and a lot of present-day psychology consists of discovering phenomena and describing them, without attempting to discover underlying concepts or scientific theories. When neuroscientists and psychologists go to conferences they discuss their discovery of their latest phenomena, and issues concerning how to find yet more new phenomena.

When I was doing theoretical physics, theoreticians did nothing but theory, no exper- iments, and there were experimentalists who mainly did experiments and had only a general idea of current theoretical ideas, and then there were people in the middle which we called phenomenologists. These were the people who were aware of all the experi- Science 11 mental data and could summarize it into tables and trends, and might develop ad hoc curves that would fit the data. It was this processed data that was used by theoreticians to guide their search for a principled theory.

In the current state of neuroscience, there are only experimentalists, and a few of these may make some theoretical conjectures. It is impossible to publish a paper in a neuro- science journal unless it contains some new experimental data, and many journals will not allow any theoretical discussion. Thus there is no theoretical effort, it is apparently viewed as foolhardy, since after all the historical road of neuroscience is strewn with the carcases of dead theories.

Some quotations from our forebears: Herbert Spencer, 1820-1903, “Science is organized knowledge” [Spencer, 1861]. Thomas H. Huxley, 1825-1895, “Science is nothing but trained and organized common sense”. Lord Rayleigh, 1842-1919, “Examples ... which might be multiplied ad libitum, show how difficult it often is for an experimenter to interpret his results without the aid of mathematics.” [Bell, 1937].

1.3.2 The Bohr model of the atom

A key image that I use is that of the Bohr atom. This was the first ever model of the atom. Before the discovery of the Bohr model in 1912, the atom was known to exist, but its structure was unknown. The best thing available was the Rutherford model which was that the atom consisted of a nucleus with electrons orbiting around it, however Rutherford could not understand why the electrons would not simply radiate electromagnetic energy and collapse to the nucleus since this is what the standard theory of electromagnetism required. Nils Bohr then suggested his model of the atom in which 12 Chapter 1: Introduction, motivation and summary the energy states of electrons were quantized, so they could only occupy certain orbits which could not collapse. Energy could only be emitted when an electron moved from one allowed orbit to another, so the spectrum of an atom consisted of discrete frequencies, and light was emitted as discrete quanta. So Bohr used the difficult idea of quanta which had been proposed by Planck and Einstein, and applied it to another difficult problem namely how the planetary model could work. His model was quite simple, with tractable mathematics, so that the average scientist could understand it, and it explained various properties that atoms were known to have, notably their different energy states and the particular wavelengths of light that each type of atom could emit. There were specific known data to explain, namely the observed spectral lines whose frequencies occurred in series given by known formulae due to Balmer and others. The Pauli exclusion principle came later, and the more sophisticated mathematical theory of Schr¨odinger didn’t appear until 1926. At the time, Bohr’s ideas were not immediately accepted, probably because his advance made several large steps at once. Indeed one of his main detractors was Schr¨odinger who thought the idea of discrete states was patently ridiculous.

It seems to me that there is an analogous situation with the brain. We have a lot of data, and various ideas such as areas, pathways, working memory, etc. However, how the brain works is a mystery to neuroscience. My model shows how the brain works, and makes its functioning understandable.

1.3.3 Approximation in science

Atomic physics also gives us an approach to approximation. The Schr¨odinger equation describing atomic structure is accepted as the correct scientific theory. There are some finer details of spectral wavelengths that require a more complex theory that takes rel- ativity into account, and also the structure of the nucleus. However, the vast majority Description by the brain and by brain scientists 13 of observed experimental data about atoms and molecules are well described by this equation. However, the solution of the equation for anything more complex than the simplest atoms of hydrogen and helium is not possible as a closed mathematical expres- sion. Solutions are therefore found by approximation methods, and the accuracy can be improved by doing more computation. Collections of atoms in solid state forms can also be solved approximately, however only when further assumptions are made, derived from experimental data, such as the crystalline or other structure of the solid.

Thus a precise correct theory may not have practical application to complex systems, even though it is accepted as correct. The exact theory gives us a way of thinking and formulating, from which practical results are obtained by approximations of various types.

Neuroscientists and psychologists are currently necessarily involved in difficult experi- mental research. In this type of culture, every detail and nuance must be examined and included, since this is where new phenomena are first discovered. By contrast, in order to develop a theory, a theoretician has to use his or her judgement to simplify and approximate. This involves leaving out some phenomena and ignoring some problems.

1.4 Description by the brain and by brain scientists

1.4.1 Description and computation

I will argue that the main work of the brain is to describe its environment and itself. It does this repeatedly and continuously, so it continuously redescribes itself.

I will in subsequent chapters explain what I mean by a description and by the process of describing. 14 Chapter 1: Introduction, motivation and summary

I will view computation as a process of description. From a newly given description, a computation will generate a more developed and complete description, and one better fitting and describing the external environment and itself.

There will be a limit to this process. At a given time and with a given starting description, computation will terminate with a more complete description that is the most complete without making any further assumptions. This is the most general and most complete description possible given the starting description.

When new data or new knowledge is added, or removed, or changed, the process of redescription can continue and again reach a new most general complete description.

The processing in the brain is the action of each brain module which simply repeatedly computes new data. In addition, brain modules coordinate by passing data and by acknowledging data as useful.

1.4.2 Description by brain modules

The brain consists of a set of interconnected modules, each of which is continuously redescribing itself, its inputs and its stored memory. My notion of description is more general than usual in that it includes all the different kinds of module that the brain has, each with its own special kinds of data and representation. Any given module thus main- tains descriptions in its own terms. A module receives descriptions from other modules and combines these and any stored descriptions to produce new descriptions. Thus it bases its work on that of others and their time of occurrence. Thus, the brain describes and continuously redescribes, action, time sequences, plans, goals and intentions. Exter- nal action occurs by the brain sending a stream of descriptions of actions to its effectors, such as muscles and glands, which act upon the world. (For our purposes, the world is everything external to the nervous system and thus includes the subject’s body.) Description by the brain and by brain scientists 15

1.4.3 The scientific description of the brain

My model achieves clarity and precision by wellknown methods, such as defining technical terms as precisely as possible, and making definitions and expressions very simple so that there is little chance of ambiguity.

The brain model works by using formal descriptions, in a formal language which I will define, and it has a precise process which continually constructs new descriptions, a process I am calling redescription.

The architecture of the brain model, with which this redescription process occurs, is defined here informally, however it is clearly precise. We will give more detail later, and ultimately any other scientist can examine and run for him- or herself the computer program which gives an exact definition and realization of the model.

1.4.4 Description of and by the brain

There are two different levels or roles for description. One is that in order to describe the brain, we, as scientists, are going to need to describe data, communication, processes, planning, memory, and so on, so therefore we will need a rich language in which to describe the brain. Such languages for describing complex information-processing and information processing systems have only recently been developed as part of computer science.

My argument is that the brain is a complex information-processing system, and therefore, in order to describe it scientifically, we will need a description language which can describe data structure, processes and their control relations, and abstraction.

The second role of description is that the brain itself is engaged in a descriptive process. It continuously describes its experience. Hence our description of the brain should be rich 16 Chapter 1: Introduction, motivation and summary enough that it can describe the brain’s descriptive processing. The scientific language that we need as scientists will therefore probably have to be richer than the descriptive techniques used by the brain.

Each module computes data of certain data types characteristic of the module, using data of other types it has received from other modules. Some modules represent perceptual information such as visual, auditory and somatosensory images, however some modules represent plans, some frames, some events and episodes.

Thus the brain’s functioning is to represent its environment, its actions and its own mental states. The language used by the brain scientist needs to be rich enough to describe this representational activity of the brain, as well as communication among modules, parallel processing among modules, and other computational properties of the brain. This language will allow the description of many different brains, including many that do not or cannot exist in nature for various reasons. Then additional statements stated in the language will describe one particular kind of brain such as the human brain. Of course, there will also be the ability to specify many different variants of the human brain corresponding to the individually different brains that different people have.

1.4.5 Representation of episodes, plans and goals

By abstraction from observed episodes, the brain develops stored plans for future use. These may involve sequencing of action steps, conditional actions and observation.

Some modules may generate representations whose effect when communicated to other modules is to evoke plans which change the state so that the original representations are removed. These representations will thus have a role as goals in the system. Description by the brain and by brain scientists 17

1.4.6 The evolution of representations

The different data types are innately endowed, although subject to some ontogenetic changes. That is, the basic modules, interconnections and the data types for each module, have resulted from evolution and are encoded into DNA which guides the construction of the brain.

We can think of the brain as developing representations of various kinds and using these to construct stored memories and to develop plans and actions. It is quite possible that the different representations used in the brain evolved at different times, and that only the most recently evolved ones underlying language are fullfledged symbolic descriptions in the sense of allowing wide use and applicability to many different situations and types of experience.

The brain’s ability for scientific description. The most advanced modules will have representations that allow natural-language semantics and mathematics to be repre- sented. With appropriate education, the ability to use scientific language and to describe different possible brains could be achieved. Thus at the very top level of representation the brain becomes truly self-describing, in the sense of scientific description.

1.4.7 Approximation in modeling the brain

We also will need to be clear about how we approximate the brain. That is, we will develop brain models which will not perform as well or as completely as the brain, but which nevertheless are models which accurately capture the scientific principles and the mechanisms used by the brain. This is very much a matter of scientific judgment, since we do not want to simply leave out phenomena that we cannot handle, by saying that they are beyond our approximation. 18 Chapter 1: Introduction, motivation and summary

One example is visual perception. This is highly developed in humans and performs at a spectacular level. Should we require that our model for visual perception also operate at a similar level? If not, what kinds of performance will we accept as establishing that the model is a good scientific model?

Another example is natural-language processing, the recognition and understanding, and the generation of, sentences. Again the subtlety of human performance is legendary, however a more limited model which exhibits some of the key natural-language processing phenomena will be satisfying. Of course, our model must not exhibit any unnatural effects. More generally, the judgement of psycholinguists should be used in evaluating whether the model captures the important properties of human behavior, and whether its conceptualizations give insights into how the brain processes language.

1.4.8 Computer science and description

Computer science is a very broad and variegated area. It includes people interested in engineering good computers, people interested in producing good and useful software, people interested in applications such as graphics, people interested in the underlying technologies such as VLSI.

I will be in the main concerned with theoretical computer science and artificial intelli- gence. This is the core of abstract theory of computation and computer science.

Computer science can be defined not as the study of computers but as the study of the description of computers. The basic activity is to develop clear descriptions and specifications of information-processing systems.

I will take a description to be (1) syntax - an expression in a precise language, with History of the research on my model 19

(2) semantics - a precise process for interpreting and finding the meaning of the expres- sion.

I will combine my treatment of theoretical computer science with the development of formal logic, since these two areas are so closely intertwined. In fact, I will be using a computer system description approach which uses formal logic.

1.5 History of the research on my model

I was educated at Queens Road Primary School, Cheadle Hulme, Cheshire, at Middleton School, Wollaton Park, Nottingham, and at Mundella Grammar School, Nottingham, before getting a scholarship in 1957 to Magdalen College, Oxford, where I studied theo- retical physics. My research career started with a PhD in theoretical physics at Imperial College, London University and continued with research in the computer science de- partment at Carnegie-Mellon University, Pittsburgh, where I became acquainted with the ideas of Newell and Simon. This lead to an interest in developing a scientifically grounded artificial intelligence and psychology. My research history consists mainly of a long range search for a scientifically grounded computational model of the brain, embel- lished with a sequence of studies that I came across along the way. In addition, I have worked on some applied AI projects, some only indirectly related to this main goal.

My computer science research started at CMU and then continued at Queen Mary Col- lege, University of London, where I was tenured faculty in computer science. I tried to follow Newell and Simon by doing my own protocol experiments for problem solving and chess. One student, Tunch Balman, reimplemented and extended GPS. Other stu- dents, Mark Witkowski, David Mott and Phil Marks, built autonomous robots. John Scott developed an artificial intelligence language. In 1978, I wrote a long position pa- 20 Chapter 1: Introduction, motivation and summary per “An approach to artificial intelligence” attempting to ground AI in the notion of an autonomous surviving robot. This paper posited four bases - survival, real-time con- trol, parallel architecture and learning. As an SERC Principal Investigator, I managed David Mott on a project to develop an intelligent rule-based learning robot, which was reported in the IJCAI81 conference. In 1981, I edited the first book on Expert Systems “The Infotech Survey of Machine Intelligence”. From 1978, I worked with several parallel computers, the ICL Distributed Array Processor and also the CMU CM* machine, and in 1983 designed my own SIMD computer architecture without a central control unit, which I called a Bit Cube.

I emigrated to the at the end of 1984, and I collaborated with Les Gasser, while at USC and UCLA in , in editing the first book on multiagent systems, “Readings in Distributed Artificial Intelligence”, published by Morgan Kaufmann, which included the first in-depth review of multiagent concepts, and was published in 1988. I pursued two research ideas in multiagent systems, in order to develop a computational approach to social relationship. The first was to develop a notion of commitment among agents, published in 1990, and the second was to develop and implement a negotiation logic based on joint proof.

In 1990, I wrote another long position paper “What I have in mind” which discussed a comprehensive set of psychological ideas such as sequential processing at the top level, emotions as mental states, the representational needs of social interaction. This was an attempt to get closure on a creative synthesis. At that time, I used to meet with Robert Stoller for discussions on psychoanalytic theory. I had however gradually realized that I needed concepts and constraints from the hardware level, i.e., brain anatomy and physiology, in order to develop a good computational model.

In 1992, I went to work in Tokyo at the Sony Computer Science Laboratory as Sony Sabbatical Chair, where I wrote down a parallel architecture that was inspired by the History of the research on my model 21 blackboard model and also the modular architecture of the brain. This was my proposed alternative to the subsumption architecture of Rodney Brooks at MIT. I implemented my model in 1993 while working at The Aerospace Corporation, El Segundo, California, and based it on joint action with other agents. After this, I extended the model to represent space, and developed a confirmation mechanism which allowed brain modules to coordinate their activities. When I started work at Caltech in the computer vision laboratory of Pietro Perona in 1996, I improved its efficiency to allow the development of applications. Its cycle time and response time were reduced to about 100 milliseconds. I also extended the model to do problem solving.

I submitted my first journal paper in 1993, but this was rejected, as were several other papers that I submitted, in the period 1994 to 1999, to neuroscience and to computer science journals. This rejection continues to this day, but I have found ways to get papers published. The objection of biologists seems to be that I am getting the computer science wrong, and of computer scientists that I am getting the biology wrong.

My first peer-reviewed paper published was in 1996 at a conference at NIST, the National Institute of Standards and Technology, and the first peer-reviewed archival journal paper was published in December 1999 in the American Journal of Primatology. I’m very grateful to Michael Raleigh and Debbie Pollack for their support during this time. So this was my decade of the brain.

This was the first version of the model. A second version and its application to problem solving behavior was published in 2001 at the CNS*01 conference. This book also presents a third version, what we call the dynamic version or core model, providing for episodic memory. 22 Chapter 1: Introduction, motivation and summary

1.6 Overview of this book and my model of the brain

The rest of the book is devoted to describing the model more explicitly and precisely, showing how it works, and explaining why I believe it is a good model of the brain.

The book is organized as five parts containing 33 chapters altogether.

Part I explains the essential foundation to this research Chapters 2 and 3 introduce primate behavior and the primate brain. Chapter 4 gives a very brief history of the origins of system thinking about the brain by neurologists and psychologists, starting in the late nineteenth century up to the present day. Chapter 5 gives a brief historical introduction to formal description methods, which includes logic, computer science and artificial intelligence, and which underlies my scientific approach.

Then, chapter 7 explains what computer science is and, in particular, description tech- niques and layering of descriptions. Chapter 10 defines my concept of brain science, as a layered set of different disciplines which use precise formal mathematical theories and have definition- and explanation-interfaces between them.

Finally, chapter 8 examines computer science concepts and proposes a set of concepts for describing information processing in the brain.

Part II develops my theory and model of the primate neocortex Chapter 11 explains concepts in information-processing analysis derived from computer science and artificial intelligence, including concepts of plan, goal, frame, sequence and abstraction.

Dividing the brain into parts.

1. The brain will mean the human brain, although a lot of our knowledge of the anatomy and physiology of the brain comes from studying the rhesus monkey brain Overview of this book and my model of the brain 23

which is very similar in structure and biological mechanism, although a lot smaller.

2. We will develop our model of the brain in parts, as in Figure 1.1:

(a) the cortex, without learning mechanisms and excluding language areas, which we will treat in Part II.

(b) the learning modules - the hippocampal formation and the basal ganglia, which we will treat in Part IV.

(c) the language areas of the cortex, treated in Part V.

(d) subcortical areas involved in motivation and control - the hypothalamus and amygdala, also treated in Part V.

(e) the rest of the central nervous system - the rest of the diencephalon, the brainstem, etc., which process incoming and outgoing data, control arousal etc.

Analyzing the primate neocortex. Chapter 12 applies these methods to analyzing the perceptual hierarchies of the primate neocortex, and Chapter 13 analyzes frontal areas of the neocortex as an action hierarchy. Connections between the perception and action hierarchies show that architecturally it forms a perception-action control hierarchy.

My abstract system description method. Chapter 14 discusses issues in the precise description of information processing in the brain.

My initial model. Chapter 15 describes my initial model of the primate neocortex, how I realized it as a computer program, and the behaviors I was able to model: (i) the cortex forms a perception-action hierarchical control system for perceiving and acting, (ii) it has short term memory in each module, (iii) it is motivated by cortical affiliative goals and produces social behaviors, (iv) it has built-in long term memory of social plans, and it executes these plans, and (v) perception is very simplified. 24 Chapter 1: Introduction, motivation and summary

perception and action systems

language systems

learning systems basal ganglia hippocampus

subcortical survival systems

Figure 1.1: Dividing the brain into four main parts

The modeled behaviors were all social and involved more than one modeled primate which interacted in a spatial environment. They included social affiliation, social conflict and social spacing behaviors.

Part III concerns mental dynamics Problem solving behaviors. Chapter 16 describes how I extended the model to do problem solving, notably for the Tower of Hanoi problem. This involved adding a working memory as part of the planning module, and also running each module to quiescence within one time step, to obtain as much data coherence as possible in the messages being exchanged.

My programming language for brain models. Chapter 17 explains a programming language that I developed which allows brain models to be developed much quicker. Overview of this book and my model of the brain 25

Theoretical issues. Chapter 18 contains a discussion of mathematical properties of logical computation, Chapter 19 of the notion of symbol, and Chapter 20 of motivation mechanisms operating at the cortical level in my model.

The layer, neuron and cell dynamical levels of description. Chapter 21 describes how I can relate my abstract system level of description of the brain to more detailed levels of description. I take these levels to be (i) cortical layers as associative processors, then (ii) neurons and neural nets, and then (iii) the dynamics within the neuron.

Part IV concerns memory mechanisms My dynamic model. Chapter 22 explains my analysis of episodic-memory mechanisms and how I extended my model to a dynamic model with a hippocampal complex, and Chapter 23 introduces and explains the notion of context, which incorporates descriptions of plans. Chapter 24 describes how episodes were learned and the system could learn by doing. Chapter 25 explains my analysis of procedural memory and how I extended my model to include a model of the basal ganglia, where procedures are learned as associations of stimuli to actions.

1. There are two learning modules, the hippocampal formation which forms event memories, and the basal ganglia which form low-level procedural memories of rou- tinely repeated action.

2. Event memories are aggregations over time of the input to the hippocampal forma- tion. The current episode can involve events over a period of a few seconds to up to an hour or more. The segmentation of the flow of event data is determined in part by the currently evoked context.

3. Routine memory is formed by the basal ganglia from inputs received from the cortex and with output to the frontal lobe of the cortex, which has ultimate control over the use of routine action. 26 Chapter 1: Introduction, motivation and summary

Part V describes possible extensions of the model Chapters 26, 27, 28 and 29 describe work in progress for extending the model in some important directions, namely vision, language and subcortical systems. These promise that eventually the entire brain can be modeled using the methods I propose.

Vision 1. A more realistic vision system can be added as a hierarchy of vision modules. 2. This generates a representation of the visual percept which has a geometric 2.5-3D form and also an abstract object-file form. 3. The object-file form tends to persist across saccades. 4. Both forms are incorporated into the representation of the current event, and into the current episode. 5. The visual system can be directed by the problem-solving modules, giving top-down attention. 6. The problem-solving system can re-evoke stored episodes and images from long-term memory, to produce mental images, which may combine with incoming visual images from the environment.

Language 1. The language areas of the cortex consist of the highest level modules of the perception- action hierarchy: (a) the phonological input module, (b) the lexicon, (c) the grammar module, (d) the text store, and (e) the phonological output module. 2. The text store forms the highest level of representation, corresponding to narrative. 3. Incoming words evoke corresponding lexical frames from the lexicon, and lexical frames competitively construct a grammatical structure and text. 4. To generate an output sentence, a text is constructed and then from this lexical frames are selected to construct a grammatical structure which is output sequentially via the phonological output module. Overview of this book and my model of the brain 27

Subcortical systems My main interests in subcortical areas are the motivational systems. These are control systems which maintain water content - thirst, nutrition - hunger, agonism - aggres- sion and submission mechanisms, attachment - mutual regulation of comfort, and sex - including maternal and paternal behaviors.

These systems are themselves hierarchical and complex, and they mutually interact. They are innate, but with some plasticity. In addition, mechanisms other than neuro- transmitters are involved, including hormones and other factors.

Part VI contains material which attempts to summarize and to draw conclu- sions Consciousness. In chapter 30, I make a few brief remarks about how my model might provide insights for the scientific study of consciousness.

A computer science for the brain. Chapter 31 describes the concepts and methods I have developed which constitute a computer science for the brain.

Summary and conclusion. In chapter 33, I collect together the conclusions of different parts of the book to give in one place a brief summary of the research presented and the theory of the brain I have developed and the model I have constructed. Chapter 2

Primate behavior

Abstract. In this chapter, I discuss the characteristics of primate behavior, and their social basis. Most of primate behavior is in fact social, and primates have detailed knowl- edge of each other’s social behavior and social status.

28 From nonhuman to human primates 29

2.1 From nonhuman to human primates

My strategy is to first model the nonhuman primate brain and then to extend it to the human case. The brains of all primates are very similar in structure and function. They have similar neural areas and similar interconnectivity among these areas. The cortical neuronal circuitry is very similar for all primates. It uses the same types of cell and the same neurotransmitter mechanisms.

Figure 2.1 diagrams the brains of a carnivore, a prosimian, a simian (i.e., monkey), and a human.

Figure 2.1: The similarity of the brains of primates, from [Brodmann, 1909] via [Bullock, 1977] Fig 10.92, p. 487 30 Chapter 2: Primate behavior

The behavior of nonhuman primates is simpler than humans, however their subcortical motivational systems are quite similar, so it is reasonable to take the innate motivational systems of nonhuman primates as essentially the same for humans.

Basic social behaviors in nonhuman primates are similar to those in humans - interactions, dominance hierarchies, social support and affiliation. Thus, it is reasonable to take these basic behaviors in nonhuman primates as a basis for extension to human behaviors. Social groupings into 30 or so individuals are similar in human hunter-gatherer societies and nonhuman primate troupes.

Thus, I would like to argue for the simplified view that the human neonate may have similar behaviors to a nonhuman primate neonate, and then the difference in cortical learning causes them to diverge as they develop.

One can argue for greater differences. Humans have much larger brains and have lan- guage. There is a much longer period of immaturity. There is thus a much greater degree of cortical control of behavior.

Humans have a much greater range of facial expressions. Humans have better vision. Humans have detailed knowledge and memory of events. Humans may be using some additional underlying learning mechanisms.

Thus my research plan is to first build a model of a monkey brain and get it to show monkey behaviors. Then after this I will extend the model to have human problem solving abilities. Then memory, language abilities and greater knowledge representation abilities.

In the next sections, I will try to take stock of what nonhuman primate behaviors my model will need to demonstrate initially. Biology and development 31

2.2 Biology and development

I will be describing a very complex biological system. There will be brain areas and anatomical connections, and there will be incoming data from senses as well as outgoing data to the body’s effectors. Neural connectivity is not the only thing, there will also be other phenomena at the level of the cell, such a different neurotransmitters, hormones, opiates, etc.

The brain has many different components, which have all evolved to their current state, and probably constitute a best compromise under the current circumstances.

In biology, development is a major theme, if not the major theme. Development in- cludes both phylogenetic development - how the brain evolved, and also ontogenetic development - how an individual human’s brain changes from conception on. An aware- ness of development will help us to appreciate the different brain components, and brain organization and function.

2.3 The evolution of primates

Primates evolved as arboreal mammals in tropical zones. The earlier forms are prosimians and the later ones simians. They also divide into New World and Old World species, since during their evolution the continents drifted apart. After simians, we get the hominoids - the apes - and then the hominids leading to modern man.

It is difficult to characterize exactly what a primate is, however defining characteristics of primates seem to include: larger brains with a lot of six layer cortex, eyes forward, facial expression - some have movable upper lips - sophisticated vision, including color vision, developed social dynamics (some prosimians are less social), developed hands, and feet, 32 Chapter 2: Primate behavior for manipulation.

The defining characteristics of man seem to be: language, very large brains, bipedal locomotion, and, according to John Eccles [Eccles, 1989], altruism and a greater ability to learn.

Note that all primates continued to evolve until the present day, so present-day monkeys are different from earlier monkeys, etc. In fact there were at one time many other types of monkey and ape, quite different from present day monkeys and apes, which became extinct.

2.3.1 Present day primates

There are currently one hundred and twenty or so species of monkeys, the character- istics and behaviors of about thirty or so of which have now been studied in detail by field primatologists [Fedigan, 1992] [Fedigan and Strum, 1997] [Rodseth et al., 1991b] [Rodseth et al., 1991a].

Most is known about three Old World species, the rhesus macaque, the Japanese macaque and the vervet. The most studied New World species are the squirrel monkey and the owl monkey.

There are very few apes. Baboons have been much studied. The great apes are all endangered - the chimpanzee and the gorilla in Africa and the orang-utan in South East Asia.

The chimpanzee has been much studied, being the smartest, and much the genetically closest to humans. Primate behaviors 33

2.4 Primate behaviors

For vervet monkeys, the classic primatological studies are due to K. R. L. Hall and Stephen Gartlan [Hall and Gartlan, 1965] and Thomas Struhsaker [Struhsaker, 1967c] [Struhsaker, 1967b] [Struhsaker, 1967a], who described basic behaviors, social relations and vocalization. Michael McGuire described the vervets on the island of St. Kitts, see his book [McGuire, 1974] and film “The St. Kitts vervets”. The importance of affiliation in determining behavior was noted and modeled by Robert Seyfarth [Seyfarth, 1977]. The ability to represent social relations and to generalize them has been described by Dorothy Cheney and Robert Seyfarth [Cheney and Seyfarth, 1990b], who have also contributed a comprehensive book on the vervet [Cheney and Seyfarth, 1990a].

Struhsaker noted 60 different detailed action types in 12 different stimulus situations. He gave a table indicating which detailed actions occurred in which situations, and gave an interpretation in terms of a message and a response implied by the action. Situations were also differentiated according to the age and sex of vervets involved. He used 5 age groups, namely, adult, subadult, juvenile, young juvenile and infant. He also gave a list of 47 different vocalization types and gave a table relating them to situations in which they occurred and the meaning implied, expressed as a message and an accompanying action. Vervets form troupes of about 40 animals, and vervet males migrate between troupes about every five years. and the system of social relations is based on grooming. The males and females are fairly evenly matched and the social system has a female hierarchy with a male hierarchy below this.

For rhesus monkeys, more precisely rhesus macaques, classic studies are by Southwick et al. [Southwick et al., 1965] and Lindburg [Lindburg, 1971]. They are found over a wide area in asia, including countries from Afghanistan to Vietnam. Rhesus males also migrate between troupes every few years, and the system of social relations is based on 34 Chapter 2: Primate behavior

grooming. The males and females are less evenly matched than vervets and the social system is multimale-multifemale. Also troupes tend to be larger, up to 100 animals. They have a set of vocal calls [Hauser, 1997] and a set of displays including fear grimac- ing (appeasement), staring with open mouth (threat), tail erect (dominance challenge), lipsmacking (conciliatry) and female (sexual) presentation [Estes, 1991]. Boelkins and Wilson [Boelkins and Wilson, 1072] described intergroup social dynamics for the troupes on the island of Cayo Santiago, near Puerto Rico. There is also a very good documentary film on these called “Monkey Island”.

Available data. The only available field data on monkey behavior, apart from film, is frequency and occurrence data of behavior types. The basic method of studying primates in the field is to develop a taxonomy of behavior types, usually less than 100 in number, and then to record the behaviors of each monkey over time in terms of this taxonomy. I have participated as an experimental observer in experiments at the Veterans’ Admin- istration primate facility at Sepulveda, California. There was a set of observers, and each observer looked only at one particular monkey. On a signal from the lead experi- mentalist, every sixty seconds, you record the code for the behavior that your monkey is doing at that moment. We each had a handheld computer into which we entered the code. This data was then uploaded into a larger computer for analysis. From this raw event data, various statistics can be calculated on frequencies, correlations, probabilities of given sequences, etc.

Cognitive abilities of captured animals are studied by more standard psychological ex- perimental situations. The animal is placed in a restraining chair and is fed juice or something similar for reward. It is then trained to do an experimental task and measures of task performance are recorded. Experiments can also be done under less restrained conditions such as in a large cage.

There is also a genre of experiments where a single electrode, a fine wire, is implanted Primate behaviors 35 in some area of interest in the animal’s brain and electrical readings are taken as the animal behaves. Experiments are beginning to be done in MRI imaging devices specially designed for nonhuman primates.

For chimpanzees, detailed videos are used, but usually for solitary cognitive activities and not for social behaviors. Usually these videos are not subjected to detailed second- by-second analysis.

For humans, detailed videos of social interaction are obtained, and subjected to detailed second-by-second analysis. Two key subareas are conversation analysis, and mother- infant interaction.

Range of behaviors. It seems that nonhuman primates exhibit a small number of different classes of behavior types: (i) feeding, including foraging, i.e., searching, hoarding, sharing, etc. (ii) agonism, including dominance struggles, avoidance and flight (iii) affiliation and attachment (iv) sex, including long-term sexual relations (v) infant caretaking, sometimes called infant handling, by all members of a troupe.

Knowledge of each other. Primates have detailed knowledge of each other, they also recognize each other’s individuality, can recognize each other at a distance, recognize each others’ particular vocal calls, and so on: (i) their health (ii) reproductive status (iii) affiliative, agonistic and sexual relationships (iv) family status, they will for example after a defeat in a fight take revenge on a weaker family member usually a sibling of the animal that defeated them. This is redirection.

Vocalization. Nonhuman primates use vocal calls. Typically, depending on the species, 36 Chapter 2: Primate behavior

there are about 50 types of calls, each with some social meaning. Within a given type, there will be variation of strength and duration, and as already mentioned, each individual animal has its own recognizable voice.

Social behavior. Note that essentially all nonhuman primate behavior is social behav- ior. Even apparently solitary behavior such as feeding is modulated and governed by the social context in which it occurs.

Thus the behavior of an individual is in general strongly coordinated with the behavior of others, and motivation of behavior is derived from the perception and knowledge of social relationships.

2.5 Societal dynamics

Troupes. Primates tend to form troupes of about 30 animals (rhesus can be > 100 animals, prosimians almost solitary, human hunter-gatherers 30-50).

Social hierarchies. Within a troupe, they dynamically maintain social hierarchies depending on the species and to some extent the environment, and the individual animals involved. Dominance carries with it the responsibility of leadership and protection of the troupe.

Dimorphism. Dimorphic species have males much bigger than females, e.g., rhesus 2 to 3 * weight, and form male dominance hierarchies. Less dimorphic species such as vervets typically form separate female and male hierarchies, with the female hierarchy often dominating the male hierarchy.

Social migration. A given geographical region will have several troupes. Each troupe has a territorial range that it defends and which may change slowly, or in some species the troupe may continually move to new areas. When juvenile males reach maturity, Societal dynamics 37 about 4-5 years of age, most of them leave the troupe of their birth and attempt to enter another troupe. This involves winning a place in the social dominance hierarchy. After a further five years or so they move to another troupe, and so on.

Social conflict. Each individual has certain goals and needs, and tries to satisfy these goals. It enters into social conflict with others as a result. To the extent that the outcomes of struggles show regularities, animals can form and be guided by memories of affiliative and agonistic solutions. This memory and behavior forms the social system.

Understanding dominant and subordinate behavior. Robert Sapolsky’s summary [Sapolsky, 1990] of the behavior of dominant baboon males has five categories. A domi- nant animal is more likely than a subordinate animal: (i) to differentiate between threatening and neutral interactions, (ii) to initiate a fight with a threatening rival, (iii) to initiate a fight he wins, (iv) to differentiate between winning and losing a fight, and (v) to successfully redirect aggression after losing a fight.

Dominance. Leader animals will protect the troupe against predators, and make dif- ficult decisions concerning troupe movement, food, and other life problems. However animals in subordinate positions tend to be more stressed and depressed. Work was done on the presence of the neurotransmitter serotonin in monkeys and it was found that dominant monkeys had more of it, and adding serotonin tended to make monkeys more dominant, and also that it was disproportionately present in their orbital frontal cortices [McGuire et al., 1986] [Raleigh, 1987]. This lead to the use of fluoxetine, which enhances serotonin levels by the action of synaptic reuptake inhibitors, for humans, sold under the brand name Prozac.

Female hierarchies are determined by birth order, the most recently born female being 38 Chapter 2: Primate behavior

placed above her elder sisters. This birth-order hierarchy will often be modified by individual personalities, strengths and weaknesses.

Sexual competition. Nonhuman primate females have offspring into old age, however the number of males is often less than females due to large casualty rates for migrating males, often 30-50% resulting in death. Only humans have menopause, and thus higher competition among males for fertile females, approaching a two to one ratio.

Affiliative behavior. Primates spend a great proportion of their time in affiliative behavior, which maintains their relationships, their position in their society, and the cohesion of their society.

Nonhuman primate troupes have been compared to the behavior of humans aged 10-12 (”junior high”) in the assiduousness and ubiquity of their socialization.

It is also clear that, in many species, animals achieve leadership through the strength of their affiliative relations [Kummer, 1975]. Males often only become dominant because of female support in addition to male support. Support stems from successful affiliative activity.

Troupe dynamics. A troupe will move around foraging for food and finding shelter in order to survive. It will need to defend itself against predators. There will be some inter-troupe conflict also.

As new animals are born and mature, and old animals grow weaker and eventually die, the pattern of affiliative, agonistic, sexual and caretaking relationships dynamically adjusts by a myriad of daily social encounters and tests of strength. As a result, the troupe society continuously restructures itself.

Survival. In order for a troupe, and the species, to survive, this process should sustain the life of its individuals. Survival includes getting adequate food and shelter, protection Societal dynamics 39 from predators, and replacement of animals by conception, birth, growth and develop- ment. Chapter 3

The primate brain

Abstract. I first give a brief account of the evolution of the primate brain, and point out the evolutionary origins of the different parts of the brain. I start from simple creatures and show the emergence of the reptile brain and then the emergence of six-layer cortex and the primate brain.

I then briefly discuss the structure and functioning of the primate brain. I give an outline architecture for the brain, making generalizations which necessarily simplify but which capture mainstream thinking. I discuss the neocortex, its division into areas and its layered structure.

40 The evolution of the primate brain 41

3.1 The evolution of the primate brain

3.1.1 Invertebrates

The evolutionary sequence is something like this: (i) distributed nets of neurons, as in jellyfish, (ii) then a concentration of neurons in the head to produce a very simple brain with some coordination of motor signals via a medulla (iii) internal organs controlled by neurosecretion, however there is not a true hypothala- mus.

3.1.2 Vertebrates

With the evolution of vertebrates, it seems that most of the modern arrangement came into being immediately although with simple components. The neurons were now ar- ranged into a systematic layered structure in the form of a sheet, or cortex. This sheet actually develops as a tube. The brain developed as a sequence of extensions of this tube called, respectively, the myelencephalon, the metencephalon, the mesencephalon, the diencephalon and finally the telencephalon, see Figure 3.1.

Thus the latest stage was the telencephalon, which had a pallium (roof) which would develop into the cortex, and a subpallium (floor) which developed into the basal ganglia. The thalamus concentrated input from sensors and from the lower brain and projected to the pallium and subpallium. Thus the cortex, basal ganglie and thalamus existed from the beginning. Then these components evolved and increased in complexity by the addition of nuclei. This new design reached a very successful design in the reptile brain, which is diagrammed in Figure 3.2, from [Romer and Parsons, 1986]. 42 Chapter 3: The primate brain

Figure 3.1: The telencephalon, from [Bullock, 1977]

The characteristics of the brains of early vertebrates include: (i) some coordination of sensory signals via a thalamus. The thalamus is present in very simple vertebrates such as early fish, including some anamniotes 1. (ii) the control of internal organs and endocrine system by a hypothalamus, (iii) much better motor control, by a cerebellum, developed originally in early sharks, (iv) the development of an olfactory cortex which differentiated from the pallium and is called the paleocortex or piriform cortex. (v) the diencephalon provided more vision processing using the superior colliculus and more auditory processing using the inferior colliculus, thus vision and audition were not handled by the cortex at this stage. (vi) the differentiation of the pallium into neocortex and archicortex. The archicortex becomes the hippocampus and moves to the medial surface of the hemisphere.

1The embryo of an amniote develops within an amniotic sac, the containing membrane being the amnion. Anamniotes do not. Later animals, including humans, are amniotes The evolution of the primate brain 43

Figure 3.2: The reptile brain, from [Romer and Parsons, 1986]

We show the development of the cortex in Figure 3.3, taken from Romer and Parsons’ book “The vertebrate body” [Romer and Parsons, 1986].

Figure 3.4, taken from Shepherd’s book, “The synaptic organization of the brain” [Shepherd, 1998], shows the 3 layer organization of the reptile cortex. The cortex is made up of millions of interconnected circuits of this type, arranged side by side to form a sheet.

As reptiles evolved into amphibia, the three-layer design with the two different types of three-layer cortex, i.e., paleocortex (the paleopallium) and archicortex (the archipallium), was very successful and lasted a long time, including the age of the dinosaurs. The two types of cortex have different connectivity patterns with the rest of the brain. The paleocortex connects directly to the amygdala, thalamus and hypothalamus, whereas the archicortex connects mainly to sensory inputs. At the same time, the basal ganglia differentiated from the paleopallium. 44 Chapter 3: The primate brain

3.1.3 Mammals

The next stage was the emergence of six-layer cortex as reptiles evolved into advanced reptiles, monotremes, marsupials and early mammals.

Figure 3.5, taken from Dart’s work [Dart, 1934], shows the cortex of an advanced reptile with two small areas of six-layer cortex extending between the olfactory cortex and the hippocampus.

Figure 3.6 shows the different types of cells in six layer cortex, taken from Rodney Douglas and Kevan Martin’s work [Douglas and Martin, 1998].

Figure 3.7, taken from Reiner’s work [Reiner, 1993] shows one account of the correspon- dence of six to three layer cortex.

3.1.4 Primates

With the advent of primates, evolving from advanced mammals, we get the rapid growth of six-layer neocortex until it dominates the brain. Figure 3.8 shows Stefan and Andy’s data on the relative growth of different parts of the brain, taken from Eccles’ book [Eccles, 1989]. To give a base line, they took what they hoped would be a neutral or vanilla mammal which they called a “basal insectivore”, i.e., a mammal that had a simple structure and behavior and spent its life eating insects. A hedgehog is pretty close to this I believe. Then they measured the sizes of different parts of the brain for different species and compared them with the brain of a basal insectivore. The evolution of the primate brain 45

Figure 3.3: The sequence of cortical evolution of the cortex, from [Romer and Parsons, 1986] 46 Chapter 3: The primate brain

Figure 3.4: The circuit of three layer cortex, from [Shepherd, 1998]

Figure 3.5: The emergence of six layer cortex in reptiles, from [Dart, 1934] The evolution of the primate brain 47

Figure 3.6: Type of cells in six layer cortex, from [Douglas and Martin, 1998]

Figure 3.7: The evolution from three to six cortical layers, from [Reiner, 1993] 48 Chapter 3: The primate brain

Figure 3.8: Stefan and Andy’s data on the relative growth of different parts of the brain, from [Eccles, 1989] The evolution of the primate brain 49

Thus, referring again to Figure 3.3, 1. the original cortex is the paleopallium which ends up as the piriform cortex, 2. the archipallium develops out of the paleopallium and becomes the hippocampus, 3. the basal ganglia develop out of the paleopallium and move to the interior, and 4. the neopallium develops from the archipallium and paleopallium and becomes the neocortex.

Todd Preuss has analyzed the neocortices of prosimians, simians and humans [Preuss and Goldman-Rakic, 1991a], and more recently the chimpanzee brain. Figure 3.9 shows newly evolved cortical areas of a simian species.

3.1.5 Modern evolution theory

The set of animals with a particular set of genes is called a genotype. The set of animals with the same physical expression of their genes is called a phenotype, so a phenotype may correspond to more than one genotype. The success in reproduction and survival of a given set of animals is called its fitness, and is determined by its phenotype.

The last few decades have seen a revolution in methods and in understanding of evolution [Ridley, 1993] [Stearns and Hoekstra, 2000] [Northcutt, 1991] [Butler and Hodos, 1996]: (i) the existence of evolutionary changes which had no impact on fitness, so called neutral genetic drift [Kimura, 1983] [Gillespie, 1992]. (ii) the development of cladistic analysis in which evolutionary development is represented as a binary tree with each step being the change in one dominant feature of the phenotype [Butler, 1994]. (iii) the use of DNA sequencing to discover genetic connections among species. (iv) the analysis of changes in terms of the evolution of the molecules involved in the development and functioning of the animal. (v) the analysis of evolution by types of neurons, which are determined by particular 50 Chapter 3: The primate brain

genes, and which make genetically determined topological connections, independently of geometry [Deacon, 1990].

These advances have also disproved several classical assumptions: (i) that present day animals are arranged in a linear scale of complexity ending in man, the scala naturae assumption, also termed orthogenesis, whereas animals have actually evolved in parallel with man and with similar growth in complexity. Corresponding parallel evolution of similar complexity is termed homoplasty, as distinguished from the relationship between a simpler and a more complex version of a feature, which is called homology. (ii) that various brain organs emerged one at a time, whereas with vertebrates the entire layout of the telencephalon emerged at once. (iii) that the telencephalon of early animals was dominated by olfactory connections.

3.2 The structure of the cortex

Neocortical areas. The neocortex in all primates is arranged as specialized areas, about fifty in each hemisphere. Each area has a few million neurons, is a few millimeters across, and 4 to 6 millimeters thick. Areas have distinct information-processing functions.

Figure 3.10 shows Brodmann’s numbering of the different areas that he identified in the human neocortex.

The structure of the neocortical areas. The primate neocortex comprises neurons from a small set of anatomical cell types, shown in Figure 3.11. Cells of a given type are generated by a particular set of genes.

In addition to pyramidal cells there are several other types of cross-connecting neurons, or interneurons, in numbers similar to the pyramidal cells. These form all possible circuits The structure of the cortex 51 and connections between pyramidal cells in the same and different layers. The fan-in ratio of connections to any given pyramidal cell can be as high as 10,000.

Areas have specific fixed interconnectivity. I will discuss the experimental evidence later in chapter 12, for a pattern of connectivity among areas, which appears to be the same, or similar, for all primates. A connection between two areas consists of about one million axons from pyramidal cells in a cortical layer in the source area to another cortical layer in the target area. Each area is typically connected to a small number of other areas. Connections divide into long range and short range. At short range, an area is often connected to several neighboring areas that are contiguous with it. At long range, an area is usually connected to one, two or three areas that are further away, and not contiguous with it.

Connectivity among the different neocortical areas and also subcortical areas Figure 3.12 diagrams how each layer has a different connectivity role. Layers 1 and 4 receive inputs, from other cortical areas and the thalamus, and layers 2, 3, 5 and 6 contain pyramidal cells generating outputs. The targets of these outputs depend on the layer, layer 2 to cortical areas in the same hemisphere, layer 3 to cortical areas in the opposite hemisphere, layer 5 to the thalamus, and layer 6 to subcortical areas including the basal ganglia.

There seem to be anatomical arrangements for hierarchical processing. The forward as- cending connections from prior areas enter at layer 4 and feedback descending connections from later areas enter at layer 1 [Felleman and Essen, 1991]. 52 Chapter 3: The primate brain

3.3 The uniform process of the neocortex

3.3.1 There is a uniform process

The primate cortex has a uniform structure over all of its area [Creutzfeldt, 1978] [Ullman, 1991], having a six-layer organization comprising neurons from a small set of cell types. The numbers of these cells per unit volume are very uniform over the cortical surface, the main differences being in motor cortex which has more and larger pyra- midal cells, and in visual cortex, which has a significantly, three times, greater density of cells. A canonical neocortical circuit can be described [Shepherd and Koch, 1998] [Douglas and Martin, 1998] see Figure 3.13, together with regional variations from the canonical form characterized. The boxes indicate populations of neurons of a given cell type, P2+3 are pyramidal cells in layers 2 and 3, P5+6 are pyramidal cells in layers 5 and 6, 4 are layer 4 stellate cells, and GABA are GABAergic interneurons, i.e., neurons using the neurotransmitter GABA. The blue dashed lines indicate inhibitory connections and the red continuous lines excitatory connections.

Although long-range connectivity, as we have seen, tends to be clustered around cortical regions, short and medium (<3mm) connectivity within one area of the cortex is statis- tically quite uniform. It therefore appears that information processing within different cortical regions has a common basis or principle.

3.3.2 Theories of the uniform process

This subsection is taken mainly from Shimon Ullman’s MIT AI Memo No 1311 December 1991 [Ullman, 1991]. He lists the different theories that have been proposed for the uniform cortical process: Overall components and architecture 53

1. The classifying cortex, David Marr [Marr, 1970]. The uniform process is the classifi- cation of incoming patterns

2. The non-linear spatio-temporal filter, Otto Creutzfeldt [Creutzfeldt, 1978]. The uni- form process is a filter which links the activity of the thalamus and other afferents to the effectors

3. The model builder, Horace Barlow [Barlow, 1972] [Barlow, 1990]. The uniform process is the detection and signaling of suspicious coincidence. P (AandB) >> P (A) ∗ P (B), coincidences being used to form internal models of the environment.

4. Multilevel relaxation, David Mumford [Mumford, 1994]. Multiple cortical areas inter- act to achieve a consistent interpretation of the incoming stimulus.

5. Large-scale associative memory, John Hopfield [Hopfield, 1982].

6. Interpolating memory, Tomaso Poggio [Poggio and Shelton, 1999], James Albus [Albus, 1981].

7. Neuronal group selection, Gerald Edelman [Edelman and Mountcastle, 1978].

8. Sequence-seeking using counter-stream structure, Shimon Ullman [Ullman, 1991] [Ullman, 1996]. The process is a search for a sequence of mappings linking sensory source and target model representations.

3.4 Overall components and architecture

Figure 3.14 is a very simplified diagram of the main components of the brain and their connections. An approximate idea of their functions, which will be discussed in a lot more detail later in the book, is as follows:

(i) The neocortex (or, simply, cortex) is a perception-action hierarchy with stored se- 54 Chapter 3: The primate brain mantic and episodic memories, providing overall control. It is organized as a six-layer sheet (cortex) of neurons.

(ii) The hippocampus provides episodic memory formation with some long-term storage of episodic information. It is organized as a cascade of three subcomponents, each a three-layer cortex.

(iii) The thalamus provides routing of incoming data to the neocortex and from cortex to cortex. It is organized as nine nuclei with limited mutual interaction.

(iv) The basal ganglia are involved in procedural memory and routine motor control. They are situated in loops to and from the neocortex.

(v) The amygdala integrates low-level motivational processing and has connections to orbital frontal neocortex, to the hippocampus, and to the hypothalamus. It is organized as two groups of nuclei.

(vi) The cerebellum is apparently used for smooth motor control and for spatial rep- resentation is general. It is organized as a very regular cortex with a large number of cells.

(vii) The hypothalamus is the main work center for creating and maintaining subcortical motivation states. It generates hormones, via the pituitary gland, and sends signals to the autonomic system and skeletal musculature, readying the system for different kinds of action. Hormones are also generated by glands in the body and these affect the hypothalamus and other brain components, including the neocortex, via the blood stream and synaptic receptors.

(viii) The brain stem is concerned with basic functioning and arousal of the system, including sleep.

(ix) The spinal cord concerns communication of sensory and motor data to and from the Overall components and architecture 55 body. 56 Chapter 3: The primate brain

Figure 3.9: Todd Preuss’s findings of new neocortical areas by comparing a prosimian (a Galago bushbaby) and a simian (a Macaque monkey) Overall components and architecture 57

Figure 3.10: Brodmann’s areas of the human neocortex 58 Chapter 3: The primate brain

layer I H1

layer II G2

P3S layer III P3L

M4 layer IV ST4 G4 P5S layer V P5L

M6 layer VI SP6

Figure 3.11: Cell types and internal connectivity in the neocortex, from Shepherd and Koch in [Shepherd, 1998]

layer I inputs of feedback from other cortical areas and thalamus

layer II output to other cortical areas in same hemisphere output to other cortical areas in other hemisphere layer III

layer IV inputs of feedforward from other cortical areas and thalamus outputs to thalamus layer V

outputs to subcortical areas layer VI

Figure 3.12: External connectivity of the different cortical layers Overall components and architecture 59

P2+3 (4)

GABA cells

P5+6

thalamus

Figure 3.13: Canonical neocortical circuit, due to Douglas and Martin [Douglas and Martin, 1998] 60 Chapter 3: The primate brain

neocortex perception−action hierarchy willed action

basal ganglia hippocampus routine behavior episodic memory formation routine motor control thalamus routing, communication cerebellum external spatial representation sensing smooth motor control amygdala integration of motivation hypothalamus control of internal motivational states sensing

brainstem arousal hormones autonomic system skeletal hormones musculature spinal cord information transmission

Figure 3.14: Overview of brain components Chapter 4

The historical development of system-level approaches to the brain

Abstract. I give a brief historical review of the several strands and disciplines that are relevant to my purpose.

The basis of the study of the brain was laid down in the nineteenth century, and lead to a system-level approach to describing the brain, which was used by neurologists in characterizing the pathologies of their patients.

Psychology emerged from philosophy as an experimental discipline, and then divided into experimental psychology, psychiatry and psychotherapy.

More recently the anatomy and connectivity of the brain have been elucidated by the use of lesion, tracing and now imaging techniques.

61 62 Chapter 4: The historical development of system-level approaches to the brain

4.1 Introduction

Although the scientific study of behavior, that is scientific psychology, only began in the nineteenth century, neurology, the medical understanding and treatment of dis- orders of the nervous system, has a long history going back before written records [Benton and Joynt, 1960]. In the beginning, there were not only art and philosophy, but also medicine and engineering.

Figure 4.1 attempts to diagram and overview some of the main researchers and their contributions.

Philosophy Neurology Psychiatry Neuropsychology Neurolinguistics Hippocrates 400 BC aphasia

Paracelsus 1520 AD Schmidt 1673 Locke − ideas, excitatory association Duc de Saint−Simon 1718 Goethe 1795 Pinel 1793 Gall 1810 − brain organization Esquirol 1820 Herbart 1820 − calculus of mental ideas inhibitory associative connections unconscious ideas

Spencer 1860 − system evolution and dissolution Broca 1869 − localized brain area for speech production Meynert 1870 − brain system anatomy Charcot 1875 − hypnosis Wernicke 1880 − modular brain architecture and psychiatric explanations Hughlings Jackson 1880 − brain system dissolution

Freud 1895 − neurological model Kraepelin 1900 1902 − psychoanalytic model

Janet 1905 − system dissolution Marie 1905 Brodmann 1908 − cortical areas Head 1920 Jakobson1955 Luria 1955

Milner 1957 memory and hippocampus Geschwind 1965 − disconnexion syndromes Warrington 1967 neurocognitive modeling 1968 Shallice 1970

Figure 4.1: Events in the history of neurology and neuropsychology

I will not attempt the impossible task of a clean separation among these different disci- Neurology, neuroanatomy and neurophysiology 63 plines.

4.2 Neurology, neuroanatomy and neurophysiology

John Locke, 1632-1704, developed a theory of ideas which were connected together by excitatory associations. The basic processes were sensation and reflection “our senses, conversant bout particular sensible objects, do convey into the mind several distinct per- ceptions of things, according to those various ways wherein those objects do affect them” [Locke, 1690]. These ”ideas” are, according to Locke, the fundamental building blocks of all human thought. The process of reflection could be more active and deliberate, including operations of composition and abstraction of ideas.

Johann Friedrich Herbart, 1776-1841, introduced the concept of inhibitory association [Herbart, 1824]. Ideas competed for energy and the dominant ones corresponded to conscious thought whereas the subordinated ones corresponded to unconscious thoughts. This laid a framework for Freud’s architectural ideas. Herbart’s textbook on psychology was the standard one in Germany in the midnineteenth century and certainly Freud would have read it for example when working in Meynert’s laboratory in the 1880s. It was Herbart who separated psychological ideas from philosophy, however he did not propose an experimental methodology, this was left to others such as Wilhelm Max Wundt, 1832-1920, [Wundt, 1863] [Wundt, 1874].

Franz Joseph Gall, 1758-1828, referring to language, opined that “a special organ of the brain presides over this wonderful function” [Gall and Spurzheim, 1810] (vol 4 p. 65).

Paul Broca, 1824-1880, made the first observation of localization of function in the human brain when he examined a patient, named Leborgne, with a speech production aphasia 1,

1An aphasia is any problem with the recognition or generation of language 64 Chapter 4: The historical development of system-level approaches to the brain and then was able to examine his brain post mortem and found damage to a left frontal area. This area was called Broca’s area, and aphasia due to its malfunction was called Broca’s aphasia.

In 1873, Camillo Golgi, 1843-1926, had been able to extend histological staining tech- niques for the first time to the very fine cells in the brain, to show its neuronal structure [Golgi, 1873]. This was based on the ’black reaction’ (reazione nera) causing nervous tis- sue hardening in potassium bichromate and impregnation with silver nitrate, a method now called Golgi staining.

Theodor Meynert, 1833-1892, was a neuroanatomist and developed the first overall archi- tectural scheme for the human brain, in which connections directly to and from the exter- nal sensors and effectors were called projection neurons and the other neurons connecting these projection neurons to each other were called association neurons [Meynert, 1884] [Marx, 1970]. In the 1884 edition of his book, he describes three different kinds of cortical cells and a five layer structure of the cortex.

Carl Wernicke, 1848-1905, was very influenced by Meynert and discovered a correspond- ing speech perception aphasia due to damage to left lower parietal regions, which were then called Wernicke’s area and Wernicke’s aphasia [Wernicke, 1874]. Actually this find- ing had already been published by H. Charlton Bastian, 1837-1915. However, Wernicke also noted that there were aphasias due to the disconnection of these two main speech areas, Broca’s and Wernicke’s areas, due to damage of the arcuate fasciculus, (the bun- dle of nerves connecting them). The first system diagram was due to Ludwig Lichtheim, 1845-1928, [Lichtheim, 1885], however Wernicke developed a more comprehensive system architecture [Eggert, 1977]. This lead him to develop a system approach to describ- ing the functioning of the brain and to classify neurological disorders due to malfunc- tion of different subsystem components and connections among them [Wernicke, 1881] [Wernicke, 1894] [Wernicke, 1886]. This approach was made possible by its grounding in Neurology, neuroanatomy and neurophysiology 65

Meynert’s neuroanatomical architectural framework.

Figure 4.2 is my version of Wernicke’s diagram, based on one of his overview papers [Wernicke, 1886].

Concept representations

speech verbalization comprehension 3 6

Auditory images 1 4 Motor images

7 commissure

speaking hearing 2 5

hearing organs speech organs

Figure 4.2: Wernicke’s system diagram

This diagram allows the explanation of seven different types of aphasia, due to discon- nection at the seven indicated points, namely, (1) cortical alexia - loss of ability to read and write, (2) subcortical alexia - loss of ability to read, while writing is unimpaired apart from copying, (3) transcortical alexia - loss of ability to read and write except for copying written ma- terial, (4) cortical agraphia - fine control for writing and copying is greatly impaired, but read- ing is unimpaired, (5) subcortical agraphia - similar to 4, 66 Chapter 4: The historical development of system-level approaches to the brain

(6) transcortical agraphia - should prevent spontaneous writing and allow copying, how- ever probably doesn’t exist as a separate condition, and (7) conduction agraphia - reading is undisturbed, normal writing is lost, but “para- graphic” writing may be possible.

Figure 4.3 shows how this diagram would be typically extended to include visual percep- tion and also writing output.

concept representations visual speech verbal representations comprehension expression 3 6

auditory images 1 commissure 7 4 motor images

copying output for writing 2 hearing 5 speaking

seeing organs hearing organs speech organs writing organs

Figure 4.3: Extended system diagram

At that time, it was hoped that all of psychiatric problems could be dealt with as neu- rological problems, however this was not possible and a separate discipline of psychiatry, not requiring neurological description, was developed by Emil Kraepelin, 1856-1926, and also of course psychoanalysis was developed by Sigmund Freud, 1856-1939.

In 1909, the neuroanatomist Korbinian Brodmann, 1868-1918, described the partitioning of the neocortex into distinct areas [Brodmann, 1909]. For primates, there were about 50 areas on each hemisphere, see section 3.2 and Figure 3.10. This brilliant work was based only on the anatomical appearance of populations of neurons comprising the cortex. Neurology, neuroanatomy and neurophysiology 67

There was another line of conceptualization which concerned the integrated system func- tioning of the brain and its disease. Basic concepts of system evolution and dissolution were described by Herbert Spencer, 1820-1903, [Spencer, 1863] (his chapter on dissolu- tion). John Hughlings Jackson, 1834-1911, developed concepts of hierarchical architec- ture of the brain and the idea of dissolution as a method of characterizing neurological disorders [Jackson, 1931] [Kennard and Swash, 1989]. Freud acknowledged that he was influenced by Hughlings Jackson.

It was about this time that the forces of the insecure, the overcritical, and the afraid, came into action, with the inevitability of winter. Broca’s work was trashed by Pierre Marie. In 1905, he examined the brain of Broca’s patient, which had been preserved, and concluded that there was little localization. Thereafter none of the system theorists was allowed to work in the French medical system. In England, the leading neurologist Henry Head described the work of the “diagrammakers” as chaos, and replaced it by a detailed empirical methodology with minimal theoretical expression. Wernicke had apparently offended an official while a junior researcher and thereafter for most of his life he was denied any German federal chair. Tim Shallice [Shallice, 1988] has observed that the names of Broca and Wernicke have survived, whereas those of Marie and Head have fallen into obscurity. In 1984, when CT imaging became available, the still pre- served (!) brain of Broca’s patient was imaged and studied by Jean-Louis Signoret et al [Signoret et al., 1984] using modern techniques. Broca’s description of it was upheld.

Neurological interest in how the brain processes language resurged in the 1950s, for example by Russian neurologists lead by Alexander Romanovich Luria, 1902-1977, [Luria, 1970] [Luria, 1970] [Luria, 1978] [Luria, 1980] treating the large numbers of brain-injured soldiers from the second world war. Roman Jakobson, 1896-1982, ap- plied linguistic criteria to describing neurological disorders of language processing [Jakobson and Halle, 1956]. Norman Geschwind, 1926-1984, in America lead the resur- 68 Chapter 4: The historical development of system-level approaches to the brain gence there of system descriptions and the recognition of disconnection syndromes in language processing [Geschwind, 1965] [Geschwind, 1966] [Geschwind et al., 1968].

During the 1950s to , advances were made in describing the architectural layout of the brain, using lesion methods, i.e., the selective destruction of different parts. This lead to the idea of hierarchical connected sequences of brain areas by Edward Jones and Thomas Powell in 1970 [Jones and Powell, 1970]. This work confirmed Brodmann’s partitioning and brain areas. In the 1970s, and , tracing methods were developed which allowed more detailed tracing of connectivity of the brain and the delineation of the different brain areas. This allowed Deepak Pandya and coworkers [Pandya and Yeterian, 1990] to confirm in greater detail the hierarchical structure of the cortex first indicated by Jones and Powell, see chapter 11. From the 1980s on, start- ing with Per Roland’s PET studies [Roland, 1993], brain imaging has provided a new type of data in which activations of different brain areas are observed under different experimental conditions.

4.3 Psychology

Psychology was given an experimental foundation by Wundt in the 1860s, and then became a separate discipline from neurology and philosophy. In addition, psychiatry and clinical psychology separated at about the same time, so that by the year 1900, there were these three components running concurrently. Psychology proper, henceforth referred to as experimental psychology, included human and animal behavior. By clinical psychology, I mean the study and treatment of neurotic behaviors, and I mean to include Freud and Janet and their colleagues.

The main issues in experimental psychology were memory, thinking, perception, motor Psychology 69 skills, and motivation.

The main issues in psychiatry were objective testing and treatment of the severely men- tally ill including psychotics.

The main issues in clinical psychology were hysteria, dissociation and overall severe neu- rotic conditions. These arose in patients presenting with behavioral problems, and with- out any known neurological problems.

Emil Kraepelin is justly called “the father of modern psychiatry”. He was the first to identify schizophrenia (originally called dementia praecox, meaning early, “precocious”, senility), manic-depression (bipolar disorder) and paranoia [Kraepelin, 1921], and he pio- neered the use of drugs to treat mental illness. He was also joint discoverer of Alzheimer’s disease (which he named after his collaborator, Dr Alois Alzheimer) [Kraepelin, 1922]. Kraepelin presented these and other discoveries in successive editions of his “Psychiatrie: Ein Lehrbuch” [Kraepelin, 1899]. He disagreed with Freud and did not use the notion of the unconscious. His legacy reaches us via his diagnostic manual which became the standard DSM manual used by all psychiatrists in the USA, and which in some sense defines the field of psychiatry.

Figure 4.4 attempts to diagram and overview some of the main researchers and their contributions.

4.3.1 System models in psychology

There have been many different theoretical bases suggested for psychology. Let me mention two main classes, first the clinical models of Freud and Janet, and then the information-processing models of short-term memory. 70 Chapter 4: The historical development of system-level approaches to the brain

perception association memory thinking/intelligence emotion dreams personality/types development consciousness social psychology Plato Plato Aristotle Aristotle Hobbes Wundt

Weber Fechner Ebbbinghaus Mach

James James Jung Binet Piaget Bartlett

Figure 4.4: Events in the history of psychology

Freud. In his lifetime, Sigmund Freud described three different system models, namely, (i) his “Project for a scientific psychology” model [Freud, 1895] in which neurons with psy- chic energy sought release and formed unconscious, preconscious and conscious parts, (ii) his “transcription” model [Freud, 1900] [Olsen and Koppe, 1988], introduced in his “In- terpretation of Dreams” book, in which the brain had a series of information-processing stages and information was rewritten in different forms from one stage to the next, see Figure 4.5, and (iii) his “ego, id and superego” model [Freud, 1923], which had a hierar- chical structure with repressed memories being pushed down to an unconscious level.

Figure 4.5: Freud’s diagram of his transcription model, from [Freud, 1900]

Freud acknowledged, see for example [Freud, 1894], the ideas of Pierre Janet who intro- duced the idea of dissociation in the early 1890’s, however Freud developed in a different Psychology 71

direction, introducing repression instead of dissociation, infantile sexual fantasy instead of different kinds of trauma, and an aggregated unconscious based on psychic energy.

Janet. Janet’s [Janet, 1886] [Janet, 1891] [Janet, 1894] [Janet, 1898] [Janet, 1901] [Janet, 1904] [Janet, 1907] [Janet, 1919] [Janet, 1971] dissociation model has better stood the test of time, being embraced as a precursor of modern theories of dissociation which are used to understand post-traumatic stress disorder [der Hart and Friedman, 1989] [van der Kolk et al., 1996]. An important main phenomenon both scientists tried to un- derstand in the nineteenth century was hysteria in which patients could not perform certain normal actions or else felt compelled to perform certain abnormal actions. Janet did not use the notion of unconscious and had a central conscious subsystem with higher levels of processing which could be dissociated laterally from each other. Dissociation was caused by trauma which lead to memories that were not integrated into the subject’s narrative. Such traumatic memories had different dynamics from normal memories, they could be triggered by specific stimuli, they retained salience and did not fade with time, and they had a tendency to express themselves in behavior either overtly or covertly. Thus the main explanatory idea was that of failure to integrate memories due to trauma. This could lead to various symptoms and to selective amnesia.

Janet’s clinical treatment consisted of attempts to reintegrate memories, using hypnosis and other techniques. Freud’s main approach was to reveal and make conscious the unconscious memories by free association and by talking out in a therapeutic relationship.

More recent psychotherapeutic work. There has been systematization of defense mechanisms which eventually lead to a more precise description by Kenneth Colby [Colby, 1963]. Later psychotherapeutic models include object relations due to Fairbairn [Fairbairn, 1952], self models due to Kohut [Kohut, 1971], and intersubjective models due to Ferenczi [S´andor, 1955], and more recently Stolorow [Stolorow et al., 1987]. Of these, object relation ideas are most clearly defined, however they are still some distance 72 Chapter 4: The historical development of system-level approaches to the brain

from any kind of predictive mathematical model.

Kenneth Colby. Colby also developed a model of paranoia which was successfully implemented as a computer model [Colby et al., 1971]. This used the idea of a dynamic self-esteem variable which tended to be depressed by incoming messages and from this to lead to hostile responses. This very innovative work unfortunately was always dependent on being able to incorporate natural-language understanding and this formed a barrier to further elaboration of underlying psychic mechanisms.

Information-processing memory models. Following the development of concepts of information and channel in the second world war, Donald Broadbent introduced these ideas into psychology [Broadbent, 1958] mainly in the context of understanding attention and auditory perception. This lead to the idea of a short-term memory in which incoming data was held and rehearsed, the dominant model being due to Atkinson and Shiffrin [Atkinson and Shiffrin, 1968]. This was systematized further in Morton’s Logogen model [Morton, 1970] which used a unified phonological code and storage for short-term memory items. This approach was elaborated further by Shallice [Shallice, 1988], and in general has had explanatory value in understanding dyslexia for example. During the late and the 1970s, neuropsychological methods and phenomena were being introduced by Warrington and Shallice [Warrington and Shallice, 1969]. This clinical work discovered a range of phenomena caused by malfunction of various information-processing activities of the brain, in perception and memory. Another development has been working memory and the model of Alan Baddeley, which has auditorily and visually coded stores used for thinking as well as perception [Baddeley, 1986]. These models, although systemic, have been used descriptively using natural language, rather than mathematically with precise computer models. They are not defined completely enough for computer implementation and the use of more precise models has not been seen as scientifically useful for better understanding the phenomena of interest. Chapter 5

The history of formal description

Abstract. In this chapter, I regard formal logic, theoretical computer science, and ar- tificial intelligence as forming a unity which stems from the need for precise formal de- scription in mathematics, computer science and natural science.

I trace the desire for precise description through Frege to modern predicate logic. I explain the main concepts of predicate logic, its strengths and weaknesses, and the historical struggle to develop it.

At the same time, I trace the development of theoretical computer science from Babbage through G¨odel, Turing and Church to artificial intelligence, modern theoretical computer science and logic programming. This explains how predicate logic became an important method for describing computation.

73 74 Chapter 5: The history of formal description

5.1 Introduction

It is a fundamental property of the human mind to be constantly trying to describe its environment, the events that are occurring and ultimately to try to describe itself. Marvin Minsky [Minsky, 2003] reminds us that the real problem is being able to describe high-level human thought and consciousness.

Figure 5.1 diagrams some important events in the history of formal description. I have not included neural nets as, although mathematical and precise, I don’t think that in their present form they can be called formal description methods; they are based on real analysis.

5.1.1 Using natural language for scientific and mathematical

description

Description consists of expression in some external communicable form. In science, this has resulted in the development of precise and technical language which has allowed unambiguous and precise communication of ideas and scientific findings and principles.

For several hundred years, science was also communicated in the common international language of Latin. This was replaced in the eighteenth century by German, French, English, Russian, etc. and scientists usually learned to read these languages. Proficiency in German was an official requirement for entering university to study science in England until the 1960s. The author had to attend “scientific German” classes in high school since London University for example required a pass in this subject for all entering undergraduates. He went to Oxford, which required Latin instead! Introduction 75

Philosophy and logic Computer science

Theoretical Artificial intelligence Aristotle computer science Ramon Lull

Leibniz − calculus of ideas

Frege 1879 concept script

Paradoxes Russell and Whitehead 1906

Hilbert’s logicism 1900

Goedel’s incompleteness theorem 1931 Model theory Tarski 1934 Turing machines 1935

Zermelo Frankel Church’s functional theory of sets calculus 1941 Neural nets 1946 Turing’c chess program 1950 Cellular automata 1953 Von Neumann Finite automata 1954 Artificial intelligence 1956 Newell and Simon, Minsky Formal language theory AI programing language 1956 1957 Robinson’s theory of Automata and the real numbers 1961 languages 1962 Functional programming Structure definitions Automated theorem proving Complexity theory resolution principle 1965 1967

Database theory 1970 Knowledge representation 1967 Hewitt’s Planner Model theory of functions 1973

Logic programming 1973 Frames, Minsky 1974

Logic grammars 1976

Theory of processes 1978 SOAR 1981

Deductive databases 1985 Multiagent systems 1988

Statistical natural language learning 1995

Figure 5.1: Events in the history of formal description 76 Chapter 5: The history of formal description

Technical English involves speaking in a special style and using technical vocabulary and terms that are explicitly defined. This reaches a most developed form in mathematics. The aim is to ensure that there is no ambiguity, and this has been successful; mathemati- cians have been able to communicate the most subtle and complex ideas and to work together.

5.1.2 Logic in natural language

Before Frege, logic was conceived as using natural language, and Aristotelian syllogisms were used for logical inference. Thus, in reasoning, one matched a syllogism to a nat- ural language sentence and derived a new natural language sentence. Sentences were structured as subject and predicate, which were quantified over separately. Incidentally, Aristotle also included some inductive reasoning and some reasoning by analogy, whereas later logics only include deduction, with induction being treated as a separate logical is- sue, and analogy being treated as artificial intelligence.

A description then consists of some externalized form, which we can take to be expressions in some language. In order for this to work, the receiver or beholder has to perceive these expressions, and also has to understand the language in which they are expressed, so there are these two aspects, syntax and semantics.

5.2 Formal logic

The yearning for a precise language for describing thought can be traced back to Ramon Llull, but in particular to Gottfried Wilhelm von Leibniz.

Ramon Llull, 1235-1316, was a Majorcan theologian whose major work “Ars Magna” was a set of treatises, including “The Tree of Knowledge” and “The Book of the Ascent Formal logic 77

and Descent of the Intellect”. This was at the height of the Islamic empires, and he read extensively on the Arabic tradition and the Logic of Al-Ghazzali. He wrote in Latin, Arabic and Catalan. His books included a design for a reasoning machine. It was a set of concentric disks with words written on them, that could be rotated to display different statements from combinations of these words. His idea was that one statement could be set up on the disks and this would result also in other statements being displayed, thus giving a kind of mechanical inference. Llull tried to use logic and mechanical methods involving symbolic notation and combinatorial diagrams to relate all forms of knowledge. He attempted to reduce Christianity to rational discussion, to prove the dogmas of the Church by logical argument. “Without producing, no man can love, nor can he understand or remember, nor have the power of feeling and being.” Ramon Llull, “The Hundred Names of God”.

In the seventeenth century, Gottfried Leibniz, 1646-1716, born in Leipzig, envisioned a formal language, a calculus, that would capture and embody all truth and valid reason- ing. He called this the characteristica universalis. “If we had it, we should be able to reason in metaphysics and morals in much the same way as in geometry and analysis ... If controversies were to arise, there would be no more need of disputation between two philosophers than between two accountants. For it would suffice to take their pen- cils in their hands, to sit down to their slates, and to say to each other ... Let us calculate.” G. W. Leibniz from Gerhardt [Gerhardt, 1890]. “Leibniz’s vision may have been absurdly ambitious, but the ideal was to influence many subsequent philosophers, most notably, Frege and Russell”, from the Stanford Encyclopedia of Philosophy, see also [Russell, 1900]. Leibniz was also interested in numerical calculation and in 1673, he demonstrated his incomplete calculating machine, which could multiply, divide and extract roots, to the Royal Society.

Friedrich Ludwig Gottlob Frege, 1848-1925, developed the first ever formal language. 78 Chapter 5: The history of formal description

He called it a “concept script” (Begriffschrift), his landmark paper was published in 1879 [Frege, 1879] entitled “Begriffsschrift, eine der arithmetischen nachgebildete Formel- sprache des reinen Denkens”, that is, “Concept Notation: A formula language of pure thought, modeled upon that of arithmetic” It had most of the descriptive ideas that are used in predicate logic to this day. It had boolean operators, predicates and quantifiers as used in modern predicate logic 1 Frege was the first major proponent of logicism – the view that mathematics is reducible to logic. Thus, all mathematics and science would eventually be expressed as formal logical expressions, with derivations being found by logical inference. Frege’s “Grundgesetze der Arithmetik” [Frege, 1893] was an attempt to explicitly derive the laws of arithmetic from logic.

Figure 5.2 shows Frege’s concept script and its relation to modern predicate logic, thus it involved the formal or mathematical use of universal quantifiers, meaning “for all individuals such that” and of existential quantifiers, meaning “there exists an individual such that”, individuals being the notional things that the logic makes assertions about.

Frege’s was a second order logic. By first order we of course mean involving quantification over individuals only, and second order involving quantification over predicates as well.

Formal description, that is formal, or mathematical or symbolic, logic, introduced formal languages for scientific communication. By formal we mean a transformation based solely on the form of an expression and not on its meaning. The main aim originally was to make the understanding of the language extremely easy, in fact purely mechanical. Thus, all the recipient had to do was read a formal expression as a sequence of characters, which had a simple syntax. Then in order to use the expression, all they had to do was apply logical schemata which involved only matching one bracketed expression to another. This process could be, and of course eventually was, performed by a machine.

1A predicate is a function whose values are truth values, i.e. true or false, and a boolean operator is one which combines truth values, such as and, or and not. Formal logic 79

Logical notion Frege's concept script Modern notation

It is not the case that Fx Fx F(x)

Gy If Fx then Gy F(x) G(y) Fx

Every x such that Fx x Fx ( x) F(x)

Some x such that Fx x Fx ( x) F(x)

Every F is such that Fa F Fa ( F)F(a)

Some F is such that Fa F Fa ( F)F(a)

Figure 5.2: Frege’s concept script

In 1910 Bertrand Russell, 1872-1970, and Alfred North Whitehead, 1861-1947, devel- oped a symbolic language for describing the mathematics of numbers, with mathematical proofs of their properties [Russell and Whitehead, 1910]. At that time there were a host of difficult paradoxes most of which stemmed from circular definitions. These were in the main removed by using a typed theory, with a hierarchy of types of individuals and functions. Arguments of given predicates were restricted to certain types of data, and any quantification was over values of some given type. This restriction prevents most circular definitions and statements.

The original logicism postulated that all of mathematics could be derived from a set of logical axioms using some basic rules of inference. One possible set is as follows, taken from [Mendelson, 1979]: Logical axioms. If A, B and C are any logical formulae, then the following logical state- ments are always true: 80 Chapter 5: The history of formal description

1. A⊃(B⊃C), 2. A⊃(B⊃C))⊃((A⊃B)⊃(A⊃C)), 3. (¬B⊃ ¬A)⊃((¬B⊃A)⊃B), and 4. (x)A(x)⊃A(t) 2 3, and 5. (x)(A⊃B)⊃(A⊃(x)B) 4. A possible set of rules of inference is: (i) Modus Ponens: A, (A⊃B) |= B, and (ii) Generalization. A |= (x)A. The symbol “⊃” means implies, “¬” means not, “∨” means or, “∧” means and, and “|=” means logically proves. “⊃” and “¬” and statements 1-5 are in the object language, and the symbol “|=” and statements (i) and (ii) are in the metalanguage, in which we express statements about the object language.

These days it is accepted that in order to do mathematics we need appropriate non-logical axioms, usually called proper axioms, in addition to any logical axioms. For example, for set theory we need about 20 proper axioms which define set theory. In the resolution formulation used in logic programming on computers, to be described in section 6.1, there are actually no logical axioms at all, just proper axioms.

In 1900, David Hilbert, 1862-1943, developed logical descriptions for plane geometry [Hilbert, 1902b] and an approach for the real numbers. Hilbert also articulated his over- all philosophy for mathematics which was that everything should be precisely definable in a formal logical language and formal proofs of all theorems using this language should be discoverable. This question of discoverability of formal proofs was called the Entschei-

2provided t is free for x in A(x), i.e., if and only if no free occurrence of x in A lies within the scope of any quantifier (y) where y is a variable in t. 3x occurs free in a formula A if it is not within the scope of any quantifier occuring in A 4provided x does not occur free in A. Formal logic 81

dung problem. 5 This was the tenth in the famous set of problems that Hilbert presented to the world in 1900 [Hilbert, 1902a] [Browder, 1976]: “Given a diophantine equation with any number of unknown quantities and with ra- tional integral numerical coefficients: To devise a process according to which it can be determined by a finite number of operations whether the equation is solvable in rational integers.” 6.

Other formal axiomatizations include classical mechanics by Hamel in 1903, ther- modynamics by Caratheodory in 1909, special relativity by Robb in 1914, probabil- ity theory by Kolmogrov in 1930 [Gray, 2000]. Rudolf Carnap also developed a lot of formal axiomatizations, including biological phenomena [Carnap, 1958]. The ax- iomatization of quantum mechanics was developed by von Neumann in the 1930s [von Neumann, 1932] [Birkhoff and von Neumann, 1936] [Lacki, 2000], and later Mackey and others [Mackey, 1963], and quantum field theory by Wightman in the 1960s [Velo and Wightman, 1973]. However there remained problems with formalizing the no- tion of set, and also the theory of the real numbers, in first order predicate logic. Axiom- atization of the reals by Hilbert in 1900, then also by Coolidge and by Tarski, were all in second order logic. However, a workable axiomatization of set theory was developed by Zermelo and Fraenkel in the 1930s.

In 1930, Kurt G¨odel,1906-1978, showed the incompleteness of any theory of the natural numbers [Kurt G¨odel,1930]. Thus there would always be theorems about the natural numbers that could be stated but could never be proved in the theory

In 1936, Alan Turing, 1912-1954, set out to develop a discovery process to solve Hilbert’s tenth problem, and devised his Turing machine [Turing, 1936] [Turing, 1937]. This orig-

5eine scheide - a boundary, scheiden - to separate, entscheiden - decided 6A diophantine equation is a polynomial equation for which the solutions are required to be integers, for example, to find integers x, y and z such that x4 + x2y + y3 = z4 82 Chapter 5: The history of formal description inal idea of computation was a mechanical process using a finite alphabet of discrete symbols, and using a discrete time scale. To remind the reader of the Turing machine, referring to Figure 5.3, this is a device which can at any one time be in one of a fixed set of states, and it has a reading head which is always under one square on a tape that it has, of unbounded length. The device operates by reading the symbol under the reading head and looking up in a fixed state transition diagram what next state to transition to and what action to take. An action consists of writing a symbol on the tape and then moving right one square or left one square, or not moving, or halting. The notation used is read symbol/write symbol/next state, thus for example, b/#/R means if the read/write head reads a b symbol then write a # symbol over it and move the read/write head one place to the right. The rest of the tape is taken to be filled with # symbols. The example in the figure starts with a string or a symbols and b symbols and decides true or false whether there is a string of a’s followed by a string of b’s of the same length. If true it writes a T symbol on the tape and halts and if false an F symbol.

tape format: ####aaabbb##### read/write head: #//R

true/false: T/F control diagram: start state halt state: H at left end b/F/H #//R a/#/R state:

scan right scan left a//L a//R b//L b//R #//L b//L

b/#/L test if notation for state transition: at right end a/F/H a/F/H finished read symbol/write symbol/next state #/T/H

Figure 5.3: Example of a Turing machine Formal logic 83

The way this particular Turing machine works is to move right until it finds the first, i.e., leftmost, a, to replace it with a # symbol and then to move to the right until it finds the first #. Then to step one step to the left, which should have a b and to replace it with a #. It then moves left until it finds a # again which should have an a to its right, or else a # in which case it has succeeded in recognizing a string of the form anbn. Situations which are not as expected lead to termination with failure to recognize.

As a result of his construction, the Turing machine, and its use in representing the at- tempted solution of decidability problems, Turing realized that there would be questions which were impossible to decide by any computational process.

The same device but without the tape, where it just inputs and outputs one symbol each time step, is called an automaton.

Turing was also able to construct a universal Turing machine. As shown in Figure 5.4, this machine could simulate any given Turing machine. The way this is done is to put a description, or code, of the given Turing machine as a sequence of characters on the tape, and then the universal Turing machine can just keep reading what to do next from this code. This encoding of a machine is an example of a program. There are by now many other different constructions for universal Turing machines, some of which are very simple, I believe the world record for conciseness has only 6 states.

Kurt G¨odelalso made the connection between formal proof and symbolic computation. He showed that all computations could be expressed as logical proofs and vice versa, so formal reasoning and computation were the same thing.

These twin developments by G¨odel and Turing were a major setback to the Hilbert program for the foundation of mathematics. However, the use of typed theories, and the consistent set theory due to Zermelo and Frankel, allow logical approaches to still be used. The G¨odel result is also not fatal because extensions of arithmetic theories can 84 Chapter 5: The history of formal description

tape:

#####encoded description of control diagram for M#########current state of M####tape for computation by M##############

read/write head

control diagram for universal Turing machine: start: write next read read current state as corresponding state of M current state next state of M for M

read current read write symbol on M's symbol being corresponding computation read by M write symbol tape

Figure 5.4: The idea of a universal Turing machine be developed that will allow the proof of those statements he showed undecidable in his original arithmetic theory, however these extended theories will in turn have their own undecidable statements.

There were developed in the 1930s and it remains true today that there are four main ideas for describing a discrete information-processing machine [Minsky, 1967] [Hermes, 1965] [Davis, 1958]. These are: (i) The automaton approach [Turing, 1936] [Davis, 1958], which makes state transitions according to some table, and where there is some external storage medium like a tape or tapes, stacks, registers, etc. (ii) The rule system approach of Post and of Markov [Minsky, 1967] [Hermes, 1965] [Davis, 1958] where there is a set of rules which operate on a working string of sym- bols. If the set of rules is unordered this is a Post system, if the rules are in a fixed order and are executed by always scanning from the top after each time step, until an applicable rule is found and executed, this is a Markov algorithm. Formal logic 85

(iii) The functional evaluation approach of Church and of Kleene [Church, 1941] [Kleene, 1971] where every expression denotes a function and computation starts with an applicative expression representing the application of a function to some arguments, which themselves represent functions, and reduces this expression, using given reduction operations, until an expression is derived which cannot be reduced anymore and is the representation of the value of the original applicative expression. This is the lambda calculus of Church. In the recursive approach of Kleene, a set of recursive definitions is given and evaluated by recursive calling. (iv) The logical proof approach [Kurt G¨odel, 1930] [Davis, 1958] where a formal theory is given as a set of expressions and then either one works forward inferring new expressions, or one is given a hypothesis and systematically works backward, developing a proof tree which proves the hypothesis is true. As the reader knows, every one of these approaches for describing information-processing machines can simulate any of the others, and the set of all functions computable by each approach is the same.

During this period, the mathematical notion of model was developed, a key advance being made by Alfred Tarski in 1936 [Tarski, 1936] (see [Tarski, 1956]). A formal theory is a set of logical expressions which are asserted to be true. Tarski developed a formal mathematical concept for an interpretation of any given theory, see Figure 5.5. This also gave a precise formal way of defining the truth of a logical statements, namely of the statement holds in the interpretation. An interpretation for which a theory is true is called a model of the theory. A theory may be true in some interpretations, in which case it is called satisfiable, or it may be true in all possible interpretations in which case it is called valid, or it may not be true in any interpretation, in which case it is called unsatisfiable.

Skolem and Lowenheim showed that every consistent theory always has a denumerable 86 Chapter 5: The history of formal description

Theory Model

alphabet of symbols set of individuals set of logical statements predicate for every predicate letter in the theory function for every function letter in the theory individual for every constant in the theory Every statement in the theory is true in the interpretation

universe = set of individuals

predicate for P predicate for R (x)(Ey)(P(x,y) & R(a)) individual for a

Figure 5.5: Tarski’s concept of an interpretation of a theory

model [Mendelson, 1979], i.e., a discrete model which is only as large as the set of integers. Herbrand showed how, given a consistent theory, to construct a denumerable model for it. This constructs the so-called Herbrand model of the theory [Herbrand, 1930] see [Herbrand, 1971], and section 18.2. Henkin showed we can construct a model restricting ourselves to interpreting every function as a computable function [Henkin, 1950].

In the 1960s, Abraham Robinson showed how to axiomatize the real numbers in first- order logic [Robinson, 1974]. This allowed him to formalize limit arguments and many of the results of calculus. Theoretical computer science 87

5.3 Theoretical computer science

Although Charles Babbage had shown in the 1840s how to build a machine for carry- ing out mechanical computations, it was only when such machines were actually built and used in the 1940s, using electronics, that computer science began to develop as a discipline. Although mainstream computer science has been necessarily concerned with the practicalities of present-day machines, their design, construction, programming and use, there has always been accompanying theory development which has studied a much wider class of theoretical computational machines. This lead in the 1950s to the study of neural net models and cellular automaton models.

At the same time, mainstream computer science was developing the concept of automaton and its connection to formal language processing, programming language design, both syntax and semantics, and operating system design and the organization of computer systems. Notions of functional programming were developed, based on Church’s func- tional approach. Theories of data structure and databases were developed in the 1960’s.

Theoretical studies have elucidated abstract properties of automata, abstract properties of formal languages including syntax and semantics, also theories of coordination of sets of serial processes. There are general theoretical bases for the notions of computability and of complexity of computation.

5.4 Artificial intelligence

Goal trees, backtracking and symbol structures. In 1956, Allen Newell, J. Clifford Shaw and Herbert A. Simon developed their first artificial intelligence (AI) program which they called the Logic Theorist [Newell and Shaw, 1957], which solved problems in 88 Chapter 5: The history of formal description

propositional logic, and at the same time they developed the programming language IPL [Newell and Shaw, 1957]. IPL-IV, i.e., version 4 of IPL, was used for the first versions of their General Problem Solver program, GPS, in 1958 [Newell et al., 1958]. A more standardized and usable version, IPL-V, was developed in 1959 and used for some early AI programs including those developed by Feigenbaum and by Feldman. Thus Newell’s first idea of AI was “cognitive programming” embodied in the IPL language. IPL is a bit like an assembly code, i.e., it is laid out on a line by line basis, each line corresponding to an element of a list of symbols, which denote data or operations [Newell et al., 1960]. A list is thus a sequence of symbols or lists, and was written, for example, as:

NAME PQ SYMB LINK HO COMMENTS L1 0 S1 S2 S3 0

A program is written, for example, as:

NAME PQ SYMB LINK HO COMMENTS 10 L1 0 input list L1 R1 L1 find last symbol of L1 R1 S4 find last symbol of sublist found previously, say B2 20 W1 B2 output to W1 0 where: NAME gives a name to the list, P indicates that the symbol is an input, Q gives the level of indirection of reference, SYMB is a symbol designating a data token or the name Artificial intelligence 89 of an elementary process, LINK is the name of a sublist to be used at this point, and HO is the communication cell. Program steps are calls to primitive routines which were written in machine code and of which there were a large number, about 120. Each IPL line was represented in one (40bit long) word of the JOHNNIAC computer which had 4K words i.e. 20K bytes = 0.02 Megabytes of total store.

In [Newell and Shaw, 1957], the authors basically invented list processing and attributed part of its power to the use of symbols to designate structures and processes. A symbol in IPL-V actually had two values, first its ordinary value which was (the address of) a list structure, and second a list of associated pairs holding other properties of the symbol, such as its name.

In developing GPS as a model of human problem solving and applying it to the ex- perimental data of Moore and Anderson [Moore and Anderson, 1954] on logic problem solving of the type taught in university courses, Newell and Simon developed AI pro- gramming techniques of goals, goal trees, methods, recursion and backtracking. Also their accounts talk of a cognitive model as an information processing system, IPS, based on symbol structures and sequential programs whose steps evoked elementary information processes [Newell and Simon, 1961] [Newell, 1962] [Newell, 1963][Ruimschotel, 1989]. So they saw their work as both artificial intelligence and cognitive psychology.

After this, in 1967, Newell courageously turned to a completely new way of programming using rule systems, as a way of breaking out of the constrained control structure of GPS, which was based on recursion and backtracking, by going to a much more primitive control structure, which was a simple repetitive linear scan of the program with rule matching, but out of which many other types of control structure could be synthesized. Newell’s rule systems are Markov algorithm rule systems and work on a working string of symbols which he identified with short term memory. 90 Chapter 5: The history of formal description

The rule systems built by Newell and colleagues had symbols built into them. In 1972 Newell and Simon published their first book on their work entitled “Human problem solving”. This consisted mainly of very long detailed studies and models of human problem-solving behavior for three problems, namely, cryptarithmetic, propositional logic and chess. However the book also gave definitive statements of their fundamental ideas and postulates of their theoretical approach. They define an information-processing system, IPS, and it is very similar to their original 1957 definition. The main differences were that: (i) the store is now divided into a short-term memory, STM, which is limited to 5-7 symbols, and a long-term memory, LTM, and (ii) programs are represented as rule systems which operate on STM, and store and retrieve to and from LTM.

In 1990, Newell published his “Unified Theories of Cognition” book which gave an up- to-date statement of his approach to cognitive modeling. A symbol system is similar to his 1980 definition, but the architecture of the notional machine is now the SOAR machine, and knowledge is organized into problem spaces. A problem space is sim- ply a set of rules, however the point is that (i) the SOAR machine only works in one problem space at a time, which must have a name, and (ii) the set of problem spaces is complete in the sense that problems that are encountered in any problem space can always be formulated and described during problem solving, and their solution at- tempted in another problem space in the set. This is called universal subgoaling and is achieved by having standardized representations for all data types involved in the gen- eral SOAR problem-solving process itself. A clear description of SOAR can be found in [John Laird and Paul Rosenbloom and Allen Newell, 1986].

I draw several conclusions from Newell’s research: (i) the notion of a symbol as a token which represented a symbol structure was intro- duced from the beginning and stayed the same. (ii) the processing is always serial and has stacking of tasks and often backtracking, to Artificial intelligence 91 previous goals on failure, built in. (iii) Ergo this architecture is not easily mappable onto brain architecture, which may not always use symbols, does use a lot of parallelism, and doesn’t use strict stacking or backtracking. (iv) It does not use data types as a programming construct in problem spaces, whereas the brain probably handles different data types differently. (v) Nevertheless, SOAR demonstrates for the first time a universal problem-solving pro- cess which is able to identify and formulate its own goals and to attempt their solution, and this is something that will ultimately need to be provided in any theory of human cognition.

Knowledge representation. The other main language was LISP, developed in 1959 at MIT, which was a functional language with list processing operations and program representation as lists. Lisp uses bracketed expressions to denote list structures and programs, and it encourages a functional style of programming with recursive calls instead of iteration. Lisp made it easier to develop special-purpose AI languages for particular tasks.

In the 1960s, the idea of knowledge representation and metalevel description was de- veloped by Marvin Minsky [Marvin Minsky, 1965] and Carl Hewitt [Hewitt, 1967] and others. Although there had been programming languages since 1956, their semantics was not clear and thus programs could not be used as descriptions.

Hewitt’s Planner language [Hewitt, 1967] was the first language in which knowledge could be represented directly and the semantics was clear, it was the formal deduction of new assertions from existing ones. A Planner program consists of a set of assertions and rules, where a rule has a pattern which matches any goal expressions that it applies to and a body which specifies what goals have to be satisfied before the entire rule is satisfied and its main goal satisfied. Thus in the following example 92 Chapter 5: The history of formal description

(def-theorem tc-broke1 (conse (x y) (broken ?x) (thgoal (fragile x)) (thgoal (heavy ?y)) (thgoal (on y x)))) this is a rule (called a theorem in Planner), whose name is tc-broke1, taking two argu- ments and having goal pattern (broken ?x). This theorem would be evoked if there was a goal which matched this goal pattern, for example (broken a). When the theorem is evoked it tries to satisfy each subgoal in its body in turn, by either finding existing facts or by evoking further theorems which match the subgoal. If and when the theorem can be executed to the end then the goal (broken a) is satisfied. Thus a rule is the same as a theorem in predicate logic, it represents a known true statement. So a Planner program is a set of statements the programmer is asserting to be true. These statements can be either facts or theorems.

Understanding a description in Planner basically needed a computer to elaborate it and to answer questions about it, since descriptions were long, and interacted with each other. A human could understand the descriptions but could not do much elaboration by hand.

So the concept of formal description by which humans described and communicated and elaborated, using the matching of expressions, was now taken to another level, in which a computer program interpreted the language, but the human could easily verify that they understood the description and could decide if they agreed with it or not. This can not be said of programs in normal programming languages, since the programs are much more complex and difficult to verify by hand, even if one uses a computer to elaborate them.

Frames. In 1974, Marvin Minsky introduced the concept of frame Artificial intelligence 93

[Marvin Minsky, 1974], which was a larger kind of structure that held representa- tions of many different aspects of a given situation or idea. This allowed one to describe all the different visual aspects of a given object and their interrelations. It also showed the way to the description of contexts, that is, outer situations which provide knowledge and impose constraints on more local descriptions.

Marvin Minsky and Seymour Papert’s MIT memo AIM-252 [Marvin Minsky and Seymour Papert, 1972] observed that children often draw un- realistically but in a way which demonstrates their knowledge of the depicted scene. An example is given in Figure 5.6, from [Marvin Minsky and Seymour Papert, 1972]. The

Figure 5.6: Geometric knowledge in children’s drawings example communicates that the box has four sides even though some are not visible.

Figure 5.7 gives an example of a visual frame from [Marvin Minsky, 1974].

A frame is a representation of all the knowledge that the system has about a given type of situation, and it can be matched to a given specific situation, identifying the components and giving their relationships, etc. Frames have slots which can be filled by matching the situation, slots can hold default values, and frames can be linked together. Frames can contain procedures, i.e., programs, for making transformations or calculating values.

Frames can also be used in understanding natural language. The following example is from [Marvin Minsky, 1974]. “There was once a wolf who saw a lamb drinking at a river 94 Chapter 5: The history of formal description

E E

B C A B

F A E B C D

Figure 5.7: Minsky’s concept of frame and wanted an excuse to eat it. For that purpose, even though he himself was upstream, he accused the lamb of stirring up the water and keeping him from drinking”

There is a frame for this situation: A upstream from B B muddies water A accuses B

There may be a mental image used to understand the verbally expressed situation.

There is other general knowledge active in this context, such as: wolves eat lambs stirring up makes water undrinkable stirring up is temporary

Frames showed that knowledge structures could be large and complicated, and to the point where they could provide a complete environment for processing. Frames were used in computer vision systems to represent all the knowledge required for visual perception of a known situation, and also in natural language processing for representing known situations being referenced in utterances. Patrick Winston’s AI textbook contains a comprehensive treatment of frames and their uses [Winston, 1993], and Minsky’s “Society Artificial intelligence 95

of Mind” book develops the concepts further [Minsky, 1986].

Learning methods. Learning has always been of great interest in AI, and in the 1980s became a major subarea. Several new types of learning method were developed [Michalski et al., 1990] [Winston, 1993].

Multiagent systems. The topic of multiagent systems started in the mid 80s [Bond and Gasser, 1988b] [Bond and Gasser, 1988a], and has developed into an inter- national research area [Huhns and Singh, 1998] [Ferber, 1999]. This has generalized AI to involve cooperation among a set of intelligent agents, and this leads to a consideration of distributed knowledge, distributed planning, high-level communication methods, and distributed learning. We have only scratched the surface of the potential of this fertile area.

Statistical natural language processing. During the 1990s, statistical grammars were introduced for learning natural language syntax. These turned out to be wildly suc- cessful, allowing languages to be learned quickly and used in automatic natural language translation. For example, the parliamentary proceedings which are bilin- gual in English and Chinese were used to learn a very accurate and complete statistical grammar for Chinese in only 3 days [Wu, 1994].

The failure of AI. It is often said, and indeed taught by the ignorant to the innocent, that AI “failed”. Their knowledge of the subject seems to be limited to answering just one Jeopardy 7 question - “It failed” - “ ‘What is AI?’, Alex”. The argument given is that some AI researchers predicted the development of super intelligent machines by now, and this hasn’t happened. However, as I have recounted above, the reality is that AI has developed new important and lasting concepts in computer science and continues

7Jeopardy is an American TV gameshow in which contestants are given an answer and have to generate the corresponding question. The original host of the show is Alex Trebeck. 96 Chapter 5: The history of formal description to do so. Incidentally, this has been achieved with a very small number of researchers worldwide, probably a few hundred.

5.5 The choice of programming language

In the United States, functional programming has continued to be used for most artificial intelligence research, partly because if advantages of functional programming such as composability and scaling, and partly in an attempt to standardize the Lisp language for practical projects and thereby ensure financial support. Logic programming has however been used instead of Lisp in , and in Japan where it was the basis of the Fifth Generation computer initiative. Due to the need to save and to share programs and the cost of learning programming, the choice of programming language rapidly becomes a cultural one. Once most people are using a particular language, it is very advantageous to use that language, in spite of any faults and unsuitability it may have.

5.6 The intellectual revolution of computer science

Computer science has developed precise models and realizations for a number of key concepts in Western thought, notably those of process, representation and abstraction. By process, we mean some kind of ongoing activity in time, by data representation we mean a structuring of items corresponding to some entity of interest, and by abstraction we mean a relation whereby a representation represents classes of other representations and their general properties and behavior. These ideas had been used for centuries in imprecise ways and as a result could not be developed beyond a certain level. However computer science has allowed us to construct and use processes, representations and abstractions as a precise science. This ability is also affecting research into philosophy The intellectual revolution of computer science 97 and the foundations of mathematics. These theoretical developments in computer science are quite distinct from the practical use of computers and their great facilitation of most areas of human endeavor. Chapter 6

Logic programming

Abstract. In this chapter, I explain how logic programming developed out of the logic approach to artificial intelligence.

98 Introduction 99

6.1 Introduction

In 1965, the resolution method for theorem proving was developed by Alan Robinson [Robinson, 1965] and gave an impetus to the use of logical methods, and to the use of logical theorem proving as a basic underlying process for artificial intelligence.

The resolution method uses one basic operation called resolution to derive a new logical statement from a given pair of logical statements. Statements are represented, without loss of generality, as sets of logical literals. In resolution, a pair of literals, one from each of the pairs of statements, is unified by finding a substitution, of terms for variables, which makes them identical. Then a new statement, the resolvent, is formed by the union of the two statements with the unified literals removed. If a sequence of resolutions can be found which starts from a set of statements and produces the empty statement, then the original set of statements is unsatisfiable.

In the 1970s, inspired by Planner, by recently discovered methods for generating linear proofs, and by new ideas in using logic to describe natural language, logic program- ming became a practical reality with the development of the Prolog language by Alain Colmerauer and Robert Kowalski [Colmerauer, 1973]. This was followed by its efficient implementation by David H. D. Warren [Warren, 1977] and the development of its precise semantics by Keith Clark [Clark, 1977].

The main difference from Planner is the use of resolution instead of modus ponens 1 for inferencing, and the use of a linear strategy for proving theorems which was complete, i.e., guaranteed to find a proof if one exists. Resolution involves a specific kind of two-way match of logical expressions called unification. Logic programming also handles variables better, allowing variables as components of data structures. Prolog was developed into a more generally usable language, it has a full range of programming language features.

1defined above in section 5.2. 100 Chapter 6: Logic programming

It is also available for all the usual types of computer, i.e. platforms, and these days it is very robust and very fast, thanks to the work at SICS, The Swedish Institute for Computer Science, supported by the Swedish government and Swedish corporations.

Robert Kowalski and Maarten Van Emden also showed that top-down goal-directed search and bottom-up data-driven query processing converged on the same minimal mod- els, also tying logic programming to a fixed point semantics analogous to that developed by Dana Scott for the computation of functions.

Logic programming has allowed large AI systems to be constructed. Chapter 7

Describing information processing in computer science

Abstract. I explain description methods used in computer science for describing infor- mation processors and information processing.

Concepts include data structures, processes, abstractions, interfaces and protocols.

101 102 Chapter 7: Describing information processing in computer science

7.1 What is a computer science?

As I never tire of saying, computer science can be defined, not as the study of computers, but as the study of the description of computers. In other words, description methods are crucial and central to the whole computer science enterprise. What a description is is part of this research, however, approximately, a description is an expression, in a precise language, which denotes or describes the entity under consideration, in the sense of allowing questions about the entity to be posed and answered.

The present-day discipline of computer science in the main describes the design, imple- mentation and use of computer systems which are based on present-day technology and methods.

Thus, programming uses programming languages based on addressable stores, instruc- tions and sequential control. Hardware is serial, indeed most machines are based on a single bus. Operating systems concern themselves with managing a set of serial programs. Database techniques similarly are concerned with the management of the seriality of disk seeks, the seriality of channels connecting processor and disk, and the seriality of multiple accesses by multiple users.

Of course there are methods rising above plain seriality. Ethernets, functional program structures the parts of which can be executed concurrently, and of course experimental parallel computers.

7.2 Concepts in computer science

In this section, I want to introduce and define the key concepts in computer science which I will use in defining my model of the brain. Concepts in computer science 103

These concepts are (1) data and data structures, (2) program, control and process, (3) interfaces, and (4) communication.

There are no such concepts in theories of neural nets, which are based on real analysis and integration of equations of real variables.

In chapter 8, I will discuss the problem of defining a class of machines which are derived from plausible models of the primate brain at the system level of analysis, and of then developing a computer science for them, which would be the study of the description of computation, data representation and control structures for this class of machines, as well as their general theoretical properties.

7.2.1 Data and data structures

The origin of the word “data” is something given, from outside, beyond our control, how- ever when computers were used to construct or derive information, this new information created by the system was also called data, to distinguish it from a program.

An etymologically more felicitous word would be “fact”, since this means a conclusion which has been constructed, from the latin verb facere, to make.

How can data have structure? One approach is to use addresses and to store data in storage locations with certain addresses. Thus a linear arrangement of data such as a queue or shopping list could be represented by putting the successive members of the list in successive addresses. Then, as in IPL, we can use lists containing addresses which refer to other lists, giving list structures. In symbol structures, elements of lists are symbols (tokens) which can be primitives or can be the names of other lists. The table giving the correspondence between a symbol and the entity that it refers to is called a symbol table and may only be required for a compiler but for some languages may be required at run 104 Chapter 7: Describing information processing in computer science

time.

However we would like to use a concept of data structure which is not tied to or dependent on any particular type of computer or processing device. An answer to this problem was given by Peter Landin in 1964 [Landin, 1964], who proposed a functional approach to the idea of a structure definition. He said that we could define a particular structure type by defining three kinds of function: (i) constructors, which construct an instance of the structure type, if given as inputs the components to be structured, (ii) selectors, which are given a structure and a description of which component, and return the value of the selected component, and (iii) predicates, which, given any structure, will tell us true or false whether the structure is of the given type or not.

7.2.2 Program, control and process

Whereas a basic level of description of an information-processing machine is as an au- tomaton, explained in chapter 5, for complex systems it is much clearer to use the concept of a program which is run by an underlying automaton. A program is a description of a sequence of elaborations using machine actions, so a program is basically text.

The word “control” is used in several ways in different disciplines, most notably control engineering where it concerns systems which seek and maintain desired states. However in computer science there is a different concept of control. Control concerns which program and machine elements are currently active and which are executed in the next time interval. This usually assumes a single serial-processor, so only one action can be executed at any one time, however the same considerations apply to systems with more than one processor. Thus control structures include conditional looping, where a program segment is executed repeatedly while a given condition is satisfied, and recursion, where during execution of a program it evokes an additional copy of itself and executes it. Control Concepts in computer science 105 relations among programs include one program transferring control to another, or calling another and expecting it to return control when it terminates.

A process is usually taken to be a program which can be executed but which can also be suspended and then processing continued at a later time. Thus a process has a variable reactivation point which specifies at which point in the program execution should be resumed.

Processes can develop different control relations to each other, such as independent par- allelism, or parallelism with a shared data store, or coroutining, which is a streamed control where each process operates in parallel and sends requests for more data to other processes when it has finished processing its current data.

Of course, in a typical computer, even where only one person is using it, there will be many processes which are active at a given time, and then a scheduler which is part of the operating system will act as policeman in regulating which processes get to run. Some systems use a timesliced method in which each process gets a fixed amount of time, usually 10-100 milliseconds, before another one takes over the processor. This gives the effect of parallelism, using a serial processor.

Control can be taken to a new level with dataflow architectures in which processes are quiescent until they receive enough data to resume computation. Thus a dataflow network is a reactive network which will process whatever data is available without requiring any management by another process.

7.2.3 Interfaces

Computer science uses a highly developed and precise notion of interface between systems which specifies what information is exchanged across the boundary between systems. 106 Chapter 7: Describing information processing in computer science

Definitions can also be transported across interfaces, or parts of definitions, or definitions can be modified in crossing boundaries.

Interfaces allow large systems to be described more easily. They also facilitate the division of and distribution of design and programming activity.

7.2.4 Communication

There is also a well-developed field of computer communication which involves among other things the design of protocols by which two computers can manage their interac- tions. Protocols are typically layered into several levels of abstraction, where each level has a different protocol, which is implemented in terms of the protocol for the level below.

7.3 Symbols in computer science

Most of computer programming uses variables which can have numerical values, thus a variable x1 might have the value 2. Any variable is referred to by a symbol, e.g., x1. Most languages require the programmer to assign a data type to each variable, so it can only take values from a given set, so x1 above might be of type integer. Some types are atomic, i.e., with no analyzable structure at the given level of description, like integers, reals, booleans and characters, but other types can be structures like arrays, strings, queues and trees. However the components of structures are ultimately particular atomic values, i.e., of particular atomic types. A variable can only have one value at one time, but can change its value during the execution of the program, although in some languages assignment is write-once, a variable once assigned a value cannot change it. There can sometimes be untyped variables which can take any type of value. A variable when first created may have no value, i.e., be unbound. Concepts in computer science 107

In some programming languages however we can have values which are symbols. A symbol then is a token which has a name. Thus a variable y2, whose name is “y2”, might have a value which is the symbol whose name is “alpha”. Values can be symbols or symbol structures, a symbol structure being a list of symbols or lists. A given symbol may occur more than once in different symbol structures and the different occurrences can be tested for equality with each other. This defines the concept of symbol in the context of present-day programming languages.

7.4 The computer science experience

There is a basic experience that computer scientists have that is difficult to convey to anyone else, and this is the confrontation of one’s intuition, rationality, and imagination by the totally objective response of the computer.

Typically a computer scientist is designing and implementing a program. He or she starts with a conception of what the program is supposed to do. This is then thought through into a design. The usual way is top down from an outline design to gradually more detailed designs. These are usually written down in great detail and ideally some examples are worked out by hand on paper to make sure the design will work. This process takes several weeks, the computer scientist will have subjected these ideas to every possible test and has imagined all possible data cases. Then when there is great clarity they start to specify the last detailed level which is to express the design in a programming language, and to run it on a computer. Any grammatical errors are usually removed on the first day.

What then happens is that the program does not work.

On examining what happens when the program is run on the computer, the computer 108 Chapter 7: Describing information processing in computer science scientist realizes how their conception of the problem is incorrect, and/or they have for- gotten an essential process that is needed to make it all work, and/or the data description has components missing.

This experience is basic to computer science.

It is also why a computer scientist is never completely convinced about an idea or design or specification until the program has been implemented and the program has been successfully demonstrated to run correctly on the computer. One can propose very convincing models and mechanisms, write them down in papers, etc., but unless it has actually run on a computer it is not real.

Conversely, psychologists and neuroscientists tend not to care so much whether or not a computational theory has been programmed and its correctness demonstrated.

It is I think quite analogous to a psychologist designing an experiment and finding that it doesn’t yield viable results. Expecting a computer scientist to take an unimplemented model seriously is like expecting a psychologist to believe in an experiment that has not been run.

Thus the many psychological models which are discussed by psychologists, and have their useful place in conceptualizing psychological mechanisms, will certainly be found to have essential conceptual flaws and missing mechanisms and data when someone eventually attempts to program them. Chapter 8

Computer science description and the brain

Abstract. I discuss the problem of defining a class of machines which are plausible models of the primate brain at a system level of analysis, and of then developing a com- puter science, which would be the description of computation, data representation, control structures, and general theoretical properties, for this class of machines.

I analyze description methods used in computer science for describing information pro- cessors and information processing. I also elucidate the underlying assumptions about information processing made in computer science description methods.

I then reconstruct and generalize some of these descriptive concepts, to provide a set of descriptive concepts for information-processing by the brain. I suggest that these descrip- tive concepts can provide a basis for a system level of description of the brain.

109 110 Chapter 8: Computer science description and the brain

8.1 The computer and the brain

The question of the relationship of the brain to the computer has been asked by many thinkers, for example Edmund Berkeley [Berkeley, 1949], John von Neumann [von Neumann, 1958] and Hartwig Kuhlenbeck [Kuhlenbeck, 1966]. Terry Sejnowski has given a discussion of von Neumann’s book [Sejnowski, 1989], and quotes its es- sential message: “Thus, logics and mathematics in the central nervous system, when viewed as languages, must structurally be essentially different from those languages to which our common experience refers”, and “However, the above remarks about reli- ability and logical and arithmetic depth prove that whatever the system is, it can- not fail to differ considerably from what we consciously and explicitly consider math- ematics”. In spite of occasional discussions of the problem [Schade and Smith, 1970] [Conrad, 1973] [Spiegel, 1983], occasional speculative schemes [Mitra and Mishra, 1993] [Mitra and Mishra, 1990], and philosophical discussions [Searle, 1990], the problem has not received any treatment that has even scratched its surface. However, unlike in von Neumann’s day, present-day computer science has by now developed a host of very gen- eral concepts and powerful methods for describing information-processing systems of wide generality.

In this chapter, I will try to contribute to the solution of the problem of describing the brain, and brain-like computers, by: (i) analyzing the assumptions about information processing that underly concepts and methods used for describing complex information-processing systems in present day com- puter science. (ii) reconstructing and generalizing these descriptive concepts from computer science, to provide a set of descriptive concepts for information processing in the brain. Computer science assumptions from von Neumann machines 111

8.2 Computer science assumptions from von Neu-

mann machines

The majority of concepts in present-day computer science are derived from von Neumann machine architecture.

The notion of data is that it is addressable and therefore directly and precisely retrievable. The notion of data structure is based on sets of pointers, that is, addresses. Accessing data is a matter of indexing and following pointer chains. Data are passive.

The basic notion of control is that there is a single point of control which is an address. The point of control usually steps serially to the next address, unless there is a branch in the machine code.

There is a concept of instruction, instructions are treated as data, there is built-in im- plicit seriality of sets of instructions from the seriality of RAM addresses, and only one instruction is executed at any one time.

There is a notion of program, which is a sequence of instructions.

Of course there are many higher-level concepts which capture notions of data and data structure, and also higher-level notions of control structure. However, these higher-level concepts always incorporate the assumptions of precisely addressable data and a single precise point of control.

Present-day computer science concepts are thus in the main derived from von Neumann architecture. It is not much of an exaggeration to say that present day computer science is the study of von Neumann machines. 112 Chapter 8: Computer science description and the brain

8.3 Life and computer systems

The isolated computer did not evolve, and computers today are only the central focus of a much wider information-processing activity carried out by humans. The analogy of computers to living systems should be to this wider process. We can perhaps define a living system as one which continually renews itself, and we can also observe the characteristic of the human mind that it is continually redescribing itself.

A system which does not need human intervention, help and creativity, and which sur- vives, adapts and reinvents itself, would have to provide: (i) ontogenic development - development of software, addition of new software, addition of new hardware - additional disk drives, (ii) phylogenic development - design of new types of system, evaluation of performance, new technologies and devices, (iii) storage needs - media, and (iv) physical needs - provision of energy, physical protection - air conditioning, shock and vibration, fault detection and diagnosis, physical repair and replacement of hardware.

8.4 Assumptions underlying computer science de-

scriptions

Since my goal is to consider the applicability of computer description methods to the brain, I will list here some of the underlying assumptions operating in such methods.

Separation of data and control. Data are passive, and there is a program which determines operations which are executed on various data items.

Seriality of control. Most models assume a single point of control, where computation activity is occurring. Everything else is passive. The point of control moves rapidly over the possible locations. It is passed around as a datum. It is also located at a precise Assumptions underlying computer science descriptions 113

location at any one moment.

The concept of program. A program is a data description which provides detailed control to a hardware processor which “runs” (executes) it.

A program can be treated as data. It can be stored, moved and executed.

Location independence. The same data can be located in any location, and can be moved from one location to another.

Separation of resource management from other functionality. Programs usually specify computation and not resource management, such as computation time, real time, priority of running, disk space to be used, etc., however some programs handle RAM allocation. Such matters are dealt with in separate descriptions, in job control languages, and in operating system policies.

Erasibility. Any program or data can always be removed by erasing it.

Copying. Programs and data can be copied, and at very little cost.

Exactness. Programs and data can be stored, retrieved and copied without changing them in any way. All quantities are held to a high degree of precision, which is reliable and reproducible.

Permanence. Programs and data do not degrade with time. This also applies to suspended processes which can be resumed at any later time. Over a very long time- frame, when we try to move programs and data to new types of machine, then we start to get degradation due to obsolescence.

The hardware-software distinction. This is a relatively black and white distinction. Most mechanisms are implemented in software and run on hardware. There are also of course varieties of firmware, FPGAs (field programmable gate arrays) and so on, giving mechanisms intermediate between hardware and software. 114 Chapter 8: Computer science description and the brain

Referencing and indexing. References are precise. However, a reference is a simple descriptor. Hence, data indexes have to be used to locate items.

Context. Precise contexts are kept and used in data access and process control.

Detailed precise description of processes. One process can hold and act upon a precise description of another.

Metalevel descriptions. Descriptions of programs and data can be used which describe the properties and performance of those programs and data.

Simulation. One process can simulate another. A process can be represented in many different ways within one system.

Error correction at each level. Control and data are exact, and any residual errors or inconsistency are usually handled as close to their source as possible. It is assumed that data passed from one process to another have been checked, and any errors removed.

Energy. The use of physical energy is not much of an issue, and the results of compu- tations hardly affect the amounts of energy used.

Limitation of processing. The number of hardware operations per unit time is a definite bound on performance, as is the amount of RAM, disk swap space, and disk space. These limit the computations that can be carried out.

8.5 Computer science concepts for the brain

Let us now consider the properties of present-day computing systems and ask how the brain might differ. Whether any concepts can be carried over into brain science, whether some can be modified, whether similar issues exist and require analogous description methods. Computer science concepts for the brain 115

Separation of data and control, and programs. (i) instructions are treated as data, (ii) there is built-in implicit seriality of control from the seriality of RAM addresses, (iii) data are passive, (iv) there is a concept of instruction, (v) only one instruction is executed at any one time, and (vi) there is a notion of program.

An instruction and a program are similar ideas, they are data which control a process (the cpu). But all data have influence on computational activity, so what if any is the difference? Try this: a program is anything which tends to focus and direct computational resources in an organized manner, and to maintain and update over time its control over those resources.

A program would be some data set whose effect upon a process is to focus and maintain control over computational resources and to produce data transformations characteristic of the program.

It may well not be necessary to use the property of location independence for the brain. Different types of data may well be localized in certain areas, and processed only by certain processors.

Values, variables and data types. It is quite reasonable to assume that the brain uses stored values which can change with time.

However, how these values are described in brain processing, such as how they are stored and retrieved, may or may not require a notion of variable, i.e., a descriptor which explicitly refers to the stored value.

A variable is usually regarded as a fixed location which can have at any one time one of a set of possible values. The location is usually compiled into the process so that the original name of the variable is lost, i.e., is not used for computation, but for specification or description of the computation. There are systems with a more general concept of variable and even systems which can generate new variables, including their names, 116 Chapter 8: Computer science description and the brain during computation, but these are exceptions.

The nervous system probably uses implicit referencing routinely with certain neural sub- areas having certain possible states or values, and processes using these values as inputs or controlling information.

We will need mechanisms for describing values and their storage and access. We will need a notion of data type. Possible values could be of different types from single numerical values, to images, to patterns, masks, programs, etc.

If a value is always stored or processed in a given specified neural area, it is likely that its possible values are all of one type and of similar size and complexity.

Channels. Neuroscientists are comfortable with the concept of communication channel. The pathways connecting neural areas can be described as channels, and the notion of a limited amount of data being able to pass through it. The validity and usefulness of information-theoretic measures in biological systems, including the brain, has been known since the fifties.

Seriality, point of control, control in general. I imagine that data and processes and programs in the brain operate concurrently. I imagine that each of the fifty Brodmann areas on each hemisphere are processing simultaneously, and that within each area all the millions of neurons are firing simultaneously, perhaps many of them relatively slowly at 1 spike per second, but nonetheless firing.

The extreme seriality of operation of computers is not found here, but the notion of control still will arise, in issues of how resources are used. Does a process access one particular memory or set of memories and run one program or object, or does it run another one? The idea is that there are many possible courses of processing activity and that not all objects can be processed simultaneously. Thus there are questions of selection of which objects to process, of suspension and termination, of one process dominating Computer science concepts for the brain 117 and inhibiting another, of one process waiting for another, of one process having a control relationship with another, perhaps they work on related data, perhaps they provide data to each other, or one to the other, perhaps they compete for resources, or for access to data.

A process can be imagined as localized within some area, or as sparsely distributed over a large area, if not the entire brain.

Where an object is large and distributed, its control over resources, viewed as being processed by processes, is no longer determined by one point but depends on a large subset of the object, if not the entire object.

Other control issues include coherence - maintaining active data and programs which agree with and are relevant to each other rather than a set of unrelated data and pro- grams.

The seriality of verbal input/output, of the stream of visual percepts from the external environment, and the seriality of high-level logical thought and problem-solving search tends to imply that the brain may be able to, or may need to, function serially at the top level, particularly for very demanding tasks.

Seriality may be necessary if, for example, (i) certain resources of limited capacity are needed and are used maximally, or (ii) results of computation are needed before further computation can occur.

Referencing and context. I imagine that narrow-band referencing by addresses and names is not used in the brain, but rather a reference is a large object, with data and even some programs, which acts upon or is processed by a memory to retrieve stored memory items.

I imagine this retrieval process is not unique but that many items could be retrieved, 118 Chapter 8: Computer science description and the brain however varying amounts of information can be used, and also a retrieved optimal item can be obtained.

We must also allow for multiple concurrent retrieved items.

Items may also be active objects with data and programs.

Contexts may also be approximate and overdetermined in a similar way.

Simulation, metalevel objects and copying. We may not have these in the brain.

Precision and exactness. There can be no doubt that the brain can achieve precision of calculation, of memory retrieval and of inference. I imagine however this being achieved by a multiplicity of overdetermining constraints and cues, which establish, maintain and ensure the accuracy and precision of information retrieved and transformed.

8.6 Summary

My analysis has shown that many inbuilt assumptions of present day computer sci- ence ubiquitously infiltrate computer science description methods. I have argued that it is nevertheless possible to crystallize out, from present day computer science, a set of computational concepts and principles which can form the basis of a system level of description for the brain. Chapter 9

Levels of description in computer science

Abstract. I explain description methods used in computer science for describing infor- mation processors and information processing.

The description of a computer system is organized as a set of self-contained description levels.

I describe a typical set of levels giving formal descriptions at each level and descriptions of elements at each level in terms of the level below.

119 120 Chapter 9: Levels of description in computer science

9.1 Describing computers

During the 1970s and 1980s, with the development of VLSI technology, the precise de- scription of large complex computer systems using multilevel approaches was developed into a powerful and practical methodology. A good standard treatment of system level abstraction and levels of description can be found in [Siewiorek et al., 1982].

Description languages have been developed for the high-level description of complete computer systems. Standard treatments can be found for example for requirements in [Davis, 1993] and for real-time embedded system methods in [Calvez, 1993].

At the highest levels, there are description languages which are not completely formal or mathematical, but are used for humans to communicate with other humans about computer system specifications, for example for systems that are being contracted and have not yet been built. This level of language tends to merge into languages used by the lawyers and accountants involved in the contracts, as well as the concepts and terminology of the culture of the organizations involved.

9.2 Levels of description for computer systems

Gordon Bell and Allen Newell published a landmark book in 1971, entitled “Computer structures: Readings and Examples”, which reviewed all existing computers at that time and gave a hierarchical description scheme for describing computer systems. It also de- veloped some general concepts for a unified, more detailed, description approach. This approach had two descriptive systems called PMS and ISP. In PMS, one specifies the overall architecture of the computer, its components, which could be memories, links, controls, switches, transducers, data operations or processors, and how they are intercon- nected. Bell and Newell give a complete example of the DEC PDP-8 computer expressed Levels of description for computer systems 121 in PMS. In ISP, one specifies the basic instruction set of each processor, and Bell and Newell show how to specify the DEC PDP-8 instruction set using ISP. A predecessor of ISP is APL which was originally developed by Kenneth Iverson at IBM in the early sixties for specifying the action of processors [Iverson, 1962]. In 1982, the second edi- tion of the “Computer Structures” book appeared with an additional co-author Daniel Siewiorek [Siewiorek et al., 1982]. It had a more general hierarchical description scheme with an additional top level based on PMS description. This allowed the components of specified computer architectures to be complete computers, including software. Figure 9.1 gives a set of levels based on their description scheme. 122 Chapter 9: Levels of description in computer science

Level no Level name Description method used Concepts described

1 PMS PMS descriptions processors, memories, networks

2 applications programming languages application mechanisms

3 programming language implementation language programming constructions

4 operating system implementation language memory allocation, scheduling of tasks file system management

5 instruction code instruction set design

6 register level microprogramming, datapaths

7 switching circuit sequential circuits - counters, registers combinatory circuits - encoders, decoders, data operations

8 gates technologies for gates and memory

Figure 9.1: Levels of description in computer science Levels of description for computer systems 123

The relationships between levels are not always well defined. These depend on the spe- cific machine and languages and methods being used, and these are in a constant state of change and description. Each level will have a formal description language, allowing exact specifications of what it is describing and how, and the descriptions and description systems exist as computer programs which have been designed and implemented to pro- vide the ability to process system descriptions correctly and to give a faithful semantics of the intended meaning of the descriptions.

In general, as diagrammed in Figure 9.2, each level will “deliver” certain functions and data to the level above, meaning that it will explain those functions and data in terms of expressions involving data and functions within its level. It will “implement” those functions and data which are primitive and unanalyzable at the level above. In its turn, it works with a certain set of primitives which are delivered from the level below. What is being described is, of course, always the behavior of the system under consideration.

description language sets, elements, functions, predicates, time and space scales at level n hypotheses, questions, experimental data, models

interlevel interface between definitions of entities at level n description languages in terms of entities at level n−1 at levels n and n−1

description language sets, elements, functions, predicates, time and space scales at level n−1 hypotheses, questions, experimental data, models

Figure 9.2: Levels of descriptions and their interactions

A related diagram from [Edwards, 1992] illustrating VLSI chip design is given in Figure 9.3. This shows that spatial layout considerations can also be incorporated into the 124 Chapter 9: Levels of description in computer science description process.

Silicon compilation

Design synthesis Layout synthesis

Behavioral Structural Physical

Performance Processors, Physical System specifications memories, etc partitions

Algorithms Hardware Clusters Algorithmic subsystems

Register ALUs, Floorplans Micro− transfers registers, etc architecture

Logic Gates, Cell Logic functions flip−flops, etc estimates Cell Transfer Transistors Circuit functions layouts

VLSI chip

Figure 9.3: The process of chip design, from [Edwards, 1992]

9.3 Descriptions used at each level

Relations between levels. Let us consider how one might give an account of the execution of a Prolog program and how this would propagate, or transcribe, through the different levels.

I am spending time on this example because the levels of description of computer systems Descriptions used in each level 125 are one of the few clear examples of precise definition of a system of description levels, and later I will be arguing for an analogous description scheme for the brain.

This is thus an example of giving a precise description of a laptop computer running a certain Prolog program. In order to do this we will need eight levels of description, and each level will involve very complex descriptions. We diagram this in Figure 9.4.

Description level Interlevel interfaces

laptop Defined as processes, memories and networks

brain model process Prolog statements defined by Prolog mechanism in C

C C statements defined by machine instructions and calls to operating system

Linux operating system C statements defined by machine instructions

Pentium instructions codes defined in serialized register transfer language serialized register transfer language defined in hardware RTL

register level registers defined as gates

switching circuit gates defined as circuits

circuit elements circuit elements defined as physical devices

Figure 9.4: Levels on laptop running by brain model 126 Chapter 9: Levels of description in computer science

1. The laptop specification describes how one can run and interact with a Prolog program.

2. The Prolog program is described by how it runs on the Prolog runtime system.

3. The Prolog runtime system is described by a C program.

4. The C program is described by a machine code program that it compiles into.

5. The machine code program is described in terms of the machine instruction set and machine architecture.

6. The machine instructions are described in terms of a serial register transfer lan- guage.

7. The machine processor and architecture is described in terms of as hardware lan- guage such as RTL, the register transfer language.

8. The logic design level. Registers and register-level operations are described by designs in terms of logic gates.

9. The circuit design level. Logic gates are described in terms of circuits, i.e., arrange- ments of circuit components or devices.

Thus each level in principle has a different description of the same thing, namely the given Prolog program.

The specifications of each level and each interlevel interface exist as documents. We diagram this in Figure 9.5. Each level has at least two complex specification documents, often many more, each about 300-500 pages, and to understand these fully, one needs to specialize to work at one particular level or interface. Descriptions used in each level 127

Description level Specification of language Specification of interlevel interfaces

laptop laptop user manual implementation of laptop commands

brain model process Prolog manual and Prolog ISO specs 1 listing plus manual for Prolog interpreter in C

C C user manual and C language ISO specs listing plus manual for C compiler including operating system calls

Linux operating system manual for Linux commands and C ISO specs listing of Linux system in C

Pentium instructions Pentium machine code manual Spec of machine instructios in terms of Pentium machine architecture which is in terms of hardware RTL

register level RTL manual specifications for registers, ALUs etc as gates

switching circuit logic diagrams and tables specifications of gates defined as circuits

circuit elements circuit diagrams circuit elements defined as physical devices

Figure 9.5: Specification documents for levels and interlecl interfaces 128 Chapter 9: Levels of description in computer science

The circuit design level. A circuit can be written as a circuit diagram involving resistors, transistors etc. In this way for example a NAND gate can be described as a circuit, see Figure 9.6. It is of course possible to write a circuit diagram as a set of assertions, one for each component and then one for each connection among components.

A AB (A^B) B

Figure 9.6: NAND gate defined by circuit, logic diagram, layout and logic formula rep- resentations

The logic design level. We might describe a combinatory logic circuit by an expression in boolean, i.e., propositional logic: O1 = A ∨ ¬(A ∨ B) = A ∨ ¬B This can also be written as a logic diagram, see Figure 9.8.

The register transfer level. RTL is a formal language. In it you specify a set of input registers, an output register, transformations such as AND, OR, NOT, etc., which are executed in one clock cycle, and register transfer statements of the form: L:Z=F(X1,X2,X3,..XN) Descriptions used in each level 129

A B ¬A ¬B (A∧B) ¬(A∧B) (A∨B) ¬(A∨B) 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 1 0 1 0 1

Figure 9.7: Definitions of NOT, NAND and NOR gates as truth tables where L is a label for the statement, Z is the output register and XI are contents of the input registers. L: goto L2 if condition then goto L sub2 Thus this language has well-defined semantics.

The instruction set level. This comprises a set of about 100 different types of ma- chine instructions, whose meaning is given by specifications given in the manual for the machine in question. These meanings can also be described in terms of notional register transfers at the level below, for example: e.g., ADD X1 X2 is described as C,R[0] ←R[0]+R[X1X2] LOAD X1X2X3X4 etc is described as R[0] ←MEM[X1X2X3etc] However these specifications are not in terms of real hardware registers in the machine but instead concern a notional serial machine. This difference occurs because the ma- chine uses a lot of additional mechanisms, broadly termed instruction-level parallelism, to enhance its performance. For example, a basic arrangement is pipelining in which each instruction is processed in a sequence of operations, and then the processing of the sequence of instructions is overlapped in time using a series of different hardware proces- sors, see Figure 9.9, taken from Tuebingen course by Gerald Heim, based on the work of Patterson and Hennessy [Patterson and Hennessy, 1996] [Hennessy and Patterson, 1998] on their DLX machine. 130 Chapter 9: Levels of description in computer science

Figure 9.8: A logic circuit Descriptions used in each level 131

Figure 9.9: Pipelining diagram defines serialized register transfer language in terms of hardware RTL 132 Chapter 9: Levels of description in computer science

Other types of instruction-level parallelism are block transfer and caching, of data or program, in which a block of instructions or data is transferred together into a fast working store in the processor from which execution can be done faster. The description of this complex mechanism however is arranged to be a serial register-transfer language and the machine is designed to make this serial model be a correct description of its parallel operation. This illustrates, then, that an interface between levels of description can be quite complex and can involve complex evaluation and complex parallel mechanisms.

The C program level. Here is a simple C program: begin s := 0; for i=0 step 1 until n do s:=s+1; end This has a meaning specified in terms of the semantics of the C language. It can be explained by an equivalent machine code program for a PDP-8 machine, taken from Siewiorek, Bell and Newell’s book:

start cla clear AC dca s s=0 deposit ac in Mi clear AC dca i i=0 loop tad s two’s complement add tad i s=s+i dca s tad n cio I→N negate AC (in two’s complement) tad i sma cio I=NP skip if AC−i clear AC stop hit halt isz i I=I+1 index by 1, skip if 0 jmp loop jump This defines the C program as a sequence of machine instructions, giving a description of the program at the instruction set level.

The Prolog runtime level. This description would be the C code that implements the mechanism of the Prolog language. Thus the description is this C program. It takes the form of the code listing of a Prolog interpreter in C. Properties of descriptions 133

The Prolog level. The description takes the form of a Prolog program, which has semantics as specified in the ISO standard for Prolog.

9.4 Properties of descriptions

The equivalent descriptions at each level, even though they describe the same thing, each in its own terms, are not simply related. For example a description at one level does not map onto an adjacent level by a simple mapping, or homomorphism. The mappings between levels are complex specifications of one set of concepts in terms of others.

For example the C program is a program written in a symbolic form namely a C pro- gram. This consists of nested expressions. By contrast, the machine instruction level has no nesting of expressions but is a strict linear arrangement of expressions for machine operations. Control flow is determined by skipping the next instruction depending upon a test on data, or an unconditional jump to another instruction.

Likewise the register transfer level is a lot different from the logic gate level. They are imaginatively different even though they can in the end be mapped onto each other.

Note also that for a given expression at one level there will be many equivalent expressions at the level below.

In principle we can generate a complete description of the Prolog program in terms of logic gates. However it would be of astronomical complexity. However, by using several levels of description languages, we can understand exactly how the Prolog program can be implemented in terms of logic gates. The levels break down the complexity of the specification of the computer system into manageable steps. 134 Chapter 9: Levels of description in computer science

9.5 Design, constraints and optimization principles

In addition to the problem of simply describing the system, computer science is based on a set of principles for optimizing performance. We need to be aware of these principles in order to see how they may need to be different for brain-like computers.

Computer system design should optimize the use of hardware resources, by minimizing processor time, RAM accesses, RAM storage, disk accesses, and context switching, and it should meet real-time requirements.

There is an even greater need to optimize programming cost, to optimize the time for writing programs, and for changing and updating programs. This is usually achieved by (i) decomposing programs, minimizing interactions among subprograms - functional programming, defining data structure types, and object oriented programming, (ii) mak- ing programs understandable by others using high level languages, storage mechanisms, control mechanisms, scope of variables, data typing and strong type-checking, declara- tive descriptions, (iii) managing software teams, documentation, negotiation, program development environments, and (iv) arranging for the coexistence of multiple programs, distributed programs, interface standards, and open systems. All of these processes may have analogs in computing by the brain. Chapter 10

Brain science

Abstract. In this chapter, I define and explain my notion of Brain Science. It is comprised of a multilevel hierarchy of descriptions.

A possible set of levels is cell dynamics, single neurons, neural nets/associative memories, system level/neuropsychology, cognitive psychology, self theory/psychotherapy and social psychology.

Each level has its own self-contained vocabulary, methods and theories, and forms a scientific culture.

Levels interact with each other, lower levels providing explanations and definitions for higher level concepts and dynamics, and higher levels providing constraints and boundary conditions for lower-level concepts and dynamics.

I also adapt concepts of goal, plan, sequence, event and context for the description of information processing in the brain.

135 136 Chapter 10: Brain science

10.1 Describing the brain

10.1.1 Levels of description of the brain

Neuroscientists recognize and work at the levels shown in Figure 10.1, which we have modified from Gordon Shepherd’s standard textbook [Shepherd, 1994].

Thus, different levels of description of neural activity have been developed for small-scale activity from the cell molecular level through the level of neural firing to the activity of small sets of neurons, at level 4 of this scheme. There tends to be a neglected gap between this highest neural level of and the lowest cognitive psychology, at level 2. This is the system level, level 3.

Present-day modeling of cognitive processes uses computational concepts based on list processing. These models, although able to capture some aspects of cognitive phenomena, are not easily or directly relatable to neural processes or brain architecture.

Thus, on the one hand we have neuroscientists describing system-level phenomena using circuit-level concepts, and on the other hand we have cognitive psychologists using models with no correspondence to brain structure. Describing the brain 137

level level name terms used description experimental no data used

1 behavior sensing, motivation, animal and human characterization of behaviors response, motor action behaviors behavior frequencies

2. cognition perception, memory cognitive psychological experiments response time, attention psychology imaging studies reaction times

3 systems system, area, nucleus, interaction among PET, MRI, MEG imaging, pathways brain subsystems EEG, single electrodes, lesions

4 abstract neurons centers organization of a few hundred and neural nets interacting neurons MRI, MEG, EEG neuron action potential, firing, integration of input single electrodes spikes, transmission and generation of output chemical manipulation

5 detailed modeling synapses, dendrites patterns of synaptic connection, single electrodes of neurons determine integrative action chemical manipulation of a neuron and synapses synapse, uptake, action of complete synapse inhibition

6 cell dynamics genes, proteins, synthesis of proteins in vitro observation in development sequences, and in response to molecular manipulation transcription intercellular signals and intracellular changes and measurement in activity

Figure 10.1: Levels of description in brain science 138 Chapter 10: Brain science

10.1.2 Natural science and computer science

Natural science, including the biology of the brain, has developed powerful mathematical languages for the description of transformations of energy from one form to another. However a fully scientific theory of the brain will involve the study of information pro- cessing. The validity of information measures for describing biological systems was first explicated by Broadbent [Broadbent, 1958]. Information, unlike energy, is not conserved, and we do not have scientific laws for the dynamics of information. It is however reason- able to assume that concepts such as information, data description, coding, computation, inference and memory will be needed in the scientific description of the brain. The disci- plines in which these concepts have been developed are computer science and electronic engineering. The use of formal logical methods to describe biological systems has been advocated, for example by Rudolf Carnap [Carnap, 1958].

10.2 A hierarchy of scientific cultures

I diagram in Figure 10.2 an idea of the field of Brain Science. It has multiple disciplines of all different scientists who study the brain. Each level is a scientific culture with its own vocabulary, concepts, experimental methods, theories and models. Each level can operate in a self-contained way, making hypotheses, doing experiments, making models and validating theories, all within its own culture. Each culture interacts with other cultures by the culture below providing more detailed explanations and mechanisms for concepts on the level above, and the culture above providing constraints on systems in the culture below. New information enters the system at each level, information that cannot be discovered in other levels. Thus no one culture is primary in the scientific process; Brain Science is multicultural. A hierarchy of scientific cultures 139

These proposed levels are not cast in stone, there could be different levels and there could be additional levels. Also, cultures need not form a linear ordering, there can be branching, for example different aspects of one level might have explanations from different cultures. For example, to explain the electrical and fuel components of a car one might use a more detailed electrical analysis and a fuel analysis. There can be also upward branching, for example where cognitive psychology explains and receives constraints from both linguistics and psychotherapy.

interacting selves, groups social psychology

mental states, consciousness, the self psychotherapy

cognitive mechanisms, motivation theories cognitive psychology

system level brain models, neurocognitive models neuropsychology

cortical layers, associative memory models neural nets

single neuron models, synapses, transmission single neurons

cell dynamics, synapse dynamics, genetic transcription cell dynamics

Figure 10.2: My concept of Brain Science 140 Chapter 10: Brain science

10.3 Scientific culture

Within one level of description, there is a culture of beliefs, knowledge, and methods. A culture has a language and a universe of discourse. The language has a vocabulary of terms and their definitions, explanations and allusions. A universe of discourse is the range of things that are discussed.

A scientific culture is self-contained, in the sense that it can for its central purposes op- erate without interacting with other cultures. Questions can be asked, hypotheses made, experiments designed and carried out, and conclusions drawn, all within the culture.

A scientific culture is usually so complex that any one person can only understand one culture, or not even that, one person may only be able to master one part of the culture, with a, less complete, general knowledge of the entire culture.

A culture also has what I call a penumbra or folklore of ideas and evaluations that are understood and not expressed in papers. These are all the ideas that nobody knows how to get to work, the dreams of future achievements, the jokes and exaggerations that people exchange at coffee and on the bus. In my artificial intelligence laboratory, I would say that about 50% of all conversation consisted of jokes, some fraction of which were not about personalities but about computer science. These were not joke stories but witty remarks, plays on words, etc.

10.4 Information is generated at each level

Information can be obtained by observation, and further information can be obtained by inference and calculation. The evolution of the brain is a fact of history and it can not be discovered from cellular models of DNA, it is a historical fact. Information about Formal and computational models at each level 141 this historical fact is obtained by the science of paleontology, by using their experimental methods and vocabulary.

Thus each level produces information and insights which cannot be deduced from theories at other levels. This is mainly from input into the system of experimental findings at each level.

10.5 Formal and computational models at each level

1. social psychology - statistical models of population

2. personality and psychotherapy

3. psychology - modular models without correspondence to brain areas

4. system level - modules and connecting channels

5. cortical layers - neural equations, back propagation

6. single neurons - detailed compartmental models

7. cell dynamics - mass action equations, diffusion equations

10.6 Interactions between levels

Explanations. A concept or word may have a definition at one level or it may be primitive at that level. This concept may have an explanation or definition at the level below. The regularities of the concepts for these explanations at the level below will also constrain the possible behaviors of the objects above. 142 Chapter 10: Brain science

Constraints. The objects at one level will have certain regularities of behavior, and these will act as constraints on what is possible for corresponding behaviors at the level below.

These are interactions between cultures, and there will be some interfacing problems, as each culture will have different vocabularies and different criteria for the truth and reliability of any given statement. In general, this interaction is a form of negotiation [Bond, 1989] [Bond, 1990].

10.7 A role for logic programming

My initial proposal is that the different precise languages at each level could each be defined using a single underlying general language, namely logic programming. To define a more specialized language, the general language would be specialized and extended using additional definitions in the general language. Definitions would take the form of logical assertions.

Thus for example, we would define, in Prolog, a language for creating neural models, another language for cell dynamics, and another for cognitive modeling, etc.

One advantage of having a single underlying language is that definitions of the interface between two levels would also be expressed in the language. In particular, variables from both levels could be used in such definitions since they would all be Prolog variables.

The usual response to this suggestion is that some other logic or language would be better, however logic programming is a very stable method and systems are available for every type of computer, also tutorial material and examples are easily available. Also its mathematical theory has been explored in some detail. Further, most problems it has as a language are shared with all other proposed languages, which in any case lack tutorials, Neuroscience 143 implementations and mathematical theory.

10.8 Neuroscience

There has been a tendency over the last decade to assume that only one level is the real scientific level for describing the brain and that this level is the neural level. In this neuroscience view, all higher, aggregated, information must be and eventually will be inferred from neural models. I believe this view has had some adverse consequences and I am suggesting that it should be replaced by my multicultural concept of Brain Science.

10.9 Concepts for describing information processing

in the brain

In this section, I will discuss key concepts for describing information processing in the brain. These concepts of goal, plan, action sequence, event and context have been introduced and developed by neuroscience researchers over the past decades, and are used routinely in present day neuroscientific papers for explaining neural activity in the brain.

Information-processing terms are used in different ways in the research literature. Some- times a term is used in a broad descriptive almost metaphorical manner, sometimes an operational definition is given for behaviors, measurements and situations being de- scribed, and sometimes a brain mechanism is postulated. I will try to clarify these different uses and claims. My purpose in this chapter is to characterize the action of different brain areas by their experimentally observed involvement in certain kinds of data, processing and mechanism. 144 Chapter 10: Brain science

10.9.1 Goals

The notion of an action being goal-directed goes back at least to Edward Tolman [Tolman, 1932], who demonstrated such behaviors in rats and developed concepts such as goal-objects and means-ends action. Our working definition of goal is information which specifies a desired state that the organism tries to attain. Usually a desired state is not specified completely, but some parts or aspects of the state are specified or constrained.

Goal-directed behavioral responses have been explicated, see for example Gordon Mo- genson’s treatment [Mogenson et al., 1980] of goal-directed locomotion and of thirst and the control of water intake. Edmund Rolls [Rolls, 1999] has given a general treatment of goal-directed mechanisms.

At the neocortical level, the motor cortex may use explicit representations of target positions, which thus constitute neural representations of goals [Alexander and Crutcher, 1990] [Shen and Alexander, 1997].

Goal-directed systems with continuous properties are the subject of control theory, and discrete logical goals are used in artificial intelligence. An action at one level of control may have properties as a goal at lower levels.

Other aspects of goals include (i) competition among goals, prioritization and selection, usually only one goal is selected at once, although the selection of some set of compatible goals and actions may be possible, and (ii) goals as processes, monitoring and evaluating progress, and determining satisfaction and failure. Usually, the initiation, or termination, of activity is not conceived as part of a goal. Concepts for describing information processing in the brain 145

10.9.2 Plans

Karl Lashley, in his classic 1951 paper [Lashley, 1951] [Bruce, 1994] argued that sequences of human motor actions are guided not by chains of associations but by plans. Feedback doesn’t determine the next action, because (i) there is too short a time for the signal indicating the completion of an action to reach the brain, (ii) the errors in sequences revealing their long range organization, and (iii) two different actions can be part of two different sequences, hence the action cannot be the sole determiner of the choice of next action. A review by Steven Keele [Keele et al., 1990] asserts that the large body of experimental evidence supports Lashley’s position, and also that plans are hierarchical in nature, as shown by latencies and grouping of different parts of observed sequences.

I take a plan to be information which guides the sequencing of action. One specific type of plan is a specification of a sequence of actions selected from a small number of elementary movements. The hierarchical decomposition into plan and elementary movements is not always perfect, an elementary movement may itself involve control and the control of sequencing may depend on the control of individual actions. More generally, a plan could branch, and include actions which are conditional on the current state. Plans can also be sets of schema [Shallice, 1982], i.e. conditional actions, and yet more generally, goal schema [Schwartz, 1995], which are conditional actions which contain subgoals and continue to function until their subgoal is attained.

10.9.3 Sequencing of action

By a sequence I mean that a process enters a state and/or generates an output at one time and then at a later time enters another state and/or generates another different output, and so on, so as to produce a sequence in time of different states and/or outputs. Sequencing could be determined by a clock of some kind, it could be “reactive” being 146 Chapter 10: Brain science triggered by perceived stimuli, or it could be triggered by some kind of internal process, for example a cognitive problem-solving process, so that whenever such a process reaches the next state in a sequence of states, the next element in the sequence is generated. Sequencing of action in subcortical areas is well known. Simple sequencing, for self grooming, can even be mediated by brain stem nuclei [Klemm and Vertes, 1990].

Whereas premotor areas are associated with the control of action sequencing, the motor cortex proper is not, and is more purely reactive, generating codes for specific movements.

Sequencing of motor plans has been shown to be associated with specific medial frontal cortical areas, whereas stimulus-guided or conditional sequencing seems to be associated with dorsal frontal areas [Tanji et al., 1996] [Luppino et al., 1991].

The premotor areas thus also involve some sensory perception, and indeed Gallese Rizzo- latti and coworkers have explored the notion of “mirror” neurons [Rizzolatti et al., 1996] [Rizzolatti et al., 1998].

The only published experiments seem to involve either self-paced internally generated sequencing or else sequencing guided by perception of external events. More general than this would be a combination where, at each step, external events are perceived but then remembered sequencing information would be combined, in determining the next action.

Higher-level sequencing at the level of a plan can be inferred from the lesion work of Michael Petrides [Petrides, 1994]. In the case of delayed non-match to sample experi- ments, prefrontal areas are involved in the control of sequencing. Unfortunately these behaviors are very simple, and thus simple processing strategies of maintaining a single activation of a remembered stimulus will allow the monkey to carry out the experimen- tal task I-Fuster [Fuster, 1997]. Thus, very little experimental evidence for higher-level sequencing has been obtained for non-human primates. Concepts for describing information processing in the brain 147

However, problems with plan sequencing and other executive errors are very well-known in human frontal patients [Shallice, 1982] [Stuss and Benson, 1986] [Shallice and Burgess, 1991a] [Shallice and Burgess, 1991c].

10.9.4 Representations of events

More generally, we will be able to characterize hierarchy better if we can characterize the perception, representation, storage and behavioral influence of events. Let us try to define what this might mean, and how it might manifest itself in experimental results.

One general level of representation would be the representation of perceptions of com- plete external events, meaning a spatial context, external objects and their movement. However, a number of distinctions and gradations can also be made. One kind of event is an instantaneous movement, another is an entire episode over a time period. An episode involves typically a selection of stimuli and dimensions, a choice of time granularity, and a selection of intervals within the time period.

There is ample psychological evidence for the use of an episodic memory by human and non-human primates [Tulving, 1983], evidence for episodic memory involvement in various brain areas such as orbital frontal and retrosplenial areas [Shallice et al., 1994] and the temporal pole [Markowitsch, 1995]. However, the types of episodes that have been used in such experiments are quite limited, usually to a single point in time and to a small number of stimuli or stimulus dimensions.

Representation of types of event or episode, if used by the brain, would be of great value, since they could provide representations of context for detailed memory indexing and for action selection. One could perhaps think of the context as providing the value of an index specifying possible subsets of memory and action, from which a final selection is made using more local and immediate information. 148 Chapter 10: Brain science

A second yet more general level of representation would be of complete mental events which would include not only the perceptual representations of external events but also some aspects of the current mental state, such as attentional tuning, and goals and plans currently active. Such a representation could exist in the brain at a given time. If so, it would no doubt be distributed over many neural areas which would mutually activate each other.

10.9.5 Social interaction

Possibly the most general classes of event or situation are those involving other primates in social action and interaction.A social event might include the perceptions of con- specifics, their dispositions and intentions, joint actions with them, joint goals and joint plans. There are results showing that certain types of social features and events are processed separately in the neocortex [Perrett et al., 1989a] [Harries and Perrett, 1991] [Desimone, 1991]. For reviews of social function in the brain see [Brothers, 1990] [Brothers, 1996] and [Adolphs, 1999].

10.9.6 Contexts

The notion of context has a lively history in psychology, linguistics and philosophy. We can perhaps define a context as a “framework or background of information with respect to which more specific items of information can be identified or manipulated” [Miller, 1991].

Context is perhaps information concerning the current situation which is relatively con- stant while we are in that situation. Knowing the context allows us to retrieve and use information that is specific, appropriate or tuned to that situation. Concepts for describing information processing in the brain 149

My idea of context is that of information which is quite general concerning the current situation and its characterization. Context is activated and maintained by the brain and facilitates the selection and discrimination of objects and actions by providing a global level of indexing or association. Contextual information is provided to various brain areas which use it to function in a more focused and unambiguous manner by constraining possible choices. Nicholas Humphrey [Humphrey, 1984] has argued that the active use of context information such as beliefs, stereotypes, expectations, and contextual knowledge is an essential requirement if an individual is to comprehend and react rapidly and effectively to his or her social world.

The cognitive psychologists Sutherland and MacKintosh [Sutherland and MacKintosh, 1971] reviewed the evidence for selective attention and concluded that “dimensions” of the context were selected independently of the specific items having those dimensions. The psychologist Tolman in an associationist learning tradition introduced the notion of cognitive maps which were frameworks learned by animals and then used to guide problem solving. Neuroanatomists John O’Keefe and Lynn Nadel [O’Keefe and Nadel, 1978] argued for the creation of spatial maps in rats by the hippocampus. For humans they suggested these maps might be more generally cognitive.

Episodic memory in humans is thought to be learned in the hippocampus and stored in the neocortex. Memory for a specific episode type is then evoked by the current situation and acts as a context in the retrieval of more specific memories and in the selection and execution of plans. 150 Chapter 10: Brain science Part II

The cortex

151 152 Chapter 11

Information-processing analysis

Abstract: In this chapter, I develop a set of concepts for analyzing the brain in terms of its functional architecture, by which I will mean what processing components exist, how they are interconnected, and what information-processing functions each is involved in. I characterize the information-processing function for each neural area in terms of the types of information it is associated with, and conceive of its activity as processing, storage and transmission of data of the corresponding types for that area.

153 154 Chapter 11: Information-processing analysis

11.1 Introduction

The primate brain as a whole has a hierarchical structure, related to its developmen- tal order [Bullock, 1977] [Romer and Parsons, 1986] and some hierarchical functioning [Kandel and Schwartz, 1999], pointed out for example by Paul Maclean [Maclean, 1970]. Within the primate neocortex, some hierarchical structure is widely accepted, an early trailbreaking paper being that of Jones and Powell [Jones and Powell, 1970]. Hierarchies of perception and motor action are well-known, however a hierarchical structure encom- passing most of the neocortex is not established. The mathematical analysis of cortical connectivity by Malcolm Young [Young, 1993] did not yield a perception-action hierar- chy. The psycholinguistic analysis of Jerry Fodor [Fodor, 1983] suggested that higher- level functioning is nonmodular and probably nonhierarchical. Fuster [Fuster, 1997] in his study of the prefrontal cortex has suggested a hierarchical structure, which is diffusely distributed and nonmodular [Fuster, 1995]. Some time ago, the control theorist Jim Al- bus [Albus, 1981] suggested a hierarchical control system concept for the cortex, Ulrich Neisser [Neisser, 1976] has discussed what he called the action-perception cycle, and the brain theorist Michael Arbib [Arbib, 1981] for example has described perceptual-action interaction for motor control.

The neuroanatomist Deepak Pandya and coworkers [Pandya and Yeterian, 1990] have described some hierarchical structure of the neocortex, however their study did not use an action hierarchy. They examined lateral connections between perception hi- erarchies and corresponding areas in the frontal lobes and showed that these had similar architectonic structure, which they related to the theory of Friedrich Sanides [Sanides, 1970] of the phylogenetic development of the neocortex. More recently the anatomical connectivity of the frontal lobes has been clarified by Helen Barbas and coworkers [Barbas and Rempel-Clower, 1997], who showed that there is a spatial se- Hierarchies 155 quence of areas whose order is predicted from their architectonic properties. The hi- erarchy of functioning in the frontal lobes has been investigated by Michael Petrides [Petrides, 1994] using a lesioning technique. Observed memory characteristics also sup- port the hierarchical ordering in showing a corresponding increasing memory ability and increasing characteristic memory time.

My concept of the information-processing function of a given brain area will be that it computes data of given types. A given area receives data of certain types and computes data of other types, which it may then store and/or transmit to other areas. I will take experimental evidence for information-processing function to be based on experimental evidence for the types of data that are observed being processed by a given area. This concept of function differs from characterizations in terms of contribution to functioning in the external environment of the organism.

This chapter. In section 11.2, I examine the concept of hierarchy and its application to the neocortex. In section 11.3, I discuss the concepts of neural areas and neural connections in the neocortex.

11.2 Hierarchies

11.2.1 The concept of hierarchy

Hierarchy is a basic organizational principle in biology. The usual concept of hierarchy is where we have a set of elements and a relation which specifies whether one element is “above” another. This relation holds only between some pairs of elements, in other words it is a partial ordering. A hierarchy can occur where there is a sequence of different anatomical structures with gradually increasing measures of some biological property, such as size, density, or complexity, for example. However, a set of identical or similar 156 Chapter 11: Information-processing analysis structures can form a hierarchy by their mutual relative arrangement and connectivity, in which case the position of the element in the hierarchy is determined by its position within the total anatomical arrangement. A notion of cortical hierarchy has also been defined by David Felleman and David Van Essen, with feedforward and feedback connections defined by which cortical layers are involved. What is more cogent is a hierarchy of function. In the case of nervous systems, the main functions of interest concern the transmission, processing and storage of information. I will take processing to include computation with the generation of new information forms from other input forms, and also learning, which includes alteration of function and the creation of new information. Elements higher in a hierarchy could perform more general or more abstract processing, and store more general or more abstract data.

11.2.2 The elements of hierarchies

My first basic question is what are the components of the neocortex that we should consider as elements of a possible hierarchy?. The kinds of elements that have been postulated for the functional components of the neocortex include: (i) the gene and cell [Shepherd, 1994], (ii) the single neuron as an integrative system, using dendrites and dendritic microcir- cuits [Barlow, 1972] [Shepherd, 1990], (iii) Hebbian assemblies of neurons in a fine-grained distribution over the cortical surface [Hebb, 1949] [Fuster, 1995], (iv) cortical columns [Mountcastle, 1957] [Szentagothai, 1972] [Szentagothai, 1983] [Mountcastle, 1995a] [Mountcastle, 1997] [Malach, 1994] (Purves [Purves et al., 1994] ar- gues for other structure of similar scale but which is dynamical), (v) cortical areas. [Brodmann, 1909] [Felleman and Essen, 1991]. This is the mainstream concept. Areas are defined by (a) cytoarchitectonic distinguishability, (b) clustering of Hierarchies 157 interconnectivity, and (c) subcortical connectivity, notably thalamic, and (vi) regions, made up of contiguous sets of related cortical areas [Pandya and Yeterian, 1990]. In this review, I will mainly consider structure based on neural areas, however I will also identify aggregations of neural areas, thereby defining regions.

11.2.3 Sensory and motor hierarchies

Perhaps the best known hierarchical concept in the neocortex is the visual perceptual hierarchy. It seems that neurons in areas V1, V2, V4, and so on, respond to increasingly higher-order visual features [Essen et al., 1990]. Thus, we have a hierarchy of func- tion based on the processing of visual features. The information-processing functions involve taking input information from lower levels of the hierarchy and computing new information representing the presence of higher-order visual features of the perceived scene.

Higher-order features can mean simply “any information derived from lower order fea- tures”, but in biological visual systems the features are usually more general in the sense of responding to or describing broader classes of stimuli. In other words, the features are derived from events occurring in a greater spatial region, over a greater temporal interval, involving more stimulus dimensions, and so on.

Similarly, there is experimental evidence for a hierarchy of motor function in the entire brain [Shepherd, 1994] and in the neocortex in particular [Porter, 1990] [Picard and Strick, 1996] [Riehle, 1991]. Higher levels process general descriptions of action to be performed, and lower levels process very specific motor patterns. A good description of issues in hierarchical control in the nervous system has been given by Peter Greene [Greene, 1972]. 158 Chapter 11: Information-processing analysis

In such a case, it is not necessary for the processing higher in the hierarchy to be any more complex, it could be of a similar complexity to other levels but simply operating on more general data. The uniformity of cortical circuitry suggests a uniformity of complexity.

There is evidence for the representation of goals, in the sense of target posi- tions, at several different levels in the motor hierarchy. The anterior cingulate re- gion is typically found to be involved in different aspects of goal-directed behavior [Devinsky et al., 1995], such as monitoring of progress, and connections between stimulus and reward [Elliott and Dolan, 1998] [Carter et al., 1998].

To the extent that there is a temporal sequencing of action, we expect that at higher levels there will be longer time intervals between elements of represented sequences [Tanji et al., 1996] [Rizzolatti et al., 1998].

11.2.4 Possible bases for hierarchy in the neocortex

My second basic question is what is the basis for the hierarchical ordering relation speci- fying that one element is “above” another element in the hierarchy? We can list a priori several different aspects of information processing which may characterize hierarchy in the neocortex:

1. hierarchy of data type - the generality of the data being processed by an element.

2. hierarchy of complexity of processing - the amount of processing and data occurring in an element

3. hierarchy of motor function - the generality of action described by an element

4. hierarchy of temporal scale - temporal sequencing of action, temporal scale of per- cept, temporal scale of memory associated with the element. Anatomical regions and connections 159

5. hierarchy of goal description - the generality of the goal being processed.

6. hierarchy of memory - short to long duration, more specific to more general data, smaller to larger capacity.

7. hierarchy of control - the element may exercise control or influence over more other elements. This can be control over elements involved in the construction of an action, or control over what different elements attend to or are tuned for.

Any one or any combination of these aspects, and others not listed here, could characterize a given biological information-processing hierarchy. Further, a given set of elements could have more than one hierarchical characterization based on different aspects.

11.3 Anatomical regions and connections

11.3.1 Neural areas

The anatomical parcellation and connectivity used here is taken mainly from stud- ies of macaque monkeys. Supporting analyzes for other species of primate have been described, for prosimians by, for example, Todd Preuss and Patricia Goldman- Rakic [Preuss and Goldman-Rakic, 1991b], chimpanzees by Percival Bailey et al. [Bailey et al., 1950], and humans by, classically, Korbinian Brodmann [Brodmann, 1909], and more recently reviewed for example by Karl Zilles [Zilles, 1990]. I will call the par- cellated cortical divisions neural areas.

Parcellation starts from an architectonic analysis of the cortex and areas are partitioned from relatively sharp changes in architectonic measures such as densities of cells in each layer, and neurotransmitter activity. It can then be strengthened and confirmed from 160 Chapter 11: Information-processing analysis connectivity data: neurons in a given area tend to have the same types of connections and to the same other areas.

A detailed parcellation of the occipital and parietal lobe visual areas has been given by David Felleman and David Van Essen [Felleman and Essen, 1991]. Detailed parcellation of the frontal lobe has been given by Helen Barbas [Barbas and Rempel-Clower, 1997].

The main neural areas of the primate neocortex are quite well established, even though their identities are under constant investigation and refinement.

There is no agreement on notation, Brodmann used numbers, but then for example Constantin von Economo and Georg Koskinas [Economo, 1925] introduced an alpha- betic notation. More recently, finer subdivisions of Brodmann areas have been given subscripts, see for example [Carmichael and Price, 1994]. We will use a mixed notation which attempts to use the notation for each part of the cortex that is used by the main neuroscientists studying that part. Figure 11.1 diagrams the neural areas and nota- tion that we will be using. Zilles [Zilles, 1990] gives a conversion table between several different labeling schemes. Anatomical regions and connections 161

TPro TE1 TE2

TE3 V4 14 V2

10 32 25 V1

24 V4 9 V2 23 PO 6 PGm MDP 4 PE 1 ci 2 PEc 3 PE

PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a

46 PF PFG MT Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1

TE2 TPro TE1

TE2 TE3 TE1

TPro V4 12

11 13 FPro 10 24 V2 V1 14 25

Figure 11.1: Neural areas and notation used 162 Chapter 11: Information-processing analysis

11.3.2 Anatomical connections and their analysis

Jones and Powell. Let me briefly describe the classic findings of Jones and Powell. The stated main aim of their paper was to find convergence areas of sensory data, and the main finding was a sequence of association connections for each sensory modality, ending in the frontal lobes. “... in each of the three systems studied 1 there is a stepwise, outward projection from the main sensory areas within both the parieto-temporal and frontal lobes with an interlocking of each new parieto-temporal and frontal step.... each primary area projects to a local area in the same lobe and to a portion of the premotor cortex in the frontal lobe.” In their experimental method, gray matter from a given area was removed, attempting to avoid damaging the white matter. This was done in the intact animal; rhesus monkeys were used. After a few days the animal was perfused and degeneration of neurons observed, giving the connections from that area. Then for one area lesioned, what they found was that a small set of other areas were affected. For example, on lesioning area 7, the areas affected are shown in Figure 11.2.

1i.e., somatosensory, auditory and visual Anatomical regions and connections 163

TPro TE1 TE2

TE3 V4 14 V2

10 32 25 V1

24 V4 9 V2 23 PO 6 PGm MDP 4 PE 1 ci 2 PE c 3 PE

PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a

46 PF PFG MT Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1

TE2 TPro TE1

Figure 11.2: Lesioning an area affects a small number of other areas 164 Chapter 11: Information-processing analysis

The pattern of connectivity they discovered is diagrammed in Figure 11.3.

local area

frontal area

local area nearby local area

frontal area frontal area

area local area lesioned frontal area

(a) sensory area

(b)

Figure 11.3: Pattern of connectivity discovered by Jones and Powell Anatomical regions and connections 165

Figure 11.4 shows the three sequences reported by them. We can see evidence for a perception-action hierarchy, however the frontal connections were unclear at that time. 166 Chapter 11: Information-processing analysis

temporal somatosensory cingulate frontal orbital frontal areas areas areas areas areas

frontal pole

temporal pole 10

23,24 retrosplenial OFC ACC 45 22 12 9

35(TH) 46(upper) (a) 46(dorsal) 7 upper 8

5 6(upper)

SII 6 (SMA)

SI 4 temporal visual cingulate frontal orbital frontal areas areas areas areas areas

temporal pole frontal pole 9 10 STS (b) 45 25 OFC

lateral EC 21 46(lower)

amyg(bln) 20 PrCo

caudal STS 18 19 temporal auditory cingulate frontal orbital frontal areas areas areas areas areas 8A 17

STS

25 35(TH) 12 (c) 10 22 9

8B supratemporal plane(TB)

41+42

Figure 11.4: Summary diagram showing the three sequences reported by Jones and Powell Anatomical regions and connections 167

Jones and Powell assert that “The significance of the double projection pattern, the one local and the other to the frontal lobe ..... is obscure”. They concluded that sensory data seemed to stay separate for several steps of the sequence. Convergence of sensory data seemed to occur in STS and OFC. They speculated that frontal areas are concerned with sensory data, sensorimotor integration and with learning and discrimination.

Evidence for neural areas. The original evidence was architectonic, exemplified by the work of Brodmann. Fifty years later, connectivity findings, such as shown in Figure 11.2, provided basic evidence supporting the existence and significance of these neural areas. The connections from one area do not go to other neurons all over the cortex but they are very clustered to go only to a small set of other areas.

Further, the partitioning into areas from connectivity derived from one “source” area will be similar to that derived from other source areas. These areas also are consistent with partitionings by connectivity from subcortical areas, notably the thalamus. Further, these neural areas are relatively constant from one individual to another, and from one species of primate to another.

Per Roland, after two decades of pioneering brain imaging experiments, reported in his book [Roland, 1993], postulated his cortical field activation hypothesis [Roland, 1985] which states that “neurons in the cerebral cortex always change their biochemical activity, not in a scattered or singular fashion, but in large distinct ensembles, each covering some 800m3 to 3000m3 of the cortex. Within the field, the active synaptic regions form columns and bands of raised metabolic activity” [Roland, 1993].

What has not been established is how constant these activation areas are over different tasks, and what is their correspondence to areas found by connectivity.

The work of Pandya and coworkers. Over a twenty year period, Pandya and var- ious coworkers have systematically investigated corticocortical connectivity. Summaries 168 Chapter 11: Information-processing analysis of their work can be found in [Pandya and Yeterian, 1985] [Pandya and Yeterian, 1990]. They used various techniques, the older papers used lesioning and later work used an- terograde and retrograde tracing. They investigated the same three hierarchies as Jones and Powell. Figure 11.5 summarizes their findings, taken from Figures 17, 19 and 21 of [Pandya and Yeterian, 1990]. Clear hierarchies are seen.

PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT Tpt 10 OAa (a) AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1

PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT (b) Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1

PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT Tpt 10 OAa (c) AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1

Figure 11.5: Summary of hierarchies reported by Pandya and coworkers Anatomical regions and connections 169

The work of Helen Barbas on the connectivity of frontal areas. Helen Bar- bas [Barbas, 1988] [Barbas, 1992] [Barbas and Rempel-Clower, 1997], using tracing tech- niques, has described the intrinsic connectivity of the frontal areas, showing its hierar- chical structure. Her results are shown in Figure 11.6. 170 Chapter 11: Information-processing analysis

25 12

10 32 (i) medial view 24 9

6

4

6 4

9 8 46

10 12

(ii) lateral view

Figure 11.6: Intrinsic connectivity of frontal areas, from (Barbas and Pandya, 1989) Sensing as the construction of descriptions 171

Determining hierarchical structure using regions and function. It is not easy to determine hierarchical structure of the cortex and this is probably why it is not generally used. I will use anatomical connectivity data which I have researched myself, but is close to and includes that already published by Young [Young, 1993], and I give item- ized references to the connectivity findings. My approach differs from Young’s in two important respects. First, I use functional information as well a connectivity information in defining our hierarchies. Young’s work showed that anatomical connections alone do not give enough information to derive a definite architecture. Second, I do not treat the entire cortex as having a single connectivity matrix. Instead, I first divide the cortex into six main parts, thus breaking up the problem of describing connectivity into two levels, namely those within parts which we will call intrinsic and those between parts which we will call extrinsic. This I believe allows more neuroscientific intuition to be brought to bear.

Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, a planning and action hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. The planning and action hierarchy is based on increasingly general situations, plans and control.

11.4 Sensing as the construction of descriptions

Sensors do not deliver raw information but rather a feature or set of features of the stimulus. Figures 11.7 and 11.8 from Blum [Blum, 1990] lists a classification of sensory 172 Chapter 11: Information-processing analysis receptors together with the features they detect and the conduction velocity.

The diameter of the fiber in micrometers is approximately the conduction velocity in meters/sec, thus 1 micrometer corresponds to 1 meter/sec. Myelination gives 4-6 x unmyelinated, I to IV is fastest to slowest. “At the level of the primary afferent fiber, a complex sensory stimulus is broken down into a number of very specific features” [Blum, 1990]. A feature may involve temporal integration and temporal adaptation and sensitization or temporal differentiation. Each sensor: 1. has a defined adequate stimulus 2. is responsive to the adequate stimulus within its receptive field, and 3. has a rate of adaptation. Sensing as the construction of descriptions 173

Figure 11.7: Sensory features - first part 174 Chapter 11: Information-processing analysis

Figure 11.8: Sensory features- second part Chapter 12

An information-processing analysis of the primate neocortex

Abstract: In this chapter, I analyze experimental evidence for the perceptual areas of the primate neocortex for conclusions concerning the existence of neural areas, for cortic- ocortical connectivity among neural areas, and for the involvement of each cortical neural area in the functioning of the brain.

I analyze neocortical perception hierarchies one by one: olfactory, gustatory, somatosen- sory, auditory, visual - ventral then dorsal, and finally the polymodal areas of the superior temporal sulcus (STS).

This analysis shows that these areas consist in the main of an interconnected polymodal perception hierarchy.

In the next chapter, I will extend this analysis to the frontal areas.

175 176 Chapter 12: An information-processing analysis of the primate neocortex

12.1 Introduction

Even if we accept that the primate neocortex contains hierarchies of perception and ac- tion, however, the nature of these hierarchies is far from agreed. My purpose in this chapter is to analyze, in terms of information-processing, the connectivity and functional properties of the primate neocortex to determine just what, if any, hierarchical structure is present. As will be seen, it is by no means obvious what hierarchical structure exists in the neocortex, partly from lack of data but also because existing data does not al- ways suggest a straightforward hierarchy. However, when all the experimental data are gathered together and analyzed, we have clear evidence for a hierarchical structure of information processing in the neocortex. This structure is based on anatomical regions, and has hierarchical anatomical connectivity and hierarchical functionality. I will show (i) a parallel set of perceptual hierarchies based on increasing generality of data pro- cessed, temporal scale and memory, (ii) an action hierarchy based on action generality, temporal scale and memory, and (iii) lateral connections between corresponding levels of these perceptual and action hierarchies. In addition, there is no negative evidence, i.e., we know of no experimental evidence that contradicts our hypothesized hierarchical architecture.

I have not included polymodal areas in the posterior cingulate or the interparietal sulcus, since data on these areas weren’t clear enough to us. Neither have I taken into account the bihemispheric structure of the brain, postponing its consideration until a more extensive study.

The establishment of an architectural design, such as the laterally-connected perception- action hierarchy to be described here, would provide an important simplifying principle for the interpretation of experimental findings and for the generation of experimental questions and hypotheses. Furthermore, as I describe in detail in chapter 15, from the Introduction 177 description of the hierarchy, giving its elements and their corresponding processing and data types, with temporal and memory characteristics, and its connectivity, we will be able to derive a dynamic causal model of information-processing activity in the primate neocortex.

Demonstration of the existence of a functional hierarchical scheme will require a review of the experimental literature, which we will now undertake. Limitations of space pre- clude us from describing experimental paradigms and results in detail, or from listing all supporting references.

I list all the regions as sets of areas in Figure 12.1. 178 Chapter 12: An information-processing analysis of the primate neocortex

Hierarchy Region Areas olfactory hierarchy, OA OI olfactory cortex O1 14c,25,13a,13m,IO gustatory hierarchy, GA GI operculum, IG, SI taste and tongue areas G1 polymodal taste area in 12o G2 polymodal taste area in 13l auditory hierarchy, AA AI auditory cortex AA1 Tpt, paAlt and caudal TS3 AA2 rostral TS3 and TS2 AA3 TS1, TPro somatosensory hierarchy, SA SI 1, 2, 3 SA1 PE, PEa, PF SA2 PEc, rostral POa, PFG SA3 PGm, rostral PG ventral visual hierarchy, VV VI V1, V2, V3, VP, V3A, V4t VV1 OAa, TE3, V4 VV2 TE2 VV3 TE1, TPro dorsal visual hierarchy, DV DV1 PIP, PO DV2 MIP, VIP, LIP, DP, MDP DV3 caudal 7a polymodal hierarchy in STS, PM PM1 TPO-4, TPO-3 PM2 TPO-2 PM3 TPO-1, TPro planning and action hierarchy, PA MI 4 PA1 6 PA2 8 PA3 46 PA4 9, 10 PA5 11, 12 G 24, 32

Figure 12.1: Table of all hierarchical regions Analysis of neocortical perception hierarchies 179

12.2 Analysis of neocortical perception hierarchies

12.2.1 Olfactory areas

Anatomy. I diagram the olfactory areas in Figure 12.2, which is based on a figure by Thomas Carmichael and Joseph Price [Carmichael and Price, 1994]. 180 Chapter 12: An information-processing analysis of the primate neocortex olfactory tubercle insula areas Iam,Iai,Ial,Iapm,Iapl piriform cortex caudal orbital areas 14c,25,13a,13m anterior olfactory nucleus olfactory bulb olfactory sensors OL2 olfactory prefrontal areas OL1 olfactory cortex (b) principal connections and hierarchy of functional involvements G Orbital frontal cortex Iapl Ial Iai Temporal lobe PC Iapm 13m Iam AONl (a) orbital view of olfactory cortex 13a AONm 14c TTv

Figure 12.2: The olfactory hierarchy Analysis of neocortical perception hierarchies 181

These areas and their connectivity have been extensively discussed by Price [Price, 1990] [Price, 1991] and by Rolls [Rolls, 1995], see also [Davis and Eichenbaum, 1991] The pri- mate olfactory system differs from that of lower mammals in having direct connectivity from olfactory nuclei to olfactory cortical areas, as well as via thalamical connections. From the described anatomy and functionality, we can identify the following regions: Region OI. This is the olfactory cortex and is described for example by Gordon Shepherd [Shepherd, 1994] and Lewis Haberly [Haberly, 1990], and consists of several different interconnected regions, including the anterior olfactory nucleus, the piri- form cortex and olfactory tubercle. According to Thomas Carmichael and coworkers [Carmichael et al., 1994], it provides detection and discrimination of odors, as well as simple odor memories. Region OL1. This is the neocortical olfactory area described by Carmichael et al [Carmichael et al., 1994] who give a detailed architectonic division of orbital frontal and insular cortices [Carmichael and Price, 1994]. OL1 is their areas 14c, 25, 13a, 13m and IO. According to them, it provides odor-guided behaviors, forced-choice olfactory discrim- ination and mating and sexual behaviors, although most of the experimental evidence comes from rodent work.

12.2.2 Gustatory areas

Anatomy. I diagram the gustatory areas in Figure 12.3, derived mainly from the work of Edmund Rolls and coworkers [Rolls, 1995]. 182 Chapter 12: An information-processing analysis of the primate neocortex

GOV gustatory, olfactory, visual area medial OFC GU2 secondary cortical taste area caudal OFC

GU1 primary cortical taste area operculum insula somatosensory taste and tongue areas (ipsi− and contra− lateral)

medulla NST (the nucleus of the solitary tract) taste sensors taste buds

Figure 12.3: The gustatory hierarchy Analysis of neocortical perception hierarchies 183

From the described anatomy and functionality, we can identify the following regions: Region GI. The primary gustatory cortical area is located in the operculum and insula, and is described by Rolls [Rolls, 1995], see also [Shepherd, 1994] and [Finger, 1991]. It is concerned with timing and detection of taste and the characterization of types of taste. It is unnecessary for reflex responses to gustatory stimuli, but necessary for normal retention of learned taste aversions. Region GU1. This secondary gustatory cortical area approximating 12o, integrates gustatory, satiety and olfactory information, according to Rolls [Rolls, 1995]. Region GU2. This is a polymodal area, approximating 13l, which integrates visual, gustatory and olfactory information, again according to Rolls [Rolls and Baylis, 1994] [Rolls, 1995].

12.2.3 Somatosensory areas

Anatomy. I diagram the somatosensory areas in Figure 12.4. 184 Chapter 12: An information-processing analysis of the primate neocortex and grasping and grasping tactile detection tactile images guidance of reaching guidance of reaching SI SS1 SS2 SS3 PF PG PFG rostral 1 rostral POa 2 3 PEa PEc PGm PE (b) principal connections and hierarchy of functional involvements PO MDP c m PE PG PEc 31 ci PE PE 2 PE 7a 3 1 23 PFG 3 PF 2 1 (ii) medial view (a) views of somatosensory areas (i) lateral view

Figure 12.4: The somatosensory hierarchy Analysis of neocortical perception hierarchies 185

I use the notation of Deepak Pandya and coworkers for connections and architectonics of the somatosensory areas. Area PG of Pandya and coworkers is approximately the same area as Brodmann 7a. I will use and refer to rostral PG as the somatosensory part of PG, and caudal 7a as the dorsal-visual part of 7a. From the described anatomy and functionality, we can identify the following regions: Region SI (areas 1, 2 and 3). A comprehensive treatment of the somatosensory systems of primates has been given by Jon Kaas and Timothy Pons [Kaas and Pons, 1988]. The primary somatosensory area SI has topographic and nontopographic somatic representa- tions [Phillips et al., 1988], and is involved in basic processing of somatic sensation, e.g., texture and angularity [Pandya and Yeterian, 1990]. Region SS1 (areas PE, PEa and PF). There are somatic re-representations re- representations of the somatic body regions, parietal somatic body regions in the parietal lobe [Kaas et al., 1981] [Merzenich et al., 1981]. These areas are thought to be involved in more complex and integrative functions [Lynch, 1980]. There are representations of spa- tial form derived from earlier topographic representations, for example tactual form and texture [Johnson et al., 1995]. A review by Juhani Hyv¨arinen [Juhani Hyv¨arinen, 1982] concludes that the somatosensory association cortex provides a “somatosensory coordi- nate system for goal-directed voluntary movements”. Mortimer Mishkin [Mishkin, 1979] concluded that there is hierarchy of somatosensory perception and has drawn an analogy to the ventral visual hierarchy.

Regions SS2 and SS3 (areas {PEc, rostral POa and PFG} and {PGm and rostral PG}). Region SS2 Region SS3 Vernon Mountcastle [Mountcastle, 1995b] has reviewed work on parietal lobe areas. Area 5 is involved in somatosensory guidance in voluntary reaching, grasping and joint rotation. Reaching involves the projection of the arm and hand, and in grasping the hand adapts to spatial contours of the target. “Grasping” neurons have been found in rostral POa and, for visual control, in LIP. 186 Chapter 12: An information-processing analysis of the primate neocortex

12.2.4 Auditory areas

Anatomy. The main auditory areas of the temporal lobe and insula are shown in Figure 12.5, taken from a review by Jon Kaas et al. [Kaas et al., 1999].

Their diagram is based on data from [Hackett et al., 1998a] [Hackett et al., 1998b] [Hackett et al., 1999] and [Romanski et al., 1999]. Areas RT, R and AI form the core, and areas RTM, RM, CM, CL, AL and RTL form the belt. The parabelt is shown as RPB and CPB, its rostral and caudal parts. LS is the lateral sulcus and STS the superior temporal sulcus. When these are in their normal closed position, only CPB, RPB and STG, the superior temporal gyrus, are visible. Note that Tpt and 22 form the planum temporale, a planar region on the upper surface of the posterior temporal cortex. The connectivity of the areas comprising the core, belt and parabelt shows a division of these areas into rostral and caudal regions, and indeed the summary diagram in [Hackett et al., 1999] shows the connections between rostral and caudal parts as weaker than the connections between different rostral parts or different caudal parts. Analysis of neocortical perception hierarchies 187 auditory memory conspecific vocalizations space perception complex stimuli semantics of words pattern recognition auditory long term memory perception of pure tones AI AU1 AU2 AU3 CML CC CB CPB CSTG 23b RC RB RPB RSTG RSTS CSTS (b) principal connections, hierarchy and functional involvements V4 PO Tpt CPB CL CM CML MI AI AL R RM RPB 31 STS (open) RTL RT 23b RTM LS (dorsal bank) STGr 23a LS(ventral bank) INSULA 29 and 30 (ii) lateral view (a) views of auditory areas (i) medial view

Figure 12.5: The auditory hierarchy 188 Chapter 12: An information-processing analysis of the primate neocortex

The intrinsic connectivity incorporating Kaas et al.’s findings is given in Figure 11(b). I have moved the rostral parts higher in the diagram because of their connectivity to frontal areas, to be described later. This frontal connectivity is based on the findings of Kaas et al. and also the findings of Pandya and coworkers, [Pandya and Sanides, 1973] [Pandya, 1995], who described the basic architecture several years ago. In addition, we have included the auditory medial areas CML and 23b. These were originally described by Goldman-Rakic et al. [Goldman-Rakic et al., 1984] in their study of connections between the principal sulcus and the hippocampal complex. Masao Yukie [Yukie, 1995] has described their connectivity to other auditory areas. Unfortunately, to our knowledge, their full connectivity to other frontal areas has not yet been described. The main frontal connectivity from all auditory areas forms the arcuate fasciculus, which runs near the medial surface.

From the described anatomy and functionality, we can identify the following regions: Region AI. To maintain consistency in my own notation, I will name the entire core region AI. Kaas et al use the name AI for just one of the areas of the core. Accord- ing to them, the core region is cochleotopically mapped and is involved in the per- ception of pure tones. AI (in our notation) has three, four or five subareas in most species of monkey [Aitkin, 1990] [Brugge and Reale, 1985]. Auditory cortical areas in man have been described by Gastone Celesia [Celesia, 1976]. Processing in AI uses au- diofrequencies projected in a regular serially ordered way (tonotopic representation). The cochlea is also re-represented point by point and indeed each cochlea is rep- resented bilaterally [Buser and Imbert, 1992]. Primary areas are involved in elemen- tary auditory processing such as frequency and amplitude [Pandya and Yeterian, 1990] [Brugge and Reale, 1985]. Auditory cortex may also be involved in the localization of sound [Heffner and Heffner, 1990].

Nobuo Suga and coworkers have described in detail how an upstream flow of information Analysis of neocortical perception hierarchies 189 in the auditory system of the mustached bat results in tuning of the midbrain frequency map [Gao and Suga, 1998] [Yan and Suga, 1998], however analogous phenomena have not yet been investigated in primates.

Region AU1 (areas {caudal and rostral belt, and caudal parabelt}, i.e., TS3). Auditory association areas are involved in more integrative functions such as auditory pattern recognition and sound localization [Juhani Hyv¨arinen, 1982] [Pandya and Yeterian, 1990]. Auditory image formation has not yet been clearly shown in primates. Kaas et al. remark that this region is less precisely cochleotopic.

Region AU2 (areas {caudal STG, caudal STS and rostral parabelt}, i.e., TS2). Accord- ing to Kaas et al., the parabelt area is concerned with space perception and auditory memory. However it also seems there is special provision for recognizing sounds from the vocal repertoire of the animal’s own species. This has been studied in detail in the squir- rel monkey [Newman, 1978], where specialist cells for nearly all major call types have been found. James Newman concluded that “the auditory cortex is a highly efficient processor of the acoustic structure of vocalizations and other complex acoustic signals, but that the determination of the biological significance of vocalizations - their interpre- tation, their meaning - most likely takes place elsewhere” [Newman, 1978], p. 104. More recently, Josef Rauschecker et al [Rauschecker et al., 1995] [Rauschecker et al., 1997] re- port involvement in the perception of conspecific vocalizations.

Region AU3 (areas {rostral STG, rostral STS}, i.e. TS1, plus CML and 23b on the medial surface). According to Pandya et al. the rostral parts of STS and STG form a separate region, both architectonically and in terms of intrinsic and extrinsic connectivity. Michael Colombo et al., working with lesioned monkeys, concluded that “the superior temporal cortex plays a role in auditory processing and retention similar to the role the inferior temporal cortex plays in visual processing and retention” [Colombo et al., 1990] p. 336. 190 Chapter 12: An information-processing analysis of the primate neocortex

Humans have special abilities for recognizing formants produced by the human pharynx and for calibrating the heard speaker’s vocal tract [Lieberman, 1991] [Blumstein, 1995]. Functional MRI imaging [Binder et al., 1994] has shown the involvement of STG bilat- erally with more meaningful auditory stimuli activating more of STG, simple stimuli being confined to the auditory cortex.

There have been several imaging studies in humans using CT [Baum et al., 1990], PET [Morris et al., 1998] [Fiez et al., 1996] [Zatorre et al., 1996] [Smith et al., 1996] [Petersen and Fiez, 1993], and FMRI [Hickok et al., 1997] [Dhankhar et al., 1997] [Millen et al., 1995] [Bilecen et al., 1998] [Huckins et al., 1998] [Binder et al., 1997]. These have limited spatial resolution but have shown that the semantic processing of words tends to use more outer and more rostral auditory areas than the perception of tones. The other main finding is the wellknown activation of posterior areas in speech perception and the processing of nouns and of frontal areas in speech production and the processing of verbs.

I include CML and 23b in AU3 because they seem connected to other areas at this level. The role of CML and 23b in auditory long term memory has been recognized clinically by Edward Valenstein et al. [Valenstein et al., 1987] and by Rudge and War- rington [Rudge and Warrington, 1991]. Paul Grasby et al. have shown this involvement using imaging [Grasby et al., 1993], and, in their FMRI study, Jeffrey Binder et al. [Binder et al., 1997] have shown clear involvement of these areas in semantic decision tasks. As shown in the diagram, one can argue for a branch in the hierarchy, a rostral branch consisting of rostral STG and STS and connected to orbital frontal cortex, and a caudal branch connecting via CML and 23b to dorsal prefrontal. Some researchers [Romanski et al., 1999] have speculatively associated the rostral branch with complex phonetic and language processing and the caudal branch with auditory spatial processing. Analysis of neocortical perception hierarchies 191

12.2.5 Ventral visual areas

Anatomy. I diagram the ventral visual areas in Figure 12.6. 192 Chapter 12: An information-processing analysis of the primate neocortex V1 VP V2 V4 MT preattentive vision motion higher level visual features recognition and storage of complex visual forms recognition and storage of complex visual forms socially significant objects socially significant objects TE3 TE2 VI VV3 VV2 VV1 TE1 4 VP TPro MT 6 8 V1 V2 12 V4 TE1 TE2 TE3 46 (c) principal connections to frontal action hierarchy 9 10 (b) principal connections and hierarchy of functional involvements V1 V1 V1 V2 V2 VP V2 V4 V2 V4 V4 V4 MT TE3 TE3 TE3 TE2 TE2 TE2 TE1 TE1 TE1 TPro TPro TPro (iii) ventral view (ii) lateral view (i) medial view (a) views of ventral visual areas

Figure 12.6: The ventral visual hierarchy Analysis of neocortical perception hierarchies 193

Our parcellation, notation and intrinsic connectivity of the occipital lobe are taken from the work of David Van Essen and coworkers [Essen et al., 1990]. They established the hierarchy of visual processing and established criteria for connections in such a hierarchy. Most of their work used macaque monkeys. For inferotemporal (IT) areas, inferotemporal areas our parcellation, notation and intrinsic connectivity are based on the work of Seltzer and Pandya [Seltzer and Pandya, 1994].

Nikos Logothetis and David Sheinberg [Logothetis and Sheinberg, 1996] have reviewed visual processing and its neural basis. Area TE is not visuotopically organized, and there is a systematic increase in receptive field size along the posterior-anterior length of IT from 1.5o to 50o. Neurons in IT are selective for stimulus attributes such as color, orientation, texture, direction of movement and shape. There is invariance to size or position for given shape, and some scale or translation invariance. More than 85% of IT neurons respond to simple or complex visual stimuli [Desimone et al., 1984].

Logothetis and Sheinberg suggest that there may be several different types of percep- tion with different neural sites, perhaps (1) basic category level objects, (2) individual identities of particular objects, (3) animate objects, and (4) visually guided movements .

The ventral visual hierarchy has been described by Keiji Tanaka and coworkers [Tanaka, 1996], Charles Gross and coworkers [Gross, 1994], also Yasushi Miyashita and coworkers [Miyashita, 1993], as recognizing complex objects, long term and short term memory of complex objects, also as involved in visual imagery imagery [Sakai and Miyashita, 1993]. Unfortunately, the experiments and their analysis do not relate this complex object recognition to action. Furthermore, according to Keiji Tanaka [Tanaka, 1996] p.135 “the accumulated findings favor the idea that no cognitive units rep- resent the concept of objects; instead the concept of object is found only in the activities distributed over various regions of the brain”. 194 Chapter 12: An information-processing analysis of the primate neocortex

Robert Desimone and coworkers have investigated attention attentional effects in IT di- rected from the frontal lobes. “The top-down selection templates for both locations and objects are probably derived from neural circuits mediating working memory, perhaps especially in prefrontal cortex” [Desimone and Duncan, 1995]. From the described anatomy and functionality, we can identify the following regions: Region VV1 (areas TE3 and V4). Keiji Tanaka distinguishes only between pos- terior and anterior IT. The extreme posterior inferior temporal areas are concerned with higher-order visual features. VV1 lesions give simple visual pattern deficits [Logothetis and Sheinberg, 1996].

Regions VV2 and VV3 (area TE2 and area TE1). Anterior inferotemporal cor- tex both recognizes and stores complex visual forms [Miyashita, 1993]. Recency and familiarity are also detected in anterior IT [Fahy et al., 1993].

Charles Gross et al. [Gross et al., 1972] discovered the visual recognition of socially significant objects such as hands in IT. Facial identity perception has also been shown in IT, in the inferomedial occipo-temporal region near the fusiform and lingual gyri. Viewer- centered detection of bodies and body parts have been shown in IT by Wachsmuth et al [Wachsmuth et al., 1994].

12.2.6 Dorsal visual areas

Anatomy. I diagram the dorsal visual areas in Figure 12.7. Analysis of neocortical perception hierarchies 195 spatial and motion features eye saccades guidance of reaching and grasping spatial maps and perception DV1 DV2 DV3 MDP 7a PO PIP DP MT MIP LIP MST VIP (b) principal connections and hierarchy of functional involvements V1 VP V1 V2 V2 DP V2 V4 PIP PO V4 MDP MT V4 MIP VIP LIP 7a (i) lateral view (ii) medial view (a) views of dorsal visual areas

Figure 12.7: The dorsal visual hierarchy 196 Chapter 12: An information-processing analysis of the primate neocortex

Connections and architectonics of the posterior parietal lobe in rhesus monkeys have been described by Pandya and Seltzer [Pandya and Seltzer, 1982]. Andersen et al. [Andersen et al., 1990] have mapped, using cynomolgus and rhesus macaque monkeys, the inferior part of the posterior parietal lobe. We will use the intrinsic connectivity and notation of Daniel Felleman and David Van Essen [Felleman and Essen, 1991] for the dorsal visual areas. To accommodate data from Pandya and coworkers, I use the name OAa to denote an area consisting of MT, MST and FST. From the described anatomy and functionality, we can identify the following regions: Region DV1 (areas PIP, PO and MT). Spatial layout features are derived in DV1. PO, for example, mainly uses information from the periphery of the visual field. Neu- ral area MT is involved in binocular disparity, speed and direction of stimulus motion [Logothetis and Sheinberg, 1996]. I have included area MT is both the dorsal visual and the ventral visual regions following the detailed studies of John Maunsell [Maunsell, 1995] showing that it has a role in both. Andersen [Andersen, 1995] has described the encoding of intention and spatial location in the posterior parietal cortex, in areas LIP and MDP (which is medial). These intentions specify saccades that the animal intends to make. Smooth pursuit and motion perception are done in OAa.

Region DV3 (caudal 7a). Vernon Mountcastle [Mountcastle, 1995b] has reviewed work on visually-guided reaching neurons in dorsal visual areas which also connect in PM3 with somatosensory guiding neurons in area 5. “Grasping” neurons are found in dorsal visual areas as well as area 5 [Wise and Desimone, 1988]. “Reaching” neurons have been found in 7a, particularly for reaches with either arm. Mountcastle summarized that these areas are concerned with (a) spatial perception, maps and coordinate transformations, (b) generation of intentions to move, and (c) commands for visuomotor and somatomotor operations. Sereno and Maunsell [Sereno and Maunsell, 1995] have suggested that LIP has memory for shape features. Analysis of neocortical perception hierarchies 197

Area 7a subserves spatial maps and spatial perception [Andersen et al., 1990], based on a distributed planar gain field representation of the spatial map, which is head-centered. Perception of the subject’s own gaze, i.e., position of eyes in the orbits, is performed in areas 7a, LIP and DP. According to Andersen et al.(1990, p.105): “area 7a appears to be very different from the other visual areas in the IPL in that it is the only area that connects to some of the highest centers in the brain”.

Various kinds of spatial map are constructed in the parietal lobe from somatosensory, visual and perhaps even auditory information. These maps concern the body and the larger environment of the animal. This information could be propagated to the frontal lobes in elaborating the spatial aspects of action. Specific maps will support action descriptions that are relatively definite and detailed, and for which spatial aspects have been determined.

12.2.7 Polymodal STS areas

Anatomy. I diagram the polymodal STS (superior temporal sulcus) areas in Figure 12.8.

Our parcellation, notation and intrinsic connectivity are based on the work of Barnes and Pandya [Barnes and Pandya, 1992] Here, there is a wealth of work from David Perrett’s group at St. Andrews University. Working with macaque monkeys, they have found cell responses to many different social stimuli. From the described anatomy and functionality, we can identify the following regions: 198 Chapter 12: An information-processing analysis of the primate neocortex social feature recognition social configuration recognition social goal recognition PM3 PM2 PM1 TS1 Tpt TS3 TS2 paAlt TAa TPO4 TPO1 TPO2 TPO3 Ipa TPro TEa OAa TEm Principal connections and hierarchy of functional involvements TE2 TE1 TE3 V1 Tpt TPO4 VP V2 DP OAa paAlt PEc PIP TPO3 V4 MT OAa PE MIP VIP LIP 7a FST Tpt TPO−3 TE3 PFG TE3 TS3 3 TPO2 PF AI TPO−2 TE2 2 paAlt 1 TAa TPO−1 Ipa TEa TEm TS2 TS3 TE2 TE1 4 TPO1 TS2 TPro TS1 6 8 TS1 TE1 12 46 TPro 9 10

Figure 12.8: The polymodal hierarchy of the superior temporal sulcus Analysis of neocortical perception hierarchies 199

Regions PM1, PM2 and PM3. (areas {TPO-4 and TPO-3}, {TPO-2, Ipa and TAa}, and {TPO-1 and TPro}). Classes of stimuli detected include: (1) faces - facial expression and facial identity [Perrett et al., 1979] [Perrett et al., 1992], face characteristics [Perrett and Mistlin, 1990], (2) direction of head in horizon- tal and vertical planes [Perrett et al., 1991], face view [Perrett et al., 1985], (3) eye gaze direction, eye contact with subject [Perrett et al., 1985], (4) hand actions [Perrett et al., 1989b], (5) limb position and body posture, patterns of walking, jerky mo- tion [Bruce et al., 1981] [Perrett et al., 1985], (6) appearance and disappearance from the visual field [Bruce et al., 1981] [Perrett et al., 1985], (7) limb and body movements of social significance, such as turning towards or away from the viewer, standing up, crouching down. [Perrett et al., 1990a], and (8) tactile stimulation in and out of sight, unexpected tactile stimulation, social actions, touching [Perrett et al., 1990b]. Cells also distinguish between the subject’s own movements and that of others, and detect unex- pected events. Michael Oram et al. [Oram et al., 1993] have also have pointed out that perceived socially significant motions can be independent of visual form.

Perrett has developed a conceptual framework for these findings [Perrett et al., 1989a]. He suggests a processing hierarchy for recognizing social stimuli based on faces, hands, eye gaze, and body and limb position and movement. Percepts can be in viewer- centered, object-centered and goal-centered frames. Viewer-centered representations are used preferentially in social situations, for they quickly constrain social response options.

All of this information is very useful for coordinating social action, a key requirement of life in primate groups. Descriptions are derived at a level which is appropriate to support the execution of a specified plan which is described in terms of spatial relations and action types, but which does not use specific positional information.

Goal-centered descriptions would be useful at a yet higher level since they are independent of choice of specific action; neurons that encode such descriptions are found mainly in 200 Chapter 12: An information-processing analysis of the primate neocortex

PM3 [Perrett et al., 1989a].

It is difficult to find a strong hierarchy in the sequence PM1, PM2, PM3. However, PM1 mainly recognizes simple features, and goal-centered descriptions are only found in PM3. We include the temporal pole TPro in PM3; it receives connections from AU3 and VV3.

Polymodal regions in the somatosensory and visual hierarchies. PM3 contains cells responding to tactile stimulation, but conditioned by visual information. The hierarchy of processing of somatosensory information thus extends beyond SS3 to PM3. Likewise, the dorsal visual hierarchy extends from DV3 to PM2 and PM3. The hierarchy of auditory processing probably extends into the frontal lobes in a more active perception process.

The temporal pole and episodic memory. Imaging studies show that long term episodic memory is usually associated with the temporal pole [Markowitsch, 1995] and orbital frontal cortex [Shallice et al., 1994].

12.3 Summary and conclusion

Summary. In this chapter, I have reviewed the evidence for a hierarchical architecture in the primate brain. By examining neuroanatomical evidence for connections among neural areas, I was able to establish anatomical regions and connections. I then examined evidence for specific functional involvements of the different neural areas and found some support for hierarchical functioning, for the perception hierarchies.

The essential technique I am using is to characterize each brain area in terms of the data types that it creates. I assume that there is a uniform cortical process which creates data and that this is a main activity of the brain. Summary and conclusion 201

Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. Chapter 13

Frontal areas and the perception-action hierarchy

Abstract: In this chapter, experimental evidence for the frontal areas of the primate neocortex is analyzed for conclusions concerning the existence of neural areas, for cortic- ocortical connectivity among neural areas, and for the involvement of each frontal area in the functioning of the brain. This analysis shows that the primate neocortex consists in the main of a perception hierarchy, an action hierarchy and connections between them. In other words, from an information-processing point of view, the primate neocortex has a hierarchical perception-action architecture.

202 The neocortical planning and action hierarchy 203

13.1 The neocortical planning and action hierarchy

13.1.1 Planning and action areas

Anatomy of the frontal lobe. I diagram the planning and action areas of the frontal lobe in Figure 13.1. 204 Chapter 13: Frontal areas and the perception-action hierarchy muscle combinations eye saccades for realtime execution representations of explicit detailed action sequences specific plans with data in working memory context and episode representation general plans, self paced goals MI PA3 PA1 G PA4 PA2 PA5 12 8 11 4 32 6 46 10 24 9 (b) principal connections and hierarchy of functional involvements 4 4 25 6 8 6 24 12 12 32 46 9 9 10 10 (ii) lateral view (i) medial view (a) views of planning and action areas

Figure 13.1: The planning and action hierarchy The neocortical planning and action hierarchy 205

Intrinsic connectivity is taken mainly from Helen Barbas [Barbas, 1988], using the Walker notation and parcellation. Parcellations and notation consistent be- tween rhesus macaques and humans have been given by Petrides and Pandya [Petrides and Pandya, 1994]. We will call the proisocortex in the frontal lobe FPro, and that in the temporal lobe TPro.

Fortunately, the frontal areas have been perspicaciously reviewed by Michael Petrides [Petrides, 1994]; Richard Passingham’s [Passingham, 1993] book gives a comprehensive treatment; and of course Joaquin Fuster’s [Fuster, 1997] classic monograph provides the primary basis for our understanding of the frontal lobes. The human prefrontal cortex is treated in Karl Zilles’ chapter [Zilles, 1990] in George Paxinos’s book [Paxinos, 1990] on the human nervous system, and also in Andr´eParent’s edition of Carpenter’s textbook [Parent, 1996].

13.1.2 Human cognition

To determine the role of neural areas in cognition involving planning and executive man- agement of planning, we need to look at human data.

Christopher Frith [Frith et al., 1991] [Frith, 1995] has reviewed work on problem solving and has concluded that dorsolateral prefrontal areas are activated for stimulus-driven cognition whereas corresponding medial prefrontal areas are activated for cognition which is self- or internally generated. He designed experiments to distinguish willed versus non- willed action, where willed action involved choices, i.e., more than one correct response existed. It seems to us that the two distinctions, one involving external stimuli versus internal generation, and the other involving choice, should be treated as independent. That is, an action involving choice may or may not involve the use of external input. This also meant in his case that the non-willed actions were essentially easier and used 206 Chapter 13: Frontal areas and the perception-action hierarchy lower areas such as SMA.

The generation of verbal sequences principally activated area 46 as well as ante- rior cingulate. That most reliable test for prefrontal damage, the Wisconsin Card Sort Test, involves choice and has been shown in several imaging studies, e.g.,goal [Mentzel et al., 1998], to activate dorsolateral areas such as 46.

Per Roland and coworkers [Roland, 1993] investigated several cognitive tasks and showed progressively more anterior activation with the abstraction and difficulty of the task. Thus, route-finding problems were imaged by Roland [Roland and Friberg, 1985] and found to principally activate areas 9 and 10. Tower of London problems have been imaged by Richard Frackowiak et al and found to activate areas 9 and 10. Roland also found that in arithmetic problems, areas 11 and 12 were involved, which he attributed to retrieval of memory of arithmetic skills for subtraction and the integers. Roland’s conclusion was that dorsolateral prefrontal areas are used in “all tasks in which a primary instruction is given which contains directives for future processing .... if no processing of sensory information or if the performance is preempted or obvious then there is no activation”.

The work on problem solving also typically shows activation of anterior cingulate. This area seems to involve the selection of goals, with its more rostral part involved in more cognitive goals and its more caudal part involved in more directly motor goals.

The nature of problem solving activity is being researched in clinical practice by Tim Shal- lice and coworkers [Shallice and Burgess, 1991a], and by Myrna Schwartz and coworkers [Schwartz, 1995], for example. They distinguish between more routine action and more higher level action involving “contention scheduling”, or choice. They conceive higher level behavior as the activation of schemas which are goal-directed, whereas lower level, but still cortical, behavior as involving simpler schemas. It seems to us that this is quite The neocortical planning and action hierarchy 207

compatible with the selection of overall goals being made in the anterior cingulate. This clinical work uses imaging to determine areas of damage, but since these observed le- sions are typically large, and idiosyncratic, it is difficult to assign finer localization to the categories of higher- and lower- level cognitive activation.

I have chosen not to fully consider human verbal behaviors, leaving a more complete treatment of this large and fascinating subject for a future paper.

13.1.3 Frontal regions

From the described anatomy and functionality, from nonhuman primates and from hu- mans, we can identify the following regions: Region MI (areas MI and 24). MI, the motor cortex is well known. It has a body mapping and sends motor execution information/commands to muscle groups in the body. Areas of motor cortex when stimulated seem to produce groups of muscle contrac- tions corresponding to common actions of the animal. Richard Passingham’s conclusion [Passingham, 1993] (p.37) is that the motor cortex is specialized for the execution of ma- nipulative movements of the limbs and face, and fine behavioral variants that are learned and which are selected in voluntary action.

Region PA1 (area 6). According to Passingham, lateral premotor cortex plays a role in the selection of manipulative movements, but not in the repetition of the same movement. It is also active in preparing to move. Selection often consists of using external cues to direct the movements. Medial premotor cortex plays a role in the selection of movement when no such cues are available, in repetitive movements that are self-paced, and in the performance of motor sequences from memory.

Region PA2 (area 8). Selection of eye movements requires a nonegocentric geometric frame and therefore information from external senses (in contrast to area 6 which uses 208 Chapter 13: Frontal areas and the perception-action hierarchy proprioceptive information and therefore an egocentric frame). The dorsomedial eyefield selection of eye movements is not determined by Ivisual targets, whereas lateral eyefields select when a target has been presented.

Region PA3 (area 46). After a complete analysis of experimental data, Michael Petrides concluded that PA3 is involved in generation of actions. Actions are generated on the basis of information in working memory or generated from memory. The principal sulcus(PS) is involved in spatial working memory and monitoring (Petrides, 1994) p.74. The term “monitoring” implies an expectation of what must or will occur and verification of what has occurred.

Region PA4 (areas 9 and 10). Petrides considered a region he called mid-dorsal lateral frontal cortex consisting of (i) dorsal 46 above PS1 and (ii) 9, to be a separate region. Our region PA4 is similar, but we also include 10 on connectivity grounds. Lesions in this region result in impairments to nonspatial working memory tasks with self-generated and externally generated responses.

This area is involved in self-ordered tasks in interaction with the medial temporal lobe (Petrides, 1994) p.76. This region has “more (than PA5)2 specialized executive process- ing in working memory that is critical for the planning and organization of behavior” (Petrides, 1994) p.79.

According to Petrides, lesions in this region do not markedly impair working memory tasks but they do markedly impair certain nonspatial tasks such as those involving the monitoring of self-generated and externally generated responses. This applies to keeping track of which ones of a set of stimuli have already been selected, for example.

Region PA5 (areas 11 and 12). Petrides considered a region he called ventrolateral

19/46d in his notation 2my comment The neocortical planning and action hierarchy 209 frontal cortex consisting of (i) 123, (ii) ventral 46 (below PS), and ventral 84 to be a separate region. This region, which is very similar to our PA5, is involved in executive processes concerning plans and intended actions, judgments of saliency and novelty and active voluntary retrieval of information in long term memory in posterior association cortex. Lesions result in severe impairments in all types of problem solving.

PA5 is clearly a separate region, as connectivity with perception hierarchies described here shows. Petrides [Petrides, 1994] Figure 18, p.77, shows strong connections to ven- trolateral frontal cortex from all unimodal and polymodal association areas including PM3 and AU3, whereas PA4 is only connected to PM2 and AU2 and other lower areas. Lesion work shows its separate functionality. It also has much stronger connectivity with the amygdala [Barbas, 1995].

I speculate that region PA5 may be best understood for its role in generating con- text. This is supported by Roland’s work on arithmetic tasks already mentioned [Roland, 1993].

Emad Eskandar et al. [Eskandar et al., 1992] have shown that IT neurons code for visual images and for behavioral context in a separable way. We have argued above that episodes provide context. Nancy Andreasen et al. [Andreasen et al., 1995], in a PET study, found that both focused and spontaneous episodic memory retrieval activated anterior medial frontal regions (area 11) and precuneus/retrosplenial cingulate cortex. Tim Shallice et al. [Shallice et al., 1994], using PET, isolated acquisition from retrieval of verbal episodic memories. Retrieval was associated with activity in right areas 10, 46 and 125 of prefrontal cortex (12 being strongest) and the bilateral precuneus (31). Left anterior cingulate (32) was active in both acquisition and retrieval. Endel Tulving

347/12 in his parcellation 445 in his parcellation 547 in their parcellation 210 Chapter 13: Frontal areas and the perception-action hierarchy et al. [Tulving et al., 1994], in PET studies, have shown neuroanatomical correlates of retrieval in episodic memory to be right dorsolateral cortex areas 10, 46, 9 and anterior 6 using auditory sentence recognition. Passingham has argued that the ventral prefrontal cortex, which he defines as areas 11, 12, 13, and 14, “selects the goal given the current context” [Passingham, 1993] p. 171.

Region G. (area 32). This region may subserve the representation of goals. G is the anterior cingulate cortex and has a similar connectivity to PA5 except it does not connect with VV3 or DV2. Jos´ePardo et al. [Pardo et al., 1990], in an imaging study, found data to support the idea of a role for the anterior cingulate in “selection and recruitment of processing centers for task execution”. Tomas Paus et al. [Paus et al., 1993], also in an imaging study, tentatively proposed the anterior cingulate as facilitating the execu- tion of appropriate responses and/or suppressing the execution of inappropriate ones. This, they observed, is particularly useful when behavior has to be modified in new and challenging situations. Michael Posner [Posner and Rothbart, 1998] has concluded that anterior cingulate is involved in executive attention. In their review, Orrin Devinsky et al. [Devinsky et al., 1995] conclude that the anterior cingulate is involved in executive functions and the posterior cingulate is involved in visuospatial and memory functions [Vogt et al., 1992]. According to them, the executive functions of anterior cingulate in- clude initiation, motivation and goal-directed behaviors, response selection including the decision not to move, and the expression of specific movement sequences that require little or no autonomic activity. Lesions of anterior cingulate cortex tend to impair willed actions.

[Elliott and Dolan, 1998] [Carter et al., 1998]

[Lane et al., 1998] [Paus et al., 1998] [Derbyshire et al., 1998] [Bussey et al., 1997] [Meunier et al., 1997] [Bussey et al., 1996] [Muir et al., 1996] [Seamans et al., 1995]. The neocortical planning and action hierarchy 211

Region Time Memory Complexity Monitoring scale and Scope of Action

G until goal memory goal expressions monitoring of goal satisfied for goal desired states progress and satisfaction

PA5 long term memory context monitoring of episode for context context

PA4 length of working self-ordered monitoring of current plan memory tasks total plan

PA3 length of working memory action involving monitoring of current action of spatial person, objects action details spatial position

PA2 saccade use of external eye saccade no monitoring target or self generated

PA1 real time use of self-paced no monitoring limb and face visual sequencing movement cues sequencing

MI real time none muscle groups no monitoring muscle in useful contractions combinations

Figure 13.2: Characterization of planning and action hierarchy

13.1.4 Planning and action hierarchy

We can therefore define a planning and action hierarchy based on a partial ordering determined by (1) immediacy and time scale, (2) memory, whether used and temporal extent, (3) complexity and scope of action, and (4) the extent of monitoring. We display these dimensions in Figure 13.2. This table shows that we can define such a hierarchy based on these measures. 212 Chapter 13: Frontal areas and the perception-action hierarchy

13.2 The perception and action hierarchies

13.2.1 Cortical regions and their hierarchies

I summarize here the perception and action hierarchies we developed in the previous sec- tions. I defined regions which consist of several neural areas and which form components of the architecture. The use of regions breaks down the cortical connectivity analysis into three levels: (i) aggregation of neural areas into regions (ii) (intrinsic) connectivity of regions within individual hierarchies. (iii) (extrinsic) connectivity of regions between individual hierarchies. These choices are determined by an analysis of function as well as anatomy.

The component neural areas of the regions we have identified in the previous sections are summarized in Figure 12.1. I draw the regions on the cortex in Figure 13.3 and I give a summary table in Figure 13.4. The perception and action hierarchies 213

VV3 VV2

VV1

PA5 VI

PA4 G

VV1 DV1 SS3 PA1 MI SI SS2 SS1

DV1

DV2

MI DV1 SI DV3 PA2 PA3 SS1 SS2 SS3

PA1 AI PA4 VI PA5 GU1 PM1

AU1 VV1 PM2 AU2

AU3 PM3

VV2 VV3

VV2 VV1

VV3

VI PA5

PA4 OL1

G

Figure 13.3: Views of the cortex showing regions 214 Chapter 13: Frontal areas and the perception-action hierarchy

hierarchy region functional involvements olfactory OI odor detection and discrimination, simple odor memory OL1 odor-guided behavior, mating and sexual behavior gustatory GI timing, detection and characterization of taste GU1 integration of taste with satiety information somatosensory SI tactile detection SS1 tactile images SS2, SS3 guidance of reaching and grasping PM3 socially significant tactile recognition auditory AI auditory detection images AU1 auditory images, maps AU2, AU3 socially significant auditory recognition ventral visual VI visual features and images VV1 object identity and motion VV2, VV3 complex objects and long term memory dorsal visual VI visual features and images DV1 spatial features DV2 eye saccades DV2, DV3 guidance of reaching and grasping DV2, DV3 spatial maps and spatial perception PM3 socially significant perception and guidance polymodal STS PM1, PM2, PM3 socially significant viewer-centered perception PM3 socially significant goal-centered perception PM3 episodic memory planning and action G goals and action selection PA5 context and episode representation PA4 complex plans, self-paced PA3 specific plans with data in working memory PA2, PA1 explicit detailed action sequences for realtime execution MI muscle combinations

Figure 13.4: Summary of experimental findings for hierarchy of data abstraction The perception and action hierarchies 215

We can now turn to our third task, that of reviewing the extrinsic connectivity among the different individual hierarchies.

13.2.2 Connectivity between perception and action hierarchies

Figures 13.5 and 13.6 give the table of all extrinsic connections between areas in different lobes. I will assume, as an approximation, that all connections, intrinsic and extrinsic, are reciprocal (some exceptions to reciprocity are listed by Felleman and Van Essen(1990)). 216 Chapter 13: Frontal areas and the perception-action hierarchy

Region Area Area References olfactory areas planning and action areas

OI OI 14c, 25, 13a, 13m [Rolls and Baylis, 1994] gustatory areas planning and action areas

GI operculum, IG 12o [Carmichael et al., 1994] somatosensory areas polymodal STS areas

SA1 PF (7b) TPO-1 [Neal et al., 1990] somatosensory areas planning and action areas

SI postcentral gyrus SI (1, 2, 3) MI (4) [Pandya and Yeterian, 1990] SA1 PE, PEa rostral 4, 6 (dorsal (MI) and SMA(MII)) [Pandya and Yeterian, 1990] PF (7b) ventral 6, 8, 45, 46, 24 [Pandya and Yeterian, 1990], [Cavada and Goldman-Rakic, 1989] SA2 PEc rostral 6 (MI and MII) [Pandya and Yeterian, 1990] rostral POa ventral 46 [Pandya and Yeterian, 1990] PFG 8, rostral 46, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] SA3 PGm rostral 6 above AS, 8, dorsal 46 and 9, 24 [Pandya and Yeterian, 1990] 7m (PGm) 45, 23, 24 [Cavada and Goldman-Rakic, 1989] rostral PG 8, rostral 46, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] auditory areas polymodal STS areas

AA1 TS3 TPO-3, TAa [Seltzer and Pandya, 1989] AA2 TS2 TAa [Seltzer and Pandya, 1989] AA3 TS1 TPO-1, TPro [Seltzer and Pandya, 1989] auditory areas planning and action areas

AA1 CB, RB, CPB (caudal TS3) dorsal 8 in concavity of AS, caudal 46 [Pandya and Yeterian, 1990] [Hackett et al., 1999] AA2 CSTG, CSTS (rostral TS3, TS2) prearcuate 46 below principal sulcus, [Pandya and Yeterian, 1990] rostral 46, dorsal prefrontal 9 and 10 [Pandya and Yeterian, 1990] [Romanski et al., 1999] 11, 12, 13 [Romanski et al., 1999] Ipa 46, 10, 11, 12, 14 [Seltzer and Pandya, 1989] AA3 23b, CML 9, 46 [Yukie, 1995] [Goldman-Rakic et al., 1984] RPB, RSTS, RSTG (TS1) 12 and 13 (OFC), 25 and 32 (medial PFC) [Pandya and Yeterian, 1990] 10, 11, 12, 13, 24, 32 [Romanski et al., 1999]

Figure 13.5: Table of all extrinsic connections among neural areas, part 1, AS - arcuate sulcus The perception and action hierarchies 217

Region Area Area References ventral visual areas dorsal visual areas

VV1 V4 MT, FST, DP, LIP, PIP, caudal 7a [Felleman and Essen, 1991] [Neal et al., 1990] [Young, 1992] VV1 TE3 (PIT) LIP, MST, FST [Maunsell, 1995] VV3 TE1 (AIT) 7a [Maunsell, 1995] ventral visual areas polymodal STS areas

VV1 OAa Ipa [Seltzer and Pandya, 1989] VV2 CIT TPO-4 [Hilgetag et al., 2000] ventral visual areas planning and action areas

VI V2, V3, VP, V4, V4t FEF (8) [Felleman and Essen, 1991] VV1 V4 46 [Felleman and Essen, 1991] TE3, OAa (lateral prestriate) premotor prearcuate cortex (8) [Pandya and Yeterian, 1990] VV2 TE2 premotor rostral 8, prearcuate 46 below PS [Pandya and Yeterian, 1990] TEa, TEm 8, 46, 11, 12 [Seltzer and Pandya, 1989] VV3 TE1, TE2 premotor rostral 8, prearcuate 46 below PS, [Pandya and Yeterian, 1990] 11 and 12 (orbitofrontal) dorsal visual areas polymodal STS areas

DV1 PO, PIP, MIP, FST, MST TPO-4 [Seltzer and Pandya, 1994] [Hilgetag et al., 2000] DV2 MIP, PIP TPO-2, TPO-3 [Seltzer and Pandya, 1994] VIP, LIP TPO-4 [Seltzer and Pandya, 1994] dorsal visual areas planning and action areas

DV1 PO 8 [Colby et al., 1988] [Felleman and Essen, 1991] DV2 DP 8, 46 [Felleman and Essen, 1991] DV2 VIP 8 [Felleman and Essen, 1991] DV2 LIP 6 (ventral premotor), 8, 46, 12, 24 [Felleman and Essen, 1991] [Cavada and Goldman-Rakic, 1989] DV3 caudal 7a 6, 8 (weakly), 46, 9, 12, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] polymodal STS areas planning and action areas

PM1 TPO-4 6, 8, caudal 46 [Seltzer and Pandya, 1989] TPO-3 dorsal 46, 9, 10 [Seltzer and Pandya, 1989] PM2 TPO-2 dorsal 46, 9, 10 [Seltzer and Pandya, 1989] PM3 TPO-1 46, 9, 10, 11, 12 [Seltzer and Pandya, 1989] 13, 14, 24, 32 TPro FPro [Seltzer and Pandya, 1989]

Figure 13.6: Table of all extrinsic connections among neural areas, part 2 218 Chapter 13: Frontal areas and the perception-action hierarchy

I diagram the connectivities from the perceptual regions to the frontal regions in Figure 13.7. The perception and action hierarchies 219 VV2 VV1 VV3 DV3 DV2 DV1 9 46 8 6 4 24 14 32 13 10 9 46 8 6 4 14 24 12 25 13 32 11 Connectivity of VV regions to frontal areas 10 12 25 Connectivity of DV regions to frontal areas 11 SI SS3 SS2 SS1 9 46 8 6 4 24 14 32 13 10 12 25 PM2 PM1 PM3 Connectivity of SA regions to frontal areas 11 9 46 8 6 4 24 14 AU2 AU1 AU3 32 13 10 12 25 Connectivity of PM regions to frontal areas 11 9 46 8 6 4 14 24 13 32 10 12 25 Connectivity of AA regions to frontal areas 11

Figure 13.7: Diagram of connections to frontal areas 220 Chapter 13: Frontal areas and the perception-action hierarchy

This regular clustering of lateral connections from perceptual to frontal regions provides further support for our identified perceptual regions and their intrinsic hierarchy. It also supports our partitioning of the set of frontal areas into a hierarchy of frontal regions, based on the clustering of their connectivities with the perceptual regions.

Figure 13.8 diagrams how regions are ordered within perception hierarchies, and are connected to frontal regions which form the planning and action hierarchy.

Due to the large number of connections, even between regions, I use a diagrammatic form which principally shows connections from perception regions to frontal regions; in addition, the main ascending and descending connections within hierarchies are indicated. A module is positioned at a level determined by its main connections, it is often also connected to adjacent frontal regions. DV3’s main connection is to area 46 (Andersen et al., 1990), hence it is placed at that level. To avoid clutter on the diagram, I have not shown DV2’s connections to PA5 and G. I have not shown in this diagram the fine structure of the planning and action hierarchy, i.e., all the connections intrinsic to the frontal regions. I have not analyzed lateral connections between different perception hierarchies.

I will regard the entire occipital lobe as a single region, VI, except for V4 which is part of region VV1. The regions at the bottom of the hierarchy, VI, AI, SI, MI, are concerned with interfacing to sensors and effectors. They all have complex structure, involving many subareas and often hierarchical structure within the region. For our purposes however we will treat each as a single architectural region.

13.2.3 Perception-action hierarchical architecture

Combining this evidence and these concepts, we can now define a five-level functional hierarchy. I will put G and PA5 on the same level, although they are distinct regions, Summary and conclusion 221 since they do not have a strong relative hierarchical ordering, but act more in parallel as goal and context respectively at the same level. Similarly, regions PA1 and PA2 will be put on the same level since they act in parallel, PA2 for eye movement control and PA1 for body movement control. Also, regions SS2 and SS1 seem to belong on the same level, mainly due to connectivity.

To speculatively characterize the hierarchy, level 5 concerns goals, context, episodes, social goals, overall spatial awareness, and social messages; level 4 concerns complex plans, and long term memory for complex objects; level 3 concerns specified plans with details in working memory, social objects, social action features, spatial descriptions, tactile guidance, and auditory social messages; level 2 concerns detailed action sequences for the self, eye saccades, action features, spatial features, tactile guidance, and auditory features; level 1 concerns activations of muscle groups and body parts, and tactile feedback.

Perception at each level concerns attending to, acquiring and maintaining information at this level of description. Planning and action at each level concerns acquiring, selecting and elaborating action descriptions.

13.3 Summary and conclusion

Summary. In this chapter, I have reviewed the evidence for a hierarchical architecture in the frontal areas of the primate brain. By examining neuroanatomical evidence for connections among neural areas, I was able to establish anatomical regions and con- nections. I then examined evidence for specific functional involvements of the different neural areas and found some support for hierarchical functioning for the planning and action hierarchy in the frontal lobes. 222 Chapter 13: Frontal areas and the perception-action hierarchy

The essential technique we are using is to characterize each brain area in terms of the data types that it creates. I assume that there is a uniform cortical process which creates data and that this is a main activity of the brain.

I have managed to push the analysis of the neocortex to include plans and goals as types of data. If we have this set of brain areas and a set of data types which includes different levels of percept, different levels of action description, and plans and goals, then this system constitutes a parallel computer which generates the complex primate behavior we are seeking to describe.

Thus this chapter’s main aim is to establish that this complete set of brain areas is a reasonable description of the neocortex, and that goals and different levels of planning and action can be included. Once this empirical description has been obtained, then chapters 14 and 15 can go ahead and generate a precise description of the parallel computer which is my model of the neocortex.

Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, a planning and action hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. The planning and action hierarchy is based on increasingly general situations, plans and control. Summary and conclusion 223

G PA5 PA4 PA3 PA2 PA1 MI

SS3 SS2 SS1 SI

DV3 DV2 DV1

AU3 AU2 AU1 AI

VI PM3 PM2 PM1

VV3 VV2 VV1

OL1 OI

GU1 GI

Figure 13.8: Neocortical perception-action hierarchy Chapter 14

Describing information processing in the neocortex

Abstract: In this chapter, I describe the use of general description methods to describe information processing in the brain.

I develop basic computational principles that are observed to hold for the brain.

224 Introduction 225

14.1 Introduction

Motivation. In chapter 12, I reviewed experimental data on neuroanatomical connec- tivity and neurophysiological activity of the neurons comprising the primate neocortex. There was sound evidence for the widely held belief that the neocortex is made up of discrete cortical regions with specialized functional involvements. My information- processing analysis of these findings concluded that each region processes certain types of data specific to that region. I also introduced information-processing concepts of goal, plan, sequence, event, and context as data types processed by certain regions. Further- more, from an analysis of connectivity, I concluded that these regions are connected together in a particular architectural scheme, namely a perception-action hierarchy. The chapter described these cortical regions, the types of data processed by each region, and the connections among regions.

What it did not do was explain how such a set of cortical regions provides the neural basis for complex organized primate behavior. The next two chapters provides this explanation, by presenting a system-level theory of brain function, using as a basis the conclusions of the previous two chapters.

We present here a system model of the primate neocortex which shows how the set of specialized cortical functions can be put together using the connectivity of the neocortex, to produce real behavior.

System models. To reiterate, a system model treats an object of study as a set of interacting subsystems, each of which is easier to understand and to describe than the complete system. It results in explanations of objects as due to the action of each subsystem and the interactions among subsystems.

Natural science, computer science and causal models. From the hierarchy of 226 Chapter 14: Describing information processing in the neocortex functional involvement alone, we cannot construct a model of brain functioning. The experimental results demonstrate the involvement of some parts of the brain in some given behavior, but they do not demonstrate a causal functioning model of the brain actually operating to produce the behavior. Again, to reiterate, by causal we mean that the model has a dynamics of changing in time from one state to another, each next state being determined from the current state. I will determine computational principles by which such a hierarchical system of information processing regions can function to produce behavior.

The perception and action hierarchies of the primate neocortex. As a basis for our system model, I now summarize the findings of the previous chapter, showing a hierarchy of function and data types in the cortex. I work with regions made up of several neural areas. I list the neural areas comprising each region and summarize their functional involvements in Figure 14.1.

Figure 14.2 summarizes the hierarchy of behavior and functionality.

I show the regions on a lateral view of the cortex in Figure 14.3(a) Introduction 227

hierarchy region corresponding areas functional involvements reference somatosensory SI 1,2,3 tactile detection [Kaas and Huertas, 1988] SS1 PE,PEa,PF tactile images [Merzenich et al., 1981] SS2 PEc,rostral POa,PFG guidance of reaching and grasping [Mountcastle, 1995b] SS3 PGm,rostral PG guidance of reaching and grasping [Mountcastle, 1995b] PM3 TPO-1,TPro socially significant tactile recognition [Perrett et al., 1989a] auditory AI auditory cortex auditory detection images [Brugge and Reale, 1985] AU1 CB,RB,CPB (TS3) auditory images, maps [Juhani Hyv¨arinen, 1982] AU2 CSTG,CSTS,RPB (TS2) socially significant auditory recognition [Newman, 1978] AU3 RSTG and RSTS (TS1),CML,23b socially significant auditory recognition [Newman, 1978] ventral visual VI V1,V2,V3,VP,V3A,V4t visual features and images [Essen et al., 1990] VV1 TE3,V4 object identity and motion [Logothetis and Sheinberg, 1996] VV2 TE2 complex objects and long term memory [Miyashita, 1993] VV3 TE1 complex objects and long term memory [Miyashita, 1993] dorsal visual VI V1,V2,V3,VP,V3A,V4t visual features and images [Essen et al., 1990] DV1 PIP,PO,MT spatial features [Essen et al., 1990] DV2 MIP,VIP,LIP, eye saccades [Andersen, 1995] DV2 MIP,VIP,LIP,DP,MDP,MST,FST guidance of reaching and grasping [Mountcastle, 1995b] DV3 caudal 7a spatial maps and spatial perception [Andersen, 1995] PM3 TPO-1,TPro socially significant perception and guidance [Perrett et al., 1989a] polymodal STS PM1, PM2, PM3 TPO-4,TPO-3,TPO-2,TPO-1,TPro socially significant viewer-centered perception [Perrett et al., 1990a] PM3 TPO-1,TPro socially significant goal-centered perception [Perrett et al., 1989a] PM3 TPO-1,TPro episodic memory [Perrett et al., 1989a] planning and G 24,25,32 goals and action selection [Devinsky et al., 1995] action PA5 11,12,13,14,FPro context and episode representation [Petrides, 1994] PA4 9,10 complex plans, self-paced [Petrides, 1994] PA3 46 specific plans with data in working memory [Petrides, 1994] PA2, PA1 8,6 explicit detailed realtime action sequences [Passingham, 1993] MI 4 muscle combinations [Passingham, 1993]

Figure 14.1: Summary of experimental findings for hierarchy of data abstraction 228 Chapter 14: Describing information processing in the neocortex

Level Perception Action Types of Information Example

Level 5: perception of goals prioritization desired states described abstractly, goal - affiliate with X. goals and perception of context and selection priorities and urgencies of such states. context of goals contexts - classes of events and episodes, themes, current resting situation, maintenance of general plans applying to classes of episode. family foraging current context objects, actions and relations described generally. summer afternoon, X is aunt

Level 4: perception of social generation of plans for a well-defined situation class, objects, groom with X joint plans features, situations social plans actions, relations corresponding, involving others. plans in social social actions relationship and intentions

Level 3: perception of features construction, execution joint plans with assigned roles groomee is X, groomer is self, joint plan features that indicate and monitoring and including defined actions (approach, prelude, groom) in relational spatial relations of explicit joint plan in specified in terms of relations form actions and intentions relational form

Level 2: perception of construction concrete actions for the self, including detailed approach to X at self action position, and execution spatial and temporal characteristics. position (300,360,0) in detail orientation of detailed plan detailed motor programs - to allow realtime get up, turn and walk movement, velocity for self performance of the actions without immediate feedback

Level 1: perception of activation of individual actions by sets of muscle groups. front right leg(), motor actions somatosensory, tactile, muscle groups front left leg() features for muscle selection and

Figure 14.2: Computational hierarchical levels used in my model Introduction 229 GI VI OI AI SI MI SS1 PA1 DV1 SS2 AU1 PM1 VV1 PA2 DV2 (b) GU1 OL1 AU2 PA3 DV3 SS3 VV2 PM2 PA4 VV3 PA5 AU3 PM3 G VI visual features DV1 spatial features DV1 DV3 maps VV1 object identities social feat. visual SS3 soc tact. feat. PM1 DV2 social SS2 action guid. tactile features AI PM2 SS1 auditory tactile images social action detection social objects complex and features AU1 VV2 (a) SI detection tactile auditory features PM3 social VV3 episodic memory goal features AU2 social MI muscle combinations messages PA1 detailed actions social messages AU3 G saccades goals eye PA2 PA3 specific plans PA5 contexts and episodes PA4 complex plans

Figure 14.3: (a) Lateral view of the cortex showing neural regions and functional involve- ments, (b) Connectivity of regions showing perception-action hierarchy 230 Chapter 14: Describing information processing in the neocortex together with an indication of their functional involvements. The region G is shown with a dotted line boundary to indicate that it is interior, being on the medial surface. In Figure 14.3(b), I give the connectivity of the set of cortical regions. The hierarchy is diagrammed in (b) on its side with its top at the left, in order to make it correspond to the usual lateral view of the cortex in (a). The positioning of a perception region on a vertical line indicates connection to the corresponding action region.

Our model. My system model is diagrammed in Figure 14.4 showing the set of implemented modules with approximately corresponding cortical locations. Introduction 231

MI muscle SI tactile combinations detection PA1 detailed actions for self SS1 tactile images DV2 spatial maps PA3 specific joint plans and DV1 spatial plan persons features PA4 overall plans

PM1 person VI visual G goals positions and movements features

PM2 person VV1 object actions and identities relations

PM3 social dispositions and affiliations

motor tactile visual input output input

motor system detailed plans for self goals overall plans specific joint plans

plan primates

environment

perceived dispositions

primate actions and relations

social relations primate positions and movements

sensor system

Figure 14.4: Modules from neural areas of the primate neocortex, and my initial system model 232 Chapter 14: Describing information processing in the neocortex

This approximate correspondence locates the perception hierarchy along the superior temporal sulcus (STS) following David Perrett’s findings, and with episodic memory for social relations in the anterior temporal lobe. Goals are in anterior cingulate. Specific joint plans and detailed plans for self are in dorsal prefrontal. Tactile sensing in somatosensory regions and spatial maps in dorsal-visual regions were used in our extension of the model for social-spacing behaviors. This also used a simple low-level spatial navigation module which could be tentatively identified with PA1.

The model functions by continuously generating and selecting a goal, and elaborating and executing a corresponding plan via its action hierarchy, while perceiving the world using its perception hierarchy, with continuous interactions between these hierarchies.

In summary, we can create a causal model of the brain at the system level if we model each cortical region by a continuously acting process which constructs, stores, and transmits data of the types specific to that region. Processes are connected by channels whose connectivity corresponds to cortico-cortical connectivity. This results in a system model whose dynamics include feedback, goal-direction, conditional plan elaboration, attention and situated action.

Predictions. From the model, we can obtain detailed predictions of temporal sequences of spatial distributions of cortical activation, for behaviors represented using the model. These predictions have a detailed time granularity of about 20 msec, and could be com- pared with fMRI or ERP data. In order to obtain FMRI data for social behaviors, one could perhaps use a visual display showing video-clips of social interactions, or an interactive video game where the subject makes moves in a social interaction game.

Social interaction. The other main new advance is that the model shows how a perception-action model can result in a model of social interaction. This occurs be- cause, in a situation with more than one animal, each animal continuously perceives The biological basis of our computational approach 233

the other, and continuously acts toward the other conditionally upon what it perceives. Further, the hierarchical organization of the perception-action system allows a hierarchi- cal description of the social interaction with different levels of control and protocol. The model provides correspondences between measurements of social interaction and the set of cortical regions and their activation patterns.

This chapter. In section 14.2, I derive computational principles from the biology of the cortex, and in section 14.3 I describe how, guided by these principles, I can represent cortical data and processes using predicate logic. In section 14.4, I describe the dynamics of our model. The next chapter gives a detailed description of a specific brain model which I have implemented on a computer, and I report behaviors and results obtained with our implemented model. This model, and its implementation, therefore establish the correctness and feasibility of the approach. They exhibit an actual functioning brain model based on the available empirical evidence.

14.2 The biological basis of our computational ap-

proach

In this section, I examine what we know about the primate cortex, and I develop the basic computational elements upon which to design a system-level brain model.

Areas. The primate cortex is partitioned into distinct areas. I therefore structure my computational model into a set of corresponding modules.

Areas have specific interconnectivity. The connectivity among areas is the same, or similar, for all primates. I will connect our modules in the same way. Areas are typically connected to a small number of other areas. Connections divide into long range and short range. At short range, an area is often connected to several neighboring areas that 234 Chapter 14: Describing information processing in the neocortex are contiguous with it. At long range, an area is usually connected to one, two or three areas that are further away, and not contiguous with it.

Each area is involved in specific kinds of processing. I will assume that each module processes only certain kinds of data, specific to that module.

Processing is distributed. Areas process data received and/or stored locally by them. There is no central manager or controller. This is a debatable issue. In our view, areas influence each other by data sent between modules. The set of modules works together in an integrated way, but by means of local processing and the exchange of data.

There is a uniform process. As discussed in section 3.3, the cortex seems to have the same operational or computational process over its entire area.

Cortical processing proceeds at a uniform rate. All modules do similar amounts of processing and run at about the same speed.

Data parallelism in communication, storage and processing. I assume that data is coded in parallel codes, such as population codes, so that a large set of parallel fibers carries a code for one message or one meaning. I assume that processing within a module is also highly parallel, operating on a large set of parallel fibers concurrently. Parallel coded data is transmitted, stored, and triggers processing. Processing acts on parallel data to produce parallel data.

The cortex works in real-time. The cortex’s fastest reaction time to a stimulus is about 100 milliseconds. The time to process information in one area and to pass it on to the next area is about 20 milliseconds (Edmund Rolls, personal communication). Further, the path from incoming sensory stimulus to outgoing motor command runs through about five areas. Language is processed in real-time, both generation and recognition, and, as Charles Goodwin has shown [Goodwin, 1981], even the co-construction of a sentence usually occurs in real-time including nonverbal signaling between participants during the Representing data and processes using logic 235 generation of the sentence. Hence the action of an area is, at least some of the time, an immediate reaction to its incoming data. Cognition occurs by modules exchanging data and by repeatedly reacting to incoming data and newly computed data. The incoming data arrives at an area, a process occurs in about 20 milliseconds, and output data is transmitted to other modules.

Data is “wide”. The data items being transmitted, stored and processed can involve a lot of information; they can be complex. Thus, if we have a parallel set of one million neurons, then the code for one choice or component of a data item might involve 10,000 neurons, and then the set of neurons might transmit 100 such components or choices simultaneously as one data item. Scientists describing this situation to each other use natural language which is less “wide”, so we tend to unconsciously assume that a data item in the brain is of similar complexity to a natural language word or phrase. However, a single data item, we suggest, may convey as much information as a whole paragraph.

14.3 Representing data and processes using logic

Logical modeling. My intention is to develop a very general and abstract kind of model that can be changed and specialized in the light of results obtained with it. For this purpose, we use predicate logic expressions to represent data and predicate logic inference rules to represent transformations of data. Given an abstract model, we will then later be able to consider more specific implementations of it, in particular, how it can be implemented as a neural net. However, the abstract model is self-contained, it can be run on a computer and its behavior found, and it can generate falsifiable predictions that can be tested against experiment. 236 Chapter 14: Describing information processing in the neocortex

Data items, and their storage and transmission. I will assume that we can view transmission and storage of data in the brain as codes which represent some information with a specific meaning. I therefore assume that, for the purposes of modeling at this level of analysis and abstraction, we can view all data streams and storage as made up of discrete data items. I will represent each data item by a logical literal which indicates the meaning of the information contained in the data item. In order to allow for ramping up and attenuation effects, I give every literal an associated weight, or strength, which is a real number. An example data item is position(adam,300,200,0) which might mean that the perceived position of a given other animal, identified by the name “adam”, is given by (x,y,z) coordinates (300,200,0). This might be a data item that is transmitted from one brain module to another. In the brain this would actually be implemented by a set of parallel neurons firing in a spatial pattern at certain firing rates. Its effect however is that the receiving module now has the information about the position of adam.

Memory Each cortical region will be represented by a continuously acting module which is a process with storage. The main determiner of processing will be the type of data being processed (rather than the function being computed), different regions being specialized for different data types. Every module may in general have stored data items. Depending on the time characteristics of the module, these stored items may constitute volatile, short term or long term memory. Items that are activated as a result of computation will have their activation sustained and will correspond to working memory. Thus, potentially, both long term memory and working memory are distributed over the set of modules; compare [Petrides, 1994].

Processing within a module. I represent the processing within a module by a set of rules. A rule matches to incoming transmitted data items and to locally stored data items. All the processing by a module is described by a set of left-to-right rules which are executed in parallel. Representing data and processes using logic 237

The patterns on the left-hand side of rules also have weights, and the strength of a rule instance is the product of the matching data item weights and the rule weights, multiplied by an overall rule weight.

A rule may do some computation which we represent by arithmetic. This should not be more complex than can be expected of a neural net. This arithmetic is represented in the body part of the rule, written as a “provided” expression, for example:

if position(W1,[M1,X1,Y1,Z1]), position(W2,[M2,X2,Y2,Z2]), then

too near(W,[M1,M2,D]), provided(distance(X1,Y1,Z1,X2,Y2,Z2,D),D < 10.0)

This rule is intended to determine whether one animal is too near another.

The results are then filtered competitively. Typically, only the one strongest rule instance is allowed to “express itself”, by sending its constructed data items to other modules and/or to be stored locally. In some cases however all the computed data is allowed through.

Uniform process. The uniform process is then the mechanism for storage and trans- mission of data and the mechanism for execution of rules.

Uniform rate. I achieve uniformity of rate by describing time by a discrete time scale; the model runs in discrete time cycles. In one processing cycle, all the rules in all the modules are executed once, that is, all rule instances, and then all selected data are communicated between modules and/or stored locally. The events of a processing cycle represent and abstract all the changes occurring during that time interval.

Perception-action hierarchy. Modules are organized as a perception-action hierarchy. I diagram my concept in Figure 14.5.

This is an abstraction hierarchy, so that modules higher in the hierarchy process data of more abstract data types. I use a fixed number of levels of abstraction. 238 Chapter 14: Describing information processing in the neocortex

The perception hierarchy receives sensory data items at the bottom and derives higher level descriptions to form a percept. The action hierarchy generates more and more detailed descriptions of action, that is, it elaborates the plan to the point where motor actions are generated at the bottom of the action hierarchy.

An example of a perceptual rule is: if position(M,X,Y,Z) and orientation(M,A) and self_position(X1,Y1,Z1) then oriented_towards(M), provided(angle_towards(X,Y,Z,X1,Y1,Z1,A1) and app_equal(A,A1)). i.e., from data giving another primate’s position and orientation, and from the subject’s own position, calculate the angle from the primate to the subject and test if that angle is the same as the primate’s orientation, if so, create a new datum representing the fact that the other primate is oriented towards the subject.

An example of an action elaboration rule is: if plan_self_action(walk_towards(M)) and position(M,X,Y,Z) then plan_self_act(walk_towards(X,Y,Z)). i.e., if the planned action for the self in terms of relations is to walk towards some primate, and if this primate’s position is X,Y,Z then generate a new datum, which represents planned action for the self in terms of detailed position, to walk towards the position X,Y,Z.

The goal module has rules causing it to prioritize the set of goals that it has received, and to select the strongest one, which is then sent to the highest level plan module.

I also use a long-term memory module, which perceives social action and maintains the memory of affiliative relations. It generates goals to affiliate and sends them to the goal module. Dynamics of our model 239

The external world. Primates operate in an external environment which is a 3D spatial world. The environment is everything external to the brain, so it includes the body. A primate has sensors which interrogate the environment and generate sensed feature descriptions which are represented as literals. These input data items are sent to specified modules each cycle. Some modules act as effectors in that they send motor commands, represented as literals, to the environment. The environment receives motor commands from all the primates and computes what changes to make. Clearly, primates can only communicate with each other via the environment, since they are not telepathic.

14.4 Dynamics of the model

Perception-action hierarchy. Figure 14.5 shows how a perception-action compu- tational architecture could support the functioning of the brain in behavior. A plan is selected and elaborated, receiving input from the perception hierarchy to allow it to elaborate appropriately. 240 Chapter 14: Describing information processing in the neocortex

memory of social relations goals received from system

observations disposition of social relative to goals prioritized and relations goals selected goal sent evaluation perception of goals and joint plan selected dispositions attention perception evaluation elaboration requests for information on actors and actions perception in terms joint plan described in relational terms; of relations; elaboration conditional on attention dependent on requested information relational information received information received attention evaluation perception elaboration requests for information on spatial positions etc perception of plan for self described spatial detail; in spatial detail; attention dependent on requested information elaboration conditional on information received geometric information received attention perception evaluation elaboration

perception of features generation of detailed motor commands

from sensors to effectors

Figure 14.5: Functioning of interacting perception and action hierarchies in behavior Dynamics of our model 241

Conditional elaboration - situation. Within a given level, the component of the action hierarchy at that level is elaborated down to the next lower level, and evaluations are assessed and transmitted back up to the next higher level. By elaboration I mean taking data which describe action at one level and generating data which describe that action in more detail. More detail includes (1) exactly how to act (which detailed action components), (2) in what order, (3) exactly at what times, (4) exactly where in space, and (5) who will do which actions. We diagram an example of this in Figure 14.6(a). By an evaluation I mean, for example, a value indicating progress, success or failure; such a value can also be associated with a particular datum, for example, one representing an action or goal.

Conditional perception - attention. The perception hierarchy and action hierarchy cooperate closely. The action hierarchy must elaborate the currently selected plan con- ditionally upon the perceived environment. The modules of the perception hierarchy at a given level derive information required for successful action elaboration at that level. The perception hierarchy receives descriptions representing tuning information and direct requests, attention information, and prediction information, from the action hierarchy. I diagram an example of this in Figure 14.6(b). This information provides a context for perception, and enables the optimal use of processing and communication resources by the perception hierarchy in supporting the realtime action. Thus, my perception-action architecture provides a framework for attention mechanisms. 242 Chapter 14: Describing information processing in the neocortex

description of goal goal groom(alice1) goal and relational far(alice1) groom(M1),near(M) −> prelude(M1) spatial information generate relations groom(M1),far(M1),oriented(M,M1) −> approach(M1) action description description of action approach(alice) in terms of spatial relations action description detailed position(alice,50,40,0) approach(alice) and detailed position positions approach(M),position(M,X,Y,Z)−>move_to(X,Y,Z) generate detailed action

move_to(50,40,0) description in terms of coordinates (a) conditional elaboration

action concerns alice groom(alice) module sends attend(alice) attend(alice) data item to perceptual module

attend(M), too_near(alice) check_near(M) action module if too_near(M) receives detailed etc. perceptual information about alice (b) attention

friend(alice1) friend(alice2) rule fires for both alice1 and alice2 should_affiliate(alice1) should_affiliate(alice2) rule instance for alice2 is friend(M1),should_affiliate(M1) −> affiliate(M1) strengthened by confirmation

affiliate(alice2) confirm(affiliate(alice2))

near(alice2) affiliate(alice1) affiliate(alice2) affiliate(alice1) is not confirmed near(M1),affiliate(M1) −> groom(M1) affilate(alice2) is confirmed

(c) confirmation groom(alice2)

Figure 14.6: How the model works Dynamics of our model 243

Continuous action. Action is continuous with a small time granularity, The primate brain runs at about 20 milliseconds, and our implementation runs at about 100 millisec- onds on a 300MHz processor. Thus, stored data are updated every cycle, the selection of rule instances is updated every cycle, and updated motor commands are output to the environment every cycle. The process of goal generation, goal selection, plan selection, plan elaboration, action specification and motion specification proceeds continuously, renewing the information every cycle.

The stability of distributed activation. Each module selects a dominant rule which outputs data to other modules. However, this can lead to incoherence, modules can get into states with crossed purposes, and attempts to elaborate plans tend to collapse under challenge. I developed a simple, biologically plausible mechanism which stabilizes dis- tributed activity. If a module receives a data item that causes successful activity, it sends a confirmation message back to the sender, evaluating that data item. Successful activity is defined as any rule firing, not necessarily a selected one. I diagram an example of this in Figure 14.6(c). The confirmation message is specific to and contains the particular data item sent. When the sending modules receives a confirmation message it boosts the level of the rules generating that data item. This therefore consolidates the strength of the selected rules. Further, is a selected rule does not receive confirmation messages, its strength will attenuate, thereby allowing competing choices competing choices choices, competing to be tried.

Viable behavioral states. The basic action of the brain model is to try to establish a plan consistent with the response of the environment and with its own motivations. It does this by trying different alternatives at each level on a competitive basis, and subject to confirmation of successful elaboration. A state of the module in which such a plan is established, and is executing consistently with perception and confirmed elaboration, can be called a it viable state. 244 Chapter 14: Describing information processing in the neocortex

Different levels of control. Provided the internally generated goals and the external environment do not change radically, the continuous process of plan elaboration, percep- tion and action will continue. A change in spatial positions will simply result in different positions being perceived, this positional information being passed to the self action module and a different and more appropriate detailed action being generated using the updated position. The other levels will continue as before. Thus, the system will track changes in position.

Greater changes in position, posture, and action may result in different spatial and action relations being perceived at level 3. The relational information passed to the action module at this level may cause a different type of self action to be generated, but one that is still consistent with, and an elaboration of, the more generally specified plan received from level 4.

Thus, the levels of the hierarchy of perception and action correspond to a hierarchy of control concerning variations of (1) new positions and/or orientations, (2) new spatial relations, action types or action phases, (3) new plans, (4) new goals, and (5) new social situations, respectively. This is depicted, using a cortical correspondence diagram, in Figure 14.7.

Joint action. I developed a notion of plan suitable for social action. A joint plan is a set of joint steps, with temporal and causal ordering constraints, each step specifying an action for every primate collaborating in the joint plan, including the subject primate. The way a plan is executed is to attempt each step in turn, and during a step to verify that every collaborating primate is performing its corresponding action and to attempt to execute the corresponding individual action for the subject primate. I made most of the levels of the planning hierarchy work with joint plans, the next to lowest works with a “selfplan” which specifies action only for the subject primate, and the lowest with concrete motor actions. However, the action of these two lowest levels still depended on Dynamics of our model 245

basic movement tracking of sequencing position and orientation MI muscle SI tactile perceptual analysis combinations detection tracking of action phase PA1 detailed SA1 tactile images actions for self DV2 spatial maps

sequencing of plan steps PA3 specific joint plans and DV1 spatial features

PA4 overall plans plan persons

G goals PM1 person VI visual positions and features movements plan elaboration maintained by confirmation PM2 person VV1 object actions and identity attention changes relations accordingly

PM3 social dispositions and perceptual affiliations analysis

subcortical systems and long term memory provides stream of goal messages

visual input motor somatosensory output input

Figure 14.7: Response to variation in environment information received from the perception hierarchy.

Imagining brain activity. The modules operate concurrently, that is they all operate at the same time, in parallel. We can perhaps imagine a cortical surface covered with an array of cortical regions like a patchwork quilt, each region lighting up by a different amount during each cycle. Each region also stores and processes different information in each time interval, so there are different sets of expressions in each module which are constantly changing, and expressions flowing between modules.

The usual image of information activity in the brain is that of information flowing through a set of cortical areas, forming a pathway. We conceive of data flowing through the brain 246 Chapter 14: Describing information processing in the neocortex as passing sequentially through a series of brain regions. During each cycle, incoming information may be used to generate new information and/or to store information. In general, the same information is not passed on; instead, new information, derived from all the inputs received at that moment and any information already in storage at that moment, is transmitted. Because of the high processing speed of modules, a high rate of data transmission is maintained through the brain.

To understand how the brain produces behavior, we need a more general concept of computation that allows information transformation activity, combination of informa- tion, storage and retrieval of information, and activity conditional upon properties of the information. Our array of regions image allows us to think of each neural region as responding conditionally to information, as having time to compute new information and to store information, and as sending different information in different directions to different other areas, including sending information back to areas “upstream”. Chapter 15

My implemented model of the primate neocortex

Abstract: I demonstrate my approach by proposing a specific causal functioning model of the brain based on its functional architecture.

By using an abstract logical analysis, I develop a computational architecture for the brain in which each cortical region is represented by a computational module with processing and storage abilities. Modules are interconnected according to the connectivity of the corresponding cortical regions.

I report on results obtained with an implementation of this model. I conclude with a brief discussion of some consequences and predictions of this work.

247 248 Chapter 15: My implemented model of the primate neocortex

15.1 Our implemented brain model

The choice of behaviors and external environment. In order to precisely define a brain model, I needed to decide what behaviors to consider and what external envi- ronment the brain would have. I chose to consider the case of social affiliation. I used a “minisociety”, in which a group of primates (monkeys) interact socially, in a naturalistic 3D environment, with each model primate controlled by a brain model. Thus the instan- taneous state of the environment is mainly the positions, orientations and configurations of these primates. I motivated the system by defining long term memory, which stores knowledge of affiliative relations, as, among other things, generating affiliation goals, since affiliative behavior is a known driving force in primate groups [Kling and Steklis, 1976].

My impulse was to build in social interaction into my brain design from its inception. In the event, this has proved to be a fruitful decision; social interaction is arguably the most general type of behavior, and leads us to construct a general model. Social behavior involves perceiving dynamically-changing environments of primates who have complex dynamics. It involves generating social behavior which is joint and requires real-time coordination of action.

The initial model. I developed an initial brain model consisting of data and process representations, with eight memory modules, shown in Figure 14.4. An outline of each of these memories, the descriptions they store and the processes they include, is as follows: (i) the affiliations module contains all affiliative information, including kinship and dom- inance relationship information. It generates affiliation goals and sends them to the goals module. (ii) the goals module contains all goals currently held. It activates the most important goals and sends this information to the overall plans module. (iii) the overall plans module receives goals and instantiates suitable joint-plans, sending Our implemented brain model 249 them to the specific joint plans module. (iv) the specific joint plans module receives a joint-plan, and generates a detailed action based on descriptions received from the perceptual hierarchy. For the others involved in the joint plan, the detailed action or state is verified, and for the self, its detailed action is sent to the detailed actions for self module. (v) the detailed actions for self module receives the detailed self action from the spe- cific joint plans module, receives object and location information from lower levels of the perceptual hierarchy, mainly from the primate positions and movements module, and outputs a detailed motor action for this to the motor system. (vi) the primate positions and movements module receives sensory descriptions of the state of the external world and provides information on requested primates to the pri- mate actions and relations and detailed actions for self modules. (vii) the primate actions and relations module computes higher-level descriptions of the action of each primate involved in the current joint action. It requests information on particular primates from the primate positions and movements module. (viii) the plan primates module receives information from the overall plans module as to which other primates are involved in the joint action, and passes this on to the primate actions and relations module. (ix) the motor system does some processing to generate the external action given the direct action received from the detailed actions for self module.

Note that we have very much simplified the perceptual and motor hierarchies in this initial model. The perceptual hierarchy is simply the primate positions and movements and primate actions and relations modules, and the action hierarchy is the overall plans, specific joint plans and detailed actions for self modules.

Figure 15.1 indicates the data types processed in each module. 250 Chapter 15: My implemented model of the primate neocortex

affiliation goal affiliation(M1,M2) goal(G) dominance_level(M,L) primate_disposition(M,G,S) grooming(M1,M2) primate_disposition plan currently_selected_goal(G) primate_action request_primate(PP) plan_primate requested_action(M,A) requested_position(M,X,Y,Z) plan_primate(PP) plan_primate_action requested_orientation(M,[B,H]) self_action(M,A) plan_primate_action(PPA) self_position(M,X,Y,Z) head_oriented_towards(M1,M2) self_orientation(M,[B,H}) body_oriented_towards(M1,M2). head_oriented_towards(M1,M2) walk_towards(M1,M2). body_oriented_towards(M1,M2). move_to(M1,M2) walk_towards(M1,M2). move_to(M1,M2) plan_self_action primate_motion plan_self_action(PSA) plan_self_act(SA) action(M,A) self_action(M,A) position(M,X,Y,Z) self_position(M,X,Y,Z) orientation(M,[B,H}) self_orientation(M,[B,H}) requested_action(M,A) requested_position(M,X,Y,Z) requested_orientation(M,[B,.H]) sensor_system motor_system self_act(SA) environment

Figure 15.1: Description types in each module Our implemented brain model 251

Figure 15.2 outlines the rules operating in each module, for the case of two-primate grooming.

affiliation goal

compute affiliate with subordinate goals from disposition compute weight of goal compute affiliate with dominant goals compute dominate goals plan if grooming increase affiliation value from currently_selected_goal primate_disposition generate plan components

compute disposition for goals plan_primate_action plan_primate g1 if other is sitting or reorienting and far the orient_head_towards from plan_primate generate g1 if other has head_oriented_towards and far then orient_head_towards plan_primate_request g2 if other sitting and far then walk_towards g2 if other reorienting_head and far then walk_towards primate_action g2 if other head_oriented_towards and far then walk_towards g3 if near and head_oriented_towards then move closer from plan_primate_request g4 if no grooming_prelude_response then try grooming_prelude compute g4 if grooming_prelude_response then continue grooming_prelude if self has head_oriented_toward other g5 if grooming_prelude_response then try grooming if other has head_oriented_toward self g5 if grooming_response then continue grooming if other has body_oriented toward self bg1 if other far then sit if other is walking toward self bg1 if reorienting _towards and far then sit if other has moved−toward self bg1 if head_oriented_towards and far then sit bg2 if walking_towards and far then orient_towards bg3 if walking_towards and near then orient_towards primate_motion bg3 if walking_towards and near and oriented_towards then sit bg3 if walking_towards and very_near and oriented_towards then sit from plan_primate_request bg4 if grooming_prelude then grooming_prelude_response compute requested_action, requested_position, bg5 if grooming then grooming_response and requested_orientation compute self_action, self_position and plan_self_action self_orientation compute distance estimate to other if plan_self−act description and requested information compute grooming_action descriptions and self_information then construct spatially specified action sensor_system motor_system

self_act(SA) environment

Figure 15.2: Outline of description transformations in each module 252 Chapter 15: My implemented model of the primate neocortex

Results were obtained with grooming, social conflict and social spacing behaviors. These simple social behaviors were obtained using about fifteen rules per module.

15.2 Behaviors and results obtained

Two primate grooming behavior and joint action. I experimented with a pro- totypical situation in which two primates groom. I developed a four phase plan for a groomer (orientation, approach, grooming-prelude, then grooming), and a groomee (waiting, orientation, grooming-prelude-response, then grooming-response), and we de- veloped suitable rules for activity in each module in each phase. I ran my computer implementation and the primates did indeed carry out the four phases described, leading to a primate named adam1 grooming a primate named alice1. I show in Figure 15.3 images from a visualization generated by my system, showing a frame from each of the four phases of grooming. Behaviors and results obtained 253

Figure 15.3: Grooming sequence 254 Chapter 15: My implemented model of the primate neocortex

I show in Figure 15.4 an instantaneous state of adam1’s brain at a point in time when he is walking toward alice1, as a result of selecting a goal to affiliate with her, and to do this by grooming her. He is perceiving that alice1 is in the process of orienting toward him and takes this into account in generating his own action of walking directly toward her. We show the left hand sides of dominant rules in each box, and the transmitted right hand sides on the channels.

MI muscle SI tactile combinations detection self_act(walk_towards(alice1)) PA2 detailed actions for self plan_self_act(walk_towards(alice1)) SA1 tactile position(adam1,340,360,0) images DV2 spatial position(alice1,360,300,0) maps plan_self_act(walk_towards(alice1)) PA3 specific joint plans confirm( "" ) plan_person_action(be_groomed(alice1,adam1) orienting_towards(alice1,adam1) position(adam1,340,360,0) requested_action(alice1,sitting) position(alice1,360,300,0) DV1 spatial requested_distance(alice1,far) features requested_distance(alice1,far) PM1 person plan_person_action(be_groomed(alice1,adam1) positions and movements PA4 overall plans confirm( "" ) orienting_towards(alice1,adam1) action(alice1,reorienting_head(178)) currently_selected_goal(affiliate(alice1,adam1)) requested_action(alice1,sitting) position(alice1,360,300,0) action(adam1,walking) plan_person(alice1) position(adam1,320,350,0) currently_selected_goal(affiliate(alice1,adam1)) action(adam2,sitting) confirm( "" ) plan persons position(adam2,400,350,0) G goals position_request(alice1) goal(affiliation,affiliate(alice1,adam1)) plan_person(alice1) VI visual goal(affiliation,affiliate(adam2,adam1)) features position_request(alice1) person_request(alice1) requested_action(alice1,reorienting_head(178))VV1 object requested_position(alice1,360,330,0) identities PM2 person goal(affiliation,affiliate(alice1,adam1)) actions and relations goal(affiliation,affiliate(adam2,adam1))requested_action(alice1,reorienting_head(178)) requested_position(alice1,360,330,0) PM3 social dispositions and person_request(alice1) affiliations affiliation(adam1,alice1) dominates(alice1,adam1) affiliation(adam1,adam2) dominates(adam2,adam1)

visual input

action(alice1,reorienting_head(178)) motor tactile position(alice1,360,300,0) output input action(adam1,walking) position(adam1,320,350,0) take a step action(adam2,sitting) towards (360,300,0) position(adam2,400,350,0)

Figure 15.4: Visualization of a typical instantaneous state of the model Behaviors and results obtained 255

Figure 15.5 shows the states of adam1 and alice1, with complementary perceptions of each other and complementary plans being elaborated.

adam1 alice1 social relations social relations goal(affiliation,affiliate(alice1,adam1)) goals goals goal(affiliation,affiliate(adam1,alice1)) affiliation(adam1,alice1)goal(affiliation,affiliate(adam2,adam1)) affiliation(adam1,alice1) goal(affiliation,affiliate(alice1,adam1)) goal(affiliation,affiliate(adam1,alice1))goal(affiliation,affiliate(adam2,alice1))dominates(alice1,adam1) dominates(alice1,adam1) goal(affiliation,affiliate(adam2,adam1)) goal(affiliation,affiliate(adam2,alice1)) affiliation(adam1,adam2) affiliation(adam1,adam2) dominates(adam2,adam1) dominates(adam2,adam1) currently_selected_goal(affiliate(adam1,alice1)) perceived dispositions currently_selected_goal(affiliate(alice1,adam1)) perceived dispositions overall plans overall plans currently_selected_goal(affiliate(alice1,adam1)) currently_selected_goal(affiliate(adam1,alice1))

plan_person(alice1) plan_person(adam1) plan persons plan persons plan_person_action(be_groomed(alice1,adam1) plan_person(alice1) plan_person_action(be_groomed(alice1,adam1) plan_person(adam1) person actions and relations person actions and relationsperson_request(alice1) person_request(adam1) requested_action(alice1,reorienting_head(178)) walking_towards(adam1,alice1) requested_position(alice1,360,300,0) specific joint plans specific joint plans requested_action(adam1,walking) person_request(alice1) plan_person_action(be_groomed(alice1,adam1) plan_person_action(be_groomed(alice1,adam1)walking_towards(adam1,alice1)requested_position(adam1,320,350,0) orienting_towards(alice1,adam1)orienting_towards(alice1,adam1) requested_action(adam1,walking) requested_action(alice1,sitting) walking_towards(adam1,alice1)requested_action(adam1,walking) requested_action(alice1,reorienting_head(178))requested_action(alice1,sitting)requested_distance(alice1,far) requested_distance(adam1,far) requested_action(adam1,walking) requested_position(alice1,360,300,0) requested_position(adam1,320,350,0) position_request(alice1) position_request(adam1) plan_self_act(walk_towards(alice1)) requested_distance(alice1,far) plan_self_act(orient_head_towards(adam1))requested_distance(adam1,far) person positions and movements person positions and movements detailed actions for self detailed actions for self action(alice1,reorienting_head(178)) position(adam1,320,350,0) action(alice1,reorienting_head(178)) position(alice1,360,300,0) plan_self_act(walk_towards(alice1)) plan_self_act(orient_head_towards(adam1)) position(alice1,360,300,0) position(adam1,320,350,0) position(adam1,320,350,0) action(adam1,walking) position(alice1,360,300,0) position(adam1,320,350,0) position(alice1,360,300,0) action(adam1,walking) position(adam1,320,350,0) position(alice1,360,300,0) position(alice1,360,300,0) position(adam1,320,350,0) action(adam2,sitting) action(adam2,sitting) position(adam2,400,350,0) position(adam2,400,350,0) position_request(alice1) position_request(alice1) self_act(walk_towards(alice1)) self_act(orient_head_towards(320,360,0)) action(alice1,reorienting_head(178)) action(alice1,reorienting_head(178)) position(alice1,360,300,0) position(alice1,360,300,0) sensor_system action(adam1,walking) motor_system motor_system action(adam1,walking) sensor_system position(adam1,320,350,0) position(adam1,320,350,0) action(adam2,sitting) action(adam2,sitting) position(adam2,400,350,0) take a step turn head position(adam2,400,350,0) towards (360,300,0) two degrees

position(adam1,320,350,0) action(adam2,sitting) position(alice1,360,300,0) action(adam1,walking) position(adam2,400,350,0) action(alice1,reorienting_head(178))

environment

Figure 15.5: Instantaneous behavioral states of two interacting primates 256 Chapter 15: My implemented model of the primate neocortex

Social conflict, change and termination of action. I wanted to investigate failure of action, and dynamic change of selected goals and selected plans, so I developed a more complex scenario involving four primates and social conflict. This involved a situation in which a primate would set up an initial goal to affiliate with another but then would find that it could not, since it would not be receiving cooperative feedback, and so it then would turn to another goal to affiliate with a different other primate. Behavior was achieved in which conflict occurred and a change of cortical process was needed.

The strongest goal of alice1 and adam1 were to groom adam2, but adam2’s strongest goal was to groom a alice2. Figure 15.6 shows the sequence obtained. Alice1 and adam1, on the left and top of the pictures, first oriented to adam2, on the right. When they perceived it orienting and moving toward alice2, at the bottom of the picture, their plans failed. The “moment of truth” is captured in Figure 15.6 (top right) where adam1, and alice1, realize from adam2’s walking toward alice2 that adam2 did not wish to enter into joint activity with him. This caused disconfirmation of his elaborated plan to groom with adam2, and eventually disconfirmation of the corresponding goal. A new goal was competitively selected to groom with alice1, and this joint action is able to be completed. Behaviors and results obtained 257

Figure 15.6: Social conflict sequence 258 Chapter 15: My implemented model of the primate neocortex

The direct perception of disposition. I also implemented this social conflict scenario using an additional module for the perception of the dispositions of others. Dispositions were represented as positive or negative evaluations of certain goal types. A disposition represented the subject primate’s perception of the attitude of another primate toward a given goal. Perceived dispositions were computed in the new module and from there transmitted to the goals module. It was relatively straightforward to develop rules for perceiving other’s dispositions in a limited context. For example, if an animal is mov- ing away, it is probably negatively disposed to grooming. We represented dispositions as evaluations relative to a given goal. So, a disposition data item in alice1’s brain might be person disposition(adam1,0.9,negative,goal(groom(alice1,adam2))). The weights of goals generated in the goal module were made conditional upon this primate disposition feedback, The change of plan was then accomplished more smoothly and quickly. As soon as the negative disposition of adam2 was perceived by adam1, his goal to groom him were rapidly reduced in weight, allowing the alternative goal to become selected. This was accomplished without prior dismantling the existing process, the new process simply displaced the old one competitively as it elaborated to each level.

Social spacing. I also implemented some social spacing behaviors, so that primates tend to sit near each other, plan paths that avoid dominant others and paths that displace subordinate others, and interrupt grooming when being displaced by a dominant other. These behaviors use new modules which include not only a spatial map but also a social map which represents space and its social significance. Thus, each primate has a different social map, perceives situations differently, and behaves in space differently. We show in Figure 15.7 an image sequence for avoidance by subordinate primate, and in Figure 15.8 an image sequence for displacement by a dominant primate. Behaviors and results obtained 259

Figure 15.7: Avoidance sequence 260 Chapter 15: My implemented model of the primate neocortex

Figure 15.8: Displacement sequence Behaviors and results obtained 261

Visualization as humans. I also developed a visualization of these scenarios using human figures, with human walking, and with grooming being to sit together. The initial figures I used were not articulated properly however their movements looked natural. The main problem was the facial expressions which didn’t look right. There wasn’t at that time a usable animated face that could be interfaced to another program. I am currently developing new computer graphics with new human figures which are more detailed and which will have a set of reasonable facial expressions.

Figure 15.9 shows a displacement sequence, adam3 displaces the group consisting of adam2, alice1 and alice2, in order to approach alice3; alice2 sits down, adam2 rejoins her, alice1 at first sits and then returns to this group; adam3 continues past adam1 to sit with alice3, alice1 rejoins alice2 and adam2. 262 Chapter 15: My implemented model of the primate neocortex

Figure 15.9: Displacement sequence, with persons visualized as humans Discussion 263

15.3 Discussion

Modularity and our model. Following the analysis of chapter 12, I took a strong mod- ularity position and assumed that the entire cortex comprises a set of specialized modules. In an influential monograph, Fodor [Fodor, 1983] argued that only the input/output pe- riphery of the cortex is modularized and that major cognitive tasks, including long-term memory and problem-solving, are carried out by non-modular “central processes”. He described a concept of module not only with domain specificity, informational encapsula- tion and neural localization, but also with lack of influence from central processes, both as regards data - no access to higher level data - and as regards control - processing was “mandatory”. These ideas have been debated and refined [Shallice, 1988], and argu- ments for the variable ontogeny of modules have also been made [Karmiloff-Smith, 1993]. Fodor’s main examples were taken from language comprehension and visual perception, which are very complex and sophisticated phenomena for which no adequate model or modeling framework is available. Fodor’s argument for central processes is that at some stage all of the different input information has to be integrated to produce behavior and that this cannot, for unstated reasons, occur in a modular system. My model shows how integration can occur through the use of abstracted types of data and that integration can occur in a modular fashion. Further, the use of higher-level data types allows the use of modules with uniform processing power. This is not only in consonance with the unifor- mity of the neocortex, it also allows the cortical model to use energy resources optimally by allowing behavior to be generated by concurrent activity of the set of modules.

States, causal dynamics and scientific theory. The image of an array of lit up processing modules visualizes the instantaneous state of the system. The state comprises the current set of stored descriptions in each module, the set of rule instances firing and the set of descriptions being transmitted along the channels. The dynamics of the system 264 Chapter 15: My implemented model of the primate neocortex is given by the action of the rules in each module, together with the action of storage management functions such as store updating and attenuation. The description of states and causal dynamics constitutes a scientific theory of the action of the brain.

Computational semantics. Although we can use names for descriptions and mod- ules, which suggest their significance by association to English words, their precise mean- ing is given by the action of the model. For example, the semantics or meaning of a goal description in the goal module is neither more nor less than that on being selected it is passed to the plans module and causes a plan to be selected and elaborated. The precise meaning of a module is a continuous process given by its description types and descrip- tion transformation rules. Thus, planning and action modules are defined as continuously generating elaborations of plan elements they receive, perception modules are defined as continuously analyzing incoming data and derive more abstract data. Goal modules are defined as storing, prioritizing and selecting from goal descriptions they receive.

Motivation. Gary Kraemer [Kraemer, 1992] has reviewed mechanisms of attachment. Innate affiliation schemas may be subcortical and part of a social attachment feedback control system. During development, these schemas are probably developed into working models, and particular instances of affiliation relations will probably be stored cortically. Thus, a better account of the generation of affiliative goals would be that subcortical and cortical representations of specific affiliation relations would generate signals which would be propagated to the anterior cingulate gyrus. The cortical component would indicate the specific affiliation involved and the subcortical component its intensity and other qualities. I discuss motivation in more depth in chapters 20 and 28.

Properties of our model. My work shows how to construct a working model of the brain at an architectural level. By using logical abstraction of data types, and processes as sets of parallel rules, a tractable computational model can be created, demonstrated and experimented with. Discussion 265

By representing control explicitly and in a distributed manner, we can see how a hierarchy of modules with storage and processing abilities can operate, in a flexible way depending on the sensed environment and current stored contents. I have shown that (a) control loops can form for controlling the environment at different levels of disturbance, (b) plans can be generated and elaborated into explicit action, (c) action can focus perception, (d) perceived information can influence action directly at the appropriate level, and (e) distributed processes can form and cohere, and can restructure depending upon eventualities.

I have modeled control in the brain as well as data, and have shown how distributed control can work. Initiative and intention are distributed. In a set of interdependent modules, many modules are essential for overall action and the contribution of any one of them can be controlling.

I model is relatively “vanilla” 1, i.e., non-idiosyncratic. It is designed as a ground level representation which can be further developed in the light of future knowledge.

My model is of course an approximation. I anticipate corrections, but my scheme and principles will probably not be strongly affected by the particular assignments of areas to regions or regions to levels, for example.

For modules with memory, the action of the module is typically to update the memory according to the incoming descriptions. Thus memories track in memory the stream of descriptions they receive. Each module operates in a processing and communication environment of a set of other modules. There is a current state of the system whose char- acteristics usually change slowly compared to the reaction time of one module; compare [Arieli et al., 1996].

Modules can be characterized by their ability to combine information of more than one

1my apologies to Vanilla planifolia 266 Chapter 15: My implemented model of the primate neocortex given types from more than one input, and to create and to store, data of given types. The development of modules with the abilities to store certain types of data would probably be aided by an existing flow of data from sensors and from subcortical areas to each module. Learning to process data by particular data transformations similarly would be aided by the co-occurrence of particular data on input and output channels of each module.

Consequences for experimental design and the interpretation of results. Characterizing the type of data processed by a given areas is seen as key. From this a process model can be developed, as we have shown. A given area however receives data from more than one input and from downstream as well as upstream. If measurements cannot be easily made, my model can give some idea of the kinds of data that will be received from downstream.

A complete computer. My model is a complete specification of brain activity, including control as well as data. In order to use this approach to describe brain activity to compare with the results of a particular experiment, it is therefore necessary to hypothesize the entire computational mechanism, both as regards data being processed and plans being executed by the brain. In a conventional boxes and arrow model, only the data and data paths are specified, the rest being described imprecisely using natural language.

The binding problem. Ny model embodies a solution to the binding problem. Data coherence and control synchronization occur because specific pattern matchings and associations occur between modules. In addition, specific confirmatory messages, positive and negative, control the coherent activity of distributed processes. The demonstration of this approach to the binding problem is only now possible because we have an explicit brain model and environment so that explicit storage contents and associations can be modeled. Discussion 267

Predictions. My main contribution is to have produced a computational approach and explicit working architecture for the primate brain. The change of thinking thereby involved produces many opportunities for falsification and verification:

Information roles of cortical areas. (i) attention requirements and requests for informa- tion are generated by the action hierarchy and cause focusing and changes in processing in the perception hierarchy. (ii) During initiation and change of action, we should get more processing in higher levels of the hierarchy. (iii) PA5 should be involved in the contextual control of action.

Falsifiable predictions of brain area activation. For the two strategies, we can now gener- ate detailed predictions of brain area activation sequences that should be observed during different types of processing situation. Using my computer realization, we can generate detailed predictions of activation levels for each time step. Since there are many ad- justable parameters and detailed assumptions in the model, it is difficult to find clearly falsifiable predictions. However, we can also make a simplified and more practical form of prediction by classifying brain states into four types, shown in Figure 15.10.

Let’s call these types of states G, E, P and M respectively. During grooming, we expect orientation would involve sequences such as G,P,E,M, during approach G,E,M,E,M.., grooming prelude P,E,M,E,M,.., grooming P,E,M,E,M,M,M.. During social conflict, ori- entation G,P,E,M,E,M and during plan failure G,G,G,..,P,E,M,E,M. 268 Chapter 15: My implemented model of the primate neocortex

MI muscleSI tactile MI muscleSI tactile combinations detection combinations detection PA1 detailed SA1 tactile PA1 detailed SA1 tactile actions for self images DV2 spatial actions for self images DV2 spatial maps maps PA3 specific DV1 spatial PA3 specific DV1 spatial jointplan planspersons and features jointplan planspersons and features PA4 overall PA4 overall G plans E plans G goals PM1 person VI visual G goals PM1 person VI visual movementspositions and features movementspositions and features during VV1 object VV1 object PM2 person identities during PM2 person identities goal creation actionsrelations and actionsrelations and dispositionsPM3 social and plan elaboration dispositionsPM3 social and goal appraisal affiliations affiliations

and visual input visual input goal reactivation outputmotor inputtactile outputmotor inputtactile

MI muscleSI tactile combinations detection MI muscleSI tactile PA1 detailed SA1 tactile combinations detection actions for self images DV2 spatial PA1 detailed SA1 tactile maps actions for self images DV2 spatial maps jointPA3 plansspecific and DV1 spatial plan persons features jointPA3 plansspecific and DV1 spatial PA4 overall plan persons features P plans M PA4 overall plans

G goals positionsPM1 person and VI visual during movements features G goals PM1 person VI visual during movementspositions and features PM2 person VV1 object actions and identities VV1 object perceptual relations PM2 person identities movement actionsrelations and dispositionsPM3 social and affiliations dispositionsPM3 social and analysis affiliations

motor tactile visual input output input visual input outputmotor inputtactile

Figure 15.10: Predicted brain area activation for different kinds of processing Discussion 269

It should be noted that there is some redundancy in the model, so that, if a mismatch to experiment is found, it would be possible to make some changes to the model to bring it into better correspondence with the data. For example, the assignment of modules to particular brain areas is tentative and may need to be changed. However, there is a limit to the changes that can be made, and mismatches with data could falsify the model in its present form.

Correspondence to lesion studies. As regards lesioning, the model has very few mod- ules at the moment and therefore little redundancy. Completely knocking out a module would produce major disruption, however one could get some phenomena such as util- isation behavior [Lhermitte, 1983] by lesioning the planning module, for example. If lesioning simply weakens a module then we would get other phenomena. There is some correspondence to a broad classification of frontal lesioning effects due to Michael Mega and Jeffrey Cummings [Mega and Cummings, 1994] where medial frontal lesions lead to apathy, dorsal frontal lesions lead to executive dysfunction, and orbital frontal lesions lead to impulsivity. This would correspond respectively in our model to lesioning the goal module, lesioning the planning module and lesioning the interface to subcortical perception-action systems.

Individual differences. As regards individual differences in response tendencies or disposi- tions, the model would in the simplest approach postulate that such individual differences result from individual differences in the performance of brain modules. Thus, one would try from a set of observations, to derive a set of brain module characteristics and a set of individual parametric values, perhaps something similar to [Daigneault et al., 1992]. 270 Chapter 15: My implemented model of the primate neocortex

15.4 Summary and conclusion

From the functional architecture of the primate brain, I have been able to design a com- putational architecture using a computer science analysis. I defined a computational approach representing neural regions as distributed modules whose data contents were represented by logical expressions and whose processing was represented by the action of sets of rules. A particular explicit brain model was defined and implemented. Its demon- stration of social behaviors in a minisociety showed the correctness and completeness of the model and the strength and feasibility of the approach.

I have developed a method of brain modeling where we do not need to make postulates concerning the encoding of information by neurons or the specific neural mechanisms used for processing information. My method of modeling takes into account processing and storage constraints of the brain and yet does not require modeling the behavior of individual neurons. The model has causal dynamics and makes falsifiable predictions concerning information processing in the brain. Conversely of course the model makes no statement concerning detailed neural activity or neural encoding of information.

Instead, we can use abstract descriptions of information and of the processing of informa- tion. This level of modeling allows one to make scientific theories concerning the types of data being transmitted, stored, and processed, transformations of data, distribution of data in the brain, specializations of processing at different locations. We can also represent issues of processing resources and timing of processes. Part III

Mental dynamics

271 272 Chapter 16

Problem-solving behavior

Abstract. In this chapter, I show how my model of the primate neocortex can be ex- tended to allow the modeling of problem-solving behaviors. Specifically, I model different cognitive strategies that have been observed for human subjects solving the Tower of Hanoi problem. These strategies can be given a naturally distributed form on the primate neocor- tex. Further, the goal stacking used in some strategies can be achieved using an episodic memory module corresponding to the hippocampus. We can deri ve explicit falsifiable predictions for the time sequence of activations of different brain areas for each strategy.

273 274 Chapter 16: Problem-solving behavior

16.1 Introduction

The action of the system is to continuously create goals, prioritize goals, and elaborate the highest priority goals into plans, then detailed actions, by propagating descriptions down the action hierarchy, resulting in a stream of motor commands. At the same time, perception of the environment occurs in a flow of descriptions up the perception hierarchy. Perceived descriptions condition plan elaboration, and action descriptions condition perception. This simple elaboration of stored plans was sufficient to allow me to demonstrate simple socially interactive behaviors using a computer realization of my model.

16.2 Problem solving and the choice of the Tower of

Hanoi problem

Many problems used in studying the cognitive psychology of human problem-solving have been spatial problems involving rearrangement of spatial configurations, or spatial transportation, usually with capacity constraints:

(i) Transportation problems are exemplified by the Missionary and Cannibals problem, also called Hobbits and Orcs.

(ii) Discrete capacity constraints are exemplified by water jug problems.

(iii) Spatial rearrangement problems are exemplified by the Tower of Hanoi, which is depicted in the five disk case in Figure 16.1. Problem solving and the choice of the Tower of Hanoi problem 275

Figure 16.1: Initial and general positions for five disk Tower of Hanoi problem 276 Chapter 16: Problem-solving behavior

The problem is to move all the disks from a starting peg to a designated finishing peg, by moving one disk at a time and at no time placing a disk on one smaller than itself. This is usually used with three pegs and different numbers of disks. The 3 disk case is very easy, the 4 disk case is straightforward and for 5 or more it can present difficulties for the average subject. This is its main attraction as a psychological test problem since it has a relatively small search space, about 2n states where n is the number of disks, but nevertheless is challenging for subjects.

Here is his description of the importance of the Tower of Hanoi problem in cognitive research from Herbert Simon’s 1991 memoir, Models of My Life: “If chess plays the role in cognitive research that Drosophila does in genetics, the Tower of Hanoi is the analogue of E. coli, providing another standardized setting around which knowledge can accumulate.”

The Tower of London problems were devised by Tim Shallice in order to give a whole set of problems that included easy and hard problems, to use as a clinical test for frontal patients. An example problem is depicted in Figure 16.2.

Figure 16.2: Typical initial and final positions for Tower of London problems Problem solving and the choice of the Tower of Hanoi problem 277

(iv) Some common clinical tests such as the Wisconsin card sort test are not often modeled cognitively. This test involves the learning of a “concept” for classifying a sequence of cards, and the ability of the subject to flexibly change this concept as the sequence changes. The test uses cards of the type depicted in Figure 16.3.

Figure 16.3: Cards used for the Wisconsin Sort Test

The subject is asked to sort a number of cards according to a rule. Each card typically contains three aspects, such as color, number, and shape. The subject must sort the cards according to one of those aspects, but he is not told which one; he must discover it through trial and error. Eventually, the subject might figure out that the rule is to sort by color, so he would place all the red cards in one pile, all the yellow cards in another pile, and all the blue cards in a third. After the subject has sorted several cards correctly, the experimenter changes the rule, without telling the subject. Now, the rule might be to sort by shape, so all the circles, squares, and stars must be separated into three piles. This requires a shift in the subject’s understanding of the task; he must inhibit the old ”color” rule and switch to the new ”shape” rule. Subjects with damage to their frontal lobes have difficulty figuring out the sorting rule, because they are unable to use their memories of previous right or wrong guesses to guide their present behavior. They also show difficulty in switching rules, since they cannot inhibit the previously correct ”color” rule. Instead, they perseverate on the old rule, continuing to sort by color regardless of the experimenter’s feedback. This failure of normal inhibition is the primary finding 278 Chapter 16: Problem-solving behavior of this task. Children, schizophrenics, patients with organic frontal lobe damage, and monkeys with ablations of their frontal lobes all perform poorly on the Wisconsin Card Sort test. Specifically, prefrontal areas seem to be crucial to inhibition.

16.3 Problem solving and the frontal lobes

It has been established that the frontal lobes are important for the solution of problems [Fuster, 1997]. Other treatments of frontal neurology include [Stuss and Benson, 1986] [Stuss, 1992] [Benson, 1994] [Cummings, 1995]. Frontal areas seem to be involved in organization of problem solving thought or what one might call executive functions. Other simpler components of problem solving thought may be distributed in other parts of the brain.

The main evidence comes from imaging, as in the work of Per Roland, and others, and also from neurological cases of people with lesions in their frontal lobes. This is not the only thing the frontal lobes are involved in, for example they are used for retrieval from long term memory and for storing episodic memories, and so on.

Figure 16.4 shows Roland’s diagram summarizing his findings for imaging of people solv- ing three different types of problem: (i) the 50-3 task, i.e., counting backwards from fifty in threes, which is a standard clinical exam, (ii) the jingle task, i.e., jumping every second word of a well-known (Danish) nine-word jingle, first generated silently and then out loud, and (iii) a route-finding task, in which subjects had to imagine going for a walk starting from their front door, and alternately turning left and right as they came to junctions. The diagram summarizes PET activations for the three tasks on successive rows, showing Problem solving and the frontal lobes 279 the left and right hemispheres. 280 Chapter 16: Problem-solving behavior

(i)

(ii)

(iii)

Figure 16.4: Roland’s diagram summarizing his findings for thinking tasks Problem solving and the frontal lobes 281

Executive dysfunction has been characterized by Shallice and coworkers [Shallice and Burgess, 1991a]. It’s the inability to keep track of plan execution. For example, some frontal patients were given a shopping list and taken to a mall and asked to buy everything, taking into account price, etc. They became quite confused and only were able to partially complete what would be a straightforward task for ordinary peo- ple. However, frontal patients can still do straightforward tasks and can often continue to work in their old job. Their IQs remain about the same as before the lesion occurred.

Utilisation behavior has been described by L’Hermitte. [Lhermitte, 1983] [Shallice et al., 1989] In this case, the external situation triggers involuntary routine responses. For example, the patient may see a comb on the doctor’s desk and start combing their hair. Their attention is easily captured by routine situations. L’Hermitte also described how some frontal patients would copy his actions, picking up a stethoscope and examining him, for example.

Mega and Cummings surveyed the symptoms of frontal patients [Mega and Cummings, 1994] and concluded that they fell into three categories depending on the locus of their lesions: (i) dorsal frontal damage resulted in executive dysfunction. (ii) medial frontal damage resulted in apathy - lack of all motivation, and (iii) orbital frontal damage resulted in impulsivity. In many cases the damage to the cortex is accompanied by damage to the basal ganglia.

Perseverance. A problem which is found in normal people but exaggerated in frontal patients is perseverance. This occurs when the subject solves a sequence of similar problems and tends to stick to the same problem solving strategy, even though it is not the best for all the problems. One can even gradually change the problems presented so that the strategy being perseverated is a very bad strategy. The classic experiment for normal people was due to Abraham Luchins who used water jug problems [Luchins, 1942]. 282 Chapter 16: Problem-solving behavior

The standard clinical test is the Wisconsin Card Sort test and it is a quite good indicator of frontal problems. It basically tests concept formation and also perseverance.

Action slips. An action slip is the substitution of one action for another. For example, you go upstairs to change to go out for the evening but instead put on your pyjamas and get into bed. This happens to normal people but is more frequent for frontal patients [Reason, 1990] [Schwartz, 1995] [Norman, 1981] [Shallice and Burgess, 1991b].

Myrna Schwartz [Schwartz, 1995] has pointed out that the breaks in the action occur as meaningful sequences, thus, putting on your pyjamas and getting into bed is a wellformed meaningful action sequence, you do not put on your pyjama jacket only and then go out.

16.4 The Tower of Hanoi problem

In any particular position, since there are only three pegs, there can only be a limited number of moves. Since the top disks on the three pegs must be in some order of size a > b > c, there will be no moves for a unless there are empty pegs in which case there is one move for each empty peg. For c there will be two moves, and in fact c will always be the very smallest disk. For b there will be one move to cover a, or another if there is an empty peg. In the general case, then where there are no empty pegs, there will be just three moves from any general position, two for the smallest disk and one for the second smallest disk.

It is useful to examine the state space of the problem, that is, the set of possible states the disks and pegs can get into, and what transitions can occur from one state to another. We diagram the state space by representing each state by a point and possible transitions by lines.

If we draw out the state space for the Tower of Hanoi, we get a very regular structure. I The Tower of Hanoi problem 283 am using a notation of l m n to denote a state with disk l on peg 1, disk m on peg 2 and disk n on peg 3, so 3 2 1 means disk 3 on peg 1, disk 2 on peg 2 and disk 1 on peg 3. For more disks we stack them vertically, so 2/3 1 means disks 2 and 3 on peg 1, nothing on peg 2 and disk 1 on peg 3. Figure 16.5(a) shows the space for one disk only, Figure 16.5(b) shows the space for two disks, and how it can be constructed using spaces for one disk. I use the notation 2 for example to mean the set of all states in which disk 2 is on peg 2, i.e. in the 2 disk case, the set of states {1 2 , 1/2 , 2 1}.

In general the construction of the state space for n+1 disks from state spaces for n disks is quite regular, and is shown in Figure 16.5(c).

Figure 16.6 shows how we construct the space for three disks.

Finally, Figure 16.7 shows the space for five disks, which is the usual problem used in experiments. I have drawn it with the start state of all disks on peg 1 at the top and the position with all disks on peg 3 is in the lower left corner, and all disks on peg 2 in the right hand corner. To figure out the position corresponding to each state in the diagram, it is labeled by each of the different subspaces it is in being labeled with the positions of the disk for that subspace, including each vertex being labeled with the position of disk 1. Also each arc which always connects two states is labeled with the number of the disk that is moved. The optimal solution is the outer edge of the outer triangle. There are 3n = 35 = 243 states altogether and the optimal solution passes through only 2n-1 = 25-1 = 31 of them. 284 Chapter 16: Problem-solving behavior

1 _ _

1 = (a)

_ _ 1 _ 1 _

1 _ _

1 1 1 2 _ _ = 2 = _ _ 1 _ 1 _ 1 1 1 2 2 _ _ 1 _ 1 _

1 (b) 1 1 1 _ _ 2 _ 2 _ 2 _ 1 _ 1 1 _ _ 1 _ _ 1 _ _ 1

disk n' does not move within n disk subspaces n

n' = = n' _ _

n n where n' = n+1 disk n' disk n' moves moves (c)

_n' _ _ _n'

disk n' moves

Figure 16.5: Construction of state spaces for Tower of Hanoi problems The Tower of Hanoi problem 285

2 _ _

2 2 3 _ _ 2 _ _ 2 3 = = _ 2 _ 2

2 2 3 3 _ _ 2 _ 2 _

2 2 2 2 _ _ 3 _ 3 _ 3 _ 2 _ 2 2 _ _ 2 _ _ 2 _ _ 2

1 _ _

1 1 2 _ _

_ _ 1 _ 1 _ 1 2 2 = _ _ 1 3 _ _ _ 1 _

1 1 1 1 _ 2 _ _ _ 2 2 _ 1 _ 1 1 _ _ 1 _ _ 1 _ _ 1 3 3 _ 1 _ _ _ 1

1 1 1 1 _ 2 _ _ _ 2

1 _ _ _ _ 1 _ 1 _ 1 _ _ 1 1 2 2 2 2 _ _ 3 1 _ _ _ _ 1 _ 1 _ _ 3 _ 1 _ _

1 1 1 1 1 1 1 1 _ _ 2 2 _ _ 2 _ _ _ 2 _ 2 2

_ _ 1 1 _ 1 _ _ 1 _ 1 1 _ _ 1 _ _ 1 _ _ 1 _ _ 1 1 _ 1 _ 3

Figure 16.6: Tower of Hanoi state space for problems with three disks 286 Chapter 16: Problem-solving behavior

1 _ _

1 1 2 _ _ _ _ 1 _ 1 _ 1 2 2 _ _ 1 3 _ _ _ 1 _

1 1 1 1 _ 2 _ _ _ 2 _ 1 _ 2 _ _ 1 1 1 _ _ 1 _ _1 3 3 _ 1 _ _ _ 1

1 1 4 _ _ 1 1 _ 2 _ _ _ 2 1 _ _ _ _ 1 _ 1 _ 1 _ _ 1 1 2 2 2 2 1 _ _ _ _ 3 _ _ 1 _ 1 _ _ 3 _ 1 _ _ 1 1 1 1 1 1 1 1 _ _ 2 2 _ _ 2 _ _ _ 2 _ _ _ 1 _ 1 _ 1 _ 1 _ 2 _ 1 _ 1 1 _ _ 3 1 _ _ 1 _ _ 1 2 _ _ 1 1 4 4 _ _ 1 _ 1 _

1 1 1 1 _ _ 2 _ 1 _ 1 _ _ 1 _ _ _ 2 _ _ _ 1 1 1 2 2 2 2 _ 1 _ _ _ 3 1 _ _ 5 _ _ 1 _ _ _ 3 _ _ _ 1 1 1 1 1 1 1 1 1 2 _ _ _ 2 _ _ _ 2 2 _ _ 1 _ _ 2 _ 1 _ _ _ 1 2 1 _ _ 1 1 _ _ 1 _ _ 1 1 _ 1 _ _ 1 _ 1 3 3 3 3 1 _ _ _ 1 _ _ _ 1 1 _ _ 1 1 _ 4 _ 1 1 1 1 _ _ 4 1 1 2 _ _ 1 _ _ _ 2 _ _ _ 1 _ _ 2 2 _ _ _ _ 1 _ 1 _ _ 1 _ 1 _ _ _ _ 1 _ 1 _ 1 1 1 1 2 2 2 2 2 2 2 2 _ _ 1 _ _ 1 _ 3 _ _ 1 _ 1 _ _ 3 _ _ _ 1 _ 3 _ _ 1 _ _ _ _ 1 _ _ 3 _ 1 _ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 _ 2 _ _ _ 2 _ _ 2 2 _ _ 2 _ _ _ 2 _ _ 2 _ _ 1 _ 2 2 _ _ 2 2 _ _ 1 _ _ 1 _ 1 _ 2 1 1 1 1 _ _ 1 1 1 _ _ 1 _ _ 1 3 1 _ 1 _ 1 1 _ _4 1 _ _ _ _ 1 _ _ 1 _ 1 _ 3 _ 1 _ 1 _ _ 1 _ _ 5 5 _ 1 _ _ _ 1

1 1 1 1 _ 2 _ _ _ 2 1 _ _ _ _ 1 _ 1 _ 1 _ _ 1 1 2 2 2 2 1 _ _ _ 3 _ _ 1 _ _ 1 _ _ _ 3 1 _ _ 1 1 1 1 1 1 1 1 _ _ 2 2 _ _ 2 _ _ _ 2 _ _ _ 1 2 _ _ 1 1 _ _ _ 1 _ 1 _ _ 1 _ 1 _ 1 1 _ _ 1 2 _ _ 1 1 3 3 3 3 _ _ 1 _ _ 1 1 _ _ _ 1 _

1 1 _ 4 _ 1 1 1 1 _ _ 4 1 1 _ _ 2 _ 2 _ 2 _ _ _ 2 _ _ 1 _ 1 _ _ _ 1 _ 1 _ _ _ _ 1 _ 1 _ 1 _ _ _ _ 1 1 1 1 1 2 2 2 2 2 2 2 2 _ 1 _ 3 _ _ _ _ 1 _ _ 1 _ _ 3 1 _ _ _ _ 1 _ 3 _ _ 1 _ 1 _ _ 3 _ _ _ _ 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 _ _ _ 2 _ 2 _ _ _ _ 2 _ 2 _ _ _ 2 _ _ 2 2 _ _ 2 2 1 _ _ 1 _ _ 1 _ _ 1 1 1 _ _ 1 2 _ 1 _ 1 1 _ _ 3 1 _ _ 1 _ 1 _ _ 1 _ _ 1 _ 1 1 _ _ 1 _ _ _ _ 1 3 _ _ 1 1 _ 1 _ 2 _ 1 _ 1 4 4 4 4 1 _ _ _ _ 1 _ 1 _ 1 _ _

1 1 1 1 1 1 1 1 2 _ _ _ 2 _ 2 _ _ _ _ 1 _ 1 _ _ 1 _ _ _ 2 1 _ _ 1 _ _ _ _ 1 _ _ 1 _ 1 _ 1 1 1 1 2 2 2 2 2 2 2 2 3 _ _ _ 1 _ _ 1 _ 1 _ _ _ _ 1 3 _ _ _ _ 1 _ _ 5 _ _ 3 1 _ _ _ 3 _ _ 5 _ _ _ 1 _ 1 _ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 _ 2 _ _ 2 _ _ _ 2 2 _ _ _ 2 _ _ _ 2 2 _ _ _ _ 2 _ 1 _ _ _ 1 1 _ _ 2 _ 1 _ _ _ 1 2 1 _ _ _ _ 1 1 _ 1 _ 1 1 _ _ 2 1 _ _ 1 1 1 1 _ 1 _ _ 1 _ 1 1 _ _ 2 1 _ _ 1 3 3 _ _ 1 _ _ 1 3 3 3 3 3 3 _ 1 _ _ _ 1 1 _ _ _ 1 _ 1 _ _ 1 _ _ _ 1 _ _ _ 1

1 1 1 1 1 1 4 _ _ 1 1 1 1 4 _ _ 1 1 1 1 _ 4 _ 1 1 _ _ 4 _ _ 2 _ 2 _ _ _ 2 2 _ _ _ 2 _ _ 1 _ 2 _ _ _ 2 _ _ _ 2 1 _ _ _ _ 1 _ 1 _ 1 _ _ _ _ 1 _ 1 _ 1 _ _ _ _ 1 _ _ 1 _ _ 1 _ 1 _ 1 _ _ _ _ 1 _ 1 _ 1 _ _ 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 _ _ _ _ 3 _ 1 _ _ _ 1 _ 1 _ _ _ 1 _ 1 _ _ 3 _ 1 _ _ _ _ 1 _ 3 _ _ 1 _ 1 _ _ 3 _ _ _ _ 1 3 _ _ _ _ 1 _ _ 3 1 _ _ _ _ 3 _ _ 1 _ 1 _ _ 3 _ 1 _ _ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 _ _ 2 2 _ _ 2 _ _ _ 2 _ _ 2 _ _ _ 2 _ _ 2 2 _ _ _ 2 _ _ 2 _ _ _ 2 2 _ _ 2 _ _ 2 2 _ _ 2 _ _ 2 2 _ 2 _ 1 _ 1 _ 2 _ 1 _ 1 _ _ 1 _ _ 1 2 _ _ 1 1 _ 1 _ 4 1 1 1 4 _ _ 1 1 1 1 1 _ _ 1 _ _ 1 _ 1 _ _ _ 1 1 1 _ _ 3 _ 1 _ 1 1 _ _ 2 1 _ _ _ _ 1 3 _ _ 1 _ 1 _ _ 1 _ 1 _ _ 1 _ _ 1 _ _ _ 1 _ 3 _ 1 _ 1 _ _ 2 1 _ _ 1 _ _ 1 4 _ _ 1 1 _ 1 _ _ 1 _ 1 1 _ _ 3 _ _ 1 2 1

Figure 16.7: Tower of Hanoi state space for problems with five disks Strategies for solving Tower of Hanoi problems 287

16.5 Strategies for solving Tower of Hanoi problems

The basic paper on this is due to Herbert Simon [Simon, 1975] who described several types of strategy: (a) the goal-recursion strategies (i) pure recursion keeping complete goal stack (ii) goal recursion using memory of top goal only plus perception of state (b) perceptual strategies where the next move determined by perceptual tests on the state and not goal memory. (c) move pattern strategies where the next move determined by pattern of previous moves and perceived state.

A relatively complete list of twenty-two possible strategies is as follows: (1) random searching, (2) systematic searching, e.g., depth first, breadth first, (3) se- lective search, (4) perceptual strategies (a) current goal plus perceived state, and (b) generating from largest remaining disk (c) goal ordering, (d) goal stacking, and (e) goal stacking, clearing source before target, (5) goal recursion disk strategies (a) perceptual goal recursion disk strategy, and (b) full goal recursion disk strategy, (6) goal recursion pyramid strategies (a) perceptual goal recursion pyramid strategy, and (b) full goal recur- sion pyramid strategy, (7) pyramid goal recursion with goal counting, (8) move pattern strategies (a) smallest and next smallest disk selection, and (b) not last disk moved, (9) rote strategies (a) moves from rote list, and (b) moves as specific rules.

I will define and discuss the main ones in turn:

Selective search. This is the initial strategy used by Anzai and Simon’s subject, [Anzai and Simon, 1979] discussed in chapter 24. It’s a sort of naive strategy, in the sense of being a priori. Selection consists of avoiding repetition, i.e., avoiding: (i) moving same disk twice in succession, 288 Chapter 16: Problem-solving behavior

(ii) reversing the move just made with the same disk, and (iii) returning a disk to the peg it was last on. It seems to the author that these constraints might be useful in many spatial movement situations. No goals are used. A production system for selective search is given in Figure 16.13.

Perceptual strategy saving only current goal. This is described in Simon’s 1975 paper. Basically, goals are used but not in any systematic or organized way. A goal to move some disk is generated, and then if there is an obstacle, a new goal is generated to move the obstacle. This causes the previous goal to be forgotten. If a goal is solved, a new goal is then generated de novo. In the production system given in Figure 16.14, adapted from Simon’s paper, the new goal generated is to move the biggest remaining disk, but one could have a simpler strategy in which just any disk was chosen for the new goal. Simon observes that this strategy will succeed for any number of disks and will not get into a loop. It thus uses goals, is successful and does not require keeping track of multiple goals.

Perceptual strategy with goal stacking. For a systematic strategy, one needs to keep track of goals. One way, described in Simon’s 1975 paper, and which can be efficient, is to maintain a stack of goals, so that only the top member of the stack is worked on. Goals are generated as before, and if a goal cannot be solved immediately, this goal is put on the stack and a new goal is generated to move the disk blocking this move. When that goal is solved the stack is popped and the previously stored goal is attempted again. This strategy is still called perceptual because it makes it main decisions by testing the feasibility of moves by perceiving the state of the problem, checking whether the proposed disk is the top disk and whether the disk on the top of the target peg is larger. The production system shown in Figure 16.15, is the variant mentioned by Simon in which we distinguish between the two different blocking cases, namely the source peg being Strategies for solving Tower of Hanoi problems 289

blocked and the target peg being blocked.

Perceptual goal recursion disk strategy. In the recursive method, it is not actually necessary to stack goals. Instead, when a disk has been moved or when it is blocked, one can simply always try to move the next larger disk. This is from Simon’s 1975 paper.

Full goal recursion disk strategy. One can however do goal recursion by maintaining a goal stack. In this case, subgoals are not generated by perception of the top disk. The entire strategy can thus be conducted without perception.

Perceptual goal recursion pyramid strategy. This is completely analogous to the perceptual goal recursion disk strategy. The goal generated on blocking is to move the next smallest pyramid. Perception is used to test for blocking of moves.

Full goal recursion pyramid strategy. This is Simon’s goal recursion strategy and is analogous to the full goal recursion disk strategy. No perception is needed.

Goal recursive pyramid strategy with goal counting. For large numbers of disks, say n > 10, instead of stacking n goals, we can simply count, so pyramid m goes to peg(n - m) mod 3.

Move pattern strategy using parity with alternation. Each disk is assigned parity odd or even, counting back from the largest disk which is odd. Direction of movement between pegs is also assigned a parity, the direction from the largest disk to the Goal peg being odd. The strategy moves odd disks always to the adjacent peg in the direction with odd parity and even disks to the adjacent peg in the direction with even parity. Disks are always alternated, i.e., we don’t move the last disk that was moved.

Completely rote strategy. This would consist of 2n totally specific rules, one for each move to be made. Simon gives a variant where the list of explicitly learned moves is stored as a list and read one at a time to execute each move. 290 Chapter 16: Problem-solving behavior

16.6 Human performance in problem solving

For discussions of the use of resources in human problem solving, see [Egan and Greeno, 1974] [Norman and Bobrow, 1975] [Simon and Hayes, 1976].

16.6.1 Performance on Tower of Hanoi problems

I list here some references to published papers that have measured performance on the Tower of Hanoi problem for various types of human subject:

Normal adults [Egan and Greeno, 1974] four-, five- and six- disk versions, analysis of difficulty in terms of depth of stack, [Karat, 1982] move choices and latencies, produc- tion system model, [Kotovsky et al., 1985] protocol study, [Ruiz, 1987] protocol analy- ses, latencies, tower noticing, [Ruiz and Newell, 1989] SOAR model of tower noticing, [Chapman, 1997] tower noticing model, [Anderson et al., 1993] goal structures, recursive strategy, [Just et al., 1996] recursive strategy

Learning of problem solving strategies [Anzai, 1978] [Anzai and Simon, 1979] [Anzai, 1987] learning different strategies, [VanLehn et al., 1989] non-LIFO execution, [VanLehn, 1991] analysis of rule acquisition events in Anzai and Simon protocol, [VanLehn, 1988] impasse driven learning, [VanLehn, 1989a] learning events in acquisi- tion of skills including TH, [VanLehn, 1989b] survey of problem solving and cognitive skill acquisition

Children [Piaget, 1976] [Klahr, 1978] [Klahr and Robinson, 1981] [Welsh, 1991] [Byrnes and Spitz, 1979a]

Children with learning disabilities or retardation [Spitz et al., 1984] [Spitz et al., 1985] [Borys et al., 1982] [Byrnes and Spitz, 1979b] [Wansart, 1990] [Gioia, ] Human performance on Tower of Hanoi problems 291

Frontal subjects [Goel and Grafman, 1995] [Glosser and Goodglass, 1990] aphasic frontal [Parks and Cardoso, 1997] [Kimberg and Farah, 1993]

Amnesic subjects [Cohen et al., 1985] [Schmidtke et al., 1996]

Turner’s syndrome [Romans et al., 1997]

Cerebellar patients [Daum et al., 1993]

Schizophrenic patients [Goldberg et al., 1990]

More recent published research includes: [Handley et al., 2002] relation to working memory, [Schuepbach et al., 2002] Cerebral hemodynamic response using Transcranial Doppler sonography, [Roennlund et al., 2001] individual differences in adults, [Anderson and Douglass, 2001] cost of goal retrieval, [Numminen et al., 2001] working memory in patients with intellectual disability, [Cavedini et al., 2001] cortical and basal ganglia functions in OCD patients, [Davis and Klebe, 2000] elderly people, [Bishop et al., 2001] children of different ages, [Goela et al., 2001] symbolic model and frontal patients, [Winter et al., 2001] procedural learning in amnesics, [Yaoda Xu and Suzanne Corkin, 2001] study of H.M. solving TH, [Schoettke, 2000] working memory and context effects in brain damaged patients, [Cardoso and Parks, 1998] neural network model for frontal patients, and [Parks and Cardoso, 1997] neural network model for left frontal patients. 292 Chapter 16: Problem-solving behavior

16.6.2 Performance on Tower of London problems

Shallice concluded that the Tower of Hanoi was too difficult to use for studying frontal pa- tients, and therefore he devised the Tower of London task [Shallice, 1982] [Shallice, 1988]. The task has been characterized psychologically by Geoffrey Ward and Alan Allport [Ward and Allport, 1997] but only for the five disk case and without verbal protocols.

Recent published research includes: [Gilhooly et al., 2002] the role working memory, individual differences, [Rainville et al., 2002] Alzheimer patients, [Sikora et al., 2002] children with poor arithmetic skills, [Raizner et al., 2002] extension to minimise ceiling effects, [de Oliveira Souza et al., 2001] individual differences, males better on difficult moves, [Philips et al., 2001] effect of mental preplanning, [Mitchell and Poston, 2001] effect of inhibition in improving performance, [Hodgson et al., 2000] study of eye gaze direction, [Changeux and Dehaene, 2000] neural model, [Gilhooly et al., 1999] planning and age, [Dagher et al., 1999] PET study, [Phillips et al., 1999] role of memory, [Murji and DeLuca, 1998] comparison with Cognitive Function Checklist (CFC) measure, [Baker et al., 1996] PET study, and [Morris et al., 1993] CT study

An example of an imaging study is an MRI BOLD study by Odile A. van den Heuvel et al. [van den Heuvel et al., 2003], for which the brain activations were as shown in Figures 16.8 and 16.9. They refer to brain areas using Talairach coordinates; which are explained in an appendix to this chapter. Then the subject was given a problem of Human performance on Tower of Hanoi problems 293 greater complexity and the differences were observed, which we give in Figures 16.10 and 16.11.

Figure 16.8: Image of Tower of London performance 294 Chapter 16: Problem-solving behavior

Figure 16.9: Table of brain area activations for Tower of London

Figure 16.10: Image of Tower of London performance for harder problem Human performance on Tower of Hanoi problems 295

Figure 16.11: Table of brain area activations for harder problem 296 Chapter 16: Problem-solving behavior

Andreasen et al [Andreasen et al., 1992] used PET on schizophrenics and normals. Their main finding was that the left mesial frontal cortex was activated in normals and not in schizophrenics, i.e., hypofrontality.

Robin Morris et al [Morris et al., 1993] used SPECT on normal adults, looking at frontal lobe activation. They found a significant increase only in left prefrontal cortex during Tower of London, and subjects who took more time planning their moves and less moves to complete the problem had significantly higher activations of left prefrontal cortex.

S. C. Baker et al. [Baker et al., 1996] [Frackowiak et al., 1997] used PET to articulate the functional anatomy of planning, finding a set of cortical areas that were activated. They call this a “distributed network”, however they did not produce any evidence for interconnection, just coactivation, and further their set, { right (area 10) rostrolateral prefrontal, left (area 9) and right (area 9) dorsolateral prefrontal, left (area 8) and right (area 8) superior frontal (decrease), left (area 45) inferior frontal (decrease), left (areas 6, 8, 9 and 10) dorsomedial frontal (decrease), left (area 6) and right (area 6) premotor, left (areas 24 (decrease) and 32) and right (area 32) anterior cingulate , left (areas 23 and 31) and right(areas 31 and 24) posterior cingulate (decrease), left (areas 7 and 5) superior parietal, left (area 40) inferior parietal, left (area 7) and right (areas 7 and 19) medial parietal (precuneus), left (area 22) and right (area 41) superior temporal gyrus (decrease), left (area 21) middle temporal gyrus) (decrease), left (areas 18 and 19) and right (areas 18 and 19) occipital cortices }, includes most of the neocortex. There was one area which correlated with task difficulty and that was area 10 in rostrolateral prefrontal cortex.

The Tower of London problem has also been used for patients with various pathologies. Robin Morris et al. [Morris et al., 1988] studied Parkinson patients. They found these patients took the same number of moves to solve the problem as normals and were also unimpaired as regards executing given plans, generating low level strategies and Human performance on Tower of Hanoi problems 297 spatial working memory. However they found a deficit in thinking time, possibly due to attention-switching difficulties.

Adrian Owen et al. [Owen et al., 1990] used the Tower of London to study patients with frontal lobe lesions and found they needed more moves to complete problems and had longer thinking times.

Claire Hughes et al. [Hughes et al., 1994] used it to study autism. They found it revealed evidence for executive dysfunction, since impairment in performance was specific to those stages in the task which placed greatest demands on executive control.

16.7 Learning problem-solving strategies

A number of learning mechanisms have been proposed for learning Tower of Hanoi strategies, see also chapter 24. Some are intended as psychological explanations and others are general AI mechanisms. The first work was due to Yuichiro An- zai [Anzai, 1978] [Anzai and Simon, 1979] [Anzai, 1987] and used an adaptive produc- tion system [Waterman, 1975] to learn new rules during the solution of the Tower of Hanoi task. Since then, other quite different learning mechanisms have been shown to be capable of learning these strategies [Langley, 1985] [Anderson, 1989] [Ruiz and Newell, 1989] and yet others are probably capable of learning them also [Minto et al., 1989]. The method of Patrick Langley [Langley, 1985] uses analysis of search trees. Dirk Ruiz [Ruiz and Newell, 1989] used a SOAR model which chunked giv- ing tower noticing. Tower noticing in SOAR has recently been reported also by Tim Chapman [Chapman, 1997]. John R. Anderson has described a sequential connectionist network [Anderson, 1989]. A neural network model has also been described by Randolph Parks and John Cardoso [Parks and Cardoso, 1997]. 298 Chapter 16: Problem-solving behavior

16.8 Extending our model to allow solution of the

Tower of Hanoi problem

16.8.1 Tower of Hanoi strategies

The Tower of Hanoi problem is the most studied, and strategies used by human sub- jects have been captured as production rule systems, see the work of John R. Anderson [Anderson et al., 1993] and of Kurt Van Lehn [VanLehn, 1991]. I will consider the two most frequently observed strategies - the perceptual strategy and the goal recursion strat- egy. In the general case, reported by Anzai and Simon[Anzai and Simon, 1979], naive subjects start with an initial strategy and learn a sequence of strategies which improve their performance. The two strategies were observed by Anzai and Simon as part of this learning sequence. Figure 16.12 shows how a problem-solving strategy might be distributed over modules, including directing attention and eye movement.

plan test_legal(move(d,p)) legal(move) move(d,p) pef? test_legal(move(d,p)) fixate(p) plan_self_action lookat(d) fef move(d,p) object_action

object_motion eye_gaze macroaction

environment

Figure 16.12: Distributing a strategy over several modules Extending our model to allow solution of the Tower of Hanoi problem 299

Starting from Simon’s formulation [Simon, 1975], I was able to represent these two strategies in our model, as follows:

16.8.2 Selective search

Selective search consists of making apparent progress while avoiding repetition by (i) moving same disk twice in succession, or (ii) reversing the move just made with the same disk, or (iii) returning a disk to the peg it was last on, even if it was not the last one moved. These requirements concern the history of the problem solving episode, and I will deal with this issue in chapter 23. Figure 16.13 indicates how this strategy might be distributed over the perception-action hierarchy. 300 Chapter 16: Problem-solving behavior

top_goal: goal(on(1,3)). goal(on(2,3)). goal(on(3,3)). goal(on(4,3)). goal(on(5,3)). target_peg(3).

halt target_peg(Goal_peg)

plan: PL1. state(ps) −> halt.

PL2. state(can(K,Peg1,Peg2)) −> test_just_moved(K), test_last_on(K,Peg2).

PL3. state(can(K,Peg1,Peg2)),not_just_moved(K),not_last_on(K,Peg2) −>

move_disk(K,Peg1,Peg2).

PL4. top(K,Peg1),target_peg(Goal_peg),not_just_moved(K),not_last_on(K,Goal_peg)

−> test(move(K,Peg1,Goal_peg)). PL5. top(K,Peg1),target_peg(Goal_peg),not_just_moved(K),other_peg(Peg1,Goal_peg,Peg2) notlaston(K,Peg2) −> test(move(K,Peg1,Peg2)).

top(K,Peg) not_just_moved(K) not_last_on(K,Peg2) test_last_on(K,Peg2) state(can(K,Peg1,Peg2)) test_just_moved(K) test(move(K,Peg1,Peg2)) move_disk(K,Peg1,Peg2)

perception: PE1. on(Disk,Peg), −noticed(Disk,Peg),now_on(Disk,Peg1) action: AC1. move_disk(K,Peg1,Peg2) −> move(K,Peg1,Peg2). −> now_on(Disk,Peg), noticed(Disk,Peg), AC2. move(K,Peg1,Peg2) −> noted_just_moved(K). previously_on(Disk,Peg1),top(Disk,Peg). AC3. just_moved(K),noted_just_moved(K1), PE2. last_on(K,Peg2),previously_on(K,Peg1),Peg2 =/= Peg1 K1 =/= K −> not_just_moved(K). −> not_last_on(K,Peg2).

PE3. move(K,Peg1,Peg2), top(K1,Peg2),K1>K moved(K) move(K,Peg1,Peg2) −> state(can(K,Peg1,Peg2)).

on(Peg,Disk)

Figure 16.13: Representation of the selective search strategy on the brain model Extending our model to allow solution of the Tower of Hanoi problem 301

16.8.3 Working goals

Since goals are created dynamically by the planning activity, I needed to extend the plan module to allow working goals as a description type. This mechanism was much better than trying to use the main goal module. We can limit the number of working goals. This would correspond to using a fixed size store, corresponding to working memory. The module can thus create working goals and use the current working goals as input to rules. Working goals would be held in dorsal prefrontal areas, either as part of or close to the plan module. Main motivating topgoals are held in the main goal module corresponding to anterior cingulate.

16.8.4 Perceptual tests and mental imagery

The perceptual tests on the external state, i.e., the state of the Tower of Hanoi apparatus, were naturally placed in a separate perception module. This corresponds to Stephen Kosslyn’s [Kosslyn, 1994] image store. The main perceptual test needed is to determine whether a proposed move is legal. This involves (a) making a change to a stored perceived representation corresponding to making the proposed move, and (b) making a spatial comparison in this image store to determine whether the disk has been placed on a smaller or a larger one. With these two extensions, I was able to develop a representation of the perceptual strategy, depicted in Figure 16.14. I programmed and ran it and verified its correctness. 302 Chapter 16: Problem-solving behavior

top_goal: goal(on(1,3)). goal(on(2,3)). goal(on(3,3)). goal(on(4,3)). goal(on(5,3)). target_peg(3). number_disks(5). target_peg(Goal_peg) halt number_disks(N) goal(G).

plan: PL1. state(ps) −> halt. PL2. state(done),goal(move(Disk,Peg1,Peg2)) −> state(null),goal(null). PL3. state(can),goal(move(Disk,Peg1,Peg2)) −> state(null),goal(null),move_disk(Disk,Peg1,Peg2). PL4. state(cant(Disk1)),goal(move(Disk,Peg1,Peg2),peg(Disk,Peg1) −> state(null),goal(null),other_peg(Peg1,Peg2,Peg3),goal(move(Disk1,Peg1,Peg3)). PL5. goal(move(Disk,Peg1,Peg2)),peg(Disk,Peg1) −> test(move(Disk.Peg1,Peg2)). PL6. goal(on(Disk,Target_peg)),peg(Disk,Peg) −> goal(move(Disk,Peg,Goal_peg).

state(State) test(move(Disk,Peg1,Peg2)) move_disk(Disk,Peg1,Peg2)

object_relations plan_self_action

PE1. target_peg(Goal_peg),(Y)on(Y,Goal_peg) −> state(ps). AC1. move_disk(Disk,Peg1,Peg2) −> move(Disk,Peg1,Peg2).

PE2. move(Disk,Peg1,Peg2),on(Disk,Peg2) −> state(done).

PE3. move(Disk,Peg1,Peg2),top(Disk,Peg1), top(Disk1,Peg2),Disk1>Disk −> state(can).

PE4. move(Disk,Peg1,Peg2),top(Disk1,Peg1),Disk1< Disk −> state(cant(Disk1)). PE5. on(Disk,Peg), −noticed(Disk,Peg) −> move(Disk,Peg1,Peg2) peg(Disk,Peg),noticed(Disk,Peg),top(Disk,Peg).

object_position(Disk,X,Y,Z) on(Peg,Disk)

object_motion macroaction on(K,A) object_position(Disk,X,Y,Z) object_position(Disk,X,Y,Z) move_position(Disk,X,Y,Z)

from sensors to effectors

Figure 16.14: Representation of the perceptual strategy on the brain model Episodic memory and its use in goal stacking 303

16.9 Episodic memory and its use in goal stacking

In order to represent the goal recursion strategy, we need to deal with goal stacking, which is represented by push and pop operations in existing production rule representations.

Since I did not believe that a stack with push and pop operations within a module is biologically plausible, I found an equivalent approach using an episodic memory module.

This module creates associations among whatever inputs it receives at any given time, and it sends these associations as descriptions to be stored in contributing modules. In general, it will create episodic representations from events occurring in extended temporal intervals, however in the current case we only needed simple association.

In the Tower of Hanoi case, the episode was simply taken to be an association between the current working goal and the previous, parent, working goal. I assume that these two working goals are always stored in working memory and are available to the plan module. The parent forms a context for the working goal. The episode description is formed in the episodic memory module and transmitted to the plan module where it is stored. The creation of episodic representations can proceed in parallel with the problem solving process, and it can occur automatically or be requested by the plan module. Rules in the plan module can retrieve episodic descriptions using the current parent working goal, and can replace the current goal with the current parent, and the current parent with its retrieved parent. Thus the working goal context can be popped.

This representation is more general than a stack, since any stored episode could be retrieved, including working goals from episodes further in the past. Such effects have in fact been reported by Van Lehn et al. [VanLehn et al., 1989] for human subjects.

With this additional extension, I was able to develop a representation of the goal recursion strategy, depicted in Figure 16.15. I programmed and ran it and verified its correctness. 304 Chapter 16: Problem-solving behavior

goal goal(on(1,3)). goal(on(2,3)). goal(on(3,3)). goal(on(4,3)). goal(on(5,3)). target_peg(3). number_disks(5). target_peg(Goal_peg) halt number_disks(N) plan goal(G). episodic Data: Data: goal(G) goal(G) goal_context_request(G) context(goal(G),goal_context(C)) goal_context(C) state(S) context(goal(G),goal_context(C)) goal_context_request(G) goal_context(C) goal(G) goal_context_request(G) goal_context(C) Rules: P1. state(ps) −> halt P2. state(done),goal(move(K,p(K),A)),goal_context(G) −> state(null),goal(G),goal_context_request(G) Rules: goal_context(C) C1. goal_context_request(G), P3. state(can),goal(move(K,p(K),A)) −> state(null),move(K,p(K),A). context(goal(G), context(goal(G),goal_context(C)) P41. state(cants(J)),goal(move(K,p(K),A)) −> state(null),goal_context(goal(move(K,p(K),A)),goal(move(J,p(J),o(p(K),A))) goal_context(C)) −> goal_context(C) P42. state(cantt(J)),goal(move(K,p(K),A)) −> state(null),goal_context(move(K,p(K),A)),goal(move(J,p(J),o(p(K),A))) P5. goal(move(K,p(K),A)) −> test(move(K,p(K),A)) C2. goal(G),goal_context(C) P6. state(biggest(J)) −> goal(move(J,p(J),Goal_peg)) −>context(goal(G),goal_context(C)) P7. −> test(biggest_remaining) C1. goal_context_request(G), context(goal(G),goal_context(C)) −> goal_context(C)

test(move(K,p(K),A)) state(S) top_goal_state(TGS) test(biggest_remaining) move(K,p(K),A) object_relations Data: test(T) top_goal_state(TGS) on(K,A) perceived_state(S) top(K) free(K,P) top(P,J) plan_self_action

Rules: Data: T2. test(move(K,p(K),A)),on(k,A) −> state(done) move(K,P1,P2) T3. test(move(K,p(K),A)),top(K),free(K,A) −> state(can) T41. test(move(K,p(K),A), top(p(K),J),J =/= K −> state(cants(J))

T42. test(move(K,p(K),A), top(A,J),bigger_than(K,J) −> state(cantt(J))

S1. top_goal_state(TGS),perceived_state(TGS) −> state(ps)

O1. test(biggest_remaining) −> J is biggest(not_on_goal_peg), object_position(Disk,X,Y,Z) state(biggest(J)) move_position(Disk,X,Y,Z)

on(K,A) object_motion macroaction

on(K,A) object_position(Disk,X,Y,Z) object_position(Disk,X,Y,Z)

from sensors to effectors

Figure 16.15: Representation of the goal recursion strategy on the brain model

Descriptions of episodes are of the form context(goal(G),goal context(C)). goal(G) being the current working goal and goal context(C) the current parent working goal. The figure shows a slightly more general version, where episodes are stored both in the episodic memory module and the plan module. This allows episodes that have not yet been transferred to the cortex to be used. Falsifiable predictions of brain area activation 305

16.10 Falsifiable predictions of brain area activation

For the two strategies, we can now generate detailed predictions of brain area activation sequences that should be observed during the solution of the Tower of Hanoi problem. Using the computer realization, we can generate detailed predictions of activation levels for each time step. Since there are many adjustable parameters and detailed assumptions in the model, it is difficult to find clearly falsifiable predictions. However, we can also make a simplified and more practical form of prediction using the classification of brain states into four types, previously discussed in chapter 13 and shown in Figure 15.10.

If we call these types of states G, E, P and M respectively, then, for example, the predicted temporal sequences of brain state types for 3 disks are: For the perceptual strategy: G0,G,E,P,G,E,P,G,E,P,E,M,P,G,E,P,G,E,P,E,M,P,G,E,P,G,E,P,E,M,P, G,E,P,E,M,P,G,E,P,G,E,P,E,M,P,G,E,P,E,M,P,G,E,P,E,M,P,G0. and for the goal recursion strategy: G0,G,E,P,G+,E,P,G+,E,P,E,M,P,G*,E,P,E,M,P,G*,E,P,G+,E,P,E,M,P, G*,E,P,E,M,P,G,E,P,G+,E,P,E,M,P,G*,E,P,E,M,E,G,E,P,E,M,P,G0 We can generate similar sequences for different numbers of disks and different strategies. The physical moves of disks occur during the M steps. The timing is usually about 3.5 seconds per physical move, but the physical move steps probably take longer than the average cognitive step. If a physical move takes 1.5 seconds, this would leave about 300 milliseconds per cognitive step.

The perceptual strategy used is an expert strategy where the largest disk is always selected. I am assuming perfect performance, when wrong moves are made, I will also need a theory of how mistakes are made, and then will be able to generate predictions. In the goal recursion strategy, I assume the subject is using perceptual tests for proposed 306 Chapter 16: Problem-solving behavior moves, and is not working totally from memory. G indicates the creation of a goal, G+ a goal creation and storing an existing goal (push), and G* the retrieval of a goal (pop) [Anderson et al., 1993] [Anderson and Douglass, 2001]. Anderson et al. have shown that pushing a goal takes about 2 seconds, although we have taken creation of a goal to not necessarily involve pushing. For me, pushing only occurs when a new goal is created and an existing goal has to be stored. G0 is activity relating to the top goal.

16.11 Tower of Hanoi in BAD

In the next chapter, I’ll describe the BAD language which I devised to allow easier programming of these types of brain model. I tested this by also programming and testing the different Tower of Hanoi strategies in it.

16.12 Appendix - Talairach coordinates

Since there is considerable variation in the size and shape of people’s brains, in order to compare imaging findings among different brains, there is a method, due to Talairach and Tournoux [Talairach and Tournoux, 1988], of scaling the measurements of any brain into approximately a standard form, i.e., normalizing. The idea is to find coordinate axes, x horizontally across the brain, y longitudinally along the length of the brain and z vertically, and then to rescale the brain data along each of these axes separately.

To do this we need to find two key points in the brain, namely, AC which is the most posterior point of the anterior commissure and PC which is the most anterior point of the posterior commissure. What on earth are these? Figure 16.16 shows the commissure, or corpus callosum, which is the connection between the hemispheres in a coronal view, Appendix - Talairach coordinates 307 so the cortical axons, that leave layer 3 to connect to the corresponding area in the other hemisphere, are routed to pass through this. Then Figure 16.17 shows a lateral view of the commissure with the different parts labeled. It has four main parts, from the front, R - rostrum, G - genu, B - body, and S - splenium, and then the anterior commissure and posterior commissures are more extreme parts.

Figure 16.16: The corpus callosum shown in coronal section

We have to locate AC and PC by looking at the MRI image. We put the origin or coordinates at AC and the forwards y-axis along the line from PC to AC. This is shown in Figure 16.18, taken from [Talairach and Tournoux, 1988].

The rest is straightforward, the maximum of the x-axis is the most lateral point of parietotemporal cortex, of the y-axis is from the most posterior point of occipital cortex to the most anterior point of frontal cortex, and of the z-axis is from the highest point of parietal cortex to the lowest point of temporal cortex. You need to linearly scale the positive and negative parts of each axis, and in addition the AC-PC interval all independently. 308 Chapter 16: Problem-solving behavior

Figure 16.17: The corpus callosum showing anterior and posterior commissures

The brain is scaled to the size of a particular brain which has the following dimensions - (i) along the y axis: 70mm from the most anterior point to AC, 23mm from AC to PC, and 79 mm from PC to the most posterior point, a total of 172 mm, (ii) along the z axis: 42mm from most inferior point to AC, and 74mm from AC to the most superior point, a total of 116 mm, and (iii) along the x axis: 68mm from the leftmost point to the center point and 68mm from the rightmost point to the center point, a total of 136 mm.

It is a further convention to use an integer numbering system based on 1mm increments, giving volume elements, or voxels of size 1 mm X 1 mm X 1 mm. So the volume of size 136 x 172 x 116 is divided along the x-axis from -68 to +68, the y-axis from -70 to 23 to 102, and the z-axis from -42 to +74. These integer coordinates are then used for comparing a given brain with standard data about brains. Appendix - Talairach coordinates 309

Figure 16.18: The sagittal plane showing AC and PC, and the x and z axes, drawn in orange

The resulting coordinate system is diagrammed in Figure 16.19. 310 Chapter 16: Problem-solving behavior

(68,−102,74)

(68,70,74) (−68,−102,74) z

(68,−102,−42) 74 79 68 23 PC (−68,70,74) 70 AC 68 y 42 x (68,70,−42) (−68,−102,−42)

(−68,70,−42)

Figure 16.19: Talairach coordinate system Chapter 17

BAD - a Brain Architecture Description language

Abstract: I describe a formal description language for specifying the anatomical struc- ture and the information processing of brains at the architectural level. This is a logical language in which concepts and mechanisms of brain architecture can be represented. The descriptions can form a complete model of the brain, which can be executed using the BAD interpreter.

BAD allows users to define a set of intelligent memory modules by the types of description they can store and the set of description transformation rules that constitute their intelli- gence. It allows users to construct a complete brain by specifying channels connecting the modules into a parallel computation system. Having specified an external world, the user defines the sensors which transduce aspects of the external world into descriptions which are input into specified modules, and effectors which take descriptions of motor actions and cause corresponding changes in the external world. The BAD system is embedded in the Prolog language which is used in places for specifying standard computations.

311 312 Chapter 17: BAD

17.1 Introduction and motivation

I have described my approach to modeling the brain at an architectural level, as well as an example brain model I have implemented. The current chapter concerns another aspect of my agenda, namely that brain modeling should involve a usable language in which concepts, mechanisms, assumptions, etc., about the brain can be expressed.

This would enable brain scientists to express their ideas, insights and theories in a well- defined scientific/mathematical language which would be open to testing and to argu- mentation and co-construction. Theories would be tested by describing neural systems of interest in the language and then determining their consequences by executing the sys- tem using the language’s interpreter program. Cooperative debate among brain scientists would be facilitated and enhanced by having a precisely defined means of expression and representation of ideas.

Such a language would be a precisely defined formal language with declarative as well as imperative constructions. It should be possible to completely describe the system of interest in the language. The description of the system of interest, written in the language, would be a precise formal specification of the system, realized as a formal document or text. In my own work, I have been mainly concerned with information- processing aspects of the brain, although some treatment of external physical processes has been unavoidable.

This specification would then be executable on a computer using an interpreter for the language. Such an execution would construct a model of the system of interest and then allow an execution of its dynamics, from a specified initial state, propagating it forward in time, according to its given specification.

In order to specify a complete experiment, it is necessary to specify not only the brain(s) Introduction and motivation 313 but also their environment and initial conditions. Typically, in my own work, I have defined a brain model and then specified a world with several animals, each with its own brain. The environment of the brain includes the bodies of the animals as well as other objects and terrain around them. In addition, we need to specify sensors and effectors which connect the brain(s) with the external world. Clearly, a specified world will include many parameters which are defined and set by the user. For a particular experiment, an initial state is specified, which includes initial brain states.

This chapter describes such a language which I call Brain Architecture Description Lan- guage, or BAD. It is embedded in the logic programming language Prolog, but includes its own special rule interpreter and other mechanisms. I have extensively demonstrated and tested all the mechanisms described here.

Systems other than complete brains can of course be specified, perhaps some subsystem of the brain. In this case, the user will need to specify the behavior of the rest of the brain, to the extent this is required for their investigation.

I also provide for instrumentation of the modeled system and for its visualization.

In a logical approach to computation such as the one adopted here, models can be used in a more general way than for explicitly specified experiments. For example, the experimenter has the ability to interrogate any aspect of the modeled world at any stage. He or she does this by writing queries in the language. It is also possible to write complex procedures for answering questions of interest about the constructed world.

We can also use the language to write command files which run complex experiments with several different phases and settings. The state of the world can be saved at any stage, and further experiments can start at a saved state. 314 Chapter 17: BAD

17.2 Specifying modules and channels

17.2.1 Descriptions and description transformation rules

A brain, or brain subsystem, is described by an interconnected set of processing mod- ules. Each module contains data of only given types. All data items are represented by descriptions.

A description is an expression of the form: predicate name(weight,context,list of terms) where weight is a term which is a list of weights, explained later, context is a term, and a term in general is an expression of the form constant name or function name(list of terms). We more exactly call these ground terms, because no variables occur in them.

An example of a description is

position([1.0,1.0,1.0,1,[]],any,[adam,300.0,200.0,0.000])

which might be a stored item representing the position X,Y,Z of a perceived person adam, with certain weights. The context term is intended to be used for grouping descriptions into sets, i.e., all those descriptions with the same value of context form a set, but I have not used this feature yet.

All these names, predicate, function and constant, should start with a lowercase letter. For example, in the description:

p(a,f(b,c),g(h(d,e),f))

p is the predicate name, f, g and h are function names and a, b, c, d, e, and f are constant names. These names may be any identifiers and they do not have to be declared.

This form of description in logic is called a ground literal, and it usually represents an assertion that the predicate is true of some objects designated by the terms that are Specifying modules and channels 315

its arguments. In our case, a description is simply a representation of information held in the module. It is often however useful to use, and to think of, a description as an assertion. We can also use descriptions as constraints, that is, conditions that must be satisfied.

The meanings of descriptions in any brain model are determined by whatever is specified in the model, for example how they relate to sensory data or to some property of the dynamics of the brain model. Thus, they do not mean that an assertion is being made or is held by the brain. If there is any connection to assertions it is that the theorist is using a description for making assertions about what information is present in the module.

To indicate what data types can be stored in a given module, we give a description pattern which is of the same form as a description except that it can have variable names in place of constant names; variable names start with a capital letter. A description pattern defines a set of descriptions which is all those descriptions that can be obtained from it by substituting ground terms for its variables.

For example, the description pattern action(W,C,[adam1,Action]) matches the literals: action([1.0],any,[adam1,sitting]), where the variable W is bound to [1.0], the vari- able C is bound to any and the variable Action is bound to the constant term sitting. It also matches the literal action([2.0],any,[adam1,walking(5.0)]), where Action is now bound to walking(5.0), and so on.

Computation in modules is represented by a set of description transformation rules.

17.2.2 Basic rule form

The basic and simplest form for a rule is

rule(agent,rule_name,context, if((sequence of description patterns)), 316 Chapter 17: BAD

then ([list of description patterns]), provided((clause_body)), weights(weights)).

It is executed by matching the left-hand side, the “if” expression, expression to bind variables and then constructing and storing the descriptions given by the right hand sides, the “then” expression. For example, consider the following rule in some module: rule(M,macroaction_2,context(all), if((predator(W_pr,C,[MP,Species]), position(W,C,[M,X,Y,Z]), position(W_p,C,[MP,XP,YP,ZP]))), then([move_towards(W_1,any,[XA,YA,ZA])],Wa), provided((MP \== M, X1 is (X - XP), XA is X + X1, Y1 is (Y - YP), YA is Y + Y1, Z1 is (Z - ZP), ZA is Z + Z1)), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0]) ).

The meaning of this rule is as follows: If we have, in the module’s store, the items predator([1.0,0.7,0.8,1[]],any,[koko,chimpanzee]), position([1.0,1.0,0.8,1[]],any,[brad,300.0,200.0,0.0]) and position([1.0,0.9,0.8,1[]],any,[koko,500.0,100.0,0.0]), then the if part of the rule will match to the store, giving bindings MP bound to koko, XP, YP, ZP to koko’s position, M to brad, X, Y, Z to brad’s position etc., and the effect of the rule will be to store a new description like move towards([1.0,0.85,0.8,1[]],any,[200.0,250.0,0.0])

In general, a rule is executed by matching all the if description-patterns, thereby binding variables to values. Of course, if the same variable occurs more than once, either in Specifying modules and channels 317 the same pattern or in two different patterns, it must have the same binding for all its occurrences. There may be more than one way that the if part of a given rule can match to the store, since there can be many descriptions in the store. Each way of matching produces an instance of activation of the rule, with an associated set of bindings for variables to terms. Every instance is executed.

The then part is executed, for a particular instance, by constructing, from each de- scription pattern in the then expression, a description obtained by substituting terms for variables. These constructed descriptions may then be stored in the module and/or output to other modules.

17.2.3 Prolog

BAD is a based on the standard logic programming language Prolog, and has some of the same characteristics. (i) It is a conversational language, meaning that the user programs and interacts with his or her program always by a two-way conversation with the system. At any one time, there is a prompt and the user may type any expression. This is taken as a query by the system, which answers the query and types the reply back to the user, giving the prompt again. (ii) Identifiers are not declared and the language is type-free, that is any variable can take any type of value, real, integer, symbol, term, etc. (iii) All expressions in queries are executed by matching them against the current database. This means that any specification which has a variable in a certain posi- tion, is being automatically specified for all values that match the variable. So, if I assert speed(Person,5), since Person is a variable, I am asserting that the speed is 5 for all Persons. If instead I assert speed(adam1,6) then this asserts the speed for adam1 only. 318 Chapter 17: BAD

17.2.4 Weights

The first argument of each description is a list of weights [wsi], These weights vary in time by various mechanisms to be described below. The general form of final part of a rule is: weights(real,[list of reals],[list of reals]) where the first argument is the overall weight of the rule, wd, the second is the list of weights for the if patterns [wli], and the third is a list of weights for the then patterns

[wri]. When the rule is evoked, the weight of the rule instance is the bilinear combination of these weights P w = wo * wrj * i wli*wsi The weights for the constructed then descriptions are ob- tained by multiplying the computed overall weight w of the rule instance by the [wri]. It is these weights that are used in competition, the strongest description and the strongest rule-activation being preferred. Example:

rule(M,person_motion_1,context(all), if((position(W_p,C,[M,X,Y,Z]), position(W_p1,C,[M1,X1,Y1,Z1]))), then([position(W_1,C,[M1,X1,Y1,Z1])],Wa), provided((M1 \== M, distance_squared_between(X,Y,Z,X1,Y1,Z1,D), D < 1000.0) ), weights(0.8,[1.0,0.5],[1.0]) ).

These weights usually stay fixed at the moment, however the language allows their values to be variable and computed at run time, for example, we could have the term: Wpos is W p * 5.0, in the body, and weights(0.8,[1.0,0.5],[Wpos]), as the weights expression. Specifying modules and channels 319

17.2.5 Computation

We can have computations within a rule. This is specified by the provided part of the rule. Its argument is an arbitrary Prolog expression, which can refer to any of the variables in the if and then parts of the rule. Computations are usually simple and used as filtering tests.

However, for some purposes, it may be necessary to define complex predicates in Prolog and to use them in rules, for example a predicate to compute the 3D distance squared be- tween two points might be defined as a Prolog predicate dist(X1,Y1,Z1,X2,Y2,Z2,D2) and then called within the body of a rule.

The basic idea is that computations should be conceivably done by a neural area, and of course they can only use information present in the module. Ideally, all variable values should be obtained from the if pattern match, but sometimes this ideal may need to be violated and a match made during the course of computation, for example:

Notexists tests One type of test that is useful in rules is the nonexistence of any item in the store which matches a given literal. This is written:

((Literal,!,fail) | true)

It will correspond to existential quantification for any unbound variables that occur in Literal and do not occur in the if part of the rule. Example:

((position(W,C,[Agent,X,Y,Z]),Z > 100.0,!,fail) | true) which, if the variable Agent does not occur in the if part of the rule, means there does not exist any agent with Z > 100.0. If Agent does occur in the if part of the rule then this is 320 Chapter 17: BAD equivalent to having a pattern position(W,C,[Agent,X,Y,Z]) in the if expression and a test Z > 100.0 in the body of the rule.

17.2.6 Rule execution

Discrete time. The system works on a discrete time scale, in cycles. There is a variable called cycle which starts at 1 and is incremented by 1 each cycle. Its value can be found by using the query cycle no(Cycle).

All rule instances are executed. In a cycle, all possible instances of all rules are executed and then the results compete to determine which of them will update the store of the module and which will be output to other modules.

Updates and outputs. The execution of a rule results in computed descriptions. These are specified as updates to the store of the same module, or outputs to other modules, or both. These are specified by the user giving output data item and update data item declarations, see below.

Execution until stability. In one cycle, all rules are repeatedly executed until there is no change in the store. During this process, only the updates are stored, until stability, (or quiescence), then the outputs are made. This repetition, or iteration, is done to ensure integrity of the store before any communication with other modules.

17.2.7 Competition

All descriptions and all rule activations have a weight which is a number representing their strength.

All the updates and outputs generated by the set of rule executions in a given time Specifying modules and channels 321 cycle are subjected to competition as specified by the user in list of rule number sets, update patterns and output patterns.

Here is how rule competition is specified: :-assert((list of rule number sets(M,Mod,[list of lists of rule numbers]))). Results from all the rule instances from rule numbers in each set compete against each other. The results of a rule is the instantiated then expression plus the Wo value which is the overall rule instance strength. Rule competition uses just the Wo values.

For example: :-assert((list of rule number sets(M,Mod,[[R1,R2],[R3,R5]]))). which defines a list of rule number sets, e.g., [R1,R2] [R3,R5]

Then all the results from rules R1 and R2 have their Wo values compared and only the one with the largest Wo is used. The same for R3 and R5. Results from any other rules all go through and are used.

After rule competition, all the surviving updates and outputs from the rule instances are made into two overall lists. These lists, of updates and outputs respectively, are then, if desired, subjected to individual competition among descriptions. Here is how update and output competition are specified:

Update_patterns_declaration::= list_of_update_patterns(Module_name,Threshold,[list of literals]). e.g., list_of_update_patterns(plan,0.1, [pstate(S,P,A),episode(E),working_goal(Goal)]).

Output_patterns_declaration ::= list_of_output_patterns(Module_name,Threshold,[list of literals]). e.g., list_of_output_patterns(plan,0.1, [episode(E),plan_self_action(PSA)]).

The way competition currently works is as follows: for each pattern in the list of update 322 Chapter 17: BAD

patterns, all the generated descriptions which match to that pattern are compared and only the one with the largest W1 weight is allowed to actually update the store. In addition, the winning W1 weight must be greater than the specified threshold. Any generated updates which do not match any update pattern are allowed to update the store. The analogous method is used for outputs also.

17.3 Data

17.3.1 Storage - descriptions

A storage item must have an item type declaration, see below. The form of storage items is any ground literal with first argument a weight, second a context and third a list of variables. A weight is a list of five elements: [C,W1,W2,W3,W4] where: C, W1 and W2 are reals. The user can use them as they like. I use C for certainty, W1 for short term weight and W2 for long term weight. These values are attenuated using three attenuation values provided by the user. W3 is an integer and is the time at which the item was last updated or used. This is used by the system to avoid doing the attenuation calculation more often than it needs to. W4 is a pair [Status, Time] where Status::= transferred | refracted and Time is an integer and indicates the time at which the Status lapses. transferred means that the item stays unattenuated until the given time to allow the receiving module to confirm the item. refracted means that the item stays down below noise level until the given time. Data 323

17.3.2 Uniqueness and integrity

Memory item update characteristic patterns: items matching these patterns replace ex- isting items: item type(Person,Module,Type,[Input pattern,Stored pattern])

An incoming description is matched to the Input pattern, if it matches, then the Stored pattern is matched to the store. If this matches then the incoming descrip- tion updates the matching stored description. If no Input pattern matches, or if the Stored pattern does not match to the store then the description is simply stored. e.g., item type(Person,Module,data,[goal(G),goal(G)]). any incoming goal description updates only an identical one. Thus there can be an in- definite number of goal descriptions, with different goals G. item type(Person,Module,data,[action(M1,A1),action(M1,A2)]). an incoming action description for a given person M1 updates any other action descrip- tion for the same person. Thus, only one action description can exist for a given person, however there can be an indefinite number of different action descriptions each for a different person.

17.3.3 Updating

Updating of weights of memory items from incoming weights of matching item uses a multiplicative increment to the excitation of the item. (((W1 input >= W1 old),W1 new is (W1 old*(1.0 + Incw1))) | ((W1 input < W1 old),W1 new is (W1 old*(1.0 - Incw2)))) The % amount of update to use is specified by statements of the form: update weights increment(Person,Module,Incw1,Incw2). for example, to increase by 20% if input is above the existing weight and 324 Chapter 17: BAD

decrease by 10% if input is below the existing weight, on each update. update weights increment(Person,Module,0.2,0.1).

17.3.4 Attenuation

Attenuation factors used in memory attenuation: attenuation factor(Person,Module,Type,DW,Description pattern). e.g., for data to attenuate to noise level in about 20 cycles: attenuation factors(Person,Module,data,[0.1,0.1,0.1],C) . Noise level can be set and is the value at which a description is removed from the store: noise level(Person,Module,Type,Noise level).

17.3.5 Confirmation

Confirmation data Factors controlling the impact of confirmation values on weights are specified by: confirm factors(Person,Module,Type,[CFNEG,CFPOS],[Confirm threshold,CSNEG,CSPOS]) where (i) Confirm threshold determines where the computed confirmation value is a positive or negative confirmation (ii) CFNEG is subtracted from W1 for negative, i.e., disconfirmation (iii) CFPOS is used to multiply the confirmation value that is added to W1 for positive confirmation (iv) and CSNEG and CSPOS are used to stretch the incoming weight value to produce a stretched confirmation value. Typical values are: confirm factors(M,goal,data,[0.4,0.05],[0.2,1.0,1.0]).

Since a confirmation expression is a description, it must also have a specified type with a memory item update characteristic pattern, e.g.: item type(Person,Module,confirmation, Specifying brain architecture 325

[confirm(goal,affiliation,goal(G1),PGS1),confirm(goal,affiliation,goal(G1),PGS2)]).

17.4 Specifying brain architecture

A brain will consist of a set of modules connected by a set of channels. % list of names of memory modules list of module types(List of module names). e.g., list of module types([affiliation,goal,plan,plan person, plan person action,plan self action,person, person action,person motion]). Each module will be given a name and will be specified by the description types it can store and the description transformation rules that determine its computational activity.

As explained above, description types are specified by declarations of the form: item type(M,Mod,Item Itype) e.g., item type(M,person motion,[action(M1,A1),action(M1,A2)])

Connections will be given, for each module, by a set of statements each of which gives a description pattern and the name of a destination module. output data item(Person,Module name,Description pattern,Destination module name). e.g., % output data list for person motion module output data item(M,person motion,action(Name,Action),person action). What a connection statement means is that, at the appropriate moment in the processing cycle, all descriptions matching the given description pattern will be transmitted to the given destination module. These descriptions will actually be located in the channel until stored by the module. 326 Chapter 17: BAD

17.5 The form of an external world specification

This specifies everything external to the brain, so “external world” means the body and the rest of the world. What is specified is usually the initial state of the world at the start of a run, or a state of interest to be investigated.

The user can specify any kind of world. Here are examples of external worlds that I have used. For a set of persons, each is given by a set of assertions giving their position, orientation, etc. : person(name(Person)). age(Person,years(Years)). sex(Person,Sex). action(Person,Action). config(Person,[Body config,Face config]). position(Person,X,Y,Z). orientation(Person,[Body orientation,Head orientation]). % orientation is angle of axis of person to x-axis (counter-clockwise in x y plane) e.g. person(name(adam1)). age(adam1,years(6.0)). sex(adam1,male). action(adam1,sitting). config(adam1,[sitting,normal]). position(adam1,320.0,320.0,0.0). orientation(adam1,[0.0,0.0]).

For objects in the Tower of Hanoi, I used expressions like the following: For each disk: object name(thdisk1,thdisk1). object position(thdisk1,[0.0,0.0,50.0]). object orientation(thdisk1,[0.0,0.0]). object velocity(thdisk1,[0,0.0]). object angular velocity(thdisk1,[0,0]). object mass(thdisk1,1.0). object support(thdisk1,[],[]). object color(thdisk1,red). Sensors and effectors 327

object shape(thdisk1,annulus). object size(thdisk1,[50.0,30.0,10.0]). For each peg: object name(thpeg1,thpeg1). object position(thpeg1,[0.0,0.0,0.0]). object orientation(thpeg1,[0.0,0.0]). object velocity(thpeg1,[0,0.0]). object angular velocity(thpeg1,[0,0]). object mass(thpeg1,1.0). object support(thpeg1,[],[]). object color(thpeg1,black). object shape(thpeg1,cylinder). object size(thpeg1,[30.0,100.0,0.0]).

Model parameters Speeds are expressed in increments per memory cycle where the memory cycle corre- sponds to 20ms or 50 cycles/second. Spatial scale is body length of 10.0, i.e., unit is about 1 inch thus a speed of 0.4 means 0.4 inches per 20 ms, which is 20 inches/second or about 1.7 ft/second or 1.5 mile/hour. The angular speed is in degrees per memory cycle, so 2.0 degrees/20ms is about 100 degrees/second. speed(Agent,Speed). e.g., speed(X,0.4). angular speed(Agent,Angular speed). e.g., angular speed(X,2.0).

17.6 Sensors and effectors

The external world or environment The external world is defined by a set of descriptions. Effectors cause changes in this set, and sensors examine the set and compute sensory descriptions to input to the brain.

Sensing the environment A set of sensors is defined by the user. Each sensor is a Prolog predicate and the system is given a list of names of the sensors. Each cycle it calls each member of this list once. 328 Chapter 17: BAD

Each sensor predicate places descriptions in its output channels, which are input channels to specified modules corresponding to sensory cortical areas etc. sensor list(List of sensor predicates). e.g. sensor list([vision,tactile]). Each sensor predicate must be written in the standard form: sense for agent and sensor(Person,Sensor name). so that the BAD system can call it. e.g., sense for agent and sensor(Person,vision). There are some basic sensor predicates available that I have written. A sensor predicate actually sends a Prolog query, i.e., a goal, to the environment,which solves it and sends the answer back.

Acting upon the environment Similarly, a set of effectors is defined by the user, each as a Prolog predicate, and the system calls each member on this list during the processing cycle. The system calls each effector predicate once for each description on its input channel, passing this description as argument. An effector predicate makes changes in the representation of the external world. effector list(List of effector predicates). e.g. effector list([macroaction]).

Each effector predicate must be written in the standard form, execute motor command(Effector name,Person,Description pattern). so that the BAD system can call it. e.g., execute motor command(movement,adam,move(5.0,2.0,0.0)). An effector sends a description to the environment which specifies a desired change. The environment determines what changes actually occur. Sensors and effectors 329

17.7 Executing a complete brain model

The way a complete brain model is executed is in discrete processing cycles, usually corresponding to about 20 milliseconds in simulated time. During each such processing cycle, each module is processed once. 1. sensors perceive the environment and input their descriptions to any modules they are connected to. 2. all input descriptions on input channels from other modules are stored by updating into the module 3. all instances of all active rules, found by matching rules to stored data items, are executed to quiescence. 4. all generated updates are subjected to competition 5. all winning updates are used to update the store of the module 6. all generated outputs are subjected to competition 7. after quiescence, all winning output descriptions are transmitted to output channels to other modules. 8. Some modules send output descriptions to effectors which make changes to the envi- ronment, or more exactly send change requests to the environment. Since the rules are independent, it doesn’t matter what order they are processed in. This is equivalent to complete parallelism of the system, at the level of the modeling abstraction being used here.

An example of a brain We previously diagrammed in Figures 15.1 and 15.2 the architecture of an example brain model and we show in Figure 15.3 a visualization of the various components of a complete world model and the relationships of these components. 330 Chapter 17: BAD

Example of behavioral states of a complete world model Figure 15.4 shows a visualization of a typical instantaneous state of the model. Figure 15.5 shows two interacting agents.

17.8 Specifying a complete system and experiment

At the very top level, we need to create a set of agents, and a (possibly time-dependent) environment, an initial state, and possibly some instrumentation regime. In addition, we may well want to have a user interface with certain visualizations available of the external environment and of the states and activities of modules.

An agent will have a brain model and a set of sensors and effectors which connect it with the environment. In general, each agent will run on a different computer and so will have its own brain model specification. We can of course make all the brains identical initially if we choose. Alternatively, we can define agents to have different brain models, for example one agent with a normal brain and another with a pathological brain. We can also have different species, which we might want in a predator-prey experiment.

The details of the distribution of agents over different machines is specified in a distribu- tion file. A multiagent model, including all agents and the world environment, can run on a single machine, provided there is enough memory available. Exactly the same code is used, it is only necessary to change the distribution file.

17.9 Specifying the initial state of the brain models

We need to create initial individual memories of persons, to initialize memory contents for all persons. For most of the immediately perceptible data, this will be filled in Loading the complete model world 331

automatically once the model starts, by the perceptual process in the model. There is some data however that the experimenter will need to put in by hand.

For example, we might need to store the initial memory of dominates relations, which is a set of descriptions of the form: dominates(Weights,Person1,Person2) e.g. dominates([1.0,1.0,1.0,1,[]],any,[alice1,adam1]).

Similarly we need an initial memory of affiliation relations affiliation(Weights,C,[Person1,Person2]) e.g. affiliation([1.0,0.67,0.8,1,[]],any,[adam1,alice1]) and perhaps a memory of physiological parameters sexual satiety([1.0,1.0,1.0,1,[]],any,[0.8]) hunger satiety([1.0,1.0,1.0,1,[]],any,[0.8]) energy level([1.0,1.0,1.0,1,[]],any,[0.8]) arousal level([1.0,1.0,1.0],0.8) This is straightforward to arrange

17.10 Loading the complete model world

A complete multiagent model will be loaded onto a set of machines on a network. This is currently done by making up a load file for each machine. A load file basically is a Prolog file which consists of a sequence of consult calls which loads all the different Prolog files that are needed.

There will also be a load file for the world server which will run on another machine. Visualization is usually done by connecting from some machine on the internet at runtime. % load distributed memory machine :-consult(dmm5). % load own utility predicates :-consult(utils). 332 Chapter 17: BAD

% load motor interface for selected world % motor1 is for vervet society :-consult(motor1). % load sensor interface for selected world % sens1 is for vervet society :-consult(sens1). % load initial model of 9 memory types and dynamics :-consult(affiliation). :-consult(goal). :-consult(plan). :-consult(plan person). :-consult(plan person action). :-consult(plan self action). :-consult(person). :-consult(person action). :-consult(person motion). :-consult(person intention). % global data concerning the model :-consult(gmod). % background world % this is a file with objects that never change % during a run but which may affect the action % like rocks, terrain etc. :-consult(bgw). % decide how which agents will exist and which are to be active % the list of all agents that exist in the world :-assert((list of agents(List of agents))). e.g. :-assert((list of agents([adam1,adam2,alice1,alice2]))). % the list of all agents that are active % i.e. whose brains are active % agents whose brains are not active will still have % an effect on the action i.e. active agents will % perceive and react to inactive agents :-assert((list of active agents(List of active agents))). e.g. :-assert((list of active agents([adam1,adam2,alice1,alice2]))). Trial files 333

17.11 Trial files

In order to actually run an agent, a run predicate is used. I have developed a method of detailed control of the brain model which I find useful. This consists of a file which contains the run predicate, and its definition, which explicitly runs each process of each brain module in turn. In other words it micromanages the execution of the brain.

17.12 Appendix - BAD Syntax

17.12.1 Syntax of BAD models

A BAD file specifies a model. A model consists of specifications of a set of modules, plus a specification of the connections between modules. To more precisely specify the syntax of the BAD language, I will use Backus-Naur form in a standard notation.

Model::= List of modules Connection List of modules::= set of files each containing a Module Connection::= file called gmod.pl containing the General specification and file called output data items containing Communication specifications Module::= Module header Declarations List of rules

17.12.2 Syntax of the BAD rule

: − dynamic rule/5. Rule::= rule(Agent variable,Rule name,Context,If,Then,Provided,Weights) Agent variable::= variable Rule name::= constant symbol identifier Context::= context(Term) If::= if((sequence of Literals separated by commas)) Then::= then([list of Literals separated by commas],Weight name) Literal::= Prolog literal with explicit Weight name as first argument % literal should match to stored items of module 334 Chapter 17: BAD

% for if part % or also to output items for module % for then part Weight name::= constant symbol identifier Provided::= provided((arbitrary Prolog code block)), if none then provided((true)) Weights::= weights(Overall rule weight,Input weights,Output weights) Overall rule weight::= rule weight(real) default - 1.0 Input weights::= Weights list Output weights::= Weights list Weights list::= [list of reals separated by commas] default - list of 1.0s of arbitrary length there should be at least as many input weights as if literals if less then the last weight is used for weighting subsequent literals thus the default can be written [1.0] e.g. rule(M,macroaction 1,context(all), if((prey(W pr,C,[MP,Species]), position(W p,C,[MP,X,Y,Z]))), then([move towards(W 1,context(any),[X,Y,Z])],Wa), provided((MP ¯= M)), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0]) ).

17.12.3 Syntax of BAD modules

Module::= Module header Declarations List of rules

The BAD module header Module header::= :-module(Module name,[]). :-consult(bad). bad.pl is a prolog file containing consult(utils). consult(exec). consult(update). consult(confirm). consult(combine weights). etc. Appendix - BAD Syntax 335

The BAD module declaration Declarations::= list of Declarations Declaration::= Item type declaration | Corresponding type declaration | Update weights increment declaration | Attenuation factors declaration | Noise level declaration | Update patterns declaration | Output patterns declaration :-dynamic item type/3.

Item type declarations Item type declaration::= item type(Agent name,Module name,Pattern pair). Agent name::= symbol name Module name::= symbol name Pattern pair::= [Description,Description] e.g. item type(M,macroaction,[plan act(W,C,[PA]),plan act(W,C,[PA])]). this means that a plan act can only update another identical plan act so there can be many different plan act items in the store simultaneously item type(M,macroaction, [object position(W,C,[Name,X1,Y1,Z1]),object position(W1,C1,[Name,X2,Y2,Z2])]). this means that an object position updates any existing object position with the same Agent name, hence there is only ever at most one object position item for a given perceived Agent

Corresponding type declarations :-dynamic corresponding type/3. Corresponding type declaration::= corresponding type(Agent name,Module name, [Description,Description,Compare list]). Compare list::- compare list([list of Variable pairs]) Variable pair::= [Variable,Variable] Variable::= prolog logic variable e.g. corresponding type(M,macroaction, [plan act(W,C,[orient parallel to(B,H)]), plan act(W1,C1,[orient parallel to(B1,H1)]), compare list([[B,B1],[H,H1]])]). this means that an item of the form plan act(orient parallel to(B,H)) will update a stored item of the form plan act(orient parallel to(B1,H1)) if (B-B1)*(B-B1) + (H-H1)*(H-H1) < threshold 336 Chapter 17: BAD

Update weights increment declaration :-dynamic update weights increment/5. Update weights increment declaration::= update weights increment(Agent name,Module name,Name,Real,Real). e.g. update weights increment(M,macroaction,Name,1.0,1.0). :-dynamic attenuation factors/5. :-dynamic attenuation factors input/5.

Attenuation factors declaration Attenuation factors declaration::= attenuation factors(Agent name,Module name,Name,[Real,Real,Real],Term). | attenuation factors input(Agent name,Module name,Name,[Real,Real,Real],Term). e.g. attenuation factors(M,macroaction,Name,[1.0,1.0,1.0],X). attenuation factors input(M,macroaction,Name,[1.0,1.0,1.0],X).

Noise level declaration :-dynamic noise level/4. Noise level declaration::= noise level(Agent name,Module name,Name,Real). e.g. noise level(M,macroaction,Name,0.1).

Update patterns declaration :-dynamic list of update patterns/3. Update patterns declaration::= list of update patterns(Module name,Threshold,[list of literals]). e.g. list of update patterns(plan,0.1, [pstate(W1,C1,[S,P,A]),episode(W2,C2,[E]),working goal(W3,C3,[Goal])]).

Output patterns declaration :-dynamic list of output patterns/3. Output patterns declaration ::= list of output patterns(Module name,Threshold,[list of literals]). e.g. list of output patterns(plan,0.1, [episode(W1,C1,[E]),plan self action(W2,C2,[PSA]) ]). Appendix - BAD Syntax 337

17.12.4 General specification - The gmod file

The whole model is put together using a general specification file, which should be named gmod.pl. Its form is as follows: General specification::= Gmod header List of module types List of in module types List of out module types Sensor list Effector list Defaults Gmod header::= :-module(gmod,[ list of module types/1, list of in types/1, list of out types/1, atten list/1, sensor list/1, effector list/1, update weights increment/5, confirm factors/6, confirm update parameters/9, refraction factors/3, noise level/4, compare default/1, output data item/4]). :-use module(library(lists)). :-consult(output data items). :-dynamic list of module types/1. List of module types::= list of module types([list of Module names]). :-dynamic sensor list/1. Sensor list:- sensor list([list of Sensor names]). e.g. sensor list([vision,tactile]). :-dynamic effector list/1. Effector list::= effector list([list of Effector names]). e.g. effector list([world]). Defaults::= system file containing default values for update increments, attenuation factors,noise level etc. 338 Chapter 17: BAD

Communication specification::= Output data item | Update data item :-dynamic output data item/4. Output data item::= output data item(Agent name,Module name,Literal,Module name). :-dynamic update data item/4. Update data item::= update data item(Agent name,Module name,Literal,Module name). e.g. output data item(M,plan,test(W,C,[T,P1,P2,P3]),object action). update data item(M,plan,working goal(W,C,[WG])). Chapter 18

Logical systems

Abstract: In this chapter, I view the types of system, that I am using to describe in- formation processing in the brain, as abstract mathematical entities and consider their general properties.

BAD rules are inference rules and all the declarations in the BAD language are logical assertions.

I define the concept of logical system, which provides modules and communication chan- nels between them, using a very general representational approach.

I discuss logic programming concepts as they relate to my research, the different possible ways of using logic, the use of representations of constraints and issues of representing uncertainty in logic.

339 340 Chapter 18: Logical systems

18.1 Using logic

In order to make the most general approach which makes the fewest theoretical commit- ments, I came up with the idea of what I am now calling a logical system.

Modules and channels. In this representation, a system consists of a set of modules with communication channels between them. A module is a general computational device which is described by a logical theory, and computation within the module is represented as logical deduction.

The data passed around consists of logical literals, i.e., information represented as atomic assertions without any logical connectives, which are true or false. In general I have mainly used ground literals, that is, no variables occur in the literal, just constants. This could all be changed, one could use general literals, or general clauses, or sets of clauses.

A module receives data and stores it. It carries out computations which make logical deductions from its current store of data, thereby generating derived data. This data may be stored and/or communicated to other modules, see Figure 18.1. d1 module d3 store channels logical theory d4 d2

Figure 18.1: The concept of logical module

How the computation is actually done within a module can, at this level, be left unspec- ified. However, it should derive enough consequences to be useful. Using logic 341

The representation of data as ground literals is also intended to leave open issues of representation as a more specific or detailed code. A literal is intended to indicate only what information is included in the data item, and our abstract representation of computation is intended to leave uncommitted any more detailed postulates about processing in a module.

The computation starts with existing data in the module’s store and derives new data. This could be something as simple as multiplication, D3 is d1*d2, with d1 = 2, d2 = 3, d3 = 6, but in general it will be something more complicated involving the functional endowment of the module plus any knowledge the module has accumulated.

From a logical point of view, any data that is derived will be logically implied by the module’s original data and knowledge, provided the computation process in the module is sound and the data is not inconsistent.

Since I am assuming there is a store of data and therefore a persistent state of the module, a functional approach would have difficulties. In any case, I preferred an approach based on logic programming because it promises to be easier to specify.

Clauses. In a logic programming approach, the steps of computation are logical infer- ences. The knowledge in the module is represented as logical statements, which typically will be conditional statements of the form if a1 and a2 and a3 and .. and an then b, or if a1 and a2 and a3 and .. and an then b1 and b2 and b3 and ... and bm which are said to be in clause form. The expression on the left of the “then” is the antecedent and on the right is the consequent.

It is known that for any first-order predicate-logic statement it is possible to find a set of simpler statements in clause form, which is logically equivalent to the original given statement. 342 Chapter 18: Logical systems

Also in most cases one can find a logically equivalent formula made up of clauses of the first type above which are called Horn clauses in which the consequent comprises only one literal.

Every set of statements, provided it is not logically inconsistent, constitutes a theory.

Models and computation. We can define computation as: “an inference process operating on data which is articulated into a series of steps, and which terminates with a well-defined result which is derived data.”

The computation derives new knowledge, in particular new ground literals. Since there is always model of any theory, which is called a Herbrand model and which consists of ground literals, then one can say that computation generates, or explores, a model of the theory.

There are different strategies for such computations, with two basic forms:

1. A hypothesis or query statement is used and the system finds a proof for it. This is a top-down process generating a tree of derived statements with the leaves usually being ground literals. This is the strategy used by the Prolog language.

2. A bottom-up process is used, as in the Datalog language for deductive databases. It starts from the set of stored ground literals and applies all the rules, i.e., the Horn clauses, from right to left, thereby generating further ground literals.

It was shown by Van Emden and Kowalski in 1976 [VanEmden and Kowalski, 1976] that these two different strategies always lead to the construction of the same model of the theory. This model is called the minimal model of the given theory, meaning that it makes the least assumptions. This minimal model is in fact contained, i.e., is a subset of, every possible Herbrand model of the theory.

Viewing the set of rules as an operator where the space is all possible models then the Using logic 343 minimal model has the property that: Rules ∗ Model = Model. So applying the rules again to the model results in . the same model. This is called the fixed point of the operator.

I also started with a bottom-up computational strategy because that way I could obtain results in a short time to give a realtime response.

Thus, initially I just allowed one cycle of inference in each time unit. So every instance of every rule was executed once. Thus, if the rule were: if p(X) and q(Y) then r(X) and the data were p(a), p(b), q(c) and q(d) then there are four instances of the rule, namely if p(a) and q(c) then r(a) if p(b) and q(c) then r(b) if p(b) and q(d) then r(b) if p(b) and q(d) then r(b).

As we saw in chapter 16, I changed to a different computational strategy which executed all rule instances repeatedly until there was no further change, i.e., quiescence.

Arithmetic expressions. The logical framework allows as special cases arithmetic expressions: if p(X) and q(Y) then Z is X**2, r(Z). and standard arithmetic functions: if p(X) and q(Y) then Z is sin(X), W is log(Y), Z>=W, r(W).

Constraints. We can represent boolean constraints: if p(X) and q(Y) then X

A further refinement I am thinking of using is to take a rule to be based on numerical 344 Chapter 18: Logical systems

constraints, like X > Y , X == Y etc. So a rule might be if X > Y and Y == Z and Z =< 3 then X == Z. The advantages of such a representation are that the constraints can be varied continu- ously as the system learns, and it also might map naturally onto a neural representation.

The use of weights. The use of logical formulae with weights is still being researched [Zaniolo et al., 1997]. It is not clear how to use probabilities since their rules of combina- tion must be consistent with the logical combination of the formulae they are associated with. One method that has good properties such as a fixed point property is to use fuzzy sets [VanEmden, 1986].

We can however always define our own way of weighting formulae by making the weight into an extra argument of any predicates. Then we can define how the weights combine as part of the logic of the rule. For example, if p(X,C1) and q(Y,C2) then C3 is C1*C2, r(Y,C3), see [Sterling and Shapiro, 1994]. This is basically what I did in my own research.

The logical representation of data types. We can represent data structures and data types by sets of rules: constructors: e.g., if p(X) and q(Y) and r(Z) then t(f(X,Y),Z) selectors: e.g., if t(f(X,Y),Z) then r(Z). predicates: e.g., if p(W) and W = t(f(X,Y),Z) then t type(W).

It is conventional in logic programming to define lists as terms involving a constructor functor cons and then to use a square bracket notation for such terms. Thus the list (a,b,c) is represented as the constructor term cons(a,cons(b,cons(c,nil))) which will be written as [a,b,c].

If we want to view a plan as a data structure: Inference and models 345

if step(S) then plan([S]). if step(S) and plan(P) and append([S],P,P1) then plan(P1). if plan(Q) and plan(P) and append([Q],P,P1) then plan(P1). if plan(Q) and plan(P) and append(Q,P,P1) then plan(P1). if execute plan(P1) and append([S],P,P1) then execute(S) and execute plan(P). if execute(S) and step(S) then step execute(S). if execute(S) and plan(S) then execute plan(S).

Similarly, we can define goals, visual images, object-files, etc., using logical rules.

18.2 Inference and models

A theory is a set of logical statements, and we will usually be concerned with a set of statements in the form of Horn clauses. This can also be viewed as a program which can be run, in which case this form is called a Definite Program.

There are a number of useful constructions and ideas on the formation of models for such theories, which we briefly review in this section.

First is the idea of a Herbrand model, which is the model we can always explicitly construct provided the theory is not inconsistent. The definition of inconsistency of a theory is that some statement can be proved and also its negation can also be proved. To form the Herbrand model of a given theory, we construct the Herbrand domain, the Herbrand base, and then Herbrand interpretations of the theory. The Herbrand domain is the set of all the ground terms that can be constructed from the constants and function letters occurring in the theory. For example, for the theory, which we will call L, consisting of just two clauses: 346 Chapter 18: Logical systems

C1. if likes(X,logic) then likes(chris,X). C2. likes(bob,logic). then its Herbrand domain is chris,bob,logic, since by inspection of the theory, there are these three constants and no functions. The Herbrand base is the set of all instances of the theory obtained by substituting all possible different combinations of the Herbrand domain into the literals in the theory. So for a theory which I will call L, the Herbrand base is: {likes(chris,chris),likes(chris,logic),likes(chris,bob), likes(logic,chris),likes(logic,logic),likes(logic,bob), likes(bob,chris),likes(bob,logic),likes(bob,bob)} A Herbrand interpretation by assigning true or false to each member of the Herbrand base. For example: {likes(chris,chris) - true,likes(chris,logic) - true,likes(chris,bob) - false, likes(logic,chris) - true,likes(logic,logic) - false,likes(logic,bob) - false, likes(bob,chris) - true,likes(bob,logic) - true,likes(bob,bob) - false} To make things simpler, we use a notation of giving the set of only the elements that are true, it being understood that an element not in this set is false. So the above Herbrand interpretation is written: For example, for theory L, the only fact in the theory is likes(bob,logic), so we form the set {likesbob.logic)}. We then instantiate all the rules, in this case the one rule in the theory, to use the facts in the set so far: if likes(bob,logic) then likes(chris,bob). so we have a new fact and the set is now {likes(bob,logic), likes(chris,bob)}. Further steps do not add any more facts so this set is a model of L.

For theory I, we have one fact so we start with {int(0)}, and get if int(o) then int(s(0)), so the set is now {int(0),int(s(0))}. Repeating this gives a model of theory I, as the limit of this infinite series. Inference and models 347

For theories made up of Horn clauses, there will be a unique minimal model, and the above method will always generate this model. One can see this is we consider the set of all models as sets, and form the lattice of all subsets of these sets with a partial ordering relation < of set inclusion relating different members of this set. This lattice will be complete, meaning that there are greatest and least elements, such that every element in the lattice is greater than the least element and smaller than the greatest element. This is not necessarily true for lattices in general. Then the method will start with one model and generate another model, and this is a monotonic operation, since is X

Organizations and microtheories. Our abstract system concept bears some relation to Carl Hewitt’s model of organizations [Hewitt, 1986], although he did not formulate it or implement it; rather, it was a conceptual model. He suggested that each department in an organization would be a module and would have a microtheory which was all the knowledge and worldview of that department. Then the work of an organization is for departments to negotiate with each other. If there was no problematic issue with a certain message then it would be processed, however some messages would lead to problems with understanding procedure. Also policy would be revised by negotiation among departments.

Communications standards. There is some activity in standards committees such as W3C, which set internet standards, where they would like to define some standard types of messages, such as assertions, queries and declarations, so that different computers by different manufacturers and with different operating systems would have a standard 348 Chapter 18: Logical systems allowing them to intercommunicate unambiguously. Chapter 19

Symbols

Abstract: In this chapter, I discuss the use of symbols in theories of the brain, including a review of current thinking on the physical symbol hypothesis.

I explain the difference between symbols in theories and symbols in models of those theo- ries. I then differentiate different properties that symbols may have, and develop a logical approach to describing those properties.

349 350 Chapter 19: Symbols

19.1 Approaches to symbols

The use of a discrete representation for specifying data and processing does not imply that I am assuming that the brain processes symbols. First, any processing mecha- nism will be expressed in some language, which is necessarily a discrete representation, and will use mathematical symbols. For example, a neural net is usually expressed in mathematical symbols or in a programming language such as C. Another example is neu- ral transmission mechanisms, such as Hodgkin-Huxley equations which are represented as mathematical symbols. Second, a symbol used in cognition is a particular type of data whose representation would have to be defined in the model and whose behavior and properties would be given by sets of rules. There is quite a lot of psychological work on clarifying if and when the brain uses symbols, and if so how to model it, see [Keith J. Holyoak and John E. Hummel, 2000] and [Zenon Pylshyn, 1985].

One can explicitly postulate that the brain uses symbols [Newell, 1990]. In 1980, Newell published a paper entitled “Physical Symbol Systems” [Newell, 1980]. He defined a symbol system, SS, as very similar to his 1972 definition [Newell and Simon, 1972] and indeed his 1958 paper [Newell et al., 1958], but more precisely articulated as a notional machine with input, output and control units. He gave precise definitions of all the ten basic operations that his machine could perform.

In 1990, Newell published his “Unified Theories of Cognition” book which gave an up-to- date statement of his approach to cognitive modeling. In this he reiterated that symbol tokens are parts of structures to be processed. His essential argument was that knowledge must consist of a set of finite structures, and that, to retrieve further knowledge, you need the occurrence of a symbol in one structure to allow you to refer to and retrieve another structure. His diagram is reproduced here in Figure 19.1.

His definition of a symbol system was similar to his 1980 definition, but the architecture Approaches to symbols 351

PROCESS

ACCESS

STRUCTURE STRUCTURE

SYMBOL SYMBOL RETRIEVAL SYMBOL

Figure 19.1: Newell’s concept of symbol

of the notional machine was now the SOAR machine.

My own view is that the brain only uses symbols for natural language syntax and seman- tics. For other processing the theory of the brain will use symbols to designate structure of data and not for referencing other data items. Thus the brain works by associative operations between rules and data items, both of which are structures.

A structure wuill be taken to be information components with subset and ordering re- lations among them, and is representable as a bracketed expression. The brain may actually use a two-dimensional structuring with two dimensions of ordering as well as subsets.

Variables in expression are merely slots for matching, and in principle they could specify data types that can match to the slot. For this approach to work, we ideally need expres- sions without repeateds occurrences of any variable. Also represence between structures will need to use descriptor terms that can be computed and assocatively matched into a store. 352 Chapter 19: Symbols

19.2 Programming issues

Relation to conventional AI programming The distribution of memory and con- trol in my model differs greatly from a conventional artificial intelligence (AI) computer program, diagrammed in Figure 19.2.

In a conventional AI program, Figure 19.2(a), there is a program which is serially executed one instruction at a a time by a processor. During this execution, it reads and writes data from and to an addressable memory. The program can examine any part of the memory, one item at a time, and has a global data perspective. Also, there is a single point of control, that is, at a given moment in time there is one place in the program instructions where activity occurs.

My model uses distributed data and control, Figure 19.2(b). The data is not all in one memory, but is in separate modules depending on data type. There is a process in each module which can access and modify only data in that module. There is therefore no process which has access to all data, and no global data perspective, and there are many distributed processes, not a single control focus. Programming issues 353

central processor RAM

processing instruction program data logic fetch goal tree address of current instruction = point of control i data d1 b read and i op d1 d2 current instruction write a being executed d2 op d1 d2 plan data operands used in instruction a

b

(a)

stored goals goal one goal active module goal processing rules

perceived plan stored plans stored states one state active one plan active world module plan processing rules states states processing rules

(b)

Figure 19.2: Difference between Conventional AI Program and our Distributed System Chapter 20

Cortical motivation

Abstract: In this chapter, I introduce the concept of motivation at the cortical level, which is a result of the system’s tendency to create integrity of its state, consistency with the environment, and consistency of purpose and social interaction. The tendency to integrity and consistency may be related to minimization of computational resources, energy and time, and as such constitutes a motivational mechanism deriving from the cortex.

354 Integrity, continuity and identity 355

20.1 Integrity, continuity and identity

I have been lead to postulate that the cortex has its own motivational dynamic, based on: (i) consistency [Festinger, 1957] or coherence [Klein, 1976] (ii) integrity [Klein, 1976] (iii) temporal continuity [Klein, 1976] (iii) the need for expression [Freud, 1900] [Freud, 1895], and (iv) the desire for novelty all of which have strong social dimensions.

These are information processing measures but are related to the survival ability to understand the world, to cooperate with others, and to minimize energy consumption [Allman, 1999].

I have realized that there are certain properties of the cortical process in my model that correspond to motivation at the cortical level. These mainly concern the tendency of the system to maintain integrity, temporal continuity and identity. George S. Klein, in his cognitive analysis of psychoanalytic ideas [Klein, 1976], defines the existence of identity and self as deriving from “ a sense of the continuity, coherence and integrity of one’s actions and thoughts in respect to autonomy and we-identity” [Klein, 1976], p. 180. By continuity he meant temporal continuity, by coherence of actions, encounters and relationships he meant their compatibility with the autonomous and affiliative aspects of selfhood, and by integrity he refers to a sense of moral truth concerning what one does and feels.

I need to define identity more simply, but more operationally, for the abstract systems I am working with. His concept of continuity seems fairly easy to define, as a consistency between the knowledge and actions of the system at time t and time t+1. His concept 356 Chapter 20: Cortical motivation of coherence presumably relates to the knowledge the system has and the actions of the system being compatible with the systems goals. His concept of integrity seems to concern the compatibility among different items of knowledge the system has and with the actions it does.

I’ll discuss integrity here, and other aspects as they come up in later chapters, such as temporal continuity in chapters 22 and 23.

20.2 Integrity mechanisms in my model

The system is constantly perceiving new information, deriving consequences from it, and integrating it into its knowledge and plans. The mechanism of the model tends to maintain integrity in several ways, in the updating and attenuation of module stores, in running inferences to quiescence, and in finding viable states in which the different modules agree with each other, with the goals of the system and with the perceived environment.

Updating and attenuation. As the external world changes, the system will perceive changed information which in many modules will be inconsistent with previous informa- tion and will need to update data items. This occurs (i) by overwriting similar data items with new ones, (ii) by computing new consequences which overwrite old ones, or recom- puting derived information which replaces old similar information, and (iii) by ignoring and failing to update less relevant, cogent (tending to fire rules), or consistent data, so that attenuation in the store eventually removes it. In this way the knowledge in stores will tend to track the changes in the external environment. This leads to consistency with the external world. Integrity mechanisms in my model 357

Inference to quiescence. As already indicated, running all matching rules to qui- escence causes a series of related updates which recompute all logical consequences of any newly received information. Thus the derived model whose construction is the main activity within a module is repeatedly reconstructed. This leads to integrity of instanta- neous action, derived from consistency of conclusions with the knowledge in the module.

Finding viable states. The system searches, finds and puts itself into states which are viable. This gives consistency between its goals, perceptions and actions.

Intermodule consistency. The confirmation mechanism tends to boost rules and data in one module that are relevant to other modules. This gives viability and consistency of the overall system state and of module-module interaction.

Social viability. By interacting with other persons, the system tends to promote states in which its viable states are compatible with the viable states of the others. This leads to social consistency and integrity of action with the actions of others.

Thus, all these different mechanisms cause the system to move into states of greater integrity and unity. If the environment stops changing, the system will also eventually stop changing. It is probably possible to show that states of greater integrity consume less computational resources, or energy. Certainly inconsistent knowledge will cause the system to make unnecessary inferences, which will consume resources and time.

Cognitive dissonance. In the late fifties, the cognitive psychologist Leon Festinger introduced the idea of cognitive dissonance [Festinger, 1957] [Harmon-Jones and Mills, 1999]. From experiments, he concluded that people tended to avoid inconsistency.

It seems to me that the concept of consistency arises naturally in a logical approach to system representation, whereas in other approaches it might be difficult to define. Chapter 21

The layer, neural and cell levels of description

Abstract: This chapter discusses how we can relate our abstract system level of de- scription of the brain to more detailed levels. A module corresponding to a cortical area can probably be represented as a set of interacting associative memories corresponding to cortical layers, where outputs from each such memory compete, where the degree of agreement between outputs controls termination, as well as having updating of identical or similar items in the store, and attenuation. Such associative memories can probably be realized as neural nets using threshold models of neurons. I will also discuss more realistic models of single neurons, and the point neuron.

Finally, I will discuss explaining the dynamics of a single neuron using a formulation of cell dynamics. This can include cell summation, thresholding, firing and propagation of spikes to presynaptic regions, release of vesicles into the synaptic cleft and uptake by the postsynaptic channels. There can also be cellular memory mechanisms, including the use of RNA and transcription factors created from received neural impulses.

358 Introduction and motivation 359

21.1 Introduction

This chapter discusses how I may be able to relate my abstract system level of description of the brain to more detailed levels.

1. I propose to define the next lower descriptive level below the system level as us- ing the concepts of a single cortical layer which will be described as an associative memory with competitive outputs. Then one module is described as a set of lay- ers which are interconnected and which have the same behavior as an abstractly described module describing one neural area of the brain.

2. I could then define the next level below this as the single neuron, so that the associative memory of one cortical layer is explained or represented as a network of neurons. I plan to introduce my own model of one neuron which has a short term store, a long term store and dynamic connectivity to other neurons.

3. I then briefly outline how our single neuron can be explained or modeled by cell dynamics. The cell has many different types of complex molecules which interact metabolically and one can write down equations for describing the changes of the cell over time. In addition, the cell has DNA and transcription mechanisms which can be modeled as part of the dynamics of the cell.

21.2 The structure of a brain module 360 Chapter 21 The layer, neural and cell levels of description

brain module

input updating associative store channels

data values

from other brain modules

rules

rule filters update filters output filters

output channels to other brain modules

Figure 21.1: Structure of a brain module The relation of our model to dataflow computers 361

21.3 The flow of data

My analysis of information processing concepts overlaps with the motivations which inspired the dataflow concept [Dennis, 1980] [Arvind and Gostelow, 1982] [Arvind and Culler, 1986] [Najjar et al., 1999]. My model has some correspondence to a special-purpose dataflow architecture with a set of dataflow machines, each with a cache and a set of processors. This set of machines would be connected in a fixed pattern by communication channels along which data tokens flow.

My model differs from and generalizes the standard dataflow machine in a number of ways: 1. Tokens are ground literals, so they have a standard form. 2. Actors are logical rules, again a standard form. 3. There is a store of tokens, which is associative. Some data may be stored only until actors are triggered by them, and others may persist longer 4. Data may be updated by combining with other tokens. 5. Any datum is usually targeted to more than on actor. 6. Activation of each actor is determined by a pattern match. Currently this has multiple data items with shared variables, however I am planning if possible to eliminate multiple occurrences of variables. 7. There are multiple instances of actors, however it is not necessary to specify different targets for the different instances. 8. Since a pattern match may fail, any datum is only potentially targeted to a given actor.

In general, my model is at a much higher level of granularity, of data and processing, than standard dataflow machines, and gives a way of increasing granularity to take advantage of contemporary technology. Since there is a notion of coherence or integrity of the store, 362 Chapter 21 The layer, neural and cell levels of description integrity would have to be achieved in one phase of processing, before a second phase, in which processes generate output from the store, could be triggered. Thus, there is a granularity of data organization which concerns the entire store. It must reach a state of integrity before processes fire, so this also dictates a granularity of processing also. Finally, my machine consists of a set of such large granularity dataflow machines. This distributed machine will have a coordination dynamics at the level of data and control communication among modules,

21.4 Representing a module as a set of interacting

layers

Figure 21.2 gives an impression of how a cortical module can be realised as a set of cortical layers. Each layer is taken to be a particular, different, kind of associative memory which contains rules and data and which associates the data and rules to cause execution of the rules. Typically, only the strongest output from an associative memory will be generated. The input layer has transient store with fixed filtering rules, the data plus rules store has short term memory with dynamic rules, and long term memory also using dynamic rules.

Some of these cortical layers, particularly layer 4, may be structured as a set of several intimately interacting layers with different cell types, see the next section. Representing a module as a set of interacting layers 363

cortex feedback output and contralateral confirm cortex output ipsilateral cortex rules and stored data

input data cortex feedforward

cortico−cortical and cortico−basal ganglia control thalamus thalamus routine action basal ganglia basal ganglia

Figure 21.2: Module as interacting layers 364 Chapter 21 The layer, neural and cell levels of description

Figure 21.3 indicates how a set of layers might work to produce activity corresponding to rule application.

position(adam,300,200,0)

position(adam,300,200,0) store

repeat to quiescence

if position(M,X,Y,Z), X1 is X+Vx,Y1 is Y+Vy,Z1 is Z+Vz rules then move(M,X1,Y1,Z1)

transfer on quiescence

move(adam,305,215,0) output

move(adam,305,215,0)

Figure 21.3: Module mechanism as interacting layers Representing a module as a set of interacting layers 365

21.5 Neural nets

The level below this is the explanation of the action of cortical layers by neural nets. There are already at least two standard theoretical neural net models which act as associa- tive memories, namely Hopfield nets and Kohonen nets [Hopfield, 1982] [Kohonen, 1989] [Haykin, 1994]. We will however have to develop additional mechanisms, namely, deriving a signal representing quiescence of the store, and competitive retrieval of data.

21.6 Cell types

Areas are composed of neurons of different types. Cells of a given type are associated with the expression of certain genes. This determines what other cell types can make inputs to the cell and what other cell types it can output to, also what neurotransmitters it uses and what learning mechanisms. Cells of the same types occurring in different animals have the same properties. To give an idea, the following table, which is not exhaustive, taken from [Barth, 2002], lists different cell types in layer IV of the primary sensory cortex:

In addition, some synapse types have LTP using NMDA receptors and others from non- NMDA receptors, for example the smooth stellate cells are NMDA independent whereas some cells in the superficial layer IV are NMDA dependent. The table should have also included the target and source synapse types for each cell type. The first cell types to be studied were afferents from the thalamus. 366 Chapter 21 The layer, neural and cell levels of description

morphology firing gene expression type references pattern

spiny stellate RS CAMKII excitatory [Benson et al., 1992] star pyramid RS CAMKII, emx1 excitatory [Lubke et al., 2000] [Benson et al., 1992] [Chan et al., 2001] large basket RS cholecystokinin inhibitory [Kawaguchi and Kubota, 1998] (smooth stellate) BS nest basket RS somatostatin inhibitory [Kawaguchi and Kubota, 1998] (smooth stellate) LTS [Wang et al., 2002] [Kubota and Yamaguchi, 2000] large basket FS parvalbumin inhibitory [Kawaguchi and Kubota, 1998] or chandelier [Defelipe et al., 1999] (smooth stellate) small basket RS vasoactive inhibitory [Wang et al., 2002] (smooth stellate) intestinal peptide VIP double bouquet RS calbindin inhibitory [Defelipe et al., 1999] (smooth stellate) double bouquet RS calretinin inhibitory [Defelipe et al., 1999] (smooth stellate)

Figure 21.4: Cell types for layer IV cortical neurons, Firing patterns: RS regular spiking, BS burst spiking and FS fast spiking Representing a module as a set of interacting layers 367

21.7 Synaptic plasticity

Leaning on a review by Malenka and Siegelbaum [Malenka and Siegelbaum, 2001], there are diverse targets and mechanisms for regulating synaptic efficiency. Any change in synaptic efficiency can be used to implement some kind of memory. Figure 21.5 summa- rizes the diversity. Mechanisms can be categorized by: (1) source of induction - from the same synapse or a different one (2) site of expression - presynaptic increase or decrease in neurotransmitter release, post- synaptic increase or decrease in response to neurotransmitter. (3) molecular basis (a) very short term - direct action of a residual elevation of calcium, Ca2+ in the presy- naptic terminal, (b) medium term, seconds to many minutes - activation of G protein coupled receptors or protein kinases that may target pre- or postsynaptic proteins (c) long term, hours, days, weeks, a lifetime - recruitment of gene transcription and new protein synthesis.

Many neuroscientists subscribe to the synaptic plasticity and memory (SPM) hypoth- esis [Kelsey C. Martin and Mark Barad and Eric R. Kandel, 2000]: Activity-dependent synaptic plasticity is induced at appropriate synapses during memory formation, and is both necessary and sufficient for the information storage underlying the type of mem- ory mediated by the brain area in which that plasticity if observed. This gives us a basis for thinking, however other mechanisms are certainly conceivable. For example, a neuro- transmitter may trigger a transcription factor which causes the synthesis of a particular protein, and then later the presence of this protein could be read out as a certain spike sequence, which seems to me different from synaptic efficiency. Speculating even further, memory structures could be coded as protein structures resident in cells, giving the cell 368 Chapter 21 The layer, neural and cell levels of description a complex state which could be used to store data, then components of these structures could be read out to generate spikes. Representing a module as a set of interacting layers 369

Figure 21.5: Mechanisms and sites for synaptic plasticity, from [Malenka and Siegelbaum, 2001] 370 Chapter 21 The layer, neural and cell levels of description

21.8 Genetic involvement in memory

Figure 21.6, taken from [Alberini, 1999], shows memory mechanisms within a typical neuron, some of which involve external stimuli, such as neural spikes, creating and acti- vating genetic transcription factors, and thereby readout from the nuclear DNA causing the synthesis of specific proteins, see also [Kaczmarek, 2000]. The presence of these pro- teins would then modify, or even initiate, future neural spiking. As indicated above, I don’t see why this should necessarily be equivalent to a change of synaptic plasticity. Representing a module as a set of interacting layers 371

Short term Neurotransmitter memory

Long term Adenyl−cyclase memory activated

2+ Increased CA cAMP level

CAMKII PKA MAPK CaMKIV

CREB−activated

Immediate Target early genes genes

Figure 21.6: Molecular events within the neuron leading to short and long term memory, based on Figure 1 of [Alberini, 1999] 372 Chapter 21 The layer, neural and cell levels of description Part IV

Memory

373 374 Chapter 22

Episodic memory

Abstract. In this chapter, I develop an approach to computational modeling of mem- ory and learning in the primate brain, based on my previously described model of the processing architecture.

I first briefly summarize current thinking in cognitive psychology concerning memory, including working memory, episodic memory, semantic memory, and procedural memory. I then review the neuroanatomy of memory. This mainly concerns the hippocampus, its structure, connectivity and functions. I then review episodic memory in more detail.

I then outline my own overall approach to memory mechanisms, and then develop a de- tailed model for episodic memory. I extend my neocortical model by adding two new modules, one providing an episodic memory, corresponding to the hippocampal complex, and the other providing the long-term memory for plans, corresponding to ventral pre- frontal areas.

In the following three chapters, I will develop the ideas further, in the next chapter for long term memory for plans, the following chapter for how plans are learned, and in the third how to model procedural memory.

375 376 Chapter 22: Episodic memory

22.1 The cognitive psychology of memory

Different kinds of memory. Psychological thinking is in the main independent of considerations of how memory is achieved or implemented in the brain. The standard picture is that there are four types of memory: (i) sensory buffers - for different sensory modalities, and short term memory - mainly phonetic (ii) working memory (iii) long term memory of two forms (a) episodic and (b) semantic. These function by the activity of control processes which cause rehearsal and transfer of data between stores, and (iv) procedural memory, which is separate.

The idea of multiple memory systems was developed by Lawrence Weiskrantz and by Lawrence Squire [Weiskrantz, 1987] The dissociation of procedural and declarative mem- ory has been demonstrated [Knowlton et al., 1996]. Working memory in prefrontal areas has been demonstrated in imaging experiments [Smith and Jonides, 1999].

I will break memory into (i) short term and of limited capacity, (ii) working memory which is of limited capacity and time, (iii) short-term episodic memory, of large capacity although limited indexability, for events in the last 15 minutes or so, (iv) long-term episodic, (v) semantic, and (vi) procedural.

A mainstream view, and my idea of how this maps onto the brain is as follows: (i) sensory buffers - areas in unimodal hierarchies (ii) working memory - areas in prefrontal cortex, mainly in area 46 or the principal sulcus. (iii) short-term episodic memory - in the hippocampal complex (iv) long-term memory - originates in the hippocampal complex and is generated by con- solidation in cortex, long-term episodic mainly in ventral prefrontal areas, and semantic The cognitive psychology of memory 377

memory mainly in temporal areas (v) procedural memory in the basal ganglia and in various cortical areas.

Memory experiments. Experimental data is obtained from experimental paradigms in which lists of items are learned and also pairings between items (paired-associates). There are three main kinds of test, namely, free recall - recall the list, cued recall - given a cue recall the corresponding element of the list, and recognition - given an item recall if it was ever presented. In addition, there is implicit memory, i.e., learning outside of awareness, which includes priming, in which memory elements become more likely to be retrieved.

These various paradigms possibly correspond to the following, where control of retrieval and storage is mainly in prefrontal: (i) sensory buffer tests - data in sensory areas, control in prefrontal. (ii) working memory tests - short term small amount of data in prefrontal, control in prefrontal. (iii) learning lists - constructed in hippocampal areas, processing in prefrontal for learn- ing and retrieval. Uses short-term episodic memory and short-term memory. (iv) long-term episodic memory (i) constructed in hippocampal areas, partially invol- untary and partially controlled or focused by prefrontal areas, (ii) in the short term, retrieval from hippocampal areas by queries from prefrontal areas, (iii) consolidated by hippocampal areas and cortical areas, and (iv) in the longer term, located in ventral prefrontal cortical areas, retrieval by queries from prefrontal, with the involvement of hippocampal areas. (v) semantic memory - similar, but longer time and more general, and in longer term, located in cortical areas and retrieved by action of prefrontal areas. 378 Chapter 22: Episodic memory

22.2 The hippocampus

O’Keefe and Nadel’s pace-setting 1978 book [O’Keefe and Nadel, 1978] included a ma- jor review of the literature, as well as a treatment of their findings of place cells and spatial, or more generally cognitive, maps, in rats. Cohen and Eichenbaum’s 1993 book [Cohen and Eichenbaum, 1993] gives a very clear and comprehensive treatment of the hippocampus and its function. Detailed neuroanatomy and connectivity for rhesus mon- keys can be found in [Kobayashi and Amaral, 1999]. Treatments of the human neu- roanatomy can be found in [Paxinos, 1990] and [Parent, 1996].

Following McClelland’s review [McClelland et al., 1995], I assume that learning in neo- cortical areas is limited to priming and possibly to learning of structure and categories within the datatypes of that area. The learning of associations between data from dif- ferent areas, the learning of events, episodes and semantic facts all need the specialized learning system of the hippocampal formation.

Figure 22.1 indicates my view that the main learning in the brain occurs in specialized learning modules, notably the hippocampus and the basal ganglia. Learning occurs in the cortex, but mainly for storage and priming.

The hippocampus in rats is relatively large, see Figure 22.2, Figure 22.3 shows the ge- ometry of how the hippocampus evolved, Figure 22.4 shows what the hippocampus looks like and Figure 22.5 shows the connectivity of the components of the hippocampal com- plex as a block diagram. I will often refer to the hippocampal complex, or hippocampal formation, as the hippocampus.

The rat explores its environment, and its hippocampus makes associations of distal stimuli which uniquely characterize small spatial areas (distal means distant, corresponding to stimuli originating from distant points, as opposed to proximal). These associations can The neuroscience of memory 379

social relations goals

perceived dispositions plan construction plan primates

primate actions and relations plan execution of specific joint plans

primate positions and movements detailed plans for self

survival routinization module episodic memory module interface

sensor system motor system vehicle

environment

Figure 22.1: Separate learning modules

Corpus callosum hippocampal formation cingulate gyrus Retrosplenial ACC

Somatosensory Motor 3,2,1 cortices CA1 sensory cortices 4,6,8

SUB Visual PL prefrontal CA3 IL 17,18 ORB

PRE DG Taste Visceral PARA Auditory 40 AON INS MEA 36 35 LEA OB

PIR

OT AMYG(C−M)

olfactory areas

Figure 22.2: The rat brain, shown flattened, adapted from (Swanson, 1992) 380 Chapter 22: Episodic memory

Figure 22.3: Hippocampal evolution

Figure 22.4: Hippocampal neuroanatomy The neuroscience of memory 381

PARA SUB CA1 CA3 neocortex EC

PERI

DG

6 layer 6 layer all 3 layer cortex cortex cortex

Figure 22.5: The hippocampal formation, block diagram

then be read out from place cells, which are individual neurons, in its hippocampus.

The hippocampus has been investigated in nonhuman primates, i.e., the rhesus monkey. The evolution of the hippocampus is complicated by (i) a change in morphology, and (ii) a change in functionality, from spatial maps first to verbal associations, then semantics and general episodic memories. It has also been investigated in humans by direct single electrode recording in neurosurgery patients, and also in imaging studies.

The standard idea of the hippocampal complex is that two things occur. First, it creates associations between a stimulus and a context. This is conceived as concerning a particu- lar moment in time rather than over an extended interval of time. A context is the larger percept including perceived distal objects, features in peripheral vision, etc. I assume this also includes all other modalities, the gustatory, olfactory, visceral, somatosensory and auditory backgrounds. Second, subsequently over a period of time, this episodic memory is copied, or re-represented, to reside in the neocortex. This is consolidation 382 Chapter 22: Episodic memory

of long-term memory. After consolidation, the memory no longer resides in the hip- pocampus, but before consolidation is complete it resides in the hippocampus and can be retrieved from there as required. The experimental basis for this idea is from amnesics with hippocampal damage, and from temporary retrograde amnesia caused by trauma (retrograde means concerning memory for events that occurred before the trauma oc- curred, as opposed to anterograde). The long term memory for general episodes which are over extended intervals of time is referred to as autobiographical memory and also is created in the hippocampus. Semantic memory for facts is treated the same as episodic memory. Procedural memory, the memory of how to perform action sequences, and skills, tends to be independent of hippocampal involvement.

Nadel, Samsonovich, Ryan and Moscovitch [Nadel et al., 2000] have challenged this stan- dard idea. In their multiple trace theory (MTT), episodic and semantic memory are treated somewhat differently. The hippocampus is always involved in storage and re- trieval of episodic memories, independent of their age, and even if consolidated. They reached this conclusion from a detailed study of autobiographical memory in amnesics. Even with amnesia that started later in life, recollection of memories for events ear- lier in life is impaired. Semantic memory however through consolidation becomes more independent of the hippocampus with age.

There is some preliminary evidence but it remains unclear what roles the different com- ponents of the hippocampal formation play. There may be different roles in encoding, storage and retrieval. Lavenex and Amaral [Lavenex and Amaral, 2000] have argued for a hierarchy of associativity. The neuroscience of memory 383

22.3 Episodic memory

22.3.1 The definition of episodic memory

To quote the originator of the concept, Endel Tulving [Tulving, 1983], pp. 134-6: “Episodic memory is that aspect of mind, or the brain, that makes the successful com- pletion of individual acts of remembering possible”.

“For our present purposes we accept the dictionary definition of an event as something that occurs at a certain place at a particular time. Thus, one characteristic of an event is that it has a beginning and an end in time, although sometimes the beginning and end are so close to one another that we think of the event as instantaneous. Second, an event always occurs in a particular location, or setting. The relation between the setting and the event is of some importance to the analysis of episodic memory; we will take it up shortly. Events can vary greatly in complexity: there is a huge difference between perceiving an event consisting of a small inhomogeneity in an otherwise completely homogeneous field and an event such as visiting Rio during the carnival. Events are temporally related to one another; one event precedes or follows another, is simultaneous with it, or overlaps it partially. Events are also embedded with other events, in an extensive arrangement in which an individual’s life represents the highest-order event. Events can always be described in terms of some action, frequently, but not always, exhibited by one or more actors. The rememberer may be a witness to an event, or a participant in it, or both.

The term ’episode’ as used in this volume may be regarded as a close synonym of ’event’, although ’episode’ usually carries with it the connotation of an event that occurs in an ongoing series of events. But since we hardly ever deal with events that are not part of some ongoing series, almost all events in which we are interested are also episodes.”

The representation of events in autobiographical memory has been discussed by Lawrence 384 Chapter 22: Episodic memory

Barsalou [Barsalou, 1988].

22.3.2 The event structure of episodic memory

I am interested first of all in short-term episodic memory, which I defined to be the memory of events that occurred in the last 15 minutes or so. After or concurrently with this there will also be long-term episodic memory which is autobiographical memory and which is mediated by a different neural mechanism than short-term episodic memory. The characteristic time for LTP and SCP mechanisms is about 20 minutes, and also modification due to RNA transcription using induced transcription factors takes a similar amount of time []. The process of copying or re-remembering from short term to long term corresponds to consolidation.

The short term memory mechanism is very fast and is “elastic” in the sense that the energy involved is quickly reusable. The short-term episodic-memory traces decay rapidly over a period of a few minutes, and the energy is recovered and used to create new episodic memory traces.

Friedman[Friedman, 1993] has reviewed episodic memory including long term and short term. I will assume that his analysis applies to both short and long term, and that long-term episodic memory has a similar form to short-term episodic memory.

The main finding is that episodic memory does not consist of a continuous trace, like a video tape recorder, but rather consists of a sequence of discrete records representing discrete events. Further, the relations among events are not necessarily those of temporal adjacency but of other general semantic relations. It seems that there are some temporal ordering relations however. To quote Friedman: “Memory for time is not built on special temporal codes or a chronologically organized memory store. Instead, our chronological sense of the past is the product of an ongoing The event structure of episodic memory 385 constructive process in which we draw on, interpret, and integrate information from: 1. our stored knowledge of time patterns, 2. and general knowledge about time, 3. the contextual associations of particular memories, 4. order codes linking related events, 5. occasional direct associations between event and time names, and 6. rudimentary clues to the ages of memories.” [Friedman, 1993] (my numbering).

Friedman divides the nine classes of theory about episodic memory into three groups: 1. Distance-based theories use processes that are correlated with the passage of time. (a) In strength theories [Hinrichs, 1970] [Annisfeld and Knapp, 1968] [Guyau, 1890] [Michon et al., 1988], [Morton, 1968] the strength of the memory trace decays with the passage of time. (b) In chronological organization theories [Koffka, 1936] [B. B. Murdock, 1974], memory traces are organized in memory store by their order of occurrence. (c) In contextual overlap theory [etal, 1980] [etal, 1983], memory traces are associated with stimuli when they occur, and this context of stimuli changes with the passage of time. 2. Location-based theories use information that is laid down at the time of encoding. (a) In time-tagging theories [Flexser and Bower, 1974] [Glenberg, 1987] [Glenberg and Swanson, 1986] [Hasher and Zacks, 1979] [Yntema and Trask, 1963], an explicit time code is added to the memory trace. This could be derived from biological clocks in the brain, see [Treisman, 1963]. (b) In encoding-perturbation theories [Estes, 1972] [Estes, 1985] [Lee and Estes, 1977] [Lee and Estes, 1981], an event is associated with control elements at different levels in a hierarchy. (c) In reconstructive theories [Anderson and Bower, 1972] [Guenther and Linton, 1975] 386 Chapter 22: Episodic memory

[Hintzman et al., 1973], retrieval draws on a rich knowledge of social, natural, and personal time assumptions and a small minority of salient events for which exact dates have been learned. 3. Theories based on relative times of occurrence, temporal information is stored in memory in the connections that exist between events. (a) In association chaining theories [Lewandowsky and Jr, 1989], events are associated with the events directly succeeding them in time. (b) In order code theories [Hintzman et al., 1975] [Tzeng and Cotton, 1980] [Tzeng et al., 1979], temporal information is added to stored items even after they occur, as a result of subsequent retrieval and use of the two events.

22.3.3 The concept of event in philosophy and linguistics

In Aristotle’s Metaphysics he described a typology of events based on their internal tem- poral structure. Later philosophers include Kenny [Kenny, 1963] and Ryle [Ryle, 1949]. The philosopher Donald Davidson pointed out how events are used as logical individuals, in a landmark paper in 1970 [Davidson, 1970] [Davidson, 1980].

Linguists also have a notion of event which is the natural semantics of a simple sentence, but this is somewhat different. Typically they want an event to comprise an action as a verb and then different case relations connecting the verb to various objects and modifiers [Parsons, 1990] [Tenny and Pustejovsky, 2000] [Pustejovsky, 1995].

Most schemes, for a review see [Tenny and Pustejovsky, 2000] chapter 1, build upon the classic work of Vendler [Vendler, 1967] (the chapter entitled ’Verbs and times”) who introduced a four-part typology of aspectual verb classes based on temporal properties such as duration, termination and internal temporal structure. In his scheme, verbs denote states, activities, achievements or accomplishments. States do not change over the The event structure of episodic memory 387 time interval under consideration, e.g. Jack loves Jill. Activities are ongoing events with no necessary temporal endpoint, e.g., Jack walked along the road. Accomplishments have duration and an obligatory temporal endpoint, e.g., Jack ate breakfast. Achievements have no duration and have an instantaneous endpoint, e.g., The stone hit the bridge. Thus according to Vendler these are the four kinds of event that we tend to describe in natural languages.

22.3.4 Rhythms and clocks

There could well be variables with diurnal and other rhythms [Treisman, 1963] which would be input to the hippocampal formation. This can be thought of as a set of values of clock phases for a set of clocks. These would be folded into the stored episodes. The current phases could then be used in reminding, and would be available on recall of event components and event reinstatement.

There are several rhythms that seem to be inherent in the nervous systems of animals, and some in human nervous systems, namely: (i) heart rate - once per second (ii) breathing rate - once per ten seconds (iii) ninety-minute rhythms of sleep states, which may also occur during waking states (iv) diurnal - daily rhythms of sleep states, arousal, hormone levels, blood coagulants etc. (v) lunar rhythms - menstrual periods (vi) tidal rhythms - slightly different from lunar (v) annual rhythms - best known are the mating seasons of animals, triggered by the perceived lengthening of daylight hours.

In most of these cases, the body has an internal clock independent of external stimulation, 388 Chapter 22: Episodic memory but this clock gets synchronized, or entrained, to the rhythm of the external stimulation. For example, the body has a diurnal rhythm of about 25 hours, but this gets synchronized to the 24 hour rhythm of daylight. In the absence of daylight, it will free run with a period of about 25 hours.

It would be possible to make a quite detailed internal clock by using the current levels of a set of variables which are linked to some or all of the body’s rhythms. I think of this as a set of dials, one for each rhythm, and the needle in each dial is the level of a variable associated with that rhythm. This is like a clock which uses separate dials for hour, minute and second.

A study of conditioning using internal clocks has been reported by King and Gallistel [King and Gallistel, 1996].

22.4 My overall approach to memory

I plan to eventually model several different mechanisms of memory and learning which serve complementary roles. Let me briefly indicate a possible overall scheme for this: (i) memory for objects and persons is stored in cortex and is subject to facilita- tion/priming from successful use. Objects and persons are represented by distributed representations made up of components in different cortical areas. Each cortical area learns component data of the data types processed and stored by that area, (ii) memory for associations among different data types from different areas is formed in parahippocampal and perirhinal areas, where it continues to function for some time before being consolidated into cortical areas as records, which contain associations to other areas involved, (iii) memory for objects in context is formed in the hippocampus, and also consolidated Our approach to episodic memory 389 to cortical areas, (iv) memory for temporal sequences of events is also formed in the hippocampus, (v) these association and event memories are available to the rest of the brain as working memory, in the short run, (vi) the influence of reward in memories is received by the hippocampus from the amyg- dala, and associations between events and reward are sent to the amygdala from the hippocampus, (vii) Consolidation results in long-term memories which are distributed over the cortex, different components being stored as data of different types in corresponding cortical ar- eas. These components can cross evoke each other, and there is an index or map, storing inter-item relations, in the hippocampus. (viii) Access to long-term memories is via prefrontal retrieval actions. (ix) Episodes will be stored autobiographically, but in addition, generalized episodes call context will be formed. Contexts contain sets of generalized events which are plans. (x) Particular contexts are evoked from the context store and are used by planning mod- ules to create action. (xi) procedural memory is laid down separatly in the basal ganglia but its long-term use is controlled by cortical areas, notably prefrontal.

22.5 My approach to episodic memory

I will start from my existing methodology and existing brain model. I seek to extend my brain model to have episodic memory. This will involve adding two new modules, one for the hippocampal system and one for the context system. The context store also includes long-term episodes. The extended model should form episodic memories, use them in 390 Chapter 22: Episodic memory short-term problem-solving, and then consolidate them to the context module, where they will influence and provide episodic and semantic knowledge for the future perception and action activity of the brain model. I have previously argued, in subsection 13.1.3, for episodic/context memories being stored in ventral prefrontal areas.

22.5.1 Main principles of my theory

1. There are events: (i) they correspond to neuroanatomy, i.e., connections from cortex and amygdala to the hippocampal formation (ii) the components of events represent changes as well as the current state (iii) the components of events are chunked within modules before being sent to the hippocampus.

2. The stream of events forms episodes: (i) an episode is a set of events plus structuring information, such as causal relations, temporal ordering, and perception-action structure (ii) episode beginnings and endings are created by various situations as well as changes in context. (iii) episodes form sequences and hierarchies (iv) the number of events or episodes in one episode is limited, to about 4 or 5, so episodes form hierarchies with a maximal branching factor of 5.

3. The short-term episode store plays various roles in brain functioning: (i) answering questions about the recent episodes, (ii) reinstating events/parts of events by merging with the current state. (iii) the form of access to this short-term episodic memory is the same as for long-term episodic memory. Our approach to episodic memory 391

(iv) it checks for novelty, familiarity and repeated events. This is involuntary and may reinstate previous events.

4. Episodes consolidate into long term memory: (i)long-term autobiographical memory is distributed over cortical modules with a cogni- tive map in the hippocampal complex. (ii) contexts are formed, generalized and updated and stored in the context module. (iii) semantic memory also emerges, and is less distributed, being mainly stored in tem- poral areas.

22.5.2 Instantaneous events

The first question I ask is what exactly is an instantaneous event, i.e., what is in- put to the hippocampus at each instant? From the connectivity of the hippocampus [Swanson et al., 1987] [Insausti et al., 1987] [Kobayashi and Amaral, 1999], most of the cortex has inputs to the hippocampus, including from the planning and action hierarchy, however the data from visual areas is the largest. This is not surprising since images carry a lot of data. The anatomical connections to and from the hippocampal complex, for the rhesus monkey, have been described by Kobayashi and Amaral, and are summarized in Figure 22.6, taken from their paper.

Thus what is input is basically the perception by the person of their own plans and actions, as well as input from the anterior cingulate. There is also subcortical input, mainly from the amygdala.

Thus what is remembered in an event, input to the learning module, is that part of the behavioral, or mental, state which includes: (i) percepts of external events, objects and people (ii) percepts of the self as an external person 392 Chapter 22: Episodic memory

Figure 22.6: Cortical connections to hippocampal complex, for rhesus monkey, from Kobayashi and Amaral

(iii) percepts of the internal activities of the self (iv) motivational states and goals of the self, and (v) percepts of others’ intentions.

My idea of the episodic module is diagrammed in Figure 22.7.

In terms of my model, the information from each module could include rule outputs but also activated rule left hand sides. There could conceivably be special outputs from cortical modules that are only sent to the hippocampal region and nowhere else, and are used in event learning and not for the main dynamics of the brain. In addition there is input to the hippocampus from orbitalfrontal areas, hence I will assume that the Our approach to episodic memory 393

local module events

goal somatosensory auditory visual module modules modules modules

plan module state working goals action occuring event evocation and merging

event module memory queries forms overall event event boundaries replies temporal adjacency event map − current episode individual events related by temporal adjacency and context

Figure 22.7: The episodic memory module 394 Chapter 22: Episodic memory currently evoked contexts/episodes are also input and part of the event.

There will almost certainly be special descriptions sent from prefrontal areas to the hippocampal region for retrieval of event memories, and for focusing and attention during learning of event memories.

22.6 My representation of events and episodes

22.6.1 My overall idea

My concept is to use the form of episodes reported by psychologists, as explained above, and the information available for events, from the neuroanatomical connections to the hippocampal complex, to develop a representation for an event and an episode, and from this a context. At the same time, there will be constraints on the size of descriptions, they should not grow to arbitrarily large size. All accessing and retrieval has to use an associative-type memory.

This scheme is the simplest I have been able to construct given all the existing evidence and processing constraints, and which is able to support all the information-processing phenomena that we know occurs.

My idea is that a context has a representation similar to an episode, only generalized, e.g., constants replaced by variables. The set of contexts will form a dynamic memory, in Schank’s sense [Schank, 1982a].

The basic construct is to allow sequences of events to form episodes and also episodes to be embedded in other episodes. This sequential and hierarchical structure extends also to contexts. Then contexts can be executed using their sequential and hierarchical structure. The components of these structures are however each of limited size, corresponding to Our approach to episodic memory 395 nodes in the structure.

22.6.2 Segmentation and chunking within each module

It is reasonable to assume that each module has its own temporal scale and performs segmentation of its stream of data to produce chunks, and it is these chunks that are sent to the hippocampus. One example is the chunking of phonemes into words by a phonological buffer.

22.6.3 Events

One event I find useful to contemplate is knocking a glass vase of flowers off a bedside table onto the floor so that it smashes. In this case, it seems that one experiences this as a single event even though it has some temporal structure. The mental representation of this event probably would include the starting state - the vase on the bedside table, an intermediate state - the vase being knocked, or falling, and the final state - the vase broken on the floor with water all around. Also there would be a sound of the vase breaking and an emotional reaction of surprise and dismay. Thus one event, according to me, can have all of these components. Interestingly it seems that this can include a visual image in a relatively direct raw form, a “snapshot”, or better “videoclip”, in addition to more processed and abstracted visual perceptions of the spatial situation.

I concluded that the information represented in a primitive event has to be a change of state or perception. This is rather like one rule firing, however it corresponds to the action of many rules firing, possibly in many different modules. It can contain something about the state prior to the change, the state after the change, and something about how the change occurred. We will indicate this by a notation: 396 Chapter 22: Episodic memory event = initial state − > final state for example, e1 = s1 − > s2.

A state will consist of the information reaching the hippocampal complex, which will include: 1. percepts such as visual, auditory, etc percepts from areas which contain processed information. For example the largest connections from the visual hierarchy to the hip- pocampal complex are from V4, which contains the 2.5D sketch and other 3D information. 2. information from frontal areas such as 9 and 46 giving the state of execution of the current plan and working memories in the planning modules. 3. information from anterior cingulate giving current goals. 4. information from posterior cingulate and parahippocampal areas giving the external spatial framework of the event. 5. information from the amygdala, giving the state of subcortical motivation, including subcortical goals, actions and evaluations.

Since my concept is that the current plan is determined by the activation of a context, item 2 will include the currently-evoked context, which I will call the cec. So the event will include the currently-evoked context as well as the current goal and working goal.

A context will be part of a hierarchy of contexts with a parent context, being part of a sequence of contexts, and having a child sequence of contexts.

Further, as we will see, if there is a parent context of the currently active context, then it too will be available for inclusion in representations of events, but not the grandparent. This is the minimal way of allowing an event or plan to return to its context.

The representation of an event is determined by the different types of information in- volved, but it is rather like a sentence with a spatial scene, goals, actions and context.

According to my methodology outlined previously, I need to be able to take perceptions of Our approach to episodic memory 397 events, to form episodes and then to generalize these to produce contexts. The observed changes or states in events need to become transformed into actions that are components of plans. Thus if we use Vendler’s typology: (i) states transform into ongoing actions without state change and without termination (ii) activities transform into ongoing actions with state change but without termination (iii) accomplishments transform into actions with an endpoint and with a temporal du- ration (iv) achievements transform into an instantaneous action. Thus to execute these plan steps, for (i) and (ii) we execute the action continuously until something causes its termination, for (iii) we execute the action until the endpoint occurs or until an estimated temporal interval as passed, and for (iv) we execute the action once only.

22.6.4 Event descriptions

Although Figure 22.8 seems to have pointers, this is not the case, arrows repre- sent membership in an expression which is a description. An event is a description like event(evdes,[sp1,obj1,obj2,obj3,goal1,wgoal1,cec1,pcec1]) which a more or less fixed structure, which can be stored in the hippocampal store. The square brackets indicate a fixed number of slots rather than a list of variable length. There is a relatively unique descriptor of the event which I denote by evdes, which is a summary expression which characterizes the event, and which will serve to allow it to be accessed in associative memory. evdes is typically an expression but it is of limited size and certainly much smaller than the full description of the event. cec and pcec are also summary descriptors of these evoked contexts. I will call this summary expression the characterization of the event. 398 Chapter 22: Episodic memory

event

cec plan step goal object3 object2 object1 spatial frame parent context working goal

Figure 22.8: An event as a set of hippocampal inputs

I have found it basically impossible to get any design to satisfy all the requirements unless there is a smaller key structure like evdes, given the limited size of any event description. The only alternative approach I can think of would be to use a memory access path which is complete and which depends on the current state of memory. Then if memory is sparse this access path will identify the item uniquely but will not require the same bandwidth as the representation of the item itself.

22.6.5 Uniqueness of reference to events

(i) With a simple model, it is too difficult to obtain a unique episode characterization, whereas with a complex model, and therefore coding space, it is much easier.

(ii) According to Friedman, time coding seems to definitely be present in human episodic memory, it is just not totally dominating. According to Barsalou and coworkers [Barsalou, 1988] [Barsalou, 1992] [Lancaster and Barsalou, 1997], indexing by time is the top-level organizing feature of autobiographical memory. Our approach to episodic memory 399

(iii) Hence I will use a time code as part of the episode characterization, and this will provide uniqueness of reference and access.

If we then were to scale the model to larger and richer descriptions, time coding will become less important but the characterization will continue to be unique.

To give an abstract example, an event is a description which is a fixed structure of components corresponding to the different chunks received from modules. Then the characterization would include the important features that change, maybe one or two very important features, and maybe any novel features or changes.

The use of characterizations is a key issue since if we can use them, then we do not need to use any symbols for reference purposes. We do not have to subscribe to the physical symbol hypothesis [Newell, 1980]. On the other hand, we can if we wish use symbols for reference purposes in some modules. This is what I have tried to achieve since I think it correctly captures the representations used in the brain.

According to the physical symbol hypothesis, symbols are used to reference symbol struc- tures, and symbols can have associated values. Instead of this approach, I am using descriptions, and the only process is the matching of descriptions. You may say surely descriptions contain symbols and I would agree with you. However, in this case the sym- bols are only used to represent structuring of data. Then a lower-level neural model can be developed which represents the same structuring.

22.6.6 Episodes

For a sequence of events within one governing context, the system forms an episode representation as a sequence of events.

My idea is that the size of this sequence is limited to a small number such as 3, 4 or 5 of 400 Chapter 22: Episodic memory

episode

next next next

event1 event2 event3 event4

Figure 22.9: An episode as a sequence of events

events. In this case, it doesn’t matter much whether we think of events as having “next” or “prior” relations, or whether we think of the episode as having a small number of slots containing event representations, as in Figure 22.9.

This again is an expression, of the form episode(epdes1,[evdes1,evdes2,evdes3,evdes4]), where epdes is a characterization. When a further event occurs which would overload this representation, or if a change occurs that is not understood, or if a standard type of event occurs which involves a change of spatial context, a signal from an interacting partner, an emergency state, and so on, then a new episode is started.

If driven by rules from a currently evoked context then the chunking and structure of this context determines the formation of episodes.

Hence, in general, episodes will form into shallow hierarchies, as in Figure 22.10.

In order to recall a given event from a cue, the current top episode will have to queried to determine the relevant component episode and iteratively down to the event that matches the cue. Thus events are not immediately available, but a higher-level description, in terms of subepisodes, is. Our approach to episodic memory 401

episode

episode2 episode1 next

next next next next next next

event1 event2 event3 event4 event1 event2 event3 event4

Figure 22.10: An episode as a sequence of episodes

Hence long-term episodic memory has hierarchical structure due to episode nesting re- quired by bandwidth constraints.

Episodes and events will initially be stored in a short-term episodic-memory module, corresponding to the hippocampal complex. It will have connections from the cortex, and also the amygdala if one is included in the model.

22.7 Episode formation

22.7.1 Recording events

As depicted in figure 22.11, the event stream is recorded as a temporal array. The event stream is segmented into episodes, using information from the stream and from the currently evoked contexts. The figure also shows that there can be more than one current context and therefore more than one segmentation of the event stream. Further, 402 Chapter 22: Episodic memory

current evoked contexts perception−action context system store c1 c2

time current state event stream current event

predicted future parts of episodes state(t) state(t−1) state(t−2)

current episode for context c1 previous episode for context c1

current episode for context c2

Figure 22.11: Multiple contexts and episodes the current episode will have predictions into the future based on the controlling context.

22.7.2 The segmentation of event sequences into episodes

The rapid sequence of events received by the hippocampus will no doubt be recorded, but also some organization will be required to start the process of abstraction and learning. As we saw, Schank’s scene instance is determined by spatial and temporal contiguity. We will approach the problem de novo.

Based on the current context. The usual case will be to segment the stream of events based on an existing evoked context. The currently evoked context will influence the segmentation of new event sequences. If this is successful it will in the long term update the evoked context. If there is not an exact match, it may in the long term generalize or alter the context by updating it.

Novel episodes. There could be the formation of new episodes not based on any existing context: (i) obvious changes of context like spatial location, (ii) successful completion of goals, and (iii) failure of goals. However, some context always exists although sometimes Our approach to episodic memory 403

very generalized, and all newly created contexts are always outgrowths of some context.

More than one segmentation. There could be more than one context active with different time scales. For example, an interruption of a conversation to answer the phone could have two episodes being formed concurrently, one for the conversation and one for the phone call.

Different kinds of failure. A completion or failure of a goal at one level may trigger an episode completion involving another level. There is thus local failure to continue applying a context, in which case the system should probably continue within the same episode and start a new subepisode. Then there is more global failure, which will typically terminate the current episode and parent episodes back to some original initiation point.

22.8 Long term memory

Consolidation. I assume that the episodic store can only remember episodes for a certain limited time, such as 15 minutes or so. During this time, another mechanism is able to make another representation which is longer lived. Most of this representation will be in the cortex and not in the episodic memory.

I imagine that in this long-term memory representation each event will be decomposed and redistributed back to the modules from which it originally came. This allows the different data types to be handled by the appropriate modules. However in addition the event component will carry an event descriptor and episode descriptors. These descriptors will allow the future reconstruction of the event from the event descriptor. By using a single event descriptor sent to all modules, all the components of that past event will be evoked simultaneously, recreating and reliving the past event. By attention control, some part only of a past event may be evoked, and also modified or combined with other evoked 404 Chapter 22: Episodic memory event memories. In addition, following Nadel and Moscovitch, there is a residual map of episodes and their relations, but without any concrete event information. This episode map is stored in the episodic memory by another longer term memory mechanism.

This approach allows the same query to access episode information from the episodic module in the shorter term and from the cortex in the longer term.

In general, we assume that queries are generated from plans in the planning module. In addition, there may be more routine plans not in the planning module, which are evoked as routine steps by the planning module but executed in other areas such as parietal areas. We will discuss this issue later. These routine steps may also generate queries to be sent to, and responded by, episodic memory.

It is our belief that short-term episodic memory is where the main working memory is to be found. There is other working memory in cortical modules but this is very much limited to a few items only in each module, and of the data types handled by that module. This frontal working memory corresponds to registers on a computer, which are a small number of implicit stores used for immediate working data.

22.9 Using event and episode information

22.9.1 Querying the hippocampal formation

The episode has sequencing/nesting structure which is usually different from the sequenc- ing/nesting structure of its generating context. The episode contains the history of events in the search for solution, including the final solution in a preferred status.

What are the different possible queries?

(i) paired-associates. what is the paired associate for a given stimulus? Our approach to episodic memory 405

(ii) list learning. free recall - generate all members of list i.e. in this learning context. ordered recall - generate list in order.

(iii) episodic. reinstate event from past given cues. find next event after/before a given event in current episode. This could be a reinstantiated episode.

Reinstatement presumably corresponds to merging with the current state.

22.9.2 The role of short term episodic memory in ongoing be-

havior

The current episode. It provides a current episode which is used in ongoing behavior.

(i) for a record of where I have been, so I can make a decision, or can compute summary information.

(ii) for a record of where to go back to - backtracking, or for reinstating limited compo- nents such as popping a previous goal.

(iii) reminding - involuntary association with and evocation of previous event.

(iv) specifically willed/controlled recall of events or parts of events. (a) as a memory. when was I there? how long ago was I here?, and (b) computing summary information, e.g., is this the third doorway? there were 4 sets of traffic lights. we have gone round in a loop.

Returning to a previous event (i) focus on part of current episode. (ii) reinstate event - is this possible in short term for complete contexts? (iii) popping of previous goal.

Comparing current episode with evoked context. when return to same place or 406 Chapter 22: Episodic memory experience similar sequence of events.

Comparing current episode with recent episode. Recent episode still in working memory, or comparing one part of current episode with another.

22.9.3 The role of long-term episodic memory in ongoing be-

havior

Reminding A current event reminds of previous event. Evokes some part of it, or creates an image.

Use of plans and other knowledge from analogous contexts. Current episode evokes current context or parts of it.

Explicit voluntary use. Query and search store. For episodes satisfying descriptions/cues/queries. This is an interactive process.

Consciously trying to remember how to do something.

22.9.4 Possible functioning of the hippocampus

Figure 22.12 shows a very speculative idea of how the hippocampal formation might be working. The hierarchy is similar to that of Lavenex and Amaral [Lavenex and Amaral, 2000]. Our approach to episodic memory 407

answers answers answers queries PARA queries EC sensory initial SUB CA1 CA3 and other associations among current information elements current generation of same initial segmented event structuring episodes contexts of episodes of PERI instantaneous initial event sensory associations among information elements queries of same event

DG time coded and structured event stream

event as sequence segmented generalized association of events episodes episodes (temporal with i.e. causal p−a events similarity) structure with associated context query query query query event sequence episode generalized episode answer answer answers answer as event as part of as part of generalized sequence episode episode

Figure 22.12: The possible action of the hippocampal formation in memory 408 Chapter 22: Episodic memory

22.10 The problem of representation

The problem we are addressing is that of designing representations of events, episodes and contexts, such that contexts can be executed by a straightforward mechanism which is neurally plausible. This means that it deals with failure to activate a module, failure to complete that action, and success on completion of the action. It should provide nesting and sequencing of contexts. It should deal with results, how to specify which results are necessary, how to terminate on achieving such results, whether in the calling or called module, and how to communicate results to other modules. It should allow time duration to be used in terminating activity.

The representation should be learnable from an episodic memory, again be straightfor- ward mechanisms. Chunking of contexts should still allow them to be executed correctly, including all the control issues listed above.

I found myself in a “fixed-point” loop depicted in Figure 22.13. Given a proposed design

plan

episodic memory

context

Figure 22.13: The context representation problem for a context, I had to then design a context application mechanism to ensure it would be executed and lead to all the right phenomena, then I had to design a representation and Our approach to episodic memory 409 mechanism for episodic memory, and then a mechanism for learning of contexts from this memory. This would have been a lot easier if the mechanisms were fixed but since they had to be redesigned each time round the loop, it was quite challenging, and difficult to get termination of the design process.

22.11 Episodes in thinking

Episodes are to some extent constructed from observations, and this is how they are discussed in the psychological memory literature, however they can also be constructed from thinking, that is, sequences of mental operations, or mental actions.

It was observed by Adriaan De Groot [de Groot, 1946], and developed by Newell and Simon [Newell and Simon, 1965] [Newell and Simon, 1972] (HPS), that, in solving chess problems, people think by generating and examining possible situations in a special pattern which one can call episodic. Possible situations, starting from a given position, form a tree. Instead of searching this tree in a breadth-first or depth-first manner, for example, people search a series of narrow paths, and after each they return to the beginning to consider what will be their next path. This is shown in for the chess position shown in Figure 22.14, Figure 22.15 showing the tree, Figure 22.16 showing the paths explored by the subject, and Figure 22.17 showing the episodic structure, 410 Chapter 22: Episodic memory

Figure 22.14: A chess position used by DeGroot and by Newell and Simon Our approach to episodic memory 411

Figure 22.15: A chess search tree, taken from HPS Fig 12.3, p. 714 412 Chapter 22: Episodic memory

Figure 22.16: A chess problem behavior graph, first half, taken from HPS Fig 12.4, p. 715-6, Our approach to episodic memory 413

Figure 22.17: Chess episodes, taken from HPS Table 2.1, p. 723 414 Chapter 22: Episodic memory

The sequence of moves in a path stops when a clear evaluation occurs, either it is positive giving a value to the path and end state, or it is negative indicating a poor position or a problem where no further move could be made. Newell called the latter an impasse since he characterized an episode as always being driven from a goal. As can be seen, at the end of an episode the subject may also generate a goal to be solved. Note that there can be a slight branching in a path, however it seems that it can only make one move and not several, before pulling back and pressing on in a different direction. I’ll call this a quasilinear path.

There is then a series of episodes which examine possible variations. Newell and Simon (NS) have rules for which episode to look at next, for example a failed episode would tend to cause examination of a rival episode whereas a successful one would tend to result in further exploration in the same direction. There is also a trend from more standard variations to more unlikely ones. As originally observed by De Groot, this leads to progressive deepening of the search tree, see Figure 22.18.

In addition, De Groot suggested that the progressive search process was composed of a succession of five types of mental process, corresponding to different levels of abstraction of representation, [de Groot, 1946], p. 131-2, which were observed as different types of statement in a problem-solving protocol, and which formed a cycle which was continually repeated, namely: (i) establishing the detail problem in question; (ii) setting, in terms of mental operations, the goal to solve this particular problem (operations-goal); (iii) carrying out the operations of investigation (calculations); (iv) determining the result and evaluating the outcome1;

1De Groot used the word “outcome” to mean a numerical value, and “result” to mean something more general that could be qualitative or quantitative. Our approach to episodic memory 415

(v) restructuring and integrating results and outcomes into the formulation of a new problem. Type v tends to merge into type i in verbalizations.

NS seem to stop short of De Groot’s conclusion that the subject is generating a proof tree. This is a set of trees, with one tree showing that a particular variation is very good and has no mistakes, and one tree for every other variation showing that it is not as good, or fails in some way, see Figure 22.19. 416 Chapter 22: Episodic memory

Figure 22.18: Progessive deepening in chess proof, three phases, taken from Chapter 7, Figure 7, pp. 268-9 Our approach to episodic memory 417

Figure 22.19: Chess proof, taken from De Groot Chapter 1, Figure 3, pp. 28-30 418 Chapter 22: Episodic memory

HPS examined human behavior on three types of problem in some detail. (i) cryptarithmetic, following original work on protocols in Bartlett’s “Thinking” book [Bartlett, 1958], NS started in 1960 using a problem from Bartlett, namely DONALD + GERALD = ROBERT . They remark that there seemed to be no other psychological studies of cryptarithmetic at that time other than Bartlett’s. (ii) problems in propositional logic using the experimental findings of Moore and Ander- son [Moore and Anderson, 1954] (iii) chess problem solving using experimental findings of De Groot. For each of these, NS ran many more experiments of their own with many subjects. In each case they found that subjects’ thinking had episodic structure into narrow paths in a search space. They also described computer programs having similar behaviors, GPS for logic problems, rule systems for cryptarithmetic and chess programs for chess. A GPS search is depth first and can be seen as episodic in many cases, and the cryptarith- metic rule systems gave a close model of human thinking. Later, a SOAR model of cryptarithmetic problem solving was developed [Newell, 1990] [Newell, 1992].

NS use a construct called a problem behavior graph (PBG), see HPS, p. 173. They define it thus: Each node represents a state of the subject’s knowledge of the problem. A move or operation is a right arrow, a return to the same state of knowledge is a downwards arrow. A repeated application of the same move gives a double arrow. Time runs right and down and states form a linear temporal sequence. Figure 22.16 above shows a PBG for chess. Actually, for NS, one episode can be not one but a sequence of PBG explorations, however in most cases the correspondence is 1:1, see Figures 22.16 and 22.17.

In propositional logic problem solving, comparing a PBG, HPS Fig 9.13, pp. 482-3, with the corresponding episodes, HPS Fig 9.12, p. 481, we see that an episode corresponds to s small search tree which is not quasilinear. In the corresponding GPS model, every Our approach to episodic memory 419 path is driven by a goal, and every episode by a transform goal.

In cryptarithmetic, there are episodes, compare the PBG in HPS Figure 6.3, p. 174, and the episode graph, HPS Figure 7.13, p. 291. The corresponding rule system model is given in HPS Fig 6.12, p. 214.

A knowledge state is knowledge of the current problem state, see HPS p. 180 and p. 585. However NS also say that there can be other knowledge, in particular: (i) memory of the states in the current path and (ii) a goal stack. Thus the PBG does not completely capture the sequence of states. In their production system for cryptarithmetic, they use all of this extra information.

NS define episode, HPS p. 84, by saying that behavior is a sequence of episodes and that an episode is “a succinctly describably segment of behavior associated with attempting a goal”, and hence they will “pay attention primarily to which episodes are initiated and how they are terminated”.

In SOAR, a state is selected in only one of two ways (i) the new current state is either a descendent or an ancestor of the current state in the problem space, and (ii) when a subgoal is terminated, the current state becomes the state that was current when the subgoal was created, i.e., it is popped.

Since SOAR’s memory is limited to the current path and not to the entire search tree so far, a number of standard search methods cannot be implemented easily in SOAR, [John Laird and Paul Rosenbloom and Allen Newell, 1986] (US) p. 76, in- cluding breadth-first and best-first. Actually there was an earlier version of SOAR called SOAR1, or MRS, or the universal weak method [Laird and Newell, 1983a] [Laird and Newell, 1983b], in which these can be implemented. It seems that episodes became less important in SOAR. 420 Chapter 22: Episodic memory

22.12 My own concept of episode in thinking

I need to reinterpret these ideas for the different architecture of my model since it is distributed and parallel and uses an episodic memory module.

(i) My concept involves execution and observation.

(ii) Hence it will need to explicitly signal the completion of a task.

(iii) Mental actions can occur in parallel.

(iv) We need to handle the distributed nature of my system, by (a) sending data from one module to another: sending data and messages and receiving results, and (b) con- trol information such as commands, triggers and queries and receiving messages of suc- cess/failure and confirmation.

(v) Episodes are chunked automatically.

(vi) Episodes and contexts are interrelated so events and action are intimately linked.

The state of knowledge should include short term memories in each module, working memory in the planning module and episodic memory. Thus it includes the history of all episodes so far, although there is a focus on the currently considered situation (= NS’s knowledge state).

Control of these memories is by rules which query EM and WM.

However there will be some knowledge in WM in the planning module which can be used for rapid decision making. By comparison access to EM is somewhat slower and expensive and is used less frequently. In addition, EM computes commonly used information such as the most recent move, the last state visited, whether a state has been visited before, the best state visited so far, and so on, and some of this can be passed to and stored in WM. Our approach to episodic memory 421

Goals are included in the episodic memory and working memory. There is no explicit store acting as a goal stack, instead previous goals are retrieved from episodic memory.

The state is distributed, with more detail to be found by further querying.

Problem solving action is included in the state, and hierarchical and sequential boundaries are recorded in the state and are used in decomposing the stream of events into episodes.

Given this, episodes will terminate with an evaluation, or an operation such as goal creation. The search can continue with a restart operation in which the situation is assessed globally.

On restart, since ths history is available we can start searching from a deeper node. This is easier because of information computed by EM and made accessible in WM.

Evaluation can proceed in parallel to searching and can take effect at any time.

I will look at mechanisms for actually generating episodes in the next chapter.

22.12.1 Types of mental action

There are two problem solving cases (i) where the state is changed from episode to episode as in TH, and (ii) where we keep resetting the state as in chess problem solving.

There can be externally generated events which mix in with internally generated events to form thinking. In the case where actions generated by thinking take some time we need to recognize when they are completed. This is because the system is distributed, because it takes time to complete a mental act and because we need to regain integrity after each act.

In the case of Newell’s work including GPS, rule systems and SOAR, this issue does not arise as their system is strictly serial, so that one action is always completed before the 422 Chapter 22: Episodic memory next one begins.

There seem to me to be several subcases: (i) mental acts limited to one module alone. (ii) mental acts involving two modules and requiring little completion if they take only one cycle to complete, or at least to initiate and then completion does not affect any other process. For example, perhaps, making a note of some mental state i.e., a simple storage operation. (iii) mental acts inovlving two modules where termination involves a local result in the target module only. (iv) mental acts involving more than two, perhaps several, modules, that need to settle for completion (v) mental acts involving only evalution of the current mental state. (vi) mental acts involving a saccade, which typically takes only 20 milliseconds to com- plete, i.e., processing one cycle. (vii) mental acts involving motor actions, which usually take at least 200 milliseconds up to several seconds. (viii) mental acts involving repeated iteration of arbitrary length, for example recalling an event from clues.

It is interesting to note that the different basal ganglia loops, to be described in chapter 25 also partition into purely mental acts (association loop), purely evaluative acts (limbic loop), purely saccadic acts (eye movement loop) and purely motor acts (motor loop). These could be decoupled fro the same reason, namely because they have different time and control properties. Our approach to episodic memory 423

22.12.2 Event types

Vendler’s event types are all from observations, not from mental acts performed by the subject. We can develop some different definitions of these event types which describe how to cause that event type: (i) states - to cause or initiate a state and maintain indefinitely if necessary. (ii) activities - repeat the mental act and maintain (iii) accomplishments - repeat, or initiate and wait, until a given end state is constructed or time interval detected. (iv) achievements - perform once or for a short time period or until a completion message is generated and received.

22.12.3 The dynamics of episodes

Episodes are triggered from the state which can include a goal but not necessarily.

Episodes are terminated by evaluations which can be + or - success, failure, or a real number value, or a goal to be generated and communicated, or an observation i.e. a datum which is generated and communicated. Episodes can also be terminated by the controlling context reaching an episode boundary.

Evaluations run as a parallel processes. In some sense the goal of an episode is to draw a conclusion which may be an evaluation.

Verbalization runs as a parallel process, it can interleave with thinking, it includes moni- toring and it can proceed in parallel with thinking and action if this is not too demanding. Chapter 23

Contexts

Abstract. I develop a neuroanatomically and psychologically plausible approach to the representation and use of problem-solving knowledge. My idea is that structures called contexts are derived from episodes. A context is a problem-solving frame containing descriptions of assumptions, conditions and plans that apply if that context is evoked.

I extend my neocortical model by adding a context memory, which is tentatively assigned to ventral prefrontal areas.

I then extend the functionality of my plan module to receive currently selected contexts which provide the current plans to be elaborated by it.

I call the new system my dynamic model, since this change is quite fundamental. It means that in modules there are now only a few general rules which interpret and recreate stored knowledge. Plans are no longer rule sets but descriptions held in contexts.

A context is formed from the representations of episodes resulting from problem-solving, which is the solution of, and strategy for solving, the problem being attempted. Contexts combine, i.e., merge, in a dynamic memory to produce more general contexts.

424 Introduction 425

23.1 Introduction

I’ll start by reviewing the ideas of Roger Schank and coworkers on dynamic memory for events, scripts, MOPs, TOPs etc.

I’ll then present my own interpretation of the issues, giving an approach to modeling planning knowledge in the primate brain.

23.2 Artificial intelligence models of memory

The most relevant is the work of Schank and coworkers [Schank, 1982a], [Schank and Abelson, 1977] [Schank, 1999] [Schank, 1982b] [Schank, 1975] [Dyer, 1981a] [Dyer, 1981b] [Wilensky, 1978b] [Wilensky, 1978a] [Roseman, 1982] [Kolodner, 1984] [Kolodner, 1993] who developed a system of knowledge representation at the conceptual level. Their motivation was to develop the knowledge that is needed for natural-language understanding by computers. The main use of these representations was to deduce the detailed action and logic underlying text being processed by a computer natural-language system.

It is difficult to read Schank’s informal writings, which is probably why no artificial intel- ligence textbooks or handbooks carry treatments, with the exception of a brief treatment of earlier work in Rich’s textbook [Rich, 1983]. There seems to be no summary or pre- cise definition in any of this work. This is a pity since in my opinion Schank’s work is very significant research of protean originality. Here is my own stab at summarizing the ideas, it inevitably will have some inexactness and inconsistency but this will not matter for the subsequent development of this research. For most of this list, Schank has specific instances of the use of concepts and also abstractions defining the concept itself, as concept. 426 Chapter 23: Contexts

1. Concepts are symbols of different types which denote objects and actions. They are called concepts because they do not necessarily correspond to words, they are the things that words refer to.

2. These have relations among them representing their conceptual dependencies, which are like case relationships - agent, object, instrument, place, and so on.

3. Some concepts are states which carry values.

4. Events have event components which are concepts - participants, location, topic.

5. There are causal relations among events and states such as event causes state or event causes state change, or state enables or disables future event.

6. Event contexts relate events to other events by time, causality or containment.

7. Goals are desired events.

8. Plans are action sequences, which are concepts, leading to the achievement of a specified goal.

9. Scene instances are sets of specific events with a shared goal occurring at the same time.

10. A scene groups together or abstracts a set of similar scene instances, to produce a memory structure consisting of actions that have a shared goal and that occurred at the same time. These actions also all occur in the same physical setting, which is usually a spatial location.

11. A scriptlet (1999) [Schank, 1999] is an ordered set of actions within a given scene.

12. A script is an ordered set of scenes, where, since 1982 [Schank, 1982a], scenes are shared among different scripts. Artificial intelligence models of memory 427

13. A MOP is a memory organization packet which specifies a set of scenes oriented toward the achievement of a given goal. There can be MOPs at different levels of generality, the lowest, i.e., most specific level, connecting to records of specific scene instances.

14. A TOP is a thematic organization packet, which is more general than a MOP and can connect different MOPs by analogy. TOPs use goals and plans to connect different MOPs.

15. A dynamic memory is a set of MOPs and TOPs which is continually updated as new information is received by the system. It is used to interpret current events and to make predictions of future events.

16. It seems that a dynamic memory is basically organized as a discrimination net.

17. The main mechanism for evoking parts of the dynamic memory relevant to incoming events is reminding.

18. The main mechanisms for updating dynamic memory is the occurrence of failures in interpretation or prediction, which cause the system to update and generalize the memory.

In addition to natural-language understanding, this theoretical framework has been ap- plied by Schank to computer-assisted learning in education, where reminding and failure are key concepts. The other main application is in case-based reasoning [Kolodner, 1993]. Indexing, i.e., selecting relevant items from memory, from Rich and Kolodner, is based on: (i) goal the same, (ii) largest number of important features, (iii) matching a subset of features exactly, (iv) most-frequently matched cases, (v) most-recently matched cases, and (vi) cases most easily adapted to the current situation. 428 Chapter 23: Contexts

Another conceptual scheme called preference semantics was devised independently in 1973 by Yorick Wilks, [Wilks, 1973] without action primitives, i.e., using an open-ended set of action concepts, and in which relations of preference are used to create conceptual- izations. Yorick Wilks’s representation has been used as the basis of European automatic translation projects.

Of course there is a vast amount of work outside AI on semantics, by Gilles Fauconnier [Fauconnier, 1985] [Fauconnier, 1997], Ronald Langacker [Langacker, 1983] [Langacker, 1999] [Langacker, 2002], Ray Jackendoff [Jackendoff, 1983] [Jackendoff, 1990] [Jackendoff, 197], and Anna Wierzbicka [Wierzbicka, 1980] [Wierzbicka, 1992], for exam- ple; however this work has not yet been specified sufficiently, or tested for consistency and usability, to allow computer implementation.

23.3 My dynamic model

I will define a new dynamic version of my model, diagrammed in Figure 23.1, in which: (i) The only rules are general application rules, which select and apply contexts. (ii) All information that used to be put in manually, now has to be derived from learning from experience. (iii) The knowledge that was previously represented as rules is now represented as de- scriptions, i.e., contexts, which are based on previously experienced events. (iv) The basic action of the system is thus: (a) to apply a context to produce specific context instances. (b) these are included in the event record together with other particularities of the cur- rent state. (c) these observed events are eventually accumulated and generalized to form new con- texts. My dynamic model 429

plan module state working goals goal somatosensory action occuring CEC visual evoked (currently evoked context) local module events module contexts

context store event module current event evoke context forms overall event from event and episode event boundaries temporal adjacency stored contexts event map − current episode generalize individual events episodes to related by temporal adjacency for contexts and context

Figure 23.1: The basic action of the system 430 Chapter 23: Contexts

23.4 The context module

Evoked contexts affect system dynamics. The context memory module participates in the dynamics of brain activity. It generates a set of evoked contexts in response to current activity. There will usually be a dominant currently evoked context, which acts as the context for current brain activity.

I will also refer to the current episode, which is the episodic structure currently being formed in the hippocampus. (i) The current context is evoked from (a) the current mental state, (b) the current episodes (temporal sequences of events in the episodic learning module). (ii) The currently evoked context conditions the action of modules in both perception and action. (iii) The currently evoked context can inject elements into modules, e.g., goals and plans. (iv) And, as I already said, currently evoked contexts are important for the segmentation of the event stream into currently perceived episodes.

The location of the context module. I will have a separate module for the context store and this has connections to the episodic memory module and also the planning module. I tentatively assign the context store to a ventral prefrontal area; what little experimental evidence we have of connectivity and imaged activity supports this choice.

Contexts and the episodic memory mechanism. I can now give a more general picture of episode formation and use. Figure 23.2 diagrams respectively (i) creation of episodic memory, (ii) consolidation of episodic memory into cortex, (iii) (unplanned) use of episodic memory as context, and (iv) (planned) iterative retrieval. Contexts and the episodic memory mechanism 431

possible control input from planning areas to memorization process dorsal frontal other areas

send information ventral frontal

hippocampus send information extend map and form associative index

possible control input from planning areas to memorization process dorsal frontal other areas

receive and store ventral frontal information with associative index

hippocampus receive and store send information information with with associative index associative index

possible control input from planning areas to memory use dorsal frontal other areas evocation of memory from incoming data and from associative links ventral frontal from other areas

hippocampus assistance in forming context by associative evocation

1 form description

dorsal frontal other areas

5 evaluate retrieved 2 memories send 4 send index ventral frontal description to distributed areas

hippocampus 4 send index 3 from description to distributed areas select associative index

Figure 23.2: Episodic memory and context mechanism 432 Chapter 23: Contexts

C

C1 C2 C3 C4

C21 C22 C23

Figure 23.3: A context as part of a hierarchy of contexts

23.5 The representation of contexts

The hierarchy of contexts. Contexts are formed by generalization from observed episodes. There is a hierarchy of contexts which is updated from experience, see Figure 23.3.

Concrete contexts. It seems reasonable, and has been argued by Barsalou [Barsalou, 1992], that a context can sometimes be derived from a single concrete episode. In other words, it is not necessary for an episode to occur many times before a context can be formed, and used for guiding the planning activity. Activation and execution of contexts 433

23.6 Activation and execution of contexts

Contexts are evoked competitively from the current state, which includes the current event, episode, context and plan step.

The context will include plan steps as leaves of its tree. When a plan step has been executed, the plan module selects the next plan step within the currently active context. At the end of the last plan step within a context, it will select the parent context. Thus, we assume some sequencing ability by the plan module. The sequencing is determined by the context, which was derived from the sequencing of events from which it was created.

A plan step is derived from, or is, an action observed as an event. It is applied by matching it to the current state in the planning module. If it is a concrete event this match will be an exact identity. The generalization process in obtaining a context from an episode will reduce the specificity of the change and allow a greater number of states to match to the plan step.

The components of a context have a sequential relationship but this may again be gener- alized to allow some or any of them to try to match to the current state. A context may also contain several loosely related contexts which will compete in matching the current state. Thus a context is similar to a problem space in SOAR.

Means-ends mechanism. Can this representation provide us with means-ends reason- ing?

Evocation of action by a goal. Suppose we have an event with goal g and state s1, then this evokes c, which contains g, and from this c1,s1− >s2 and makes c2 to be executed next, with g still in place. This evokes c and from this c2,s2− >s3, and makes c3 to be executed next. Similarly c3,s3− >s4.

Events caused by executing plans. Thus an event which is caused by the execution 434 Chapter 23: Contexts of a planned rule would look like this initial state s1: p(), q(), cec(C) evokes context cec(C’), so C becomes parent(C). this leads to a change in some of the state q’() giving final state s2: p(), q’(), cec(C’).

The structure of contexts. The evoked contexts may have some relation to each other, for example one could be an instance of the other, one could be contained in the other, and so on. This may have an impact on the segmentation of the current situation.

23.7 Example of a context

23.7.1 The form of the ss context

We can now show in Figure 23.4 the context for the selective search strategy.

This has some nonobvious properties. The main property is that each leaf is something that could be learned from an experienced episode, and hence is of the form I already indicated, i.e., actions in modules. However these also correspond to rule instances with left-hand-side descriptions that caused the event to occur. Then, when such descriptions are part of a context, we need to be able to execute them by sending descriptions to different modules which would then have to correct effect.

Thus, referring to Figure 23.4: (i) look for(Disk) is observed as an occurrence of a saccade, but it also can be sent to a parietal area and would cause the same rule to fire which would send a saccade command to FEF. Thus, this was originally experienced as a saccade command, and then when generalized and used in a context, this causes a saccade command to be executed. Example of a context 435

look_for(Disk) look_for(Disk,disk) recognize(Disk,disk)

get_disk(Disk,Peg) top_disk(Disk,Peg) top_disk(Disk,Peg)

not_just_moved(Disk) not_just_moved(Disk)

move(Disk,Peg,Peg2)

look_for(Peg2) look_for(Peg2,peg) get_target_peg(Peg2) recognize(Peg2.peg)

not_last_on(Disk,Peg2) not_last_on(Disk,Peg2)

move(Disk,Peg2) move(Disk,Peg2)

Figure 23.4: An example of a context, for the selective search strategy

(ii) recognize(Disk, disk) would be sent to TE0 to pay attention to the foveated images of the disk and to create an object-file representation of the object and to recognize that it is of type disk. recognize(Disk,disk) would be a member of a left hand side of a rule which was executed in the original experience. Some components of rules are like this, they can be changed by the system itself. I call them action parts of the rule. Other parts will be descriptions over which the system has less control, and if they are not present when the context sends the recognize(Disk,disk) message then the rule may not fire, so a fail message would be returned, or a time-out. (iii) top disk(Disk,Peg) is a message to check whether the given Disk is the top disk on the given Peg. This is a perceptual action, which might occur in TE0 or later. Thus it might require some actions within this module however we will assume that this has been routinized and can be treated as atomic. (iv) not just moved(Disk) will send a message to the hippocampus to interrogate the current episode. I assume that the hippocampus provides the ability to find the last 436 Chapter 23: Contexts

pec cec event target area pec cec

get(Disk,disk,Peg,peg) conss conss get(Peg2,peg)

look(Disk,disk) conss get(Disk,Peg) time conss get(Disk,disk,Peg,peg) check_top(Disk,Peg) check_moved(Disk)

saccade(Disk,disk) get(Disk,disk,Peg,peg) look_disk(Disk) get(Disk,disk,Peg,peg) look(Disk,disk) recognize(Disk,disk) look(Disk,disk) saccade(Disk,disk) look(Disk,disk) saccade(Disk,disk) parietal visual area look(Disk,disk) recognize(Disk,disk) look(Disk,disk) recognize(Disk,disk) TE0 get(Disk,disk,Peg,peg) check_top(Disk,Peg) get(Disk,disk,Peg,peg) check_top(Disk,Peg) TE get(Disk,disk,Peg,peg) check_moved(Disk) current episode get(Disk,disk,Peg,peg) check_moved(Disk) conss get(Disk,disk,Peg,peg) conss get(Disk,disk,Peg,peg) look(Peg2.peg) conss get(Peg2,peg) conss get(Peg2,peg) check_last(Disk,Peg2)

saccade(Peg2.peg) get(Peg2,peg) look(Peg2.peg) get(Peg2,peg) look(Peg2.peg) recognize(Peg2,peg) look(Peg2.peg) saccade(Peg2.peg) parietal visual area look(Peg2.peg) saccade(Peg2.peg) look(Peg2.peg) recognize(Peg2,peg) TE0 look(Peg2.peg) recognize(Peg2,peg) conss get(Peg2,peg) conss get(Peg2,peg) get(Peg2,peg) check_last(Disk,Peg2) current episode get(Peg2,peg) check_last(Disk,Peg2) conss premotor conss move(Disk,Peg2) move(Disk,Peg2)

Figure 23.5: Executing the ss context, and forming an episode

moved object in the current episode. (v) look for(Peg) is similar to look for(Disk). (vi) not last on(Disk,Peg2) is also sent to the hippocampus and we assume it is possible for the hippocampus to retrieve this information from the current episode. (vii) move(Disk,Peg2) is set by the plan module, executing the context, to the next lower module in the planning hierarchy, possibly premotor cortex.

Executing the ss context. Figure 23.5 shows how the ss context is executed by the planning module and how the hippocampus forms an episode description from this execution.

Formulation of the ss context. Context descriptions have the form con- text(cdes,[con1,con2,con3,con4]). Generating and updating contexts 437 context(conss,[get(Disk,disk,Peg,peg),get(Peg2,peg)]) context(get(Disk,disk,Peg,peg),[look(Disk,disk),check top(Disk,Peg), check moved(Disk)]) context(look(Disk,disk),[saccade(Disk,disk),recognize(Disk,disk)]) context(get(Peg2,peg),[look(Peg2,peg),check last(Disk,Peg2)]) context(look(Peg2,peg),[saccade(Peg2,peg),recognize(Peg2,peg)])

An expression which does not have a context definition is taken as a primitive message to be sent to the appropriate module.

Rules for executing a context. Here is an outline of a context execution mechanism in the planning module: exec(Cdes,[Primitive]) → send(Parent context), send(Primitive) exec(Cdes,[Context]) → send(Parent context), exec(Context),satisfied(Cdes) exec(Cdes,[Context—Rest]) → exec(Context),exec(Rest)

Issues in the evocation of contexts. (i) how do we find a good relevant context (ii) how do we avoid evoking many not exactly good contexts (iii) can we work successfully with more than one concurrent contexts

23.8 Generating and updating contexts

A new context can be created as a result of problem solving. The problem state evokes a context and it may cause some steps to be taken, however this may not completely succeed, and another context may then match better and get evoked. Thus the actual sequence of events is determined by the action of various different contexts in response to the current situation. The sequence of events thus created as the observed episode will be concrete instances of plan steps. This new episode can then be stored in the context 438 Chapter 23: Contexts store, and be used directly, generalized by subsequent similar episodes, or contribute to the generalization of some existing contexts.

Rules for chunking events and episodes into episodes. Here is an outline of rules for chunking in the episodic memory module: if context start then start new episode in same sequence if context push i.e. to subcontext then push episode i.e. start subepisode if context finish then pop episode and continue at upper level if context pop then pop episode if repeat(Event,Episode) then note repeated event if novel(Event) then note novel if episode sequence long then form subepisode and continue at the same (upper) level.

Characterizing episodes Key - get one or two most important changes Important - get main important changes during episode Novel - get any novel events then form Char = [Key, Important, Novel]

Generalizing episodes (i) Extract solution subepisode structure. This might happen automatically when merged, in any case it may be good to include failed executions.

(ii) Compare with all stored contexts and find one similar, then merge.

Merging episodes If incoming episode differs from a stored context by only one argument, then generalize this to the union of the two arguments, and if numerical to the interval between the values.

If differs by one subepisode but has same characterization, then delete both subepisodes Relation of my representation to Schank’s 439

to generate an abstracted context.

Issues with learning abstractions. (i) how do we manage to form good generalizations (ii) how do we avoid generating too many generalizations or ones that are difficult to distinguish by indexing.

23.9 Relation of my representation to Schank’s

To return briefly to Schank:

1. Similar episodes would generate their “union”.

2. Schank’s scenes are only generalized from episodes with similar spatial locations and goals.

3. The process of matching is probably the same both for evoking context and for learning.

4. Schank’s MOPs are second-level generalizations from his scenes.

5. The process of generalization is mainly driven from failed expectations, i.e., differ- ences in the two episodes.

6. Goals are integral to episodes and (therefore) to elements of the hierarchy.

7. Plans can be learned as generalizations from episodes.

8. Plans and goals will also be automatically associated together.

9. Schank’s TOPs are probably learned higher-level abstractions from MOPs, plans, goals, etc.

10. Schank’s scripts and scriptlets are routine temporal chains associated with goals. 440 Chapter 23: Contexts

target module

plan module rule instance fires message context to be executed: if state then messages working memory: registers data from context coordination message target module useful episodic into

queries plan event and info rule instance fires module event episodic module

all episodes

current episode module event useful info computed

Figure 23.6: The environment of context execution

11. Thus, we have a mixture of rational goal-directed events and routine sequentialized events.

23.10 Contexts and memory

The environment for executing a context includes the state as I have defined it in the previous chapter, Figure 23.6.

Thus the state contains the state of execution of the context. Suppose the search tree is as shown in Figure 23.7(a), then after one episode we might have episodic memory developing in a sequence as in Figure 23.7(b)(i)-(vi), leading to a chunked memory shown in Figure 23.7(c).

Thus in this approach, the planning module executes the context, continously accessing Relation of my representation to Schank’s 441

s2 2 s1 3 1 s3 s0 (a) s5 4 s4 5 6 s6 7 s7

s0 − 1 − s1 − 2 − s2 = E1 E1 (i)

s1 − 3 − s3 = E2 E12 = (E1 − E2) (ii)

s1 − s0 − 4 − s4 − 5 − s5 = E3 E123 = (E12 − E3) (iii) (b)

s4 − 6 − s6 = E4 E34 = (E3 − E4) (iv) E1234 = (E12 − E34)

E345 = (E3 − E4 − E5) s4 − 7 − s7 = E5 (v) E12345 = (E12 − E35)

s0 = E6 E123456 = (E12345 − E6) (vi)

Figure 23.7: Episode creation during problem solving 442 Chapter 23: Contexts the memory of the episode to choose the next context element to execute.

When a state can be evaluated, it is. After evaluation, the context is rematched and a new subcontext selected and applied.

By the use of working memory in the planning module this can be made more plausible. Since there is a limited set of choices at each point, we can keep a single cdes in working memory. So from cdes we choose the next element.

There could also be threads, i.e., associative links from leaves to the next elements to execute. This information is available at learning time.

There can be more than one parallel context, however I think only one can be articulated at once. They can all be active and match however.

It is conceivable that the different levels of one context are executed concurrently, and in many cases this will happen automatically by default.

We would like the execution of contexts to be simple reconstruction of stored generalized events.

23.11 Executing a context

Let us first write this down abstractly. The different issues seem to be: 1. whether the mental act is to be repeated or not. 2. whether we need a completion message. 3. whether we need a result datum to be generated and communicated. 4. whether a characteristic time is involved. 5. whether the mental act occurs entirely within the planning module. 6. whether we need to pass a result to another context. Relation of my representation to Schank’s 443

Thus the general context item is: if evoking condition then (repeat) until (result obtained or completion message or time) do (send message or activate subcontext)

I will assume that the context is being executed in the planning module and that it may result in sending a message (an arbitrary description) to a target module. I also assume that there is the usual confirmation mechanism. Thus, if the message fails to trigger a rule in the target module then there is no confirmation and this indicates a failure of the context item. We have not so far looked at the case where the message is initially confirmed, producing rule executions in the target module, but then before there is any completion or result confirmation ceases. It is not clear if there is a general way of dealing with this in all cases.

The repetition of activity is built into my model and so all acts will be repeated at least for a short time so repetition does not need to be explicitly represented.

It seems that we do need information on completion of an act. This can however arise in more than one way. For example, the planning module could receive information from some third module, perhaps from perception of a change in the environment. If the target module is to determine completion, this will depend on the type of action it has had to take. If it in turn has send a message to a further module, it may need to wait for completion information from elsewhere before it can send a completion message itself. My general idea is that I should generalize the model to have completion messages. These could be similar to confirmation messages, and contain a copy of the original evoking message.

Similar considerations apply to the generation of results. These occur in acts which 444 Chapter 23: Contexts execute until a given effect has been achieved. There are also two cases, in the first a result being generated in the module that executed the context itself, and the second, a result is generated in the target and the communicated to the module executing the context. For results, I think we need not provide a new mechanism. Thus results that somehow find their way to the planning module can be used to terminate repeated acts. For results generated in the target module, it will have to have its own communication arrangement to send either the result or a completion message to the planning module.

In the case where a context executes entirely within the planning module, for example in elaborating a subcontext, things are more flexible. Any required results generated by one context and needed by another are available. Also, coactivation should occur fairly naturally without confirmation or completion being explicitly generated.

23.12 Contexts required for the Tower of Hanoi pro-

tocol

We need, at least, the following contexts: To be executed in plan: conss, see Figure 23.8, which gives one cycle of the ss strategy, which is repeated, and which calls: get disk(Disk,Peg), get a disk to move, see Figure 23.8 get target peg(Peg), get a peg to move the disk to, see Figure 23.8 look for(Disk,disk), look for a disk, see Figure 23.8 look for(Peg,peg), look for a peg, see Figure 23.8 top disk(Disk,Peg), check if Disk is the top one for peg Peg, see Figure 23.8 not just moved(Disk), check whether Disk was just moved in the last move, see Figure Relation of my representation to Schank’s 445

plan module messages sent to other modules

look_for(Disk),IP look_for(Disk,disk) recognize(Disk,disk),TE0

get_disk(Disk,Peg) top_disk(Disk,Peg) top_disk(Disk,Peg),TE0

not_just_moved(Disk) not_just_moved(Disk),episodic

move(Disk,Peg,Peg2)

look_for(Peg2),IP look_for(Peg2,peg) get_target_peg(Peg2) recognize(Peg2.peg),TE0

not_last_on(Disk,Peg2) not_last_on(Disk,Peg2),episodic

move(Disk,Peg2) move(Disk,Peg2),plan_self_action

Figure 23.8: Context for the selective search strategy, showing messages

23.9(a) not last on(Disk,Peg), check whether Disk was previously on Peg before its last move, see Figure 23.9(b) move(Disk,Peg), move Disk to Peg, see Figure 23.9(c).

Then goal creation: obstacle source - create working goal for obstacle on source peg, see Figure 23.10.

(a) not_just_moved(Disk): send(notjustmoved(Disk),episodic)

(b) not_last_on(Disk,Peg): send(not_last_on(Disk,Peg),episodic)

(c) move(Disk,Peg2): repeat until on(Disk,Peg2): send(move(Disk,Peg2),premotor)

Figure 23.9: Contexts which send messages 446 Chapter 23: Contexts

plan messages

topgoal(on(Disk,Peg)) lookfor(Disk,disk) Peg2 =\= Peg on(Disk,Peg2) lookfor(Peg2,peg) on(Disk,Peg2)

obstacle_source Disk2 =\=Disk lookfor(Disk2,disk) top(Disk2,Peg2) lookfor(Peg2,peg) top_disk(Disk2,Peg2)

wgoal(−on(Disk2,Peg2))

Figure 23.10: Context for obstacle on source peg

obstacle target - create working goal for obstacle on target peg, see Figure 23.11.

Then evaluation: eval(V), evaluate the current position, see Figure 23.12.

Then verbalize: ? - verbally describe the current mental activity.

Then messages sent to other modules: lookfor(Disk,disk) - may result in saccade(disk) - to IP, causes the visual system to look for a disk, to foveate it, making a saccade if necessary lookfor(Peg,peg) - may result in saccade(peg) - to IP, causes the visual system to look for a peg, to foveate it, making a saccade if necessary recognize(Disk,disk) - to TE0, recognize the foveated object and confirm that it is a disk recognize(Peg,peg) - to TE0, recognize the foveated object and confirm that it is a peg top disk(Disk,Peg) - to TE0, compute whether Disk is on the top of Peg on(Disk,Peg) - to TE0, compute whether Disk is on Peg Relation of my representation to Schank’s 447

plan messages

topgoal(on(Disk,Peg)) lookfor(Disk,disk) Peg2 =\= Peg on(Disk,Peg2) lookfor(Peg2,peg) on(Disk,Peg2)

obstacle_target Disk2 =\=Disk lookfor(Disk2,disk) top(Disk2,Peg) lookfor(Peg,peg) top_disk(Disk2,Peg)

lookfor(Disk,disk) size(Disk,S) smaller(S2,S) lookfor(Disk2,disk) size(Disk2,S2) S2 < S

wgoal(−on(Disk2,Peg))

Figure 23.11: Context for obstacle on target peg

topgoal(on(Disk,Peg))

eval_progress on(Disk,Peg)

add_to_eval(1.0)

Figure 23.12: Context for evaluation 448 Chapter 23: Contexts notjustmoved(Disk) - to episodic, readout whether Disk was the last moved notlaston(Disk,Peg) - to episodic, readout whether Disk was last on Peg move(Disk,Peg) - to plan self act, physically move Disk to Peg

23.12.1 Restarting

The control regime leads to a depth-first search. Leaves occur due to evaluations. Some- times search will restart at the top of the search tree. This is caused by major changes, evaluations, generation of goals, and realizing important new facts.

Thus, a search is terminated by a successful evaluation. The search then restarts. It can however start at a non-root node because the episodic memory indicates the nodes that have not been searched.

23.12.2 Quasilinear searching

The slight branching occurs presumably due to the ability of the system to step back one move only. Thus after one move, the previous state is still available. This could be provided as useful information by the episodic memory. It seems also that more than one stepback never occurs.

23.13 The hierarchy property

We would like to representation to have the property that contexts and their control and data relations can be nested, as shown in Figure 23.13.

There are two cases, when all the contexts are elaborated in the same module, and Relation of my representation to Schank’s 449

c1 c2

c3 c4

c1 c2

c

c3 c4

Figure 23.13: Nesting of contexts

when some are in other modules. Within the same module, (a) results are automatically available, (b) a wider range of results can be used, and (c) synchronization is handled differently.

There should also be a learning property among modules, as shown in Figure 23.14.

Hierarchy also makes it easier to have different levels of analysis of the system. For example, look for(Disk,disk) can be either a leaf, or can be elaborated as: lookfor(disk) recognize(Disk,disk) 450 Chapter 23: Contexts

plan target module

c1 c2 c5 c

c3 c4 c6

c1 c2' c5

c

c3 c4' c6

Figure 23.14: Learning and nesting property

23.14 The learnability property

23.14.1 The problem

We now turn to the important issue of how these contexts could ever be learned, by biologically plausible mechanisms.

The problem is to develop a context representation which will allow the reconstruction of the event.

1. What information is in an event, and how is it learned. (i) the context within which the event takes place. (ii) what input and state that caused the event. (iii) and possibly which module this input came from (iv) whether a result occurred Relation of my representation to Schank’s 451

(v) whether a characteristic time was involved.

2. How are episodes/chunks learned or formed? (i) a sequence of event occurs. (ii) if there is a common parent context, then use it otherwise form a cdes. (iii) what is the input causing the episode. (iv) what are the overall results of the episode. (v) whether a characteristic time was involved.

3. How are contexts formed By abstraction of several similar episodes by: (i) ignoring constants (ii) making parameters from corresponding values (iii) combining tests into conjunctions.

23.14.2 Events

An atomic module event, is entirely within one module, and is the activation of one rule instantiation.

A (not necessarily atomic) module event is an aggregation of several atomic module events which result in a chunk and possibly a result occuring: if c ∧ mai then moi

Module event boundaries are presumably things like: (i) The generation of a data item which was not generated in the previous cycle (ii) The opposite of (i) i.e. stopping generating an item This will provide for the recognition of phonemes for example, boundaries would occur at the onset and termination of accoustic signals, and also on the recognition of each 452 Chapter 23: Contexts phoneme.

As to a certain extent a special case of a module event, the plan module may send a set of messages {pi}to modules: A plan event: if c ∧ g then p ∧ send({pi})

A module event influenced by a plan message, for module i: if c ∧ pi ∧ mai then moi.

With a small change from our previous modeling ideas, it is clear that each module has the information on the origin of its data items, i.e., which module originally constructed the data item. This arises once you realise that the same data item could not have been sent from more than one module. In the original approach this possibility was allowed for, but it has not been used in the applications so far. If we look at neuranatomy, it is also clear that inputs do not directly merge as they enter a module.

This means also that we can assume, or make a default, that all messages are automati- cally confirmed without having to be explicitly specified to do so.

An episode event is c ∧ g ∧ {pi} ∧ {mai} then {moi}.

The episode also includes the information of the origin of all of its components. This also means that it automatically contains information on the plan module’s influence on each module via the {pi} messages.

However we need more complex events, which we will call causal events which are se- quences of episode events for which an effect is produced. Relation of my representation to Schank’s 453

The main case is to do a mental act until a result is produced.

For the plan module, the repetition of: if c ∧ g then p ∧ send({pi}) eventually produces a result data item in the plan module, then this is a causal event, from which we can deduce a context: repeat until result: if c ∧ g then p ∧ send({pi}) If this occurs regularly, it strengthens and abstracts this ground context into a more generally applicable context.

A similar mechanism can be used in any module, leading to a context for that module.

More generally, a sequence of episode events constitutes a regularity which can be ab- stracted. It contains the plan event.

23.15 The cognitive map

In general, the episodes are stored in a cognitive map. This contains expressions of the form (epdes,[epdes1,epdes2,...]) at the leaves of this hierarchy we have (epdes,[evdes1,evdes2,...]) and then there are event components stored in each module, in the form (epdes,mevdesi,evi) where mevdesi is the original module chunked event.

This arrangement allows the details of events, which will be of different data types, to be stored in modules that can process them.

This also allows an episode to be re-evoked by sending epdes and mevdesi to each module 454 Chapter 23: Contexts as required. It seems also reasonable to allow modules to propagate the epdes key among the different modules and to increase its weight if it matches well.

So this is recollection of a previous event. The role of the cognitive map is to store the structure of episodes. This also includes the episode keys which allow various relations among episodes to be computed, including temporal relations.

I am arguing that an abstraction of this is what leads to the evocation of plan contexts and the use of plan knowledge and activity in generating mental acts.

23.16 Logical representation of contexts and their

execution

A context has a similar representation to an episode, i.e., context ::= con- text(cparent,cdes,[cdes1,cdes2,..]) where cdes can be another context, an episode, epdes, or a single event evdes. In addition, a context may have cparent the descriptor of its parent context. We gave an example earlier for the ss context. The main control oper- ations are to execute the next context, which is the next in the list, and to execute a subcontext, where a component of a context in turn is a list of contexts. At the end of a list, the execution returns to the parent and continues.

23.17 The code

23.17.1 Brief outline

Figure 23.15 indicates the code needed to implement our theory of memory. Relation of my representation to Schank’s 455

plan

normal module execute currently evoked contexts form and send module event, meventi, to episodic send plan event to episodic store epdes,evdes,meventi

reinstate meventi from epdes propagate epdes to neigbors

context

generalize episodes to episodic contexts in dynamic memory form events and episodes, evdes, epdes select currently evoked contexts store epdes in cognitive map

on cue epdes, activate evdes, evdesi

Figure 23.15: Outline of code 456 Chapter 23: Contexts

Normal modules 1. They have to construct the module event, mevdesi, and send it to episodic.

2. On consolidation, they receive epdes,evdes,mevdesi and store them.

3. On recall they may receive a cue which is matched to mevdesi components of items and/or epdes components of items. This may cause the reinstatement of mevdesi, as well as epdes and evdes, and the propagation of these values to connected modules.

The episodic module. This is in two parts, the formation of episodes and the management of the cognitive map.

1. The formation of episodes.

Forms events and episodes, including computing their characterizations.

An event has the form: event(evdes,[sp1,obj1,obj2,obj3,goal1,wgoal1,cec1,pcec1])

I will assume the system has access to a time variable which is an integer equal to the number of cycles so far.

In addition, evdes has the most important components, such as the rule instance and any important inputs and outputs to it.

An episode has the form: episode(epdes1,[evdes1,evdes2,evdes3,evdes4]), where evdesi can be an event descriptor or an episode descriptor.

2. The management of the cognitive map.

(i) The items are the episodes episode(epdes1,[evdes1,evdes2,evdes3,evdes4])

(ii) On receipt of a cue, its closeness to an epdes is computed and the episode activated.

(iii) This causes the evdesi to be sent to the modulesi.

The context module Relation of my representation to Schank’s 457

1. This receives episode(epdes1,[evdes1,evdes2,evdes3,evdes4]) and integrates them into a dynamic memory of contexts of the form: context(cdes,[con1,con2,con3,con4]).

2. In the simplest case this can be a simple episode, however if there are already stored similar episodes the module will construct generalizations from them.

3. The selection and instantiation of the currently evoked contexts. This occurs by matching cdes to the current state and episode. The currently evoked contexts are sent to the plan module.

The plan module 1. This has to execute the currently evoked context.

2. It also sends the plan event to the episodic module.

23.17.2 Normal modules

The module event will initially be the dominant rule instantiation. In addition it should have any changes of state since these may be results caused by other modules.

The dominant rule instantiation is lhs, rhs, rule number

Later we can introduce some chunking at the modular level, notably the set of microevents which causes one change of an item, where by a microevent I mean action occurring during just one cycle.

23.17.3 The episodic module

(i) the event is formed event(evdes,[sp1,obj1,obj2,obj3,goal1,wgoal1,cec1,pcec1]) 458 Chapter 23: Contexts where the components are descriptions received from different modules. The main prob- lem is computing evdes, but as discussed earlier, this will be made unique by using the time as part of the characterization. The other parts will be limited in size but should involve whatever is unique or important about this event. The entire event could be used provided the size of cec1 and pcec1 are bounded. We could use this full descriptor initially.

The components should really be the rule instantiations which generated those data items.

(ii) the episode is formed episode(epdes1,[evdes1,evdes2,evdes3,evdes4]) The idea is that this list is limited to a maximum of 4 or 5 items. An episode is completed when an evaluation occurs, or a change of context cec1, or the number of items exceeds 4 or 5. Then a new episode is started. episode(epdes2,[evdes5]) This episode may be either a next subepisode or a next episode at any level of the episode hierarchy. The previous episode epdes1 is now a completed episode. epdes is again a time plus key features, and again could be the entire data in principle.

If a next episode at this level then episode(epdes,[epdes1,epdes2])

If a next episode at the next higher level, then episode(epdes0,[epdes11,epdes2]) episode(epdes11,[epdes1,..]) This could occur when a subtree exceeds 4 or 5 children.

Note that the accretion of episodes is a mutation really, although it could be made into a nesting, i.e., episode(epdes2,[evdes5]) Relation of my representation to Schank’s 459 then episode(epdes2,[evdes5,evdes6]) and so on, or else episode(epdes2,[episode(epdes2,[evdes5]),evdes6])

If the context stays the same then it can be attached at the aggregated levels.

Rules for chunking events and episodes into episodes. Here is an outline of rules for chunking in the episodic memory module: if context start then start new episode in same sequence if context push i.e. to subcontext then push episode i.e. start subepisode if context finish then pop episode and continue at upper level if context pop then pop episode if repeat(Event,Episode) then note repeated event if novel(Event) then note novel if episode sequence long then form subepisode and continue at the same (upper) level.

Characterizing episodes Key - get one or two most important changes Important - get main important changes during episode Novel - get any novel events then form Char = [Key, Important, Novel]

(iii) the cognitive map reactivated events A cue is received. It is matched to the set of episodes and one chosen. The features in the characterization are sent to their corresponding modules.

This could be achieved by finding every context and then looking at the components of its epdes, and matching them to the given cue.

This is the currently retrieved episode, so another cue will retrieve further, relative to this episode. It can descend to children or transition to the next episode or to the parent. 460 Chapter 23: Contexts

It can also use association of features within the currently retrieved episode to find some episode within this bit of the tree.

If the retrieved episode is very concrete this when sent to modules may evoke a vivid reenactment of the stored episode.

What is actually sent to modules is epdes, evdes, mevdesi. This allows the episode to be reactivated and propagated among the modules.

23.17.4 The context module

(i) Incoming episode descriptions are stored and can be used as contexts. In addition, the context module builds up a dynamic memory of abstractions of episodes.

(ii) The currently most relevant contexts are activated by the current state. This is achieved by matching to their cdes descriptors.

Contexts that can run in parallel can be activated, but those in competition have to be filtered.

(iii) the initial set of contexts is given as a set which could plausibly have been learned from early experience, sensory-motor and otherwise.

23.17.5 The plan module

(i) This executes all the currently active contexts. This can be achieved with a small set of rules. Also the episodic memory provides information on progress in the search tree.

Form: context(parent,cdes,[cdes1,cdes2,...]).

Referring to Figure 23.16, we are using notation Relation of my representation to Schank’s 461

Currently evoked context - Cec Currently parent context - Pcec Current context being executed - Cc Parent of current context being executed - Pcc Working memory: done(C)

There are just three cases:

% if first then execute and then push if cc(Cc),first(Cc,C1),notdone(C1) then exec(C1),cc(C1),pcc(CC),done(C1)

% next in sequence if cc(Ci),done(Ci),next(Ci,Ci1), then exec(Ci1),done(Ci1)

% last then execute and pop if pcc(Cc),cc(Ci),next(Ci,Cl),c(Pcc,Cc, ),done(Ci),last(Cl) then exec(Cl),cc(Cc),pcc(Pcc),done(C1)

(ii) in common with all the other modules, plan also sends a module event. The plan event contains the cec, pcec, wgoals, as well as the current rule instance.

We could use context and send cec and pcec from these as a context module event.

We can now write down descriptions for the various contexts involved in the Tower of Hanoi problem:

% ss strategy context(move(Disk,Peg,Peg2), 462 Chapter 23: Contexts

Ppcc

Pcc

Cc

Cl C1 Ci Ci1

Figure 23.16: Executing a context Relation of my representation to Schank’s 463

[get_disk(Disk,Peg),get_target_peg(Peg2),move(Disk,Peg2)]).

% select a disk to move context(get_disk(Disk,Peg), [look_for(Disk,disk),top_disk(Disk,Peg),not_just_moved(Disk)]).

% get a peg to move the selected disk to context(get_target_peg(Peg2), [look_for(Peg2,peg),not_last_on(Disk,Peg2)]).

% look for a disk context(look_for(Disk,disk), [send(look_for(Disk),ip),send(recognize(Disk,disk),te0)]).

% check if a given disk is the top one for a given peg context(top_disk(Disk,Peg), [send(top_disk(Disk,Peg),te0)]).

% check whether Disk was just moved in the last move context(not_just_moved(Disk), [send(not_just_moved(Disk),episodic)]).

% look for a peg context(look_for(Peg,peg), [send(look_for(Peg),ip),send(recognize(Peg,peg),te0)]).

% check whether Disk was previously on Peg before its last move context(not_last_on(Disk,Peg), [send(not_last_on(Disk,Peg),episodic)]).

% move Disk to Peg context(move(Disk,Peg), [send(move(Disk,Peg),plan_self_action)]).

We can now write the BAD rules for executing a context:

% if first then push rule(M,plan_1,context(all), 464 Chapter 23: Contexts if((cc(W_cc,C,[Cc]), first(W_fi,C,[Cc,C1]), notdone(W_nd,C,[C1]))), then([cc(W_1,any,[C1]),pcc(W_2,any,[CC])],Wa), provided(()), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0,1.0]) )

% if notexists first then execute rule(M,plan_2,context(all), if((cc(W_cc,C,[Cc]), notexistsfirst(W_fi,C,[Cc,C1]), notdone(W_nd,C,[C1]))), then(Lista,Wa), provided(( Cc = context(_,List), append(List,[done(W_1,any,[C1])],Lista) )), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0,1.0]) )

% next in sequence % if cc(Ci),done(Ci),next(Ci,Ci1), then exec(Ci1),done(Ci1) rule(M,plan_3,context(all), if((cc(W_cc,C,[Ci]), done(W_d,C,[Ci]), next(W_ne,C,[Ci,Ci1]))), then(Lista,Wa), provided(( Ci1 = context(_,List), append(List,[done(W_1,any,[Ci1])],Lista) )), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0,1.0]) )

% last is done then pop %if pcc(Cc),cc(Cl),done(Cl),last(Cl) then cc(Cc),pcc(Pcc) rule(M,plan_4,context(all), if((cc(W_cc,C,[Cl]), done(W_d,C,[Cl]), last(W_d,C,[Cl]), pcc(W_cc,C,[Cc]))), Relation of my representation to Schank’s 465 then([cc(W_1,any,cc(Cc),pcc(W_2,any,[Pcc])],Wa), provided((find_pcc(Cc,Pcc))), weights(1.0,[1.0,1.0,1.0],[1.0,1.0,1.0,1.0,1.0,1.0,1.0]) )

We can now write the contexts in BAD:

% ss strategy context(move(Disk,Peg,Peg2), [get_disk(Disk,Peg),get_target_peg(Peg2),move(Disk,Peg2)]).

% select a disk to move context(get_disk(Disk,Peg), [look_for(Disk,disk),top_disk(Disk,Peg),not_just_moved(Disk)]).

% get a peg to move the selected disk to context(get_target_peg(Peg2), [look_for(Peg2,peg),not_last_on(Disk,Peg2)]).

% look for a disk context(look_for(Disk,disk), [send(look_for(Disk),ip),send(recognize(Disk,disk),te0)]).

% check if a given disk is the top one for a given peg context(top_disk(Disk,Peg), [send(top_disk(Disk,Peg),te0)]).

% check whether Disk was just moved in the last move context(not_just_moved(Disk), [send(not_just_moved(Disk),episodic)]).

% look for a peg context(look_for(Peg,peg), [send(look_for(Peg),ip),send(recognize(Peg,peg),te0)]).

% check whether Disk was previously on Peg before its last move context(not_last_on(Disk,Peg), [send(not_last_on(Disk,Peg),episodic)]).

% move Disk to Peg 466 Chapter 23: Contexts context(move(Disk,Peg), [send(move(Disk,Peg),plan_self_action)]). Chapter 24

Learning by doing

Abstract. I demonstrate my system by modeling learning by doing for the Tower of Hanoi problem. This is based on the published experimental results of Anzai and Simon. Thus, a naive subject makes a series of attempts at solving the Tower of Hanoi problem, and their performance improves as they learn by doing.

467 468 Chapter 24: Learning by doing

24.1 Learning by doing

The way I am conceiving it, this is a very simple kind of learning: which is really just problem solving.

The activity results in an episode which is a record of: (a) all events involved in solving the problem, i.e., the search space as experienced (b) a summary of the key events which is the solution of the problem

So solving the problem and forming this episodic memory is the simplest form of learning by doing.

In this case, what is learned is the episode and it can be remembered and probably communicated.

The more interesting form of learning by doing occurs when, from this episode, we then develop generalizations in a dynamic memory.

24.2 Tower of Hanoi learning

Anzai and Simon [Anzai and Simon, 1979] obtained a verbalization protocol, which I give in full in the Appendix of this chapter. From this, Anzai and Simon concluded that the subject transitioned through a sequence of three strategies, namely, the selective search strategy, the perceptual strategy, saving only the current goal, and the goal-stacking strategy. These three strategies are diagrammed in Figure 24.1. I have already described these strategies and gave rule systems for them in chapter16, which are diagrammed in Figures 16.13, 16.14 and 16.15. Tower of Hanoi learning 469

1 2 3

was on 1 last disk moved do not move do not move to strategy 1 previous peg (selective search)

1 2 3 if obstructed if obstructed at source at target goal create new goal create new goal move largest to move to move out−of−place obstructing disk obstructing disk strategy 2 to other peg (2) to other peg (2) (perceptual) disk to target (3)

1 2 3

strategy 3 (goal stacking)

same as strategy 2, but remember previous goal, then when goal is solved, reinstate goal

Figure 24.1: Strategy learning sequence of Anzai and Simon 470 Chapter 24: Learning by doing

24.3 Research by others on modeling Tower of Hanoi

learning

1. Pat Langley - adaptive production system [Langley, 1996] 2. Dirk Ruiz - SOAR, Tower noticing mechanism [Ruiz, 1987] [Ruiz and Newell, 1989] 3. John Anderson - CMU, goal stacking [Anderson, 1993].

24.4 My analysis of the Anzai-Simon protocol

In trying to implement learning by doing using contexts, I was lead to a somewhat different analysis of what is happening in the subject’s mind:

(i) The selective search strategy will in fact solve the TH problem alone, if pursued to the bitter end. Actually you need one more rule to get round the far corner of the search space if you happen to go there. However, the subject stops after a few moves. I attribute this to an evaluation context which is running in parallel to the ss context.

(ii) The main problem the subject has is not to invent selective search, or the concept of obstacle, or goal, or the ordering of goals, or the stacking of goals. These will be already part of the subject’s knowledge, having been learned by learning by doing in childhood. Rather the problem is to discover which contexts to use, and how to use them, and use them together, in the new problem situation.

24.5 Learning Tower of Hanoi strategies

Outline. To further develop my approach to learning and memory, I now develop a treatment of the learning of strategies for the Tower of Hanoi problem: My initial idea Learning Tower of Hanoi strategies 471 is that a sequence of stages occurs as follows: 1. Initial problem is perceived 2. This evokes a context, which is ss - selective search 3. The context sends plan and moves etc to the planning module 4. The system follows this ss strategy, generating a sequence of moves 5. An event sequence is formed in the hippocampal system 6. This evokes the obstacle context 7. An episode is recognized as an obstacle episode 8. The evoked obstacle context sends descriptions of working goals, plan and moves to planning 9. Thus planning now has a composite of both kinds of plans and moves, this constitutes ps - the perceptual strategy 10. The new plan is executed giving a new event sequence, which matches the evoked contexts to form episodes 11. The new episode has a sequence of working goals 12. This goal sequence evokes the subgoal context 13. The sequence is perceived as a subgoal context 14. New plan, goals and moves are added to the planning module giving stack - the stacking strategy.

The initial state. I assume that the initial problem state is already encoded and in place. It is a context which is constructed from verbal and visual perception: goal - on(disk1,peg3),on(disk2,peg3),on(disk3,peg3) constraint - minimize number of moves actions - top(Disk,Peg1),move(Disk,Peg1,Peg2) constraint - disk size less than top disk on target top(Disk2,Peg2),less than(Disk1,Disk2) | peg free(Peg2) 472 Chapter 24: Learning by doing

Note that the perception-action hierarchy already computes on(Disk,Peg), less than(Disk1,Disk2), top(Disk,Peg) and we already have a move action in the planning module, move(Disk,Peg1,Peg2).

The initial evoked context is ss, selective search, which adds two constraints:

1. don’t move disk such that last disk moved(Disk)

2. don’t move back, i.e., don’t move disk to peg such that last on(Disk,Peg) Note that we already have last disk moved(Disk) and last on(Disk,Peg) from querying the event sequence.

Available contexts.

(i) perceived situation: on(Disk1,Peg1),on(Disk2,Disk1) working goal: on(Disk1,Peg2) plan: goal(not on(Disk2,[Peg1,Peg2])),move(Disk1,Peg2)

(ii) perceived situation: on(Disk2,Disk1),on(Disk2,Peg1) working goal: not on(Disk2,[Peg1,Peg2]) plan: choose different peg(Peg3,[Peg1,Peg2]),move(Disk2,Peg3)

(iii) perceived situation: on(Disk2,Peg1) working goal: not on(Disk2,[Peg1,Peg2]) plan: choose different peg(Peg3,[Peg1,Peg2]),move(Disk2,Peg3)

(iv) perceived situation: working goal: goal(G1) Contexts in the Tower of Hanoi example 473 plan: goal(G2),goal(G1)

Comparison of contexts with SOAR problem spaces. What I am calling a context is a bit like a problem space in SOAR. There are various differences: Contexts are evoked by matching not by name By matching to the current event stream or a current episode Contexts contain descriptions of sequences of events at different times Contexts contain situations, goals and plans, not states and operators There can be, and usually are, more than one context evoked at the same time Context instances can be combined to create problem conceptions, and new contexts by learning Learning occurs by observation, including impasses.

24.6 Contexts in the Tower of Hanoi example

24.7 Lessons learned from attempting to extend the

model to do learning

The episodic learning module (hippocampus) must have greater functionality than usu- ally discussed. It must for example store the memory of very recent events, which would be associated together, allowing some deduction of ordering in time.

It is this recent event memory that is accessed in many recall tasks. It could also be used for goal stacking, since pushing a goal would be simply marking it in this recent-event memory and popping would be recalling from recent-event memory.

The hippocampus is structured a certain way. It has two sets of buffers which are regular 474 Chapter 24: Learning by doing

6-layer cortex, and then two loops of sequences of 3-layer cortex. Ideally, a model should have this same structure, and rule sets for 3-layer cortex should be simpler than for 6-layer cortex.

The representation of rules should be altered to be explicit data items which are based on representations of events formed by learning.

The system should evoke multiple event-contexts which can combine to form new learned events.

The underlying machine would now become a universal process for executing events.

24.8 Appendix - The Anzai and Simon protocol from Yuichiro Anzai and Herbert A. Simon, “The Theory of Learning by Doing”, Psy- chological Review, Volume 86, pages 124-140, 1979.

1. I’m not sure, but first I’ll take 1 from A and place it on B. 2. And I’ll take 2 from A and place it on C. 3. And then, I take 1 from B and place it on C. (If you can, tell me why you placed it there) 4. Because there was no place else to go, I had to place 1 from B to C. 5. Then, next, I placed 3 from A to B. 6. Well . . . , first I had to place 1 to B, because I had to move all disks to C. I wasn’t too sure though. 7. I thought that it would be a problem if I placed 1 on C rather than B. 8. Now I want to place 2 on top of 3, so I’ll place 1 on A. 9. Then I’ll take 2 from C, and place it on B. 10. And I’ll take 1 and . . . place it from A to B. 11. So then, 4 will go from A to C. 12. And then . ., um . . ., oh . . ., um . . ., 13. I should have placed 5 on C. But that will take time. I’ll take 1 . . (If you want to, you can start over again. If you are going to do that, tell me why.) 14. But I’ll stay with this a little more . . . 15. I’ll take 1 from B and place it on A. 16. Then I’ll take 2 from B to C. Appendix - The Anzai and Simon protocol 475

17. Oh, this won’t do . . . 18. I’ll take 2 and place it from C to B again. 19. And then, I’ll take 1, from A . . . 20. Oh no! If I do it this way, it won’t work! 21. I’ll return it. 22. OK? 23. I’ll start over. (Go ahead) 24. If I go on like this, I won’t be able to do it, so I’ll start over again. 25. Let’s see . . . I don’t think 5 will move. 26. Therefore, since 1 is the only disk I can move, and last I moved it to B, I’ll put it on C this time . . . from A to C. 27. So naturally, 2 will have to go from A to B. 28. And this time too, I’ll place 1 from C to B. 29. I’ll place 3 from A to C. 30. And so I’ll place 1 from B . . . to C. 31. Oh yeah! I have to place it on C. 32. Disk 2 . . . no, not 2, but I placed 1 from B to C . . . Right? 33. Oh, I’ll place 1 from B to A. 34. Because . . . I want 4 on B, and if I had placed 1 on C from B, it wouldn’t have been able to move. 35. 2 will go from B to C. 36. 1 will go from A to C. 37. And so, B will be open, and 4 will go from A to B. 38. So then, this time . . . It’s coming out pretty well . . . 39. 1 will . . . 1 will go from C . . . to B. 40. So then 2, fom C, will go to . . . A . . . 41. And then, 1 will go from B to A. 42. And then, 3 will go from C to B. 43. 1 will go from A to C. 44. What? 45. And then, 2 will go from A to B. 46. And then, oh, it’s getting there. 47. 1 will go from C to B. 48. So then, 5 will finally go from A to C. 49. And then, 1 will go from B to A. 50. Oh, I’ll put 1 from B to C. (Why?) 51. Because if 1 goes from B to A, 2 will go from B to C . . . 52. Let’s try it again, ok? 53. Um . . . it’s hard, isn’t it? 54. I didn’t know it would be so hard . . . It’s hard for me to remember . . . 55. And so I guess I have to do it logically and systematically. 476 Chapter 24: Learning by doing

56. And 1 will go from A to C. 57. 3 will go from B to A. 58. 1 will go from C to B. 59. Because I want to move 4 to C, and to do that I have to move 2, don’t I. 60. And to do that, 2 will go from C to A. 61. And then, 1 will go from B to A. 62. And then, 4 will go from B to C. 63. This time, if I think of 3 on C, that will be good, so 1 will go from A to C. 64. 2 will go from A to B. 65. 1 will go from C to B. 66. And then, I’ll bring 3 from A to C. 67. This time, it’s easy, and 1 will go from B to A. 68. 2 will go from B to C. 69. And then 1 will go from A to C. 70. All right, I’ve made it. 71. I wonder if I’ve found something new. 72. I don’t know for sure, and little ones will have to go on top of big ones . . . big ones can’t go on to of little ones, so first, bit by bit, C will be used more often before 5 gets there. 73. And then, if 5 went to C, next I have to think of it as 4 to go to C . . . 74. This is my way of doing it . . . 75. Can I move it like this? 76. First, if I think of it as only one disk, 1 could go from A to C, right? 77. But, if you think of it as two disks, this will certainly go as 1 from A to B and 2 from A to C, then 1 from B to C. 78. That . . . that anyway 2 will have to go to the bottom of C, naturally I thought of 1 going to B. 79. So, if there were three . . . yes, yes, now it gets difficult. 80. Yes, it’s not that easy . . . 81. . . . this time, I will ... 82. Oh, yes, 3 will have to go to C first. 83. For that, 2 will have to go to B. 84. For that, um . . . , 1 will go to C. 85. So, 1 will go from A to C, 2 will go from A to B, 1 will go back from C to B, I’ll move 3 . . . That’s the way it is! 86. So, if there were four disks, this time, 3 will have to go to B, right? 87. For that, 2 will have to stay at C, and then, for that, 1 will be at B. 88. So 1 will go to B. Appendix - The Anzai and Simon protocol 477

89. And then, 2 will go from A to C. 90. And then, 1 will go back from C to B. 91. And then, 3 will move from A to B. 92. And then, I will move 1 from C to A. 93. And the, first, I will move 1 from C to A. 94. And then, I will move 1 from A to B. 95. And then, 4 from A to C. 96. And then, again this will go from A ... 1 will 97. Wrong . . . , this is the problem and ... 98. 1 will go from B to C. 99. For that, um . . . , this time 3 from B, um ... as to go to C, so . . . 100. For that,2 has to go to A. 101. For that, 1 has to go back to C, of course. 102. And then, 2 will go from B to A. 103. And then, 1 will go from C to A. 104. And then, 3 will go from B to C. 105. So then, 1 will go from A to B. 106. 2 will go from A to C. 107. and then, 1 will go from B to C. (All right) 108. I think I can do five now. 109. ...Ah, it’s interesting ... 110. If it were five, of course, 5 will have to go to C, right? 111. So, 4 will be at B. 112. 3 will be at C. 113. 2 will be at B. 114. So 1 will go from A to C. (Fantastic.) 115. This is the way I think!! 116. And then, 2 will go from A to B. 117. 1 will go back from B to C. 118. 3 will go from A to C. 119. For that, um ..., this time, again ..., as this time 4 will have to go to B ... 120. Let’s move back 1 from B to A ... 121. If 4 has to go from A to B, it means ... 122. 2 will have to go to 3. 123. Because 1 will ... 124. So, 1 will go back from B to A. 125. And then, 2 will go from B to C. 126. And then, 1 from A to C. 127. And then, 4 from A to C. 128. And then, this time ... , it’s the same as 478 Chapter 24: Learning by doing

before, I think ... , um ... 129. Of course, 5 will go to C, right? 130. For that, 3 will have to go to B, so 131. 2 will go back to A, 132. 1 from C to B. 133. 2 from C to A. 134. 1 from B to A. 135. 3 from C to B. 136. 1 from A to C. 137. 2 from A to B. 138. 1 from C to B. 139. And then, finally 5 will go from A to C. 140. and then, this time, um ..., um , 4 will go to C, so ... 141. 3 goes to A. 142. 2 goes to B. 143. and then, 1 will go to A. 144. So , anyway, I will move 1 from B to A. 145. 2 from B to C. 146. And then, 1 from A to C. 147. 3 from B to A. 148. 1 from C to B. 149. And then, 2 from C to A. 150. And then, 3 from B ... 151. 1 from B to A. 152. And then, finally, I have succeeded in moving 4 from B to C. 153. So, this time, um ... , oh, this time, 3 naturally has to go there, so, 154. for that, 2 has to go to B. 155. So 1 will go from A to C, 156. place 2 from A to B, 157. place 1 from C to B, 158. and then, 3 from A to C. 159. Place 1 from B to A, 160. place 2 from B to C, 161. and then, move 1 from A to C. 162. Oh, yeah ..., In this way, think bit by bit ..., think back... (Ok, why don’t you try it again?) 163. After all, it’s the same thing, isn’t it? 164. First 1 will go from A to C. 165. Because, 5, at the end, will go to C, so, 166. So, 4 will go to B. 167. And then 3 will got to C. Appendix - The Anzai and Simon protocol 479

168. And then, 2 will got to B. 169. So, 1 will go from A to C. 170. 2 will go from A to B. 171. Move 1 from C to B, 172. move 3 from A to C. 173. Next, 4 will go to B. So ... 174. move 1 from B to A, 175. move 2 from B to C. 176. Move 1 from A to C, 177. and then, move 4 to B. 178. Next, 5 has to go to C, so ... 179. I only need move three blocking disks to ... B. 180. So, first ... 1 will go from C to B, 181. move 2 from C to A, 182. and then, move 1 from B to A. 183. Move 3 from C to B, 184. 1 from A to C. 185. Move 2 from A to B, 186. and then, 1 from C to B. 187. And then, 5 can go to C ... 188. It’s easy, isn’t it? 189. 5 has already gone to C. 190. Next ... , 5 was able to move, because ... 191. A and C were open, right? 192. 5 is already at C, so ... 193. I will move the remaining four from B to C ... 194. It’s juts like moving four, isn’t it? 195. So ... I will have to move 4 from B to C ... 196. For that, the three that are on top have to go from B to A ... 197. Oh, yeah, 3 goes from B to A! 198. For that, 2 has to go from B to C, 199. for that, 1 had to go from B to A. 200. So, 1 will got from B to A. 201. 2 goes from B to C. 202. 1 will go from A to C. 203. And then, 3 can go from B to A. 204. Then, it’s be good if 1 and 2 got to A, so 205. ... first 1 goes from C to B, 206. 2 moves from C to A. 207. And then, 1 moves from B to A. 208. Um, with this, the three at B have moved 480 Chapter 24: Learning by doing

to A, so ... 209. move 4 from B to C. 210. Next, if the three at A go to C, I will be done. 211. So first, the top two disks will be moved to B. 212. For that, 1 goes from A to C. 213. 2 goes from A to B. 214. And then, 1 goes from C to B, 215. and then, 3 goes from A to C. 216. Oh! This time, the two on B will be moved to C. 217. Right ... 218. 1 moves from B to A, 219. 2 from B to C. 220. And then, 1 will move from A to C. 221. I did it! 222. I think I finally got it ... 223. This time, if 5 goes to C, it’ll be just like moving four ..., but ... 224. I still don’t quite yet ... Chapter 25

Procedural memory and routinization

Abstract. I will define creative action as that originating in the basic knowledge elabora- tion activity of the cortex. At lower levels of the perception-action hierarchy, this activity may be fairly straightforward and at higher levels may be more creative.

In this chapter, I will argue that cortical areas, particularly frontal areas, will be involved in the selection and control of routine action which originates in the basal ganglia. I explain the anatomy of the basal ganglia and their connections, which place them in loops from various areas of the cortex and back to frontal areas. I then develop a model of routine action via the basal ganglia, which learn association connections among source areas feeding them.

Execution of routine action is controlled by planning areas in the cortex. They initiate routine action, and monitor and terminate it. Normally, there will be an interleaving of creative cortical action and routine action. The phenomena of action slips show how action breaks at the boundaries between these two modes of action.

481 482 Chapter 25: Procedural memory and routinization

25.1 Introduction

Novel action and prefrontal cortex. In the case where a person is placed in a novel environment with a novel problem, he or she engages in problem solving using general planning methods in prefrontal areas. This involves making mistakes, trial and error, experimentation and other activities which we conceive of as being controlled from general problem-solving methods fairly high in the action hierarchy, probably in areas 9 and 10. Brain activity as measured by implanted electrode recording or by imaging should show changes in these areas.

Contexts and the hippocampus. I expect that, once a more useful plan has been discovered, it will be learned as a known context by the hippocampus and stored in ventral prefrontal cortex, perhaps in areas 11, 12, 46 or thereabouts. Action will then be controlled from this area. It may not take so many trials to learn this, usually less than 50 for monkeys, and less than 5 for humans.

Routinization and the basal ganglia. After this, over a longer period of a few hundred trials for monkeys and ten or more for humans, there will develop a routinized form of action. In this, the specific sequence of states of perception and action will be overlearned by rote. They will be represented in the basal ganglia.

I expect that activity will now shift to the basal ganglia. However, I postulate that there will be residual control from prefrontal cortex. This will involve initiation of the routinized activity, monitoring and possible interruption of ongoing routinized activity and termination of the routinized activity, which I will call control decisions. I thus expect to see activity in prefrontal cortex, particularly while performing control decisions. This is consistent with the findings of Joaquin Fuster [Fuster, 1997] for example.

Routine and creative action. In any given situation, the system will probably have Introduction 483 a routine response or routine responses available.

In addition, it can treat the situation as novel, or more on its individual merits, and analyze it and create a new solution, i.e., a representation of a new course of action. After all, every situation must be different from any previous situation in some way.

We cannot leave it up to simple competition between individual creative and routine actions, in general the creative thought process will need to perform control decisions.

In order to make control decisions, the planning system will need a stream of information, originating in the routine system. It seems that the creative system will require more resources, more energy, and more realtime, than the routine system, to generate an action. However, once initiated, the routine system should be able to send and receive information rapidly, leaving the planning system to monitor its progress and effects in parallel.

Variability of routine action. By routine action, I mean where a stimulus is received as input and the routine system directly generates an action as output. Of course things are not that simple: (i) the input is actually not raw stimuli but information derived from stimuli. The early visual system V1, V2 and V3 is not connected to the basal nuclei, but TE0 is. (ii) the output action does not go directly to the muscles but to cortical areas such as motor cortex, prefrontal cortex, frontal eye fields and anterior cingulate cortex. Thus the action leads to further processing which eventually generates actual muscle contractions. Here, I am using the word “action” to mean an output from a cortical module, and not necessarily information sent to effectors.

This makes sense since there will be variability in the situation. For example, suppose you routinely put on your hat on leaving the house, then each day your hat may be in a different position, and there may be different hats. Also, these variations may form 484 Chapter 25: Procedural memory and routinization a continuous sequence, for example, putting on different jackets may involve continuous sensing of the sleeve and control of the arms by the routine action system.

Goals and subgoals. The clearest way I have found to think of this is in terms of goals. The plan generates subgoals which can be dealt with by routine action or by creative action. Routine action may also create simpler subgoals. This is diagrammed in Figure 25.1. plan process subgoal full perception memory etc routine motor goals higher level sensing motor adjustment action detailed sensing

Figure 25.1: Subgoals in routine action

In addition, it seems to me that there can be responses to subgoals at lower levels of the perception-action hierarchy, as diagrammed in Figure 25.2.

These actions are knowledge/rule based, i.e., learned but can be quite limited in their activity.

An example of this is a perceptual subgoal, such as “which is the largest disk on the source peg?”, which could cause actions to be generated by parietal regions which in Introduction 485

plan

perceptual subgoal

perceptual module

Figure 25.2: Subgoals in perception hierarchy

turn would evoke routine eye movement sequences (scanpaths) in the basal ganglia. This would generate saccade commands in the frontal eye fields which would elaborate through the lower level subcortical eye control systems to control the eyes.

The basal ganglia can be used without goals but usually goals would be better, otherwise actions inappropriate to the current intention might be generated. The phenomenon of utilisation behavior [Lhermitte, 1983] [Shallice et al., 1989] suggests that on weakening or ablation of the controlling area that the lower area defaults to automatic control which is driven by perceptual input.

Two types of routine action. Thus, in some sense there are two different types of routine action, the first being cortical, learned, very specific, actions and the second being learned by the basal ganglia. I will however to avoid confusion only refer to the second type as routine actions. 486 Chapter 25: Procedural memory and routinization

25.2 The basal ganglia

Figure 25.3, from [Noback et al., 1991] pp. 10-11, indicates the anatomy of the basal ganglia. They are usually called the basal ganglia but they should more correctly be called the basal nuclei as they are nuclei and not ganglia.

Figure 25.3: Basal ganglia

Figure 25.4, taken from the textbook edited by Haines [Haines, 1997], Figure 15-15, pp. 214, shows how this strange geometry of the basal ganglia has resulted from influence by the presence of the third ventricle of the brain. This also applies to the hippocampus and the amygdala. The basal ganglia 487

Figure 25.4: The geometric influence of the third ventricle 488 Chapter 25: Procedural memory and routinization

Recent research by Garrett Alexander [Alexander et al., 1986] has shown that the basal ganglia are not an intermediate level of motor control between cerebrum and thalamus, but instead participate in four loops starting in the cortex and ending in the cortex: (i) the sensory-motor loop, Figure 25.5 (a) corresponding to routine motor action (ii) the oculomotor loop, Figure 25.5 (b), corresponding to routine eye movement (iii) the association loop, Figure 25.5 (c), corresponding to routine planning (iv) the limbic loop, Figure 25.5 (d), corresponding to routine goal setting. These figures are from [Noback et al., 1991], pp. 385-388. These types of routine ac- tions can run concurrently with each other, since they do not interact, other than by competition for energy.

Figure 25.6 shows the anatomical geometry of these loops, which is best seen in coronal section.

We can define the source areas as in Figure 25.5 to be: 1. for SMA - arcuate premotor and premotor cortex. 2. for FEF - dorsolateral prefrontal and posterior parietal. 3. for DLC - posterior parietal and arcuate premotor. 4. for LOF - STG, ITG and ACA. 5. for ACA - HC, EC, STG and ITG. These areas can arguably be seen as being involved in preparation of higher-level infor- mation to be sent to the main output area of the loop. The basal ganglia 489

SENSORY−MOTOR LOOPS OCULOMOTOR LOOPS

SUPPLEMENTARY MOTOR CORTEX (area 6) FRONTAL EYE FIELD (area 8)

SOMATOSENSORY CORTEX PREFRONTAL CORTEX (areas 1, 2 and 3) (areas 9 and 10) PRIMARY MOTOR CORTEX POSTERIOR PARIETAL CORTEX (area 4) (area 7) PREMOTOR CORTEX (area 6)

PUTAMEN CAUDATE NUCLEUS (body)

GLOBUS PALLIDUS (medial segment GLOBUS PALLIDUS (medial segment) and lateral segment) (caudal amd dorsomedial portion)

SUBSTANTIA NIGRA (pars reticularis) SUBSTANTIA NIGRA (pars reticularis) (caudolateral portion) (ventrolateral portion)

VENTRAL LATERAL NUCLEUS (pars oralis) VENTRAL ANTERIOR NUCLEUS (magnocellular) SUPERIOR (lateral portion) COLLICULUS VENTRAL LATERAL NUCLEUS (pars medialis) DORSOMEDIAL NUCLEUS (parvocellular)

ASSOCIATION LOOPS LIMBIC LOOPS

PREFRONTAL CORTEX (areas 9 and 10) ANTERIOR CINGULATE GYRUS (area 24)

ORBITOFRONTAL CORTEX (areas 10 and 11) PREMOTOR CORTEX MEDIAL AND LATERAL (area 6) TEMPORAL LOBE POSTERIOR PARIETAL CORTEX HIPPOCAMPUS (area 7) AMYGDALA ENTORHINAL CORTEX

CAUDATE NUCLEUS (head) VENTRAL STRIATUM (nucleus accumbens) (dorsolateral portion) CAUDATE NUCLEUS (head)

GLOBUS PALLIDUS (medial segment) (dorsomedial portion) VENTRAL PALLIDUM

SUBSTANTIA NIGRA (pars reticularis) GLOBUS PALLIDUS (medial segment) (rostral portion) SUBSTANTIA NIGRA (pars reticularis)

VENTRAL ANTERIOR NUCLEUS (parvocellular) VENTRAL ANTERIOR NUCLEUS (magnocellular)

DORSOMEDIAL NUCLEUS (parvocellular) DORSOMEDIAL NUCLEUS (magnocellular)

Figure 25.5: Loops involving the basal ganglia 490 Chapter 25: Procedural memory and routinization

cortex

caudate nucleus

thalamus putamen globus pallidus

subthalamic nucleus

Figure 25.6: Basal ganglia loop shown in coronal section Learning by the basal ganglia 491

25.3 Learning by the basal ganglia

In learning by doing, the basal ganglia “record” the activity of their source modules during the search for solution.

Figure 25.7 shows the association loop mapped onto the perception-action hierarchy.

PFC

PPC PreMC

V M BG

Figure 25.7: The association loop mapped onto the perception-action hierarchy

So what are the members of the association loop typically doing? (i) PPC, the posterior parietal cortex is concerned with visual perception to create none- gocentric maps. (ii) PreMC, the premotor cortex generates descriptions of motor actions at the intention and coordinate level. (iii) PFC, the prefrontal cortex, generates plans and actions at the relation level.

One could write the association as: if PPC and PreMC and PFC then PFC so this assertion is upwards, from more concrete towards more abstract information. 492 Chapter 25: Procedural memory and routinization

Figure 25.8 similarly shows the sensory motor loop. This again creates an upwards

PreMC SMA

SOM PMC

S M BG

Figure 25.8: The sensory-motor loop mapped onto the perception-action hierarchy association.

Thus I conclude that the basal ganglia build a plan based on lower level inputs and outputs. The end result is rules in the basal ganglia and also for example PFC, of the general form: in the basal ganglia: if PPC and PreMC and PFC then PreMCR and in PFC PreMCR is a possible action.

What I intend by this notation is that the basal nuclei generate the suggested routine action PreMCR for PreMC, but they do not send it to PreMC, but instead to PFC for permission. PFC may have an alternative creative action PreMCC to send to PreMC, however it can perform logic to make a decision between sending PreMCC or PreMCR to module PreMC. Figure 25.9 shows how the association loop might work. Learning by the basal ganglia 493

PFC

PPC,PreMC PreMCC PreMCR

PreMCC or PreMCR PPC PreMC

PFC PreMCR

V M PPC PreMC

BG PPC,PreMC,PFC PreMCR

Figure 25.9: The association loop mapped onto the perception-action hierarchy

Thus according to this theory the upper module is the regulator or controller of the routine action.

One main consideration is that in situations warranting routine action, this can proceed quickly. Thus PFC would simply select PreMCR without doing any or much computa- tion. This is to allow a rapid stream of actions to be generated by the basal ganglia and sent to module PreMC. 494 Chapter 25: Procedural memory and routinization

25.4 The interleaving of routine and creative action

A proposed arrangement for interleaving. The plan has to retain supervisory con- trol [Sheridan, 1992] over routinized action. Thus, one would expect some arrangement such as that as shown in Figure 25.10.

planning module

control creative action signal information on routine action

sensing routine action

routine module

Figure 25.10: Possible arrangement for monitoring of routine action by planning module

The work of Sherman and Guillery on the thalamus. The thalamus is also the main route by which the basal nuclei send information to the cortex. Recent research by Murray Sherman and Ray Guillery [Sherman and Guillery, 2001] has shown the ar- rangement in Figure 25.11. Connections to the thalamus fall into two types, drivers and modulators. Drivers feed information, for example from ascending sensory tracts, and modulators modulate and control the flow of information. The majority of inputs to the thalamus are modulators. The thalamus receives a lot of inputs from the cortex and sends most of its outputs to the cortex. It thus can provide an important corticocortical communication function. The interleaving of routine and creative action 495

cortical area

to lower motor layer 4 a areas layer 5 layer 6 a d c

m thalamus

other other drivers modulators

Figure 25.11: Driver and modulator connections of the thalamus

The thalamus and interleaving. This suggests a possible role for the thalamus. If we put our diagram of the basal ganglia loops, Figure 25.5, together with our diagram of the thalamus connections, Figure 25.11, we get Figure 25.12. This includes a variant of our desired Figure 25.10, our idea of creative control of routine action.

Most of the thalamus is concerned with receiving a wide range of sensory, subcortical and cortical inputs and sending outputs to the cortex. It is only the limited parts, mainly the ventral group of thalamic nuclei that are involved in the basal ganglia loops and receive inputs from the globus pallidus and substantia nigra.

The evolution of the thalamus. One can speculate that the thalamus was originally a one-way router of data, and the data was not so important since it did not mediate vision or olfaction or motor control or motivation. As the pallium developed, the data became more important, and reciprocal connections were advantageous.

It is not obvious what kind of advantage or use the thalamus had. The channeling of all 496 Chapter 25: Procedural memory and routinization

cortical area precursor cortical to lower motor areas layer 4 a areas layer 5 layer 6 a layer 6 d c

m thalamus

d r

basal nuclei

Figure 25.12: Composite diagram of cortex, thalamus and basal ganglia the different streams of data seems useful, since otherwise each part of the cortex would have to receive many different data streams. The regulation of data streams, perhaps being able to reduce the amplitude of one stream, could be useful in saving energy and also avoiding disruption of cortical activity.

The basal ganglia provided a learning mechanism for stimulus-action sequences, and as cortical mechanisms and hippocampal learning mechanisms were added, the basal ganglia loops developed, using the thalamus as a conduit and control mechanism.

The ability of the thalamus to provide a place where cortical modulation could be brought to bear on data streams seems useful, and it could provide a fast channel for routine ac- tion to be communicated to parts of the cortex where it would be useful. This idea assumes that the cortex could be slow, with knowledge evocation and iteration to quies- cence, whereas routine action from the basal ganglia could be a simple association and The interleaving of routine and creative action 497 therefore faster. Then the thalamus would also be fast, simply combining the driver and modulatory streams using a combinatory (i.e., non-looping) neural network.

In situations requiring full evaluation of the proposed course of action, the system might have to reach a viable state, which could take 300 milliseconds. Routine action, on the other hand, can be used for faster actions such as steering a bike.

Proposed mechanisms. Thus our idea is as follows: 1. The thalamus combines driver inputs modified by and conditional upon modulator inputs and generates outputs to the cortex. 2. This is relatively fast, not causing much delay. 3. In the case of basal ganglia inputs, it provides a fast throughput of action data from the basal ganglia to the output effectors to produce continous routine action in realtime. 4. The routine action generated is a combination of (i) basal ganglia driver information (ii) cortical driver information from the action area, and (iii) cortical modulator information from the action area. 5. This allows basal ganglia routines to be adapted to and used in a wider range of situations. 6. The arrangement also allows continuous monitoring of the routine action stream by the action and precursor areas. We will call these the monitoring areas. 7. It also allows routine action to be suspended or terminated easily by the monitoring areas since they can simply send appropriate modulatory information to block the routine data stream. 8. The monitoring areas operate concurrently with the routine action and can generate alternative actions as required. 9. Thus, the monitoring and planning areas: (i) generate possible actions 498 Chapter 25: Procedural memory and routinization

(ii) receive information on possible alternative routine actions (iii) make the choice of which action to use (iv) monitor the chosen action ....

Cortical mechanisms for interleaving. One issue is how learned knowledge and plans in the planning areas can accommodate routine action. Since it is unlikely that the system would learn this separately for each plan, there is probably a uniform mechanism.

If a plan operates by generating goals, then actions and appraising actions, then the routine action could be simply mixed in and appraised.

If we do not use predictions then it simply would appraise suggested actions: rule 1: if goal and pattern1 and body1 then action1 rule 2: if goal and pattern2 and body2 then action2 routine: if goal and pattern3 then action3 then it would be up to action filtering to select which action to send on. Action filtering could use predictions. The goal could be generated by the precursor areas. Part V

Extensions

499 500 Chapter 26

Vision

Abstract. I list some psychological phenomena that we would like any model of the human visual system to be able to exhibit. This list is based on a review by Ann Treis- man and Nancy Kanwisher [Treisman and Kanwisher, 1998], and a review of results on consciousness by Kanwisher [Kanwisher, 2001].

In addition, I describe how vision subserves problem solving, and how it works with episodic memory. I suggest incorporating Stephen Kosslyn’s ideas on modeling mental imagery.

I then outline my proposed design for a logical system model of the human visual system. I present arguments as to why it is a reasonable representation of the known neuroanatomy of the visual system, and why it should exhibit the listed psychological phenomena.

I explain how all the different mechanisms, i.e., vision, problem solving and imagery, are in fact needed to explain the use of vision in the Tower of Hanoi problem. Saccadic vision is used, also the computation of object representations and spatial relations among objects, and mental imagery is needed in determining legality of proposed moves.

501 502 Chapter 26: Vision

26.1 Introduction

I’ll first briefly review current thinking on the human visual system, using ideas obtained from neuroanatomy of the rhesus monkey, psychological experiments using tachistoscopic presentation, transsaccadic observation, observation of visual episodic memory, imaging studies, and reports on consciousness correlated with imaging.

26.2 The neuroanatomy of the visual system

Figure 26.1 shows the visual areas of the human brain alongside those of the monkey brain, since a lot of neuroanatomical information is taken from the monkey brain. Figure 26.2 shows the retinotopic mapping onto area 17, emphasizing foveal vision and also placing peripheral vision more medially.

Figure 26.3 is a summary figure from a review by David Van Essen and Jack Gallant [VanEssen and Gallant, 1994] which shows the different areas and the different streams of visual information. There are basically three sensory-information streams P - parvo- cellular, M - magnocellular and K - other, which originate in the retina and maintain their identity as they are processed by the thalamus (LGN), then 17 (V1) and 18 (V2).

26.3 The psychology of vision

The initial concept of an information channel as a model of perception was due to Broad- bent. Ann Treisman then developed her feature theory of visual perception and attention [Treisman and Gelade, 1980] [Treisman, 1988].

Her idea is that there are a set of parallel filters which extract orientation, color and The psychology of vision 503 motion feature images and then one object is perceived by attention being directed to it. This causes an object representation to be constructed and the corresponding features bound to it. Other objects in the field of view but not at the fovea cannot be perceived properly, subjects make binding errors. Figure 26.4 diagrams the idea. 504 Chapter 26: Vision

TPro TE1 TE2

38 TE3 28 V4 12 11 14 25 V2 18 17 10 32 25 V1 10 RS 18 24 V4 32 24 9 V2 9 23 PO 19 6 8 23 31 PGm MDP 4 PE 7 1 ci 2 PEc 6 3 PE 5 4

PEc PE PIP MIP 4 VIP 5 6 3 LIP 6 4 1 DP 2 8 7 9 8 7a 2

46 PF PFG 9 1 40 MT 3 Tpt 10 OAa AI FST 39 12 V4 V1 46 TPO−3 V2 paAlt 45 44 TS3 22 TPO−2 10 37 19 18 TS2 TE3 VP TPO−1 47 21 TS1

TE2 38 20 TPro TE1

TE2 TE3 TE1

TPro 47 V4 20 37 12 38 11 11 13 FPro 28 19 10 24 V2 V1 10 14 25 12 25 18

Figure 26.1: Visual Brodmann areas for the monkey brain (left) and the human brain (right) The psychology of vision 505

left right peripheral

lup rup central upper luc foveal ruc luf ruf llf rlf llc rlc lower rlp llp

visual field

left right

left right rlc llp llc llf primary rlf rlp primary rup lup visual visual ruc luc cortex cortex ruf luf (area 17) (area 17)

Calcarine fissue Calcarine fissue

Figure 26.2: Retinotopic mapping of retina onto V1 506 Chapter 26: Vision

Figure 26.3: Early vision modules and functioning - from Gallant and Van Essen 1995 The psychology of vision 507

Stimulus display

Fixation

Color maps Orientation maps Red

Yellow

Blue

Map of locations

Attention

Figure 26.4: The Treisman psychological model of early vision - from [Treisman, 1988] 508 Chapter 26: Vision

Ann Treisman and Daniel Kahneman introduced the concept of object-file, which is a representation of objects when attended to [Kahneman and Treisman, 1984] [Kahneman et al., 1992]. Treisman has also been investigating the connections between object perception and spatial perception [Treisman, 1993] [Friedman-Hill et al., 1995].

Nancy Kanwisher’s work on repetition blindness [Kanwisher, 1987] pointed out the dis- tinction between the visual perception of object types and their possibly multiple tokens, or instances.

Imaging evidence for a parahippocampal place area for the background spatial environ- ment was reported by Epstein and Kanwisher [Epstein and Kanwisher, 1998], although it seems be mainly involved in the learning of events [Epstein et al., 1999]. Other vi- sual regions have been reported, for example for faces [Kanwisher, 2000], and objects [Kourtzi and Kanwisher, 2000]. Kanwisher has recently reviewed the imaging evidence for area activations accompanying consciousness [Kanwisher, 2001].

26.3.1 Representations of the percept

The first stages of vision are retinotopic and egocentric and occur in areas V1, V2 and V4. After this, the object representations are formed and these are more conceptual and are nonegocentric. The idea is that there is a spatial context which is represented independently of the objects in it, something similar to Epstein and Kanwisher’s place area. So the percept has at least two parts - the representation of objects in one store and spatial context in another store and bindings between them. There are actually several areas in posterior cingulate in the precuneus, and other medial areas such as the retrosplenial gyrus, as well as the parahippocampal area, which all seem to be involved in representations of space, in spatial memory and visual imagery. These areas have connections to the hippocampal complex and also to the early visual areas. The psychology of vision 509

V4 in humans is mainly on the inferior temporal lobe on the undersurface. It certainly seems to be the main area where shading and orientation maps are generated and stored.

There is some debate as to exactly where color is processed. The standard position is that it is in V4, and that this includes computing the perceived color and compensating for the color of incident lighting. On the other hand, some authors believe there is an adjacent area, called V8, where color processing mainly takes place. It has been known for some time, after J. C. Meadows clinical work [Meadows, 1974], that color processing is probably adjacent to the processing of faces, since deficits in both tend to occur in the same patient.

It also seems that finding shape from edges may be a different mechanism than finding shape from shading.

It seems well established that motion is represented in MT. I assume this means motion edges in an image-like representation, but it could also involve moving visual regions. There is also come subcortical visual processing, studied under the phenomenon of blind- sight.

Since human vision may well differ in various ways from that of the rhesus monkey, it is simply not yet known exactly what features are computed and exactly where. The Treisman feature model established that shape, orientation and color are processed sep- arately. More recently, there is evidence for other “pop-out” features, such as regions with certain kinds of variable shading.

26.3.2 The total percept and consciousness

Kanwisher has recently reviewed the experimental findings for activation of different mod- ules and their correspondence to different conscious experiences. The evidence suggests 510 Chapter 26: Vision that different modules are activated as the corresponding different aspects of the percept are attended to.

Kanwisher suggested, following Desimone and others, that the conscious percept is dis- tributed over several visual modules, the exact strengths in each being determined by current attention.

Thus the percept has a set of object-files, a set of object types, a spatial layout represen- tation, and a set of feature maps, all in different modules and all bound together.

When the eyes fixate a new part of the visual world, this changes the early retinotopic part of the percept, but the later part of the percept is just updated.

26.3.3 Models of the visual system

There seem to be two complementary classes of model, those, such as those due to Tomaso Poggio and coworkers, have developed which emphasize neural representation and the processes of extraction of various kinds of information, such as shading, texture, motion and so on, and those, exemplified by Ilya Rybak’s model, which model eye movement and the overall process of vision within behavior.

Models of feature extraction. The work of Poggio’s lab at MIT in- cludes reviews [Riesenhuber and Poggio, 2002] [Riesenhuber and Poggio, 2000] [Riesenhuber and Poggio, 1999], Max Riesenhuber’s thesis [Riesenhuber, 2000], Tomaso Poggio’s work with Gerald Edelman [Poggio and Edelman, 1990], and Thomas Vetter’s work on view-based models [Vetter et al., 1995].

Vision models include Kunihiko Fukushima’s Neurocognitron [Fukushima, 1980], Barlett Mel’s SEEMORE model [Mel, 1997], David Van Essen and coworkers models using cortical routing [Anderson and Essen, 1987] [Olshausen et al., 1993], Dana Ballard and Phenomena to be modeled 511 coworkers [Rao and Ballard, 1997], and David Mumford’s ideas on modeling the cortex [Mumford, 1992].

26.4 Phenomena to be modeled

Using the review by Treisman and Kanwisher [Treisman and Kanwisher, 1998] as an initial framework and guide, let me list a set of known phenomena. My goal will be to design and implement a vision system which exhibits the main phenomena in this list. It is hoped that by modeling the vision system, this will provide new insights, allowing existing phenomena to be understood more clearly and precisely, making connections among the phenomena, and even suggesting new scientific questions.

26.4.1 Representations

Six representation types. Treisman and Kanwisher suggest a set of six different rep- resentations of the visual percept that could exist simultaneously in the brain, probably in different modules.

1. object token - viewpoint dependent, conscious, representation of the object as cur- rently seen.

2. structural description - object centered, non-visually conscious, allows object’s ap- pearance from other positions to be predicted.

3. object type - the object’s identity or membership of a stored category.

4. knowledge representation - other knowledge about the object.

5. emotional and motivational representation - significance to the observer. 512 Chapter 26: Vision

6. action representation - the role of an object in action, including location, size and shape relative to our hands.

Treisman and Kanwisher suggest that 1-4 depend on the ventral stream, 5 on the amyg- dala and 6 on the dorsal stream.

The Titchener size illusion. This does not affect dorsal representations [Milner and Goodale, 1995].

Emotional responses without conscious recognition. See [Whalen et al., 1998].

26.4.2 Timing phenomena

Discrimination of objects takes about 100-200 milliseconds. 8 objects per second at the fovea [Potter, 1976].

Minimal time to onset of response. Inferotemporal activation can occur within 50 milliseconds from initiation [Wallis and Rolls, 1997].

Minimal duration of neural response for discrimination. This can be as low as 25 milliseconds [Wallis and Rolls, 1997].

Time to reach awareness. For a discriminated object to reach awareness may take a further 100 milliseconds.

26.4.3 Object tokens

Interactive segmentation generates tokens [Vecera and Farah, 1997].

Object specific priming [Kahneman et al., 1992] (i) Objects prime for same objects at different locations and motion Phenomena to be modeled 513

(ii) Objects prime for objects with same name (iii) Objects do not prime for other objects whose names are synonyms, whose types are semantic associates.

Repetition blindness (i) Changed object repeated very soon after the object is not seen [Kanwisher, 1987]. (ii) Changed objects in apparent motion sequence are not seen [Chun and Cavanagh, 1997] (iii) Novel objects are not seen [Arnell and Jolicoeur, 1997]

Attentional blink indexattentional blink (i) See [Raymond et al., 1992] [Chun, 1997] (ii) The blinked item primes (iii) Shows that objects may activate types before completing object-files for awareness.

Repetition blindness for location. See [Epstein and Kanwisher, 1999].

Attentional masking. (i) See [Enns and Lollo, 1997] (ii) With simultaneous mask with delayed offset [Lollo et al., 2000] [Lollo et al., 2002]

26.4.4 Attention and awareness

Object identification (i) occurs without attention or awareness [Luck et al., 1996] (ii) causes activation (iii) causes priming (iv) the object’s representation is viewpoint dependent [Stankiewicz et al., 1998], 514 Chapter 26: Vision two-dimensional, no interpretation of occlusion and no amodal completion [Treisman and DeSchepper, 1996] (v) there can be also extreme inattention (vi) there can be inattentional blindness [Mack and Rock, 1998], (however still primes words and emotional responses).

Clinical neglect (i) representations are formed (ii) including illusory contours [Mattingley et al., 1997] and filled-in surfaces (iii) therefore there is normal perception before the stage where the pathology causing neglect occurs.

Inaccurate binding of features Occurs if the object is not attended to [Treisman and Gelade, 1980]. (i) need to pay attention to object to bind its features accurately. (ii) may need spatial attention if more than one object are present [Robertson et al., 1997]. (iii) Balint’s syndrome, includes simultagnosia and inaccurate feature binding. (iv) Extinction following unilateral parietal lesions [Driver, 1996] [Baylis et al., 1993], (v) May have implicit knowledge by access from features to types. (vi) the lower-half visual field is more difficult for crowded multiple objects, due to greater parietal projection to the lower cortical field [He et al., 1996].

Global attention. Amodal completion for homogeneous displays [Rensink and Enns, 1998].

Preattentive object-files (i) These seem to act as holders for collections of attributes together with part-whole relationships [Wolfe and Bennett, 1997]. Phenomena to be modeled 515

(ii) There is reversion to a preattentive object-file from object-file form after removal of attention [Wolfe, 2000].

Changes need focused attention (i) transsaccadic changes need attention to detect them [McConkie and Currie, 1996] (ii) also need in alternating scenes [Rensink et al., 1997] [Simons, 1996] [Simons and Levin, 1997].

The dorsal route has representations of orientation, size and motion only [Carey et al., 1996] [Faillenot et al., 1997].

The role of the amygdala may be important.

26.4.5 Object types

View dependence or view-independent structural representations (i) gradients of generalization around 2D [Hayward and Tarr, 1997] (ii) perception of rigidity [Sinha and Poggio, 1996] (iii) interpolation process to determine rotation [Kourtzi and Shiffrar, 1997] (iv) view-independent neurons [Logothetis and Sheinberg, 1996] (v) object component neurons [Fujita et al., 1992] (vi) errors support structural representation [Fiser et al., 1996] [O’Kane et al., 1995] [Martin Lades et al., 1993] (vii) Hummel and Stankiewicz’s model [Hummel and Stankiewicz, 996] has a role for both.

Orientation is dissociated from identity

Priming of both structural and view-dependent representations

Implicit priming is invariant across location, color, orientation and size. 516 Chapter 26: Vision

Recognition is more specific

26.5 Eye movement control

Eye gaze, smooth pursuit, vergence and blinking are each controlled by different systems, and also there are separate lower level subcortical controls for horizontal and vertical eye movement [Haines, 1997] [Kandel and Schwartz, 2000]. Figure 26.5 shows the main subcortical areas and connections involved in eye movement control.

cerebral cortex

frontal eye fields parieto−occipital

superior colliculus

brainstem pulse generator

pons mesencephalon

PPRF interstitial nucleus MLF

horizontal saccades vertical saccades

Figure 26.5: Eye movement control in the brain

Starting from a standard medical book treatment [Haines, 1997] on eye movement, look- ing at the main cortical areas and connections used by the brain in eye movement control, Figure 26.6 shows my idea of a hierarchical system for the control of saccades, which I Eye movement control 517 show embedded in the usual perception-action hierarchy. The lowest level, “innate re- flex”, is the superior colliculus (SC) and brain stem which have basic innate eye movement behaviors, such as orienting to a moving object, etc. The top level, “creative voluntary”, is control by explicit planning where the subject deliberately looks toward some point. The middle level, “eye movement planning”, is controlled by the parietal eye field (PEF) which is basically the lateral inferior parietal area or LIP and provides a set of eye- movement plans which are evoked by a query or goal. Thus planning would send a query to look for an object which is red, and LIP would generate an eye movement plan for finding such an object in the scene. LIP can have learned routine eye movements and also more creative planning of eye movements. Possibly the level of learned “routine eye movements” also involve the basal ganglia, and they correspond to David Noton and Lawrence Stark’s concept of learned scanpath [Noton and Stark, 1971]. The middle and top levels shared a common output via the frontal eye fields (FEF).

26.5.1 Transsaccadic vision

There has been intensive work on determining what is remembered between saccades. Irwin [Irwin and Zelinsky, 2002] has put forward the theory that what is remembered are object files, which have fairly conceptual information, and not detailed visual information. He further suggested that subjects could only remember about four object files between saccades.

26.5.2 Models of vision including eye movement

The other class of models attempts to incorporate vision into a model of behavior [Rybak et al., 1998] [Carpenter et al., 1992] [Califano et al., 1990] [Deco and Schurmann, 2000]. Rybak’s model uses difference-of-gaussians edge-detectors 518 Chapter 26: Vision

Four levels of eye gaze control: creative voluntary eye movement planning affiliation routine eye movements goal innate reflex control

person_disposition plan

person_action plan_person_action

object_action PEF plan_self_action FEF person_motion

spatial_action

object_motion path_planning macroaction SC sensors: tactile vision eye_gaze face head_or head_pos body_or body_pos effectors:

environment

Figure 26.6: Our concept of hierarchical eye movement control in the brain for the foveal area and also for edges on the periphery of the image. From these it determines the next saccade to make, using a scanpath representation, which leads to perception of an object.

26.6 Problem solving, perceptual queries, and top-

down attention

Vision plays an important role in problem solving. Indeed, in some sense, vision subserves the overall activity of the person, which can be seen mainly as solving problems. My model provides a basis for incorporating vision into problem-solving activity. Figure Problem solving, perceptual queries, and topdown attention 519

26.7 shows how eye movement fits into problem solving activity for the Tower of Hanoi problem, discussed in chapter 16. This figure also shows communications to and from

plan test_legal(move(d,p)) legal(move) test_legal(move(d,p)) move(d,p) object_relations pef plan_self_action lookat(d) fixate(p) fef attend(move(d,p)) object_action

object_motion eye_gaze macroaction

environment

Figure 26.7: Organization of Tower of Hanoi strategy showing perceptual goals perceptual modules such as the object relations and object action modules. 520 Chapter 26: Vision

26.7 Vision and mental imagery

Stephen Kosslyn, in two books [Kosslyn, 1980] [Kosslyn, 1994] and numerous papers, has explicated mental imagery to the point of defining a model quite precisely, from which some limited implementations have been produced. Figure 26.8 diagrams his model, taken from [Kosslyn, 1994].

Categorical Categorical to Attention Shape Motor Property Coordinate Shifting Shift Programming Lookup Conversion Systems

Categorical Body, Head Coordinate Spatial Relations Eye Positions Property Encoding Spatiotopic Lookup mapping Coordinate Spatial Relations Visual Encoding Buffer

Associative Attention Memory Category Motion Relations Window Pattern Activation Encoding

Exemplar Preprocessing Pattern Activation Stimulus−based Attention Control

Figure 26.8: Stephen Kosslyn’s model for mental imagery

In addition, he has argued cogently for sharing of subsystems between visual percep- tion and mental imagery, and has supported his argument with PET and MRI data [Kosslyn et al., 1993]. Vision and episodic memory 521

26.8 Vision and episodic memory

As mentioned in chapter 22, the anatomical connections to and from the hippocampal complex, for the rhesus monkey, have been described by Kobayashi and Amarel, and are summarized in Figure 22.6.

The psychological connections, between short-term visual phenomena, short-term vi- sual memory, and episodic memory, have been investigated by Andrew Hollingsworth and John Henderson [Hollingworth and Henderson, 2002]. They concluded that episodic memory for scenes included greater visual detail than for object files, and hence they postulated a further store which they called LTM. Thus they have VSTM, CSTM and LTM. VSTM is the well-known short-term visual buffer, CSTM is the short term con- ceptual store which contains three or four object files transsaccadically, and LTM is the longer term memory.

I would identify their LTM with an episodic memory, and also a perceptual module higher but still within V4, since it concerns visual memories from the same session but longer that those retained in CSTM. I would further identify this episodic memory with the hippocampal complex.

My idea of the action of the hippocampal complex, discussed in chapter 24 on learning by doing, is that it continuously forms and maintains a memory of the current episode, which is the last 15 minutes or so of events. The episode consists of a set of key events with relations among them. The content of an event is related to the connections in the Kobayashi and Amarel diagram. This is based on the brain’s representation of its perception and not its action. Also it is dominated by vision. We also see that there are two distinct visual inputs to the hippocampal complex, one from V4 to the parahippocampal area and one from TE0, etc., to the perirhinal area. I would tentatively identify these with a shape, color and motion representation from V4, and an object- 522 Chapter 26: Vision

file representation from TE0, etc. These would be combined in entorhinal cortex to produce the episodic representation of visual information. This would be separate from the more immediate visual representation which is investigated tachistoscopically in most experiments. This episodic visual memory could be retrieved from the hippocampus, and/or re-evoked in V4 and TE0, and used to answer queries.

Incidentally, Kobayashi and Amaral also summarize subcortical connections to the hip- pocampal complex. showing the source of affective and other information for episodic memory, and visual episodic memory in particular. Of course there are also subcortical inputs directly into temporal areas.

26.9 The interface between the visual system and

the core brain model

It seems useful, for extensions to the core model, to try to define, as clearly and precisely as possible, the interface between the extension subsystems and the core model, as de- scribed in Parts II and III. This interface will have two main parts, the interface to the perception-action hierarchy, and the interface to the learning modules. Let me try to do this for a proposed vision extension.

1. Interface to the perception-action hierarchy. (i) The visual system inputs information to core visual modules. This is much more differentiated information than used in the more abstract core model. (ii) The visual system receives semantic memory information from temporal areas. (iii) The dorsal visual modules of the vision system input information to the lower levels of the planning and action hierarchy. (iv) The visual system generates information for eye saccade generation. Our approach to the visual system 523

2. Interface to the learning modules. (i) Episodic memory. Information is provided by the visual system to the hippocampal complex. This is in two parts, 3D sketch information from TE0 to the parahippocampal area, and object-file information from TE to the perirhinal area. (ii) Procedural memory. The visual system provides visual information regarding the current situation to the procedural memory system. From medial and lateral temporal cortex and from entorhinal cortex to the limbic loop, and from posterior parietal cortex to the oculomotor and association loops, see Figure 25.5 and chapter 25.

26.10 My approach to the visual system

26.10.1 My proposed contribution

What my approach to modeling brings to the research effort in vision is its ability to provide a framework for the entire system of which vision is a part. This allows us to understand the role of the visual system in the behavior of the brain as a whole and of the person. It gives us: 1. The generation of topdown attention, from the planning module and the current plan step being executed. 2. It generalizes attention to include not only what object or spatial region the visual system should focus on but also the request to obtain certain specified information from the scene, for example, whether one object is next to another. 3. It allows us to integrate the ability for mental imagery into the visual system. The work of Stephen Kosslyn has already developed a detailed description of the data struc- tures and operations involved in mental imagery. This provides for the construction of mental images, possibly combining them with the current visual percept, and the an- 524 Chapter 26: Vision swering of queries about the synthesized image, by examining it. 4. It allows us to connect the visual system with episodic memory. Although the con- tribution from the visual system to the representation of the current event is mainly as object files and their spatial relations, there is also the ability to contribute the current 3D sketch as required, as discussed previously and as diagrammed in Figure 22.6.

In addition, my abstract logical system approach leads to a different type of description of the visual system. The system is described as a set of modules, the data communicated between and stored in modules is described in terms of the information it contains without committing to any particular coding of that information.

26.10.2 An example of visual system behavior

To make things clearer, let us consider the role of vision in the solution of the Tower of Hanoi problem.

First, the visual system is used to find out what objects are there and how they relate to each other spatially. This is done by scanning the scene by saccades, and generating object files for each object. Since the system can only maintain a strong representation of four to five object-files, the system probably chunks the objects into sets such as all the disks on one particular peg, except the top one, and this would form one object-file. Thus during the solution of the problem, the visual system has to repeatedly revisit the scene in order to revive and re-represent particular disks and their positions. Most subjects would not memorize the scene and then solve it with their eyes closed, however they may remember certain properties and positions even with their eyes open, so returning to and re-perceiving a part of the scene may not require the same processing as the first time.

Then the planning system starts trying to solve the problem. As it does so it may require certain information about the current state of the scene, such as which is currently the Our approach to the visual system 525 top disk on a given peg, whether another peg is clear and if not which disk is the top one on it. In order to answer these queries, the visual system will saccade to certain places, and TE and other areas will process the information to answer the particular current query.

When a move is decided upon, of course the visual system will be used to guide the subject’s hand to the disk to be moved, then to monitor its movement and placing at its new position. One can argue that this is handled by the dorsal “where” visual system.

The other thing that happens however is that mental imagery is used. I have actually studied this in a little detail together with David DeVault. We postulated that in order to decide whether a given proposed move was legal, that the subject formed a mental image of the end result of moving the disk, and then observed this mental image and read off whether the moved disk was smaller or larger than the top one on the target peg. Thus we postulated that the subject constructed a mental image by making a mental image of the disk to be moved, and then combined this image with the visual percept. In order to test this hypothesis, we asked subjects a series of legal-move questions, and we observed their eye movements. We asked questions “Can you move disk 1 to peg 3?” mixed in equal numbers with questions asked in the reverse order “Can you move to peg 3 disk 1?” in order to avoid a bias, due to order of verbal presentation, to attend to the disk first and then the peg afterwards. What we found was that the subjects always looked first at the disk for a short time, one fixation or so, and then looked at the target peg for a longer time. This data thus supported our theory of mental image formation during solution of the Tower of Hanoi problem.

Another thing emerged from this study. We first had tried asking all the questions about the same Tower of Hanoi position, and we found that only the first and possibly second questions showed the effect described. After this, the effect went away. In the above experiment, we changed the position for each question. Our conclusion from the 526 Chapter 26: Vision disappearance of the effect was that the subject was accumulating knowledge about the position with each question about it and therefore could then answer the query without forming a mental image. After all, once you know the two or three possible legal moves in a given position, you can also answer all other move legality questions.

26.11 A proposed model of the visual system

26.11.1 Areas and mechanisms to be modeled

Putting everything together, we end up with an extended system as diagrammed in Figure 26.9. Control theorists may avert their eyes.

My model will then represent brain areas in this diagram by processing modules, and anatomical connections by communication channels. That is, I will not initially model the dorsal visual subsystem, or the spatial layout memory in the posterior cingulate, or the detailed structure of the hippocampal formation.

26.11.2 The mechanism of the model for the Treisman process

Figure 26.10 is my attempt to depict the different kinds of information involved during perception. The initial scene is perceived as an image which has a foveal area which is much smaller than the total image and yet has a lot of information including most of the color. This is the information in V1. What I am calling the peripheral image is a representation of the total image but with less information. To indicate the different amounts of information, I took the size of the total image to be 600x600 pixels if we ignore the foveal part. So the peripheral representation will also be 600x600 pixels. However the foveal part alone is also 600x600 pixels, so I made the different images in Our proposed model of the visual system 527

episodic memory medial posterior visual memory subsystem

output HIPPO (spatial layout) PCING visual actions memory distal EC PPA spatial motor layout dorsal cortex PARA frontal PERI visual subsystem PO planning LIP (action guidance) areas peripheral dorsal lower motion spatial 9/46 parietal areas layout MT

frontal 10/12 context periph periph areas V2 V1 ventral V4 TE TE0 foveal foveal visual object visual object buffer focus types V8 object buffer instances color and spatial relations LGN ventral visual subsystem (object identity)

visual input from retina

Figure 26.9: The main brain areas of the visual system the diagram have these numbers of pixels. I then tried to show different aspects of the foveal image as they might be processed by visual areas. The first is a sharpened image, the second is an intensity edge image and the third is a spatially blurred color image. 528 Chapter 26: Vision

Scene

perception

Input image showing foveal fixation

Peripheral with limited color Sharpened Edges Color information intensity only with low spatial resolution

Figure 26.10: An image and its different components during processing Our proposed model of the visual system 529

Figure 26.11 shows how the perceptual mechanism might work. The key is to (1) as- sign coordinate frames to modules, and (2) make attention be represented as part of a nonegocentric frame.

2.5−3D representation peripheral image

attention A B xa,ya xb,yb

nonegocentric retinal frame frame

total visual image color area focus area

A B A B xa,ya xb,yb xa,ya xb,yb

visual focus buffer retinal visual object buffer retinal frame frame A AA B B xa,ya xa,ya xb,yb

nonegocentric retinal intensity edges frame frame A B xa,ya xb,yb

long term memory retinal of object types frame

A B

Figure 26.11: The different modules and their data during perception

The figure shows what happens in a Treisman experiment where a red A and a blue B are presented. It is assumed these are in the foveal area or an area surrounding and close 530 Chapter 26: Vision to the fovea, which we will call the area of focus. So both letters are within this area of focus.

They are processed by two modules. The color module loses some spatial determination, which I have diagrammed by using large images of the letters. The intensity edge module I take to ignore color but to provide good spatial resolution information. This perception process happens during one saccadic fixation, a period of about 300 milliseconds only, before being replaced by an image from the next fixation.

Note that these two modules both represent information in the retinal frame. This means that the contents can be accessed by giving information specifying spatial locations on the retina. I represent this by attaching explicit retinal coordinates to each data item in their stores, although other representations are possible.

Meanwhile the peripheral image, which is at first in retinal coordinates, in the third stage module generates or updates a representation in the nonegocentric frame, which includes some 3D information. This nonegocentric representation is updated while the different saccadic fixations occur. It is updated by the incoming peripheral images.

The color and edge modules combine information into the next module which H. Branch Coslett [Coslett and Saffran, 1991] calls the visual-analog buffer, but I will call the visual- focus buffer. The representation is a mixture of spatially resolved, color unresolved edge information and spatially unresolved, color resolved information. It is also still in the retinal frame. Our proposed model of the visual system 531

The system now uses this information to generate or update a representation of one attended-to object, which it puts in the next module, which I will call the visual-object buffer but Coslett calls the visual buffer. Thus we are paying attention to the red A and this information from the visual-focus buffer is used, along with spatial information from the 3D representation, to form the object-file data which is part of the contents of the visual-object buffer. The information concerning the blue B is ignored and lost. The visual-object buffer uses a nonegocentric frame, and it accumulates object files as different saccades occur. The representation however decays rapidly and there cannot be more than 4 or 5 object files in it at any one time.

The way attention works is as shown. The 3D representation which uses a nonegocentric frame continuously maintains a spatial representation of where the area of focus is relative to the total image and relative to the nonegocentric frame. Thus, from an attention signal in the nonegocentric frame it can generate an attention signal in retinal coordinates which can be used by the visual-focus buffer to pick out the attended-to object, in this case the red A. The red A has retinal coordinates xa,ya.

It is also quite possible that the visual-object buffer contains something more general than nonegocentric coordinates, involving spatial relations among objects. This would allow more rapid judgements where information about relations of objects relative to each other is needed. However a default assumption would be that spatial relations are represented in a further, adjacent, visual module. Chapter 27

Natural language processing

Abstract. I describe an initial study for a psycholinguistically and neurolinguistically plausible model of natural-language processing by the human brain. This model is based on the work of Gerard Kempen and coworkers at Leiden and Nijmegen who have de- veloped computational models of language generation and of language recognition. My model is implemented as a set of intercommunicating brain modules that run in parallel. These brain modules have the same structure and control regime as other nonlinguistic brain modules. They approximately correspond to Broca’s area and temporal lobe areas including Wernicke’s area.

532 Introduction 533

27.1 Introduction

We can roughly divide natural language processing into syntax, semantics and pragmat- ics. In addition, there is the treatment of conversational interaction. I do not have much to say yet about any of these, certainly not semantics, however I have started looking at syntax. My initial aim is to try to understand the modeling of syntactic processing which is in agreement with psycholinguistics, that is, how humans process sentences, their performance including errors and pathologies.

Chomsky’s theory. A lot of work on computational models of language processing has been based on the work of Chomsky, notably his Government- Binding (GB) approach to his Principles and Parameters theory [Chomsky, 1981] [Chomsky, 1986] [Cook and Newsom, 1996]. [Sells, 1985] [Akmajian et al., 1995] [Riemsdijk and Williams, 1986] [Kimball, 1973b] [Kimball, 1973a].

Lexicalist approaches. Most other models are based on lexicalist approaches, no- tably head-driven phrase structure grammar (HPSG) [Pollard and Sag, 1994] and gener- alized phrase structure grammar (GPSG) [Gazdar and Pullum, 1982]. Another variant is lexical-functional grammar [Kaplan, 1995] [Kaplan and Bresnan, 1995]. For a compar- ative study of GB, HPSG and GPSG see [Borsley, 1991].

The work of Gerard Kempen. Gerard Kempen [Kempen, 1976] [Kempen, 1977] [Kempen, 1978] [Kempen and Hoenkamp, 1982] [Kempen and Huijbers, 1983] [Kempen and Hoenkamp, 1987] [Kempen, 1987] [Kempen, 1989] has developed an approach to language processing which is psycholinguistically plausible. He and his coworkers have specified and implemented both language generation systems [Carel Van Wijk and Gerard Kempen, 1987] [Smedt and Kempen, 1991] and language recognition systems [Kempen and Vosse, 1989] [Vosse and Kempen, 2000] on computers. This work is also influenced by Willem Levelt [Levelt, 1989] and other psycholinguists 534 Chapter 27: Natural language processing

in the Netherlands.

Other models. Other psycholinguistic models of natural language processing in- clude those of Lynn Frazier [Frazier and Fodor, 1978] [Frazier, 1990] [Frazier, 1998], Richard Lewis [Lewis, 1993], Matthew Crocker [Crocker, 1996], and Edward Gibson [Gibson, 1998].

Artificial intelligence. There is some influence from artificial intelligence, notably the work of Roger Schank and coworkers [Schank, 1975] [Schank and Abelson, 1977] [Schank, 1982a] [Dyer, 1983a] [Dyer, 1983b] [Hovy and Schank, 1984]. A collection of AI work on natural language generation is edited by Paris et al. [Paris et al., 1991].

Unification. Many contemporary approaches to natural language processing and gram- mar use unification as a basic operation [Kay, 1985] [Kay, 1992]. This also applies to work which is not psycholinguistic but simply linguistic.

Logic grammars. There is also influential work in logic programming approaches to natural language processing, notably the work of Veronica Dahl [Dahl, 1988] [Dahl, 1989] [Dahl, 1990] [Dahl, 1994] [Dahl, 1999], Fernando Pereira [Pereira and Shieber, 1987] and Patrick Saint-Dizier [Dahl and Saint-Dizier, 1985] [Dahl and Saint-Dizier, 1988].

Human linguistic performance. Descriptions of human linguistic performance are given by Andrew Ellis and colleagues [Ellis and Young, 1988] [Ellis and Beattie, 1986]. Merrill Garrett produced some important theoretical ideas and a first model [Garrett, 1980] [Garrett, 1988] [Garrett, 1995]. A main source of data and interpreta- tion of linguistic disorders are the books of David Caplan [Caplan, 1992] [Caplan, 1987].

Kempen’s approach is psycholinguistic, in that it seeks to model human performance. The system should find sentences difficult to process that humans find difficult to process, it should tend to make the same mistakes that humans do, and, if we can model pathologies it should exhibit the same degradations in performance that humans do. Theo Vosse and Introduction 535

Gerard Kempen have applied their model to aphasic patients [Vosse and Kempen, 2000].

Modularity. Modularity in language processing is a subject of much debate, see [Fodor, 1983] [Karmiloff-Smith, 1992] [Smith and Tsimpli, 1995] [Fodor, 2000].

Brain architectural issues. In addition, Kempen has discussed cognitive and brain ar- chitectural issues in language processing [Kempen, 2000]. His approach differs somewhat from other views such as those of Lynn Frazier [Frazier, 1998].

Neuroanatomy of language processing. The neuroanatomy of auditory processing in primates has been reviewed by Jon Kaas et al. [Kaas et al., 1999], see also [Pandya, 1995] [Webster, 1992].

Imaging studies. In recent years, new insights into brain processing of natu- ral language have been obtained with imaging [Fiez et al., 1996] [Zatorre et al., 1996] [Baum et al., 1990] [Petersen and Fiez, 1993] [Huckins et al., 1998] [Bilecen et al., 1998] [Binder et al., 1997] [Millen et al., 1995] [Dhankhar et al., 1997] [Hickok et al., 1997].

Deacon’s ideas.

Terence Deacon has extensively reviewed the neuroanatomy of language processing and its evolution [Deacon, 1988] [Deacon, 1989] [Deacon, 1992] [Deacon, 1997]. 536 Chapter 27: Natural language processing

Figure 27.1: Summmary of imaging data for natural language processing, from [Deacon, 1997] Introduction 537

Figure 27.2: A suggestion for the natural language processing system, from [Deacon, 1988] 538 Chapter 27: Natural language processing

27.2 Kempen’s model of grammar

As an initial exploration, I defined and implemented a modular brain model based on Kempen’s psycholinguistic theory [Kempen, 1978] [Kempen, 2000], using my own logical system approach to brain modeling. I focused on Theo Vosse and Gerard Kempen (V- K)’s sentence recognition system [Vosse and Kempen, 2000].

Overview of the V-K sentence recognition system. The sentence is read in, in- crementally, word by word and a structure description is constructed incrementally and dynamically, being updated after each word.

The lexicon. The approach is lexicalist, in that there is very little grammar that is not derived from the lexicon. Words are held as lexical frames, which are four-tiered unordered trees which are “mobiles” , i.e., there is no ordering among branches. Figure 27.3 depicts some frames. They have variables, such as np, dp, pp and s, at certain places which can be linked to form structure descriptions for sentences.

Grammatical features. Each variable has an associated set of features. For example, a noun phrase, type np, can have a set of feature values composed of subsets of the following feature sets: person - {first, second, third}, number - {singular, plural}, case - {nominative, accusative}. The set of features is called a matrix. A particular instance of a noun phrase might have, for example, [case = {nominative,accusative},number = {singular}, person = {third,first}]. There can also be features shared between the frame’s root and the frame components at its leaves.

Shared grammatical features. For the root however, we need to indicate which features share and with what frame component: [number([singular,plural],shared with([fcname1,fcname3]), In the exam- ple given in Figure 2.2 of [Kempen, 2000], our Figure 27.4, s shares number and person Kempen’s model of grammar 539

dp np

hd det hd mod

art dp n pp

the woman

s pp

subj hd dobj mod hd obj

np v np pp prep np

sees with

Figure 27.3: Lexical frames with the subject frame component and gender with the dobj frame component, also the hd frame component shares both.

Thus the complete lexicon entry for his example could be written something like this:

lexicon(‘‘saw’’,1,1.0, [s, [number([singular,plural],shared_with([subj,hd])), person([first,second,third],shared_with([subj,hd])), gender([masculine,feminine],shared_with([hd,dobj]))]], [frame_component(subj,[null,0.0],[np,[case([nominative])]]), frame_component(hd,[null,0.0],[v,[tense([past])]]), frame_component(dobj,[null,0.0],[np,[case([accusative])]]) ] 540 Chapter 27: Natural language processing

[num={sing.plur} s [gender={masc,fem}] pers={1st,2nd,3rd}] subj head dobj

[case={nom}] np v np [case={acc}]

[tense={past}]

np [num={sing} [num={plur} np pers={2nd} pers={3rd} case={nom,acc} case={nom,acc} hd hd gender={fem}] gender={masc}] n n

Figure 27.4: Sharing of features, from Gerard Kempen’s book [Kempen, 2000] Figure 2.2

).

Multiple grammatical features. If there is more than one frame component of the same type, such as multiple adjectival modifiers, we can use [Name,Occurrence], with the convention that Name means [Name,1]. Using a list instead of a unique name would allow one to specify sharing of features with all occurrences of, for example, modifiers in given frame.

Unification. The unification process used in this theory differs from head-driven phrase structure grammar and lexical-functional grammar in being nonrecursive and involving only feature unification. Two lexical frames are combined by unifying a variable from one frame with a variable from the other frame. Unification is an agreement check between two nodes that become unified. For example, to unify [case = {nominative, accusative}, number = {singular}, person = {third, first}] with [person = {first, second, third}, case = {nominative}], we proceed as follows [Kempen, 2000]: Kempen’s model of grammar 541

1. For each shared feature type, find the intersection of possible values, thus, in the example, person and case are shared and number is not: [person = {first, third}, case = {nominative}] 2. if some shared feature type has no intersection, unification fails. 3. for nonshared features, keep the same: [number = {singular}]. 4. form new matrix from the union of results of 1 and 3: [case = {nominative}, number = {singular}, person = {third, first}]

Word-order check. There is a word ordering process, and there must be consistency between the input order of words and their position in the structure description.

U-space. Unification space, or u-space, is where the structure description is formed. At each moment, U-space consists of a set of lexical frames. Lexical frames are linked by u-links which represent unifications of variables present in these frames. Each u-link has an instantaneous strength and each lexical frame has an activation value, which are real numbers.

Weights.

1. Each node in a given lexical frame has the same activation value.

(a) its initial value is taken from the lexicon, corresponding to usage, and

t−t0k (b) it decays to zero vk(t) = vk(t0k).d

2. Each unification link (u-link) has a strength.

(a) i. its initial value is 0.

ii. increases spontaneously and may also be inhibited, and

iii. there is also noise added.

(b) spontaneous increase 542 Chapter 27: Natural language processing

i. each cycle, the increase is proportional to activation value of root and foot nodes it connects.

ii. ui(t) = ui(t-1) + pr.vroot(i)(t) + pf .vfoot(i)(t) + pnoise.random(t). where

pr, pf and pn are constants, which are independent of i and t.

(c) i. two u-links unifying with the same node inhibit each other.

ii. Iij=1 if inhibiting, otherwise Iij=0.

iii. the contribution subtracted from ui(t) is also weighted by vfoot(i)(t), by

uj(t) of the other u-link, with a constant term to diminish sensitivity of P a u-link to inhibition with time. (pconstinhib + pfootinhib.vfoot(i)(t)). j

uj(t).Iij. where pconstinhib and pfootinhib are constants.

(d) co-inhibition. Two u-links from the same lexical frame enhance each other’s

strength. Contribution to u1(t) is pco−inhib.u2(t); u2(t) is pco−inhib.u1(t).

(e) The apex node. vapex is independent of time. inhibapex inhibits u-links that descend from the apex.

Vosse and Kempen determined appropriate values for the various parameters, given in Figure 27.5, by fitting the performance of the model to empirical data for normal subjects

Global integrity conditions. In addition, there are overall “integrity conditions” that must hold of any successful parse tree. Two u-links are in violation of integrity if: (i) both try to attach to either the same root or the same foot node. (ii) they unify two different lexical entries associated with the same input word. (iii) two root nodes try to u-link to two different foot nodes of the same lexical frame and violate each other’s word order rules. (iv) the foot and root node of a u-link unify with the root and foot nodes of another, thereby creating a loop. Kempen’s model of grammar 543

parameter value parameter value parameter value

word tmin 7 d 0.98 napex 0.09 word tmax 35 pnoise 0.18 co-inhib 0.08 fin tmin 9 pr 0.84 pfs 0.46

fin tmax 55 pf 0.97 inhibapex 0.05

pthreshold 1.83 pcs 0.65

Figure 27.5: Normal parameter values determined by Vosse and Kempen

(v) they lead to crossing branches.

Dynamic creation of structure descriptions. In a language recognition regime: (i) a new input word generates a lexical frame, or frames, which is (are) added to the u-space. Initial activation values are taken from the lexicon. (ii) u-links are created from the root node(s) of the new lexical frame to all existing matching foot nodes, and from all existing and matching root nodes to the foot nodes of the newly entered lexical frame. The initial strength of these u-links is 0. (iii) then the activation values of all frames are decremented, and (iv) then the strength values of all u-links are updated according to a competitive inhibi- tion process, and also as a result of the application of certain global integrity conditions.

Competitive inhibition. U-links overlapping at the same node will inhibit each other by an amount linearly related to their current strengths and to the activation values of the nodes involved.

An example of sentence recognition. An example is given in the V-K paper [Vosse and Kempen, 2000] of recognizing the sentence: “The woman sees the man with 544 Chapter 27: Natural language processing the binoculars”. Figure 27.6 shows one step during construction of the structure descrip- tion, at the point where “The woman sees the man with” has been received.

S

Subj Head DObj Mod

PP NP V NP

sees

PP

hd obj

Prep NP NP with NP det hd mod det hd mod DP N PP DP N PP man woman

DP DP hd hd art art the the

Figure 27.6: Step during construction of structure description

Competing u-links can be seen. By inhibition, the process will eventually settle to have unique u-links, which gives the structure description of the input sentence.

27.3 Vosse and Kempen’s results

The performance of their model on thirty sample sentences, taken together, exhibits a large portion of reported psycholinguistic phenomena, including garden path sentences, nesting of clauses, word ambiguities and so on, see Figure 27.7. Vosse and Kempen’s results 545

no sentence t % type % description

1 The rat the cat chased escaped S 100 self-embedded clause, single nesting, small number of ulinks 2 The rat that chased the cat escaped S 100 single embedding 3 The rat that the cat chased escaped S 100 single embedding 4 The rat the cat with the binoculars chased escaped S 100 double embedding, easy, “with” attaches to left 5 The rat the cat the dog bit chased escaped F 100 double embedding, difficult, attachment to right 6 The rat the cat you bit chased escaped F 100 7 The cat chased the rat that escaped S 100 right branching counterpart of 1, final NP easily dominates 8 The dog bit the cat that chased the rat that escaped S 100 right branching counterpart of 5, final NP easily dominates 9 The executive who the fact that the employee stole sentential complement clause embedded within a relative clause, office-supplies worried hired the manager F 100 difficult, see [Gibson, 1998]. 10 The fact that the employee who the manager hired relative clause embedded within a sentential complement clause, stole office-supplies worried the executive S 100 easy, see [Gibson, 1998]. 11 John said he came yesterday H 1.2 L 98.8 global ambiguity resolved by recency effect 12 The woman watches the man with the binoculars H 100 L 0 global ambiguity, but recency combated by “watch” having higher strength for instrumental modifiers, e.g. PP headed by “with” 13 The horse raced past the barn fell F 100 local ambiguity, classic garden path 14 The horse raced past the barn yesterday S 100 if “raced” activation too high, can produce NP not S. 15 The stablehand groomed the horse raced past S 100 unproblematic, however Stevenson and Merlo (97) the barn observe it is difficult, due to special properties of motion verbs 16 The man knew the woman slept H 100 easy, since the NP is first attached as direct object of “knew” and then reattached as by the second, main, verb as its subject 17 The man who knew the woman slept H 100 This has two readings, but the high attachment choice putting “slept” as main verb, is preferred by humans [Sturt et al., 1999], and by uspace dynamics 18 Since Jay always jogs a mile seems easy, well-known garden path [Frazier and Rayner, 1982], a short distance to him “a mile” gets attached to “jogs” as adverbial modifier and then reattached as subject of the main clause with “seems” 19 Since the woman slept the dog bit her S 100 easy, since “slept” is intransitive. 20 Since the horse kicked the dog bit her S 99 F 1 “the dog” is attached as object of “kicked” and then reattached as subject of the main clause with “bit” 21 When the boys strike the dog kills Easy, from [Warner and Glass, 1987] 22 Before the boy kills the man the dog bites strikes Difficult, from [Warner and Glass, 1987], will not succeed unless inhibitory weight for direct object ulink is reduced below 0.85. 23 They can fish VV 54.4 VN 45.6 “can” is auxiliary or main verb, “fish” is noun or verb 24 Without her contributions the funds are inadequate S 100 “her” is determiner or personal pronoun, from [Pritchett, 1992] 25 Without her contributions are inadequate F 100 from [Pritchett, 1992] 26 I hate that S 100 from [Lewis, 1993], “that” can be determiner, demonstrative pronoun, relative pronoun or complementizer (subordinating conjunction) 27 I hate that boy S 100 from [Lewis, 1993] 28 I believe that John smokes annoys Mary S 100 garden path ([Gibson, 1991]), always analyzed correctly 29 Before she knew that she went to the store Pro 51 CMPR 49 30 That coffee tastes terrible surprised John

Figure 27.7: Vosse and Kempen’s sentence recognition results 546 Chapter 27: Natural language processing

Agrammatism. Vosse and Kempen went on to apply their system to the clinical condi- tion agrammatism. They got a good fit to performance using a set of nine sentence types developed by David Caplan [Caplan et al., 1985] for evaluating agrammatic patients. See section 27.7.

27.4 My grammatical model for the brain

I can now discuss how I formulated the Vosse-Kempen psycholinguistic model in terms of my brain model.

Basically, the decrementation of activation weight, and the incrementation of u-link strength can be modeled by mechanisms already provided in my brain model archi- tecture. Then I extended the brain model to provide inhibition of competing overlapping structures.

1. Activation decrements will occur by attenuation of the word data item, and by the continuous reconstruction of the evoked lexicon item.

2. The spontaneous incrementation of u-links will follow from the ramping up of strength of the lexical frame data items in the u-space module. For this to work, I needed to make the effect of u-link strength depend on the strengths of the lexical frames it unifies.

As regards inhibition of overlapping u-links, this could be made a general property of the brain model and the data items used. At the moment, each type of data item has a specification of how it is to be updated. It is possible to specify uniqueness of data items so that only one of a given type can occur in the store at any one time. If a new item enters, it simply replaces the previous value. It is also possible to specify that there can be multiple items of the same data type. The kind of competition needed here would be a generalization of the model. There could be multiple copies but they could compete Our grammatical model for the brain 547

over a number of cycles to eventually leave one item only.

Thus, I will now look at how to extend my data item concept to provide this kind of uniqueness and competition as a general property of the working of the brain model. There are certain properties of perception in general, each perceived object has one and only one interpretation.

My Approach. I first developed straightforward representations as descriptions. I con- cluded that u-links are not neurally plausible as data items. I instead decided to represent the same information in a new kind of description which I called a unified lexical frame. Thus the store where the sentence structure description is constructed will contain only lexical frames and unified lexical frames. I used the property of my brain model that it continuously reconstructs data items. I also did not think that a global integrity enforce- ment process was neurally plausible either. To achieve the same computation as V-K, a brain model rule continually, every brain model cycle, reconstructs all the different pos- sible unified lexical frames. Because of this, the data items representing unified lexical frames need to carry more information to allow the different integrity requirements to be implemented as reconstruction actions. Then, I used two main rules: (1) create lexical frame from word: word → lexical frame, and (2) unify lexical frame to create u-link: lexical frame 1 and lexical frame 2 → unified lexical frame 12

As regards the issue of learnability of grammar from positive, i.e., grammatical, instances of sentences, we will need to make the inhibition/exclusion principle a general princi- ple independent of the particular grammar. This principle corresponds to the idea of uniqueness of a match or association.

Thus, to summarize, the store corresponding to u-space will contain only lexical frames and unified lexical frames, and this brain module will contain a set of parallel rules which act every cycle and which elaborate and reconstruct the unified lexical frames each cycle. 548 Chapter 27: Natural language processing

This is all the mechanism that we will have, there will be no separate u-links and there will be no separate global integrity enforcement process. All of these mechanisms are achieved by rules and unified lexical frames.

Note that my brain model already does attenuation of activation values. Every lexical frame will decay with time because the sensed word will decay with time.

The nominal rate is 20 milliseconds per cycle. Lexical frames will be renewed every 20 milliseconds, and therefore unification links also since they are part of unified lexical frames. At a normal speaking rate of 140 words per minute and 7 words per sentence this is 20 sentences per minute, giving 3 seconds per sentence and 400 milliseconds per word. Thus, there will be about 20 brain cycles for each new word, and therefore 20 cycles of reconstruction of the structure description in each increment.

In Figure 27.8, I show the processes organized as modules which are interconnected and run in parallel, and in Figure 27.9, how it would correspond to areas of the brain. These show a module for c-space, which is Kempen’s conceptual semantic process, which he did not specify. Our grammatical model for the brain 549

c−space

conceptual graphs

lexicon lexical items u−space lexical frames lexical frames auditory input parse tree

goals (a) sentence recognition

c−space planning conceptual graphs

lexicon lexical items u−space lexical frames lexical frames parse tree phonology

motor for speech

auditory output

(b) sentence generation

Figure 27.8: Processes organized as concurrent modules 550 Chapter 27: Natural language processing

motor speech output

planning

u−space auditory input c−space

lexicon goals

Figure 27.9: Brain areas corresponding to concurrent modules Our grammatical model for the brain 551

27.5 The interface between the natural language sys-

tem and the core brain model

1. Interface to the perception-action hierarchy. (i) Auditory information is input to the natural language system. (ii) Context information is exchanged between the perception-action hierarchy and the natural language system. (iii) Goal and plan information of current activities is provided by the perception-action hierarchy to the natural language system. The natural language system also sends goals and information concerning verbalization to the planning and action hierarchy. (iv) Semantic information is exchanged between temporal areas and the natural language system. 2. Interface to the memory systems. (i) Information is exchanged with the episodic memory areas concerning the events involved in perceiving the current input sentence. (ii) Procedural memory is used in generating output sentences.

27.6 Using general brain model mechanisms for

strengths

General mechanisms of strength variation in our model. In normal updating behavior of data items in my brain model, incoming items update corresponding stored items by incrementing the stored value towards the strength of the incoming value. This also applies to decrementation when the incoming value is less than the stored value, and the decrementation rate can be different from the incrementation rate. Also data items 552 Chapter 27: Natural language processing in our brain model are attenuated exponentially each cycle. I defined a new mechanism, namely inhibition behavior of data items in my brain model, in which an item is decre- mented by an amount proportional to the sum of the strengths of all those items it is currently competing with.

Variation in strengths during sentence recognition. Figure 27.10 shows the strength variational functions used by V-K.

t−t0 frame v(t) = v(0)*d activation

t

u−link with no u(t+1)=u(t) + pf*vf(t) + ph*vh(t) inhibition vf,vh: activation values for foot and head pf,ph: constants

u−links with stronger inhibition

weaker

Figure 27.10: Variation of strengths in the V-K model of sentence recognition

Figure 27.11 puts together all the time variation graphs for the different types of data in our model. I achieve something very similar to the variation used by V-K, but using only the variational forms available in my basic brain model. Our grammatical model for the brain 553

The incoming word ramps up quickly and so does the lexical frame generated in the lexicon module. This is transmitted to the u-space module where it ramps up the value of the stored frame there. While the stimulating word is current, it will continuously regenerate the frame, causing it to ramp up. As soon as the next word is received, the previous word starts to attenuate, causing the frame value to also reduce by updating, in addition to its attenuation. The frame in the u-space module also reduces, but more slowly, and it continues to regenerate the unified frames containing the u-link information. Thus the frame stays around for a while in the u-space module, long after it has decayed in the lexicon module. While frames are in existence, the effect of competitive inhibition is palliated by regeneration from frames. Eventually, however the frames decay and then inhibition takes hold and eliminates the weaker u-links. At the end, only unified frames are left, which are just the correct ones specifying the grammatical structure of the sentence. 554 Chapter 27: Natural language processing

input signal

word (sensed word) lexicon noise level

auditory (phonological buffer)

frame

frame t

uspace unified frame (corres to ulink) no inhibition

unified frame stronger with inhibition weaker

Figure 27.11: Variation of strengths in our brain model for sentence recognition Our grammatical model for the brain 555

module data type update update attenuation increment decrement

lexicon lexicon 0.2 0.2 0.05 word 1.0 1.0 0.05 current word 1.0 1.0 0.05 current order 1.0 1.0 0.05 frame 0.8 0.8 0.05

uspace frame 0.4 0.4 0.05 unified frame 0.1 0.01 0.05

Figure 27.12: Parameter values used

Setting realistic parameters. I set up the model to nominally use a 20 millisecond brain cycle time. The environment then had a sentence in the form of a sequential list of character strings representing the words of the sentence. These were set to last about 3 seconds for the complete sentence. This meant that each word was current in the environment for 300 milliseconds and gaps between words were 100 milliseconds. The number of brain cycles in 3 seconds, i.e., 3000 milliseconds, is 150. I ended up needing to leave the model running up to 700 cycles before it settled on a unique solution. So this corresponded to about 10 seconds.

Successful model parameter settings were obtained after experimenting for only 5 runs. Figure 27.12 details the typical parameter values I used. 556 Chapter 27: Natural language processing

The noise level was set at 0.05. I did not find it necessary to use two different updating times for frames and unified frames. The value for inhibition had to be set quite low, at 0.2, unlike the V-K value of 0.65, however they divided everything by 5 for the different updating times, corresponding to 0.13 which is similar to ours. Agrammatism 557

Sentence type Abbrev Example % correct

active A The elephant hit the monkey 81.9 cleft-subject CS It was the elephant that hit the monkey 80.2 passive P The elephant was hit by the monkey 59.0 dative D The elephant gave the monkey to the rabbit 58.7 cleft-object CO It was the elephant that the monkey hit 51.1 conjoined C The elephant hit the monkey and hugged the rabbit 45.0 object-subject relative OS The elephant hit the monkey that hugged the rabbit 41.4 dative passive DP The elephant was given to the monkey by the rabbit 37.7 subject-object relative SC The elephant that the monkey hit hugged the rabbit 25.9

Figure 27.13: Caplan’s stimulus sentence types and comprehension scores

27.7 Agrammatism

V-K went on to apply their system to the clinical condition of agrammatism. They addressed the data published by Caplan [Caplan et al., 1985]. Caplan developed a set of nine sentence types, shown in Figure 27.13. The average success rate for patients with agrammatism is shown for each sentence type.

By hypothesizing that the set of patients abilities lay on a line in the parameter space of the model, V-K were able to find values of these parameters that gave a good fit to perfor- mance. Recent imaging studies show a neural area deficit common the all agrammatical subjects. This area is adjacent to and just posterior to Broca’s area. 558 Chapter 27: Natural language processing

27.8 Conclusion

I was able to develop a system level brain model of sentence recognition based on the Vosse-Kempen psycholinguistic model. Several existing brain model mechanisms could be used to represent mechanisms occurring in their model. My continuous construction execution regime provided a natural mechanism for sentence recognition. I added one new mechanism, namely the competitive inhibition of data items with specified overlap in their descriptions. I see this as a general brain mechanism. My system is currently able to recognize some of V-K’s example sentences, and I am continuing to develop the model. My model uses just two modules, corresponding to the lexicon and to u-space, and probably to Wernicke’s and temporal lobe areas and to Broca’s area, respectively. A fuller model would also have a semantic module, and input and output phonological modules. Chapter 28

Analysis of subcortical systems

Abstract: I consider how to extend my cortical model by adding models of subcortical systems involved in motivation and survival of the animal.

From a discussion of current experimental results, both behavioral and anatomical, and theoretical ideas, I propose that there are three motivational systems, namely agonism (meaning dominance and submission, including flight), attachment, and sex. Each mo- tivational system has positive and negative subsystems, and has a hierarchical structure involving the amygdala, hypothalamus and lower-level effector systems.

I discuss agonism in rats as described by Jeffrey Gray, Joseph Ledoux, James McGaugh, and others, attachment in rats as described by Myron Hofer and associates, and sexual behavior in rhesus monkeys as described by Richard Michael and Doris Zumpe.

In the next chapter, I will outline how my model can be extended to have such a motiva- tional system.

559 560 Chapter 28: Analysis of subcortical systems

28.1 Introduction

Motivation. The field of motivation research has not yet really settled down to a single accepted framework or indeed to a single accepted swamp. At the moment there are many frameworks/swamps, e.g., [Plutchik, 1980] [Hinde, 1970]. In addition there are many approaches to the psychology of emotion [Ortony et al., 1988] [Frijda, 1986] [Arnold, 1970] [Rapaport, 1942]. New motivational mechanisms are still being discovered and in general the subcortical motivational system now seems extremely complicated [Weiner, 1992].

Cortical systems develop and function by interacting with subcortical systems in behav- ior. In addition, as we discussed in chapter 20, the cortex has its own specifically cortical motivational dynamics.

Nevertheless, there is some agreement on large-scale organization of motivational func- tionality and on the anatomical structure and function of the main subcortical systems implementing it. There are attempts to develop unifying theoretical frameworks for be- havior and motivation; one I have found useful is due to William Mason [Mason, 1993].

For the purposes of this analysis, I will group motivational mechanisms into three main systems, namely agonism, attachment and sex. I will classify these as on a different level from what I will term basic survival systems such as hunger, thirst and body temperature.

Subcortical behaviors. It seems that, phylogenetically and ontogenetically, there is a set of interacting subcortical subsystems which provide basic behaviors and functions which include: 1. agonism which comprises aggression and avoidance: (a) aggression, which has many different forms with somewhat different mechanisms, Introduction 561 see below. (b) avoidance of dangerous situations, i.e., fear, This can coactivate freezing, flight and submission subsystems. 2. attachment of infants to caretakers, including maternal behaviors. Mechanisms seem to differentiate into pleasure due to proximity and distress due to separation. 3. sexual behavior, including maternal, paternal and courtship behaviors.

What can be a little confusing is that: 1. subsystems are at different levels of abstraction, for example the control of breathing, the control of eating, and the control of sexual behavior are at different levels. 2. subsystems greatly overlap neuroanatomically, for example the control of eating and the control of predatory aggression both use only the lateral hypothalamus and then midbrain central gray areas. The two subsystems use different sets of neurons, and probably different neurotransmitters to activate these similar areas. 3. subsystems are designed to coactivate, thus sexual excitement may coactivate a sexual aggression subsystem. Thus it may be sometimes clearer to talk of just one large system which has many different modes of activation, rather than a set of separate subsystems. 4. there are subcortical integration centers, notably the amygdala, which organize and integrate motivational systems, and may include pleasure and pain centers.

Underlying control systems. Below the level of these motivational systems are the underlying systems that they utilize and control. Underlying mechanisms for control- ling the body include: subcortical motor control systems, subcortical sensory processing, the peripheral nervous system, hormonal systems, and the immune system. Subcorti- cal motor control systems control the skeletal musculature as well as internal smooth muscle. Subcortical sensory processing delivers sensory information to the subcortical motivational systems. The peripheral nervous system comprises the sympathetic ner- vous system and the parasympathetic nervous system. The hormonal system comprises 562 Chapter 28: Analysis of subcortical systems pituitary, pineal, thyroid and parathyroid, thymus, adrenal, pancreas, and reproductive glands - testes and ovaries.

Underlying basic survival systems include: cardiovascular, thermoregulatory, respiratory, gastrointestinal, and urinary systems. Some elementary mechanisms will be implemented as brainstem nuclei, which are about 150 in number [Klemm, 1990].

Levels of control. Figure 28.1 shows our general approach to scientific explanation of brain action using a two-level hierarchical architecture with just two main subsystems, a cortical subsystem and a subcortical subsystem. In addition there are input perception subsystems which may be partially shared by the two main subsystems, and output effector interfacing subsystems, which may also be shared. This approach is arguably the simplest possible architecture if we want to model cortical and subcortical action.

My initial concept is that: (i) the subcortical control systems will be modeled by a feedback control system (ii) these feedback systems have limited plasticity (iii) the relationship between these feedback systems and the neocortical model would be that of a two-level control hierarchy. (iv) ontogenetically, the cortical system would initially make little contribution to behav- ior, but due to its greater plasticity, it would develop corresponding cortical responses which eventually come to dominate the control of the animal. The main subcortically motivated behaviors 563

cortical subsystem

effector perception interfacing

subcortical subsystem

sensors effectors including somatosensory such as motor subsystems, and internal sensing peripheral nervous subsystems and hormonal subsystems

Figure 28.1: General approach with two levels of control

28.2 The psychology of subcortically motivated be-

haviors

28.2.1 Agonism

Aggression. Moyer [Moyer, 1976] has differentiated several different kinds of aggression, namely: type description predatory an animal stalks, catches and kills its natural prey intermale a male attacks a strange male conspecific maternal a mother assaults a perceived threat to her young sexual a male becomes aggressive when encountering sex-related stimuli fear-induced an animal cornered and unable to escape from danger becomes aggressive irritable an annoyed person attacks another person or object instrumental an individual employs aggressive behavior to obtain a desired goal territorial an animal defends his territory against intrusion 564 Chapter 28: Analysis of subcortical systems

Wilson has territorial, dominance, sexual, parental discipline, weaning, moralistic, preda- tory, and antipredatory. According to Scott and Fredericson [Scott, 1975], agonistic behavior is anything relating to fighting, including reconciliation. Aggression among conspecifics is a set of behaviors that serve as competitive techniques. Competition can be social status competition, sexual competition or resource competition.

It seems that aggression can be used in the service of most behaviors, as the need or circumstance arises. It is a general-purpose subsystem which readies the body for fighting or other forceful acts.

Doris Zumpe and Richard Michael managed to show decoupling, i.e., desynchronization, of aggression from sex, thereby supporting the idea of aggression as a separate system. Normally, the sexual behavior and aggressive behaviors of males are coactivated on a yearly cycle, being triggered by the shortening of the day in the fall, causing a rise in testosterone levels. Zumpe and Michael kept some rhesus monkeys in an environment with a constant photoperiod for about four years, i.e., the length of the “day” was kept constant throughout the year. Testosterone levels in males went into a free running annual rhythm of about 13 months, and sexual behavior decoupled from testosterone levels, however aggression remained tightly coupled to testosterone levels.

Avoidance. Avoidance can be approximately equated with fear. Joseph Ledoux has explicated the avoidance response in rats as a model emotion [Ledoux, 1996], and Jeffrey Gray has developed a systematic framework for fear, stress, and anxiety [Gray and McNaughton, 2000].

Dominant and subordinate behavior. As already mentioned in section 2.5, Robert Sapolsky’s summary [Sapolsky, 1990] of the behavior of dominant baboon males has five categories. A dominant animal is more likely than a subordinate animal: (i) to differentiate between threatening and neutral interactions, The main subcortically motivated behaviors 565

(ii) to initiate a fight with a threatening rival, (iii) to initiate a fight he wins, (iv) to differentiate between winning and losing a fight, and (v) to successfully redirect aggression after losing a fight.

To describe these behavioral effects, I made the obvious postulate that there are two interacting behavioral subsystems, a cortical system and a subcortical system. Each receives sensory input and perceives its environment, and each generates motor output. The subsystems interact and can excite or inhibit each other.

The subcortical system is inherently more simple than the cortical. It has simple percep- tual categories, simple output behaviors, and is less adaptive or plastic than the cortical system.

By contrast, the cortical system has complex functionality, particularly in social areas. It has complex perceptual abilities, generates complex behaviors and is adaptable.

Let us examine each of Sapolsky’s social scenarios in turn and look at different cases:

1. A dominant animal shows most differentiation between threatening and neutral interactions. There are two cases

(a) There is a socially neutral interaction. The subcortical system sometimes per- ceives this as a threat and makes a simple fight response. The cortical system makes a correct perception that this is a neutral interaction and generates a neutral response.

(b) There is a socially threatening interaction. The subcortical system sometimes perceives this as non-threatening, and generates a neutral response, and some- times as a threatening interaction and generates a fight response. The cortical system makes a correct perception of the situation as threatening and always 566 Chapter 28: Analysis of subcortical systems

generates a fight response.

2. A dominant animal is more likely to initiate a fight that he or she wins.

(a) There is a threatening rival that he or she is likely to beat. The subcortical system sometimes incorrectly perceives the rival as likely to lose to, and so does not generate fighting behavior. The cortical system makes a more accurate assessment and initiates fights.

(b) There is a threatening rival that he or she is likely to lose to. The subcortical system sometimes perceives the situation as one he or she is likely to win in and initiates a fight. The cortical system initiates a neutral response.

3. A dominant animal shows the most differentiation between winning and losing a fight.

The cortical system gives the correct assessment, but the subcortical system gives an incorrect assessment some of the time.

4. A dominant animal is most likely to successfully redirect aggression after losing a fight. The cortical system is better able to identify a good displacement target. This target has to be attacked within a fairly short time frame in order to avoid the deleterious biochemical effects of the previously lost fight.

For healthy behavior, the cortical system has to be able to perceive situations correctly, to inhibit any incorrect behaviors generated by the subcortical system, to use the subcortical system to attain integrated behavior involving cortical and subcortical signals, to attain reward for the subcortical system as well as the cortical system. The main subcortically motivated behaviors 567

28.2.2 Attachment

Attachment in humans [Cassidy and Shaver, 1999] has a vast literature starting with John Bowlby’s classic work [Bowlby, 1973][Bowlby, 1980][Bowlby, 1982]. This arose from studies of children separated from their parents in the second world war, either planned as organized evacuations of children from cities like London, or unplanned due to war damage. There is also work on adult attachment [Hazan and Zeifman, 1999].

The phenomenon of attachment in humans is well known. During the first year of life, the infant seeks and forms an attachment relationship with its caretaker, in which separation is distressful and proximity is pleasureable. This relationship acts as a secure base for the infant in exploring its environment. On prolonged separation of several days the infant will progress through stages of distress, protest and despair. As a result of different temperaments and mothering styles, attachment can take different forms which are usually put into four types [Main, 1995]: Type A: Insecure-avoidant. The caretaker is sometimes separated and exhibits some lack of response, also may withdraw from the infant when the infant seems sad. The infant tends to explore less and to avoid contact with people. Type B: Secure. Caretakers are sensitive to signals and communications of their infants. The infant tends to explore nnfreely and to be comfortable with other people. Type C: Insecure-ambivalent/preoccupied. Caretakers are unpredictable, discouraging of autonomy, and insensitive to infant signals and communications. The infant tends to explore less and to be ambivalent to people. Type D: Disorganized/disoriented. Often caretaker maltreats infant, also resulting from psychiatrically distressed caretakers. In this type, infant behaviors include: (i) sequential display of contradictory behavior patterns (ii) simultaneous display of contradictory behavior patterns (iii) undirected, misdirected, incomplete and interrupted movements and expressions 568 Chapter 28: Analysis of subcortical systems

(iv) stereotypies, asymmetric movements, mistimed movements and anomalous posture. (v) freezing, stilling, slowed movements and expression (vi) direct indicators of apprehension regarding caretaker (vii) direct indicators of disorganization or disorientation.

Attachment is construed as involving the infant’s formation of a working model of its caretaker, which is a representation of the caretaker, of the caretaker’s behavior and of interactions between the infant and caretaker. Failure to form a stable attachment relationship can impair the child’s development. Adult relationships also have similar properties to infant attachment. It seems to me that the caretaker also forms a strong attachment to the infant, although this is perhaps not quite so crucial as for the infant.

The attachment categories used for 5-7 year old children are slightly different, namely A: avoidant, B: secure, C: dependent, D: controlling and insecure-other [Goldberg, 1995]. There is also a classification for adult attachment, which is based on the adult’s behav- ior when asked about their childhood attachment experience: F: secure/autonomous. coherent and collaborating in dialogue, D: dismissing. lack of memory of childhood, E: confused and preoccupied with past experience, and U: unresolved/disorganized in discussing attachment [Goldberg, 1995].

Since it was introduced twenty years ago, the concept of working model has resisted more precise characterization, and remains rather imponderable. The development of compu- tational theories of the attachment process should facilitate progress in understanding it.

Attachment in rhesus monkeys was described in the classic experiments by Harry Harlow [Harlow, 1971] [Harlow, 1986]. More recent work is reviewed in [Carter et al., 1999].

In order to work with detailed physiological data, I’ll briefly describe work by Myron Hofer and coworkers on attachment in the rat [Hofer, 1984][Hofer, 1987]. I’ll describe The main subcortically motivated behaviors 569

mutual regulation between infant and caretaker, which is thought to be the basis of attachment. Some work on the physiological basis of mutual regulation in monkeys has been reported by Gary Kraemer [Kraemer et al., 1991].

Maternal behaviors. Both dam and pup have innate behaviors. The maternal behav- iors of the dam are initiated by hormonal priming in late gestation and then parturition. This set of behaviors is maintained by continued interaction with the pups. If the pups are removed for 3-4 days, maternal behavior disappears and does not reappear when the dam is reunited with the pups.

Maternal behavior progresses in time for a few weeks and changes its form during this time. Maternal behavior immediately after parturition consists of (i) crouching over the pups, (ii) vigorously licking them, (iii) nursing them, (iv) retrieval if a pup strays, and (v) taking a 5-10 minute break every hour or two to drink and eat.

Results reported by Hofer correspond to behavior two weeks after parturition. Nursing continues until 25-30 days, when active juvenile behaviors of the pups, such as exploration and rough and tumble play, become dominant. Weaning occurs between 2.5 and 4 weeks. At 2 weeks, the mother interacts with the pups for 15-40 minutes every one or two hours. The rest of the time she rests at some distance from her litter and feeds, rests and grooms herself. Thus, rat pups experience periods of separation every day.

Pup behaviors. Pup behavior immediately after parturition consists of (i) writhing along tactile and olfactory spatial gradients, (ii) rooting and nutritive sucking, (iii) root- ing and nonnutritive sucking, (iv) ultrasonic cries on separation from mother, and (v) sleep-wake cycles.

Development of various systems occurs, resulting in changes in behavior. Eventually, the pup is progressively more active and independent by four weeks after parturition.

Other motivational systems involved in attachment 570 Chapter 28: Analysis of subcortical systems

Other motivational systems, or modes of response, may interact with attachment re- sponses, notably the aggression and avoidance responses toward and by unfamiliar ani- mals.

28.2.3 Sex

I’ll discuss sex in rhesus monkeys; similar behavior patterns occur in other primate species. I’m grateful to Doris Zumpe for explaining all of this to me. Obviously, the full range of sexual behavior using the cortex in humans is large. The position I take is that the sexual behaviors and mechanisms observed in monkeys are a good approximation to the subcortical component of behaviors and mechanisms in humans. There is some debate on this point. I take these behaviors to be based on innate mechanisms with some modification due to plasticity in the amygdala. Full sexual behavior in humans is a result of cortical development, and has several components, including sexual identity and sexual preference. During development the cortex can take over all control of the subcortical systems, leading to quite different behaviors. For example, women vary in how cyclical their sexual desires are, and both sexes vary in how interested in sex they are, including some priests committed to lifelong celibacy.

In rhesus monkeys, when young males become mature, which occurs after about five years, they leave the troop into which they were born. Then, males gradually immigrate into another troop, initially by sexual invitation from ovulating females. A male may then develop a companion affiliative relation with a particular female.

Zumpe and Michael [Zumpe and Michael, 1996] have characterized sexual behavior in rhesus monkeys as a composition of two separate behaviors. The first type of behavior occurs during the preovulatory (follicular) phase and maximally during the ovulatory phase of the menstrual cycle in the female, and I will refer to it as midcycle sex. In The main subcortically motivated behaviors 571 midcycle sex, the female engages in sex with several males but at a frequency very strongly linked to midcycle. During the rest of the menstrual cycle, no midcycle sex occurs. The second type of behavior is linked to affiliation, and I will refer to it as affiliative sex. In affiliative sex, the female engages in sex with one particular male, her companion, during a much larger fraction, in fact most, of her oestral cycle. Thus the female has a cyclical pattern of sexual arousal, having sex with many males at midcycle. Conversely, the male has a constant pattern of arousal, having sex with any females which are at their midcycles.

Companionship has a time development over a period of about five years, diagrammed in Figure 28.2. It starts with a new male entering the primate troupe, develops through an early phase, with sex throughout the menstrual cycle including midcycle, then in its later stages continues outside midcycle, with less sex during midcycle, then only outside midcylce, and then not at all, until eventually the relationship ends and usually the male leaves the primate troop.

4 years

non−friend early friend old friend

male some or affiliatively enhanced male emigrates immigrates no midcycle sex midcycle sex + midcycle sex + non−midcycle only non−midcycle affiliative sex affiliative sex + + affiliation affiliation

Figure 28.2: Time course of male-female relationship

Familiarity and novelty. The sexual response is modulated by novelty and familiar- ity. Initially, the sexual response is enhanced by novelty, and then later, it is reduced, to the point of extinction, by familiarity. Although most observations necessarily confound the effect of familiarity in males and females, there is reason to believe that the main effect is in females. 572 Chapter 28: Analysis of subcortical systems

First, evolutionary theory would suggest females seek novel males. Evolutionary theory predicts that the negative effects of incest are potentially greater for the sex (female in primates) which is capable of producing the fewer offspring per lifetime. Typically 12-13 offspring are born to each female, and 7 survive to reproductive age. The reproductive period is from 5 or 6 years of age until 15 years of age. Reproductive losses are also large, 17% before one year or more. Females would therefore be expected to seek novel males to avoid incest with group members. A study of 4000 people who grew up on kibbutzen had no sex with each other later in life. This applied to males and females, so there may be an incest inhibitory mechanism in both sexes.

Second is the work of Chapais and Mignault on Japanese macaques where animals have both homosexual and heterosexual relations. In this case, one can differentiate different effects of familiarity on the two different sexual behaviors. These effects in humans have been known for some time, see [Jasso, 1985] [Liu, 2000] [Masters et al., 1992].

Other motivational systems involved in sex. Other motivational systems, or modes of response, may interact with sexual responses: (i) the avoidance response to unfamil- iar animals, (ii) the male aggression aspect of an agonistic system, probably linked to testosterone level, (iii) attachment relations involving some mutual regulation, and (iv) pain which can inhibit sexual motivation.

28.3 The neuroanatomy of subcortically motivated

behavior

There are three main levels of anatomical mechanism, namely the amygdala, the hy- pothalamus and what I am calling the underlying control systems. The neuroanatomy of subcortically motivated behavior 573

28.3.1 The hypothalamus

Referring to Figure 28.3, the hypothalamus is a collection of ganglion cells grouped into nuclei. We can group these nuclei into three groups, namely: anterior nuclei: preoptic, supraoptic and paraventricular nuclei central nuclei: ventromedial, dorsomedial, and arcuate nuclei posterior nuclei: dorsal, posterior hypothalamic, and mamillary nuclei.

Figure 28.3: The human hypothalamus, taken from Carpenter, 9th edition, Figure 17.1, page 707

The connections of the hypothalamus. The hypothalamus has inputs from: auditory, gustatory and olfactory sensing, visceral sensing, blood temperature, blood salinity, and blood hormone levels. 574 Chapter 28: Analysis of subcortical systems

Most hormone output from the hypothalamus is via the pituitary. The hypothalamus sends different peptides which are called regulatory factors to the pituitary which secretes selected hormones at selected levels into the bloodstream. The pituitary creates the hormones, except for two hormones, ADH (antidiuretic hormone) and oxytocin, which are created by the hypothalamus and then stored in the pituitary, Figure 28.4 shows the different hormone outputs and their effects.

hypothalamus pituitary neural anterior intermediate posterior lobe lobe lobe regulatory factors (peptides) hormones (ADH and oxytocin)

vasopressin kidney oxytocin uterus contractions breast MSH endorphin skin and CNS GH (growth hormone) bone

TSH thyroid −> thyroxine triiodothyronine ACTH adrenal cortex −> corticosteriods

FSH,LH ovary −> estrogen, progesterone, ovum testis −> testosterone, sperm Prolactiin breast

Figure 28.4: Hormone outputs from the hypothalamus via the pituitary

28.3.2 The amygdala

Structure of the amygdala. The amygdala consists of a number of nuclei, at least 13 in number [Emery and Amaral, 2000] [Aggleton and Saunders, 2000].

Intrinsic connections of the amygdala. The intrinsic connectivity, between the The neuroanatomy of subcortically motivated behavior 575 different nuclei, is the subject of current research. Figure 28.5 shows recent data reported by Aggleton and Saunders [Aggleton and Saunders, 2000].

Figure 28.5: Intrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000], Legend: AAA: anterior amygdala area, AB: acces- sory basal nucleus, CE: central nucleus, COa,p: cortical nucleus, anterior and posterior parts, B mc,pc:Basal nucleus, magnocellular and parvocellular parts, L: lateral nucleus, PAC: periamygdaloid cortex, PL: paralamellar part of basal nucleus

Extrinsic connections of the amygdala. Our explanation of the connections of the amygdala is taken from the work of Joseph Ledoux [Ledoux, 1996], Jeffrey Gray and Bruce McNaughton [Gray and McNaughton, 2000], Joseph Price and David Ama- ral [Price et al., 1987] [Amarel et al., 1992], and John Aggleton and Richard Saunders [Aggleton and Saunders, 2000].

Figure 28.6 diagrams the amygdala and its extrinsic connections, taken from [Aggleton and Saunders, 2000]. This includes the inputs to amygdala from sensory in- puts, subcortical areas and cortical areas, the outputs from amygdala to subcortical 576 Chapter 28: Analysis of subcortical systems

areas, the outputs from amygdala to cortical areas, and the outputs from amygdala to skeletomotor system for emotional expression.

Figure 28.6: Extrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000]

The connectivity of the amygdala. A simplified view [Emery and Amaral, 2000] of the connectivity of the amygdala is that: (i) most input from cortical perception, notably visual, areas goes to the central nucleus. (ii) information from the central nucleus goes mainly to the basolateral nucleus and the lateral nucleus, and (iii) most of the information from the amygdala to the cortex goes from the basolateral nucleus. Nathan Emery and David Amaral [Emery and Amaral, 2000] have conjectured that the The neuroanatomy of subcortically motivated behavior 577 central nucleus is the main recognizer of input stimuli into affectively and socially sig- nificant classes, and the basolateral nucleus is the main action generator. According to a recent survey by John Aggleton and Richard Saunders [Aggleton and Saunders, 2000], connections to premotor cortex are weak, So actions are not executed directly by premo- tor cortex but instead go to medial frontal and orbitalfrontal cortex which are planning and context areas. There is some connectivity to dorsolateral prefrontal cortex as well.

Exactly which nucleus does what may not be important if we are modeling the amygdala as one module, likewise the hippocampus. Certainly the different nuclei of the amyg- dala are strongly connected together so different motivational mechanisms can certainly influence each other.

The basolateral nucleus is comparatively large in primates, indicating an evolutionary survival advantage, possibly due to its providing improved social discrimination and action.

Strong connections between the amygdala and the cortex have been discovered only in the last decade or so and many treatments still assume the amygdala is mainly an olfactory motivation device. However, cortical connectivity has altered our picture of the role of the amygdala. It is now seen as important in recognizing social percepts and actions, in being plastic and in storing memories of socially significant situations.

28.4 The action of subcortical systems

28.4.1 Subcortical effects

The subcortical motivation system interacts with the rest of the brain in various ways: (1) It generates a component of behavior both directly through the skeletal musculature 578 Chapter 28: Analysis of subcortical systems but mainly more indirectly through the endocrine and autonomic systems. It mainly readies the person for different classes of motivated survival behavior. (2) The amygdala integrates information on the state and intentions of the subcortical system. It communicates with the hippocampus to affect and to contribute to episodic memories, and to be in turn affected by them. It also communicates with orbital frontal areas, probably adding motivational influences to cortical intentions. (3) Hormones and opiates in the blood stream secreted by the hypothalamus, pituitary, and other areas, affect receptors in synapses in the cortex and other brain areas.

Subcortical areas will have some plasticity, particularly the amygdala.

28.4.2 System functions and connections

I can use the functionalities I have assigned to the modules of the perception-action hierarchy, and to the hippocampus, to understand their interaction with the amygdala.

The main interactions I suggest are: (i) with anterior temporal areas, giving the amygdala information on sophisticated high- level visual percepts, where the scene is interpreted using the cortex’s store of semantic concepts. (ii) with orbitalfrontal areas, giving a connection with the context store and the currently active context(s) (a) allowing the context to be taken into account in the action of the amygdala, and (b) allowing the amygdala to influence the choice and description of the currently active context(s). (iii) with the anterior cingulate, giving a connection with the system’s current goals, both as context for the amygdala and also for the amygdala to influence or even create goals. (iv) with the hippocampus, giving The neuroanatomy of subcortically motivated behavior 579

(a) the current event and episode as input to the amygdala, and (b) output from the amygdala as a component of the current event, giving an eval- uative, affective and social component. Here the input of moderate levels of excitation tends to enhance the saliency of the event and to make it more strongly remembered whereas high levels of stress incapacitate the hippocampus and cause the formation of dissociated memories [Nadel and Jacobs, 1996].

I can therefore now draw a summary diagram, Figure 28.7, which shows the data and processing that could be occurring in each module using the case of agonistic behavior as an example. In general, corresponding mechanisms would apply for attachment and for sexual behaviors. These different motivational mechanisms would also mutually interact certainly in modules at the cortical level and also in the amygdala.

These also have an ordering property so that agonism can be studied independently of the other two, but the study of attachment requires some treatment of agonism but not sex, and the study of sex requires some treatment of both agonism and attachment.

The blood can be conceived as a part of the environment and also sensing of hormone and salinity levels, etc., by the hypothalamus, as sensing of the environment.

28.4.3 Brain mechanisms of agonism

Figure 28.7 is my idea of a complete system diagram for agonistic behavior.

Mechanisms of aggression. More specific data on aggression have been analyzed by Stephen Klein [Klein, 1982]. Flynn [Flynn, 1976] also describes distinctions be- tween affective and predatory aggression. Affective involves amygdala, hypothala- mus, central gray and brainstem. Predatory involves hypothalamus, medial forebrain bundle, central gray and brainstem. Other work includes that of Albert and Walsh 580 Chapter 28: Analysis of subcortical systems

perception hierarchy prefrontal cortex perception of agonistically temporal lobes significant agonistic plan execution features

models of agonistic behavior in general orbital frontal model of agonistic behavior with specific partner

hippocampus anterior cingulate

goal storage and prioritization cortical agonistic goals

amygdala

agonistic responses and goals adaptive and learned goals at this level

hypothalamus

agonistic responses and goals relatively nonplastic innate responses

peripheral nervous system motor system hormonal systems

states of agonistic readiness basic agonistic actions states of agonistic readiness phases of agonistic activities

Figure 28.7: Summary connections diagram for the amygdala, illustrated by agonism

[Albert and Walsh, 1984] [Pinel, 1993] on the different roles of the anatomical compo- nents for different types of aggression.

Mechanisms of avoidance. Detailed treatment of mechanisms of avoidance can be found in the publications of Ledoux and coworkers [Ledoux, 1996]. Figure 28.8 gives Gray’s idea of levels of threat processing derived from Graeff’s work [Graeff, 1994] [Gray and McNaughton, 2000]. Recent work differentiates the roles of the lateral and basolateral nuclei in learning [McGaugh, 2002] [Amorapanth et al., 2000] [Killcross et al., 1997]. The output from the lateral nucleus has been associated with the innate (unconditioned) response of freezing. The neuroanatomy of subcortically motivated behavior 581

potential danger risk assessment posterior cingulate to approach and behavior inhibition septo-hippocampal

potential danger avoidance anterior cingulate to avoid amygdala

distal danger escape medial hypothalamus inhibit aggression

proximal danger freezing periaqueductal gray flight fight

Figure 28.8: Levels of threat processing, from Graeff 1994

28.4.4 Brain mechanisms for attachment behavior

As far as I know, there is currently no agreed idea of subcortical or cortical mechanisms of attachment. However, we do have Hofer’s analysis of lower brain mechanisms involved which include the hypothalamus and the brainstem.

Maternal behaviors. Maternal behavior is modified by the action of the pups, see the table in Figure 28.9.

pup action effect on dam

nuzzling, licking, huddling nursing sucking slow wave sleep → oxytocin → milk letdown activity, heat brain temperature → termination of nursing ultrasonic calling increases retrieval, nest relocation, milk letdown, licking pups decreases biting pups, stepping on pups

Figure 28.9: Table of pup action effects on the dam, from [Hofer, 1987] 582 Chapter 28: Analysis of subcortical systems

Pup behaviors. The behavior of the pup is modified by interaction with the mother, see the table Figure 28.10. On separation of the pup from the dam, the different control systems react differently, as diagrammed in Figure 28.11. A complete system diagram for attachment behavior can be obtained from the one for agonism, Figure 28.7, by altering the labeling accordingly.

dam action effect on pup

body warmth + activity level tactile, olfactory - activity level

milk (distention) - nutritive sucking tactile (perioral) - nonnutritive sucking

body warmth, and tactile (dorsal) + NE (norepinephrine) + DA (dopamine) + ODC (ornithine decarboxylase (growth regulator))

milk (sugar) + oxygen consumption

periodicity of milk and tactile + REM sleep - arousal

milk (interoreceptors) + heart rate (beta-adrenergic) - arterial resistance (alpha-adrenergic)

tactile (dorsal) + growth hormone

Figure 28.10: Table of dam action effects on the pup, from [Hofer, 1987] The neuroanatomy of subcortically motivated behavior 583

Figure 28.11: Reaction of the different control systems of the pup on separation from the dam, from [Hofer, 1987] 584 Chapter 28: Analysis of subcortical systems

28.4.5 Brain mechanisms of sexual behavior

Neuroanatomy underlying sexual behavior. Figure 28.12 outlines the subcortical systems involved in sexual behavior.

testosterone level estrogen level amygdala

vision taste smell viscera sound

testosterone level hypothalamus estrogen level

vision taste smell viscera sound

pituitary autonomic motor

FSH,LH

testosterone initiate sexual behavior estrogen

Figure 28.12: Subcortical systems involved in sexual behavior

The hormonal systems are (i) ovaries and adrenal glands producing estradiol, (ii) pro- gesterone receptors facilitated by estradiol, (iii) testes and adrenal glands producing testosterone, and (iv) receptors induced by each hormone. The neuroanatomy of subcortically motivated behavior 585

The male and female use different hypothalamic nuclei, males using MPO and females VM. In male hamsters, Ruth Wood [Wood and Coolen, 1997] found that it needed both olfactory input and testosterone in MPO to evoke sexual behavior; if either were missing, there was no sexual behavior; if both were present, then sexual behavior occurred. It also needs both sensory input and androgens input in the medial amygdaloid nucleus for the amygdala to stimulate sexual behavior.

A system diagram for sexual behavior can be obtained from the one for agonism, Figure 28.7, by altering the labeling accordingly.

In addition, we will have to take into account the hormonal circulation in the bloodstream, which is diagrammed in Figure 28.13. Hormone receptors in cortical and subcortical areas are induced by behavior. The female has low levels of testosterone, and has progesterone only during the luteil cycle (after ovulation). In rats the sex differentiation occurs in gene expression in the foetus before any hormonal influence from the mother. The probability of homosexuality increases, up to double the rate, with the number of older brothers. This is thought to be due to an interaction between the mother and the foetus with the mother becoming immunized against males.

The metabolic chain starts with cholsterol which produces progesterone which in turn produces testosterone. Testosterone is metabolized to estradiol and to dihydrotestos- terone.

The hypothalamus and amygdala contain aromatoase which metabolizes testosterone to estradiol which then affects neural receptors.

Testosterone is metabolized by the enzyme alpha-reductase to dihydro-testosterone, but not in the brain.

The hormone systems of the male and female are thought to be identical, the differ- ence being due to the response of the system to different hormone regimes. The male’s 586 Chapter 28: Analysis of subcortical systems

cingulate testosterone cingulate testosterone

thalamus hippocampus thalamus estradiol hippocampus testosterone amygdala estradiol amygdala estradiol testosterone testosterone estradiol aromatase: T −> E2 hypothalamus progesterone hypothalamus testosterone testosterone estradiol estradiol pituitary follicle stimulating pituitary gonadotropins ovaries luteinizing hormone GnRH (graffine follicles) gonads testosterone progesterone testosterone ovulation dihydrotestosterone estradiol seminal corpus vesicles prostate glans penis luteum progesterone 5alpha reductase: T −> DHT uterus

Female Male

Figure 28.13: Female and male hormonal circulation brain sees more oestrogen with testosterone, but the female’s brain sees progesterone. If the hormone regimes are artificially changed then the male immediately adopts female behavior and the female adopts male behavior.

28.4.6 A unified picture

It has been observed by Newman [Newman, 1999] that all the different mo- tivations use the same brain centers. As she observes, a number of re- searchers have already noted that more than one function is subserved within the The neuroanatomy of subcortically motivated behavior 587 amygdala-hypothalamus system [Joppa et al., 1995] [Luiten et al., 1985] [Simerly, 1995] [Nyby et al., 1992] [Barfield, 1984] [Jonge and de Poll, 1984].

Thus, it seems that there may be no separate pure motivations, such as agonism, attach- ment, and sex, but that instead, if these components intercommunicate there is a single motivation which has many different dimensions or variations. Apparently it is true that “There is but one urge”.

Figure 28.14 is my attempt to summarize and depict the three main motivational systems, with several levels of control and interactions between the systems at all levels. The division of the boxes is intended to indicate that a given system may itself have several components, such as agonism having aggression and avoidance, and attachment having proximity and separation subsystems. 588 Chapter 28: Analysis of subcortical systems

agonism attachment sex cortical level full perception

learning complex learning mainly and memory and memory learned

info info (cortical descriptions) (affect)

restricted agonism attachment sex amygdala level perception (vision) some learning and memory (connections to hippocampus) integration center

limited perception agonism attachment sex hypothalamic level (olfactory, light levels)

limited plasticity

autonomic system endocrine system skeletal motor system periaqueductal gray

Figure 28.14: Levels of interacting motivational control The neuroanatomy of subcortically motivated behavior 589

28.5 Stress

Understanding the mechanism and effects of stress is of great medical importance. Ac- cording to Lynn Nadel and Jake Jacobs [Nadel and Jacobs, 1996], stress originates in the sympathetic nervous system which triggers a related set of neurobiological processes, namely, (i) the release of epinephrine fom the adrenal medulla, (ii) of nor-epinephrine from the locus coeruleus, rapidly and (iii) corticosteriods from the adrenal cortex, more slowly. These three substances have different effects on different areas. Most notable is the effect of corticosterone on the hippocampus, which has a very high concentration of glucocorticoid receptors, interfering with its normal functioning. The response to stress first enhances and then disrupts the hippocampal memory activity. According to Nadel and Jacobs this results in memories without spatiotemporal reference, i.e., dissociation.

Application of modeling to stress. The amygdala plays an important role in stress. However the main phenomenon of the formation and nature of traumatic memories con- cerns the integration process of memories for events. Chapter 29

Modeling interacting cortical and subcortical systems

Abstract: I first briefly examine ontogenetic development of subcortical systems, in- volving innate systems, amygdala plasticity and cortical learning.

I then outline how my model might be extended to have such subcortical systems.

590 Introduction 591

29.1 Introduction

In order to model the motivational subsystems described in the previous chapter, I will first briefly examine ontogenetic development, since this is probably the key to under- standing the corresponding motivational mechanisms in the cortex as well as the rela- tionships between cortical and subcortical processes.

After this, I will consider the formulation of subcortical motivational mechanisms using an abstract logical system approach.

29.2 Ontogenetic development

The development of basic behaviors. From an initial neonatal state, in which the animal has protobehaviors, the animal learns more general responses, by hippocampal learning and amygdala adaptation. These are general behaviors learned during play and exploration.

The development of agonism. In the development of aggression, rat pups hone their fighting skills in rough and tumble play, determining exactly how to hold another down, exactly where on the neck of the opponent to bite, etc. [Pinel, 1993] [Pellis, 1989].

In the development of avoidance, although vervet monkeys have an innate tendency to fear snakes, eagles and leopards, their main natural predators in the wild, it has been shown that infant monkeys learn the proper avoidance response from their mothers. Dif- ferent human cultures treat fear differently, see for example Catherine Lutz [Lutz, 1988].

The development of attachment. From an initial neonatal state, in which the animal has protobehaviors, and also innate mutual regulatory needs, the animal develops attachment behavior with its parents. This involves creating and using a working model. 592 Chapter 29: Modeling interacting cortical and subcortical systems

The formation of a secure base, meaning a secure attachment relationship between child and parent, allows the child to explore the world and to develop further.

We should be able to model the development of the different types of attachment re- lation resulting from different temperaments of the child and parent, from parent-child interactions with different characteristics.

The development of sexual relations. The development of sexual relations involves maturation, adaptation and learning. My overall system diagram, Figure 29.3, indicates the central role of the hippocampus in generating cortical behaviors, being influenced by the amygdala, and influencing adaptation in the amygdala.

In the development of basic sexual behavior, early development produces a model of social behavior in general which is used in the development of more specialized relationships. Sexual behavior is also learned, based on innate responses, and other learned behaviors. This will involve the development of contexts (often called scripts in the sex literature) and working models, and will involve mainly cognitive learning in cortex, but also learning in the amygdala.

I am here ignoring other important aspects of sexual development, notably the formation of gender identity and gender preference as described by Robert Stoller [Stoller, 1968] [Stoller, 1974] [Stoller, 1985].

In the development of adult sexual behavior, during adulthood, the animal has basic sexual encounters, and learns sexual contexts with partners. In the case of prolonged sexual activity with a given individual, these contexts become more developed. For longer periods, sexual contexts become generalized to a sexual companion relationship which is based not only on sexual activity but other experiences. This involves a representation of the companion by a working model and contexts. This sexual companion relationship stabilizes fear and social status issues. Ontogenetic development 593

Mechanisms of development of motivational behaviors. Let’s assume: 1. Neonataly, there is no cortical involvement. Amygdala and hypothalamus control organism and generate protobehaviors for fear, attachment, etc. 2. Then learning/development occurs: (a) amygdala and other subcortical information is communicated to the cortex: (i) what subcortical percepts occur, (ii) what subcortical actions occur, and (iii) evaluation re- ward/punishments occurring (b) the cortex learns (i) states, (ii) percepts, and (iii) actions, which increase and decrease control of fear, attachment, etc (c) the cortex acts, generating information used by the amygdala (d) this information is communicated to the amygdala (e) the amygdala (i) uses this information for action, and (ii) uses this information for its own learning/plasticity

Information on reward and punishment and amygdala action is communicated to the hippocampus from the amygdala, and information on learned episodes is communicated to the amygdala from the hippocampus.

Control theoretic work. There has been some work on modeling motivation us- ing control theory, which has focused on hunger, thirst and sexual appetitive behaviors [Toates, 1986] [McFarland, 1993]. However this modeling limits itself to the basic survival level, just the lowest levels of control, which are below any other levels we are interested in. 594 Chapter 29: Modeling interacting cortical and subcortical systems

29.3 The interface between the subcortical systems

and the core brain model

1. Interface to the perception-action hierarchy: (i) Perception. Perception information from anterior temporal areas such as TE is sent to the subcortical systems, and perceptual tuning information is sent back to several areas of the visual ventral hierarchy. (ii) Context. Context information is sent from orbitalfrontal areas to subcortical sys- tems, and affective components of the current situation are sent to the orbitalfrontal areas. (iii) Goals. Information on goals is interchanged between the anterior cingulate areas and subcortical areas. (iv) Planning. Information on the current plan and also inhibitory, and possibly exci- tatory, signals are sent from prefrontal planning areas to subcortical areas. Information concerning affective state and possibly descriptions of suggestions for plans are sent from subcortical areas to prefrontal cortex. (v) Action. Information is sent from subcortical areas to, mainly medial, premotor ar- eas. 2. Interface to memory systems: (i) Episodic memory. Event memory and control information is exchanged between the amygdala and the hippocampus. This provides affective information to event repre- sentations, and also informs the amygdala of the current event and episode. (ii) Procedural memory. Subcortical areas send information to the procedural memory system, which could alter the selection and execution of procedures. Modeling agonistic motivation 595

29.4 Modeling agonistic motivation

I can thus now propose outline mechanisms for agonistic behaviors, referring to Figure 29.1.

perception hierarchy prefrontal cortex perception of agonistically temporal lobes significant agonistic plan execution features

models of agonistic behavior in general orbital frontal model of agonistic behavior with specific partner

hippocampus anterior cingulate

goal storage and prioritization cortical agonistic goals

amygdala

agonistic responses and goals adaptive and learned goals at this level

hypothalamus

agonistic responses and goals relatively nonplastic innate responses

peripheral nervous system motor system hormonal systems

states of agonistic readiness basic agonistic actions states of agonistic readiness phases of agonistic activities

Figure 29.1: Postulated brain mechanisms for agonistic behaviors

Modeling the fear of snakes. Monkeys have an innate predisposition and then develop the full fear response to snakes by observation of the fear responses of adults [Mineka and Cook, 1988] [Mineka and Cook, 1993] [Mineka and Ohman, 2002].

We do not have enough information to attribute learning to specific areas of the amygdala and/or the cortex. One plausible idea is that (i) the visual areas of the cortex learn to recognize snakes in more detail, (ii) the amygdala learns the survival significance of 596 Chapter 29: Modeling interacting cortical and subcortical systems snakes - how important, i.e., dangerous, certain snakes in certain positions with certain behaviors are, and also (iii) the amygdala learns to generate appropriate responses which would be to make snake alarm calls and generate actions for the avoidance of snakes, which (iv) could be elaborated by the frontal cortex into jumping onto trees/rocks or whatever is available.

29.5 Modeling attachment motivation

I can propose outline mechanisms for attachment behaviors, referring to Figure 29.2.

perception hierarchy prefrontal cortex perception of affiliatively temporal lobes significant affiliative plan execution features models of affiliation in general orbital frontal model of affiliation with specific partner

hippocampus anterior cingulate

goal storage and prioritization

cortical affiliation goals

amygdala

affiliative responses and goals adaptive and learned goals at this level

hypothalamus

affilative responses and goals relatively nonplastic innate responses

peripheral nervous system motor system hormonal systems

states of affilative readiness basic affilative actions states of affilative readiness phases of sexual activities

Figure 29.2: Postulated brain mechanisms for attachment behaviors Modeling sexual motivation 597

Modeling the development of attachment. We know that proximity leads to reinforcement via opiates and separation leads to avoidance due to its distressing nature. Both the infant and caretaker learn to represent each other, although these are different tasks since the caretaker already has many working models of friends and acquaintances whereas the infant hasn’t. It is more than simply prediction however, it seems to me that the infant forms a shared state with the caretaker which allows it to reduce its avoidance response and thereby to explore. So the working model leads to predictability and therefore security.

Using an analysis analogous to that above for fear, one plausible idea is that (i) the visual areas of the cortex learn to recognize the caretaker, (ii) the amygdala learns the survival significance of the caretaker - how important, i.e., the effect of the caretaker’s actions on the subcortical systems, and also (iii) the amygdala learns to generate appropriate responses which would be smiling, crying etc., which (iv) could be elaborated by the frontal cortex into more complex and discriminated actions. At the same time, there will be some adaptation and learning by the caretaker concerning the behavior, intentions and personality of this particular infant.

29.6 Modeling sexual motivation

From the typical time course of sexual relationship, diagrammed in Figure 28.2, I can now characterize four phases of behavior:

1. Basic sex is mainly subcortical.

2. Affiliatively enhanced sex

(a) Stronger subcortical since fear and social status inhibition are reduced 598 Chapter 29: Modeling interacting cortical and subcortical systems

(b) Cortical sexual contexts rewarded by achieving sexual and affiliative goals

(c) Cortical affiliative contexts rewarded by achieving sexual and affiliative goals

3. Affiliative sex

(a) Less strong subcortically due to familiarization

(b) Less strong cortically due to familiarization

(c) Cortical sexual contexts rewarded by achieving sexual and affiliative goals

(d) Cortical affiliative contexts rewarded by achieving sexual and affiliative goals

4. Affiliation incorporating some sex

(a) Cortical sexual contexts rewarded by achieving sexual and affiliative goals

(b) Cortical affiliative contexts rewarded by achieving sexual and affiliative goals

Brain mechanisms for sexual behaviors. In Figure 29.3, I postulate what data each module receives, stores and acts upon, and what actions it takes.

To summarize the roles of the different modules involved: 1. Perceptual hierarchy of modules. Learns to recognize relevant sexual features. 2. Temporal lobe. Stores sexual and affiliative models which guide behavior. Generates cortical goals for sexual and affiliative behaviors and sends to anterior cingulate. 3. Anterior cingulate. Receives goals and prioritizes them, sending most important ones to the prefrontal planning modules. 4. Prefrontal planning and action hierarchy. Executes sexual plans and behaviors, send- ing commands to motor cortex and thence to motor systems. It also has two-way com- munication with the amygdala and hypothalamus. 5. The amygdala. Has innate sexual responses and is plastic. Learns sexual responses, from perception of stimuli, reward information and information from cortical areas. Sends Modeling sexual motivation 599

perception hierarchy prefrontal cortex

perception of sexually significant temporal lobes sexual plan execution features models of affiliation in general models of sexual behavior in general orbital frontal model of sexual behavior with specific partner model of affiliation with specific partner

hippocampus anterior cingulate

goal storage and prioritization cortical sexual goals cortical affiliation goals

amygdala

sexual responses and goals adaptive and learned goals at this level

hypothalamus

sexual responses and goals relatively nonplastic innate responses

peripheral nervous system motor system hormonal systems

basic sexual actions states of sexual readiness states of sexual readiness phases of sexual activities

Figure 29.3: Postulated brain mechanisms for sexual behaviors

outputs to motor systems and peripheral nervous system to evoke innate amygdala and learned sexual behaviors. 6. The hypothalamus. Has innate sexual responses and is not plastic. Sends outputs to motor systems and peripheral nervous system to evoke basic sexual behaviors. 600 Chapter 29: Modeling interacting cortical and subcortical systems

29.7 Logical models of motivational systems

29.7.1 Formal representation

I’ll now try to outline the precise representation of the action of these systems using logical rules. This representation would be a precise theory of these systems, and also can be run as a logic program which constructs the specified action.

29.7.2 Representing the hypothalamus

Data for describing the hypothalamus. Inputs to hypothalamus: (i) sensors olfactory, tactile, visceral (ii) blood hormones (iii) neural inputs from amygdala

Outputs from hypothalamus: (i) descending neural outputs - sympathetic, parasympathetic, substantia nigra (ii) ascending neural outputs - amygdala, cortex (OFC) (iii) regulators to pituitary (iv) blood hormones

State of hypothalamus: (i) circadian state (ii) storage of hypothalamically generated hormones in pituitary

Input data types. The types of input data include: blood: blood hormone levels: testosterone(X), estrogen(X), Logical models of motivational systems 601

blood temperature(T), blood glucose(G), blood salinity(S). sensory afferents: auditory(A), gustatory(G), olfactory(O), visceral(V). afferents from other areas: amygdala(State)

Output data types. regulator(R), hormone(adh(X)), hormone(oxytocin(X)), pituitary(Pit), symp(symp(X)), parasymp(para(X)), motor(freeze(X)).

Stored data types. circadian(State).

Logically describing data We need to give all the types of data to be used in the system. I represent each type by a different name, and I use this name as a functor with arguments for any values asso- ciated with the data. For example, if we want to specify that the hormone testosterone is data that is used, either as input, output or stored, then we might use the logical term testosterone(X) where X will be set to a real number giving the current level of testosterone at that point in the system. Alternatively, we might prefer to group all the different hormones together, and use a logical term such as hormone(testosterone,X). We can use an underline character within a name to give greater mnemonic value, for example blood glucose(G). The names are not important, they just have to be used consistently. The variables like X, etc., do not have to be unique, we can have sugar(X) and spice(X), etc., but when they occur in the same expression they mean the same variable. We don’t need to limit the values of a variable to be numbers, we can also use symbols. For example, motor(freeze) might represent the data sent from the hypotha- lamus to central gray which is a complex neural code which has the effect, or means, to freeze. We could also parametrize this giving the degree of freezing, motor(freeze,0.4). 602 Chapter 29: Modeling interacting cortical and subcortical systems

Names which are functors or other constants must begin with a lower case letter, and any name beginning with an upper case letter will denote a variable.

All that matters is that the data term should be usable by action rules in the system, that is, it should be able to match to the left hand side of any rule which uses it.

Logically describing action We can now write the action of the hypothalamus as a set of rules which act concurrently.

A rule describes a change that will occur during the next short time interval. This time interval could be 20 milliseconds for example. Thus the rule is a bit like a time derivative.

Time is divided into discrete intervals, of say 20 milliseconds. So, the data description describes whatever is received by the module during a unit time interval. Rules describe what change happens during the next time interval. A rule specifies what outputs are generated and sent to other modules, and what data is generated and stored back into the same module.

A rule matches to the incoming data for that time instant, and it also matches to any stored data in the module.

At the end of each unit time interval, all the output data that has been generated by rule action is then sent to the appropriate modules which receive it and act on it in the next time period.

The current time constant of 20 milliseconds may not be appropriate for dealing with hormone transfer, etc. If, however, hormones are slower than neural signals then it should be alright to use the time constant appropriate for neural signal processing, hormone processing will just take several cycles.

In the case where a continuously varying level is input, then a stored value may be needed to capture the behavior of a module: Logical models of motivational systems 603 if(input_level(X), stored_level(Y)), provided(Z1 is function1(X,Y), Z2 is function2(X,Y)), then(output_level(Z1), stored_level(Z2)). e.g., Z is Y + X∗∗2. So the stored value is also updated. In this way one should be able to represent continuously time dependent variations of different levels.

Here then are some outline rules for the action of the hypothalamus. By outline rule, I mean that a lot of details have not yet been put in. I will put each condition, internal computation expression and output expression on its own separate line.

I use a % sign to indicate a comment.

% fluid balance if(blood_salinity(S)) provided(S < threshold) then(regulator(adh,X)). % to pituitary

% eating if(gustatory(G),olfactory(O),viscera(V),blood_glucose(G)) then([regulator(gh,X), % to pituitary saliva_control % to autonomic system digestion_control % to autonomic system eating_goal] % to motor system )

% body temperature if(blood_temperature(T)), provided(T < t_lower_threshold), then(constrict_blood_vessels,shiver,inhibit_sweating)

% sexual behavior if(testosterone(T),male(Self),and/or amyg_state(sex)) then([regulator(FSH), % to pituitary regulator(LH), % to pituitary sexual_goal]) % to motor system 604 Chapter 29: Modeling interacting cortical and subcortical systems

% aggressive behavior if(cortical_state(C_aggression),amyg_state(A_aggression)) provided(C_aggression > C_agg_thresh, A_aggression > A_agg_thresh) then([aggression]) % to autonomic if(cortical_state(C_aggression), amyg_state(A_aggression)) provided(C_aggression < C_agg_thresh, A_aggression < A_agg_thresh) then([reduce_aggression]) % to autonomic

% arousal if(cortical(C_arousal), amyg_state(A_arousal)) provided(C_arousal < C_agg_arousal, A_arousal < A_agg_arousal) then([increase_arousal]) % to cortex etc if(cortical(C_arousal), amyg_state(A_arousal)) provided(C_arousal > C_agg_arousal, A_arousal > A_agg_arousal) then decrease_arousal) % to cortex etc if(time(T)) provided(cycle(SW,T)) then(sleep_wake(SW))

Thus, a set of rules can give an exact theory of the action of the hypothalamus, and it can also be run on a computer so we have a computer model of the theory which can compute the detailed behavior specified by the theory.

Describing connectivity and data transfer. It is assumed there is a specification saying that certain types of data generated in a certain brain module are always trans- fered to certain other brain modules at the end of each processing cycle. For example, we might have specified that expressions matching regulator(R,Z) generated by the hypothalamus rules will always be transfered to the pituitary. Towards a logical model for sexual behavior 605

29.7.3 Representing the amygdala

Inputs: (i) sensory - olf, tactile, visual, visceral (ii) neural - from hypothalamus, from hippocampus, septum and subiculum, from thala- mus, from temporal lobe, from prefrontal particularly OFC.

Outputs: (i) descending - hypothalamus, PAG (periaqueductal gray) (ii) ascending - hippocampus etc., thalamus, temporal lobe, prefrontal, especially OFC.

29.8 Towards a logical model for sexual behavior

The different modules I write down an outline logical model with eight modules, temporal lobe, amygdala, hypothalamus, substantia nigra, ovaries, adrenal gland, olfaction and vision. This corre- sponds to Figure 29.3.

% Temporal lobe module. if(working_model(X)) then(goal(affiliate(X)), % to anterior cingulate goal(sex(X)), % to anterior cingulate affiliative_sex(X), % to amygdala affiliation(X)). % to amygdala

% Amygdala module. if(affiliative_sex(X)) then(affiliative_sex_hyp(X) % to VMH of hypothalamus affiliative_sex_action(X)) % to SN

% Hypothalamus module. if(estradiol(X,E),olfaction(X,O),visual(X,V)) then(sex_action(X)) % to SN 606 Chapter 29: Modeling interacting cortical and subcortical systems

if(affiliative_sex_hyp(X),olfaction(X,O),visual(X,V)) then(sex_action(X)) % to SN

% Substantia nigra module. if(sex_action(X)) then(sexual_behavior(X)) % to environment (body) if(affiliative_sex_action(X)) then(affiliative_sexual_behavior(X)) % to environment (body)

% Ovaries module. if(cycle(T)) provided(levele(L,T)) then(estradiol(L)) % to environment (blood)

% Adrenal module. if(cycle(T)) provided(levele(L,T)) then(estradiol(L)) % to environment (blood) if(cycle(T)) provided(levelp(L,T)) then(progesterone(L)) % to environment (blood)

Sensors In addition, we need olfactory and visual sensors which detect features in the environment and send descriptions to olfaction and visual modules.

% Olfaction module if(olfactory_feature(F)) provided(detect(F,X,O)) then(olfaction(X,O)) % to hypothalamus and amygdala

% Visual module if(visual_feature(F)) provided(detect(F,X,O)) then(visual(X,O)) % to hypothalamus and amygdala Towards a logical model for sexual behavior 607

The environment Representing the environment. This consists of a formal representation of every- thing relevant that is outside of the brain. It would be a set of expressions of the forms: blood(hormone(H,L)), blood(salinity(L)), position(Animal,X,Y,Z), orientation(Animal,[B,H]), smell(Animal,Feature), visual(Animal,Feature).

Sensing the environment. There are sensors which are simple functions which look at the current state of the environment and generate description expressions for sensed features of the environment.

Acting upon the environment. There are effectors which at each time interval may cause a change in the environment. Skeletal: move(Animal,X,Y,Z), rotate(Animal,[B,H]) Blood: secrete(hormone(H.L))

Time dependence of the environment. In addition to changes made by actions of the animal, there may be changes due to physics. For example, an outside temperature may fall, sweat may dry.

Deriving rules from experimental data My main idea is that the rules of my model will be derived and justified from reported experimental data. We need to look at enough data to cover all the different actions that take place in these modules, and further, the complete set of rules should still describe correctly each individual finding. That is, there should not be any residual conflict between rules derived from different findings. 608 Chapter 29: Modeling interacting cortical and subcortical systems Part VI

Conclusions

609 610 Chapter 30

Consciousness

Abstract. I briefly summarize the various properties of the model and point out that they have some correspondence to properties of consciousness.

This involves discussing the issue of describing consciousness, as regards its distributed although bound form, its relation to coherence of mental states, and its relation to creative vs. routine action.

I also point out connections to the work of Wilhelm Dilthey.

611 612 Chapter 30: Consciousness

30.1 Some remarks on consciousness

Does my model shed any light on the perplexing subject of the scientific study of con- sciousness? I have always regarded this subject as something I should leave well alone. However I will venture a few thoughts.

1. The instantaneous active state of the brain according to my model is distributed over many modules. The distributed nature of consciousness has been studied using MRI imaging and has been reviewed by Nancy Kanwisher [Kanwisher, 2001]. Her conclusion was that the experimental data support the idea of consciousness being distributed over many neural areas. When the subject pays attention to some aspect of the percept and makes it conscious, the corresponding brain area is found to light up.

2. In my model, this distributed activity is bound together by the exchange of data. This may result in some temporal synchronization but this would be an effect and not a cause of binding.

3. Modules continuously respond to new data that they receive, by creating a maximally coherent and consistent state of their stored data and active rules.

4. The set of modules which constitutes the brain, by the exchange of data and by responding to its own goals, tends to put itself into a viable state where the states of the different modules cohere into an activity which attempts to solve its goals and which coheres with perceived behavior of other persons in social interaction with them.

5. The characteristic time taken by our model in attaining such a coherent state of the entire system is about 300 milliseconds. This seems to agree with the experimental work of Benjamin Libet, who found a 350 millisecond delay between the time of directly stimulating the brain of a patient and the time that the patient reported becoming aware of the stimulation [Libet, 1993]. Lived experience and creative imagination 613

6. Visual perception and mental imagery should coexist and be able to combine flexibly. This is Stephen Kosslyn’s observation [Kosslyn, 1994], and my modeling approach should be able to incorporate it.

7. My theory of routine and creative action, and of the interleaving of their activities, gives a plausible model of the experience of action and choice of action. It also is sup- ported by experimental observations of action slips. Note that in my logical formulation it is possible to distinguish between creative and routine action. Creative action is the activity of the cortex consisting of knowledge/rule activations. This activity recreates the learned knowledge in the current situation. The idea that consciousness results from the re-creation of descriptions was put forward some time ago. Unfortunately it cannot be quite as simple as this, since visual percepts are conscious, and it would seem that they may not be a result of re-creation of descriptions.

8. I will characterize emotions in terms of cortical states, but these have to be com- bined with corresponding co-occurring subcortical states of activation of the amygdala and hypothalamus, to produce complete brain states corresponding to emotional experi- ences. In any case this could be entirely the wrong approach; Marvin Minsky has been developing a totally different approach where emotions are properties of system dynamics [Minsky, 2003].

9. Thus the activity of my model seems to be consistent with observed brain activity corresponding to consciousness.

30.2 Lived experience and creative imagination

Wilhelm Dilthey, 1833-1911, was a German philosopher, and founder of hermeneutics. He concluded that our main activity, what he called the life sciences, was to live our 614 Chapter 30: Consciousness experience, to express this lived experience to ourselves and others, and to understand the expression of ourselves and others. Thus our main motivation and activity is to understand each other’s lived experience, or consciousness. Lived experience we can also call erlebnis, from the German.

Dilthey defines ”life” as all of human experience, i.e., consciousness, and not about biology at all. He sees each individual as taking part in the historical progress of humanity. In this way he factors into this process, from its inception, language, cultural activities, and social convention.

The main human activity is understanding of the human experience, of others and of ourselves. This understanding process involves rational analysis, but ultimately rests on a primitive empathic process in our minds.

Dilthey says that all of historical analysis, and of art, concerns trying to understand the basic human experience behind activities in the world.

Understanding depends upon everyone’s expression of their lived experience. Expression involves the outer sign, consisting of an event in the physical world, of something inner, i.e., a thought or a feeling.

Thus we live a cycle of lived experience, expression and understanding.

Dilthey pointed out that this fundamental human activity is something like poetry. In- terestingly the word “poetry” comes from the Greek “poesis”, meaning “to do”, hence power, possible, potential, etc. It’s the thing that we do.

More recently, Humberto R. Maturana and Francisco J. Varela introduced the word “autopoesis” applied to biological systems, meaning self description and self expression.

Incidentally, according to Dilthey, experience cannot be examined logically. One can understand why he would believe this, particularly since, in his time, logic was still Lived experience and creative imagination 615 based on Aristotelian ideas, formal logic having not yet been developed.

My own rapprochement is to take knowledge and memory to be grounded in “raw” experi- ence. Thus there will be unanalyzable primitives of perception, memory and knowledge. However we can surely elaborate and describe our experience, and this elaboration will be a kind of logic. Chapter 31

Towards a computer science of the brain

Abstract. I outline a computer science of the brain, based on my analysis and my modeling research.

I can now write down a set of concepts and principles for a computer science of a class of brain-like computers, based on the model I have developed.

This includes architectural, data and processing concepts, representational principles, and optimization measures.

616 Introduction 617

31.1 Introduction

I previously defined a computer science, for a particular class of machines, as a set of con- cepts and mechanisms which describe data representation, control structures, languages and communication, programming and specification languages, and general theoretical properties of that class of machines.

I will try to outline a computer science for the brain, using a class of machines based on my model.

31.2 Concepts

1. Data values are structured and have specifications of how they are to be updated, which are a form of integrity condition. They may also have attenuation properties.

2. Communication channels. Data values are communicated between modules through channels. Only certain data types are output to certain modules. Output is com- petitive and thresholded.

3. Program. A program is a data type which specifies a rule. It has a set of patterns which match to data values, a body specifying computation, and a set of patterns specifying update and output data values that it produces.

4. Module. A module has a set of specifications of data types handled by it, a store of data values of these types, a set of programs, and a set of specifications of filters for rules, updates and outputs.

5. There is a uniform process which executes modules and channels in parallel, by (a) executing programs, (b) filtering data values produced by programs, (c) updating 618 Chapter 31: Towards a computer science of the brain

stores, (d) transferring data values through channels, and (e) managing stored data values.

6. Execution of a program: (a) depends on the set of stored data values, (b) generates a set of update values which may change the store, and (c) generates a set of output data values which may be transmitted to other modules.

7. Filtering of data values produced by programs involves: (a) competitive filtering of rule activations, (b) competitive filtering of sets of update data values, and (c) competitive filtering of sets of output data values

8. Updating of a store involves (a) recognition of incoming data values, (b) updating according to data type update specifications, and (c) ramping up or down the weights of stored data values.

9. Outputing values to channels involves: (a) only certain data types being output to certain modules, and (b) possible channel capacity constraints.

10. Storage management of stored data values: (a) time-dependent mechanisms associ- ated with individual data items, and (b) management of persistence and strength of data values.

11. Parallelism. Modules and programs all execute in parallel.

12. Perception-action hierarchy. Modules and their data types may form a perception- action hierarchy.

31.3 Design, constraints and optimization principles

In addition to concepts, there are certain design principles for brain descriptions which are intended to capture biological requirements and realities. Design, constraints and optimization principles 619

Energy consumption. Biologically, the use of physical energy is a strong limitation [Allman, 1999]. Energy is derived from metabolism and basically all ingredients are brought in via the blood stream. Energy is used for different processes with different priorities and consequences. For realtime information-processing, energy is used for syn- thesizing neurotransmitter molecules and storing and moving them through the system, and also for modulating synaptic receptors. Energy is also used for learning and mod- ification of the system over a longer time scale. This involves strengthening synapses, creating receptors and growing axons and dendrites.

Economics of rule firing. I take rule firing to correspond to the consumption of energy, with greater energy consumption for more complex data items. I could take the measure that is to be minimized to be the number of matches involved, which is actually the number of unifications in my formulation.

Economics of storage use. I assume storage is a separate energy consumption consider- ation, and is not so energy consuming as rule firing. In general I assume that there is plenty of storage available in each module and that storage size is not a limitation.

Bandwidth is bounded, both for communication of data between modules and for matching of rules to stored items. Thus coding of information must fit within this. I identify information with chunks. A chunk is a packet of related information containing a number of components. Large-scale information needs to be represented as a set of chunks of bounded size, which are communicated serially and matched serially.

Complexity of processing. Atomic processes, such as the computation by a single neuron, will have limited complexity, perhaps corresponding to a monotonic piecewise smooth function.

Handling large amounts of data. A major problem is the filtering, selection and routing of data and the rapid updating of storage. This includes the removal or reduction of data 620 Chapter 31: Towards a computer science of the brain that has become out of date or irrelevant.

Fixed architecture. A brain system will have a fixed number of modules, with fixed connectivity. The set of modules and connections needs to be specified.

Fixed data types. The set of data types needs to be specified. I assume very slow changes here. Initial learning facilitates genetically controlled construction of the system.

Fixed sensors and effectors. Similarly, sensors and effectors cannot be changed easily.

Stability of distributed processing. The system needs to be able to search a range of possibilities and then to focus on a single distributed process.

Seriality at the highest level. It is generally believed that the top level of control corre- sponds to consciousness and is serial (with a given bandwidth).

Learnability. Most logic and data need to be learned, by the system from its own behavior and from communication with others.

Event coding and retrieval is key to learning, since all knowledge must be derived from mental experience.

Routinization is the key to efficiency, adaptation and realtime performance.

31.4 Programming language

I have developed and am using a programming language which supports the construction of brain models. BAD (Brain architecture description language), discussed in chapter 17, is a Prolog package which provides for the specification of datatypes, parallel modules, parallel rule sets and module connectivity, and all of the mechanisms described here. Information can be found at URL: http://www.cs.caltech.edu/˜bond/bad.html Research issues 621

31.5 Research issues

I list here some research issues of importance, which need to be addressed. Data representation issues: Updating. How can updating of the store from incoming data items be accomplished efficiently? How do we deal with the replacement of an item by a similar, but possibly nonmatching, equivalent item? Coherence within a data store. How can we maintain integrity and consistency within the store, as conditions and items change with time? The use of wide data items. How much of the large width of data items should we use for the representation of significant structure and how much for redundancy? Representation of visual images. How do we extend the logical term representation to handle visual images and other types of visual information?

Control issues: Coherence of action among different modules. How is the action of a module maintained relevant and consistent with that of other modules? Stability of distributed processes. How does the set of activations of different modules maintain a coherent and useful relationship? Control properties of perception-action hierarchies. How can we characterize the various modes of activity of a perception-action hierarchy?

Programming issues: Description languages for describing the module and channel level. What is the best kind of language for describing overall system structure and dynamics? Programming languages for describing the data and rules. What is the best kind of language for describing computation and storage within a module? 622 Chapter 31: Towards a computer science of the brain

Theoretical issues: Properties of these machines. Can we prove interesting and useful properties, such as stable control strategies that are guaranteed to converge and terminate, and processing of classes of language? Completeness of inference by distributed system. Can we show logical completeness for a system, e.g., if there exists a successful distributed strategy, will the system always find it?

Implementation issues: VLSI. How do we implement the uniform process in VLSI hardware? Neural nets. How do we implement the uniform process as a neural net?

Multiperson cooperation: Social action. How do two, or a set of, persons (i) initiate, continue and terminate action, (ii) find optimal social action, and (iii) maintain shared memories. Communication. What are the best methods of communication during social action, what protocols?

31.6 Summary and conclusion

In this chapter I have done what I set out to do, namely to outline a computer science for the brain. Chapter 32

Toward Brain Science

Abstract. In this chapter, I make a few observations concerning my proposed discipline of brain science that arise from the research discussed in this book.

623 624 Chapter 32: Toward Brain Science

32.1 Describing the brain Chapter 33

Summary of the model

Abstract. In this chapter, I gather together everything that I have developed in the rest of the book and give a very brief account of the whole model. This will include both the core dynamic model and also the proposed extensions.

I also make a few statements concerning (i) basic tenets and assumptions, (ii) the limited role of symbols, (iii) the abstract representation proposed, and (iv) the relationship to neurons and neural nets.

625 626 Chapter 33: Summary

33.1 Overall

The model is set at a system level of analysis and is composed of logical expressions, modules and channels, and not neurons. The level below this is a set of interacting asso- ciative memories corresponding to the layers of the cortex. These associative memories are then explainable by neural nets at the next level below this.

What then in summary is our theory of brain structure and function? In the next sections I’ll summarize the tenets, representation and core dynamic model that has been completed all the way to a computer specification and realization. After this, in following section, I’ll summarize work in progress on the extension of the model to include vision, language and subcortical systems.

33.2 Some basic tenets of the theory

(i) My theory of the brain’s action uses a concept of data, separate from processing, and data consists of discrete data items which I call descriptions. Descriptions can contain real numbers, and they have strengths and certainties. They are chunks of information. (ii) The brain does not use symbols, in the sense of arbitrary tokens which can be asso- ciated with values, except in certain kinds of language processing. (iii) The different kinds of processing occurring in the brain can be represented by the inferencing of descriptions from sets of other descriptions. (iv) Inferencing occurs by the matching of rules to descriptions. Rules are themselves composed of more abstract descriptions. (v) The meaning of a description is defined by its effect on processing, and not on its origins in sensory perception or its connection with motor output. The basic representation 627

33.3 The basic representation

I developed a theory of description of data items and of processing. Descriptions are represented by logical literals corresponding to assertions of facts. Descriptions are gen- erated by sensors by sensing the environment, they are transmitted among modules and stored in modules, and effectors are driven by output descriptions and cause changes to the environment.

Processing occurs within each module and is represented by an inference process con- sisting of general rules for re-construction by the re-combination of learned descriptions. Plans are represented as descriptions, and are grouped together in contexts. Processing corresponds to inferences of sets of descriptions from other given sets of descriptions. Pro- cessing is repeated until quiescence, when all the useful descriptions have been inferred from the module’s current store of descriptions.

Descriptions have a limited size corresponding to the bandwidth of channels between modules and to the width of the associative memories in which they are stored.

33.4 The core dynamic model

The brain is structured as modules which run concurrently, transmitting, receiving and storing descriptions. The cortex is structured as a perception-action hierarchy of modules, where each module is specialized to process descriptions of certain types characteristic of that module, see Figure 33.1.

The perception-action hierarchy can perceive and act. It forms goals, evokes plans and elaborates them into a stream of output actions. It behaves as a hierarchical real-time control system for producing the behavior of the human, see Figure 33.2. 628 Chapter 33: Summary

MI muscle PA1 combinations4 detailed SA2 tactile 8 actions 6 guidance DV2 PA3 5 maps specific plans SA1 7 tactile images DV3 9 social visual SA3 features PA4 social tactile complex features plans DV1 SI 2 spatial tactile features 46 detection 3 1 40 10 39 PA5 contexts and 44 episodes 45 43 G 19VI goals 41 PM1 visual ? 42 social action 11 47 features 22 features 18 17 PM2 VV1 social action object identities 38 PM3 features 37 social goal 21 features

2

auditory input visual input olfactorysomatosensory input input motor output gustatory input

Figure 33.1: Modules from neural areas of the primate neocortex corresponding to my initial system model

I found that the perception-action system is also naturally suited to cooperative interac- tion between two or more humans.

The cortex learns by storing descriptions that are generated by separate learning modules. Each module provides a short term store of limited capacity and a long term store of arbitrarily large capacity. Kinds of learning intrinsic to the cortex include priming of stored data and also possibly some generalization and optimization of access.

The main learning process, the creation of data to be stored, occurs in three specialized modules - the hippocampus, the basal ganglia and the cerebellum, which are not part of The core dynamic model 629

memory of social relations goals received from system

observations disposition of social relative to goals prioritized and relations goals selected goal sent evaluation perception of goals and joint plan selected dispositions attention perception evaluation elaboration requests for information on actors and actions joint plan perception in terms described in relational terms; of relations; elaboration conditional on attention dependent on requested information relational information received information received attention evaluation perception elaboration requests for information on spatial positions etc perception of plan for self described spatial detail; in spatial detail; attention dependent on requested information elaboration conditional on information received geometric information received attention perception evaluation elaboration

perception generation of detailed of features motor commands

from sensors to effectors

Figure 33.2: Functioning of interacting perception and action hierarchies in behavior the cortex, and which provide respectively episodic memory formation, perception-action association, and classical conditioning, see Figure 33.3.

I also have come up with a simpler way of diagramming the system, which I call summary diagrams, starting with Figure 33.4.

I developed representations for events, episodes, and contexts, and of the processing in the hippocampus which forms representations of events and episodes. I also developed processing in ventral prefrontal areas which generalizes episodes into contexts which are then stored.

Episodes are stored, giving a short-term episodic memory, Figure 33.5. This memory is consolidated into descriptions which are stored more permanently in cortical modules. 630 Chapter 33: Summary

social relations goals

perceived dispositions plan construction plan persons

person actions and relations plan execution of specific joint plans

person positions and movements detailed plans for self

survival routinization module episodic memory module interface

sensor system motor system body

environment

Figure 33.3: Separate learning modules

These constitute long-term episodic memory, or autobiographical memory, which is stored in the cortex, with an event map in the hippocampus. These memories can be evoked and merged with the current state of the modules.

Contexts constitute the store of learned plans, Figure 33.6. Both episodes and con- texts have a sequential and nested structure. This provides control information for the execution of contexts in the planning module.

Episodes are the source of both autobiographical memories and also contexts contain- ing learned plans. They also can be generalized into time-independent facts, forming semantic memory.

I developed a representation for the basal ganglia and defined perception-action associa- tion as an inference process, Figure 33.7. This allowed the learning of procedures to form procedural memories.

I have not yet looked at the cerebellum in the context of this model. One mainstream idea is that it represents detailed spatial relations which it forms by classical conditioning.

I pointed out the need to interleave description-based thought and routine thought, and The core dynamic model 631

perception−action hierarchy

stimulus−response conditioning routine action

event and episode formation

Figure 33.4: Summary diagram of learning module and core model

that there is probably cortical routine action which is learned via episodic memory, as well as basal-ganglia routine activity. I suggested that the interleaving of basal-ganglia routine action could use the control function of the thalamus.

So this set of structures and mechanisms forms my core dynamic model. It provides a scientific explanation of the main functions of the cortex and how learning occurs. 632 Chapter 33: Summary

4 8 6 5 7 9 plan module somatosensory state modules working goals action occuring 2 module 46 events 1 consolidation 3 module event evocation events and merging consolidation40 10 memory event evocation auditory 39 queries and merging modules replies 44 event module 45 forms overall goal event, and43 module event map − 19 current episode 41 ? events 11 related by 42 visual 47 temporal22 modules adjacency 18 17 and context 38 21

2

Figure 33.5: The episodic memory system The core dynamic model 633

4 8 6 5 7 9 somatosensory plan module module state working goals action occuring CEC 2 CEC queries 1 46 auditory 3 module evoked 40 10 contexts event module 39 forms overall event 44 event boundaries context store current event temporal adjacency evoke context 45 from event generalize 43event map − 19 and episode episodes to current episode individual41 events visual stored contexts form? contexts related by 42 module 11 47 goal temporal adjacency 22 and context 18 17

38 21

2

Figure 33.6: The formation and use of contexts 634 Chapter 33: Summary

M4 8 PreMC 6 5 7 9 PPC PFC 2 46 1 3 40 10 39 44 45 43 19 Basal ganglia 41 ? 42 V 11 47 22 18 17

38 21

2

Figure 33.7: The association loop of the basal ganglia and cortex Extensions to vision, language and subcortical systems 635

33.5 Extensions to vision, language and subcortical

systems

33.5.1 Vision

The usual vision hierarchy of the occipital lobe, Figure 33.8, delivers two types of data item. The first is the set of foveal features which are used to form a succession of representations of single objects from foveal data from a succession of fixations connected by saccades. The second is a nonegocentric representation of the entire scene, obtained from peripheral and foveal information. The first is stored in V4 and TE0, drawing upon storage of object types in the temporal lobe. The second is stored in lower parietal areas and its data can associate into, reference and locate, data in V4 and TE0.

4 8 6 5 7 9

2.5−3D representation

attention2 A B 46 1 xa,ya xb,yb 3 nonegocentric peripheral image 40 10 frame 39

44 retinal frame 45 intensity edges 43 19 total visual image 41 A B focus area visual focus buffer xa,ya xb,yb ? A B 11 visual object42 buffer xa,ya xb,yb 47 A A B B retinal frame 22 xa,ya A xb,yb color area 17 xa,ya 18retinal frame retinal frame A B nonegocentric xa,ya xb,yb 38 long term memory frame of object types 21A retinal frame B

2

Figure 33.8: The different modules and their data during visual perception 636 Chapter 33: Summary

Figure 33.9 gives a summary diagram.

perception−action hierarchy IP PO 19 17 TE0 V4 18

learning modules

Figure 33.9: Summary diagram of extension for vision

These interacting representations are used by the brain in three ways, first for obtaining visual information for use in problem solving. This includes the computation of spatial relations and also visual search of the external environment by generating saccades. Secondly, visual percepts are incorporated into event representations both in the image form from V4 and the object-file form from TE0. Thirdly, images can be generated from long term memory and reconstructed and merged into the short-term visual stores giving mental imagery.

33.5.2 Natural language

I have only scratched the surface of this most human and sophisticated area in the context of this brain model. I’ve done nothing about semantics. As regards syntax, I Extensions to vision, language and subcortical systems 637 have outlined how an existing psycholinguistically-based lexicalist account of sentence recognition, due to Gerard Kempen and coworkers, can be represented as brain modules.

My method assigns a store of representations of lexical frames to a posterior cortical area, possibly Wernicke’s area, Figure 33.10. Incoming phonemes enter the system through the primary auditory area AI where a phonological buffer identifies them and queues them as they activate corresponding lexical frames.

These lexical frames are transmitted to a frontal area, just posterior to Broca’s area according to current imaging data on patients with agrammatism, where they are com- petitively merged to form a representation of the grammatical structure of the sentence.

4 8 6 motor 5 speech output 7 9

planning 2 46 1 3 40 10 auditory 39 input 44u−space c−space 45 43 19 41 ? 42 11 47 22 lexicon 18 17

38 21

2

Figure 33.10: Brain areas corresponding to language processing

Figure 33.11 gives a summary diagram.

Kempen’s approach is designed to capture psycholinguistic phenomena such as agramma- 638 Chapter 33: Summary

grammatical phonetic construction output

semantic perception−action structures hierarchy

lexicon auditory processing

learning modules

Figure 33.11: Summary diagram of extension for language processing tism in sentence recognition and generation as well as observed performance phenomena in normal sentence recognition and generation.

33.5.3 Subcortical systems

From an examination of the voluminous and rather chaotic literature on motivation, including reviews and conclusions of many scientists, I came up with a framework, Fig- ure 33.12, that allows most mechanisms to be included, which deals with the role of aggression, and which provides a hierarchical structure corresponding the the observed neuroanatomical levels of (i) the amygdala, (ii) the hypothalamus and (iii) lower level effector systems such as the endocrine system and the autonomic nervous system.

My framework has three parallel but interacting hierarchies for (i) agonism - fear and aggression, (ii) attachment and (iii) sex. Extensions to vision, language and subcortical systems 639

Within this 3x3 framework, we can represent motivational mechanisms as rules combining inputs and using arithmetic expressions intended to capture the responses of the amygdala and the hypothalamus to incoming neural and hormonal information.

Figure 33.13 gives a summary diagram.

The main interactions between cortical and subcortical areas are: (i) with anterior temporal areas, giving the amygdala information on sophisticated high level visual percepts, where the scene is interpreted using the cortex’s store of semantic concepts. (ii) with orbitalfrontal areas, giving a connection with the context store and the currently active context(s) (a) allowing the context to be taken into account in the action of the amygdala, and (b) allowing the amygdala to influence the choice and description of the currently active context(s). (iii) with the anterior cingulate, giving a connection with the system’s current goals, both as context for the amygdala and also for the amygdala to influence or even create goals. (iv) with the hippocampus, giving (a) the current event and episode as input to the amygdala, and (b) output from the amygdala as a component of the current event, giving an eval- uative, affective and social component. Here the input of moderate levels of excitation tends to enhance the saliency of the event and to make it more strongly remembered whereas high levels of stress cause hippocampal malfunctioning and dissociation of mem- ories [Nadel and Jacobs, 1996]. 640 Chapter 33: Summary

agonism sex cortical level full perception attachment

complex learning mainly learned elaboration and memory learned and complex behaviors

info info (cortical descriptions) (affect)

restricted agonism attachment sex amygdala level perception (vision) some learning and memory innate social behaviors (connections to hippocampus) integration center

limited perception agonism attachment sex hypothalamic level (olfactory, light levels) survival regulation

limited plasticity

autonomic system endocrine system primitive actions skeletal motor system and reactions periaqueductal gray

Figure 33.12: Levels of interacting motivational control Extensions to vision, language and subcortical systems 641

perception−action hierarchy

goals amygdala hypothalamus endocrine and maintenance of motivation peripheral nervous systems integration and social memory

learning modules

Figure 33.13: Summary diagram of extension for subcortical systems 642 Chapter 33: Summary

Finally, we give summary diagrams of the whole brain, Figures 33.14 and 33.15.

grammatical phonetic construction output

perception−action semantic goals structures hierarchy IP PO amygdala hypothalamus

19 17 endocrine and TE0 V4 maintenance of motivation peripheral nervous systems 18 integration and social memory

lexicon auditory processing

stimulus−response conditioning routine action

event and episode formation

Figure 33.14: Summary diagram of brain model showing some detail Extensions to vision, language and subcortical systems 643

language perception−action subcortical processing hierarchy systems

learning modules

Figure 33.15: Simplified summary diagram of brain model 644 Chapter 33: Summary

33.6 My disagreements

1. Newell’s physical symbol hypothesis. I proposed my associative structure hypoth- esis, in which the only role of symbols in the model is to structure data and not to refer to symbolic structures.

2. Fodor’s modularity hypothesis. I proposed a completely distributed architecture.

3. Cortical learning. I proposed that most learning occurs in the specialized learning modules, namely, the hippocampus, basal ganglia and cerebellum. The cortex only has learning for longer range changes to data types and for efficiency. It also provides priming. The cortex learns mainly by receiving, storing and generalizing data generated by learning modules.

4. Neural learning is limited to associative memory. The stabilization of data types in different modules is needed to maintain communication among modules, this pro- cess corresponds to neural learning. This property results in standard associative interfaces.

5. Binding does not use temporal synchronization but instead explicit confirmation messages for intermodule cooperation.

6. Synaptic facilitation is not the only effect of neural learning mechanisms, there are also genetic memory mechanisms within the cell. Chapter 34

In conclusion

Abstract. A concluding remark.

645 646 Chapter 34: In conclusion

Well, it’s been a wild ride! I started this book with an ambition to clarify psychological thinking by constructing a precise well-defined theory and model of the brain and its action. My theory had to agree with neuroanatomy, psychological findings and all other experimental data on the brain and mind.

It is my opinion that I have achieved the basis and core of these objectives. I have developed a mathematical approach based on computer science, in particular logic pro- gramming, and I’ve applied it to the cortex, hippocampus, basal ganglia, thalamus, amygdala and hypothalamus, which comprise most of the brain.

The method is reflected in the form of each chapter where I examined the neuroanatomy and the psychology of the brain area and function under discussion. Then I developed a theory and model of the area, which agreed with its neuroanatomy and psychology. This model could then be run on a computer to verify its technical correctness, and to compute predicted behaviors for comparison with experiment. Bibliography

[Adolphs, 1999] Adolphs, R. (1999). Social cognition and the human brain. Trends in Cognitive Sciences, 3:469–479.

[Aggleton, 1992] Aggleton, J. P. (1992). The Amygdala: neurobiological aspects of emo- tion, memory, and mental dysfunction. John Wiley, New York.

[Aggleton, 2000] Aggleton, J. P. (2000). The amygdala: a functional analysis. Oxford University Press.

[Aggleton and Saunders, 2000] Aggleton, J. P. and Saunders, R. C. (2000). The amyg- dala - what’s happened in the last decade? pages 1–30. in [Aggleton, 2000].

[Aitkin, 1990] Aitkin, L. (1990). The Auditory Cortex. Chapman and Hall, London.

[Akmajian et al., 1995] Akmajian, A., Demers, R. A., Farmer, A. K., and Harnish, R. M. (1995). Linguistics: An introduction to language and communication. MIT Press, Cambridge, Massachusetts. Fourth edition.

[Alberini, 1999] Alberini, C. M. (1999). Genes to remember. Journal of Experimental Biology, 202:2887–2891.

[Albert and Walsh, 1984] Albert, D. J. and Walsh, M. L. (1984). Neural systems and the inhibitory modulation of agonistic behavior: a comparison of mammalian species. Neuroscience and Biobehavioral Reviews, 8:5–24.

[Albus, 1981] Albus, J. S. (1981). Brains, behavior, and robotics. Byte Books, Peterbor- ough, New Hampshire.

[Alexander et al., 1986] Alexander, G., DeLong, M., and Strick, P. L. (1986). Parallel organization of segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience, 9:357–381.

[Alexander and Crutcher, 1990] Alexander, G. E. and Crutcher, M. D. (1990). Neural Representations of the Target (Goal) of Visually Guided Arm Movements in Three Motor Areas of the Monkey. Journal of Neurophysiology, 64:164–178.

647 648 Bibliography

[Allman, 1999] Allman, J. M. (1999). Evolving brains. W. H. Freeman, San Francisco. Scientific American Library.

[Altmann, 1990] Altmann, G. T. M. (1990). Cognitive models of speech processing : psy- cholinguistic and computational perspectives. MIT Press, Cambridge, Massachusetts.

[Altmann, 1967] Altmann, S. A., editor (1967). Social Communication among Primates. University of Chicago Press, Chicago.

[Amarel et al., 1992] Amarel, D. G., Price, J. L., Pitkanen, A., and Carmichael, S. T. (1992). Anatomical organization of the primate amygdaloid complex. pages 1–66. in [Aggleton, 1992].

[Amorapanth et al., 2000] Amorapanth, P., LeDoux, J. E., and Nader, K. (2000). Differ- ent lateral amygdala outputs mediate reactions and actions elicited by a fear-arousing stimulus. Nature Neuroscience, 3:74–79.

[Andersen, 1995] Andersen, R. A. (1995). Encoding of intention and spatial location in the posterior parietal cortex. Cerebral Cortex, 5:457–469.

[Andersen et al., 1990] Andersen, R. A., Asanuma, C., Essick, G., and Siegel, R. M. (1990). Corticocortical connections of anatomically and physiologically defined subdi- visions within the inferior parietal lobule. Journal of Comparative Neurology, 296:65– 113.

[Anderson and Essen, 1987] Anderson, C. H. and Essen, D. C. V. (1987). Shifter circuits: a computational strategy for dynamic aspects of visual processing. Proceedings of the National Academy of Sciences of the USA, 84:6297–6301.

[Anderson, 1989] Anderson, J. R. (1989). A Theory of the Origins of Human Knowledge. Artifical Intelligence Journal, 40:313–351.

[Anderson, 1993] Anderson, J. R. (1993). Rules of the mind. Lawrence Erlbaum Asso- ciates, Hillsdale, New Jersey.

[Anderson and Bower, 1972] Anderson, J. R. and Bower, G. H. (1972). Recognition and retrieval processes in free recall. Psychological Review, 79:97–123.

[Anderson and Douglass, 2001] Anderson, J. R. and Douglass, S. (2001). Tower of Hanoi: Evidence for the cost of goal retrieval. Journal of Experimental Psychology: Learning, Memory and Cognition, 27:1331–1346.

[Anderson et al., 1993] Anderson, J. R., Kushmerick, N., and Lebiere, C. (1993). The Tower of Hanoi and Goal Structures. pages 121–142. in [Anderson, 1993]. Bibliography 649

[Andreasen et al., 1995] Andreasen, N. C., O’Leary, D. S., Cizadlo, T., Arndt, S., Rezai, K., Watkins, G. L., Ponto, L. L. B., and Hichwa, R. D. (1995). Remembering the past: Two facets of episodic memory explored with positron emission tomography. American Journal of Psychiatry, 152:1576–1585.

[Andreasen et al., 1992] Andreasen, N. C., Rezai, K., Alliger, R., II, V. W. S., Flaum, M., Kirchner, P., Cohen, G., and O’Leary, D. S. (1992). Hypofrontality in Neuroleptic- Naive Patients and in Patients with Chronic Schizophrenia. Archives of General Psy- chiatry, 49:943–958.

[Annisfeld and Knapp, 1968] Annisfeld, M. and Knapp, M. (1968). Association, syn- onymity, and directionality in false recognition. Journal of Experimental Psychology, 77:171–179.

[Anzai, 1978] Anzai, Y. (1978). Learning strategies by computer. In Proceedings Second Canadian Society for Computational Studies of Intelligence Conference, pages 181–190.

[Anzai, 1987] Anzai, Y. (1987). Doing, understanding, and learning in problem solving. pages 55–97. in [Klahr et al., 1987].

[Anzai and Simon, 1979] Anzai, Y. and Simon, H. A. (1979). The Theory of Learning by Doing. Psychological Review, 86:124–140.

[Apt et al., 1999] Apt, K. R., Marek, V. W., Truzczynski, M., and Warren, D. S. (1999). The logic programming paradigm. Springer-Verlag, New York.

[Arbib, 1981] Arbib, M. (1981). Perceptual Structures and Distributed Motor Control. pages 1449–1480. in [Brooks, 1981].

[Arieli et al., 1996] Arieli, A., Sterkin, A., Grinvald, A., and Aertsen, A. (1996). Dynam- ics of ongoing activity: explanation of the large variability in evoked cortical responses. Science, 273(5283):1868–1871.

[Arnell and Jolicoeur, 1997] Arnell, K. M. and Jolicoeur, P. (1997). Repetition blindness for pseudoobject pictures. Journal of Experimental Psychology: Human Perception and Performance, 23:999–1013.

[Arnold, 1970] Arnold, M. B., editor (1970). Feelings and emotions. Academic Press, New York.

[Arvind and Culler, 1986] Arvind and Culler, D. E. (1986). Dataflow architectures. An- nual Reviews in Computer Science, 1:225–253.

[Arvind and Gostelow, 1982] Arvind and Gostelow, K. (1982). The U-interpreter. IEEE Computer, 15:42–49. 650 Bibliography

[Atkinson and Shiffrin, 1968] Atkinson, R. C. and Shiffrin, R. M. (1968). Human mem- ory: A proposed system and its control processes. In Spence, K. and Spence, J., editors, The psychology of learning and motivation: Advances in research and theory, Vol. 2. Academic Press, New York and London.

[B. B. Murdock, 1974] B. B. Murdock, J. (1974). Human memory: Theory and data. Erlbaum, Potomac, MD.

[Baddeley, 1986] Baddeley, A. D. (1986). Working memory. Oxford University Press, Oxford, England.

[Bailey et al., 1950] Bailey, P., Bonin, G. V., and McCullough, W. S. (1950). The Iso- cortex of the Chimpanzee. The University of Illinois Press, Urbana.

[Baker et al., 1996] Baker, S. C., Rogers, R. D., Owen, A. M., Frith, C. D., Dolan, R. J., Frackowiak, R. S. J., and Robbins, T. W. (1996). Neural systems engaged by planning: a PET study of the Tower of London task. Neuropsychologia, 34:515–526.

[Bara and Guida, 1984] Bara, B. G. and Guida, G. (1984). Computational models of natural language processing. North Holland, Amsterdam.

[Barbas, 1988] Barbas, H. (1988). Anatomic organization of basoventral and mediodor- sal visual recipient prefrontal regions in the rhesus monkey. Journal of Comparative Neurology, 276:313–342.

[Barbas, 1992] Barbas, H. (1992). Architecture and cortical connections of the prefrontal cortex in the rhesus monkey. Advances in Neurology, 57:91–115.

[Barbas, 1995] Barbas, H. (1995). Anatomic Basis of Cognitive-Emotional Interactions in the Primate Prefrontal Cortex. Neuroscience and Biobehavioral Reviews, 19:499–510.

[Barbas and Rempel-Clower, 1997] Barbas, H. and Rempel-Clower, N. (1997). Cortical structure predicts the pattern of corticocortical connections. Cerebral Cortex, 7:1–12.

[Barfield, 1984] Barfield, R. J. (1984). Reproductive hormones and aggressive behavior. pages 105–134. in [Flannelly et al., 1984].

[Barlow, 1972] Barlow, H. B. (1972). Single units and sensation: a neuron doctrine for perceptual psychology? Perception, 1:371–394.

[Barlow, 1990] Barlow, H. B. (1990). Conditions for versatile learning, Helmholtz’s un- conscious inference, and the task of perception. Vision Research, 30:1561–1571.

[Barnes and Pandya, 1992] Barnes, C. L. and Pandya, D. N. (1992). Efferent cortical connections of multimodal cortex of the superior temporal sulcus in the rhesus monkey. Journal of Comparative Neurology, 318:222–244. Bibliography 651

[Barsalou, 1988] Barsalou, L. W. (1988). The content and organization of autobiograph- ical memories. pages 193–243. in [Neisser and Winograd, 1988].

[Barsalou, 1992] Barsalou, L. W. (1992). Frames, concepts, and conceptual fields. In Kit- tay, E. and Lehrer, A., editors, Frames, fields, and contrasts: New essays in semantic and lexical organization, pages 21–74. Lawrence Erlbaum Associates.

[Barth, 2002] Barth, A. L. (2002). Differential plasticity in neocortical networks. Physi- ology and Behavior, 77:545–550.

[Bartlett, 1958] Bartlett, S. F. (1958). Thinking: an experimental and social study. Basic Books, New York.

[Batori et al., 1989] Batori, I. S., Lenders, W., and Putschke, W. (1989). Computational Linguistics. Walter de Guyter, Berlin.

[Baum et al., 1990] Baum, S. R., Blumstein, S. B., Naeser, M. A., and Palumbo, C. L. (1990). Temporal Dimensions of Consonant and Vowel Production: An Acoustic and CT Scan Analysis of Aphasic Speech. Brain and Language, 39:33–56.

[Baylis et al., 1993] Baylis, G., Driver, J., and Rafal, R. D. (1993). Visual extinction and stimulus repetition. Journal of Cognitive Neuroscience, 5:453–466.

[Bell, 1937] Bell, E. T. (1937). Men of mathematics. Simon and Schuster, New York.

[Benson, 1994] Benson, D. F. (1994). The neurology of thinking. Oxford University Press, New York.

[Benson et al., 1992] Benson, D. L., Isackson, P. J., Gall, C. M., and Jones, E. G. (1992). Contrasting patters in the localization of glutamic acid decarboxylase and Ca2+/calmodulin protein kinase gene expression in the rat central nervous system. Neuroscience, 46:825–849.

[Benton and Joynt, 1960] Benton, A. L. and Joynt, R. J. (1960). Early descriptions of aphasia. Archives of Neurology, 3:205–222.

[Berkeley, 1949] Berkeley, E. C. (1949). Giant brains or machines that think. John Wiley, New York.

[Bilecen et al., 1998] Bilecen, D., Scheffler, K., Schmidt, N., Tschopp, K., and Seelig, J. (1998). Tonotopic organization of the human auditory cortex as detected by BOLD- FMRI. Hearing Research, 126:19–27.

[Binder et al., 1997] Binder, J. R., Frost, J. A., Hammeke, T. A., Cox, R. W., Rao, S. M., and Prieto, T. (1997). Human Brain Language Areas Identified by Functional Magnetic Resonance Imaging. The Journal of Neuroscience, 17:353–362. 652 Bibliography

[Binder et al., 1994] Binder, J. R., Rao, S. M., Hammeke, T. A., Yetkin, F. Z., Jes- manowicz, A., Bandettini, P. A., Wong, E. C., Estkowski, L. D., Goldstein, M. D., Haughton, V. M., and Hyde, J. S. (1994). Functional magnetic resonance imaging of human auditory cortex. Annals of Neurology, 35:662–672.

[Birkhoff and von Neumann, 1936] Birkhoff, G. and von Neumann, J. (1936). The logic of quantum mechanics. The Annals of Mathematics, Second series, 37:823–843.

[Bishop et al., 2001] Bishop, D., Aamodt-Leeper, G., Creswell, C., McGurk, R., and Skuse, D. H. (2001). Individual differences in cognitive planning on the Tower of Hanoi task: Neuropsychological maturity or measurement error? Journal of Child Psychology and Psychiatry and Allied Disciplines, 42:551–556.

[Bjorklund et al., 1987] Bjorklund, A., Hokfelt, T., and Swanson, L. W. (1987). Hand- book of Chemical Neuroanatomy, Volume 5: Integrated Systems of the CNS, Part I Hypothalamus, Hippocampus, Amygdala, Retina. Elsevier Science Publishers B.V.

[Bloom et al., 1999] Bloom, F. E., Bjorklund, A., and Hokfelt, T. (1999). Handbook of Chemical Neuroanatomy, Volume 15 The Primate Nervous System, Part III. Elsevier Science Publishers B.V.

[Blum, 1990] Blum, P. S. (1990). Sensory input and integration. pages 147–170. in [Klemm and Vertes, 1990].

[Blumstein, 1995] Blumstein, S. E. (1995). The neurobiology of the sound structure of language. pages 915–929. in [Gazzaniga, 1995].

[Boelkins and Wilson, 1072] Boelkins, R. C. and Wilson, A. P. (1072). Intergroup Social Dynamics of the Cayo Santiago Rhesus (Macaca mulatta) with Special Reference to Changes in Group Membership by Males. Primates, 13:125–140.

[Boller and Grafman, 1994] Boller, F. and Grafman, J. (1994). Handbook of Neuropsy- chology, Volume 9. Elsevier Science Publishers B.V.

[Bond, 1989] Bond, A. H. (1989). The Cooperation of Experts in Engineering Design. In Gasser, L. and Huhns, M., editors, Distributed Artificial Intelligence, Volume II, pages 462–486. Pitman/Morgan Kaufmann, London.

[Bond, 1990] Bond, A. H. (1990). Commitment: A Computational Model for Organi- zations of Cooperating Intelligent Agents. In Proceedings of the 1990 Conference on Office Information Systems, pages 21–30, Cambridge, MA.

[Bond and Gasser, 1988a] Bond, A. H. and Gasser, L. (1988a). An Analysis of Problems and Research in Distributed Artificial Intelligence. In Readings in Distributed Artificial Intelligence. Morgan Kaufmann Publishers, San Mateo, CA. Bibliography 653

[Bond and Gasser, 1988b] Bond, A. H. and Gasser, L. (1988b). Readings in Distributed Artificial Intelligence. Morgan Kaufmann Publishers, San Mateo, CA.

[Borsley, 1991] Borsley, R. D. (1991). Syntactic Theory: A Unified Approach. Edward Arnold, London.

[Borys et al., 1982] Borys, S. V., Spitz, H. H., and Dorans, B. A. (1982). Tower of Hanoi performance of retarded young adults and nonretarded children as a function of solution length and goal state . Journal of Experimental Child Psychology, 33:87–110.

[Bowlby, 1973] Bowlby, J. (1973). Separation, anxiety and loss. Pelican Books, London. Attachment and Loss, Volume II.

[Bowlby, 1980] Bowlby, J. (1980). Loss, sadness and depression. Basic Books, New York. Attachment and Loss, Volume III.

[Bowlby, 1982] Bowlby, J. (1982). Attachment. Basic Books, New York, second edition. Attachment and Loss, Volume I.

[Brazier and Petsche, 1978] Brazier, M. A. B. and Petsche, H. (1978). Architectonics of the Cerebral Cortex. Raven Press, New York.

[Broadbent, 1958] Broadbent, D. E. (1958). Perception and communication. Pergamon Press, New York.

[Brodmann, 1909] Brodmann, K. (1909). Vegleichende Lokalisationslehre der Grosshirnde. Barth, Leipzig.

[Brooks, 1981] Brooks, V. B. (1981). Handbook of Physiology, Section 2: The Nervous System, Vol. II, Motor Control, Part 1. American Physiological Society.

[Brothers, 1990] Brothers, L. (1990). The social brain: a project for integrating primate behavior and neurophysiology in a new domain. Concepts in Neuroscience, 1:27–51.

[Brothers, 1996] Brothers, L. A. (1996). Friday’s Footprint: How society shapes the human mind. Oxford University Press.

[Browder, 1976] Browder, F. E. (1976). Symposium in Pure Mathematics, Northern Illinois University, 1974. Mathematical developments arising from Hilbert problems : [proceedings of the Symposium in Pure Mathematics of the American Mathematical Society, held at Northern Illinois University, DeKalb, Illinois, May 1974. American Mathematical Society, Providence.

[Bruce et al., 1981] Bruce, C., Desimone, R., and Gross, C. G. (1981). Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. J. Neurophysiol., 46:369–384. 654 Bibliography

[Bruce, 1994] Bruce, D. (1994). Lashley and the Problem of Serial Order. American Psychologist, 49:93–103.

[Bruce, 1988] Bruce, V. (1988). Recognising Faces. Lawrence Erlbaum Associates, Hills- dale, New Jersey.

[Brugge and Reale, 1985] Brugge, J. F. and Reale, R. A. (1985). Auditory Cortex. pages 229–272. Plenum Press, New York. in [Peters and Jones, 1985], volume 4.

[Bullock, 1977] Bullock, T. H. (1977). Introduction to nervous systems. W. H. Freeman, San Francisco. with the collaboration of Richard Orkand, Alan Grinnell.

[Buser and Imbert, 1992] Buser, P. and Imbert, M. (1992). Audition. M.I.T. Press, Cambridge, Massachusetts. (French edition, Hermann, 1987).

[Bussey et al., 1996] Bussey, T. J., Muir, J. L., Everitt, B. J., and Robbins, T. W. (1996). Dissociable effects of anterior and posterior cingulate cortex lesions on the acquisition of a conditional visual discrimination: facilitation of early learning vs. impairment of late learning. Behavioural Brain Research, 82:45–56.

[Bussey et al., 1997] Bussey, T. J., Muir, J. L., Everitt, B. J., and Robbins, T. W. (1997). Triple dissociation of anterior cingulate, posterior cingulate, and medial frontal cor- tices on visual discrimination tasks using a touchscreen testing procedure for the rat. Behavioral Neuroscience, 111:920–936.

[Butler, 1994] Butler, A. B. (1994). The evolution of the dorsal pallium in the telen- cephalon of amniotes, cladistic analysis and a new hypothesis. Brain Research Reviews, 19:66–101.

[Butler and Hodos, 1996] Butler, A. B. and Hodos, W. (1996). Comparative vertebrate neuroanatomy : evolution and adaptation. John Wiley, New York.

[Butterworth, 1980] Butterworth, B. (1980). Language production, volume 1, Speech and talk. Academic Press, New York.

[Byrnes and Spitz, 1979a] Byrnes, M. M. and Spitz, H. H. (1979a). Developmental pro- gression of performance on the Tower of Hanoi problem. Bulletin of the Psychonomic Society, 14:379–381.

[Byrnes and Spitz, 1979b] Byrnes, M. M. and Spitz, H. H. (1979b). Performance of Retarded Adolescents and Non-retarded Children on the Tower of Hanoi problem. 81:561–569.

[Califano et al., 1990] Califano, A., Kjeldsen, R., and Bolle, R. M. (1990). Data and Model Driven Foveation. In IEEE International Conference on Pattern Recognition, pages 1–7. Bibliography 655

[Calvez, 1993] Calvez, J. P. (1993). Embedded Real-Time Systems. John Wiley, New York.

[Campbell and Smith, 1978] Campbell, R. N. and Smith, P. T. (1978). Recent advances in the psychology of language: formal and experimental approaches. Plenum Press, New York.

[Caplan, 1987] Caplan, D. (1987). Neurolinguistics and linguistic aphasiology. Cam- bridge University Press, Cambridge, England.

[Caplan, 1992] Caplan, D. (1992). Language: structure, processing and disorders. MIT Press, Cambridge, Massachusetts.

[Caplan et al., 1985] Caplan, D., Baker, C., and Dehaut, F. (1985). Syntactic determi- nants of sentence comprehension in aphasia. Cognition, 21:117–175.

[Cardoso and Parks, 1998] Cardoso, J. and Parks, R. W. (1998). Neural network mod- eling of executive functioning with the Tower of Hanoi test in frontal lobe-lesioned patients. In Parks, R. W. and Levine, D. S., editors, Fundamentals of neural network modeling: Neuropsychology and cognitive neuroscience, pages 209–231. M.I.T. Press, Cambridge, Massachusetts.

[Carel Van Wijk and Gerard Kempen, 1987] Carel Van Wijk and Gerard Kempen (1987). A dual system for producing self-repairs in spontaneous speech: evidence from experimentally elicited corrections. Cognitive Psychology, 19:403–440.

[Carey et al., 1996] Carey, D. P., Harvey, M., and Milner, A. D. (1996). Visuomotor sensitivity for shape and orientation in a patient with visual form agnosia. Neuropsy- chologia, 34:329–337.

[Carmichael et al., 1994] Carmichael, S. T., Clugnet, M.-C., and Price, J. L. (1994). Cen- tral Olfactory Connections in the Macaque Monkey. Journal of Comparative Neurology, 346:403–434.

[Carmichael and Price, 1994] Carmichael, S. T. and Price, J. L. (1994). Architectonic Subdivision of the Orbital and Medial Prefrontal Cortex in the Macaque Monkey. Journal of Comparative Neurology, 346:366–402.

[Carnap, 1958] Carnap, R. (1958). Introduction to symbolic logic and its applications. Dover Publication, New York. Translated by William H. Meyer and John Wilkinson.

[Carpenter et al., 1992] Carpenter, G. A., Grossberg, S., and Lesher, G. W. (1992). A What-and-Where Neural Network for Invariant Image Preprocessing. In IEEE Inter- national Joint Conference on Neural Networks, pages 303–308. 656 Bibliography

[Carter et al., 1998] Carter, C. S., Braver, T. S., Barch, D. M., Botvinick, M. M., Noll, D., and Cohen, J. D. (1998). Anterior cingulate cortex, error detection, and the online monitoring of performance. Science, 280(5364):747–749.

[Carter et al., 1999] Carter, C. S., Lederhendler, I. I., and Kirkpatrick, B. (1999). The integrative neurobiology of affiliation. M.I.T. Press, Cambridge, Massachusetts.

[Carterette, 1966] Carterette, E. C., editor (1966). Brain Function, Volume 3, Proceed- ings of the Third Conference, November 1963, Speech, Language, and Communication. University of California Press.

[Cassidy and Shaver, 1999] Cassidy, J. and Shaver, P. R. (1999). Handbook of attachment : theory, research, and clinical applications. Guilford Press, New York.

[Cavada and Goldman-Rakic, 1989] Cavada, C. and Goldman-Rakic, P. S. (1989). Pos- terior parietal cortex in rhesus monkey: II. Evidence for segregated corticocortical networks linking sensory and limbic areas with the frontal lobe. Journal of Compara- tive Neurology, 287:422–445.

[Cavedini et al., 2001] Cavedini, P., Cisima, M., Riboldi, G., D’Annucci, A., and Bel- lodi, L. (2001). A neuropsychological study of dissociation in cortical and subcortical functioning in Obsessive-Compulsive Disorder by Tower of Hanoi Task. Brain and Cognition, 46:357–363.

[Celesia, 1976] Celesia, G. G. (1976). Organization of Auditory Cortical Areas in Man. Brain, 99:403–414.

[Chan et al., 2001] Chan, C. H., Godinho, L. N., Thomaidou, D., Tan, S. S., Gulisano, M., and Parvelas, J. G. (2001). Emx1 is a marker for pyramidal neurons of the cerebral cortex. Cerebral Cortex, 11:1191–1198.

[Changeux and Dehaene, 2000] Changeux, J.-P. and Dehaene, S. (2000). Hierarchical neuronal modeling of cognitive functions: From synaptic transmission to the Tower of London. International Journal of Psychophysiology. Special Issue: Proceedings of the 9th World Congress of the International Organization of Psychophysiology (IOP), 35:179–187.

[Chapman, 1997] Chapman, T. (1997). Tower Noticing in the Tower of Hanoi. Cognitive Modeling unit paper WP/017, Department of Psychology, University of Nottingham, Nottingham, England.

[Cheney and Seyfarth, 1990a] Cheney, D. L. and Seyfarth, R. M. (1990a). How Monkeys See the World. University of Chicago Press, Chicago.

[Cheney and Seyfarth, 1990b] Cheney, D. L. and Seyfarth, R. M. (1990b). The repre- sentation of social relations by monkeys. Cognition, 37:167–196. Bibliography 657

[Chomsky, 1981] Chomsky, N. (1981). Lectures on government and binding. Dordrecht, Holland.

[Chomsky, 1986] Chomsky, N. (1986). Barriers. MIT Press, Cambridge, Massachusetts.

[Chun, 1997] Chun, M. M. (1997). Types and tokens in visual processing: A double dissociation between the attentional blink and repetition blindness. Journal of Exper- imental Psychology: Human Perception and Performance, 23:738–755.

[Chun and Cavanagh, 1997] Chun, M. M. and Cavanagh, P. (1997). Seeing two as one: Linking apparent motion and repetition blindness. Psychological Science, 8:74–79.

[Church, 1941] Church, A. (1941). The calculi of lambda-conversion. Princeton Univer- sity Press, Princeton, New Jersey.

[Clark, 1977] Clark, K. L. (1977). Verification and synthesis of logic programs. Research report, Imperial College of Science and Technology, London.

[Cohen and Eichenbaum, 1993] Cohen, N. J. and Eichenbaum, H. (1993). Memory, am- nesia, and the hippocampal system. M.I.T. Press, Cambridge, Massachusetts.

[Cohen et al., 1985] Cohen, N. J., Eichenbaum, H., Deacedo, B. S., and Corkin, S. (1985). Different memory systems underlying acquisition of procedural and declar- ative knowledge. pages 54–71. in [Olton et al., 1985].

[Colby et al., 1988] Colby, C. L., Gattass, R., Olson, C. R., and Gross, C. G. (1988). Topographical organization of cortical afferents to extrastriate visual area PO in the macaque: A dual tracer study. Journal of Comparative Neurology, 269:392–413.

[Colby, 1963] Colby, K. M. (1963). Computer simulation of a neurotic process. In nad S. Messick, S. S. T., editor, Computer simulation of personality, pages 165–179. John Wiley, New York.

[Colby et al., 1971] Colby, K. M., Weber, S., and Hilf, F. D. (1971). Artificial Paranoia. Artificial Intelligence, 2:1–25.

[Colmerauer, 1973] Colmerauer, A. (1973). Les systemes-Q ou un Formalisme pour Anal- yser et Synthesizer des Phrases sur Ordinateur. Publication Interne No. 43, Dept. Informatique, Universite de Montreal Canada.

[Colombo et al., 1990] Colombo, M., D’Amato, M. R., Rodman, H., and Gross, C. G. (1990). Auditory association cortex lesions impair auditory short-term memory in monkeys. Science, 247:336–338.

[Conrad, 1973] Conrad, M. (1973). Is the brain an effective computer? International Journal of Neuroscience, 5:167–170. 658 Bibliography

[Cook and Newsom, 1996] Cook, V. J. and Newsom, M. (1996). Chomsky’s Universal Grammar. Basil Blackwell, Oxford.

[Coslett and Saffran, 1991] Coslett, H. B. and Saffran, E. M. (1991). Simultanagnosia: To see but not two see. Brain, 114:1082–1107.

[Cowan et al., 2001] Cowan, W. M., Sudhof, T., and Stevens, C. F. (2001). Synapses. The Johns Hopkins University Press, Baltimore and London.

[Creutzfeldt, 1978] Creutzfeldt, O. D. (1978). The neocortical link: Thoughts in the generality of structure and function of the neocortex. pages 357–383. in [Brazier and Petsche, 1978].

[Crocker, 1996] Crocker, M. W. (1996). Computational psycholinguistics. Kluwer, Dor- drecht.

[Cummings, 1995] Cummings, J. L. (1995). Anatomic and behavioral aspects of frontal- subcortical circuits. Annals of the New York Academy of Sciences, 769:1–13.

[Dagher et al., 1999] Dagher, A., Owen, A. M., Boecker, H., and Brooks, D. J. (1999). Mapping the network for planning: A correlational PET activation study with the Tower of London ask. Brain, 122:1973–1987.

[Dahl, 1988] Dahl, V. (1988). Representing linguistic knowledge through logic program- ming. pages 249–262. in [Kowalski and Bowen, 1988].

[Dahl, 1989] Dahl, V. (1989). Discontinuous grammars. Computational Intelligence, 5:161–179.

[Dahl, 1990] Dahl, V. (1990). Parsing and generation with static discontinuity grammars. New Generation Computing, 8:245–274.

[Dahl, 1994] Dahl, V. (1994). Natural language processing and logic programming. Jour- nal of Logic Programming, 19,20:681–714.

[Dahl, 1999] Dahl, V. (1999). The logic of language. pages 429–456. in [Apt et al., 1999].

[Dahl and Saint-Dizier, 1985] Dahl, V. and Saint-Dizier, P. (1985). Natural language understanding and logic programming. Elsevier Science Publishers B.V.

[Dahl and Saint-Dizier, 1988] Dahl, V. and Saint-Dizier, P. (1988). Natural language understanding and logic programming II. Elsevier Science Publishers B.V.

[Daigneault et al., 1992] Daigneault, S., Braun, C. M. J., and Whitaker, H. A. (1992). An empirical test of two opposing theoretical models of prefrontal function. Brain and Cognition, 19:48–71. Bibliography 659

[Dalrymple et al., 1995] Dalrymple, M., Kaplan, R. M., and III, J. T. M. (1995). Formal issues in lexical-functional grammar. Center for the Study of Language and Informa- tion, Stanford University, Stanford, California. CSLI Lecture Notes No. 47.

[Dart, 1934] Dart, R. A. (1934). The dual structure of the neopallium: its history and significance. Journal of Anatomy, 69:33–79.

[Daum et al., 1993] Daum, I., Ackermann, H., Schugens, M. M., Reimold, C., Dichgans, J., and Birbaumer, N. (1993). The Cerebellar and Cognitive Functions in Humans. Behavioral Neuroscience, 107:411–419.

[Davidson, 1970] Davidson, D. (1970). Events as particulars. Nous, 5.

[Davidson, 1980] Davidson, D. (1980). Essays on actions and events. Oxford University Press, Oxford.

[Davis, 1993] Davis, A. M. (1993). Software requirements. Prentice Hall, Englewood Cliffs, New Jersey.

[Davis and Klebe, 2000] Davis, H. P. and Klebe, K. J. (2000). A longitudinal study of the performance of the elderly and young on the Tower of Hanoi puzzle and Rey recall. Brain and Cognition, 46:95–99. Special Issue: TENNET XI: Theoretical and Experimental Neuropsychology, June 15-17, 2000.

[Davis and Eichenbaum, 1991] Davis, J. L. and Eichenbaum, H. (1991). Olfaction: A model system for computational neuroscience. Massachusetts Institute of Technology.

[Davis, 1958] Davis, M. (1958). Computability and unsolvability. McGraw-Hill, New York.

[de Groot, 1946] de Groot, A. D. (1946). Het Denken van den Schaker. North Holland, Amsterdam. Translation into English published as Thought and Choice in Chess, Mouton, 1965.

[de Oliveira Souza et al., 2001] de Oliveira Souza, R., de Azevedo Ignacio, F., Cunha, F. C. R., de Oliveira, D. L. G., and Moll, J. (2001). The neuropsychology of exec- utive behavior: Performance of normal individuals on the Tower of London and the Wisconsin Card Sorting tests/Contribuicao a neuropsicologia do comportamento ex- ecutivo: Torre de Londres e teste de Wisconsin em individuos normais. Arquivos de Neuro-Psiquiatria, 59:526–531.

[Deacon, 1988] Deacon, T. W. (1988). Human Brain Evolution, 1. Evolution of Language Circuits. pages 363–381. in [Jerison and Jerison, 1988].

[Deacon, 1989] Deacon, T. W. (1989). The neural circuitry underlying primate calls and human language. Human Evolution, 4:367–401. 660 Bibliography

[Deacon, 1990] Deacon, T. W. (1990). Rethinking Mammalian Brain Evolution. Amer- ican Zoologist, 30:629–705.

[Deacon, 1992] Deacon, T. W. (1992). The human brain. pages 115–123. in [Jones et al., 1992].

[Deacon, 1997] Deacon, T. W. (1997). The symbolic species : the co-evolution of language and the brain. W. W. Norton, New York.

[Deco and Schurmann, 2000] Deco, G. and Schurmann, B. (2000). A neuro-cognitive visual system for object recognition based on testing of interactive attentional top- down hypotheses. Perception, 29:1249–1264.

[Defelipe et al., 1999] Defelipe, J., Gonzalez-Albo, M. C., Rio, M. R. D., and Elston, G. N. (1999). Distribution and patterns of connectivity iof interneurons containing calbindin, calretinin, and parvalbumin in visual areas of the occipital and temporal lobes of the macaque monkey. Journal of Comparative Neurology, 412:515–526.

[Dennis, 1980] Dennis, J. (1980). Dataflow supercomputers. Computer, 13:48–56.

[der Hart and Friedman, 1989] der Hart, O. V. and Friedman, B. (1989). A reader’s guide to Pierre Janet on dissociation: A neglected intellectual heritage. Dissociation, 2:3–16.

[Derbyshire et al., 1998] Derbyshire, S. W., Vogt, B. A., and Jones, A. K. (1998). Pain and Stroop interference tasks activate separate processing modules in anterior cingulate cortex. Experimental Brain Research, 118:52–60.

[Desimone, 1991] Desimone, R. (1991). Face-selective Cells in the Temporal Cortex of Monkeys. Journal of Cognitive Neuroscience, 3:1–8.

[Desimone et al., 1984] Desimone, R., Albright, T. D., Gross, G. C., and Bruce, C. (1984). Stimulus-selective properties of inferior temporal neurons in the macaque. J. Neurosci., 4:2051–2062.

[Desimone and Duncan, 1995] Desimone, R. and Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18:193–222.

[Devinsky et al., 1995] Devinsky, O., Morrell, M. J., and Vogt, B. A. (1995). Contribu- tions of anterior cingulate cortex to behaviour. Brain, 118:279–306.

[Dhankhar et al., 1997] Dhankhar, A., Wexler, B. E., Fulbright, R. K., Halwes, T., Blamire, A. M., and Shulman, R. G. (1997). Functional Magnetic Resonance Imag- ing Assessment of the Human Brain Auditory Cortex Response to Increasing Word Presentation Rates. Journal of Neurophysiology, 77:476–483. Bibliography 661

[Doty, 1995] Doty, R. L. (1995). Handbook of Olfaction and Gustation. Mark Dekker, Inc., New York.

[Douglas and Martin, 1998] Douglas, R. and Martin, K. (1998). Neocortex. pages 459– 510. in [Shepherd, 1998].

[Dowty et al., 1985] Dowty, D. R., Karttunen, L., and Zwicky, A. M. (1985). Natural lan- guage parsing: psychological, computational, and theoretical perspectives. Cambridge University Press, Cambridge.

[Driver, 1996] Driver, J. (1996). What can visual neglect and extinction reveal about the extent of preattentive processing? In Kramer, A. F., Coles, M., and Logan, G. D., editors, Converging operations in the study of visual selective attention, pages 193–223. American Psychological Association, Washington, D.C.

[Dyer, 1981a] Dyer, M. G. (1981a). $RESTAURANT revisited or lunch with BORIS. In IJCAI81.

[Dyer, 1981b] Dyer, M. G. (1981b). The role of TAUs in narratives. In Proceedings of the Third Conference of the Cognitive Science Society.

[Dyer, 1983a] Dyer, M. G. (1983a). In-Depth Understanding. M.I.T. Press, Cambridge, Massachusetts.

[Dyer, 1983b] Dyer, M. G. (1983b). The role of affect in narratives. Cognitive Science, 7:211–242.

[Eccles, 1989] Eccles, J. C. (1989). Evolution of the Brain: Creation of the Self. Rout- ledge and Kegan Paul, London.

[Economo, 1925] Economo, C. F. V. (1925). Die Cytoarchitektonik der Hirnrinde des erwachsenen Menschen. Springer Verlag, Berlin. Translated into English and published in 1929 by Oxford University Press as The Cytoarchitectonics of the human cerebral cortex, now with coauthor G. N. Koskinas.

[Edelman and Mountcastle, 1978] Edelman, G. M. and Mountcastle, V. B. (1978). The mindful brain : cortical organization and the group-selective theory of higher brain function. M.I.T. Press, Cambridge, Massachusetts.

[Edwards, 1992] Edwards, M. D. (1992). Automatic logic synthesis techniques for digital systems. McGraw-Hill, New York.

[Egan and Greeno, 1974] Egan, D. E. and Greeno, J. G. (1974). Theory of rule induction: knowledge acquired in concept learning, serial pattern learning and problem solving. pages 43–103. in [Gregg, 1974]. 662 Bibliography

[Eggert, 1977] Eggert, G. H. (1977). Wernicke’s works on aphasia: a sourcebook and review, early sources on aphasia and related disorders, volume 1. Mouton, The Hague, Paris and New York.

[Elliott and Dolan, 1998] Elliott, R. and Dolan, R. J. (1998). Activation of different anterior cingulate foci in association with hypothesis testing and response selection. Neuroimage, 8:17–29.

[Ellis and Beattie, 1986] Ellis, A. and Beattie, G. (1986). The psychology of language and communication. The Guilford Press, New York.

[Ellis and Young, 1988] Ellis, A. and Young, A. (1988). Human Cognitive Neuropsychol- ogy: A textbook with readings. Psychology Press, East Sussex, UK.

[Emery and Amaral, 2000] Emery, N. J. and Amaral, D. G. (2000). The role of the amygdala in primate social cognition. pages 156–191. in [Lane and Nadel, 2000].

[Enns and Lollo, 1997] Enns, J. T. and Lollo, V. D. (1997). Object substitution: A new form of masking in unattended visual locations. Psychological Science, 8:135–139. Alternative title: Attentional masking: visual interference by object substitution.

[Epstein et al., 1999] Epstein, R., Harris, A., Stanley, D., and Kanwisher, N. (1999). The parahippocampal place area: recognition navigation, or encoding? Neuron, 23:115– 125.

[Epstein and Kanwisher, 1998] Epstein, R. and Kanwisher, N. (1998). A cortical repre- sentation of the local visual environment. Nature, 392:598–601.

[Epstein and Kanwisher, 1999] Epstein, R. and Kanwisher, N. (1999). Repetition blind- ness for locations: Evidence for automatic spatial coding in an RSVP task. Journal of Experimental Psychology: Human Perception and Performance, 25:1855–1866.

[Eskandar et al., 1992] Eskandar, E. N., Richmond, B. J., and Optican, L. M. (1992). Role of Inferior Temporal Neurons in Visual Memory: I. Temporal Encoding of In- formation about Visual Images, Recalled Images, and Behavioral Context. Journal of Neurophysiology, 68:1277–1295.

[Essen et al., 1990] Essen, D. C. V., Felleman, D. J., Yoe, E. A. D., Olavarria, J., and Knierim, J. (1990). Modular and hierarchical organization of extrastriate visual cor- tex in the macaque monkey. Cold Spring Harbor Symposia on Quantitative Biology, 55:679–696.

[Estes, 1991] Estes, R. D. (1991). The Behavior Guide to African Mammals. University of California Press. Bibliography 663

[Estes, 1972] Estes, W. K. (1972). An associative basis for coding and organization of memory. In Melton, A. W. and Martin, E., editors, Coding processes in human memory, pages 161–190. V. H. Winston, Washington, DC.

[Estes, 1985] Estes, W. K. (1985). Memory for temporal information. In Michon, J. A. and Jackson, J., editors, Time, mind and behavior, pages 151–168. Springer-Verlag, New York.

[etal, 1980] etal, A. M. G. (1980). A two-process account of long-term serial position effects. Journal of Experimental Psychology: Learning, Memory and Cognition, 6:355– 369.

[etal, 1983] etal, A. M. G. (1983). Studies of the long-term recency effect: Support for the contextually guided retrieval hypothesis. Journal of Experimental Psychology: Learning, Memory and Cognition, 9:231–255.

[Fahy et al., 1993] Fahy, F. L., Riches, I. P., and Brown, M. W. (1993). Neuronal activity related to visual recognition memory: long-term memory and the encoding of recency and familiarity information in the primate anterior and medial inferior temporal and rhinal cortex. Experimental Brain Research, 96:457–472.

[Faillenot et al., 1997] Faillenot, I., Toni, I., Decety, J., Gregoire, M., and Jeannerod, M. (1997). Visual pathways for object-oriented action and object recognition: functional anatomy with PET. Cerebral Cortex, 7:77–85.

[Fairbairn, 1952] Fairbairn, W. R. D. (1952). Psychoanalytic studies of the personality. Tavistock Publications, London.

[Fauconnier, 1985] Fauconnier, G. (1985). Mental Spaces: aspects of meaning construc- tion in natural language. MIT Press, Cambridge, Massachusetts.

[Fauconnier, 1997] Fauconnier, G. (1997). Mappings in Thought and Language. Cam- bridge University Press, Cambridge, England.

[Fedigan, 1992] Fedigan, L. M. (1992). Primate Paradigms. University of Chicago Press, Chicago.

[Fedigan and Strum, 1997] Fedigan, L. M. and Strum, S. C. (1997). Changing images of primate societies. Current Anthropology, 38:677–681.

[Felleman and Essen, 1991] Felleman, D. J. and Essen, D. C. V. (1991). Distributed hierarchical processing in the primate visual cortex. Cerebral Cortex, 1:1–47.

[Ferber, 1999] Ferber, J. (1999). Multi-agent systems : an introduction to distributed artificial intelligence. A ddison Wesley. 664 Bibliography

[Festinger, 1957] Festinger, L. (1957). A theory of cognitive dissonance. Row, Peterson, Evanston, Ill.

[Fiez et al., 1996] Fiez, J. A., Raichle, M. E., Balota, D. A., Tallal, P., and Petersen, S. E. (1996). PET Activation of Posterior Temporal Regions during Auditory Word Presentation and Verb Generation. Cerebral Cortex, 6:1–10.

[Finger, 1991] Finger, T. E. (1991). Gustatory Nuclei and Pathways in the Central Nervous System. pages 331–353. in [Finger and Silver, 1991].

[Finger and Silver, 1991] Finger, T. E. and Silver, W. L. (1991). Neurobiology of Taste and Smell. Krieger, Malabar, Florida.

[Fiser et al., 1996] Fiser, J., Biederman, I., and Cooper, E. E. (1996). To what extent can matching algorithms based on direct outputs of spatial filters account for human shape recognition? Spatial Vision, 10:237–271.

[Flannelly et al., 1984] Flannelly, K. J., Blanchard, R. J., and Blanchard, D. C. (1984). Biological perspectives on aggression. Alan R. Liss, New York.

[Flexser and Bower, 1974] Flexser, A. J. and Bower, G. H. (1974). How frequency af- fects recency judgements: a model of recency discrimination. Journal of Experimental Psychology, 103:706–716.

[Flynn, 1976] Flynn, J. P. (1976). Neural basis of threat and attack. In Grenell, R. G. and Gabay, S., editors, Biological foundations of psychiatry, pages 273–295. Raven Press, New York.

[Fodor, 1983] Fodor, J. A. (1983). The modularity of mind: an essay on faculty psychol- ogy. MIT Press, Cambridge, Massachusetts.

[Fodor, 2000] Fodor, J. A. (2000). The Mind Doesn’t Work That Way : The Scope and Limits of Computational Psychology (Representation and Mind). MIT Press.

[Frackowiak et al., 1997] Frackowiak, R. S. J., Friston, K. J., Froth, C. D., Dolan, R. J., and Mazziotta, J. C. (1997). Human Brain Function. Academic Press, New York and London.

[Frazier, 1990] Frazier, L. (1990). Exploring the architecture of the language-processing system. pages 409–433. in [Altmann, 1990].

[Frazier, 1998] Frazier, L. (1998). Getting there (slowly). Journal of Psycholinguistic Research, 27:123–146.

[Frazier and Fodor, 1978] Frazier, L. and Fodor, J. D. (1978). The sausage machine: a new two-stage parsing model. Cognition, 6:291–325. Bibliography 665

[Frazier and Rayner, 1982] Frazier, L. and Rayner, K. (1982). Making and correcting errors during sentence comprehension: eye movements in the analysis of structurally ambiguous sentences. Cognitive Psychology, 14:178–221.

[Frege, 1879] Frege, G. (1879). Begriffsschrift, a formula language, modeled upon that of arithmetic, for pure thought. in [VanHeijenoort, 1967].

[Frege, 1893] Frege, G. (1893). Grundgesetze der Arithmetik (Basic Laws of Arithmetic). Verlag Hermann Pohle, Jena. Band I (1893), Band II (1903).

[Freud, 1894] Freud, S. (1894). The neuropsychoses of defence. In Standard Edition [Freud, 1978], volume 3, pages 43–70.

[Freud, 1895] Freud, S. (1895). Project for a Scientific Psychology. In Standard Edition [Freud, 1978], volume 1, pages 283–.

[Freud, 1900] Freud, S. (1900). The Interpretation of Dreams. In Standard Edition [Freud, 1978], volume 4, volume 4 and 5, pages 1–630.

[Freud, 1923] Freud, S. (1923). The ego and the id. In Standard Edition [Freud, 1978], volume 19, pages 3–68. [Ich und das Es].

[Freud, 1978] Freud, S. (1978). The standard edition of the complete psychological works of Sigmund Freud. Hogarth Press, London. translated under the general editorship of James Strachey.

[Friedman, 1993] Friedman, W. J. (1993). Memory for the Time of Past Events. Psy- chological Review, 113:44–66.

[Friedman-Hill et al., 1995] Friedman-Hill, S. R., Robertson, L. C., and Treisman, A. (1995). Parietal contributions to visual feature binding: Evidence from a patient with bilateral lesions. Science, 269:853–855.

[Frijda, 1986] Frijda, N. (1986). The emotions. Cambridge University Press, Cambridge, England.

[Frith, 1995] Frith, C. (1995). Functional imaging and cognitive abnormalities. Lancet, 346:615–620.

[Frith et al., 1991] Frith, C. D., Friston, K., Liddle, P. F., and Frackowiak, R. S. J. (1991). Willed action and the prefrontal cortex in man: a study with PET. Proceedings of the Royal Society of London, Series B, 244:241–246.

[Fujita et al., 1992] Fujita, I. K., Tanaka, K., Ito, M., and Cheng, K. (1992). Columns for visual features of objects in inferotemporal cortex. Nature, 360:343–346. 666 Bibliography

[Fukushima, 1980] Fukushima, K. (1980). Neocognitron: A self-organizing neural net- work model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36:193–202.

[Fuster, 1995] Fuster, J. M. (1995). Memory in the cerebral cortex : an empirical ap- proach to neural networks in the human and nonhuman primate. Massachusetts Insti- tute of Technology.

[Fuster, 1997] Fuster, J. M. (1997). The Prefrontal Cortex: Anatomy, Physiology, and Neuropsychology of the Frontal Lobe. Raven Press, New York.

[Gall and Spurzheim, 1810] Gall, F.-J. and Spurzheim, G. (1810). Anatomie et physiolo- gie du system nerveux en general, et du cerveau en particulier. Schoell, Paris.

[Gao and Suga, 1998] Gao, E. and Suga, N. (1998). Experience-dependent corticofugal adjustment of midbrain frequency map in bat auditory system. Proceedings of the National Academy of Sciences of the USA, 95:12663–12670.

[Garrett, 1980] Garrett, M. (1980). Levels of processing in sentence production. pages 177–220. in [Butterworth, 1980].

[Garrett, 1988] Garrett, M. (1988). Processes in language production. pages 69–96. in [Newmeyer, 1988].

[Garrett, 1995] Garrett, M. (1995). The structure of language processing: neuropsycho- logical evidence. pages 881–899. in [Gazzaniga, 1995].

[Gazdar and Pullum, 1982] Gazdar, G. and Pullum, G. K. (1982). Generalized Phrase Structure Grammar: A Theoretical Synopsis. Indiana University Linguistics Club. reproduced by IULC.

[Gazzaniga, 1989] Gazzaniga, M. S. (1989). Organization of the Human Brain. Science, 245:947–952.

[Gazzaniga, 1995] Gazzaniga, M. S. (1995). The Cognitive Neurosciences. M.I.T. Press, Cambridge, Massachusetts.

[Gerhardt, 1890] Gerhardt, C. I., editor (1875-1890). Die philosophischen schriften von Gottfried Wilhelm Leibniz, volumes i-vii. Berlin.

[Geschwind, 1965] Geschwind, N. (1965). Disconnexion syndromes in animals and man. I. and II. Brain, 88:237–294 and 585–644.

[Geschwind, 1966] Geschwind, N. (1966). Carl Wernicke, the Breslau school, and the history of aphasia. pages 1–16. in [Carterette, 1966]. Bibliography 667

[Geschwind et al., 1968] Geschwind, N., Quadfasel, F. A., and Segarra, J. M. (1968). Isolation of the speech area. Neuropsychologia, 6:327–340.

[Gibson, 1991] Gibson, E. (1991). A computational theory of human linguistic processing: memory limitations and processing breakdown. PhD thesis, CMU.

[Gibson, 1998] Gibson, E. (1998). Linguistic complexity: locality of syntactic dependen- cies. Cognition, 68:1–76.

[Gilhooly et al., 1999] Gilhooly, K. J., Phillips, L. H., Wynn, V., Logie, R. H., and Sala, S. D. (1999). Planning processes and age in the five-disc Tower of London task. Think- ing and Reasoning, 5:339–361.

[Gilhooly et al., 2002] Gilhooly, K. J., Wynn, V., Philips, L. H., Llogie, R. H., and Sala, S. D. (2002). Visuo-spatial and verbal working memory in the five-disc Tower of London task: An individual differences approach. Thinking and Reasoning, 8:165–178.

[Gillespie, 1992] Gillespie, J. H. (1992). The causes of molecular evolution. Oxford Uni- versity Press.

[Gioia, ] Gioia, G. A. The Tower of Hanoi Task and Developmental Executive Dysfunc- tion. published abstract.

[Glenberg, 1987] Glenberg, A. M. (1987). Temporal context and recency. In Gorfein, D. F. and Hoffman, R. R., editors, Memory and learning: The Ebbinghaus Centennial Conference, pages 173–190. Lawrence Erlbaum Associates.

[Glenberg and Swanson, 1986] Glenberg, A. M. and Swanson, N. G. (1986). A tempo- ral distinctiveness theory of recency and modality effects. Journal of Experimental Psychology: Learning, Memory and Cognition, 12:3–15.

[Glosser and Goodglass, 1990] Glosser, G. and Goodglass, H. (1990). Disorders in Exec- utive Control Functions Amon Aphasic and Other Brain-Damaged Patients. Journal of Clinical and Experimental Neuropsychology, 12:485–501.

[Goel and Grafman, 1995] Goel, V. and Grafman, J. (1995). Are the frontal lobes im- plicated in planning functions? Interpreting data from the Tower of Hanoi. Neuropsy- chologia, 33:623–642.

[Goela et al., 2001] Goela, V., Pullara, S. D., and Grafman, J. (2001). A computational model of frontal lobe dysfunction: Working memory and the Tower of Hanoi task. Cognitive Science, 25:287–313.

[Goldberg, 1995] Goldberg, S. (1995). Introduction. pages 1–15. in [Goldberg et al., 1995]. 668 Bibliography

[Goldberg et al., 1995] Goldberg, S., Muir, R., and Kerr, J. (1995). Attachment theory: social developmental, and clinical perspectives. Analytic Press, Hillsdale, New Jersey.

[Goldberg et al., 1990] Goldberg, T. E., Saint-Cyr, J. A., and Weinberger, D. R. (1990). Assessment of Procedural Learning and Problem Solving in Schizophrenic Patients by Tower of Hanoi Type Tasks. Journal of Neuropsychiatry, 2:165–173.

[Goldman-Rakic, 1988] Goldman-Rakic, P. S. (1988). Topography of Cognition: Parallel Distributed Networks in Primate Association Cortex. Annual Review of Neuroscience, 11:137–156.

[Goldman-Rakic et al., 1984] Goldman-Rakic, P. S., Selemon, L. D., and Schwartz, M. L. (1984). Dual Pathways Connecting the Dorsolateral Prefrontal Cortex with the Hip- pocampal Formation and Parahippocampal Cortex in the Rhesus Monkey. Neuro- science, 12:719–743.

[Golgi, 1873] Golgi, C. (1873). On the structure of the brain grey matter. Gazzetta Medica Italiana.

[Goodwin, 1981] Goodwin, C. (1981). Conversational organization : interaction between speakers and hearers. Academic Press, New York and London.

[Graeff, 1994] Graeff, F. G. (1994). Neuroanatomy and neurotransmitter regulation of defensive behaviors and related emotions in mammals. Brazilian Journal of Medical and Biological Research, 27:811–829.

[Grafman et al., 1995] Grafman, J., Holyoak, K. J., and Boller, F. (1995). Structure and Functions of the Human Prefrontal Cortex. New York Academy of Sciences. Annals of the New York Academy of Sciences, Volume 769.

[Grasby et al., 1993] Grasby, P. M., Frith, C. D., Friston, K. J., Bench, C., Frackiowak, R. S. J., and Dolan, R. J. (1993). Functional mapping of brain areas implicated in auditory-verbal memory function. Brain, 116:1–20.

[Gray and McNaughton, 2000] Gray, J. A. and McNaughton, N. (2000). The neuropsy- chology of anxiety : an enquiry into the function of the septo-hippocampal system. Oxford University Press. Second edition.

[Gray, 2000] Gray, J. J. (2000). The Hilbert Challenge. Oxford University Press.

[Greene, 1972] Greene, P. H. (1972). Problems of organisation of motor systems. pages 303–338. in [Rosen and Snell, 1972].

[Gregg, 1974] Gregg, L. W. (1974). Knowledge and cognition. Lawrence Erlbaum Asso- ciates, Hillsdale, New Jersey. Bibliography 669

[Gross, 1994] Gross, C. G. (1994). How inferior temporal cortex became a visual area. Cerebral Cortex, 4:455–469.

[Gross et al., 1972] Gross, C. G., Rocha-Miranda, C. E., and Bender, D. B. (1972). Vi- sual properties of neurons in inferotemporal cortex of the Macaque. Journal of Neu- rophysiology, 35:96–111.

[Guenther and Linton, 1975] Guenther, R. K. and Linton, M. (1975). Mechanisms of temporal coding. Journal of Experimental Psychology: Human Learning and Memory, 97:220–229.

[Guyau, 1890] Guyau, J.-M. (1890). La genese de l-idee de temps. Alcan, Paris. (The origin of the idea of time).

[Haberly, 1990] Haberly, L. B. (1990). Olfactory Cortex. pages 317–345. in [Shepherd, 1990].

[Hackett et al., 1998a] Hackett, T. A., Stepneiwska, I., and Kaas, J. H. (1998a). Subdi- visions of auditory cortex and ipsilateral cortical connections of the parabelt auditory cortex in macaque monkeys. Journal of Comparative Neurology, 394:475–495.

[Hackett et al., 1998b] Hackett, T. A., Stepneiwska, I., and Kaas, J. H. (1998b). Thala- mocortical connections of the parabelt auditory cortex in macaque monkeys. Journal of Comparative Neurology, 400:271–286.

[Hackett et al., 1999] Hackett, T. A., Stepneiwska, I., and Kaas, J. H. (1999). Prefrontal connections of the parabelt auditory cortex in macaque monkeys. Brain Research, 817:45–58.

[Haines, 1997] Haines, D. E., editor (1997). Fundamental Neuroscience. Churchill Liv- ingstone, New York.

[Hall and Gartlan, 1965] Hall, K. R. L. and Gartlan, J. S. (1965). Ecology and Behaviour of the Vervet Monkey, Cercopithecus Aethiops, Lolui Island, Lake Victoria. Proceedings of the Zoological Society of London, 145:37–57.

[Handley et al., 2002] Handley, S. J., Capon, A., Copp, C., and Harper, C. (2002). Con- ditional reasoning and the Tower of Hanoi: The role of spatial and verbal working memory. British Journal of Psychology, 93:501–518.

[Harlow, 1971] Harlow, H. F. (1971). Learning to love. Albion, San Francisco.

[Harlow, 1986] Harlow, H. F. (1986). From learning to love : the selected papers of H.F. Harlow, edited by Clara Mears Harlow. Praeger, New York. 670 Bibliography

[Harmon-Jones and Mills, 1999] Harmon-Jones, E. and Mills, J. (1999). Cognitive dis- sonance: progress on a pivotal theory in social psychology. American Psychological Association, Washington, D.C. [Harries and Perrett, 1991] Harries, M. H. and Perrett, D. I. (1991). Visual processing of faces in temporal cortex: physiological evidence for a modular organization and possible anatomical correlates. Journal of Cognitive Neuroscience. [Hasher and Zacks, 1979] Hasher, L. and Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108:356–388. [Hauser, 1997] Hauser, M. (1997). The evolution of communication. M.I.T. Press, Cam- bridge, Massachusetts. [Haykin, 1994] Haykin, S. S. (1994). Neural Networks: A Comprehensive Foundation. Macmillan. [Hayward and Tarr, 1997] Hayward, W. G. and Tarr, M. J. (1997). Testing conditions for viewpoint invariance in object recognition. Journal of Experimental Psychology: Human Perception and Performance, 23:1511–1521. [Hazan and Zeifman, 1999] Hazan, C. and Zeifman (1999). Pair Bonds as Attachments: Evaluating the Evidence. In Handbook of attachment : theory, research, and clinical applications. Guilford Press, New York. [He et al., 1996] He, S., Cavanagh, P., and Intriligator, J. (1996). Attentional resolution and the locus of visual awareness. Nature, 383:334–337. [Hebb, 1949] Hebb, D. O. (1949). The organization of behavior; a neuropsychological theory. John Wiley, New York. [Heffner and Heffner, 1990] Heffner, H. E. and Heffner, R. S. (1990). Effect of bilat- eral auditory cortex lesions on sound localization in japanese macaques. Journal of Neurophysiology, 64:915–931. [Henkin, 1950] Henkin, L. (1950). Completeness in the theory of types. The Journal of Symbolic Logic, 15:81–91. [Hennessy and Patterson, 1998] Hennessy, J. L. and Patterson, D. A. (1998). Computer organization and design : the hardware/software interface, 2nd edition. Morgan Kauf- man, San Mateo, California. [Herbart, 1824] Herbart, J. F. (1824). Psychology as a science, newly founded on expe- rience, metaphysics and mathematics. In Shipley, T., editor, Classics in Psychology, pages 22–50. Philosophical library, New York. Translated and published in 1961 from Psychologie als Wissenschaft, neu gegr¨undetauf Erfahrung, Metaphysik und Mathe- matik, A. M. Unzer, Konigsberg, 1824. Bibliography 671

[Herbrand, 1930] Herbrand, J. (1930). Recherches sur la theorie de la demonstration. Travaux de la Societie des Sciences et des Lettres de Varsovie, III, 33:33–160.

[Herbrand, 1971] Herbrand, J. (1971). Logical Writings. Harvard and Reidel.

[Hermes, 1965] Hermes, H. (1965). Enumerability, decidability, computability. Academic Press, New York and London.

[Hewitt, 1967] Hewitt, C. E. (1967). PLANNER: A language for proving theorems. Tech- nical Report AIM-137, MIT Artificial Intelligence Laboratory, April 1972.

[Hewitt, 1986] Hewitt, C. E. (1986). Offices are Open Systems. ACM Transactions on Office Information Systems, 4(3):271–287. Also in Huberman, B.A. The Ecology of Computation, Elsevier Science Publishers/North Holland, Amsterdam, 1988.

[Hickok et al., 1997] Hickok, G., Love, T., Swinney, D., Wong, E. C., and Buxton, R. B. (1997). Functional MR Imaging during Auditory Word Perception: A Single-Trial Presentation Paradigm. Brain and Language, 58:197–201.

[Hilbert, 1902a] Hilbert, D. (1902a). Mathematical problems. Bulletin of the American Mathematical Society, 8:437–479. The original appeared in the G¨ottinger Nachtrichten, 1900, pp. 253-297, and in the Archiv der Mathematik und Physik, 36 ser. vol 1, 1901, pp. 4-63 and 113-237.

[Hilbert, 1902b] Hilbert, D. (1902b). The foundations of geometry. The original formed the first part of Festschrift zur feier der enth¨ullung den Gauss-Weber-denkmals in G¨ottingen, Grundlagen der geometrie, with the additions made by the author in the French translation, Paris, 1901, incorporated, Translated by Leo Unger from the Ger- man ed. 10th ed. rev., and enl. by Paul Bernay, Open Court Pub. Co., La Salle, Illinois, 1971.

[Hilgetag et al., 2000] Hilgetag, C.-C., O’Neill, M. A., and Young, M. P. (2000). Hi- erarchical organization of macaque and cat cortical sensory systems explored with a novel network processor. Philosophical Transactions of the Royal Society, Series B: Biological Sciences, 355:71–89.

[Hinde, 1970] Hinde, R. A. (1970). Animal behaviour; a synthesis of ethology and com- parative psychology, 2nd edition. McGraw-Hill, New York.

[Hinrichs, 1970] Hinrichs, J. V. (1970). A two-process memory-strength theory of judg- ment of recency. Psychological Review, 77:223–233.

[Hintzman et al., 1973] Hintzman, D. L., Block, R. A., and Summers, J. J. (1973). Con- textual associations and memory for serial position. Journal of Experimental Psychol- ogy, 97:220–229. 672 Bibliography

[Hintzman et al., 1975] Hintzman, D. L., Summers, J. J., and Block, R. A. (1975). Spac- ing judgments as an index of study-phase retrieval. Journal of Experimental Psychol- ogy: Human Learning and Memory, 1:31–40. [Hodgson et al., 2000] Hodgson, T. L., Bajwa, A., Owen, A. M., and Kennard, C. (2000). The strategic control of gaze direction in the Tower of London task. Journal of Cog- nitive Neuroscience, 12:894–907. [Hofer, 1984] Hofer, M. A. (1984). Relationships as regulators: A psychobiologic per- spective on bereavement. Psychosomatic Medicine, 46:183–197. [Hofer, 1987] Hofer, M. A. (1987). Early social relationships: a psychobiologist’s view. Child Development, 58:633–647. [Hollingworth and Henderson, 2002] Hollingworth, A. and Henderson, J. M. (2002). Ac- curate visual memory for previously attended objects in natural scenes. Journal of Experimental Psychology: Human Perception and Performance, 28:113–136. [Hopfield, 1982] Hopfield, J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the USA, 79:554–2558. [Hovy and Schank, 1984] Hovy, E. H. and Schank, R. C. (1984). Language generation by computer. pages 165–195. in [Bara and Guida, 1984]. [Huckins et al., 1998] Huckins, S. C., Turner, C. W., Doherty, K. A., Fonte, M. M., and Szevernyi, N. M. (1998). Functional magnetic Resonance Imaging Measures of Blood Flow Patterns in Human Auditory Cortex in Response to Sound. Journal of Speech, Language, and Hearing Research, 41:538–548. [Hughes et al., 1994] Hughes, C., Russell, J., and Robbins, T. W. (1994). Evidence for executive dysfunction in autism. Neuropsychologia, 32:477–492. [Huhns and Singh, 1998] Huhns, M. and Singh, M. (1998). Readings in Agents. Morgan Kaufman, San Mateo, California. [Hummel and Stankiewicz, 996] Hummel, J. E. and Stankiewicz, B. J. (‘996). An archi- tecture for rapid, hierarchical structural description. In Inui, T. and McClelland, J., editors, Attention and Performance XVI: Information Integration in Perception and Communication, pages 93–121. M.I.T. Press, Cambridge, Massachusetts. [Humphrey, 1984] Humphrey, N. (1984). Consciousness regained. Oxford University Press, Oxford. [Insausti et al., 1987] Insausti, R., Amaral, D. G., and Cowan, W. M. (1987). The en- torhinal cortex of the monkey: II. Cortical afferents. Journal of Comparative Neurology, 264:356–395. Bibliography 673

[Irwin and Zelinsky, 2002] Irwin, D. E. and Zelinsky, G. J. (2002). Eye movements and scene perception: Memory for things observed. Perception and Psychophysics, 64:882– 895.

[Iverson, 1962] Iverson, K. E. (1962). A programming language. John Wiley, New York.

[Jackendoff, 197] Jackendoff, R. S. (197). The architecture of the language faculty. M.I.T. Press, Cambridge, Massachusetts.

[Jackendoff, 1983] Jackendoff, R. S. (1983). Semantics and cognition. M.I.T. Press, Cambridge, Massachusetts.

[Jackendoff, 1990] Jackendoff, R. S. (1990). Semantic structures. M.I.T. Press, Cam- bridge, Massachusetts.

[Jackson, 1931] Jackson, J. H. (1931). Selected writings of John Hughlings Jackson. Sta- ples Press, London. Two volumes, edited by James Taylor.

[Jakobson and Halle, 1956] Jakobson, R. and Halle, M. (1956). Fundamentals of Lan- guage. Mouton, The Hague, Paris and New York.

[Janet, 1886] Janet, P. (1886). L’automatisme psychologique. Felix Alcan, Paris. Reprint: Societe Pierre Janet, Paris, 1973.

[Janet, 1891] Janet, P. (1891). Etude sur un cas d’aboulie et d’idees fixes. Revue Philosophique, 33:258–287.

[Janet, 1894] Janet, P. (1894). Histoire d’une id´eefixe. Revue Philosophique, 37:121–163. Also in P. Janet (1898). N´evroses et id´eesfixes, Vol. I (pp. 156-212). Paris: F. Alcan.

[Janet, 1898] Janet, P. (1898). Nevroses et idees fixes. Vol. I. Felix Alcan, Paris.

[Janet, 1901] Janet, P. (1901). The mental state of hystericals. Putnam and Sons, New York. Reprint: University Publications of America, Washington, DC, 1977.

[Janet, 1904] Janet, P. (1904). L’amnesie et la dissociation des souvenirs par l’emotion. Journale de Psychologie, 1:417–453.

[Janet, 1907] Janet, P. (1907). The major symptoms of hysteria. Macmillan, New York.

[Janet, 1919] Janet, P. (1919). Les m´edications psychologiques (3 vols.). Felix Alcan, Paris. (Reprint: Soci´et´ePierre Janet, Paris, 1984). English edition: Psychological healing (2 vols.). New York: Macmillan, 1925. (Reprint: Arno Press, New York, 1976.).

[Janet, 1971] Janet, P. (1971). La m´edecine psychologique. English . Principles of psy- chotherapy. Translated by H. M. and E. R. Guthrie. 674 Bibliography

[Jasso, 1985] Jasso, G. (1985). Marital Coital Frequency and the Passage of Time: Esti- mating the Separate Effects of Spouses’ Ages and Marital Duration, Birth and Marriage Cohorts, and Period influences. American Sociological Review, 50:224–241.

[Jeannerod, 1990] Jeannerod, M. (1990). Attention and Performance XIII. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Jerison and Jerison, 1988] Jerison, H. J. and Jerison, I. (1988). Intelligence and Evolu- tionary Biology. Springer-Verlag, Berlin.

[John Laird and Paul Rosenbloom and Allen Newell, 1986] John Laird and Paul Rosen- bloom and Allen Newell (1986). Universal Subgoaling and Chunking : The Automatic Generation and Learning of Goal Hierarchies. Kluwer Academic Publishers, Boston, Dordrecht, London.

[Johnson et al., 1995] Johnson, K. O., Hsiao, S. S., and Twombly, I. A. (1995). Neural mechanisms of tactile form recognition. pages 253–268. in [Gazzaniga, 1995].

[Jones and Powell, 1970] Jones, E. G. and Powell, T. P. S. (1970). An anatomical study of converging sensory pathways with the cerebral cortex of the monkey. Brain, 93:793– 820.

[Jones et al., 1992] Jones, S., Martin, R., and Pilbeam, D. (1992). The Cambridge En- cyclopedia of Human Evolution. Cambridge University Press, Cambridge, England.

[Jonge and de Poll, 1984] Jonge, F. H. D. and de Poll, N. E. V. (1984). Relationships between sexual behavior in male and female rats: effects of gonadal hormones. Progress in Brain Research, 61:283–302.

[Joppa et al., 1995] Joppa, M. A., Meisel, R. L., and Garber, M. A. (1995). c-Fos expres- sion in female hamster brain following sexual and aggressive behaviors. Neuroscience, 68:783–792.

[Juhani Hyv¨arinen, 1982] Juhani Hyv¨arinen (1982). The Parietal Cortex of Monkey and Man. Springer-Verlag, Berlin.

[Just et al., 1996] Just, M. A., Carpenter, P. A., and Hemphill, D. D. (1996). Con- straints on Processing Capacity: Architectural or Implementational. pages 141–178. in [Steier and Mitchell, 1996].

[Kaas et al., 1999] Kaas, J. H., Hackett, T. A., and Tramo, M. J. (1999). Auditory processing in primate cerebral cortex. Current Opinion in Neurobiology, 9:164–170.

[Kaas and Huertas, 1988] Kaas, J. H. and Huertas, M. F. (1988). The Somatosensory System of Primates. pages 421–468. in [Steklis and Erwin, 1988], Volume 4: Neuro- sciences. Bibliography 675

[Kaas and Pons, 1988] Kaas, J. H. and Pons, T. P. (1988). The somatosensory system of primates. In Steklis, H. P., editor, Comparative primate biology, Vol 4, Neuroscience, pages 421–468. Liss, New York.

[Kaas et al., 1981] Kaas, J. H., Sur, M., Nelson, R. I., and Merzenich, M. M. (1981). The postcentral somatosensory cortex: multiple representations of the body in primates. pages 29–45. in [Woolsey, 1981], Volume 1, Multiple Somatic Areas.

[Kaczmarek, 2000] Kaczmarek, L. (2000). Gene expression in learning processes. Acta Neurobiologiae Experimentalis, 60:419–424.

[Kahneman and Treisman, 1984] Kahneman, D. and Treisman, A. (1984). Changing views of attention and automaticity. In Parasuraman, R. and Davies, D. R., editors, Varieties of attention, pages 29–61. Academic Press, New York and London.

[Kahneman et al., 1992] Kahneman, D., Treisman, A., and Gibbs, B. (1992). The review- ing of object files: Object-specific integration of information. Cognitive Psychology, 24:175–219.

[Kandel and Schwartz, 1999] Kandel and Schwartz (1999). Principles of Neural Science.

[Kandel and Schwartz, 2000] Kandel, E. R. and Schwartz, J. H. (2000). Principles of Neural Science, 4th edition. McGraw-Hill, New York.

[Kanwisher, 2000] Kanwisher, N. (2000). Domain Specificity in Face Perception. Nature Reviews in Neuroscience, 3:759–763.

[Kanwisher, 2001] Kanwisher, N. (2001). Neural Events and Perceptual Awareness. Cog- nition, 79:89–113.

[Kanwisher, 1987] Kanwisher, N. G. (1987). Repetition blindness: type recognition with- out token individuation. Cognition, 27:117–143.

[Kaplan, 1995] Kaplan, R. M. (1995). The formal architecture of lexical-functional gram- mar. pages 7–27. in [Dalrymple et al., 1995].

[Kaplan and Bresnan, 1995] Kaplan, R. M. and Bresnan, J. (1995). Lexical-functional grammar: a formal system for grammatical representation. pages 29–130. in [Dalrymple et al., 1995].

[Karat, 1982] Karat, J. (1982). A model of problem solving with incomplete constraint knowledge. Cognitive Psychology, 14:538–559.

[Karmiloff-Smith, 1992] Karmiloff-Smith, A. (1992). Beyond modularity: a developmen- tal perspective on cognitive science. MIT Press, Cambridge, Massachusetts. 676 Bibliography

[Karmiloff-Smith, 1993] Karmiloff-Smith, A. (1993). Beyond Modularity. Oxford Uni- versity Press, Oxford.

[Kawaguchi and Kubota, 1998] Kawaguchi, Y. and Kubota, Y. (1998). Neurochemical features and synaptic connections of large physiologically-identified GABAergic cells in the rat frontal cortex. Neuroscience, 85:677–701.

[Kay, 1985] Kay, M. (1985). Parsing in functional unification grammar. pages 251–278. in [Dowty et al., 1985].

[Kay, 1992] Kay, M. (1992). Unification. pages 1–29. in [Rosner and Johnson, 1992].

[Keele et al., 1990] Keele, S. W., Cohen, A., and Ivry, R. (1990). Motor programs: con- cepts and issues. pages 77–110. in [Jeannerod, 1990].

[Keith J. Holyoak and John E. Hummel, 2000] Keith J. Holyoak and John E. Hummel (2000). The proper treatment of symbols in a connectionist architecture. In Eric Diet- rich and Arthur B. Markman, editor, Cognitive dynamics: Conceptual and representa- tional change in humans and machines, pages 229–263. Lawrence Erlbaum Associates.

[Kelsey C. Martin and Mark Barad and Eric R. Kandel, 2000] Kelsey C. Martin and Mark Barad and Eric R. Kandel (2000). Local protein synthesis and its role in synapse- specific plasticity. Current Opinion in Neurobiology, 10:587–592.

[Kempen, 1976] Kempen, G. (1976). Syntactic constructions as retrieval plans. British Journal of Psychology, 67:149–160.

[Kempen, 1977] Kempen, G. (1977). Conceptualizing and formulating in sentence pro- duction. pages 259–274. in [Rosenberg, 1977].

[Kempen, 1978] Kempen, G. (1978). Sentence construction by a psychologically plausible formulator. pages 103–123. in [Campbell and Smith, 1978].

[Kempen, 1987] Kempen, G. (1987). A framework for incremental syntactic tree forma- tion. In Proceedings of the 1987 International Joint Conference on Artificial Intelli- gence, pages 655–660.

[Kempen, 1989] Kempen, G. (1989). Language generation systems. pages 471–480. in [Batori et al., 1989].

[Kempen, 2000] Kempen, G. (2000). Performance grammars. unpublished book manuscript.

[Kempen and Hoenkamp, 1982] Kempen, G. and Hoenkamp, E. (1982). Incremental sen- tence generation: implications for the structure of a syntactic processor. In COL- ING82: Proceedings of the Ninth International Conference on Computational Linguis- tics, Prague, July 5-10, edited by Jan Horecky, pages 151–156. Bibliography 677

[Kempen and Hoenkamp, 1987] Kempen, G. and Hoenkamp, E. (1987). An incremental procedural grammar for sentence formulation. Cognitive Science, 11:201–258.

[Kempen and Huijbers, 1983] Kempen, G. and Huijbers, P. (1983). The lexicalization process in sentence production and naming: indirect election of words. Cognition, 14:185–209.

[Kempen and Vosse, 1989] Kempen, G. and Vosse, T. (1989). Incremental syntactic tree formation in human sentence processing: a cognitive architecture based on activation decay and simulated annealing. Connection Science, 1:273–290.

[Kennard and Swash, 1989] Kennard, C. and Swash, M. (1989). Hierarchies in neurology : a reappraisal of a Jacksonian concept. Springer-Verlag, New York.

[Kenny, 1963] Kenny, A. (1963). Action, emotion and will. Routledge and Kegan Paul, London.

[Killcross et al., 1997] Killcross, S., Robbins, T. W., and Everitt, B. J. (1997). Differ- ent types of fear-conditioned behaviour mediated by separate nuclei within amygdala. Nature, 388:377–380.

[Kimball, 1973a] Kimball, J. (1973a). Seven principles of surface structure parsing in natural language. Cognition, 2:15–47.

[Kimball, 1973b] Kimball, J. P. (1973b). The Formal Theory of Grammar. Prentice-Hall, New Jersey.

[Kimberg and Farah, 1993] Kimberg, D. Y. and Farah, M. J. (1993). A unified account of cognitive impairments following frontal lobe damage: the role of working memory in complex, organized behavior. Journal of Experimental Psychology: General, 122:411– 428.

[Kimura, 1983] Kimura, M. (1983). The neutral theory of molecular evolution. Oxford University Press.

[King and Gallistel, 1996] King, A. P. and Gallistel, C. R. (1996). Multiphasic neuronal transfer function for representing temporal structure. Behavior Research Methods, Instruments and Computers, 28:217–223.

[Klahr, 1978] Klahr, D. (1978). Goal formation, planning, and learning by pre-school problem solvers or: ’my socks are in the dryer’. pages 181–212. in [Siegler, 1978].

[Klahr et al., 1987] Klahr, D., Langley, P., and Neches, R. (1987). Production System Models of Learning and Development. M.I.T. Press, Cambridge, Massachusetts. 678 Bibliography

[Klahr and Robinson, 1981] Klahr, D. and Robinson, M. (1981). Formal assessment of problem-solving and planning processes in preschool children. Cognitive Psychology, 13:113–148.

[Kleene, 1971] Kleene, S. C. (1971). Introduction to metamathematics. Wolters- Noordhoff Pub., Groningen and American Elsevier Pub. Co., New York.

[Klein, 1976] Klein, G. S. (1976). Psychoanalytic theory : an exploration of essentials. International Universities Press, New York.

[Klein, 1982] Klein, S. B. (1982). Motivation: Biosocial approaches. McGraw-Hill, New York.

[Klemm, 1990] Klemm, W. R. (1990). Historical and introductory perspectives on brainstem-mediated behaviors. pages 3–32. in [Klemm and Vertes, 1990].

[Klemm and Vertes, 1990] Klemm, W. R. and Vertes, R. P. (1990). Brainstem mecha- nisms of behavior. John Wiley, New York.

[Kling and Steklis, 1976] Kling, A. and Steklis, H. D. (1976). A neural substrate for affiliative behavior in nonhuman primates. Brain, Behavior and Evolution, 13:216– 238.

[Knowlton et al., 1996] Knowlton, B. J., Mangels, J. A., and Squire, L. R. (1996). A Neostriatal Habit Learning System in Humans. Science, 273:1399–1402.

[Kobayashi and Amaral, 1999] Kobayashi, Y. and Amaral, D. G. (1999). Chemical neu- roanatomy of the hippocampal formation and the perirhinal and parahippocampal cortices. pages 285–401. in [Bloom et al., 1999].

[Koch and Davis, 1994] Koch, C. and Davis, J. L. (1994). Large-scale neuronal theories of the brain. Massachusetts Institute of Technology.

[Koffka, 1936] Koffka, K. (1936). Principles of Gestalt Psychology. Harcourt, Brace, New York.

[Kohonen, 1989] Kohonen, T. (1989). Self-Organization and Associative Memory. Springer-Verlag, New York.

[Kohut, 1971] Kohut, H. (1971). The Analysis of the Self; a Systematic Approach to the Psychoanalytic Treatment of Narcissistic Personality Disorders. International Univer- sities Press, New York.

[Kolodner, 1984] Kolodner, J. L. (1984). Towards an understanding of the role of ex- perience in the evolution from novice to expert. In Developments in Expert Systems, pages 95–116. Academic Press, London. Bibliography 679

[Kolodner, 1993] Kolodner, J. L. (1993). Case-based reasoning. Morgan Kaufman, San Mateo, California.

[Kosslyn, 1980] Kosslyn, S. (1980). Image and Mind. Harvard University Press, Cam- bridge.

[Kosslyn, 1994] Kosslyn, S. (1994). Image and Brain. M.I.T. Press, Cambridge, Mas- sachusetts.

[Kosslyn et al., 1993] Kosslyn, S. M., Alpert, N. M., Thompson, W. L., Maljkovic, V., Weise, S. B., Chabris, C. F., Hamilton, S. E., Rauch, S. L., and Buonanno, F. S. (1993). Visual mental imagery activates topographically organized visual cortex: PET investigations. Journal of Cognitive Neuroscience, 5:263–287.

[Kotovsky et al., 1985] Kotovsky, K., Hayes, J. R., and Simon, H. A. (1985). Why are some problems hard? Evidence from Tower of Hanoi. Cognitive Psychology, 17:248– 294.

[Kourtzi and Kanwisher, 2000] Kourtzi, Z. and Kanwisher, N. (2000). Cortical Regions involved in Processing Object Shape. The Journal of Neuroscience, 20(9):3310–3318.

[Kourtzi and Shiffrar, 1997] Kourtzi, Z. and Shiffrar, M. (1997). One -shot view invari- ance in a moving world. Psychological Science, 8:461–466.

[Kowalski and Bowen, 1988] Kowalski, R. A. and Bowen, K. A. (1988). Logic Program- ming: Proceedings of the Fifth International Conference and Symposium. MIT Press, Cambridge, Massachusetts.

[Kraemer, 1992] Kraemer, G. W. (1992). A psychobiological theory of attachment. Be- havioral and Brain Science, 15:493–541.

[Kraemer et al., 1991] Kraemer, G. W., Ebert, M. H., Schmidt, D. E., and McKinney, W. T. (1991). Strangers in a strange land: a psychobiological study of infant mon- keys before and after separation from real or inanimate mothers. Child Development, 62:548–566.

[Kraepelin, 1899] Kraepelin, E. (1899). Psychiatrie. Ein Lehrbuch f¨urStudirende und Aerzte. Sechste, vollst¨andigumgearbeitete Auflage. I. Band. Allgemeine Psychiatrie. II. Band. Klinische Psychiatrie. Barth Verlag, Leipzig. English translation - Clinical psychiatry; a text-book for students and physicians, abstracted and adapted from the 6th German ed. by A. Ross Diefendorf, Macmillan, New York, 1902.

[Kraepelin, 1921] Kraepelin, E. (1921). Manic-depressive insanity and paranoia. Chicago Medical Book Company, Chicago. Translated by R. Mary Barclay from the Eighth Ger- man Edition of the ”Text-Book of Psychiatry”, vol. iii, part ii, section on the Endoge- nous Dementias. Edited by George M. Robertson. E. and S. Livingstone, Edinburgh 680 Bibliography

1921, Chicago Medical Book Company, Chicago 1921 [Ubersetzung¨ von Kapitel XI ”Das manisch-depressive Irresein” aus 1913a und von Kapitel XIV ”Die Verr¨ucktheit (Paranoia)” aus 1915a in englischer Sprache].

[Kraepelin, 1922] Kraepelin, E. (1922). Demonstration eines Falles von Alzheimerscher Krankheit. Zentralblatt f¨urdie gesamte Neurologie und Psychiatrie, 20:431.

[Kubota and Yamaguchi, 2000] Kubota, Y. and Yamaguchi, Y. (2000). Dependence of GABAergic synaptic areas on the interneuron type and target size. The Journal of Neuroscience, 20:375–386.

[Kuhlenbeck, 1966] Kuhlenbeck, H. (1966). Weitere Bemerkungen zur Maschinentheorie des Gehirns. Confinia Neurol., 27:295–328.

[Kummer, 1975] Kummer, H. (1975). Rules of dyad and group formation among captive gelada baboons: Theropithecus gelada. In Symposium of the 5th Congress of the International Primatology Society, Tokyo. Japan Science Press, Tokyo.

[Kurt G¨odel,1930] Kurt G¨odel (1930). Die Vollst¨andigkeit der Axiome des logischen Funtionenkalk¨uls. Monatsch. Math. Phys., 37:349–360.

[Lacki, 2000] Lacki, J. (2000). The early axiomatizations of quantum mechanics: Jordan, von Neumann and the continuation of Hilbert’s program. Archive for History of Exact Sciences, 54:279–318.

[Laird and Newell, 1983a] Laird, J. E. and Newell, A. (1983a). A universal weak method. Technical Report, CMU.

[Laird and Newell, 1983b] Laird, J. E. and Newell, A. (1983b). A universal weak method: summary of results. In Proceedings of the 1983 International Joint Conference on Artificial Intelligence. Morgan Kaufman, San Mateo, California.

[Lancaster and Barsalou, 1997] Lancaster, J. S. and Barsalou, L. W. (1997). Multiple organisations of events in memory. Memory, 5:569–599.

[Landin, 1964] Landin, P. J. (1964). The mechanical evaluation of expressions. Computer Journal, 6:308–320.

[Lane and Nadel, 2000] Lane, R. D. and Nadel, L. (2000). Cognitive neuroscience of emotion. Oxford University Press.

[Lane et al., 1998] Lane, R. D., Reiman, E. M., Axelrod, B., Yun, L. S., Holmes, A., and Schwartz, G. E. (1998). Neural correlates of levels of emotional awareness. Evidence of an interaction between emotion and attention in the anterior cingulate cortex. Journal of Cognitive Neuroscience, 10:525–535. Bibliography 681

[Langacker, 1983] Langacker, R. W. (1983). Foundations of cognitive grammar. Indiana University Linguistics Club, Bloomington, Indiana.

[Langacker, 1999] Langacker, R. W. (1999). Grammar and conceptualization. Mouton, The Hague, Paris and New York.

[Langacker, 2002] Langacker, R. W. (2002). Concept, image, and symbol : the cognitive basis of grammar. Mouton, The Hague, Paris and New York.

[Langley, 1985] Langley, P. (1985). Learning to Search: From Weak Methods to Domain- Specific Heuristics. Cognitive Science, 9:217–260.

[Langley, 1996] Langley, P. (1996). Elements of machine learning. Morgan Kaufman, San Mateo, California.

[Lashley, 1951] Lashley, K. S. (1951). The problem of serial order in behavior. In Jeffress, L. A., editor, Cerebral mechanisms in behavior: The Hixon Symposium, pages 112–146. John Wiley, New York.

[Lavenex and Amaral, 2000] Lavenex, P. and Amaral, D. G. (2000). Hippocampal- Neocortical Interaction: A Hierarchy of Associativity. Hippocampus, 10:420–430.

[Ledoux, 1996] Ledoux, J. (1996). The Emotional Brain: The Mysterious Underpinnings of Emotional Life. Simon and Schuster.

[Lee and Estes, 1977] Lee, C. L. and Estes, W. K. (1977). Order and position in primary memory for letter strings. Journal of Verbal Learning and Verbal Behavior, 16:395–418.

[Lee and Estes, 1981] Lee, C. L. and Estes, W. K. (1981). Item and order information in short-term memory: Evidence for multilevel perturbation processes. Journal of Experimental Psychology: Human Learning and Memory, 7:149–169.

[Levelt, 1989] Levelt, W. J. M. (1989). Speaking: From intention to articulation. MIT Press, Cambridge, Massachusetts.

[Lewandowsky and Jr, 1989] Lewandowsky, S. and Jr, B. B. M. (1989). Memory for serial order. Psychological Review, 96:25–57.

[Lewis, 1993] Lewis, R. L. (1993). An architecturally-based theory of human sentence comprehension . PhD thesis, Carnegie Mellon University. CMU-CS-93-226.

[Lhermitte, 1983] Lhermitte, F. (1983). Utilization behaviour and its relation to lesions of the frontal lobe. Brain, 106:237–255.

[Libet, 1993] Libet, B. (1993). Neurophysiology of Consciousness: Selected Papers and New Essays. Birkhauser, Boston. 682 Bibliography

[Lichtheim, 1885] Lichtheim, L. (1885). On aphasia. Brain, 7:433–484.

[Lieberman, 1991] Lieberman, P. (1991). Uniquely Human: The evolution of speech, thought and selfless behavior. Harvard University Press, Cambridge.

[Lindburg, 1971] Lindburg, D. G. (1971). The rhesus monkey in North India: an eco- logical and behavioral study. In Rosenblum, L. A., editor, Primate Behavior: Devel- opments in Field and Laboratory Research, Volume 2, pages 1–106. Academic Press, New York and London.

[Liu, 2000] Liu, C. (2000). A Theory of Marital Sexual Life. Journal of Marriage and the Family, 62:363–374.

[Locke, 1690] Locke, J. (1690). An essay concerning human understanding. Oxford Uni- versity Press. also published in 1979, edited with a foreword by Peter H. Nidditch.

[Logothetis and Sheinberg, 1996] Logothetis, N. K. and Sheinberg, D. L. (1996). Visual object recognition. Annual Review of Neuroscience, 19:577–621.

[Lollo et al., 2000] Lollo, V. D., Enns, J. T., and Rensink, R. A. (2000). Competition for consciousness among visual events: The psychophysics of reentrant visual processes. Journal of Experimental Psychology: General, 129:481–507.

[Lollo et al., 2002] Lollo, V. D., Enns, J. T., and Rensink, R. A. (2002). Object substi- tution without reentry? Journal of Experimental Psychology: General, 131:594–596.

[Lubke et al., 2000] Lubke, J., Egger, V., Sakmann, B., and Feldmeyer, D. (2000). Columnar organization of dendrites and axons of single and synaptically coupled exci- tatory spiny neurons in layer 4 of the rat barrel cortex. The Journal of Neuroscience, 20:5300–5311.

[Luchins, 1942] Luchins, A. S. (1942). Mechanization of problem solving. Psychological Monographs, 54. Whole number 248.

[Luck et al., 1996] Luck, S. J., Vogel, E. K., and Shapiro, K. L. (1996). Word meanings can be accessed but not reported during the attentional blink. Nature, 383:616–618.

[Luiten et al., 1985] Luiten, P. G. M., Koolhaas, J. M., Boer, S. D., and Koopmans, S. J. (1985). The cortico-medial amygdala in the central nervous system organization of agonistic behavior. Brain Research, 332:283–297.

[Luppino et al., 1991] Luppino, G., Matelli, M., Camarda, R. M., Gallese, V., and Riz- zolatti, G. (1991). Multiple representations of body movements in mesial area 6 and the adjacent cingulate cortex: and intracortical microstimulation study in the macaque monkey. Journal of Comparative Neurology, 311:463–482. Bibliography 683

[Luria, 1970] Luria, A. R. (1970). The Functional Organization of the Brain. Scientific American, 222(3):66–78. [Luria, 1978] Luria, A. R. (1978). The Human Brain and Conscious Activity. In Schwartz, G. E. and Shapiro, D., editors, Consciousness and Self-Regulation, Advances in Research, volume 2, pages 1–35. Plenum Press, New York. [Luria, 1980] Luria, A. R. (1980). Higher Cortical Functions in Man. Basic Books, New York. [Lutz, 1988] Lutz, C. A. (1988). Unnatural emotions : everyday sentiments on a Mi- cronesian atoll and their challenge to western theory. University of Chicago Press. [Lynch, 1980] Lynch, J. C. (1980). The functional organization of the posterior parietal association cortex. Behavioral and Brain Science, 3:485–534. [Mack and Rock, 1998] Mack, A. and Rock, I. (1998). Inattentional blindness: Percep- tion without attention. M.I.T. Press, Cambridge, Massachusetts. [Mackey, 1963] Mackey, G. W. (1963). The mathematical foundation of quantum me- chanics. W. A. Benjamin, Reading, Massachusetts. [Maclean, 1970] Maclean, P. D. (1970). The triune brain, emotion, and scientific bias. In Schmitt, F. O., editor, The Neurosciences Second Study Program, pages 336–349. Rockefeller University Press. [Main, 1995] Main, M. (1995). Recent studies in attachment: overview, with selected implications for clinical work. in [Goldberg et al., 1995]. [Malach, 1994] Malach, R. (1994). Cortical columns as devices for maximizing neuronal diversity. Trends in Neurosciences, 17:101–104. [Malenka and Siegelbaum, 2001] Malenka, R. C. and Siegelbaum, S. A. (2001). Synaptic plasticity: Diverse targets and mechanisms for regulating synaptic efficacy. pages 393–453. in [Cowan et al., 2001]. [Mandl and Lesgold, 1988] Mandl, H. and Lesgold, A. (1988). Learning Issues for Intel- ligent Tutoring Systems. Springer-Verlag, Berlin. [Markowitsch, 1995] Markowitsch, H. J. (1995). Anatomical basis of memory disorders. pages 765–779. in [Gazzaniga, 1995]. [Marr, 1970] Marr, D. (1970). A Theory for Cerebral Neocortex. Proceedings of the Royal Society of London, Series B, 176:161–234. [Martin Lades et al., 1993] Martin Lades et al. (1993). Distortion Invariant Object Recognition in the Dynamic Link Architecture. IEEE Transactions on Computers, 42:300–311. 684 Bibliography

[Marvin Minsky, 1965] Marvin Minsky (1965). Matter, Mind and Models. In Proceedings of IFIP Congress 1965, I, pages 45–49. Spartan Books, Wash. D.C. May, Published in Semantic Information Processing, MIT Press, 1968.

[Marvin Minsky, 1974] Marvin Minsky (1974). A Framework for Representing Knowl- edge. AI Memo 306, MIT. June, Reprinted in The Psychology of Computer Vision, P. Winston, Ed., McGraw-Hill, 1975. Shorter versions in J. Haugeland, Ed., Mind Design, MIT Press, 1981, and in Cognitive Science, Collins, Allan and Edward E. Smith, Eds., Morgan-Kaufmann, 1992 ISBN 55860-013-2.

[Marvin Minsky and Seymour Papert, 1972] Marvin Minsky and Seymour Papert (1972). Artificial Intelligence Progress Report. AI Memo 252, MIT. January.

[Marx, 1970] Marx, O. M. (1970). Nineteenth-century medical psychology: Theoretical problems in the work of Griesinger, Meynert, and Wernicke. Isis, 61:355–370.

[Mason, 1993] Mason, W. A. (1993). The nature of social conflict: a psycho-ethological perspective. pages 13–47. in [Mason and Mendoza, 1993].

[Mason and Mendoza, 1993] Mason, W. A. and Mendoza, S. P. (1993). Primate social conflict. State University of New York Press, Albany, N.Y.

[Masters et al., 1992] Masters, W., Johnson, V., and Kolodny, P. (1992). Human sexu- ality. Harper Collins, New York.

[Mattingley et al., 1997] Mattingley, J. B., Davis, G., and Driver, J. (1997). Preattentive filling-in of visual surfaces in parietal extinction. Science, 275:671–674.

[Maunsell, 1995] Maunsell, J. H. R. (1995). The Brain’s Visual World: Representation of Visual Targets in Cerebral Cortex. Science, 270:764–769.

[McClelland et al., 1995] McClelland, J. L., McNaughton, B. L., and O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neo- cortex: insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102:419–457.

[McConkie and Currie, 1996] McConkie, G. and Currie, C. (1996). Visual stability across saccades while viewing complex pictures. Journal of Experimental Psychology: Human Perception and Performance, 22:563–581.

[McFarland, 1993] McFarland, D. (1993). Animal behaviour : psychobiology, ethology, and evolution. John Wiley, New York. 2nd Edition.

[McGaugh, 2002] McGaugh, J. L. (2002). Memory consolidation and the amygdala: a systems perspective. Trends in Neurosciences, 25:456–461.

[McGuire, 1974] McGuire, M. T. (1974). The St. Kitts Vervet. S. Karger, Basel. Bibliography 685

[McGuire et al., 1986] McGuire, M. T., Brammer, G. L., and Raleigh, M. J. (1986). Rest- ing cortisol levels and the emergence of dominant status among male vervet monkeys. Hormones and Behavior, 20:106–117.

[McIntosh et al., 1994] McIntosh, A. R., Grady, C. L., Ungeleider, L. G., Haxby, J. V., Rapoport, S. I., and Horwitz, B. (1994). Network analysis of cortical visual pathways mapped by PET. The Journal of Neuroscience, 14:655–666.

[Meadows, 1974] Meadows, J. C. (1974). Disturbed perception of colours associated with localized cerebral lesions. Brain, 97:615–632.

[Mega and Cummings, 1994] Mega, M. S. and Cummings, J. L. (1994). Frontal- subcortical circuits and neuropsychiatric disorders. Journal of Neuropsychiatry and Clinical Neurosciences, 6:358–370.

[Mel, 1997] Mel, B. W. (1997). SEEMORE: Combining color, shape, and texture his- togramming in a neurally inspired approach to visual object recognition. Neural Com- putation, 9:777–804.

[Mendelson, 1979] Mendelson, E. (1979). Introduction to Mathematical Logic, 2nd edi- tion. Van Nostrand Reinhold Co., New York.

[Mentzel et al., 1998] Mentzel, H.-J., Gaser, C., Volz, H.-P., Rzanny, R., Hager, F., Sauer, H., and Kaiser, W. A. (1998). Cognitive stimulation with the Wisconsin Card Sorting Test: Functional MR Imagining at 1.5 T. Radiology, 207:399–404.

[Merzenich et al., 1981] Merzenich, M. M., Sur, M., Nelson, R. I., and Kaas, J. H. (1981). Multiple cutaneous representations in areas 3b and 1 of the owl monkey. pages 67–119. in [Woolsey, 1981], Volume 1, Multiple Somatic Areas.

[Mesulam, 1990] Mesulam, M.-M. (1990). Large-scale neurocognitive networks and dis- tributed processing for attention, language and memory. Annals of Neurology, 28:597– 613.

[Meunier et al., 1997] Meunier, M., Bachevalier, J., and Mishkin, M. (1997). Effects of orbital frontal and anterior cingulate lesions on object and spatial memory in rhesus monkeys. Neuropsychologia, 35:999–1015.

[Meynert, 1884] Meynert, T. (1884). Psychiatrie : Klinik der Erkrankungen des Vorder- hirns begr¨undetauf dessen Bau, Leistungen und Ern¨ahrung,1. H¨alfte/ von Theodor Meynert. Wilhelm Braumueller, Wien.

[Micevych and Hammer, 1995] Micevych, P. E. and Hammer, R. P. (1995). Neurobiolog- ical effects of sex steroid hormones. Cambridge University Press, Cambridge, England. 686 Bibliography

[Michalski et al., 1990] Michalski, R. S., Carbonell, J. G., and Mitchell, T. M. (1983- 1990). Machine learning : an artificial intelligence approach. Morgan Kaufman, San Mateo, California.

[Michon and Akyurek, 1992] Michon, J. A. and Akyurek, A. (1992). Soar: a cognitive architecture in perspective. Kluwer Academic Publishers, Boston, Dordrecht, London.

[Michon et al., 1988] Michon, J. A., Pouthas, V., and Jackson, J. L. (1988). Guyau and the idea of time. North Holland, Amsterdam.

[Millen et al., 1995] Millen, S. J., Haughton, V. M., and Yetkin, Z. (1995). Functional Magnetic Resonance Imaging of the Central Auditory Pathway Following Speech and Pure-Tone Stimuli. Laryngoscope, 105:1305–1310.

[Miller, 1991] Miller, R. (1991). Cortico-Hippocampal Interplay and the representation of contexts in the brain. Springer-Verlag, Berlin.

[Milner and Goodale, 1995] Milner, A. D. and Goodale, M. A. (1995). The visual brain in action. Oxford University Press, Oxford.

[Mineka and Cook, 1988] Mineka, S. and Cook, M. (1988). Social learning and the ac- quisition of snake fear in monkeys. pages 51–73. in [Tho and Bennett G. Galef, 1988].

[Mineka and Cook, 1993] Mineka, S. and Cook, M. (1993). Mechanisms involved in the observational conditioning of fear. Journal of Experimental Psychology: General, 122:23–38.

[Mineka and Ohman, 2002] Mineka, S. and Ohman, A. (2002). Phobias and prepared- ness: the selective, automatic and encapsulated nature of fear. Biological Psychiatry, 52:927–937.

[Minsky, 1967] Minsky, M. (1967). Computation: Finite and infinite machines. M.I.T. Press, Cambridge, Massachusetts.

[Minsky, 1986] Minsky, M. (1986). The Society of Mind. Simon and Schuster, New York.

[Minsky, 2003] Minsky, M. (2003). The emotion machine. manuscript in preparation, viewable on the web.

[Minto et al., 1989] Minto, S., Carbonell, J. G., Knoblock, C. A., Kuokka, D. R., Etzioni, O., and Gil, Y. (1989). Explanation-based Learning: A Problem Solving Perspective. Artifical Intelligence Journal, 40:63–118.

[Mishkin, 1979] Mishkin, M. (1979). Analogous neural models for tactual and visual learning. Neuropsychologia, 17:139–151. Bibliography 687

[Mitchell and Poston, 2001] Mitchell, C. L. and Poston, C. S. L. (2001). Effects of inhibit- ing of response on Tower of London performance. Current Psychology: Developmental, Learning, Personality, Social, 20:164–168.

[Mitra and Mishra, 1990] Mitra, N. R. and Mishra, A. K. (1990). Mechanism of percep- tion and brain-computer model. Psychologia, 33:63–72.

[Mitra and Mishra, 1993] Mitra, N. R. and Mishra, A. K. (1993). A comparative study of the functional attributed of computer, brain and the brain-computer model with reference to artificial intelligence. Psychologia, 36:84–91.

[Miyashita, 1993] Miyashita, Y. (1993). Inferior Temporal Cortex: Where visual percep- tion meets memory. Annual Review of Neuroscience, 16:245–263.

[Mogenson et al., 1980] Mogenson, G. J., Jones, D. L., and Yim, C. Y. (1980). From motivation to action: Functional interface between the limbic system and the motor system. Progress in Neurobiology, 14:69–97.

[Moore and Anderson, 1954] Moore, O. K. and Anderson, S. B. (1954). Modern logic and tasks for experiments on problem solving behavior. Journal of Psychology, 38:151–160.

[Morris et al., 1998] Morris, J. S., Friston, K. J., and Dolan, R. J. (1998). Experience- dependent modulation of tonotopic neural responses in human auditory cortex. Pro- ceedings of the Royal Society of London, 265:649–657.

[Morris et al., 1993] Morris, R. G., Ahmed, S., Syed, G. M., and Toone, B. K. (1993). Neural correlates of planning ability: frontal lobe activation during the Tower of Lon- don test. Neuropsychologia, 31:1367–1378.

[Morris et al., 1988] Morris, R. G., Downes, J. J., Sahakian, B. J., Evenden, J. L., Heald, A., and Robbins, T. W. (1988). Planning and spatial working memory in Parkinson’s disease. Journal of Neurology, Neurosurgery and Psychiatry, 51:757–766.

[Morton, 1968] Morton, J. (1968). Repeated items and decay in memory. Psychonomic Science, 10:219–220.

[Morton, 1970] Morton, J. (1970). A functional model of memory. pages 203–254. in [Norman, 1970].

[Mountcastle, 1957] Mountcastle, V. B. (1957). Modularity and topographic properties of single neurons of cat’s somatic sensory cortex. Journal of Neurophysiology, 20:408– 434.

[Mountcastle, 1995a] Mountcastle, V. B. (1995a). The evolution of ideas concerning the function of the neocortex. Cerebral Cortex, 5:289–295. 688 Bibliography

[Mountcastle, 1995b] Mountcastle, V. B. (1995b). The parietal system and some higher brain functions. Cerebral Cortex, 5:377–390.

[Mountcastle, 1997] Mountcastle, V. B. (1997). The columnar organization of the neo- cortex. Brain, 120:701–722.

[Moyer, 1976] Moyer, K. E. (1976). The psychobiology of aggression. Harper and Row.

[Muir et al., 1996] Muir, J. L., Everitt, B. J., and Robbins, T. W. (1996). The cerebral cortex of the rat and visual attentional function: dissociable effects of mediofrontal, cingulate, anterior dorsolateral, and parietal cortex lesions on a five-choice serial reac- tion time task. Cerebral Cortex, 6:470–481.

[Mumford, 1992] Mumford, D. (1992). On the computational architecture of the neocor- tex, II The role of cortico-cortical loops. Biological Cybernetics, 66:241–251.

[Mumford, 1994] Mumford, D. (1994). Neuronal Architectures for Pattern-theoretic Problems. pages 125–152. in [Koch and Davis, 1994].

[Murji and DeLuca, 1998] Murji, S. and DeLuca, J. W. (1998). Preliminary validity of the Cognitive Function Checklist: Prediction of Tower of London performance. Clinical Neuropsychologist, 12:358–364.

[Nadel and Jacobs, 1996] Nadel, L. and Jacobs, W. J. (1996). The role of the hippocam- pus in PTSD, panic and phobia. In Kato, N., editor, The hippocampus: functions and clinical relevance, pages 455–463. Elsevier Science Publishers B.V.

[Nadel et al., 2000] Nadel, L., Samsonovich, A., Ryan, L., and Moscovitch, M. (2000). Multiple Trace Theory of Human Memory: Computational, Neuroimaging, and Neu- ropsychological Results. Hippocampus, 10:352–368.

[Najjar et al., 1999] Najjar, W. A., Lee, E. A., and Gao, G. R. (1999). Advances in the dataflow computational model. Parallel Computing, 25:1907–1929.

[Neal et al., 1990] Neal, J. W., Pearson, R. C. A., and Powell, T. P. S. (1990). The connections of area PG, 7a, with cortex in the parietal, occipital and temporal lobes of the monkey. Brain Research, 532:249–264.

[Neisser, 1976] Neisser, U. (1976). Cognition and reality : principles and implications of cognitive psychology. W. H. Freeman, San Francisco.

[Neisser and Winograd, 1988] Neisser, U. and Winograd, E. (1988). Remembering re- considered: Ecological and traditional approaches to the study of memory. Cambridge University Press, Cambridge, England. Bibliography 689

[Newell, 1962] Newell, A. (1962). Some problems of basic organization in problem- solving programs. In Yovits, M. C., Jacobs, G. T., and Goldstein, G. D., editors, Self-organization Systems , pages 293–423. Spartan Books, Washington, D.C.

[Newell, 1963] Newell, A. (1963). A guide to the General Problem Solver Program 2-2. Technical Report RM-3337, Rand Corporation, Santa Monica, California.

[Newell, 1980] Newell, A. (1980). Physical symbol systems. Cognitive Science, 4:135–183.

[Newell, 1990] Newell, A. (1990). Unified theories of cognition. Harvard University Press, Cambridge.

[Newell, 1992] Newell, A. (1992). Unified theories of cognition and the role of Soar. pages 25–79. in [Michon and Akyurek, 1992].

[Newell and Shaw, 1957] Newell, A. and Shaw, J. C. (1957). Programming the logic theory machine. Proceedings of the Western Joint Computer Conference, pages 230– 240.

[Newell et al., 1958] Newell, A., Shaw, J. C., and Simon, H. A. (1958). Elements of a theory of problem solving. Psychological Review, 65:151–166.

[Newell and Simon, 1965] Newell, A. and Simon, H. (1965). An example of human chess playing the light of chess playing programs. In Wiener, N. and Schade, J. P., editors, Progress in Biocybernetics, Volume 2, pages 19–75. Elsevier Science Publishers B.V.

[Newell and Simon, 1972] Newell, A. and Simon, H. (1972). Human problem solving. Prentice Hall, Englewood Cliffs, New Jersey.

[Newell and Simon, 1961] Newell, A. and Simon, H. A. (1961). Computer simulation of human thinking. Science, 134:2011–2017.

[Newell et al., 1960] Newell, A., Tonge, F. M., Feigenbaum, E. A., Nealy, G. H., Saber, N., Green, B. F., and Wolf, A. K. (1960). Information processing language V manual. Technical Report P-1897, The Rand Corporation.

[Newman, 1978] Newman, J. D. (1978). Perception of sounds used in species-specific communication: the auditory cortex and beyond. Journal of Medical Primatology, 7:98–105.

[Newman, 1999] Newman, S. W. (1999). The medial extended amygdala in male repro- ductive behavior. Annals of the New York Academy of Sciences, 877:242–257.

[Newmeyer, 1988] Newmeyer, F. J. (1988). Linguistics: The Cambridge Survey, Vol- ume 3, Language: psychological and biological aspects. Cambridge University Press, Cambridge. 690 Bibliography

[Noback and Montagna, 1970] Noback, C. R. and Montagna, W. (1970). The primate brain: advances in primatology. Appleton-Century-Crofts, New York.

[Noback et al., 1991] Noback, C. R., Strominger, N. L., and Demarest, R. J. (1991). The human nervous system: introduction and review. Lea and Febinger, Philadelphia.

[Norman, 1970] Norman, D. A. (1970). Models of human memory. Academic Press, London.

[Norman, 1981] Norman, D. A. (1981). A psychologist views human processing: Human errors and other phenomena suggest processing mechanisms. In Proceedings of the Seventh International Joint Conference on Artificial Intelligence (IJCAI-81), volume 2, pages 1097–1101, University of British Columbia, Vancouver, BC.

[Norman and Bobrow, 1975] Norman, D. A. and Bobrow, D. G. (1975). On data-limited and resource-limited processes. Cognitive Psychology, 7:44–64.

[Northcutt, 1991] Northcutt, R. G. (1991). Evolution of the telencephalon in nonmam- mals. Annual Review of Neuroscience, 4:301–350.

[Noton and Stark, 1971] Noton, D. and Stark, L. W. (1971). Eye Movements and Visual Perception. Scientific American, 224:34–43.

[Numminen et al., 2001] Numminen, H., Lehto, J. E., and Ruoppila, I. (2001). Tower of Hanoi and working memory in adult persons with intellectual disability. Research in Developmental Disabilities, 22:373–387.

[Nyby et al., 1992] Nyby, J., Matochik, J. A., and Barfield, R. J. (1992). Intracranial an- drogenic and estrogenic stimulation of male-typical behaviors in house mice. Harcourt, Brace, New York, 26:24–45.

[O’Kane et al., 1995] O’Kane, B., Biederman, I., Cooper, E. E., and Nystrom, B. (1995). An account of object identification confusions. Journal of Experimental Psychology: Applied, 3:21–41.

[O’Keefe and Nadel, 1978] O’Keefe, J. and Nadel, L. (1978). The hippocampus as a cognitive map. Oxford University Press, Oxford.

[Olsen and Koppe, 1988] Olsen, O. A. and Koppe, S. (1988). Freud’s Theory of Psycho- analysis. New York University Press, New York.

[Olshausen et al., 1993] Olshausen, B. A., Anderson, C. H., and Essen, D. C. V. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. The Journal of Neuroscience, 13:4700–4719. Bibliography 691

[Olton et al., 1985] Olton, D. S., Gamzu, E., and Corkin, S. (1985). Memory dysfunc- tions: an integration of animal and human research from preclinical and clinical per- spectives. New York Academy of Sciences, New York. Annals of the New York Acade- mymy of Sciences, Volume 444.

[Oram et al., 1993] Oram, M. W., Perrett, D. I., and Hietanen, J. K. (1993). Directional tuning of motion-sensitive cells in the anterior superior temporal polysensory area of the macaque. Experimental Brain Research, 97:274–294.

[Ortony et al., 1988] Ortony, A., Clore, G., and Collins, A. (1988). The cognitive struc- ture of emotions. Cambridge University Press, Cambridge, England.

[Owen et al., 1990] Owen, A. M., Downes, J. J., barbara J. Sahakian, Polkey, C. E., and Robbins, T. W. (1990). Planning and spatial working memory following frontal lobe lesions in man. Neuropsychologia, 28:1021–1034.

[Pandya, 1995] Pandya, D. N. (1995). Anatomy of the Auditory Cortex. Revue Neu- rologique (Paris), 151:486–494.

[Pandya and Sanides, 1973] Pandya, D. N. and Sanides, F. (1973). Architectonic parcel- lation of the temporal operculum in the rhesus monkey, and its projection pattern. Z. Anat. Entwicklungsgesch., 139:127–161.

[Pandya and Seltzer, 1982] Pandya, D. N. and Seltzer, B. (1982). Intrinsic connections and architectonics of posterior parietal cortex in the rhesus monkey. Journal of Com- parative Neurology, 204:196–210.

[Pandya and Yeterian, 1985] Pandya, D. N. and Yeterian, E. H. (1985). Architecture and Connections of Cortical Association Areas. pages 3–61. Plenum Press, New York. in [Peters and Jones, 1985], volume 4.

[Pandya and Yeterian, 1990] Pandya, D. N. and Yeterian, E. H. (1990). Prefrontal cortex in relation to other cortical areas in rhesus monkey: Architecture and connections. Progress in Brain Research, 85:63–94. H.B.M. Uylings et al, editors.

[Pardo et al., 1990] Pardo, J. V., Pardo, P. J., Janer, K. W., and Raichle, M. E. (1990). The anterior cingulate cortex mediates processing selection in the Stroop attentional conflict paradigm. Proceedings of the National Academy of Sciences of the USA, 87:256–259.

[Parent, 1996] Parent, A. (1996). Carpenter’s Human Neuroanatomy, Ninth Edition. Williams and Wilkins, Baltimore.

[Paris et al., 1991] Paris, C. L., Swartout, W. R., and Mann, W. C. (1991). Natural Language Generation in artificial intelligence and computational lingusitics. Kluwer, Dordrecht. 692 Bibliography

[Parks and Cardoso, 1997] Parks, R. W. and Cardoso, J. (1997). Parallel distributed processing and executive functioning: Tower of Hanoi neural network model in healthy controls and left frontal lobe patients. International Journal of Neuroscience, 89:217– 240.

[Parsons, 1990] Parsons, T. (1990). Events in the semantics of English: a study in sub- atomic semantics. M.I.T. Press, Cambridge, Massachusetts.

[Passingham, 1993] Passingham, R. E. (1993). The Frontal Lobes and Voluntary Action. Oxford University Press, Oxford.

[Patterson and Hennessy, 1996] Patterson, D. A. and Hennessy, J. L. (1996). Computer architecture : a quantitative approach. Morgan Kaufman, San Mateo, California.

[Paus et al., 1998] Paus, T., Koski, L., Caramanos, Z., and Westbury, C. (1998). Re- gional differences in the effects of task difficulty and motor output on blood flow re- sponse in the human anterior cingulate cortex: a review of 107 PET activation studies. Neuroreport, 9:R37–47.

[Paus et al., 1993] Paus, T., Petrides, M., Evans, A. C., and Meyer, E. (1993). Role of the human anterior cingulate cortex in the control of oculomotor, manual, and speech responses: a positron emission tomography study. Journal of Neurophysiology, 70:453–469.

[Paxinos, 1990] Paxinos, G. (1990). The Human Nervous System. Academic Press, Lon- don.

[Pellis, 1989] Pellis, S. M. (1989). Fighting: the problem of selecting appropriate behavior patters. In R. J. Blanchard and P. F. Brain and D. C. Blanchard and S. Parmigiani, editor, Ethoexperimental approaches to the study of behavior. Kluwer, Dordrecht.

[Pereira and Shieber, 1987] Pereira, F. C. and Shieber, S. M. (1987). Prolog and natural- language analysis. Center for the Study of Language and Information, Stanford, Cali- fornia.

[Perrett et al., 1989a] Perrett, D. I., Harries, M. H., Bevan, R., Thomas, S., Benson, P. J., Mistlin, A. J., Chitty, A. J., Hietanen, J. K., and Ortega, J. E. (1989a). Frame- works of analysis for the neural representation of animate objects and actions. Journal of Experimental Biology, 146:87–113.

[Perrett et al., 1990a] Perrett, D. I., Harries, M. H., Chitty, A. J., and Mistlin, A. J. (1990a). Three stages in the classification of body movements by visual neurones. In Barlow, H. B., Blakemore, C., and Weston-Smith, M., editors, Images and Under- standing, pages 94–107. Cambridge University Press, Cambridge, England. Bibliography 693

[Perrett et al., 1990b] Perrett, D. I., Harries, M. H., Mistlin, A. J., Hietanen, J. K., Ben- son, P. J., Bevan, R., Thomas, S., Oran, M. W., Ortega, J., and Brierley, K. (1990b). Social signals analyzed at the single cell level: Someone is looking at me, something touched me, something moved! International Journal of Comparative Psychology, 4:25–55.

[Perrett et al., 1992] Perrett, D. I., Hietanen, J. K., Oram, M. W., and Benson, P. (1992). Organization and functions of cells responsive to faces in the temporal cortex. Philo- sophical Transactions of the Royal Society, in press.

[Perrett and Mistlin, 1990] Perrett, D. I. and Mistlin, A. J. (1990). Perception of facial characteristics by monkeys. In Berkley, M. A. and Stebbins, W., editors, Comparative Perception v..2. John Wiley, New York.

[Perrett et al., 1989b] Perrett, D. I., Mistlin, A. J., Harries, M. H., and Chitty, A. J. (1989b). Understanding the visual appearance and consequence of hand actions. In Goodale, M. A., editor, Vision and Action: The Control of Grasping, pages 163–180. Ablex Publishing Corporation.

[Perrett et al., 1991] Perrett, D. I., Oram, M. W., Harries, M. H., Bevan, R., Hietanen, J. K., Benson, P. J., and Thomas, S. (1991). Viewer-centred and object-centred coding of heads in the macaque temporal cortex. Experimental Brain Research, pages 159–173.

[Perrett et al., 1979] Perrett, D. I., Rolls, E. T., and Caan, W. (1979). Temporal lobe cells of the monkey with visual responses selective for faces. Neurosci. Lett., s358:s3.

[Perrett et al., 1985] Perrett, D. I., Smith, P. A. J., Potter, D. D., Mistlin, A. J., Head, A. S., Milner, A. D., and Jeeves, M. A. (1985). Visual cells in the temporal cortex sensitive to face view and gaze direction. Proc. R. Soc. London B, 223:293–317.

[Peters and Jones, 1985] Peters, A. and Jones, E. G. (1985). Cerebral cortex. Plenum Press, New York.

[Petersen and Fiez, 1993] Petersen, S. E. and Fiez, J. A. (1993). The Processing of Single Words Studied with Positron Emission Tomography. Annual Review of Neuroscience, 16:509–530.

[Petersen et al., 1988] Petersen, S. E., Fox, P. T., Posner, M. I., Minton, M., and Raichle, M. E. (1988). Positron emission tomographic studies of the cortical anatomy of single word processing. Nature, 331:585–589.

[Petrides, 1994] Petrides, M. (1994). Frontal lobes and working memory: evidence from investigations of the effects of cortical excisions in nonhuman primates. pages 59–82. in [Boller and Grafman, 1994]. 694 Bibliography

[Petrides and Pandya, 1994] Petrides, M. and Pandya, D. N. (1994). Comparative ar- chitectonic analysis of the human and macaque frontal cortex. pages 17–58. in [Boller and Grafman, 1994].

[Petsche and Brazier, 1972] Petsche, H. and Brazier, M. A. B. (1972). Synchronisation of EEG activity in epilepsies. Springer-Verlag, Berlin.

[Philips et al., 2001] Philips, L. H., Wynn, V. E., McPherson, S., and Gilhooly, K. J. (2001). Mental planning and the Tower of London task. 54A:579–597.

[Phillips et al., 1988] Phillips, J. R., Johnson, K. O., and Hsiao, S. S. (1988). Spatial pat- tern representation and transformation in monkey somatosensory cortex. Proceedings of the National Academy of Sciences of the USA, 85:1317–1321.

[Phillips et al., 1999] Phillips, L. H., Wynn, V., Gilhooly, K. J., Sala, S. D., and Logie, R. H. (1999). The role of memory in the Tower of London task. Memory, 7:209–231.

[Piaget, 1976] Piaget, J. (1976). The grasp of consciousness: action and concept in the young child. Harvard University Press, Cambridge.

[Picard and Strick, 1996] Picard, N. and Strick, P. L. (1996). Motor areas of the medial wall: a review of their location and functional activation. Cerebral Cortex, 6:342–353.

[Pinel, 1993] Pinel, J. P. J. (1993). Biopsychology. Allyn and Bacon: Boston.

[Plutchik, 1980] Plutchik, R. (1980). Emotion, a psychoevolutionary synthesis. Harper and Row.

[Poggio and Edelman, 1990] Poggio, T. and Edelman, S. (1990). A network that learns to recognize three-dimensional objects. Nature, 343:263–266.

[Poggio and Shelton, 1999] Poggio, T. and Shelton, C. R. (1999). Machine learning, machine vision, and the brain. AI Magazine, 20(3):37–55.

[Pollard and Sag, 1994] Pollard, C. and Sag, I. A. (1994). Head-driven Phrase Structure Grammar. University of Chicago Press, Chicago, Illinois.

[Porter, 1990] Porter, R. (1990). The Kugelberg lecture. Brain mechanisms of volun- tary motor commands–a review. Electroencephalography and Clinical Neurophysiology, 76:282–293.

[Posner and Rothbart, 1998] Posner, M. I. and Rothbart, M. K. (1998). Attention, self-regulation and consciousness. Philosophical Transactions of the Royal Society, B353(1377):1915–1927.

[Potter, 1976] Potter, M. C. (1976). Short term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2:509–522. Bibliography 695

[Preuss and Goldman-Rakic, 1991a] Preuss, T. M. and Goldman-Rakic, P. S. (1991a). Myelo- and Cytoarchitecture of the Granular Frontal Cortex and Surrounding Regions in the Strepsirhine Primate Galago and the Anthropoid Primate Macaca. Journal of Comparative Neurology, 310:429–474.

[Preuss and Goldman-Rakic, 1991b] Preuss, T. M. and Goldman-Rakic, P. S. (1991b). Myelo- and cytoarchitecture of the granular frontal cortex and surrounding regions in the strepsirhine primate Galago and the anthropoid primate Macaca. Journal of Neurophysiology, 310:429–474.

[Price, 1990] Price, J. L. (1990). Olfactory System. pages 979–998. in [Paxinos, 1990].

[Price, 1991] Price, J. L. (1991). The Central Olfactory and Accessory Olfactory Systems. pages 179–203. in [Finger and Silver, 1991].

[Price et al., 1987] Price, J. L., Russchen, F. T., and Amaral, D. G. (1987). Chap- ter III. The limbic region. II: The amygdaloid complex. pages 279–388. in [Bjorklund et al., 1987].

[Pritchett, 1992] Pritchett, B. L. (1992). Grammatical competence and parsing perfor- mance. University of Chicago Press, Chicago, Illinois.

[Purves et al., 1994] Purves, D., Riddle, D. R., White, L. E., Gutierrez-Ospina, G., and LaMantia, A. S. (1994). Categories of cortical structure. Progress in Brain Research, 102:343–355.

[Pustejovsky, 1995] Pustejovsky, J. (1995). The generative lexicon. M.I.T. Press, Cam- bridge, Massachusetts.

[Rainville et al., 2002] Rainville, C., Amieva, H., Lafont, S., Dartigues, J.-F., Orgogozo, J.-M., and Fabrigoule, C. (2002). Executive function deficits in patients with dementia of the Alzheimer’s type: A study with a Tower of London task. Archives of Clinical Neuropsychology, 17:513–530.

[Raizner et al., 2002] Raizner, R. D., Song, J., and Levin, H. S. (2002). Raising the ceiling: The Tower of London-Extended Version. Developmental Neuropsychology, 21:1–14.

[Raleigh, 1987] Raleigh, M. J. (1987). Differential behavioral effects of tryptophan and 5-hydroxytryptophan in vervet monkeys: influence of catecholaminergic systems. Psy- chopharmacology, 93:44–50.

[Rao and Ballard, 1997] Rao, R. P. N. and Ballard, D. H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9:721–763. 696 Bibliography

[Rapaport, 1942] Rapaport, D. (1942). Emotions and memory. International Universities Press, New York.

[Rauschecker et al., 1995] Rauschecker, J. P., Tian, B., and Hauser, M. (1995). Pro- cessing of Complex Sounds in the Macaque Nonprimary Auditory Cortex. Science, 268:111–114.

[Rauschecker et al., 1997] Rauschecker, J. P., Tian, B., Pons, T., and Mishkin, M. (1997). Serial and Parallel Processing in Rhesus Monkey Auditory Cortex. Journal of Com- parative Neurology, 382:89–103.

[Raymond et al., 1992] Raymond, J. E., Shapiro, K. L., and Arnell, K. M. (1992). Tem- porary suppression of visual processing in an RSVP task: An attentional blink? Jour- nal of Experimental Psychology: Human Perception and Performance, 18:849–860.

[Reason, 1990] Reason, J. (1990). Human Error. Cambridge University Press, Cam- bridge, England.

[Reiner, 1993] Reiner, A. (1993). Neurotransmitter organization and connections of tur- tle cortex: implications for the evolution of mammalian cortex. Comparative Biochem- istry and Physiology, 104A:735–748.

[Rensink and Enns, 1998] Rensink, R. A. and Enns, J. T. (1998). Early Completion of Occluded Objects. Vision Research, 38:2489–2505.

[Rensink et al., 1997] Rensink, R. A., O’Regan, J. K., and Clark, J. J. (1997). To See or Not to See: The Need for Attention to Perceive Changes in Scenes. Psychological Science, 8:368–373.

[Rich, 1983] Rich, E. (1983). Artificial intelligence. McGraw-Hill, New York.

[Ridley, 1993] Ridley, M. (1993). Evolution. Basil Blackwell, Oxford.

[Riehle, 1991] Riehle, A. (1991). Visually induced signal-locked neuronal activity changes in precentral motor areas of the monkey: hierarchical progression of signal processing. Brain Research, 540:131–137.

[Riemsdijk and Williams, 1986] Riemsdijk, H. V. and Williams, E. (1986). Introduction to the Theory of Grammar. MIT Press, Cambridge, Massachusetts.

[Riesenhuber, 2000] Riesenhuber, M. (2000). How a Part of the Brain Might or Might Not Work: A New Hierarchical Model of Object Recognition. PhD thesis, BCS, MIT.

[Riesenhuber and Poggio, 1999] Riesenhuber, M. and Poggio, T. (1999). Hierarchical Models of Object Recognition in Cortex. Nature Neuroscience, 2:1019–1025. Bibliography 697

[Riesenhuber and Poggio, 2000] Riesenhuber, M. and Poggio, T. (2000). Models of Ob- ject Recognition. Nature Neuroscience, 3 Supp:1199–1204.

[Riesenhuber and Poggio, 2002] Riesenhuber, M. and Poggio, T. (2002). Neural mecha- nisms of object recognition. Current Opinion in Neurobiology, 12:162–168.

[Rizzolatti et al., 1996] Rizzolatti, G., Fadiga, L., Gallese, V., and Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Brain Research. Cognitive Brain Research, 3:131–141.

[Rizzolatti et al., 1998] Rizzolatti, G., Luppino, G., and Matelli, M. (1998). The organi- zation of the cortical motor system: new concepts. Electroencephalography and Clinical Neurophysiology, 106:283–296.

[Robertson et al., 1997] Robertson, L., Treisman, A., Friedman-Hill, S., and Grabowecky, M. (1997). The interaction of spatial and object pathways: Evidence from Balint’s syndrome. Journal of Cognitive Neuroscience, 9:254–276.

[Robinson, 1974] Robinson, A. (1974). Nonstandard Analysis. North Holland, Amster- dam.

[Robinson, 1965] Robinson, J. A. (1965). A Machine-Oriented Logic Based on the Res- olution Principle. Journal of tbe Association for Computing Machinery, 12:23–41.

[Rodseth et al., 1991a] Rodseth, L., Smuts, B. B., Harrigan, A. M., and Wrangham, R. W. (1991a). On the human community as a primate society - reply. Current Anthropology, 32:429–433.

[Rodseth et al., 1991b] Rodseth, L., Wrangham, R. W., Harrigan, A. M., and Smuts, B. B. (1991b). The human community as a primate society. Current Anthropology, 32:221–254.

[Roennlund et al., 2001] Roennlund, M., Loevden, M., and Nilsson, L.-G. (2001). Adult age differences in Tower of Hanoi performance: Influence from demographic and cog- nitive variables. Aging, Neuropsychology, and Cognition, 8:269–283.

[Roland, 1985] Roland, P. E. (1985). Application of imaging of brain blood flow to behavioral neurophysiology: The cortical field activation hypothesis. pages 87–106. in [Sokoloff, 1985].

[Roland, 1993] Roland, P. E. (1993). Brain Activation. John Wiley, New York.

[Roland and Friberg, 1985] Roland, P. E. and Friberg, L. (1985). Localization of cortical areas activated by thinking. Journal of Neurophysiology, 53:1219–1243.

[Rolls, 1995] Rolls, E. T. (1995). Central Taste Anatomy and Neurophysiology. pages 549–573. in [Doty, 1995]. 698 Bibliography

[Rolls, 1999] Rolls, E. T. (1999). The Brain and Emotion. Oxford University Press, Oxford.

[Rolls and Baylis, 1994] Rolls, E. T. and Baylis, L. L. (1994). Gustatory, Olfactory and Visual Convergence within the Primate Orbitofrontal Cortex. The Journal of Neuro- science, 14:5437–5452.

[Romans et al., 1997] Romans, S. M., Roeltgen, D. P., Kushman, H., and Ross, J. L. (1997). Executive Function in Girls with Turner’s Syndrome. Developmental Neu- ropsychology, 13:23–40.

[Romanski et al., 1999] Romanski, L. M., Bates, J. F., and Goldman-Rakic, P. S. (1999). Auditory Belt and Parabelt Projections to the Prefrontal Cortex in the Rhesus Mon- key. Journal of Comparative Neurology, 403:141–157.

[Romer and Parsons, 1986] Romer, A. S. and Parsons, T. S. (1986). The vertebrate body. Saunders College Pub,Philadelphia. 6th edition.

[Roseman, 1982] Roseman, I. (1982). Cognitive aspects of discrete emotions. PhD thesis, Yale University.

[Rosen and Snell, 1972] Rosen, R. and Snell, F. M. (1972). Progress in theoretical biol- ogy”, Volume 2. Academic Press, New York.

[Rosenberg, 1977] Rosenberg, S. (1977). Sentence production: developments in research and theory. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Rosner and Johnson, 1992] Rosner, M. and Johnson, R. (1992). Computional linguistics and formal semantics. Cambridge University Press, Cambridge, England.

[Rudge and Warrington, 1991] Rudge, P. and Warrington, E. K. (1991). Selective im- pairment of memory and visual perception in splenial tumours. Brain, 114:349–360.

[Ruimschotel, 1989] Ruimschotel, D. (1989). Explanation, causation and psychological theories: a methodological study illustrated by an analysis of Festinger’s theory of cog- nitive dissonance and Newell and Simon’s theory of human problem solving. Swets and Zeitlinger, B.V. Amsterdam/Lisse.

[Ruiz, 1987] Ruiz, D. (1987). Learning and problem solving: what is learned while solving the Tower of Hanoi? PhD thesis, Stanford University.

[Ruiz and Newell, 1989] Ruiz, D. and Newell, A. (1989). Strategy change in the Tower of Hanoi: a SOAR model. In Proceedings of the 11th Annual Conference of the Cognitive Science Society, pages 521–529. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Russell, 1900] Russell, B. (1900). A critical exposition of the philosophy of Leibniz. Cambridge University Press, Cambridge, England. Bibliography 699

[Russell and Whitehead, 1910] Russell, B. and Whitehead, A. N. (1910). Principia Math- ematica. Cambridge University Press, Cambridge, England.

[Rybak et al., 1998] Rybak, I. A., Gusakova, V. I., Golovan, A. V., Podladchikova, L. N., and Shevtsova, N. A. (1998). A model of attention-guided visual perception and recognition. Vision Research, 38:2387–2400.

[Ryle, 1949] Ryle, G. (1949). The concept of mind. Hutchins University Library, London.

[Sakai and Miyashita, 1993] Sakai, K. and Miyashita, Y. (1993). Memory and imagery in the temporal lobe. Current Biology, 3:166–170.

[S´andor, 1955] S´andor, F. (1955). Final contributions to the problems and methods of psycho-analysis. Basic Books, New York. edited by Michael Balint ; translated by Eric Mosbacher and others ; introd. by Clara Thompson. [1st ed.].

[Sanides, 1970] Sanides, F. (1970). Functional architecture of motor and sensory cortices in primates in the light of a new concept of neocortex evolution. pages 137–208. in [Noback and Montagna, 1970].

[Sapolsky, 1990] Sapolsky, R. M. (1990). Stress in the wild. Scientific American, 262:116– 123.

[Schade and Smith, 1970] Schade, J. P. and Smith, J. (1970). Computer of the brain and brainmade computers. Progress in Brain Research, 33:9–21.

[Schank, 1982a] Schank, R. (1982a). Dynamic memory: a theory of reminding in com- puters and people. Cambridge University Press.

[Schank, 1999] Schank, R. (1999). Dynamic memory revisited. Cambridge University Press.

[Schank and Abelson, 1977] Schank, R. and Abelson, R. (1977). Scripts Plans Goals and Understanding: An Inquiry into Human Knowledge Structures. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Schank, 1975] Schank, R. C., editor (1975). Conceptual information processing. Elsevier.

[Schank, 1982b] Schank, R. C. (1982b). Reminding and memory organization: an in- troduction to MOPs. In Lehnert, W. and Ringle, M., editors, Strategies for natural language processing. Lawrence Erlbaum, New Jersey.

[Schank and Colby, 1973] Schank, R. C. and Colby, K. M. (1973). Computer models of thought and language. W. H. Freeman, San Francisco.

[Schmidtke et al., 1996] Schmidtke, K., Handschu, R., and Vollmer, H. (1996). Cognitive Procedural Learning in Amnesia. Brain and Cognition, 32:441–467. 700 Bibliography

[Schoettke, 2000] Schoettke, H. (2000). Working memory and context information with the Tower of Hanoi/Arbeitsgedaechtnis und Kontextinformationen mit dem Turm von Hanoi. Zeitschrift fuer Differentielle und Diagnostische Psychologie, 21:304–318.

[Schuepbach et al., 2002] Schuepbach, D., Merlo, M. C. G., Goenner, F., Staikov, I., Mattle, H. P., Dierks, T., and Brenner, H. D. (2002). Cerebral hemodynamic re- sponse induced by the Tower of Hanoi puzzle and the Wisconsin Card Sorting test. Neuropsychologia, 40:39–53.

[Schwartz, 1995] Schwartz, M. (1995). Re-examining the role of Executive Functions in Routine Action Production. pages 321–335. in [Grafman et al., 1995].

[Scott, 1975] Scott, J. P. (1975). Aggression. University of Chicago Press.

[Seamans et al., 1995] Seamans, J. K., Floresco, S. B., and Phillips, A. G. (1995). Func- tional differences between the prelimbic and anterior cingulate regions of the rat pre- frontal cortex. Behavioral Neuroscience, 109:1063–1073.

[Searle, 1990] Searle, J. R. (1990). Is the brain’s mind a computer program? Scientific American, 262(1):26–31.

[Sejnowski, 1989] Sejnowski, T. J. (1989). The computer and the brain revisited. Annals of the History of Computing, 11:197–201.

[Sells, 1985] Sells, P. (1985). Lectures on contemporary syntactic theories. Center for the Study of Language and Information, Stanford University, Stanford, California. CSLI Lecture Notes No. 3.

[Seltzer and Pandya, 1989] Seltzer, B. and Pandya, D. N. (1989). Intrinsic connections and architectonics of the superior temporal sulcus in the rhesus monkey. Journal of Comparative Neurology, 290:451–471.

[Seltzer and Pandya, 1994] Seltzer, B. and Pandya, D. N. (1994). Parietal, Temporal, and Occipital Projections to Cortex of the Superior Temporal Sulcus in the Rhesus Monkey: A Retrograde Tracer Study. Journal of Comparative Neurology, 343:445–463.

[Sereno and Maunsell, 1995] Sereno, A. and Maunsell, J. H. R. (1995). Spatial and shape selective sensory and attentional effects in neurons in the macaque lateral intraparietal cortex (LIP). Investigations Ophthalmol. Vis. Sci. Suppl., 36:S692.

[Seyfarth, 1977] Seyfarth, R. M. (1977). A model of social grooming among adult female monkeys. Journal of Theoretical Biology, 65:671–698.

[Shallice, 1982] Shallice, T. (1982). Specific impairments in planning. Philosophical Transactions of the Royal Society, B298:199–209. Bibliography 701

[Shallice, 1988] Shallice, T. (1988). From neuropsychology to mental structure. Cam- bridge University Press, Cambridge, England.

[Shallice and Burgess, 1991a] Shallice, T. and Burgess, P. W. (1991a). Deficits in strat- egy application following frontal lobe damage in man. Brain, 114:727–741.

[Shallice and Burgess, 1991b] Shallice, T. and Burgess, P. W. (1991b). Deficits in strat- egy application following frontal lobe damage in man. Brain. in press.

[Shallice and Burgess, 1991c] Shallice, T. and Burgess, P. W. (1991c). Higher-order Cog- nitive Impairments and Frontal Lobe Lesions in Man. In Levin, H. S., Eisenberg, H. M., and Benton, A. L., editors, Frontal Lobe Function and Dysfunction, pages 125–138. Oxford University Press, Oxford, England.

[Shallice et al., 1989] Shallice, T., Burgess, P. W., Schon, F., and Baxter, D. (1989). The Origins of Utilization Behaviour. Brain, 112:1587–1598.

[Shallice et al., 1994] Shallice, T., Fletcher, P., Frith, C. D., Grasby, P., Frackowiak, R. S., and Dolan, R. J. (1994). Brain regions associated with acquisition and retrieval of verbal episodic memory. Nature, 368:633–635.

[Shen and Alexander, 1997] Shen, L. and Alexander, G. E. (1997). Preferential repre- sentation of instructed target location versus limb trajectory in dorsal premotor areas. Journal of Neurophysiology, 77:1195–1212.

[Shepherd, 1990] Shepherd, G. M. (1990). The Synaptic Organization of the Brain, Third Edition. Oxford University Press, Oxford.

[Shepherd, 1994] Shepherd, G. M. (1994). Neurobiology, Third Edition. Oxford Univer- sity Press, Oxford.

[Shepherd, 1998] Shepherd, G. M. (1998). The Synaptic Organization of the Brain, Fourth Edition. Oxford University Press, Oxford.

[Shepherd and Koch, 1998] Shepherd, G. M. and Koch, C. (1998). Introduction to Synaptic Circuits. pages 1–36. in [Shepherd, 1998].

[Sheridan, 1992] Sheridan, T. B. (1992). Telerobotics, automation, and human supervi- sory control. M.I.T. Press, Cambridge, Massachusetts.

[Sherman and Guillery, 2001] Sherman, S. M. and Guillery, R. W. (2001). Exploring the thalamus. Academic Press, London.

[Siegler, 1978] Siegler, R. S. (1978). Children’s Thinking: What Develops? Lawrence Erlbaum Associates, Hillsdale, New Jersey. 702 Bibliography

[Siewiorek et al., 1982] Siewiorek, D. P., Bell, C. G., and Newell, A. (1982). Computer structures : principles and examples. McGraw-Hill, New York.

[Signoret et al., 1984] Signoret, J.-L., Castaigne, P., Lhermitte, F., Abelanet, R., and Lavoral, P. (1984). Rediscovery of Leborgne’s Brain: Anatomical Description with CT Scan. Brain and Language, 22:303–319.

[Sikora et al., 2002] Sikora, D. M., Haley, P., Edwards, J., and Butler, R. W. (2002). Tower of London test performance in children with poor arithmetic skills. Develop- mental Neuropsychology, 21:243–254.

[Simerly, 1995] Simerly, R. B. (1995). Hormonal regulation of limbic and hypothalamic pathways. pages 85–114. in [Micevych and Hammer, 1995].

[Simon, 1975] Simon, H. A. (1975). The functional equivalence of problem solving skills. Cognitive Psychology, 7:268–288.

[Simon and Hayes, 1976] Simon, H. A. and Hayes, J. R. (1976). The understanding process: problem isomorphs. Cognitive Psychology, 8:165–190.

[Simons, 1996] Simons, D. J. (1996). In sight, out of mind: When object representations fail. Psychological Science, 7:301–305.

[Simons and Levin, 1997] Simons, D. J. and Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1:261–267.

[Sinha and Poggio, 1996] Sinha, P. and Poggio, T. (1996). I think I Know that Face... Nature, 384:404.

[Smedt and Kempen, 1991] Smedt, K. D. and Kempen, G. (1991). Segment grammar: a formalism for incremental sentence generation. pages 329–349. in [Paris et al., 1991].

[Smith and Jonides, 1999] Smith, E. E. and Jonides, J. (1999). Storage and Executive Processes in the Frontal Lobes. Science, 283:1657–1661.

[Smith et al., 1996] Smith, E. E., Jonides, J., and Koeppe, R. A. (1996). Dissociating Verbal and Spatial Working Memory using PET. Cerebral Cortex, 6:11–20.

[Smith and Tsimpli, 1995] Smith, N. and Tsimpli, I.-M. (1995). The mind of a savant: language learning and modularity. Basil Blackwell, Oxford.

[Sokoloff, 1985] Sokoloff, L. (1985). Brain Imaging and Brain Function. Raven Press, New York.

[Southwick et al., 1965] Southwick, C. H., Beg, M. A., and Siddiqi, M. R. (1965). Rhesus monkeys in North India. In DeVore, I., editor, Primate Behavior - Field Studies of Monkeys and Apes, pages 111–159. Holt, Rinehart and Winston, New York. Bibliography 703

[Spencer, 1861] Spencer, H. (1861). Education: intellectual, moral, and physical. G. Manwaring, London, and D. Appleton, New York. Originally appeared as four review- articles: the first in the Westminster review for July, 1859; the second in the North British review for May, 1854; and the remaining two in the British quarterly review, for April, 1858, and for April, 1859.

[Spencer, 1863] Spencer, H. (1863). First principles. Williams and Norgate, London.

[Spiegel, 1983] Spiegel, E. A. (1983). Should the brain be regarded as a computer. Applied Neurophysiology. Proceedings of the American Society for Stereotactic and Functional Neurosurgery, Durham, North Carolina, 46:7–10.

[Spitz et al., 1984] Spitz, H. H., Minsky, S. K., and Bessellieu, C. L. (1984). Subgoal length versus full solution length in predicting Tower of Hanoi problem-solving perfor- mance. Bulletin of the Psychonomic Society, 22:301–304.

[Spitz et al., 1985] Spitz, H. H., Minsky, S. K., and Bessellieu, C. L. (1985). Influence of planning time and first-move strategy on Tower of Hanoi problem-solving performance of mentally retarded young adults and nonretarded children. American Journal of Mental Deficiency, 90:46–56.

[Stankiewicz et al., 1998] Stankiewicz, B. J., Hummel, J. E., and Cooper, E. E. (1998). The role of attention in priming for left-right reflections of object image: evidence for a dual representation of object shape. Journal of Experimental Psychology: Human Perception and Performance, 24:732–744.

[Stearns and Hoekstra, 2000] Stearns, S. C. and Hoekstra, R. F. (2000). Evolution: an introduction. Oxford University Press.

[Steier and Mitchell, 1996] Steier, D. and Mitchell, T. M. (1996). Mind Matters: A Tribute to Allen Newell. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Steklis and Erwin, 1988] Steklis, H. D. and Erwin, J. (1988). Comparative Primate Biology. Alan R. Liss, Inc., New York.

[Sterling and Shapiro, 1994] Sterling, L. and Shapiro, E. (1994). The Art of Prolog: advanced programming techniques. M.I.T. Press, Cambridge, Massachusetts.

[Stoller, 1968] Stoller, R. J. (1968). Sex and gender; on the development of masculinity and femininity. Science House, New York.

[Stoller, 1974] Stoller, R. J. (1974). Sex and Gender II. Aronson, Northvale, New Jersey.

[Stoller, 1985] Stoller, R. J. (1985). Presentations of gender. Yale University Press, New Haven, Connecticut. 704 Bibliography

[Stolorow et al., 1987] Stolorow, R. D., Brandschaft, B., and Atwood, G. E. (1987). Psy- choanalytic Treatment: an Intersubjective Approach. Analytic Press, Hillsdale, New Jersey.

[Struhsaker, 1967a] Struhsaker, T. T. (1967a). Auditory communication among vervet monkeys: Cercopithecus aethiops. pages 281–324. in [Altmann, 1967].

[Struhsaker, 1967b] Struhsaker, T. T. (1967b). Behavior of Vervet Monkeys: Cercop- ithecus Aethiops. University of California Publications in Zoology, 82:1–72.

[Struhsaker, 1967c] Struhsaker, T. T. (1967c). Social Structures among Vervet Monkeys: Cercopithecus Aethiops. Behaviour, 29:83–121.

[Sturt et al., 1999] Sturt, P., Pickering, M. J., and Crocker, M. W. (1999). Structural change and reanalysis difficulty in language comprehension. Journal of Memory and Language, 40:136–150.

[Stuss, 1992] Stuss, D. T. (1992). Biological and Psychological Development of Executive Functions. Brain and Cognition, 20:8–23.

[Stuss and Benson, 1986] Stuss, D. T. and Benson, D. F. (1986). The Frontal Lobes. Raven Press, New York.

[Sutherland and MacKintosh, 1971] Sutherland, N. S. and MacKintosh, N. J. (1971). Mechanisms of animal discrimination learning. Academic Press, New York and Lon- don.

[Swanson et al., 1987] Swanson, L. W., Kohler, C., and Bjorklund, A. (1987). The limbic region. I: The septohippocampal system. pages 125–. in [Bjorklund et al., 1987].

[Szentagothai, 1972] Szentagothai, J. (1972). The basic neuronal circuit of the neocortex. pages 1–45. in [Petsche and Brazier, 1972].

[Szentagothai, 1983] Szentagothai, J. (1983). The modular architectonic principle of neural centers. Reviews of Physiology, Biochemistry and Pharmacology, 98:11–61.

[Talairach and Tournoux, 1988] Talairach, J. and Tournoux, P. (1988). Co-planar Stereo- taxic Atlas of the Human Brain. Thieme Medical Publishers, Inc., New York.

[Tanaka, 1996] Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuroscience, 19:109–139.

[Tanji et al., 1996] Tanji, J., Shima, K., and Mushiake, H. (1996). Multiple cortical motor areas and temporal sequencing of movements. Brain Research. Cognitive Brain Research, 5:117–122. Bibliography 705

[Tarski, 1936] Tarski, A. (1936). Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philos., 1:261–405.

[Tarski, 1956] Tarski, A. (1956). Logic, semantics, metamathematics; papers from 1923 to 1938. Oxford University Press, Oxford, England.

[Tenny and Pustejovsky, 2000] Tenny, C. and Pustejovsky, J. (2000). Events as gram- matical objects: the converging perspectives on lexical semantics and syntax. Center for the Study of Language and Information, Stanford University, Stanford, California.

[Tho and Bennett G. Galef, 1988] Tho, a. R. Z. and Bennett G. Galef, J. (1988). So- cial learning: psychological and biological perspectives. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[Toates, 1986] Toates, F. M. (1986). Motivational systems. Cambridge University Press, Cambridge, England.

[Tolman, 1932] Tolman, E. C. (1932). Purposive behavior in animals and men. Appleton- Century-Crofts.

[Treisman, 1988] Treisman, A. (1988). Features and objects: The Fourteenth Bartlett Memorial Lecture. Quarterly Journal of Experimental Psychology: A series, 40(2):201– 237.

[Treisman, 1993] Treisman, A. (1993). The perception of features and objects. In Bad- deley, A. and Weiskrantz, L., editors, Attention: Selection, awareness and control. A tribute to Donald Broadbent, pages 5–35. Oxford University Press, Oxford, England.

[Treisman and DeSchepper, 1996] Treisman, A. and DeSchepper, B. (1996). Object to- kens, attention, and visual memory. In Inui, T. and McClelland, J., editors, Attention and Performance XVI: Information Integration in Perception and Communication, pages 15–46. M.I.T. Press, Cambridge, Massachusetts.

[Treisman and Gelade, 1980] Treisman, A. and Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12:97–136.

[Treisman and Kanwisher, 1998] Treisman, A. and Kanwisher, N. (1998). Perceiving Visually-Presented Objects: Recognition, Awareness, and Modularity. Current Opin- ion in Neurobiology, 8:218–226.

[Treisman, 1963] Treisman, M. (1963). Temporal discrimination and the indifference interval: Implications for a model of the internal clock. Psychologica lMonographs, 77(13, Whole no. 576).

[Tulving, 1983] Tulving, E. (1983). Elements of episodic memory. Oxford University Press, Oxford. 706 Bibliography

[Tulving et al., 1994] Tulving, E., Kapur, S., Markowitsch, H. J., Craik, F. I. M., Habib, R., and Houle, S. (1994). Neuroanatomical correlates of retrieval in episodic memory: Auditory sentence recognition. Proceedings of the National Academy of Sciences of the USA, 91:2012–2015. [Turing, 1936] Turing, A. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42:230–265. [Turing, 1937] Turing, A. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 43:544–546. [Tzeng and Cotton, 1980] Tzeng, O. J. L. and Cotton, B. (1980). A study-phase retrieval model of temporal coding. Journal of Experimental Psychology: Human Learning and Memory, 6:705–716. [Tzeng et al., 1979] Tzeng, O. J. L., Lee, A. T., and Wetzel, C. D. (1979). Temporal coding in verbal information processing. Journal of Experimental Psychology: Human Learning and Memory, 5:52–64. [Ullman, 1991] Ullman, S. (1991). Sequence-seeking and Counter Streams: A Model for Information Processing in the Cortex. AI Memo 1311, Massachusetts Institute of Technology, Artificial Intelligence Laboratory. [Ullman, 1996] Ullman, S. (1996). High-level vision : object recognition and visual cog- nition. M.I.T. Press, Cambridge, Massachusetts. [Valenstein et al., 1987] Valenstein, E., Bowers, D., Verfaelllie, M., K. M, H., Day, A., and Watson, R. T. (1987). Retrosplenial amnesia. Brain, 110:1631–1646. [van den Heuvel et al., 2003] van den Heuvel, O. A., Groenewegen, H. J., Barkhof, F., Lazeron, R. H. C., van Dyck, R., and Veltman, D. J. (2003). Frontostriatal system in planning complexity: a parametric functional magnetic resonance version of tower of london task. NeuroImage, 18:367–374. [van der Kolk et al., 1996] van der Kolk, B. A., McFarlane, A. C., and Weisaeth, L. (1996). Traumatic stress : the effects of overwhelming experience on mind, body, and society. The Guilford Press, New York. [VanEmden, 1986] VanEmden, M. H. (1986). Quantitative deduction and its fixpoint theory. Journal of Logic Programming, 4:37–53. [VanEmden and Kowalski, 1976] VanEmden, M. H. and Kowalski, R. A. (1976). The Semantics of Predicate Logic as a Programming Language. Journal of tbe Association for Computing Machinery, 23:733–742. [VanEssen and Gallant, 1994] VanEssen, D. C. and Gallant, J. L. (1994). Neural mech- anisms of form and motion processing in the primate visual system. Neuron, 13:1–10. Bibliography 707

[VanHeijenoort, 1967] VanHeijenoort, J. (1967). From Frege to Goedel: A Sourcebook in Mathematical Logic. Harvard University Press, Cambridge, Massachusetts.

[VanLehn, 1988] VanLehn, K. (1988). Toward a Theory of Impasse-Driven Learning. pages 19–41. in [Mandl and Lesgold, 1988].

[VanLehn, 1989a] VanLehn, K. (1989a). Learning events in the acquisition of three skills. In Program of the Eleventh Annual Conference of the Cognitive Science Society, pages 434–441. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

[VanLehn, 1989b] VanLehn, K. (1989b). Problem Solving and Cognitive Skill Acquisi- tion. In Posner, M. I., editor, Foundations of Cognitive Science, pages 527–579. M.I.T. Press, Cambridge, Massachusetts.

[VanLehn, 1991] VanLehn, K. (1991). Rule Acquisition Events in the Discovery of Problem-Solving Strategies. Cognitive Science, 15:1–47.

[VanLehn et al., 1989] VanLehn, K., Ball, W., and Kowalski, B. (1989). Non-LIFO Ex- ecution of Cognitive Procedures. Cognitive Science, 13:415–465.

[Vecera and Farah, 1997] Vecera, S. P. and Farah, M. J. (1997). Is visual image seg- mentation a bottom-up or an interactive process? Perception and Psychophysics, 59:1280–1296.

[Velo and Wightman, 1973] Velo, G. and Wightman, A. S., editors (1973). Constructive quantum field theory. The 1973 Ettore Majorana international school of mathematical physics. Springer-Verlag, Berlin.

[Vendler, 1967] Vendler, Z. (1967). In Linguistics in philosophy. Cornell University Press, Ithaca, New York.

[Vetter et al., 1995] Vetter, T. A., Hurlbert, A., and Poggio, T. (1995). View-based Models of 3D Object Recognition: Invariance to Imaging Transformations. Cerebral Cortex, 5:261–269.

[Vogt et al., 1992] Vogt, B. A., Finch, D. M., and Olson, C. R. (1992). Functional het- erogeneity in cingulate cortex: the anterior executive and posterior evaluative regions. Cerebral Cortex, 2:435–443.

[von Neumann, 1932] von Neumann, J. (1932). Mathematische Grundlagen der Quan- tenmechanik. Springer-Verlag, Berlin. English translation, Mathematical foundations of quantum mechanics, Princeton University Press, 1955.

[von Neumann, 1958] von Neumann, J. (1958). The computer and the brain. Yale Uni- versity Press, New Haven. 708 Bibliography

[Vosse and Kempen, 2000] Vosse, T. and Kempen, G. (2000). Syntactic structure assem- bly in human parsing: a computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75:105–143.

[Wachsmuth et al., 1994] Wachsmuth, E., Oram, M. W., and Perrett, D. I. (1994). Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cerebral Cortex, 5:509–522.

[Wallis and Rolls, 1997] Wallis, G. and Rolls, E. T. (1997). Invariant face and object recognition in the visual system. Progress in Neurobiology, 51:167–194.

[Wang et al., 2002] Wang, Y., Gupta, A., Toledo-Rodriguez, M., Wu, C. Z., and Markram, H. (2002). Anatomical, physiological, molecular and circuit properties of est basket cells in the developing somatosensory cortex. Cerebral Cortex, 12:395–410.

[Wansart, 1990] Wansart, W. L. (1990). Learning to solve a problem: a microanalysis of the solution strategies of children with learning disabilities. Journal of Learning Disabilities, 23:164–184.

[Ward and Allport, 1997] Ward, G. and Allport, A. (1997). Planning and problem- solving using the five-disk Tower of London task. Quarterly Journal of Experimental Psychology: A series, 50:49–78.

[Warner and Glass, 1987] Warner, J. and Glass, A. L. (1987). Context and distance-to- disambiguation effects in ambiguity resolution: evidence from grammaticality judge- ments of garden path sentences. Journal of Memory and Language, 26:714–738.

[Warren, 1977] Warren, D. H. D. (1977). Implementing Prolog - Compiling Programs 1 and 2. DAI Research Reports 39 and 40, University of Edinburgh, Scotland.

[Warrington and Shallice, 1969] Warrington, E. K. and Shallice, T. (1969). The selective impairment of auditory verbal short-term memory. Brain, 92:885–896.

[Waterman, 1975] Waterman, D. (1975). Adaptive production systems. In Proceedings of the 1975 International Joint Conference on Artificial Intelligence. MIT AI Lab, Cambridge, Mass.

[Webster, 1992] Webster, D. B. (1992). An Overview of Mammalian Auditory Pathways with an Emphasis on Humans. pages 1–22. in [Webster et al., 1992].

[Webster et al., 1992] Webster, D. B., Fay, R. R., and Popper, A. N. (1992). The Evolu- tionary Biology of Hearing. Springer-Verlag, New York.

[Weiner, 1992] Weiner, H. (1992). Perturbing the organism : the biology of stressful experience. University of Chicago Press. Bibliography 709

[Weiskrantz, 1987] Weiskrantz, L. (1987). Neuroanatomy of memory and amnesia: A case for multiple memory systems. Human Neurobiology, 6:93–105.

[Welsh, 1991] Welsh, M. C. (1991). Rule-guided behavior and self-monitoring on the Tower of Hanoi disk-transfer task. Cognitive Development, 6:59–76.

[Wernicke, 1874] Wernicke, C. (1874). Der aphasische Symptomencomplex. Eine psychol- ogische Studie auf anatomische Basis. Cohn and Weigert, Breslau. English translation in [Eggert, 1977].

[Wernicke, 1881] Wernicke, C. (1881). Lehrbuch der Gehirnkrankheiten f¨urAerztse und Studirende. Fischer, Kassel und Berlin. In three volumes, 1981-3.

[Wernicke, 1886] Wernicke, C. (1886). Nervenheilkunde. Die neueren Arbeiten ¨uber Aphasie. Fortschritte der Medizin, 4:463–482. Translated into English and published as Neurology: Recent contributions on Aphasia, in Cognitive Neuropsychology, vol 6, pp. 547-569, 1989, it was a review of a paper by H. Grashey, Uber¨ Aphasie und ihre Beziehungen zur Wahrnehmung, Archiv f¨urPsychiatrie und Nerbenkrankheiten.

[Wernicke, 1894] Wernicke, C. (1894). Grundriss der Psychiatrie in klinischen Vorlesun- gen. Thieme, Leipzig. Also revised 2nd edition, 1906.

[Whalen et al., 1998] Whalen, P. J., Rauch, S. L., Etcoff, N., McInerney, S. C., Lee, M. B., and Jenike, M. A. (1998). Masked presentations of emotional facial expressions modulate amygdala activity without explicit knowledge. The Journal of Neuroscience, 18:411–418.

[Wierzbicka, 1980] Wierzbicka, A. (1980). Lingua mentalis : the semantics of natural language. Academic Press, London.

[Wierzbicka, 1992] Wierzbicka, A. (1992). Semantics, culture, and cognition : universal human concepts in culture-specific configurations. Oxford University Press, Oxford.

[Wilensky, 1978a] Wilensky, R. (1978a). Understanding goal-based stories. PhD thesis, Yale University.

[Wilensky, 1978b] Wilensky, R. (1978b). Why John married Mary; understanding stories involving recurring goals. Cognitive Science, 2.

[Wilks, 1973] Wilks, Y. (1973). An artificial intelligence approach to machine translation. pages 114–151. in [Schank and Colby, 1973].

[Winston, 1993] Winston, P. H. (1993). Artificial Intelligence. M.I.T. Press, Cambridge, Massachusetts. 710 Bibliography

[Winter et al., 2001] Winter, W. E., Broman, M., L, A. L. R., and Reber, A. S. (2001). The assessment of cognitive procedural learning in amnesia: Why the Tower of Hanoi has fallen down. Brain and Cognition, 45:79–96.

[Wise and Desimone, 1988] Wise, S. P. and Desimone, R. (1988). Behavioral Neurophys- iology: Insights into Seeing and Grasping. Science, 242:736–741.

[Wolfe, 2000] Wolfe, J. M. (2000). Inattentional Amnesia. In Coltheart, V., editor, Fleeting Memories. M.I.T. Press, Cambridge, Massachusetts.

[Wolfe and Bennett, 1997] Wolfe, J. M. and Bennett, S. C. (1997). Preattentive object files: shapeless bundles of basic features. Vision Research, 37:25–44.

[Wood and Coolen, 1997] Wood, R. I. and Coolen, L. M. (1997). Integration of chemosensory and hormonal cues is essential for sexual behaviour in the male Syr- ian Hamster: role of the medial amygdaloid nucleus. Neuroscience, 78:1027–1035.

[Woolsey, 1981] Woolsey, C. N. (1981). Cortical sensory organization. Humana Press, Clifton, New Jersey.

[Wu, 1994] Wu, D. (1994). Aligning a parallel English-Chinese corpus statistically with lexical criteria. In ACL-94: 32nd Annual Meeting of the Assoc. for Computational Linguistics, 80-87. Las Cruces, NM.

[Wundt, 1863] Wundt, W. M. (1863). Vorlesungen uber die Menschen und Tier-Seele. L. Voss, Leipzig. English translation, Lectures on Human and Animal Psychology, 1896.

[Wundt, 1874] Wundt, W. M. (1874). Grundzuge der physiologeschen Psychologie. W. Engelman, Leipzig. English translation, Principles of Physiological Psychology, 1904.

[Yan and Suga, 1998] Yan, W. and Suga, N. (1998). Corticofugal modulation of the midbrain frequency map in the bat auditory system. Nature Neuroscience, 1:54–58.

[Yaoda Xu and Suzanne Corkin, 2001] Yaoda Xu and Suzanne Corkin (2001). H. M. Revisits the Tower of Hanoi Puzzle. Neuropsychology, 15:69–79.

[Yntema and Trask, 1963] Yntema, D. B. and Trask, F. P. (1963). Recall as a search process. Journal of Verbal Learning and Verbal Behavior, 2:65–74.

[Young, 1992] Young, M. P. (1992). Objective analysis of the topological organization the primate cortical visual system. Nature, 358:152–1155.

[Young, 1993] Young, M. P. (1993). The organization of neural systems in the primate cerebral cortex. Proceedings of the Royal Society of London, B252:13–18.

[Yukie, 1995] Yukie, M. (1995). Neural connections of the auditory association cortex with the posterior cingulate cortex in the monkey. Neuroscience Research, 22:179–187. Bibliography 711

[Zaniolo et al., 1997] Zaniolo, C., Ceri, S., Faloutsos, C., Snodgrass, R., Subrahmanian, V. S., and Zican, R. (1997). Advanced database systems. Morgan Kaufman, San Mateo, California.

[Zatorre et al., 1996] Zatorre, R. J., Meyer, E., Gjedde, A., and Evans, A. C. (1996). PET Studies of Phonetic Processing of Speech: Review, Replication, and Reanalysis. Cerebral Cortex, 6:21–30.

[Zenon Pylshyn, 1985] Zenon Pylshyn (1985). Computation and Cognition. M.I.T. Press, Cambridge, Massachusetts.

[Zilles, 1990] Zilles, K. (1990). Cortex. pages 757–802. in [Paxinos, 1990].

[Zumpe and Michael, 1996] Zumpe, D. and Michael, R. P. (1996). Social Factors Mod- ulate the Effects of Hormones on the Sexual and Aggressive Behavior of Macaques. American Journal of Primatology, 38:233–261. Index

abstract logical analysis, 244 affiliative goals, modeling, 235 abstraction, 97 affiliative relations, 245 abstraction hierarchy, 234 modeling, 235 acknowledgements affiliative sex, 566 Pollack, Debbie, 21 Aggleton, John, 571, 573 Raleigh, Michael, 21 aggression, 558 Zumpe, Doris, 566 brain mechanisms, 575 acquisition of verbal episodic memories, 206 conspecific, 562 action decoupling from sex, 562 continuous, 240 development, 587 elaboration rule, 235 psychology, 561 features, 218 types, 561 hierarchy, 235 agonism, 35, 557, 561–562 representation, 510 brain mechanisms, 575 sequence, 145 development, 587 sequences for the self, 218 agonistic motivation slips, 280, 479 model of, 591 specification, 240 agrammatism, 544, 555 action-perception cycle, 144 agraphia, 65 activation value AI, 183, 185, 217, 224 of lexical frame, 539 AL, 183 activation, distributed, 240 Albus, James, 58, 144 activations of body parts, 218 Alexander, Garrett, 486 activations of muscle groups, 218 alexia, 65 actors, 359 alice1, 249, 253 adam1, 249, 253 alice2, 253 adam2, 253 all rule instances executed once, 341 adaptive production system, 468 all rule instances executed repeatedly, 341 address, 4 Allport, Alan, 290 addresses, 133 Alzheimer’s disease, 69 ADH (antidiuretic hormone), 570 Amaral, David, 376, 380, 519, 520, 571, 572 adrenal glands, 580 amnesia retrograde, 380 Aerospace Corporation, 21 amnesic patients, 289 affiliation, 33, 35 amnesics, 380 affiliation goals, 245 autobiographical memory, 380 affiliation module, 245 amplitude, auditory, 185 affiliative behavior, 39, 245 amygdala, 58, 206, 387, 484, 570–573

712 Index 713

extrinsic connections, 571 18, 500 intrinsic connections, 570 31, 206 model of, 601 32, 206, 207 neuroanatomy, 570 46, 203, 205, 206, 217, 480 representation of, 601 AI, 183 anatomical parcellation, 156 AL, 183 Andersen, Richard, 193 auditory core, 183 Anderson, John R., 295, 468 CL, 183 Andreasen, Nancy, 206, 294 CM, 183 animate objects, visual perception, 190 CML, 185 anterior cingulate cortex, 148, 203, 204, 207, CPB, 183 229, 389, 481 CT, 187 caudal, 203 DP, 194 rostral, 203 FEF, 515 anterior inferotemporal cortex, 191 FPro, 202 anterior medial frontal regions, 206 FST, 193 anterior olfactory nucleus, 178 IPL, 194 anterior temporal lobe, 229 IT, 190, 191, 206 antidiuretic hormone(ADH), 570 LIP, 193, 194, 515 anxiety, 562 LS, 183 Anzai, Yuichiro, 295, 296, 465, 472 MDP, 193 apathy, 279 MPO, 581 aphasia, 65 MST, 193 appearance from the visual field, visual MT, 193, 507 recognition of, 196 OAa, 193 approach phase of grooming, 249 parabelt, 183, 186 approximation, 13, 17, 262 parahippocampal, 387 Arbib, Michael, 144 PEF, 515 architectonic analysis, 156 perirhinal, 387 architectonic division, 178 PG, 182 architecture, fixed, 616 PIP, 193 arcuate fasciculus, 185 PO, 193 area R, 183 23b, 185 RM, 183 7a, 182, 194 RPB, 183 5, 182 RT, 183 6, 204, 207 RTL, 183 8, 204 RTM, 183 9, 203, 205, 207 SMA, 203 10, 203, 205, 206 STG, 187 11, 203, 205–207, 480 STS, 183 12, 203, 205–207, 480 TE, 190 13, 207 TE0, 519 14, 207 TPro, 197, 202 17, 500 V1, 147, 500, 506 714 Index

V2, 147, 500, 506 working model, 564 V4, 147, 506, 507 attachment mechanisms, 261 V8, 507 cortical, 261 VM, 581 subcortical, 261 areas, 230 attachment types in 5-7 year olds, 564 involved in language processing, 546 attention, 191, 229, 238, 500, 511 Aristotelian logic, 76 focused, 513 arithmetic expressions, 341 global, 512 arithmetic problems, 203 mechanisms, 238 arithmetic tasks, 206 preattention, 512 Arnold, Magda, 558 vision, 529 artificial intelligence, 18, 73, 87, 532 attentional masking, 511 models of memory, 423 attenuation, in BAD, 322 association AU1, 186, 224 hierarchy of, 380 AU2, 224 association loop, 486, 489 AU3, 186, 187, 197, 206, 224 association neurons, 64 auditory associative memory, 357, 640 areas, 183 assumptions underlying computer science association areas, 186 descriptions, 133, 134, 136 belt, 183 Atkinson, Richard, 7 core, 183 attachment, 261, 557, 559, 562–565 cortical areas in man, 185 adult, 563 features, 218 adult attachment types, 564 hierarchy, 183, 197 brain mechanisms, 577 image formation, 186 despair, 563 medial areas, 185 development, 587, 593 memory, 186 disorganized/disoriented type, 563 parabelt, 183 distress, 563 pattern recognition, 186 in rhesus monkeys, 564 sentence recognition, 207 in the rat, 564 social messages, 218 insecure-ambivalent/preoccupied type, spatial processing, 187 563 autobiographical memory, 380, 382 insecure-avoidant type, 563 automaton, 83 maternal behaviors, 565 autonomic system, 59 model of, 593 avoidance, 558, 562 mothering styles, 563 brain mechanisms, 576 other motivational systems, 565 development, 587 physiological basis, 564 avoidance sequence, 255 protest, 563 awareness, 511 psychology, 562 rat pup behaviors, 565 Babbage, Charles, 86 secure base, 563 BAD, 309–336, 616 secure type, 563 attenuation, 322 types, 563 competition of rules, 319 Index 715

confirmation, 322 Binder, Jeffrey, 187 data items, 320 binding description types, 323 of distributed brain activity, 608, 640 effectors, 326 of visual features, 512 external world specification, 324 problem, 263 interpreter, 309 our approach to its solution, 263 model parameters, 325 binocular disparity, visual recognition of, noise level, 322 193 notexists tests, 318 biological basis of our computational ap- rule execution, 318 proach, 230 sensing, 326 bipolar disorder, 69 sensors, 326 birth-order hierarchy, 39 storage process, 320 Bit Cube, 20 syntax, 331 blackboard, 20 trial files, 331 blindsight, 507 updating stores, 321 body posture, visual recognition of, 196 use of weights, 316 Bohr atom, 11 Bailey, Percival, 156 Bohr, Nils, 12 Baker, S. C., 294 boolean constraints, 341 Ballard, Dana, 508 bottom-up process, 340 Balman, Tunch, 20 Bowlby, John, 562 bandwidth, bounded, 615 boxology, 7 Barbas, Helen, 144, 156, 166, 202 brain Barlow, Horace, 57 components, 58 Barnes, Clifford, 194 reptile, 42 Barsalou, Lawrence, 382 brain activity, imagining, 242 basal ganglia, 44, 58, 375, 479, 480, 483, 484, Brain Architecture Description Language, 515 309, 616 geometry, 484 brain mechanisms loops, 486 for sexual behaviors, 594 source areas for, 486 brain model basal ganglia loops implementation, 245–267 anatomical geometry, 486 Brain Science, 3, 9, 124 bases for hierarchy in the neocortex, 148 brain stem, 59 basic category level objects, visual recogni- brainstem nuclei, 560 tion, 190 Broadbent, Donald, 124 basic computational principles, 221 Broca’s area, 530, 556 basic tenets and assumptions, 621 Broca, Paul, 63, 67 basic tenets of the theory, 622 Brodmann, Korbinian, 66, 68, 156, 164 Bastian, Charlton, 64 Brooks, Rodney, 21 Begriffschrift, 78 Bruce, Victoria, 6 behavioral context, 206 Bullock, Theodore, 42, 144 beliefs, 155 Bell, Gordon, 105 calculus, 86 Berkeley, Edmund, 132 Caltech, 21 716 Index

Caplan, David, 532, 544, 555 CML, 185 Cardoso, John, 295 co-construction, of sentences, 231 caretaker, in attachment, 563 cochleotopic mapping, 185 Carmichael, Thomas, 176, 178 cognitive goals, 203 Carnap, Rudolf, 124 cognitive maps, 155, 376 Carnegie-Mellon University, 19 cognitive psychology, 272 case-based reasoning, 425 cognitive strategies, 271 causal, 7 Cohen, Neil, 376 dynamics, 4, 260 coherence model, 174 maximal, 608 model of the brain, 229 Colmerauer, Alain, 96 models, 222 Colombo, Michael, 186 relations among events, 425 color, perception of, 507 Celesia, Gastone, 185 color, visual recognition, 190 cell dynamics, 128, 357 commitment, 20 level of description, 357 communication channels, 337 central processes, 260 companionship, 567 cerebellum, 43, 59 competing choices, 240 change of action, 253 competition change of goals, 253 among goals, 150 change of plans, 253 conspecific, 562 channels, 138 for resources, 562 Chapman, Tim, 295 for social status, 562 characteristic time, 608 in BAD rules, 319 Cheney, Dorothy, 33 in outputs, 319 children with learning disabilities, 288 in updates, 319 children, problem solving, 288 sexual, 562 chimpanzees, 32, 156 complex plans, 218 choice of behaviors, 245 complexity of processing, 615 choice of external environment, 245 components of the brain, 58 choice of social interaction, 245 computation choices, competing, 240 definition of, 340 Chomsky, Noam, 531 computation as logical inference, 339 cingulate computation explores a model of the theory, anterior, 203 340 posterior, 506 computation generates a model of the the- CL, 183 ory, 340 Clark, Keith, 96 computation, basic elements, 230 clause form, 339 computation, limits of rule, 234 clauses, 340 computational architecture, 236, 244 clock, 151 computational principles, 230 clustered interconnectivity, 168 computational semantics, 261 clustering of interconnectivity, 147 computer implementation, 249 clustering of lateral connections, 217 computer implementation, our, 249 CM, 183 computer science, 8, 9, 18, 86, 97 Index 717

theoretical, 18 consistency, 339 computer science assumptions, 133 maximalt, 608 computer science concepts for the brain, 136 consolidation, 380 computer science of the brain, 612–618 consolidation of long-term memory, 380 control issues, 617 consolidation of memory, 387 data representation issues, 617 constraints, 124 design constraints, 614 causal ordering, 241 implementation issues, 618 discrete capacity, 272 multiperson cooperation issues, 618 temporal, 241 optimization principles, 614 constructors, 342 programming issues, 617 contention scheduling, 203 programming language, 616 context, 139, 145, 153, 154, 206, 218, 379, research issues, 617 387, 422 theoretical issues, 618 definition of, 154 computer science, what is a?, 100 context information, 155 computer scientific context memory, 422 assumptions, 136 contextual knowledge, 155 concept formation, 280 continuous action, 229, 240, 262 concept of data, 622 continuous perception, 229 concept of hierarchy, 145 continuous updating, 262 concepts, in conceptual net theory, 425 control, 133, 138, 262 conceptual dependencies, 425 decisions, 480 conceptual nets distributed in our model, 262 representation, 423 explicit representation in our model, conclusion, 641 262 concurrency, of modules operation, 242 issues, 617 concurrent modules mandatory, 260 in language processing, 546 processes, 7 conditional plan elaboration, 229 theory, 589 conditional statements, 339 controller, no central, 231 confirmation, 240 conventional AI programming, 350 in BAD, 322 conversation analysis, 35 confirmation mechanism, 240 conversational interaction, 531 confirmation message, 240 coordinate frames, in vision, 527 connectivities from perceptual regions to core brain model frontal regions, 215 interface to vision system, 520 connectivity, 156, 230 core dynamic model, 623 of hippocampus, 389 correspondence to Mega and Cummings connectivity among areas, 53 frontal lesion types, 266 connectivity role, 54 cortex, 43, 58, 64 connectivity, extrinsic, 168 control of routine action, 496 connectivity, intrinsic, 168 prefrontal, 200–219 consciousness, 5, 508, 607–611 cortical distributed nature of, 608 areas, 147 scientific study of, 608 cortical columns, 146 718 Index cortical connectivity, 144 dataflow cortical field activation hypothesis, 164 architecture, 359 cortical layers, 128, 146 computer, 359 cortical learning, 640 concept, 359 cortical routing, 508 Datalog language, 340 corticocortical connectivity, 172, 199 Deacon, Terence, 7, 533 Coslett, H. Branch, 528 deductive databases, 340 courtship behaviors, 559 defining characteristics CPB, 183 of man, 32 creative action, 479, 480, 515 of primates, 32 creative imagination definition of episodic memory, 381 Dilthey’s concept, 609 delayed non-match to sample experiments, creative thought, 481 152 Creutzfeldt, Otto, 57 dementia praecox, 69 Crocker, Matthew, 532 derivation of new data, 339 CT, 187 description, 14, 19 cued recall, 375 in BAD, 312 culture, 124 description languages, 8 Cummings, Jeffrey, 266, 279 description pattern, 313 cycle of inference, 341 description transformation rules, 313 cycle, processing, 240 description types, in BAD, 323 cynomolgus macaque monkeys, 193 descriptions, 623 cytoarchitectonic distinguishability, 147 descriptive process, 15 Desimone, Robert, 191, 508 Dahl, Veronica, 532 detailed actions for self module, 246 data, 262 DeVault, David, 523 concept of, in model, 622 development, 31 from the field, 34 of agonism, 587 wide, 232 of attachment, 587 data items, 233 of sexual relations, 588 in BAD, 320 attachment, 593 strength of, 233 fear of snakes, 587 uncertainty of, 233 of aggression, 587 weight of, 233 of avoidance, 587 weights, 234 ontogenetic, 31 data parallelism, 231 phylogenetic, 31 data representation issues, 617 working models, 588 data structure, 87, 133 Devinsky, Orrin, 207 data structures diagram logical representation, 342 Kobayashi and Amarel, 519 data types, 168, 197, 198, 219, 342 diencephalon, 43 characterization of brain areas in terms difference-of-gaussians, 515 of, 197, 219 Dilthey, Wilhelm, 607, 609 fixed, 616 dimorphism, 36 processed in each module, 246 direction of head, visual recognition of, 196 Index 719 direction of movement, visual recognition, of rule firing, 615 190 of storage use, 615 direction of stimulus motion, visual recogni- Edelman, Gerald, 58, 508 tion of, 193 edge-detectors, 516 disagreements, 640 effectors, 236 disappearance from the visual field, visual effectors, in BAD, 326 recognition of, 196 egocentric representation, 506 disconfirmation, 253 Eichenbaum, Howard, 376 discrimination net, 425 elaboration, 238 disorganized/disoriented attachment type, conditional, 238 563 of plans, 236 displacement sequence, 255 Ellis, Andrew, 532 disposition, 154, 255 Emery, Nathan, 572 direct perception of, 255 emotion, 609 representation of, 255 emotional and motivational representation, dissociation, 585 509 and stress, 585 energy consumption in the brain, 615 dissolution, 67 energy resources, 260 distributed activation, 240 entscheidung problem, 81 distributed network, 294 environment, external, 236 distributed planar gain field representation, episode, 153 194 episodes, 206, 218 distributed processing, 231 episodic memory, 153, 155, 229, 301, 373– distributed strategies, 271 421, 466 distributed working memory, 233 definition of, 381 domain specificity, 260 in Tower of Hanoi, 301 dominance, 36, 39 medium term, 382 dominant rule, 240 module, 271 dorsal prefrontal, 187, 229 my approach, 388 dorsal visual areas, 191, 193 retrieval, 206 dorsal visual hierarchy, 197 theories of, 383 dorsal visual regions, 229 vision, 519 dorsolateral prefrontal areas, 202 erlebnis, 610 Douglas, Rodney, 45 ERP, 229 DP, 194 Eskandar, Emad, 206 drivers, 492 estradiol, 580 DSM manual, 69 evaluation, 238 DV1, 193, 224 event, 145, 153 DV2, 217, 224 components, 390 DV3, 193, 197, 217, 224 participants, location, topic, 425 dynamic memory, 425 contexts, 425 dynamic version, of my brain model, 426 instantaneous, 389 dynamics of our model, 236 memory, 373, 382 temporal sequences of, 387 economics event, in conceptual net theory, 425 720 Index evolution, 17, 41, 42 failure of action, 253 of the primate brain, 42 failures, driving memory updates, 425 evolution of hippocampus, 379 falsifiable predictions, 4, 271 evolutionary sequence, 42 familiarity, 191, 567 executive Fauconnier, Gilles, 426 attention, 207 fear, 562 control, 295 fear of snakes, 587 dysfunction, 279, 295 development, 587 functions, 207, 276 model of, 591 processes, 206 feeding, 35 processing, 205 FEF, 515 expectations, 155 Felleman, Daniel, 146, 156, 193, 212 experience female support, 40 raw, 611 field data, 34 experimental evidence, 172, 199 filtering, competitive, 234 explanations, 124 five-level functional hierarchy, 218 expression, provided, 234 fixed external environment, 236 architecture, 616 external world specification, in BAD, 324 data types, 616 externally generated responses, 205 fixed point of an operator, 341 extreme posterior inferior temporal areas, fixed point semantics, 96 191 flight, 559 extrinsic connectivity, 212 fluoxetine, 39 extrinsic connectivity among hierarchies, FMRI, 187, 229 212 Fodor, Jerry, 144, 260 eye contact with subject, visual recognition modularity ideas, 260 of, 196 modularity hypothesis, 640 eye control systems, 483 our answer to his modularity ideas, 260 eye gaze direction, visual recognition of, 196 folklore, 127 eye movement, 508, 514 forced-choice olfactory discrimination, 178 models, 515 formal description, 15, 73, 78 planning, 515 formal language, 15 routine, 515 formal logic, 73 selection of, 204 foveal image, 524 Tower of Hanoi, 523 foveal vision, 500 Tower of Hanoi problem, 517 FPro, 202 eye saccades, 218 Frackowiak, Richard, 203 frame, 422 face perception, 6 frames, 92 face view, visual recognition of, 196 lexical, 536 faces Frazier, Lynn, 532, 533 perception of, 506 free recall, 375 facial expression, visual recognition of, 196 freezing, 559 facial expressions, 30 Frege, Gottlob, 73, 76, 78 facial identity, visual recognition of, 196 frequency, auditory, 185 Index 721

Freud, Sigmund, 63, 66, 69 Gibson, Edward, 532 Friedman, William, 382 goal, 145 Frijda, Nico, 558 working definition of, 150 Frith, Christopher, 202 goal generation, 240 frontal damage goal module, 235 medial, 279 goal recursion pyramid strategy with goal dorsal, 279 counting, 287 orbital frontal, 279 goal recursion strategy, 296, 301 frontal eye fields (FEF), 483, 515 goal selection, 207, 240 frontal lobes, 144, 191, 200–219 goal stacking, 271, 286, 301, 468 and problem solving, 276 goal-centered frames, 196 functional involvements of, 276 goal-centered representations, 196 lesions of, 276, 279 goal-directed, 150 types of deficit, 279 behavior, 148 frontal patients, 153, 289 behaviors, in anterior cingulate, 207 frontal regions, 204 mechanisms, 150 FST, 193 voluntary movements, 182 Fukushima, Kunihiko, 508 goal-directio, 229 full goal recursion disk strategy, 287 goal-objects, 150 full goal recursion pyramid strategy, 287 goals, 148, 207, 218, 229, 343, 482 functional approach, 339 goals as processes, 150 functional architecture, 143 goals module, 245 functional involvement, 7 goals, in conceptual net theory, 425 functional programming, 87 Goedel, Kurt, 81 fusiform gyrus, 191 Goldman-Rakic, Patricia, 7, 156, 185 Fuster, Joaquin, 202, 480 Golgi, Camillo, 64 fuzzy sets, 342 Goodwin, Charles, 231 gorillas, 32 G, 207, 217 Government-Binding (GB) approach, 531 Gall, Franz Joseph, 63 GPS, 87 Gallant, Jack, 500 grammatical features, 536 garden path sentences, 542 multiple, 538 Garrett, Merrill, 532 shared, 536 Gartlan, Stephen, 33 granularity of time, 240 Gasser, Les, 20 Grasby, Paul, 187 gender identity, 588 grasping, 182 gender preference, 588 grasping neurons, visually guided, 193 General Problem Solver program, 87 Gray, Jeffrey, 557, 562, 571 generalized phrase structure grammar Greene, Peter, 148 (GPSG), 531 groomer, 249 generation of actions, 205 grooming, 249 generation of intentions to move, 193 grooming behavior, 249 generation of verbal sequences, 203 grooming phase of grooming, 249 Geschwind, Norman, 7, 67 grooming plan, 249 GI, 180 grooming-prelude phase of grooming, 249 722 Index grooming-prelude-response phase of groom- dorsal visual, 197 ing, 249 of associativity, 380 grooming-response phase of grooming, 249 of behavior, 223 Gross, Charles, 190, 191 of complexity of processing, 148 ground literals, 313, 338 of control, 149 GU1, 180 of data type, 148 GU2, 180 of frontal regions, 217 Guillery, Ray, 492 of functionality, 223 gustatory areas, 178 of goal description, 149 of memory, 149 Haberly, Lewis, 178 of motor function, 147, 149 Hackett, Troy, 183 of somatosensory perception, 182 Hall, K. R. L., 33 of temporal scale, 149 hand actions, visual recognition of, 196 of visual processing, 190 hardware, 135 Perrett’s processing hierarchy for recog- Harlow, Harry, 564 nizing social stimuli, 196 Head, Henry, 67 planning and action, 200–219 head-centered spatial map, 194 somatosensory, 197 head-driven phrase structure grammar ventral visual, 190 (HPSG), 531 higher order visual features, 147, 191 Hebbian assemblies, 146 Hilbert, David, 81, 84 hemispheres of brain, 173 Hinde, Robert, 558 Henderson, John M., 519 hippocampus, 43, 58, 185, 271, 373, 376, Herbart, Johann Friedrich, 63 379, 380, 472, 484, 519 Herbrand model, 340 connectivity of , 389 hermeneutics, 609 evolution of, 379 Hewitt, Carl, 91, 345 in humans, 379 hierarchical in nonhuman primates, 379 anatomical connectivity, 173 in rats, 376 description of social interaction, 230 neuroanatomy, 376 functionality, 173 neuroanatomy, human, 376 level, perception at, 218 subcortical connections, 520 level, planning and action at, 218 hobbits and orcs problem, 272 ordering relation, 148 Hofer, Myron, 557, 564 processing, 55 Hollingsworth, Andrew, 519 structure, 144 hominids, 31 structure of neocortex, 173 hominoids, 31 hierarchical organization of the perception- Hopfield, John, 57 action system, 230 hormonal systems, 559 hierarchies, 173 hormones, 59, 570 elements of, 146 circulation, 581 hierarchy, 145 receptors, 580, 581 abstraction, 234 regulatory factors, 570 auditory, 183, 197 Horn clauses, 340 birth order, 39 Hughes, Claire, 295 Index 723 human areas, notation consistent with retrieval of, 243 macaque, 202 storage of, 243 human neonate, 30 transformation activity, 243 human pharynx, 187 information processing human problem solving, 272 concepts, 150 Humphrey, Nicholas, 155 function, 143, 145 hunger, 558, 589 information-processing hunter gatherer, 36 analysis of the primate neocortex, 172 societies, 30 inhibition hypofrontality, 294 competitive, in parsing, 541 hypothalamic nuclei, 581 inhibition/exclusion principle, 545 hypothalamus, 43, 59, 569–570 initial brain model, 245 connections, 569 initial state of the brain models, 329 model of, 596 initiation of behaviors, in anterior cingulate, neuroanatomy, 569 207 representation of, 596 initiative, 262 Hyvarinen, Juhani, 182 initiative, distributed, 262 insecure-ambivalent/preoccupied attach- ICL Distributed Array Processor, 20 ment type, 563 image insecure-avoidant attachment type, 563 foveal, 524 instantaneous events, 389 peripheral, 524 instantaneous state, 251 imagery instantaneous state of brain, diagram, 251 mental, 521 instructions, 133 imagery, Kosslyn model, 518 instrumentation, 311 imagery, mental, 299, 518 insula, 180, 183 imagining brain activity, 242 insular cortex, 178 immune system, 559 integrity conditions implementation issues, 618 global in parse tree, 540 implementation of brain model, 245–267 intellectual framework, 5 impulsivity, 279 intellectual revolution of computer science, indexing of cases, 425 97 infant behavior intention, 154, 262 brain mechanisms, 578 distributed, 262 infant caretaking, 35 encoding of in posterior parietal cortex, infant handling, 35 193 inference, 76 inter-troupe conflict, 40 inferencing, 622 interface to learning modules inferomedial occipo-temporal region, 191 vision system, 520 inferotemporal areas, 190 interface to memory systems inferotemporal cortex, anterior, 191 subcortical systems, 590 information interface to mmory systems activity conditional on, 243 language processing, 549 combination of, 243 interface to perception-action hierarchy encapsulation, 260 vision system, 520 724 Index interface to the core brain model Kraemer, Gary, 261, 564 vision system, 520 Kraepelin, Emil, 66, 69 interface to the perception-action hierarchy Kuhlenbeck, Hartwig, 132 language processing, 549 subcortical systems, 590 L’Hermitte, F., 279 interleaving of routine and creative action, Langacker, Ronald, 426 492–496 Langley, Patrick, 295, 468 internally generated cognition, 202 language, 530–632 internet standards, 345 language processing, 18, 187, 556 interparietal sulcus, 173 Lashley, Karl, 151 IPL, 87, 194 lateral geniculate nucleus, 500 IPL-V, 87 lateral inferior parietal area, 515 IT, 190, 191, 206 lateral premotor cortex, 204 Iverson, Kenneth, 105 Lavenex, Pierre, 380 layer, 64 Jackendorf, Raymond, 426 cortical, 357 Jackson, John Hughlings, 67 level of description, 357, 360 Jacobs, Jake, 585 leadership, 36, 39 Jakobson, Roman, 67 learnability Janet, Pierre, 71 of grammars, 545 Japanese macaque monkeys, 32 learning jerky motion, visual recognition of, 196 by doing, 465–478, 489 joint action, 154, 241 mechanisms, 295 joint goal, 154 of associations, 376 joint plan, 154, 229, 241 of categories, 376 joint rotation, 182 of episodes, 376 Jones, Edward, 68, 144, 159, 164, 165 of events, 376 junior high, 39 of facts, 376 Kaas, Jon, 182, 183, 185, 186, 533 of problem solving strategies, 288 Kahneman, Daniel, 6, 506 of structure, 376 Kanwisher, Nancy, 506, 508, 608 Tower of Hanoi, 295 Karmiloff-Smith, Annette, 7 Ledoux, Joseph, 557, 562, 571 Keele, Steven, 151 left anterior cingulate, 206 Kempen’s model of grammar, 536–544 legal-move questions, for Tower of Hanoi Kempen, Gerard, 530, 531, 533, 536 problem, 523 knowledge in a module, 339 Leibniz, Gottfried Wilhelm von, 76 knowledge representation, 509 lesion studies, correspondence to, 266 Knowlton, Barbara, 374 lesion studies, correspondence to frontal, Kobayashi and Amarel diagram, 519 266 Kobayashi, Yasushi, 376, 519, 520 lessons learned, learning, 471 Kolodner, Janet, 423, 425 levels of control, in the model, 241 Koskinas, Georg, 156 levels of description, 104 Kosslyn model of mental imagery, 518 of the brain, 122 Kosslyn, Stephen, 6, 7, 299, 518, 609 used in my model, 223 Kowalski, Robert, 96, 340 Levelt, Willem, 531 Index 725

Lewis, Richard, 532 long term memory for complex objects, 218 lexical frame loop activation value, 539 association, 486 unified, 545 limbic, 486 lexical frames, 536 oculomotor, 486 lexical-functional grammar, 531 sensory-motor, 486 lexicalist approaches, 531 LS, 183 lexicon, 536 LTP, 382 LGN, 500 Luchins, Abraham, 279 Lhermitte, F., 483 Lull, Ramon, 76 Libet, Benjamin, 608 Luria, Alexander Romanovich, 67 Lichtheim, Ludwig, 64 Lutz, Catherine, 587 life sciences Dilthey’s concept, 609 macaque monkeys, 156, 194 limb and body movements of social signifi- MacKintosh, Nicholas, 155 cance, visual recognition of, 196 Maclean, Paul, 144 limb position, visual recognition of, 196 magnocellular stream, 500 limbic loop, 486 manager, no central, 231 lingual gyrus, 191 mandatory control, 260 linguistic disorders, 532 manic-depression, 69 linguistic performance Marie, Pierre, 67 human, 532 Marks, Phil, 20 linguistics, 9 Marr, David, 57 LIP, 193, 194, 515 Martin, Kevan, 45 Lisp language, 97 Mason, William, 558 lived experience matching, 234 Dilthey’s concept, 609 matching data items, 234 loading a brain model, 329 matching, in BAD rules, 314 localization of sound, 185 maternal behavior, 559 Locke, John, 63 brain mechanisms, 577 logic grammars, 532 in the rat, 565 logic programming, 96, 339 maternal behaviors, 559 Logic theorist program, 87 mating and sexual behaviors, 178 logical Maturana, Humberto, 610 deduction, 338 Maunsell, John, 193 formulae with probabilities, 342 McClelland, James, 376 formulae with weights, 342 McFarland, David, 589 literals, 338 McGaugh, James, 557 modeling, 232 McGuire, Michael, 33 statements, 339 McIntosh, Randy, 6 theory, 338 McNaughton, Bruce, 571 logical system, 337 MDP, 193 logicism, 78 Meadows, J. C., 507 Logothetis, Nikos, 190 meaning of a description, 622 long term memory, 245, 375 means-ends action, 150 726 Index mechanism, confirmation, 240 model medial premotor cortex, 204 of attachment, 593 medial temporal lobe, 205 concept of data, 622 medulla, 42 core dynamic, 623 Mega, Michael, 266, 279 inferencing in, 622 Mel, Barlett, 508 minimal, 340 memory my grammatical model for the brain, artificial intelligence models of, 423 544 associative, 357 of agonistic motivation, 591 autobiographical, 380, 382 of fear of snakes, 591 consolidation, 387 of the amygdala, 601 consolidation of long-term, 380 of the hypothalamus, 596 context, 422 SEEMORE, 508 dissociation, 585 sexual motivation, 593 episodic, 373–421, 466 subcortical systems, 586 episodic, definition of, 381 summary of, 621–640 event, 373 model of organizations, 345 for events, 382 model parameters, in BAD, 325 for shape features, 193 model, computational semantics of my, 261 for time, 382 model, distributed representation of control long term, 375 in our, 262 neuroanatomy of, 373 model, explicit representation of control in of cortical regions, 233 our, 262 procedural, 375 model, properties of, 262 psychology of, 373 models working, 375, 387 cortex, 509 menopause, 39 eye movement, 508, 515 mental events, 154 visual system, 508 mental imagery, 6, 518 modularity, 260 mental states, 16 Fodor’s ideas, 260 Mesulam, Marsel, 6 in language processing, 533 Meynert, Theodor, 63, 64 modularity hypothesis, 640 MI, 204, 217, 223 modulators, 492 Michael, Richard, 557, 562 module microtheory, 345 brain, structure of, 357 mid-dorsal lateral frontal cortex , 205 concept of, 613 midbrain frequency map, 186 module, episodic memory, 271 midcycle sex, 566 module, plan, 422 minimal model, 340 modules, 337 minisociety, 245 concurrent, in language processing, 546 Minsky, Marvin, 74, 91, 94, 609 Mogenson, Gordon, 150 mirror neurons, 152 monitoring, 205 Mishkin, Mortimer, 182 monitoring of progress, 148 missionary and cannibals problem, 272 monkeys Miyashita, Yasushi, 190 brain, 30 Index 727

cynomolgus macaque, 193 negotiation, 20, 345 Japanese macaque, 32 Neisser, Ulrich, 144 macaque, 156, 194 neocortex, 43, 58, 172, 199 owl, 32 neocortical areas, 53 rhesus macaque, 32, 193 neocortical regions, definitions of, 174 species, 32 neocortical regions, list of, 174 squirrel, 32 nesting of clauses, 542 vervet, 32 neural area MOP, memory organization packet, 425 definition of, 156 Morris, Robin, 294 neural areas, 172, 199 Moscovitch, Morrris, 380 neural learning, 640 mother-infant interaction, 35 neural level, 4 mothering styles, 563 neural localization, 260 motion perception, 193 neural network model, 295 motion specification, 240 neural network models motion, perception of, 507 Tower of Hanoi, 295 motivation, 261, 557, 558 neural representation, 342 control theory, 589 neuroanatomy, 9 levels of control, 583 amygdala, 570 logical models, 596 hypothalamus, 569 unified picture, 582 of subcortically motivated behavior, motivation of behaviors, in anterior cingu- 568–573 late, 207 visual system, 500 motor goals, 203 Neurocognitron, 508 motor module, 246 neurology, 9, 62 motor sequences, 204 neuron Mott, David, 20 level of description, 357 Mountcastle, Vernon, 182, 193 neurons, 128 movement sequences, 207 neurophysiology, 9 MPO, 581 neuroscience, 130 MST, 193 Newell’s physical symbol hypothesis, 640 MT, 193, 507 Newell, Allen, 19, 87, 105 MTT, multiple trace theory, 380 Newman, James, 186 multiagent systems, 20 no negative evidence, 173 multiperson cooperation issues, 618 noise level, in BAD, 322 multiple trace theory (MTT), 380 non-visuotopic organization, 190 Mumford, David, 57, 509 non-willed action, 202 mustached bat, 186 nonegocentric maps, 489 mutual regulation, 564, 587 nonegocentric representation, 506, 528 my brain model, dynamic version of, 426 nonhuman primate neonate, 30 nonspatial working memory, 205 Nadel, Lynn, 155, 376, 380, 585 nontopographic somatic representation, 182 natural language, 632 Norman, Donald, 280 natural language processing, 18, 530–556 notexists tests, in BAD, 318 neglect, 512 Noton, David, 515 728 Index novel action, 480 Owen, Adrian, 295 novel environment, 480 owl monkeys, 32 novelty, 206, 567 oxytocin, 570 numerical constraints, 342 PA1, 204, 218, 224 O’Keefe, John, 155, 376 PA2, 204, 218, 224 OAa, 193 PA3, 205, 224 object identification, 511 PA4, 205, 224 object identity, visual perception, 190 PA5, 205, 206, 217 object token, 509 pallium, 43 object type, 509 Pandya, Deepak, 68, 144, 164, 182, 185, 186, object-centered frames, 196 190, 193, 194 object-centered representations, 196 parabelt, 183 object-file, 6, 343, 506 parabelt area, 186 object-file representation, 520 parahippocampal area, 387, 519 occipital lobe, 190 parahippocampal place area, 506 oculomotor loop, 486 parallelism odor-guided behaviors, 178 in natural language processing, 546 OI, 178 parallelism, of modules operation, 242 OL1, 178 parameters old world species of monkey, 32 in Vosse-Kempen model, 540 olfactory anterior nucleus, 178 language processing, 553 areas, 176 paranoia, 69 cortex, 43, 178 parcellation, 156 hierarchy, 177 Pardo, Jos´e, 207 nuclei, 178 Parent, Andr´e, 202, 376 system, 178 parietal cortex tubercle, 178 posterior, 489 ontogenetic development, 31, 586 parietal eye field (PEF), 515 operculum, 180 parietal lobe optimization principles, 118 areas, 182 Oram, Michael, 196 inferior posterior, 193 orang-utans, 32 posterior, 193 orbital frontal cortex, 39, 153, 178, 187, 197 Parkinson patients, 294 areas, 392 Parks, Randolph, 295 organization partial ordering, 146 Carl Hewitt’s model of, 345 parvocellular stream, 500 orientation phase of grooming, 249 Passingham, Richard, 202, 204, 207 orientation, visual recognition, 190 paternal behavior, 559 Ortony, Andrew, 558 patients output competition, 319 amnesic, 289 ovaries, 580 frontal, 289, 295 overall plans module, 245 Parkinson, 294 overall spatial awareness, 218 schizophrenic, 289, 294 Index 729 patterns of walking, visual recognition of, Petersen, Steven, 7 196 Petrides, Michael, 145, 152, 202, 205 Paus, Tomas, 207 PG, 182 Paxinos, George, 202, 376 phases of grooming, 249 PEF, 515 phenomena, 10 penumbra, 127 phenomenologists, 10 perception phonetic processing, complex, 187 color, 507 phylogenetic development, 31, 144 faces, 506 physical symbol hypothesis, 348, 640 motion, 507 PIP, 193 shape from shading, 507 piriform cortex, 178 perception and action hierarchies, 209, 223 pituitary, 570 perception hierarchy, 168, 235 pituitary gland, 59 perception of conspecific vocalizations, 186 place area, 506 perception of pure tones, 185 place cells, 376 perception, conditional, 238 plan, 145 perception-action architecture, 199, 236 plan as data structure, 342 perception-action hierarchy, 144, 162, 174, plan elaboration, 236, 240 222, 234, 489, 623 plan module, 299, 422 perception-action model, 229 plan primates module, 246 perceptual goal recursion disk strategy, 287 plan selection, 240 perceptual goal recursion pyramid strategy, plan, in conceptual net theory, 425 287 Planner language, 92 perceptual rule, 235 planning and action areas, 200 perceptual strategy, 286, 296 planning and action hierarchy, 168, 200–219 perceptual strategy with goal stacking, 286 planning and action hierarchy, determining perceptual strategy, Tower of Hanoi, 466 features, 208 perceptual subgoal, 482 plans, 151, 218 perceptual test for move legality, 299 plans are hierarchical in nature, 151 perceptual tests, 299 plans as sets of schema, 151 Pereira, Fernando, 532 planum temporale, 183 performance Plutchik, Robert, 558 human PM1, 196, 197, 224 Tower of Hanoi problem, 288 PM2, 196, 197, 224 peripheral image, 524 PM3, 193, 196, 197, 206, 224 peripheral nervous system, 559 PO, 193 peripheral representation, 524 Poggio, Tomaso, 57, 508 peripheral vision, 500 pointers, 133 perirhinal area, 387 Pollack, Debbie, 21 Perona, Pietro, 21 polymodal areas, 173 Perrett, David, 194, 196, 229 polymodal association areas, 206 perseverance, 279, 280 polymodal STS areas, 194 persistent state of the module, 339 Pons, Timothy, 182 personality, 128 pop-out features, 507 PET, 187, 206, 207, 294 position, tracking changes in, 241 730 Index

Posner, Michael, 207 hobbits and orcs, 272 posterior cingulate, 173, 207 missionary and cannibals, 272 Powell, Thomas, 68, 144, 159, 164, 165 spatial, 272 preattention, 512 spatial rearrangement, 272 precuneus, 206, 506 transportation, 272 predators, 39 water jug, 272, 279 predicates, 342 procedural memory, 375, 480–496 predicted sequences of brain activation process, 97 states, 264 processing cycle, 240 predictions, 229, 244 processing hierarchy, Perrett’s for recogniz- predictions made by our model, 264 ing social stimuli, 196 predictions, falsifiable, 271 processing of nouns, 187 preference semantics, 426 processing of verbs, 187 prefrontal cortex, 200–219, 375, 480, 489 processing, distributed, 231 dorsal, 229 production system, adaptive, 468 prefrontal cortex, left, 294 progesterone receptors, 580 premotor areas, 152 program, 135 premotor cortex, 489 concept of, 613 medial, 204 programming issues, 617 preparing to move, 204 programming language Preuss, Todd, 156 for brain description, 616 Price, Joseph, 176, 571 proisocortex, 202 primate projection neurons, 64 actions and relations module, 246 Prolog language, 96, 311, 340 behavior, 28 advantages, 315 brain, 41 prosimians, 31, 36, 156 groups, 196 protocol, 466 knowledge, 35 protocol, Anzai and Simon, 472 positions and movements module, 246 provided expression, 234 troupes, 567 proximity, pleasure, 559 entering, 567 Prozac, 39 primatology, 9 psycholinguistic models, 532 priming, 376, 386 psycholinguistics, 531 priming, visual, 511 psychology, 9, 68, 128 principal sulcus, 185, 205, 375 memory, 373–421 Principles and Parameters theory, 531 of sex, 566 problem solving, 202, 272–307 subcortically motivated behaviors, 561 activity, 203 vision, 500 and the frontal lobes, 276 pyramidal cells, 53 behavior, 271 quiescence, 341 by children, 288 organization of, 276 R, 183 vision, 516 Raleigh, Michael, 21 problem space, 471 RAM, 133 problems Rapaport, David, 558 Index 731 rats PA4, 205, 224 attachment behavior, 564 PA5, 205, 206, 217 maternal behavior, 565 PM1, 196, 197, 224 pup behaviors, 565 PM2, 196, 197, 224 Rauschecker, Josef, 186 PM3, 193, 196, 197, 206, 224 raw experience, 611 SI, 182, 217, 224 Rayleigh, Lord, 11 SS1, 182, 218, 224 re-representation of cochlea, 185 SS2, 182, 218, 224 reaching, 182 SS3, 182, 197, 224 real-time coordination of action, 245 VI, 217, 224 real-time, cortex works in, 231 VV1, 191, 224 real-time, language processing, 231 VV2, 191, 224 realtime response, 341 VV3, 191, 197, 224 Reason, John, 280 regions, 147, 223 recall regions, list the neural areas comprising cued, 375 each region, 223 free, 375 regions, system properties, 209 recency, 191 regulatory factors, hormonal, 570 receptors reminding, 425 hormones, 580 Repetition blindness, 511 recognition memory, 375 repetition blindness, 506 recognition of complex visual forms, 191 repetition of the same movement, 204 recognizing formants, 187 repetitive movements, 204 redescription, 15 representation, 97 redirection, 35 conceptual nets, 423 redundancy in the model, 266 egocentric, 506 referencing, 139 nonegocentric, 506 region of events, 153 AI, 185, 217, 224 of goals, 207 AU1, 186, 224 of the amygdala, 601 AU2, 224 of the hypothalamus, 596 AU3, 186, 187, 197, 206, 224 retinotopic, 506 DV1, 193, 224 visual types, 509 DV2, 217, 224 representations DV3, 193, 197, 217, 224 evolution of, 17 G, 207, 217 reptile brain, 42 GI, 180 reptile cortex, 44 GU1, 180 research issues, 617 GU2, 180 resolution method, 96 MI, 204, 217, 223 results OI, 178 Vosse and Kempen’s model, 542 OL1, 178 retinotopic mapping, 500 PA1, 204, 218, 224 retinotopic representation, 506 PA2, 204, 218, 224 retrieval of verbal episodic memories, 206 PA3, 205, 224 retrosplenial, 153 732 Index

cingulate cortex, 206 selected, 240 gyrus, 506 weights, 234 reward, 387 rules operating in each module, 248 rhesus, 36 Russell, Bertrand, 79 rhesus macaque monkeys, 32, 193 Ryan, Lee, 380 rhesus monkeys, 36 Rybak, Ilya, 508, 515 attachment behavior, 564 sex, 566 saccades, 193, 483 Rich, Elaine, 423 saccadic fixations, 528 Riesenhuber, Max, 508 Saint-Dizier, Patrick, 532 Rizzolatti, Gallese, 152 saliency, 206 RM, 183 Samsonovich, Alexei, 380 RNA Sanides, Friedrich, 144 and long term memory, 382 Sapolsky, Robert, 37 Robinson, Abraham, 86 satiety, 180 Robinson, Alan, 96 Saunders, Richard, 571, 573 Roland, Per, 68, 164, 203, 206, 276 scale invariance, in vision, 190 Rolls, Edmund, 150, 178, 180, 231 scanpath, 515 Romanski, Lizabeth, 183, 187 scenario Romer, Alfred Sherwood, 42, 44 grooming, 249 rote strategy, 287 social conflict, 253 route-finding problems, 203 social spacing, 255 routine action, 479 scene, 425 control, 479 scene instances, 425 interleaving, 479 Schank, Roger, 423, 532 routine and creative action, 609 schemas, 203 routine eye movements, 515 schizophrenia, 69 routinization, 480–496 schizophrenic patients, 289, 294 RPB, 183 Schroedinger, Erwin, 13 RT, 183 Schwartz, Myrna, 203, 280 RTL, 183 Science, 11 RTM, 183 science, 10 Rudge, P., 187 scientific culture, 124 Ruiz, Dirk, 295, 468 scientific theory, 3 rule, 233 scientific theory, our model constitutes one, action elaboration, 235 261 body part of, 234 Scott, Dana, 96 computation, limits of, 234 SCP, 382 dominant, 240 script, 425 example of, 234 scriptlet, 425 execution, in BAD, 318 secure attachment type, 563 in BAD, form of, 314 secure base, 563 instance, 234 SEEMORE model, 508 instances, selection, 240 Sejnowski, Terence, 132 perceptual, 235 selected rule, 240 Index 733 selection of manipulative movements, 204 novelty, 567 selective search strategy, 285, 297 other motivational systems, 568 selective search strategy, Tower of Hanoi, psychology of, 566 466 sexual appetitive behaviors, 589 selectors, 342 sexual behavior, 559 self-generated responses, 205 sexual competition, 39, 562 self-ordered tasks, 205 Seyfarth, Robert, 33 Seltzer, Bernard, 190, 193 Shallice, Tim, 6, 67, 203, 206, 274, 279, 280, semantic decision tasks, 187 290, 483 semantics, computational, 261 shape from shading, 507 semantics, computational of my model, 261 shape, visual recognition, 190 sensing, in BAD, 326 Shaw, Clifford, 87 sensors, 236 Sheinberg, David, 190 sensors, in BAD, 326 Shepherd, Gordon, 44, 122, 178 sensory buffers, 375 Sherman, Murray, 492 sensory-motor loop, 486 Shiffrin, Richard, 7 sentence comprehension scores, 555 SI, 182, 217, 224 sentence recognition Siewiorek, Daniel, 105 example, 541 Signoret, Jean-Louis, 67 sentence types, 555 simians, 31 sentences Simon, Herbert, 19, 87, 285, 286, 296, 465, garden path, 542 472 separation Simon, Tower of Hanoi formulation, 297 distress, 559 simple odor memories, 178 physiological mechanisms, 578 simulation, 140 sequence, 151 single neuron, 146 sequencing by brain stem nuclei, 152 situated action, 229 sequencing of action, 151 six layer cortex, 45, 472 sequencing, conditional in dorsal frontal ar- SMA, 203 eas, 152 Smith, Edward, 374 sequencing, in medial frontal areas, 152 smooth pursuit, 193 Sereno, Martin, 193 snakes seriality, 134, 138 model of fear of, 591 seriality at the highest level, 616 SOAR, 288, 295, 468, 471 serotonin, 39 social action, 196 set theory, 84 social action features, 218 sex, 35, 557, 566 social affiliation, 245 in rhesus monkeys, 566 social behavior, 28, 245 affiliative type, 566 social conflict, 37, 249, 253 brain mechanisms, 580, 594 social conflict behavior, 249 development, 588 social event, 154 familiarity, 567 social features, 154 logical model, 601 social goals, 218 midcycle type, 566 social groupings, 30 model of, 593 social hierarchies, 36 734 Index social interaction, 154 spatial localization, 168 social interaction, model of, 229 spatial location, encoding of in posterior social map, 255 parietal cortex, 193 social messages, 218 spatial map, 255 social migration, 36 spatial maps, 194, 229 social objects, 218 spatial navigation module, 229 social psychology, 128 spatial perception, 193, 194 social relations, 229 spatial problems, 272 social spacing, 249, 255 spatial rearrangement problems, 272 scenario, 255 spatial working memory, 205 social spacing behavior, 249 species of monkeys, 32 social spacing behaviors, 255 specific joint plans module, 246 social status SPECT, 294 competition, 562 speed, visual recognition of, 193 social stimuli, 196 Spencer, Herbert, 11, 67 social stimuli, visual perception of, 194 spinal cord, 59 social support, 40 squirrel monkeys, 32, 186 socially significant motions, perception in- SS1, 182, 218, 224 dependent of visual form, 196 SS2, 182, 218, 224 socially significant objects, visual recogni- SS3, 182, 197, 224 tion of, 191 St. Kitts, 33 sociology, 9 stability, of distributed activation, 240 software, 135 stacking strategy, Tower of Hanoi, 469 solitary behavior, 36 stacking, not biologically plausible, 301 somatic angularity, 182 standard arithmetic functions, 341 somatic body regions, parietal, 182 standards committees, 345 somatic re-representations, 182 Stark, Lawrence, 515 somatic texture, 182 states, in conceptual net theory, 425 somatomotor operations, 193 stereotypes, 155 somatosensory STG, 187 areas, 180 stimulus-driven cognition, 202 association cortex, 182 Stoller, Robert, 20, 588 coordinate system, 182 storage of complex visual forms, 191 guidance, 182 storage process, in BAD, 320 guiding neurons, 193 strategies for computation, 340 hierarchy, 197 strategies, distributed, 271 regions, 229 strategies, for solving Tower of Hanoi prob- Sony Computer Science Laboratory, 20 lems, 285 sound localization, 186 strategies, for Tower of Hanoi problems, list soundness, 339 of, 285 source areas for basal ganglia, 486 strengths during sentence recognition, 550 space perception, 186 stress, 562, 585 spatial context, 506 dissociation, 585 spatial descriptions, 218 structural description, 509 spatial features, 218 structure of the cortex, 53 Index 735

Struhsaker, Thomas, 33 system STS, 183 model, 6 polymodal areas, 194 system level, 128 subcortical connections system model, 222 hippocampus, 520 system model, correspondence to cortex, subcortical connectivity, 147 227 subcortical motor control systems, 559 system model, diagram of our, 227 subcortical sensory processing, 559 system, action of, 272 subcortical systems, 557–586, 634 system-level brain model, 230 action, 573 control theory, 589 tactile feedback, 218 interface to memory systems, 590 tactile guidance, 218 interface to the perception-action hier- tactile sensing, 229 archy, 590 tactile stimulation of social significance, vi- levels of control, 583 sual recognition of, 196 modeling, 586 tactual form and texture, 182 psychology of motivated behaviors, 561 Tanaka, Keiji, 190, 191 unified picture, 582 target positions, 150 subcortically motivated behaviors Tarski, Alfred, 85 neuroanatomy, 568–573 task execution, 207 subgoal taste, 180 perceptual, 482 taste aversions, 180 subject’s own gaze, perception of, 194 TE, 190 subject’s own movements, visual recognition TE0, 519 of, 196 temperament, 563 submission, 559 temperature, body, 558 subsumption architecture, 21 temporal lobe, 183 Suga, Nobuo, 185 temporal pole, 153, 197 sulcus temporal sequences of events, 387 principal, 375 temporal sequencing, 148 summary of the model, 621–640 temporal synchronization, 608 superior temporal cortex termination of action, 253 auditory role of, 186 territorial range, 36 superior temporal sulcus, 183 testes, 580 superior temporal sulcus (STS), 229 testosterone, 580 survival, 557 temporal variation in levels, 562 basic systems, 560 texture, visual recognition, 190 survival systems, basic, 558 thalamus, 43, 54, 58, 164, 492–496, 500 Sutherland, Stuart, 155 as controller, 494 syllogisms, 76 evolution of, 493 symbols, 348 proposed theory, 495 role of in model, 621 theoretical computer science, 18, 73 synaptic facilitation, 640 theoretical issues, 618 synchronization, temporal, 608 theory syntax, 531 definition of, 340 736 Index

of episodic memory, 383 typed theories, 84 thirst, 558, 589 three layer cortex, 44, 472 u-link, 539 time u-space, 539 characteristic, 608 Ullman, Shimon, 58 memory for, 382 underlying control systems, 559 time granularity, 240 unification, 532, 538 Toates, Frederick, 589 unified lexical frame, 545 unified picture of motivation, 582 tokens, 359 uniform process, 55, 231, 234 Tolman, Edward, 150, 155 uniform rate, 234 tonotopic representation, 185 uniform rate of cortical processing, 231 TOP, thematic organization packet, 425 uniformity of complexity, 148 top-down process, 340 uniformity of cortical circuitry, 148 topographic somatic representation, 182 unimodal association areas, 206 tower noticing, 288, 295 univeral Turing machine, 83 mechanism, 468 universe of discourse, 127 Tower of Hanoi problem, 271, 272, 465, 466 University of London, 20 definition, 274 update competition, 319 diagram, 272 updating store, every cycle, 240 eye movement, 517, 523 updating stores, in BAD, 321 human performance, 288 upstream, 243 learning, 295 utilisation behavior, 279, 483 neural network model, 295 protocol, Anzai and Simon, 472 V1, 147, 506 search space, 274 V2, 147, 506 Simon’s formulation, 297 V4, 147, 506 strategies for solving, 285 V8, 507 use of episodic memory, 301 Valenstein, Edward, 187 visual system, 522 Van Emden, Maarten, 96, 340 Tower of London problem, 203, 274, 290 Van Essen, David, 146, 156, 190, 193, 212, TPro, 197, 202 500, 508 tracing techniques, 166 Van Lehn, Kurt, 296, 301 tracking, changes in position, 241 Varela, Francisco, 610 transcription factors, 382 various kinds of spatial map, 194 translation invariance, in vision, 190 ventral prefrontal areas, 422 transportation problems, 272 ventral prefrontal cortex, 207, 480 transsaccadic vision, 515 ventral visual areas, 188 Treisman, Ann, 6, 500, 506 ventral visual hierarchy, 190 trial files, in BAD, 331 ventricle troupes, 36 third, 484 Tulving, Endel, 207, 381 ventrolateral frontal cortex, 206 Turing machine verbal sequences universal, 83 generation of, 203 Turing, Alan, 83 vervet monkeys, 32, 33, 36 Turner’s syndrome, 289 Vetter, Thomas, 508 Index 737

VI, 217, 224 timing, 510 viable behavioral states, 240 transsaccadic, 515 viable state, 608 visual-analog buffer, 528 viewer-centered detection, 191 visual-focus buffer, 528 viewer-centered frames, 196 visualization, 311 viewer-centered representations, 196 visually guided movements, visual percep- vision, 499–529, 631 tion, 190 visual areas visually-guided reaching neurons, 193 dorsal, 193 visuomotor operations, 193 visual features visuospatial functions, 207 higher order, 147 VM, 581 visual imagery, 190 vocabulary, 127 visual images, 343 vocal calls, 35 visual long term memory of complex ob- vocal repertoire, 186 jects, 190 vocal tract visual perception, 18 calibrating heard speaker’s, 187 visual perceptual hierarchy, 147 vocalization, 35 visual recognition of bodies, 191 vocalization types, 33 visual recognition of body parts, 191 voluntary visual recognition of facial identity, 191 grasping, 182 visual recognition of hands, 191 joint rotation, 182 visual short term memory of complex ob- reaching, 182 jects, 190 voluntary action, 204 visual system, 499–529 von Economo, Constantin, 156 attention, 529 von Neumann machines, 133 binding of features, 512 von Neumann, John, 132 coordinate frames, 527 Vosse and Kempen’s results, 542 dorsal route, 513 Vosse, Theo, 532, 536 episodic memory, 519 VV1, 191, 224 example of behavior, 522 VV2, 191, 224 in Tower of Hanoi problem, 522 VV3, 191, 197, 224 interface to rest of brain, 520 interface to the core brain model, 520 W3C, 345 models, 508, 515 Wachsmuth, E., 191 my approach, 521 waiting phase of grooming, 249 my proposed model, 524 Walker notation and parcellation, 202 neuroanatomy of, 500 Ward, Geoffrey, 290 object tokens, 510 Warren, David H. D., 96 object types, 513 Warrington, Elizabeth, 187 peripheral, 516 water jug problems, 272, 279 phenomena, 509 weight, rule, overall, 234 priming, 511 weights problem-solving, 516 in BAD rules, 316 psychology, 500 weights, data item, 234 representation types, 509 weights, rule, 234 738 Index

Weiner, Herbert, 558 Weiskrantz, Lawrence, 374 Wernicke’s area, 530, 556 Wernicke, Carl, 6, 64 Whitehead, Alfred North, 79 Wierzbicka, Anna, 426 Wilks, Yorick, 426 willed action, 202 Winston, Patrick, 94 Wisconsin Card Sort test, 203, 275, 280 Witkowski, Mark, 20 Wood, Ruth, 581 word ambiguities, 542 word ordering, 539 working goals, 299 working memory, 191, 205, 218, 375, 387 working memory, distributed, 233 working model in attachment behavior, 564 working models, 588 development, 588 world, external, 236 Wundt, Wilhelm Max, 63, 68

Young, Malcolm, 144, 168 Yukie, Masao, 185

Zilles, Karl, 156, 157, 202 Zumpe, Doris, 557, 562, 566 Warning

This book contains: Neuroanatomical terminology

Psychological experimental designs

Programming languages

Explicit logic

Partial mathematization

Reader discretion is advised