Brain science
MI muscle SI tactile combinations detection PA1 detailed actions for self SA1 tactile images DV2 spatial maps PA3 specific joint plans and DV1 spatial plan persons features PA4 overall plans Episodic memory
Context PM1 person VI visual store Event map positions and G goals movements features
PM2 person VV1 object actions and identities relations PM3 social dispositions and affiliations
motor tactile visual input output input
A theory of brain structure and function
interacting selves, groups social psychology
mental states, consciousness, the self psychotherapy
cognitive mechanisms, motivation theories cognitive psychology
system level brain models, neurocognitive models neuropsychology
cortical layers, associative memory models neural nets
single neuron models, synapses, transmission single neurons cell dynamics, synapse dynamics, genetic transcription cell dynamics
by Alan H. Bond
Brain Science
A theory of brain structure and function based on neuroanatomy, psychology and computer science
Alan H. Bond 1 September 20, 2004
1Alan H. Bond, Ph.D., California Institute of Technology, Computer Science Department, Mailstop 256-80, Pasadena, California 91125. email: [email protected] Disclaimer
I started putting this book together on January 1st 2003, and now have a very rough draft. I will then rewrite it during the following months, guided by the criticism of a literary agent and potential publishers.
This current version has a lot of mistakes and omissions, but it does outline what material will be in the book and in what order.
There are currently many figures copied by scanning from other sources, but many of these will be redrawn by me.
It is thus with due hesitancy that I am making it available so that it can help in ex- plaining my research. Any feedback and criticism is welcomed, please email me at [email protected] or phone me at 310-828-8719.
To avoid a negative reaction to missing and incorrect details, it might help to think of it as a rough hewn sculpture rather than a detailed sculpture with errors. This is how I think of it. In the next months, I will be able to give it the necessary detail, precision, polish and completion. Quotation i
Prologue from the PhD thesis of Paul D. Scott 1975 Sussex University, England, entitled “The cerebral neocortex as a programmable machine”
‘
THE PARADOX
‘It is conventional for a dissertation such as this to begin with a review of the current state of knowledge relevant to its subject. Unfortunately Sussex University possesses a library containing three hundred thousand volumes of which the vast majority are devoted to the subject matter of this thesis, the human cerebral cortex. Strangely only a small proportion of the authors concerned were actually aware that this was what they were writing about. Certainly many of those whose works are kept in the psychology section may have been but move a few shelves and we come to the astronomy section. Here we find expounded how the cerebral cortex has transformed the eye from an organ for finding the nearest meal into an instrument capable of perceiving the composition of a galaxy many millions of light years distant. Passing on a few feet we come to applied sciences. Here we can read how the hand-eye coordination which got our ancestors to the next branch has developed to enable men to get to the moon. Surrounding volumes describe innumerable ways in which the cortex by controlling our sensory and motor systems has developed ways of controlling the world. Wherever we go in the library we are confronted with testaments of cortical activity - in politics and economics, in mathematics and philosophy or in Marxism and theology. Take away the books and we still have the building as a consequence of activity in (among many others) Sir Basil Spence’s neocortex. Take away the building and we still have the concept of a library - an institution dedicated to fostering the most striking of all human cortical activities, verbal communication.
No doubt many of our readers, whom we have so far ignored on our ’gedanken’ tour of the ii Quotation library (for fear of disrupting their cortical activities), would be horrified at such extreme reductionism. Such a reaction would not be possible without a cortex. It is impossible to avoid a feeling of awe upon the realisation that the whole of human achievement, so extensively documented in the history section is due to the organised firing of neurones in the neocortex. Nor should we make the mistake of attributing only the spectacular achievements of the human race to this region of the brain. We glance at the clock as we leave the library. It is five past one. Just time to skim through a chapter of a book before lunch. The cortex has translated a simple visual pattern into the concept of time and used this information to plan our future activities.
Returning to our desks we open the book, a medical student’s introduction to the nervous system. What does it have to say about the neocortex? “There are essentially only two types of nerve cell in the cerebral cortex: a PYRAMID cell and a GRANULAR cell concerned with the output and input functions respectively. The pyramid cells have a long corticofugal axon and an apical dendrite which goes up to the most superficial layer of the cortex. The granule cells are characterised by a profuse dendritic tree while their axons are usually intracortical and so may go off in any direction. The neocortex is constructed in six layers. The outermost or MOLECULAR layer contains mostly the apical dendrites of the pyramidal cells together with a small number of internuncial neurones. The innermost or FUSIFORM layer contains internuncial and callosal neurons whose axons end in the most superficial layers of their own or the opposite side. The intermediate layers are made up as a four-decker sandwich of granular and pyramidal cells. Layers II and IV are the external and internal layers respectively while layers III and V are the external and internal pyramidal layers. The layers vary considerably in relative thickness in different areas of the cortex....” Bowsher (1970)
This is of course an oversimplification. Nevertheless it is true that almost everywhere cerebral neocortex consists of a six-layered structure composed of a few basic cell types Quotation iii with definite, if not yet defineable, restrictions on their connectivities. So if we are to try and relate structure and function in the cortex we are at once confronted with a striking paradox. The cortex exhibits a uniquely diverse range of behaviour and yet it has a relatively uniform structure. Any attempt to explain how the brain works must sooner or later face up to this paradox’
Reference. David Bowsher, “Introduction to the anatomy and physiology of the nervous system”, 2nd ed. Oxford, Blackwell Scientific Publications, 1970. iv Quotation Preface v
Style policy
I and we. I have used I to mean Alan Bond alone, and we in two different senses, namely, we meaning the author and the reader, and we meaning the brain science community, the scientific community or people in general. Using I and my often sounds rather egotistical, but this is not my intention. The alternatives seem to be to either use we and our throughout, or to use reported speech, mainly passives. I noticed that other scientific books also use the mixed convention that I have adopted.
The index. By inspection of preliminary forms of the index, I determined some categories which act as subheadings. Some of these are adjectives; I hope this makes sense. e.g., auditory, or even hierarchical.
People’s names include their first name, unless it has been too difficult to find it, and I have not included middle initials, except to remove ambiguity. This is a little unusual in scientific writing, where often just the surname, or perhaps some initials as well, are used. It also becomes a problem when we get down to less important figures, where it is better to just use surnames alone. Also for teams of people, one can give the leader’s name and say “coworkers” or “their research colleagues”, or one can use the lead author’s name with et al. which can be misleading if the lead author is not one of the main scientists involved.
I use lower case index terms unless they are proper names. I assumed that old world monkeys was all lower case.
I have made hyphenation in the index consistent to avoid ambiguity, since in some con- texts in the text we might need to hyphenate, but not in other contexts, but in the index there should be one corresponding form, e.g. context 1 - problem-solving method, vi Preface
context 2 - problem solving, index entry - problem solving.
There are a few ambiguous cases, for example joint rotation means a joint of one person rotating, whereas joint action means the action of two of more persons.
A region only means the regions that I have defined as sets of areas. Other peoples’ regions I index as areas. Nuclei I also index as areas for now.
I seem to have converged on orbital frontal cortex and not orbitofrontal cortex or orbital- frontal cortex. In any case, its abbreviation is OFC.
Glossary. Since most of the terms used are defined and explained in the body of the text, it seemed that an explicit glossary would be wasteful. Instead, for every term used in the text I have made two index entries. One, under the term, I have put a subentry for its definition, and two I have put under an explicit index entry glossary with the term as a subentry. Thus, computation occurs in the index as computation, definition and as glossary, computation.
Hyphenation. My strategy is to insert a hyphen if it is needed to resolve ambiguity. One partial exception is object-file which I have kept hyphenated following Treisman and Kahneman’s use.
Technical terms across culture boundaries. I have used all of the terms language, language processing and natural language processing, since natural language processing is a computer science term used always to distinguish it from programming languages or formal languages, whereas for psychologists and neuroscientists, language always means natural language. Protocol means an experimental procedure to psychologists, but to them it can also mean a record of the subject’s verbalization, and it means a sequence of exchanged messages in computer science networking.
Acknowledgements. I’ve put some acknowledgements in the body of the text where Preface vii
there was a specific debt to a person. I think this thanks them better.
Historical figures. I’ve put in dates of birth and death, so we can better appreciate the historical time course.
Diacritical marks, accents. There are occasional discrepancies due to the inability of various software tools to handle them correctly. A name may occur in the text, in a cited reference, in a figure caption, cited in a figure caption, as a table entry, as an index term, or in a footnote.
Typesetting. I am basically using default latex typesetting for a document of type re- port, which results in various infelicities, which will eventually be removed, first manually by me, and then by using a publisher’s latex macros.
Brevity of style. I have tried to present my theory of the brain, to explain the scientific reasons and the postulated ideas and mechanisms with a reasonable degree of precision, rigor and completeness. I did not want to gloss over this, I want to explain all of the supporting experimental data and theoretical concepts to the point where the reader can evaluate and use them for him- or herself. The magnitude of this undertaking has lead me to a rather terse or elliptical style of writing. Ideas are explained only briefly without much illustration, repetition, redundancy of expression, or ado. In this way a book of managealbe size has been produced, and the reader is lead at a good speed, some would say a fair clip, others breakneck speed, through the material, the arguments, explanation of mechanisms and conclusions. I hope that the reader will find this style habitable, even comfortable. A more usually elaborated and embroidered form would have taken 2000 pages. viii Preface Contents
I Foundation 1
1 Introduction, motivation and summary 3 1.1 Introduction ...... 4 1.1.1 The desire for a framework ...... 5 1.2 A basis in three disciplines ...... 9 1.3 Science ...... 10 1.3.1 Science and neuroscience ...... 10 1.3.2 The Bohr model of the atom ...... 11 1.3.3 Approximation in science ...... 12 1.4 Description by the brain and by brain scientists ...... 13 1.4.1 Description and computation ...... 13 1.4.2 Description by brain modules ...... 14 1.4.3 The scientific description of the brain ...... 15 1.4.4 Description of and by the brain ...... 15 1.4.5 Representation of episodes, plans and goals ...... 16 1.4.6 The evolution of representations ...... 17 1.4.7 Approximation in modeling the brain ...... 17 1.4.8 Computer science and description ...... 18 1.5 History of the research on my model ...... 19 1.6 Overview of this book and my model of the brain ...... 22
ix x Contents
2 Primate behavior 28 2.1 From nonhuman to human primates ...... 29 2.2 Biology and development ...... 31 2.3 The evolution of primates ...... 31 2.3.1 Present day primates ...... 32 2.4 Primate behaviors ...... 33 2.5 Societal dynamics ...... 36
3 The primate brain 40 3.1 The evolution of the primate brain ...... 41 3.1.1 Invertebrates ...... 41 3.1.2 Vertebrates ...... 41 3.1.3 Mammals ...... 44 3.1.4 Primates ...... 44 3.1.5 Modern evolution theory ...... 49 3.2 The structure of the cortex ...... 50 3.3 The uniform process of the neocortex ...... 52 3.3.1 There is a uniform process ...... 52 3.3.2 Theories of the uniform process ...... 52 3.4 Overall components and architecture ...... 53
4 The historical development of system-level approaches to the brain 61 4.1 Introduction ...... 62 4.2 Neurology, neuroanatomy and neurophysiology ...... 63 4.3 Psychology ...... 68 4.3.1 System models in psychology ...... 69
5 The history of formal description 73 5.1 Introduction ...... 74 5.1.1 Using natural language for scientific and mathematical description 74 Contents xi
5.1.2 Logic in natural language ...... 76 5.2 Formal logic ...... 76 5.3 Theoretical computer science ...... 87 5.4 Artificial intelligence ...... 87 5.5 The choice of programming language ...... 96 5.6 The intellectual revolution of computer science ...... 96
6 Logic programming 98 6.1 Introduction ...... 99
7 Describing information processing in computer science 101 7.1 What is a computer science? ...... 102 7.2 Concepts in computer science ...... 102 7.2.1 Data and data structures ...... 103 7.2.2 Program, control and process ...... 104 7.2.3 Interfaces ...... 105 7.2.4 Communication ...... 106 7.3 Symbols in computer science ...... 106 7.4 The computer science experience ...... 107
8 Computer science description and the brain 109 8.1 The computer and the brain ...... 110 8.2 Computer science assumptions from von Neumann machines ...... 111 8.3 Life and computer systems ...... 112 8.4 Assumptions underlying computer science descriptions ...... 112 8.5 Computer science concepts for the brain ...... 114 8.6 Summary ...... 118
9 Levels of description in computer science 119 9.1 Describing computers ...... 120 xii Contents
9.2 Levels of description for computer systems ...... 120 9.3 Descriptions used at each level ...... 124 9.4 Properties of descriptions ...... 133 9.5 Design, constraints and optimization principles ...... 134
10 Brain science 135 10.1 Describing the brain ...... 136 10.1.1 Levels of description of the brain ...... 136 10.1.2 Natural science and computer science ...... 138 10.2 A hierarchy of scientific cultures ...... 138 10.3 Scientific culture ...... 140 10.4 Information is generated at each level ...... 140 10.5 Formal and computational models at each level ...... 141 10.6 Interactions between levels ...... 141 10.7 A role for logic programming ...... 142 10.8 Neuroscience ...... 143 10.9 Concepts for describing information processing in the brain ...... 143 10.9.1 Goals ...... 144 10.9.2 Plans ...... 145 10.9.3 Sequencing of action ...... 145 10.9.4 Representations of events ...... 147 10.9.5 Social interaction ...... 148 10.9.6 Contexts ...... 148
II The cortex 151
11 Information-processing analysis 153 11.1 Introduction ...... 154 11.2 Hierarchies ...... 155 Contents xiii
11.2.1 The concept of hierarchy ...... 155 11.2.2 The elements of hierarchies ...... 156 11.2.3 Sensory and motor hierarchies ...... 157 11.2.4 Possible bases for hierarchy in the neocortex ...... 158 11.3 Anatomical regions and connections ...... 159 11.3.1 Neural areas ...... 159 11.3.2 Anatomical connections and their analysis ...... 162 11.4 Sensing as the construction of descriptions ...... 171
12 An information-processing analysis of the primate neocortex 175 12.1 Introduction ...... 176 12.2 Analysis of neocortical perception hierarchies ...... 179 12.2.1 Olfactory areas ...... 179 12.2.2 Gustatory areas ...... 181 12.2.3 Somatosensory areas ...... 183 12.2.4 Auditory areas ...... 186 12.2.5 Ventral visual areas ...... 191 12.2.6 Dorsal visual areas ...... 194 12.2.7 Polymodal STS areas ...... 197 12.3 Summary and conclusion ...... 200
13 Frontal areas and the perception-action hierarchy 202 13.1 The neocortical planning and action hierarchy ...... 203 13.1.1 Planning and action areas ...... 203 13.1.2 Human cognition ...... 205 13.1.3 Frontal regions ...... 207 13.1.4 Planning and action hierarchy ...... 211 13.2 The perception and action hierarchies ...... 212 13.2.1 Cortical regions and their hierarchies ...... 212 xiv Contents
13.2.2 Connectivity between perception and action hierarchies ...... 215 13.2.3 Perception-action hierarchical architecture ...... 220 13.3 Summary and conclusion ...... 221
14 Describing information processing in the neocortex 224 14.1 Introduction ...... 225 14.2 The biological basis of our computational approach ...... 233 14.3 Representing data and processes using logic ...... 235 14.4 Dynamics of the model ...... 239
15 My implemented model of the primate neocortex 247 15.1 Our implemented brain model ...... 248 15.2 Behaviors and results obtained ...... 252 15.3 Discussion ...... 263 15.4 Summary and conclusion ...... 270
III Mental dynamics 271
16 Problem-solving behavior 273 16.1 Introduction ...... 274 16.2 Problem solving and the choice of the Tower of Hanoi problem ...... 274 16.3 Problem solving and the frontal lobes ...... 278 16.4 The Tower of Hanoi problem ...... 282 16.5 Strategies for solving Tower of Hanoi problems ...... 287 16.6 Human performance in problem solving ...... 290 16.6.1 Performance on Tower of Hanoi problems ...... 290 16.6.2 Performance on Tower of London problems ...... 292 16.7 Learning problem-solving strategies ...... 297 16.8 Extending our model to allow solution of the Tower of Hanoi problem . . 298 Contents xv
16.8.1 Tower of Hanoi strategies ...... 298 16.8.2 Selective search ...... 299 16.8.3 Working goals ...... 301 16.8.4 Perceptual tests and mental imagery ...... 301 16.9 Episodic memory and its use in goal stacking ...... 303 16.10Falsifiable predictions of brain area activation ...... 305 16.11Tower of Hanoi in BAD ...... 306 16.12Appendix - Talairach coordinates ...... 306
17 BAD - a Brain Architecture Description language 311 17.1 Introduction and motivation ...... 312 17.2 Specifying modules and channels ...... 314 17.2.1 Descriptions and description transformation rules ...... 314 17.2.2 Basic rule form ...... 315 17.2.3 Prolog ...... 317 17.2.4 Weights ...... 318 17.2.5 Computation ...... 319 17.2.6 Rule execution ...... 320 17.2.7 Competition ...... 320 17.3 Data ...... 322 17.3.1 Storage - descriptions ...... 322 17.3.2 Uniqueness and integrity ...... 323 17.3.3 Updating ...... 323 17.3.4 Attenuation ...... 324 17.3.5 Confirmation ...... 324 17.4 Specifying brain architecture ...... 325 17.5 The form of an external world specification ...... 326 17.6 Sensors and effectors ...... 327 17.7 Executing a complete brain model ...... 329 xvi Contents
17.8 Specifying a complete system and experiment ...... 330 17.9 Specifying the initial state of the brain models ...... 330 17.10Loading the complete model world ...... 331 17.11Trial files ...... 333 17.12Appendix - BAD Syntax ...... 333 17.12.1 Syntax of BAD models ...... 333 17.12.2 Syntax of the BAD rule ...... 333 17.12.3 Syntax of BAD modules ...... 334 17.12.4 General specification - The gmod file ...... 337
18 Logical systems 339 18.1 Using logic ...... 340 18.2 Inference and models ...... 345
19 Symbols 349 19.1 Approaches to symbols ...... 350 19.2 Programming issues ...... 352
20 Cortical motivation 354 20.1 Integrity, continuity and identity ...... 355 20.2 Integrity mechanisms in my model ...... 356
21 The layer, neural and cell levels of description 358 21.1 Introduction ...... 359 21.2 The structure of a brain module ...... 359 21.3 The flow of data ...... 361 21.4 Representing a module as a set of interacting layers ...... 362 21.5 Neural nets ...... 365 21.6 Cell types ...... 365 21.7 Synaptic plasticity ...... 367 Contents xvii
21.8 Genetic involvement in memory ...... 370
IV Memory 373
22 Episodic memory 375 22.1 The cognitive psychology of memory ...... 376 22.2 The hippocampus ...... 378 22.3 Episodic memory ...... 383 22.3.1 The definition of episodic memory ...... 383 22.3.2 The event structure of episodic memory ...... 384 22.3.3 The concept of event in philosophy and linguistics ...... 386 22.3.4 Rhythms and clocks ...... 387 22.4 My overall approach to memory ...... 388 22.5 My approach to episodic memory ...... 389 22.5.1 Main principles of my theory ...... 390 22.5.2 Instantaneous events ...... 391 22.6 My representation of events and episodes ...... 394 22.6.1 My overall idea ...... 394 22.6.2 Segmentation and chunking within each module ...... 395 22.6.3 Events ...... 395 22.6.4 Event descriptions ...... 397 22.6.5 Uniqueness of reference to events ...... 398 22.6.6 Episodes ...... 399 22.7 Episode formation ...... 401 22.7.1 Recording events ...... 401 22.7.2 The segmentation of event sequences into episodes ...... 402 22.8 Long term memory ...... 403 22.9 Using event and episode information ...... 404 xviii Contents
22.9.1 Querying the hippocampal formation ...... 404 22.9.2 The role of short term episodic memory in ongoing behavior . . . 405 22.9.3 The role of long-term episodic memory in ongoing behavior . . . . 406 22.9.4 Possible functioning of the hippocampus ...... 406 22.10The problem of representation ...... 408 22.11Episodes in thinking ...... 409 22.12My own concept of episode in thinking ...... 420 22.12.1 Types of mental action ...... 421 22.12.2 Event types ...... 423 22.12.3 The dynamics of episodes ...... 423
23 Contexts 424 23.1 Introduction ...... 425 23.2 Artificial intelligence models of memory ...... 425 23.3 My dynamic model ...... 428 23.4 The context module ...... 430 23.5 The representation of contexts ...... 432 23.6 Activation and execution of contexts ...... 433 23.7 Example of a context ...... 434 23.7.1 The form of the ss context ...... 434 23.8 Generating and updating contexts ...... 437 23.9 Relation of my representation to Schank’s ...... 439 23.10Contexts and memory ...... 440 23.11Executing a context ...... 442 23.12Contexts required for the Tower of Hanoi protocol ...... 444 23.12.1 Restarting ...... 448 23.12.2 Quasilinear searching ...... 448 23.13The hierarchy property ...... 448 23.14The learnability property ...... 450 Contents xix
23.14.1 The problem ...... 450
23.14.2 Events ...... 451
23.15The cognitive map ...... 453
23.16Logical representation of contexts and their execution ...... 454
23.17The code ...... 454
23.17.1 Brief outline ...... 454
23.17.2 Normal modules ...... 457
23.17.3 The episodic module ...... 457
23.17.4 The context module ...... 460
23.17.5 The plan module ...... 460
24 Learning by doing 467
24.1 Learning by doing ...... 468
24.2 Tower of Hanoi learning ...... 468
24.3 Research by others on modeling Tower of Hanoi learning ...... 470
24.4 My analysis of the Anzai-Simon protocol ...... 470
24.5 Learning Tower of Hanoi strategies ...... 470
24.6 Contexts in the Tower of Hanoi example ...... 473
24.7 Lessons learned from attempting to extend the model to do learning . . . 473
24.8 Appendix - The Anzai and Simon protocol ...... 474
25 Procedural memory and routinization 481
25.1 Introduction ...... 482
25.2 The basal ganglia ...... 486
25.3 Learning by the basal ganglia ...... 491
25.4 The interleaving of routine and creative action ...... 494 xx Contents
V Extensions 499
26 Vision 501 26.1 Introduction ...... 502 26.2 The neuroanatomy of the visual system ...... 502 26.3 The psychology of vision ...... 502 26.3.1 Representations of the percept ...... 508 26.3.2 The total percept and consciousness ...... 509 26.3.3 Models of the visual system ...... 510 26.4 Phenomena to be modeled ...... 511 26.4.1 Representations ...... 511 26.4.2 Timing phenomena ...... 512 26.4.3 Object tokens ...... 512 26.4.4 Attention and awareness ...... 513 26.4.5 Object types ...... 515 26.5 Eye movement control ...... 516 26.5.1 Transsaccadic vision ...... 517 26.5.2 Models of vision including eye movement ...... 517 26.6 Problem solving, perceptual queries, and topdown attention ...... 518 26.7 Vision and mental imagery ...... 520 26.8 Vision and episodic memory ...... 521 26.9 The interface between the visual system and the core brain model . . . . 522 26.10My approach to the visual system ...... 523 26.10.1 My proposed contribution ...... 523 26.10.2 An example of visual system behavior ...... 524 26.11A proposed model of the visual system ...... 526 26.11.1 Areas and mechanisms to be modeled ...... 526 26.11.2 The mechanism of the model for the Treisman process ...... 526 Contents xxi
27 Natural language processing 532 27.1 Introduction ...... 533 27.2 Kempen’s model of grammar ...... 538 27.3 Vosse and Kempen’s results ...... 544 27.4 My grammatical model for the brain ...... 546 27.5 The interface between the natural language system and the core brain model551 27.6 Using general brain model mechanisms for strengths ...... 551 27.7 Agrammatism ...... 557 27.8 Conclusion ...... 558
28 Analysis of subcortical systems 559 28.1 Introduction ...... 560 28.2 The psychology of subcortically motivated behaviors ...... 563 28.2.1 Agonism ...... 563 28.2.2 Attachment ...... 567 28.2.3 Sex ...... 570 28.3 The neuroanatomy of subcortically motivated behavior ...... 572 28.3.1 The hypothalamus ...... 573 28.3.2 The amygdala ...... 574 28.4 The action of subcortical systems ...... 577 28.4.1 Subcortical effects ...... 577 28.4.2 System functions and connections ...... 578 28.4.3 Brain mechanisms of agonism ...... 579 28.4.4 Brain mechanisms for attachment behavior ...... 581 28.4.5 Brain mechanisms of sexual behavior ...... 584 28.4.6 A unified picture ...... 586 28.5 Stress ...... 589
29 Modeling interacting cortical and subcortical systems 590 xxii Contents
29.1 Introduction ...... 591 29.2 Ontogenetic development ...... 591 29.3 The interface between the subcortical systems and the core brain model . 594 29.4 Modeling agonistic motivation ...... 595 29.5 Modeling attachment motivation ...... 596 29.6 Modeling sexual motivation ...... 597 29.7 Logical models of motivational systems ...... 600 29.7.1 Formal representation ...... 600 29.7.2 Representing the hypothalamus ...... 600 29.7.3 Representing the amygdala ...... 605 29.8 Towards a logical model for sexual behavior ...... 605
VI Conclusions 609
30 Consciousness 611 30.1 Some remarks on consciousness ...... 612 30.2 Lived experience and creative imagination ...... 613
31 Towards a computer science of the brain 616 31.1 Introduction ...... 617 31.2 Concepts ...... 617 31.3 Design, constraints and optimization principles ...... 618 31.4 Programming language ...... 620 31.5 Research issues ...... 621 31.6 Summary and conclusion ...... 622
32 Toward Brain Science 623 32.1 Describing the brain ...... 624
33 Summary of the model 625 Contents xxiii
33.1 Overall ...... 626 33.2 Some basic tenets of the theory ...... 626 33.3 The basic representation ...... 627 33.4 The core dynamic model ...... 627 33.5 Extensions to vision, language and subcortical systems ...... 635 33.5.1 Vision ...... 635 33.5.2 Natural language ...... 636 33.5.3 Subcortical systems ...... 638 33.6 My disagreements ...... 644
34 In conclusion 645 xxiv Contents List of Figures
1.1 Dividing the brain into four main parts ...... 24
2.1 The similarity of the brains of primates, from [Brodmann, 1909] via [Bullock, 1977] Fig 10.92, p. 487 ...... 29
3.1 The telencephalon, from [Bullock, 1977] ...... 42 3.2 The reptile brain, from [Romer and Parsons, 1986] ...... 43 3.3 The sequence of cortical evolution of the cortex, from [Romer and Parsons, 1986] ...... 45 3.4 The circuit of three layer cortex, from [Shepherd, 1998] ...... 46 3.5 The emergence of six layer cortex in reptiles, from [Dart, 1934] ...... 46 3.6 Type of cells in six layer cortex, from [Douglas and Martin, 1998] . . . . 47 3.7 The evolution from three to six cortical layers, from [Reiner, 1993] . . . . 47 3.8 Stefan and Andy’s data on the relative growth of different parts of the brain, from [Eccles, 1989] ...... 48 3.9 Todd Preuss’s findings of new neocortical areas by comparing a prosimian (a Galago bushbaby) and a simian (a Macaque monkey) ...... 56 3.10 Brodmann’s areas of the human neocortex ...... 57 3.11 Cell types and internal connectivity in the neocortex, from Shepherd and Koch in [Shepherd, 1998] ...... 58 3.12 External connectivity of the different cortical layers ...... 58 3.13 Canonical neocortical circuit, due to Douglas and Martin [Douglas and Martin, 1998] ...... 59 3.14 Overview of brain components ...... 60
xxv xxvi Figures
4.1 Events in the history of neurology and neuropsychology ...... 62 4.2 Wernicke’s system diagram ...... 65 4.3 Extended system diagram ...... 66 4.4 Events in the history of psychology ...... 70 4.5 Freud’s diagram of his transcription model, from [Freud, 1900] ...... 70
5.1 Events in the history of formal description ...... 75 5.2 Frege’s concept script ...... 79 5.3 Example of a Turing machine ...... 82 5.4 The idea of a universal Turing machine ...... 84 5.5 Tarski’s concept of an interpretation of a theory ...... 86 5.6 Geometric knowledge in children’s drawings ...... 93 5.7 Minsky’s concept of frame ...... 94
9.1 Levels of description in computer science ...... 122 9.2 Levels of descriptions and their interactions ...... 123 9.3 The process of chip design, from [Edwards, 1992] ...... 124 9.4 Levels on laptop running by brain model ...... 125 9.5 Specification documents for levels and interlecl interfaces ...... 127 9.6 NAND gate defined by circuit, logic diagram, layout and logic formula representations ...... 128 9.7 Definitions of NOT, NAND and NOR gates as truth tables ...... 129 9.8 A logic circuit ...... 130 9.9 Pipelining diagram defines serialized register transfer language in terms of hardware RTL ...... 131
10.1 Levels of description in brain science ...... 137 10.2 My concept of Brain Science ...... 139
11.1 Neural areas and notation used ...... 161 11.2 Lesioning an area affects a small number of other areas ...... 163 Figures xxvii
11.3 Pattern of connectivity discovered by Jones and Powell ...... 164 11.4 Summary diagram showing the three sequences reported by Jones and Powell166 11.5 Summary of hierarchies reported by Pandya and coworkers ...... 168 11.6 Intrinsic connectivity of frontal areas, from (Barbas and Pandya, 1989) . 170 11.7 Sensory features - first part ...... 173 11.8 Sensory features- second part ...... 174
12.1 Table of all hierarchical regions ...... 178 12.2 The olfactory hierarchy ...... 180 12.3 The gustatory hierarchy ...... 182 12.4 The somatosensory hierarchy ...... 184 12.5 The auditory hierarchy ...... 187 12.6 The ventral visual hierarchy ...... 192 12.7 The dorsal visual hierarchy ...... 195 12.8 The polymodal hierarchy of the superior temporal sulcus ...... 198
13.1 The planning and action hierarchy ...... 204 13.2 Characterization of planning and action hierarchy ...... 211 13.3 Views of the cortex showing regions ...... 213 13.4 Summary of experimental findings for hierarchy of data abstraction . . . 214 13.5 Table of all extrinsic connections among neural areas, part 1, AS - arcuate sulcus ...... 216 13.6 Table of all extrinsic connections among neural areas, part 2 ...... 217 13.7 Diagram of connections to frontal areas ...... 219 13.8 Neocortical perception-action hierarchy ...... 223
14.1 Summary of experimental findings for hierarchy of data abstraction . . . 227 14.2 Computational hierarchical levels used in my model ...... 228 14.3 (a) Lateral view of the cortex showing neural regions and functional in- volvements, (b) Connectivity of regions showing perception-action hierarchy229 xxviii Figures
14.4 Modules from neural areas of the primate neocortex, and my initial system model ...... 231 14.5 Functioning of interacting perception and action hierarchies in behavior . 240 14.6 How the model works ...... 242 14.7 Response to variation in environment ...... 245
15.1 Description types in each module ...... 250 15.2 Outline of description transformations in each module ...... 251 15.3 Grooming sequence ...... 253 15.4 Visualization of a typical instantaneous state of the model ...... 254 15.5 Instantaneous behavioral states of two interacting primates ...... 255 15.6 Social conflict sequence ...... 257 15.7 Avoidance sequence ...... 259 15.8 Displacement sequence ...... 260 15.9 Displacement sequence, with persons visualized as humans ...... 262 15.10Predicted brain area activation for different kinds of processing ...... 268
16.1 Initial and general positions for five disk Tower of Hanoi problem . . . . 275 16.2 Typical initial and final positions for Tower of London problems . . . . . 276 16.3 Cards used for the Wisconsin Sort Test ...... 277 16.4 Roland’s diagram summarizing his findings for thinking tasks ...... 280 16.5 Construction of state spaces for Tower of Hanoi problems ...... 284 16.6 Tower of Hanoi state space for problems with three disks ...... 285 16.7 Tower of Hanoi state space for problems with five disks ...... 286 16.8 Image of Tower of London performance ...... 293 16.9 Table of brain area activations for Tower of London ...... 294 16.10Image of Tower of London performance for harder problem ...... 294 16.11Table of brain area activations for harder problem ...... 295 16.12Distributing a strategy over several modules ...... 298 16.13Representation of the selective search strategy on the brain model . . . . 300 Figures xxix
16.14Representation of the perceptual strategy on the brain model ...... 302 16.15Representation of the goal recursion strategy on the brain model . . . . . 304 16.16The corpus callosum shown in coronal section ...... 307 16.17The corpus callosum showing anterior and posterior commissures . . . . . 308 16.18The sagittal plane showing AC and PC, and the x and z axes, drawn in orange ...... 309 16.19Talairach coordinate system ...... 310
18.1 The concept of logical module ...... 340
19.1 Newell’s concept of symbol ...... 351 19.2 Difference between Conventional AI Program and our Distributed System 353
21.1 Structure of a brain module ...... 360 21.2 Module as interacting layers ...... 363 21.3 Module mechanism as interacting layers ...... 364 21.4 Cell types for layer IV cortical neurons, Firing patterns: RS regular spik- ing, BS burst spiking and FS fast spiking ...... 366 21.5 Mechanisms and sites for synaptic plasticity, from [Malenka and Siegelbaum, 2001] ...... 369 21.6 Molecular events within the neuron leading to short and long term memory, based on Figure 1 of [Alberini, 1999] ...... 371
22.1 Separate learning modules ...... 379 22.2 The rat brain, shown flattened, adapted from (Swanson, 1992) ...... 379 22.3 Hippocampal evolution ...... 380 22.4 Hippocampal neuroanatomy ...... 380 22.5 The hippocampal formation, block diagram ...... 381 22.6 Cortical connections to hippocampal complex, for rhesus monkey, from Kobayashi and Amaral ...... 392 22.7 The episodic memory module ...... 393 22.8 An event as a set of hippocampal inputs ...... 398 xxx Figures
22.9 An episode as a sequence of events ...... 400 22.10An episode as a sequence of episodes ...... 401 22.11Multiple contexts and episodes ...... 402 22.12The possible action of the hippocampal formation in memory ...... 407 22.13The context representation problem ...... 408 22.14A chess position used by DeGroot and by Newell and Simon ...... 410 22.15A chess search tree, taken from HPS Fig 12.3, p. 714 ...... 411 22.16A chess problem behavior graph, first half, taken from HPS Fig 12.4, p. 715-6, ...... 412 22.17Chess episodes, taken from HPS Table 2.1, p. 723 ...... 413 22.18Progessive deepening in chess proof, three phases, taken from Chapter 7, Figure 7, pp. 268-9 ...... 416 22.19Chess proof, taken from De Groot Chapter 1, Figure 3, pp. 28-30 . . . . 417
23.1 The basic action of the system ...... 429 23.2 Episodic memory and context mechanism ...... 431 23.3 A context as part of a hierarchy of contexts ...... 432 23.4 An example of a context, for the selective search strategy ...... 435 23.5 Executing the ss context, and forming an episode ...... 436 23.6 The environment of context execution ...... 440 23.7 Episode creation during problem solving ...... 441 23.8 Context for the selective search strategy, showing messages ...... 445 23.9 Contexts which send messages ...... 445 23.10Context for obstacle on source peg ...... 446 23.11Context for obstacle on target peg ...... 447 23.12Context for evaluation ...... 447 23.13Nesting of contexts ...... 449 23.14Learning and nesting property ...... 450 23.15Outline of code ...... 455 Figures xxxi
23.16Executing a context ...... 462
24.1 Strategy learning sequence of Anzai and Simon ...... 469
25.1 Subgoals in routine action ...... 484 25.2 Subgoals in perception hierarchy ...... 485 25.3 Basal ganglia ...... 486 25.4 The geometric influence of the third ventricle ...... 487 25.5 Loops involving the basal ganglia ...... 489 25.6 Basal ganglia loop shown in coronal section ...... 490 25.7 The association loop mapped onto the perception-action hierarchy . . . . 491 25.8 The sensory-motor loop mapped onto the perception-action hierarchy . . 492 25.9 The association loop mapped onto the perception-action hierarchy . . . . 493 25.10Possible arrangement for monitoring of routine action by planning module 494 25.11Driver and modulator connections of the thalamus ...... 495 25.12Composite diagram of cortex, thalamus and basal ganglia ...... 496
26.1 Visual Brodmann areas for the monkey brain (left) and the human brain (right) ...... 504 26.2 Retinotopic mapping of retina onto V1 ...... 505 26.3 Early vision modules and functioning - from Gallant and Van Essen 1995 506 26.4 The Treisman psychological model of early vision - from [Treisman, 1988] 507 26.5 Eye movement control in the brain ...... 516 26.6 Our concept of hierarchical eye movement control in the brain ...... 518 26.7 Organization of Tower of Hanoi strategy showing perceptual goals . . . . 519 26.8 Stephen Kosslyn’s model for mental imagery ...... 520 26.9 The main brain areas of the visual system ...... 527 26.10An image and its different components during processing ...... 528 26.11The different modules and their data during perception ...... 529 xxxii Figures
27.1 Summmary of imaging data for natural language processing, from [Deacon, 1997] ...... 536 27.2 A suggestion for the natural language processing system, from [Deacon, 1988]537 27.3 Lexical frames ...... 539 27.4 Sharing of features, from Gerard Kempen’s book [Kempen, 2000] Figure 2.2540 27.5 Normal parameter values determined by Vosse and Kempen ...... 543 27.6 Step during construction of structure description ...... 544 27.7 Vosse and Kempen’s sentence recognition results ...... 545 27.8 Processes organized as concurrent modules ...... 549 27.9 Brain areas corresponding to concurrent modules ...... 550 27.10Variation of strengths in the V-K model of sentence recognition . . . . . 552 27.11Variation of strengths in our brain model for sentence recognition . . . . 554 27.12Parameter values used ...... 555 27.13Caplan’s stimulus sentence types and comprehension scores ...... 557
28.1 General approach with two levels of control ...... 563 28.2 Time course of male-female relationship ...... 571 28.3 The human hypothalamus, taken from Carpenter, 9th edition, Figure 17.1, page707...... 573 28.4 Hormone outputs from the hypothalamus via the pituitary ...... 574 28.5 Intrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000], Legend: AAA: anterior amygdala area, AB: accessory basal nucleus, CE: central nucleus, COa,p: cortical nucleus, anterior and posterior parts, B mc,pc:Basal nucleus, magnocel- lular and parvocellular parts, L: lateral nucleus, PAC: periamygdaloid cortex, PL: paralamellar part of basal nucleus ...... 575 28.6 Extrinsic connections diagram for the amygdala, from [Aggleton and Saunders, 2000] ...... 576 28.7 Summary connections diagram for the amygdala, illustrated by agonism . 580 28.8 Levels of threat processing, from Graeff 1994 ...... 581 28.9 Table of pup action effects on the dam, from [Hofer, 1987] ...... 581 Figures xxxiii
28.10Table of dam action effects on the pup, from [Hofer, 1987] ...... 582 28.11Reaction of the different control systems of the pup on separation from the dam, from [Hofer, 1987] ...... 583 28.12Subcortical systems involved in sexual behavior ...... 584 28.13Female and male hormonal circulation ...... 586 28.14Levels of interacting motivational control ...... 588
29.1 Postulated brain mechanisms for agonistic behaviors ...... 595 29.2 Postulated brain mechanisms for attachment behaviors ...... 596 29.3 Postulated brain mechanisms for sexual behaviors ...... 599
33.1 Modules from neural areas of the primate neocortex corresponding to my initial system model ...... 628 33.2 Functioning of interacting perception and action hierarchies in behavior . 629 33.3 Separate learning modules ...... 630 33.4 Summary diagram of learning module and core model ...... 631 33.5 The episodic memory system ...... 632 33.6 The formation and use of contexts ...... 633 33.7 The association loop of the basal ganglia and cortex ...... 634 33.8 The different modules and their data during visual perception ...... 635 33.9 Summary diagram of extension for vision ...... 636 33.10Brain areas corresponding to language processing ...... 637 33.11Summary diagram of extension for language processing ...... 638 33.12Levels of interacting motivational control ...... 640 33.13Summary diagram of extension for subcortical systems ...... 641 33.14Summary diagram of brain model showing some detail ...... 642 33.15Simplified summary diagram of brain model ...... 643 xxxiv Figures Part I
Foundation
1 2 Chapter 1
Introduction, motivation and summary
Abstract: In this first chapter, I explain some motivations for wanting to develop a general scientific theory of the brain, the chief one being a desire for a framework within which to define psychological concepts in terms of the overall functioning of the brain.
I explain that this book is an attempt to define not only a theory of the brain but also a field which I call Brain Science which intimately combines elements of neuroscience, psychology and theoretical computer science. I discuss my notion of natural science, and approximation. I make some remarks about the problem of scientific description of the brain, as well as the description process carried out by the brain. I indicate the kinds of representation that I will postulate are used by the brain.
I give a brief history of my own career and the events leading to the development of this theory. I also briefly outline what my theory consists of, by giving an account of what will be in each chapter as we read through the book.
3 4 Chapter 1: Introduction, motivation and summary
1.1 Introduction
In this book, I will explain an approach to developing a working model of the brain. This is a scientific model of the brain, in that it is described by a precise theory. It has a causal dynamics and describes the action of the brain at a system level, rather than a neural level. I indicate how a neural-level model could be obtained from this system-level model. My model is also realized as a computer program. This model can generate falsifiable predictions that can be compared with experimental data.
The model results from a use of abstract computer science concepts to describe the brain as a computer. Most of computer science concerns techniques for designing and implementing present-day computer systems, and uses the notion of address throughout, for describing control as well as data, which is not biologically plausible. However there is a more abstract theoretical computer science and I was able to use more abstract ideas.
The resulting research unfortunately falls well outside of neuroscience, psychology and computer science, as these are narrowly defined in university departments. However, my one redeeming feature of obtuse tenacity enabled this research to progress and to bear fruit.
It was opined several decades ago that we already had enough data about the brain so that a reasonable model could be found. However, the predominant idea or opinion of most neuroscientists and psychologists is that a correct theory and model of the brain will not occur for many decades, perhaps a century. Further that it will emerge from painstaking empirical research by neuroscientists. A psychologist friend of mine is fond of saying it will not occur in her lifetime. To her I say, put this book down immediately and try to enjoy what is left of your short life. To everybody else, read on! Introduction 5
1.1.1 The desire for a framework
The impulse that propelled me along this path was my need for an intellectual framework within which to define concepts in psychology. I needed an overall view of the total action of the mind, and this lead inevitably to the need to understand the brain.
The kinds of problems I would like to illuminate include: (i) what is the basic regime of processing in the brain? A lot of present-day scientific thinking basically assumes a “straight through” processing of input data to produce an output action. (ii) What happens in perception, how do we construct our percept of the world? (iii) What is the role of perception in overall action? (iv) How can we think of motivation - what causes different mental activities to happen? (v) Can we think of emotion in a scientific manner? (vi) What is happening during different kinds of thinking? (vii) What happens in the brain during social interaction? (viiii) How does a child learn and develop into an adult? (ix) How do different personalities form? (x) How should we think about consciousness?
Can we find a set of information-processing concepts that will allow us to describe infor- mation processing in the brain?
Artificial intelligence has terms such as goal, plan, semantic net, and so on. What are the corresponding concepts for the brain?
What is the correct set of concepts for psychological thinking. Current ideas include episodic memory, phonological buffer, chunk, motor program, etc., how do we define these precisely? 6 Chapter 1: Introduction, motivation and summary
Neurophysiological explanations are unsatisfactory: Pathway - the action of the system is assumed to be “straight through” and does not iterate or execute actions conditionally. The feedback “upstream” is difficult for them to handle. How does data flow within the brain?
This book presents a well-defined framework within which to define psychological and neuroscience concepts and to give answers to many of these kinds of questions.
There are psychologists who have put forward computer-science type theories to describe their experimental results, although rarely realised as computer programs, for example: 1. Daniel Kahneman and Anne Treisman’s object-file theory of object perception [Kahneman and Treisman, 1984]. 2. Stephen Kosslyn’s mental imagery model [Kosslyn, 1994]. 3. Vickie Bruce’s theory of face perception [Bruce, 1988]. 4. Tim Shallice’s theory of frontal decision making [Shallice, 1988].
System models. In my research, I will model the primate neocortex as a system. A system model treats an object of study as a set of interacting subsystems, each of which is easier to understand and to describe than the complete system. It results in explanations of objects as due to the action of each subsystem and the interactions among subsystems.
The use of a systems level of thinking in neuroscience, where more than one neural area are conceived as working together, has a venerable history going back to Wernicke’s books [Wernicke, 1874][Wernicke, 1894]. There is, for neuroscientists, a natural systems-level of explanation of experimental data, see for example [Gazzaniga, 1989]. Current imaging evidence is showing coactivation of distributed areas in many tasks. From this type of evidence, McIntosh et al. [McIntosh et al., 1994] have developed influence graphs, which give a measure of dynamic influence among neural areas. Mesulam [Mesulam, 1990] has Introduction 7 developed ideas of a distributed system mediating attention. Kosslyn [Kosslyn, 1994] has developed a system model of visual imagery and perception. Goldman-Rakic [Goldman-Rakic, 1988] has investigated distributed systems for working memory, which involve areas in the frontal lobes and in the parietal and temporal lobes. Modular ex- planations of language processing have progressed [Geschwind, 1965] [Deacon, 1989] and now have support from imaging experiments. Petersen et al. [Petersen et al., 1988], for example, have produced a modular description of language processing. There are modular explanations of the aphasias, and of the dyslexias [Karmiloff-Smith, 1992].
Boxology and control processes. In cognitive psychology, the “boxology” diagrams for short term memory have always been unsatisfactory to me for two main reasons: 1. They are not part of an overall complete system which controls behavior, or even of an overall memory system. 2. Control issues are completely finessed, as “control processes”, a term introduced by Atkinson and Shiffrin [Atkinson and Shiffrin, 1968] and which has never been defined to a workable degree of precision.
The interface. The subject of my research in many ways sits as an interface between biology and computer science. From a computer science perspective I seek to describe computers based on the brain, which might have some of the legendary properties of the brain, its parallelism, intelligence, flexibility and resilience. From a biological perspective I seek to bring computer-science concepts and methods to bear on the problem of the scientific understanding the brain, which may provide a well-defined theory and a precise, tractable, description of the brain.
Natural science and causal models. Experimental results demonstrate the involve- ment of some parts of the brain in some given behavior. but they do not provide a causal functioning model of the brain actually operating to produce the behavior. By causal I mean that the model has a dynamics of changing in time from one state to another, each 8 Chapter 1: Introduction, motivation and summary next state being determined from its current state. Very little experimental information is available on how different parts of the brain work together, what information flows, what is computed, or how the activities of different parts are coordinated. Indeed, most of natural science to date has been concerned with matter and energy and transforma- tions among their various forms. What we will need for the brain is a natural science of information processing.
Computer science. In order to produce a working causal model of information pro- cessing in the primate neocortex, I will turn to a computer science analysis, where I will draw on knowledge of information-processing systems from several different specializa- tions within the field of computer science - parallel architectures, distributed systems, formal description languages, and artificial intelligence planning.
Neuroscientific information-processing requirements can be used to constrain the design of a scientific model. We know the brain does indeed work in a coordinated manner to produce behavior, we know it is stable under a number of disturbances, we think it is organized for real-time responsiveness, we believe it is distributed, and we have some idea of timings of components.
Computer science brings to my research notions of processing architecture, of data and process representation, and of control [Siewiorek et al., 1982]. Further, computer science brings techniques for describing, specifying and implementing models using programming languages. Description languages have been developed for the high-level description of complete computer systems [Davis, 1993] [Calvez, 1993]. In my case, computer science will be of particular value for understanding how control in the brain could be organized, and for understanding how to create a model by specifying and implementing one using logic programming. A basis in three disciplines 9
1.2 A basis in three disciplines
I will propose a multidisiplinary approach to the study of the brain. The arguments in this book are based on three core disciplines, namely, neuroscience, psychology, and computer science: (i) Neurology, neuroanatomy and neurophysiology deal directly with the biology of the brain and with neurons, (ii) Psychology, which includes experimental psychology, psychiatry and psychoanalysis, deals with human behavior and its pathology, without neural explanations, and (iii) Computer science and logic, which deal with formal mechanisms and theories of information processing. In addition, however there are other relevant areas which we will involve as needed: (iv) Linguistics, the study of language in its own terms, (v) Primatology, the study of behavior and its underlying mechanisms in, mainly non- human, primates, and (vi) Sociology, the study of groups, populations and the interactions of individuals.
In order to understand the brain, we need to have a good knowledge of all three core disciplines. It is one of my aims that this book can be read by a specialist in any one of these disciplines with the result that they are able to understand the contribution of the other two.
It is also my hope that this book will help define a new subject which we can call Brain Science, which intimately combines Neuroscience, Psychology and Computer Science into a unified field of study and research. It will do this by giving an intellectual framework incorporating all three component disciplines, in suggesting a hierarchy of description levels allowing different types of knowledge, experiment and theory to be developed at each level and allowing different levels to relate to each other. It will also show how one 10 Chapter 1: Introduction, motivation and summary might select, from each discipline, material and concepts so that one does not need to attempt the impossible task of becoming an expert in all three subjects. I also suggest one possible brain model and one possible precise language in which to represent brain models.
I have tried to avoid unnecessary technicality without sacrificing an adult-strength treat- ment. In order to do this I have necessarily simplified my treatment. One can verify and correct my treatment by consulting more specialized material to which I will provide references. I also assume that the reader may skip the more demanding sections without causing a catastrophic break in their understanding.
1.3 Science
1.3.1 Science and neuroscience
As every student knows, the word “science” comes from the latin word “scientia” meaning knowledge, from scientem, scire, to know. Now, although in common parlance one can be said to know facts, such as a phone number, it is clear that the intention of science is to know concepts and principles. I emphasize this point since most of present-day neuroscience and a lot of present-day psychology consists of discovering phenomena and describing them, without attempting to discover underlying concepts or scientific theories. When neuroscientists and psychologists go to conferences they discuss their discovery of their latest phenomena, and issues concerning how to find yet more new phenomena.
When I was doing theoretical physics, theoreticians did nothing but theory, no exper- iments, and there were experimentalists who mainly did experiments and had only a general idea of current theoretical ideas, and then there were people in the middle which we called phenomenologists. These were the people who were aware of all the experi- Science 11 mental data and could summarize it into tables and trends, and might develop ad hoc curves that would fit the data. It was this processed data that was used by theoreticians to guide their search for a principled theory.
In the current state of neuroscience, there are only experimentalists, and a few of these may make some theoretical conjectures. It is impossible to publish a paper in a neuro- science journal unless it contains some new experimental data, and many journals will not allow any theoretical discussion. Thus there is no theoretical effort, it is apparently viewed as foolhardy, since after all the historical road of neuroscience is strewn with the carcases of dead theories.
Some quotations from our forebears: Herbert Spencer, 1820-1903, “Science is organized knowledge” [Spencer, 1861]. Thomas H. Huxley, 1825-1895, “Science is nothing but trained and organized common sense”. Lord Rayleigh, 1842-1919, “Examples ... which might be multiplied ad libitum, show how difficult it often is for an experimenter to interpret his results without the aid of mathematics.” [Bell, 1937].
1.3.2 The Bohr model of the atom
A key image that I use is that of the Bohr atom. This was the first ever model of the atom. Before the discovery of the Bohr model in 1912, the atom was known to exist, but its structure was unknown. The best thing available was the Rutherford model which was that the atom consisted of a nucleus with electrons orbiting around it, however Rutherford could not understand why the electrons would not simply radiate electromagnetic energy and collapse to the nucleus since this is what the standard theory of electromagnetism required. Nils Bohr then suggested his model of the atom in which 12 Chapter 1: Introduction, motivation and summary the energy states of electrons were quantized, so they could only occupy certain orbits which could not collapse. Energy could only be emitted when an electron moved from one allowed orbit to another, so the spectrum of an atom consisted of discrete frequencies, and light was emitted as discrete quanta. So Bohr used the difficult idea of quanta which had been proposed by Planck and Einstein, and applied it to another difficult problem namely how the planetary model could work. His model was quite simple, with tractable mathematics, so that the average scientist could understand it, and it explained various properties that atoms were known to have, notably their different energy states and the particular wavelengths of light that each type of atom could emit. There were specific known data to explain, namely the observed spectral lines whose frequencies occurred in series given by known formulae due to Balmer and others. The Pauli exclusion principle came later, and the more sophisticated mathematical theory of Schr¨odinger didn’t appear until 1926. At the time, Bohr’s ideas were not immediately accepted, probably because his advance made several large steps at once. Indeed one of his main detractors was Schr¨odinger who thought the idea of discrete states was patently ridiculous.
It seems to me that there is an analogous situation with the brain. We have a lot of data, and various ideas such as areas, pathways, working memory, etc. However, how the brain works is a mystery to neuroscience. My model shows how the brain works, and makes its functioning understandable.
1.3.3 Approximation in science
Atomic physics also gives us an approach to approximation. The Schr¨odinger equation describing atomic structure is accepted as the correct scientific theory. There are some finer details of spectral wavelengths that require a more complex theory that takes rel- ativity into account, and also the structure of the nucleus. However, the vast majority Description by the brain and by brain scientists 13 of observed experimental data about atoms and molecules are well described by this equation. However, the solution of the equation for anything more complex than the simplest atoms of hydrogen and helium is not possible as a closed mathematical expres- sion. Solutions are therefore found by approximation methods, and the accuracy can be improved by doing more computation. Collections of atoms in solid state forms can also be solved approximately, however only when further assumptions are made, derived from experimental data, such as the crystalline or other structure of the solid.
Thus a precise correct theory may not have practical application to complex systems, even though it is accepted as correct. The exact theory gives us a way of thinking and formulating, from which practical results are obtained by approximations of various types.
Neuroscientists and psychologists are currently necessarily involved in difficult experi- mental research. In this type of culture, every detail and nuance must be examined and included, since this is where new phenomena are first discovered. By contrast, in order to develop a theory, a theoretician has to use his or her judgement to simplify and approximate. This involves leaving out some phenomena and ignoring some problems.
1.4 Description by the brain and by brain scientists
1.4.1 Description and computation
I will argue that the main work of the brain is to describe its environment and itself. It does this repeatedly and continuously, so it continuously redescribes itself.
I will in subsequent chapters explain what I mean by a description and by the process of describing. 14 Chapter 1: Introduction, motivation and summary
I will view computation as a process of description. From a newly given description, a computation will generate a more developed and complete description, and one better fitting and describing the external environment and itself.
There will be a limit to this process. At a given time and with a given starting description, computation will terminate with a more complete description that is the most complete without making any further assumptions. This is the most general and most complete description possible given the starting description.
When new data or new knowledge is added, or removed, or changed, the process of redescription can continue and again reach a new most general complete description.
The processing in the brain is the action of each brain module which simply repeatedly computes new data. In addition, brain modules coordinate by passing data and by acknowledging data as useful.
1.4.2 Description by brain modules
The brain consists of a set of interconnected modules, each of which is continuously redescribing itself, its inputs and its stored memory. My notion of description is more general than usual in that it includes all the different kinds of module that the brain has, each with its own special kinds of data and representation. Any given module thus main- tains descriptions in its own terms. A module receives descriptions from other modules and combines these and any stored descriptions to produce new descriptions. Thus it bases its work on that of others and their time of occurrence. Thus, the brain describes and continuously redescribes, action, time sequences, plans, goals and intentions. Exter- nal action occurs by the brain sending a stream of descriptions of actions to its effectors, such as muscles and glands, which act upon the world. (For our purposes, the world is everything external to the nervous system and thus includes the subject’s body.) Description by the brain and by brain scientists 15
1.4.3 The scientific description of the brain
My model achieves clarity and precision by wellknown methods, such as defining technical terms as precisely as possible, and making definitions and expressions very simple so that there is little chance of ambiguity.
The brain model works by using formal descriptions, in a formal language which I will define, and it has a precise process which continually constructs new descriptions, a process I am calling redescription.
The architecture of the brain model, with which this redescription process occurs, is defined here informally, however it is clearly precise. We will give more detail later, and ultimately any other scientist can examine and run for him- or herself the computer program which gives an exact definition and realization of the model.
1.4.4 Description of and by the brain
There are two different levels or roles for description. One is that in order to describe the brain, we, as scientists, are going to need to describe data, communication, processes, planning, memory, and so on, so therefore we will need a rich language in which to describe the brain. Such languages for describing complex information-processing and information processing systems have only recently been developed as part of computer science.
My argument is that the brain is a complex information-processing system, and therefore, in order to describe it scientifically, we will need a description language which can describe data structure, processes and their control relations, and abstraction.
The second role of description is that the brain itself is engaged in a descriptive process. It continuously describes its experience. Hence our description of the brain should be rich 16 Chapter 1: Introduction, motivation and summary enough that it can describe the brain’s descriptive processing. The scientific language that we need as scientists will therefore probably have to be richer than the descriptive techniques used by the brain.
Each module computes data of certain data types characteristic of the module, using data of other types it has received from other modules. Some modules represent perceptual information such as visual, auditory and somatosensory images, however some modules represent plans, some frames, some events and episodes.
Thus the brain’s functioning is to represent its environment, its actions and its own mental states. The language used by the brain scientist needs to be rich enough to describe this representational activity of the brain, as well as communication among modules, parallel processing among modules, and other computational properties of the brain. This language will allow the description of many different brains, including many that do not or cannot exist in nature for various reasons. Then additional statements stated in the language will describe one particular kind of brain such as the human brain. Of course, there will also be the ability to specify many different variants of the human brain corresponding to the individually different brains that different people have.
1.4.5 Representation of episodes, plans and goals
By abstraction from observed episodes, the brain develops stored plans for future use. These may involve sequencing of action steps, conditional actions and observation.
Some modules may generate representations whose effect when communicated to other modules is to evoke plans which change the state so that the original representations are removed. These representations will thus have a role as goals in the system. Description by the brain and by brain scientists 17
1.4.6 The evolution of representations
The different data types are innately endowed, although subject to some ontogenetic changes. That is, the basic modules, interconnections and the data types for each module, have resulted from evolution and are encoded into DNA which guides the construction of the brain.
We can think of the brain as developing representations of various kinds and using these to construct stored memories and to develop plans and actions. It is quite possible that the different representations used in the brain evolved at different times, and that only the most recently evolved ones underlying language are fullfledged symbolic descriptions in the sense of allowing wide use and applicability to many different situations and types of experience.
The brain’s ability for scientific description. The most advanced modules will have representations that allow natural-language semantics and mathematics to be repre- sented. With appropriate education, the ability to use scientific language and to describe different possible brains could be achieved. Thus at the very top level of representation the brain becomes truly self-describing, in the sense of scientific description.
1.4.7 Approximation in modeling the brain
We also will need to be clear about how we approximate the brain. That is, we will develop brain models which will not perform as well or as completely as the brain, but which nevertheless are models which accurately capture the scientific principles and the mechanisms used by the brain. This is very much a matter of scientific judgment, since we do not want to simply leave out phenomena that we cannot handle, by saying that they are beyond our approximation. 18 Chapter 1: Introduction, motivation and summary
One example is visual perception. This is highly developed in humans and performs at a spectacular level. Should we require that our model for visual perception also operate at a similar level? If not, what kinds of performance will we accept as establishing that the model is a good scientific model?
Another example is natural-language processing, the recognition and understanding, and the generation of, sentences. Again the subtlety of human performance is legendary, however a more limited model which exhibits some of the key natural-language processing phenomena will be satisfying. Of course, our model must not exhibit any unnatural effects. More generally, the judgement of psycholinguists should be used in evaluating whether the model captures the important properties of human behavior, and whether its conceptualizations give insights into how the brain processes language.
1.4.8 Computer science and description
Computer science is a very broad and variegated area. It includes people interested in engineering good computers, people interested in producing good and useful software, people interested in applications such as graphics, people interested in the underlying technologies such as VLSI.
I will be in the main concerned with theoretical computer science and artificial intelli- gence. This is the core of abstract theory of computation and computer science.
Computer science can be defined not as the study of computers but as the study of the description of computers. The basic activity is to develop clear descriptions and specifications of information-processing systems.
I will take a description to be (1) syntax - an expression in a precise language, with History of the research on my model 19
(2) semantics - a precise process for interpreting and finding the meaning of the expres- sion.
I will combine my treatment of theoretical computer science with the development of formal logic, since these two areas are so closely intertwined. In fact, I will be using a computer system description approach which uses formal logic.
1.5 History of the research on my model
I was educated at Queens Road Primary School, Cheadle Hulme, Cheshire, at Middleton School, Wollaton Park, Nottingham, and at Mundella Grammar School, Nottingham, before getting a scholarship in 1957 to Magdalen College, Oxford, where I studied theo- retical physics. My research career started with a PhD in theoretical physics at Imperial College, London University and continued with research in the computer science de- partment at Carnegie-Mellon University, Pittsburgh, where I became acquainted with the ideas of Newell and Simon. This lead to an interest in developing a scientifically grounded artificial intelligence and psychology. My research history consists mainly of a long range search for a scientifically grounded computational model of the brain, embel- lished with a sequence of studies that I came across along the way. In addition, I have worked on some applied AI projects, some only indirectly related to this main goal.
My computer science research started at CMU and then continued at Queen Mary Col- lege, University of London, where I was tenured faculty in computer science. I tried to follow Newell and Simon by doing my own protocol experiments for problem solving and chess. One student, Tunch Balman, reimplemented and extended GPS. Other stu- dents, Mark Witkowski, David Mott and Phil Marks, built autonomous robots. John Scott developed an artificial intelligence language. In 1978, I wrote a long position pa- 20 Chapter 1: Introduction, motivation and summary per “An approach to artificial intelligence” attempting to ground AI in the notion of an autonomous surviving robot. This paper posited four bases - survival, real-time con- trol, parallel architecture and learning. As an SERC Principal Investigator, I managed David Mott on a project to develop an intelligent rule-based learning robot, which was reported in the IJCAI81 conference. In 1981, I edited the first book on Expert Systems “The Infotech Survey of Machine Intelligence”. From 1978, I worked with several parallel computers, the ICL Distributed Array Processor and also the CMU CM* machine, and in 1983 designed my own SIMD computer architecture without a central control unit, which I called a Bit Cube.
I emigrated to the United States at the end of 1984, and I collaborated with Les Gasser, while at USC and UCLA in Los Angeles, in editing the first book on multiagent systems, “Readings in Distributed Artificial Intelligence”, published by Morgan Kaufmann, which included the first in-depth review of multiagent concepts, and was published in 1988. I pursued two research ideas in multiagent systems, in order to develop a computational approach to social relationship. The first was to develop a notion of commitment among agents, published in 1990, and the second was to develop and implement a negotiation logic based on joint proof.
In 1990, I wrote another long position paper “What I have in mind” which discussed a comprehensive set of psychological ideas such as sequential processing at the top level, emotions as mental states, the representational needs of social interaction. This was an attempt to get closure on a creative synthesis. At that time, I used to meet with Robert Stoller for discussions on psychoanalytic theory. I had however gradually realized that I needed concepts and constraints from the hardware level, i.e., brain anatomy and physiology, in order to develop a good computational model.
In 1992, I went to work in Tokyo at the Sony Computer Science Laboratory as Sony Sabbatical Chair, where I wrote down a parallel architecture that was inspired by the History of the research on my model 21 blackboard model and also the modular architecture of the brain. This was my proposed alternative to the subsumption architecture of Rodney Brooks at MIT. I implemented my model in 1993 while working at The Aerospace Corporation, El Segundo, California, and based it on joint action with other agents. After this, I extended the model to represent space, and developed a confirmation mechanism which allowed brain modules to coordinate their activities. When I started work at Caltech in the computer vision laboratory of Pietro Perona in 1996, I improved its efficiency to allow the development of applications. Its cycle time and response time were reduced to about 100 milliseconds. I also extended the model to do problem solving.
I submitted my first journal paper in 1993, but this was rejected, as were several other papers that I submitted, in the period 1994 to 1999, to neuroscience and to computer science journals. This rejection continues to this day, but I have found ways to get papers published. The objection of biologists seems to be that I am getting the computer science wrong, and of computer scientists that I am getting the biology wrong.
My first peer-reviewed paper published was in 1996 at a conference at NIST, the National Institute of Standards and Technology, and the first peer-reviewed archival journal paper was published in December 1999 in the American Journal of Primatology. I’m very grateful to Michael Raleigh and Debbie Pollack for their support during this time. So this was my decade of the brain.
This was the first version of the model. A second version and its application to problem solving behavior was published in 2001 at the CNS*01 conference. This book also presents a third version, what we call the dynamic version or core model, providing for episodic memory. 22 Chapter 1: Introduction, motivation and summary
1.6 Overview of this book and my model of the brain
The rest of the book is devoted to describing the model more explicitly and precisely, showing how it works, and explaining why I believe it is a good model of the brain.
The book is organized as five parts containing 33 chapters altogether.
Part I explains the essential foundation to this research Chapters 2 and 3 introduce primate behavior and the primate brain. Chapter 4 gives a very brief history of the origins of system thinking about the brain by neurologists and psychologists, starting in the late nineteenth century up to the present day. Chapter 5 gives a brief historical introduction to formal description methods, which includes logic, computer science and artificial intelligence, and which underlies my scientific approach.
Then, chapter 7 explains what computer science is and, in particular, description tech- niques and layering of descriptions. Chapter 10 defines my concept of brain science, as a layered set of different disciplines which use precise formal mathematical theories and have definition- and explanation-interfaces between them.
Finally, chapter 8 examines computer science concepts and proposes a set of concepts for describing information processing in the brain.
Part II develops my theory and model of the primate neocortex Chapter 11 explains concepts in information-processing analysis derived from computer science and artificial intelligence, including concepts of plan, goal, frame, sequence and abstraction.
Dividing the brain into parts.
1. The brain will mean the human brain, although a lot of our knowledge of the anatomy and physiology of the brain comes from studying the rhesus monkey brain Overview of this book and my model of the brain 23
which is very similar in structure and biological mechanism, although a lot smaller.
2. We will develop our model of the brain in parts, as in Figure 1.1:
(a) the cortex, without learning mechanisms and excluding language areas, which we will treat in Part II.
(b) the learning modules - the hippocampal formation and the basal ganglia, which we will treat in Part IV.
(c) the language areas of the cortex, treated in Part V.
(d) subcortical areas involved in motivation and control - the hypothalamus and amygdala, also treated in Part V.
(e) the rest of the central nervous system - the rest of the diencephalon, the brainstem, etc., which process incoming and outgoing data, control arousal etc.
Analyzing the primate neocortex. Chapter 12 applies these methods to analyzing the perceptual hierarchies of the primate neocortex, and Chapter 13 analyzes frontal areas of the neocortex as an action hierarchy. Connections between the perception and action hierarchies show that architecturally it forms a perception-action control hierarchy.
My abstract system description method. Chapter 14 discusses issues in the precise description of information processing in the brain.
My initial model. Chapter 15 describes my initial model of the primate neocortex, how I realized it as a computer program, and the behaviors I was able to model: (i) the cortex forms a perception-action hierarchical control system for perceiving and acting, (ii) it has short term memory in each module, (iii) it is motivated by cortical affiliative goals and produces social behaviors, (iv) it has built-in long term memory of social plans, and it executes these plans, and (v) perception is very simplified. 24 Chapter 1: Introduction, motivation and summary
perception and action systems
language systems
learning systems basal ganglia hippocampus
subcortical survival systems
Figure 1.1: Dividing the brain into four main parts
The modeled behaviors were all social and involved more than one modeled primate which interacted in a spatial environment. They included social affiliation, social conflict and social spacing behaviors.
Part III concerns mental dynamics Problem solving behaviors. Chapter 16 describes how I extended the model to do problem solving, notably for the Tower of Hanoi problem. This involved adding a working memory as part of the planning module, and also running each module to quiescence within one time step, to obtain as much data coherence as possible in the messages being exchanged.
My programming language for brain models. Chapter 17 explains a programming language that I developed which allows brain models to be developed much quicker. Overview of this book and my model of the brain 25
Theoretical issues. Chapter 18 contains a discussion of mathematical properties of logical computation, Chapter 19 of the notion of symbol, and Chapter 20 of motivation mechanisms operating at the cortical level in my model.
The layer, neuron and cell dynamical levels of description. Chapter 21 describes how I can relate my abstract system level of description of the brain to more detailed levels of description. I take these levels to be (i) cortical layers as associative processors, then (ii) neurons and neural nets, and then (iii) the dynamics within the neuron.
Part IV concerns memory mechanisms My dynamic model. Chapter 22 explains my analysis of episodic-memory mechanisms and how I extended my model to a dynamic model with a hippocampal complex, and Chapter 23 introduces and explains the notion of context, which incorporates descriptions of plans. Chapter 24 describes how episodes were learned and the system could learn by doing. Chapter 25 explains my analysis of procedural memory and how I extended my model to include a model of the basal ganglia, where procedures are learned as associations of stimuli to actions.
1. There are two learning modules, the hippocampal formation which forms event memories, and the basal ganglia which form low-level procedural memories of rou- tinely repeated action.
2. Event memories are aggregations over time of the input to the hippocampal forma- tion. The current episode can involve events over a period of a few seconds to up to an hour or more. The segmentation of the flow of event data is determined in part by the currently evoked context.
3. Routine memory is formed by the basal ganglia from inputs received from the cortex and with output to the frontal lobe of the cortex, which has ultimate control over the use of routine action. 26 Chapter 1: Introduction, motivation and summary
Part V describes possible extensions of the model Chapters 26, 27, 28 and 29 describe work in progress for extending the model in some important directions, namely vision, language and subcortical systems. These promise that eventually the entire brain can be modeled using the methods I propose.
Vision 1. A more realistic vision system can be added as a hierarchy of vision modules. 2. This generates a representation of the visual percept which has a geometric 2.5-3D form and also an abstract object-file form. 3. The object-file form tends to persist across saccades. 4. Both forms are incorporated into the representation of the current event, and into the current episode. 5. The visual system can be directed by the problem-solving modules, giving top-down attention. 6. The problem-solving system can re-evoke stored episodes and images from long-term memory, to produce mental images, which may combine with incoming visual images from the environment.
Language 1. The language areas of the cortex consist of the highest level modules of the perception- action hierarchy: (a) the phonological input module, (b) the lexicon, (c) the grammar module, (d) the text store, and (e) the phonological output module. 2. The text store forms the highest level of representation, corresponding to narrative. 3. Incoming words evoke corresponding lexical frames from the lexicon, and lexical frames competitively construct a grammatical structure and text. 4. To generate an output sentence, a text is constructed and then from this lexical frames are selected to construct a grammatical structure which is output sequentially via the phonological output module. Overview of this book and my model of the brain 27
Subcortical systems My main interests in subcortical areas are the motivational systems. These are control systems which maintain water content - thirst, nutrition - hunger, agonism - aggres- sion and submission mechanisms, attachment - mutual regulation of comfort, and sex - including maternal and paternal behaviors.
These systems are themselves hierarchical and complex, and they mutually interact. They are innate, but with some plasticity. In addition, mechanisms other than neuro- transmitters are involved, including hormones and other factors.
Part VI contains material which attempts to summarize and to draw conclu- sions Consciousness. In chapter 30, I make a few brief remarks about how my model might provide insights for the scientific study of consciousness.
A computer science for the brain. Chapter 31 describes the concepts and methods I have developed which constitute a computer science for the brain.
Summary and conclusion. In chapter 33, I collect together the conclusions of different parts of the book to give in one place a brief summary of the research presented and the theory of the brain I have developed and the model I have constructed. Chapter 2
Primate behavior
Abstract. In this chapter, I discuss the characteristics of primate behavior, and their social basis. Most of primate behavior is in fact social, and primates have detailed knowl- edge of each other’s social behavior and social status.
28 From nonhuman to human primates 29
2.1 From nonhuman to human primates
My strategy is to first model the nonhuman primate brain and then to extend it to the human case. The brains of all primates are very similar in structure and function. They have similar neural areas and similar interconnectivity among these areas. The cortical neuronal circuitry is very similar for all primates. It uses the same types of cell and the same neurotransmitter mechanisms.
Figure 2.1 diagrams the brains of a carnivore, a prosimian, a simian (i.e., monkey), and a human.
Figure 2.1: The similarity of the brains of primates, from [Brodmann, 1909] via [Bullock, 1977] Fig 10.92, p. 487 30 Chapter 2: Primate behavior
The behavior of nonhuman primates is simpler than humans, however their subcortical motivational systems are quite similar, so it is reasonable to take the innate motivational systems of nonhuman primates as essentially the same for humans.
Basic social behaviors in nonhuman primates are similar to those in humans - interactions, dominance hierarchies, social support and affiliation. Thus, it is reasonable to take these basic behaviors in nonhuman primates as a basis for extension to human behaviors. Social groupings into 30 or so individuals are similar in human hunter-gatherer societies and nonhuman primate troupes.
Thus, I would like to argue for the simplified view that the human neonate may have similar behaviors to a nonhuman primate neonate, and then the difference in cortical learning causes them to diverge as they develop.
One can argue for greater differences. Humans have much larger brains and have lan- guage. There is a much longer period of immaturity. There is thus a much greater degree of cortical control of behavior.
Humans have a much greater range of facial expressions. Humans have better vision. Humans have detailed knowledge and memory of events. Humans may be using some additional underlying learning mechanisms.
Thus my research plan is to first build a model of a monkey brain and get it to show monkey behaviors. Then after this I will extend the model to have human problem solving abilities. Then memory, language abilities and greater knowledge representation abilities.
In the next sections, I will try to take stock of what nonhuman primate behaviors my model will need to demonstrate initially. Biology and development 31
2.2 Biology and development
I will be describing a very complex biological system. There will be brain areas and anatomical connections, and there will be incoming data from senses as well as outgoing data to the body’s effectors. Neural connectivity is not the only thing, there will also be other phenomena at the level of the cell, such a different neurotransmitters, hormones, opiates, etc.
The brain has many different components, which have all evolved to their current state, and probably constitute a best compromise under the current circumstances.
In biology, development is a major theme, if not the major theme. Development in- cludes both phylogenetic development - how the brain evolved, and also ontogenetic development - how an individual human’s brain changes from conception on. An aware- ness of development will help us to appreciate the different brain components, and brain organization and function.
2.3 The evolution of primates
Primates evolved as arboreal mammals in tropical zones. The earlier forms are prosimians and the later ones simians. They also divide into New World and Old World species, since during their evolution the continents drifted apart. After simians, we get the hominoids - the apes - and then the hominids leading to modern man.
It is difficult to characterize exactly what a primate is, however defining characteristics of primates seem to include: larger brains with a lot of six layer cortex, eyes forward, facial expression - some have movable upper lips - sophisticated vision, including color vision, developed social dynamics (some prosimians are less social), developed hands, and feet, 32 Chapter 2: Primate behavior for manipulation.
The defining characteristics of man seem to be: language, very large brains, bipedal locomotion, and, according to John Eccles [Eccles, 1989], altruism and a greater ability to learn.
Note that all primates continued to evolve until the present day, so present-day monkeys are different from earlier monkeys, etc. In fact there were at one time many other types of monkey and ape, quite different from present day monkeys and apes, which became extinct.
2.3.1 Present day primates
There are currently one hundred and twenty or so species of monkeys, the character- istics and behaviors of about thirty or so of which have now been studied in detail by field primatologists [Fedigan, 1992] [Fedigan and Strum, 1997] [Rodseth et al., 1991b] [Rodseth et al., 1991a].
Most is known about three Old World species, the rhesus macaque, the Japanese macaque and the vervet. The most studied New World species are the squirrel monkey and the owl monkey.
There are very few apes. Baboons have been much studied. The great apes are all endangered - the chimpanzee and the gorilla in Africa and the orang-utan in South East Asia.
The chimpanzee has been much studied, being the smartest, and much the genetically closest to humans. Primate behaviors 33
2.4 Primate behaviors
For vervet monkeys, the classic primatological studies are due to K. R. L. Hall and Stephen Gartlan [Hall and Gartlan, 1965] and Thomas Struhsaker [Struhsaker, 1967c] [Struhsaker, 1967b] [Struhsaker, 1967a], who described basic behaviors, social relations and vocalization. Michael McGuire described the vervets on the island of St. Kitts, see his book [McGuire, 1974] and film “The St. Kitts vervets”. The importance of affiliation in determining behavior was noted and modeled by Robert Seyfarth [Seyfarth, 1977]. The ability to represent social relations and to generalize them has been described by Dorothy Cheney and Robert Seyfarth [Cheney and Seyfarth, 1990b], who have also contributed a comprehensive book on the vervet [Cheney and Seyfarth, 1990a].
Struhsaker noted 60 different detailed action types in 12 different stimulus situations. He gave a table indicating which detailed actions occurred in which situations, and gave an interpretation in terms of a message and a response implied by the action. Situations were also differentiated according to the age and sex of vervets involved. He used 5 age groups, namely, adult, subadult, juvenile, young juvenile and infant. He also gave a list of 47 different vocalization types and gave a table relating them to situations in which they occurred and the meaning implied, expressed as a message and an accompanying action. Vervets form troupes of about 40 animals, and vervet males migrate between troupes about every five years. and the system of social relations is based on grooming. The males and females are fairly evenly matched and the social system has a female hierarchy with a male hierarchy below this.
For rhesus monkeys, more precisely rhesus macaques, classic studies are by Southwick et al. [Southwick et al., 1965] and Lindburg [Lindburg, 1971]. They are found over a wide area in asia, including countries from Afghanistan to Vietnam. Rhesus males also migrate between troupes every few years, and the system of social relations is based on 34 Chapter 2: Primate behavior
grooming. The males and females are less evenly matched than vervets and the social system is multimale-multifemale. Also troupes tend to be larger, up to 100 animals. They have a set of vocal calls [Hauser, 1997] and a set of displays including fear grimac- ing (appeasement), staring with open mouth (threat), tail erect (dominance challenge), lipsmacking (conciliatry) and female (sexual) presentation [Estes, 1991]. Boelkins and Wilson [Boelkins and Wilson, 1072] described intergroup social dynamics for the troupes on the island of Cayo Santiago, near Puerto Rico. There is also a very good documentary film on these called “Monkey Island”.
Available data. The only available field data on monkey behavior, apart from film, is frequency and occurrence data of behavior types. The basic method of studying primates in the field is to develop a taxonomy of behavior types, usually less than 100 in number, and then to record the behaviors of each monkey over time in terms of this taxonomy. I have participated as an experimental observer in experiments at the Veterans’ Admin- istration primate facility at Sepulveda, California. There was a set of observers, and each observer looked only at one particular monkey. On a signal from the lead experi- mentalist, every sixty seconds, you record the code for the behavior that your monkey is doing at that moment. We each had a handheld computer into which we entered the code. This data was then uploaded into a larger computer for analysis. From this raw event data, various statistics can be calculated on frequencies, correlations, probabilities of given sequences, etc.
Cognitive abilities of captured animals are studied by more standard psychological ex- perimental situations. The animal is placed in a restraining chair and is fed juice or something similar for reward. It is then trained to do an experimental task and measures of task performance are recorded. Experiments can also be done under less restrained conditions such as in a large cage.
There is also a genre of experiments where a single electrode, a fine wire, is implanted Primate behaviors 35 in some area of interest in the animal’s brain and electrical readings are taken as the animal behaves. Experiments are beginning to be done in MRI imaging devices specially designed for nonhuman primates.
For chimpanzees, detailed videos are used, but usually for solitary cognitive activities and not for social behaviors. Usually these videos are not subjected to detailed second- by-second analysis.
For humans, detailed videos of social interaction are obtained, and subjected to detailed second-by-second analysis. Two key subareas are conversation analysis, and mother- infant interaction.
Range of behaviors. It seems that nonhuman primates exhibit a small number of different classes of behavior types: (i) feeding, including foraging, i.e., searching, hoarding, sharing, etc. (ii) agonism, including dominance struggles, avoidance and flight (iii) affiliation and attachment (iv) sex, including long-term sexual relations (v) infant caretaking, sometimes called infant handling, by all members of a troupe.
Knowledge of each other. Primates have detailed knowledge of each other, they also recognize each other’s individuality, can recognize each other at a distance, recognize each others’ particular vocal calls, and so on: (i) their health (ii) reproductive status (iii) affiliative, agonistic and sexual relationships (iv) family status, they will for example after a defeat in a fight take revenge on a weaker family member usually a sibling of the animal that defeated them. This is redirection.
Vocalization. Nonhuman primates use vocal calls. Typically, depending on the species, 36 Chapter 2: Primate behavior
there are about 50 types of calls, each with some social meaning. Within a given type, there will be variation of strength and duration, and as already mentioned, each individual animal has its own recognizable voice.
Social behavior. Note that essentially all nonhuman primate behavior is social behav- ior. Even apparently solitary behavior such as feeding is modulated and governed by the social context in which it occurs.
Thus the behavior of an individual is in general strongly coordinated with the behavior of others, and motivation of behavior is derived from the perception and knowledge of social relationships.
2.5 Societal dynamics
Troupes. Primates tend to form troupes of about 30 animals (rhesus can be > 100 animals, prosimians almost solitary, human hunter-gatherers 30-50).
Social hierarchies. Within a troupe, they dynamically maintain social hierarchies depending on the species and to some extent the environment, and the individual animals involved. Dominance carries with it the responsibility of leadership and protection of the troupe.
Dimorphism. Dimorphic species have males much bigger than females, e.g., rhesus 2 to 3 * weight, and form male dominance hierarchies. Less dimorphic species such as vervets typically form separate female and male hierarchies, with the female hierarchy often dominating the male hierarchy.
Social migration. A given geographical region will have several troupes. Each troupe has a territorial range that it defends and which may change slowly, or in some species the troupe may continually move to new areas. When juvenile males reach maturity, Societal dynamics 37 about 4-5 years of age, most of them leave the troupe of their birth and attempt to enter another troupe. This involves winning a place in the social dominance hierarchy. After a further five years or so they move to another troupe, and so on.
Social conflict. Each individual has certain goals and needs, and tries to satisfy these goals. It enters into social conflict with others as a result. To the extent that the outcomes of struggles show regularities, animals can form and be guided by memories of affiliative and agonistic solutions. This memory and behavior forms the social system.
Understanding dominant and subordinate behavior. Robert Sapolsky’s summary [Sapolsky, 1990] of the behavior of dominant baboon males has five categories. A domi- nant animal is more likely than a subordinate animal: (i) to differentiate between threatening and neutral interactions, (ii) to initiate a fight with a threatening rival, (iii) to initiate a fight he wins, (iv) to differentiate between winning and losing a fight, and (v) to successfully redirect aggression after losing a fight.
Dominance. Leader animals will protect the troupe against predators, and make dif- ficult decisions concerning troupe movement, food, and other life problems. However animals in subordinate positions tend to be more stressed and depressed. Work was done on the presence of the neurotransmitter serotonin in monkeys and it was found that dominant monkeys had more of it, and adding serotonin tended to make monkeys more dominant, and also that it was disproportionately present in their orbital frontal cortices [McGuire et al., 1986] [Raleigh, 1987]. This lead to the use of fluoxetine, which enhances serotonin levels by the action of synaptic reuptake inhibitors, for humans, sold under the brand name Prozac.
Female hierarchies are determined by birth order, the most recently born female being 38 Chapter 2: Primate behavior
placed above her elder sisters. This birth-order hierarchy will often be modified by individual personalities, strengths and weaknesses.
Sexual competition. Nonhuman primate females have offspring into old age, however the number of males is often less than females due to large casualty rates for migrating males, often 30-50% resulting in death. Only humans have menopause, and thus higher competition among males for fertile females, approaching a two to one ratio.
Affiliative behavior. Primates spend a great proportion of their time in affiliative behavior, which maintains their relationships, their position in their society, and the cohesion of their society.
Nonhuman primate troupes have been compared to the behavior of humans aged 10-12 (”junior high”) in the assiduousness and ubiquity of their socialization.
It is also clear that, in many species, animals achieve leadership through the strength of their affiliative relations [Kummer, 1975]. Males often only become dominant because of female support in addition to male support. Support stems from successful affiliative activity.
Troupe dynamics. A troupe will move around foraging for food and finding shelter in order to survive. It will need to defend itself against predators. There will be some inter-troupe conflict also.
As new animals are born and mature, and old animals grow weaker and eventually die, the pattern of affiliative, agonistic, sexual and caretaking relationships dynamically adjusts by a myriad of daily social encounters and tests of strength. As a result, the troupe society continuously restructures itself.
Survival. In order for a troupe, and the species, to survive, this process should sustain the life of its individuals. Survival includes getting adequate food and shelter, protection Societal dynamics 39 from predators, and replacement of animals by conception, birth, growth and develop- ment. Chapter 3
The primate brain
Abstract. I first give a brief account of the evolution of the primate brain, and point out the evolutionary origins of the different parts of the brain. I start from simple creatures and show the emergence of the reptile brain and then the emergence of six-layer cortex and the primate brain.
I then briefly discuss the structure and functioning of the primate brain. I give an outline architecture for the brain, making generalizations which necessarily simplify but which capture mainstream thinking. I discuss the neocortex, its division into areas and its layered structure.
40 The evolution of the primate brain 41
3.1 The evolution of the primate brain
3.1.1 Invertebrates
The evolutionary sequence is something like this: (i) distributed nets of neurons, as in jellyfish, (ii) then a concentration of neurons in the head to produce a very simple brain with some coordination of motor signals via a medulla (iii) internal organs controlled by neurosecretion, however there is not a true hypothala- mus.
3.1.2 Vertebrates
With the evolution of vertebrates, it seems that most of the modern arrangement came into being immediately although with simple components. The neurons were now ar- ranged into a systematic layered structure in the form of a sheet, or cortex. This sheet actually develops as a tube. The brain developed as a sequence of extensions of this tube called, respectively, the myelencephalon, the metencephalon, the mesencephalon, the diencephalon and finally the telencephalon, see Figure 3.1.
Thus the latest stage was the telencephalon, which had a pallium (roof) which would develop into the cortex, and a subpallium (floor) which developed into the basal ganglia. The thalamus concentrated input from sensors and from the lower brain and projected to the pallium and subpallium. Thus the cortex, basal ganglie and thalamus existed from the beginning. Then these components evolved and increased in complexity by the addition of nuclei. This new design reached a very successful design in the reptile brain, which is diagrammed in Figure 3.2, from [Romer and Parsons, 1986]. 42 Chapter 3: The primate brain
Figure 3.1: The telencephalon, from [Bullock, 1977]
The characteristics of the brains of early vertebrates include: (i) some coordination of sensory signals via a thalamus. The thalamus is present in very simple vertebrates such as early fish, including some anamniotes 1. (ii) the control of internal organs and endocrine system by a hypothalamus, (iii) much better motor control, by a cerebellum, developed originally in early sharks, (iv) the development of an olfactory cortex which differentiated from the pallium and is called the paleocortex or piriform cortex. (v) the diencephalon provided more vision processing using the superior colliculus and more auditory processing using the inferior colliculus, thus vision and audition were not handled by the cortex at this stage. (vi) the differentiation of the pallium into neocortex and archicortex. The archicortex becomes the hippocampus and moves to the medial surface of the hemisphere.
1The embryo of an amniote develops within an amniotic sac, the containing membrane being the amnion. Anamniotes do not. Later animals, including humans, are amniotes The evolution of the primate brain 43
Figure 3.2: The reptile brain, from [Romer and Parsons, 1986]
We show the development of the cortex in Figure 3.3, taken from Romer and Parsons’ book “The vertebrate body” [Romer and Parsons, 1986].
Figure 3.4, taken from Shepherd’s book, “The synaptic organization of the brain” [Shepherd, 1998], shows the 3 layer organization of the reptile cortex. The cortex is made up of millions of interconnected circuits of this type, arranged side by side to form a sheet.
As reptiles evolved into amphibia, the three-layer design with the two different types of three-layer cortex, i.e., paleocortex (the paleopallium) and archicortex (the archipallium), was very successful and lasted a long time, including the age of the dinosaurs. The two types of cortex have different connectivity patterns with the rest of the brain. The paleocortex connects directly to the amygdala, thalamus and hypothalamus, whereas the archicortex connects mainly to sensory inputs. At the same time, the basal ganglia differentiated from the paleopallium. 44 Chapter 3: The primate brain
3.1.3 Mammals
The next stage was the emergence of six-layer cortex as reptiles evolved into advanced reptiles, monotremes, marsupials and early mammals.
Figure 3.5, taken from Dart’s work [Dart, 1934], shows the cortex of an advanced reptile with two small areas of six-layer cortex extending between the olfactory cortex and the hippocampus.
Figure 3.6 shows the different types of cells in six layer cortex, taken from Rodney Douglas and Kevan Martin’s work [Douglas and Martin, 1998].
Figure 3.7, taken from Reiner’s work [Reiner, 1993] shows one account of the correspon- dence of six to three layer cortex.
3.1.4 Primates
With the advent of primates, evolving from advanced mammals, we get the rapid growth of six-layer neocortex until it dominates the brain. Figure 3.8 shows Stefan and Andy’s data on the relative growth of different parts of the brain, taken from Eccles’ book [Eccles, 1989]. To give a base line, they took what they hoped would be a neutral or vanilla mammal which they called a “basal insectivore”, i.e., a mammal that had a simple structure and behavior and spent its life eating insects. A hedgehog is pretty close to this I believe. Then they measured the sizes of different parts of the brain for different species and compared them with the brain of a basal insectivore. The evolution of the primate brain 45
Figure 3.3: The sequence of cortical evolution of the cortex, from [Romer and Parsons, 1986] 46 Chapter 3: The primate brain
Figure 3.4: The circuit of three layer cortex, from [Shepherd, 1998]
Figure 3.5: The emergence of six layer cortex in reptiles, from [Dart, 1934] The evolution of the primate brain 47
Figure 3.6: Type of cells in six layer cortex, from [Douglas and Martin, 1998]
Figure 3.7: The evolution from three to six cortical layers, from [Reiner, 1993] 48 Chapter 3: The primate brain
Figure 3.8: Stefan and Andy’s data on the relative growth of different parts of the brain, from [Eccles, 1989] The evolution of the primate brain 49
Thus, referring again to Figure 3.3, 1. the original cortex is the paleopallium which ends up as the piriform cortex, 2. the archipallium develops out of the paleopallium and becomes the hippocampus, 3. the basal ganglia develop out of the paleopallium and move to the interior, and 4. the neopallium develops from the archipallium and paleopallium and becomes the neocortex.
Todd Preuss has analyzed the neocortices of prosimians, simians and humans [Preuss and Goldman-Rakic, 1991a], and more recently the chimpanzee brain. Figure 3.9 shows newly evolved cortical areas of a simian species.
3.1.5 Modern evolution theory
The set of animals with a particular set of genes is called a genotype. The set of animals with the same physical expression of their genes is called a phenotype, so a phenotype may correspond to more than one genotype. The success in reproduction and survival of a given set of animals is called its fitness, and is determined by its phenotype.
The last few decades have seen a revolution in methods and in understanding of evolution [Ridley, 1993] [Stearns and Hoekstra, 2000] [Northcutt, 1991] [Butler and Hodos, 1996]: (i) the existence of evolutionary changes which had no impact on fitness, so called neutral genetic drift [Kimura, 1983] [Gillespie, 1992]. (ii) the development of cladistic analysis in which evolutionary development is represented as a binary tree with each step being the change in one dominant feature of the phenotype [Butler, 1994]. (iii) the use of DNA sequencing to discover genetic connections among species. (iv) the analysis of changes in terms of the evolution of the molecules involved in the development and functioning of the animal. (v) the analysis of evolution by types of neurons, which are determined by particular 50 Chapter 3: The primate brain
genes, and which make genetically determined topological connections, independently of geometry [Deacon, 1990].
These advances have also disproved several classical assumptions: (i) that present day animals are arranged in a linear scale of complexity ending in man, the scala naturae assumption, also termed orthogenesis, whereas animals have actually evolved in parallel with man and with similar growth in complexity. Corresponding parallel evolution of similar complexity is termed homoplasty, as distinguished from the relationship between a simpler and a more complex version of a feature, which is called homology. (ii) that various brain organs emerged one at a time, whereas with vertebrates the entire layout of the telencephalon emerged at once. (iii) that the telencephalon of early animals was dominated by olfactory connections.
3.2 The structure of the cortex
Neocortical areas. The neocortex in all primates is arranged as specialized areas, about fifty in each hemisphere. Each area has a few million neurons, is a few millimeters across, and 4 to 6 millimeters thick. Areas have distinct information-processing functions.
Figure 3.10 shows Brodmann’s numbering of the different areas that he identified in the human neocortex.
The structure of the neocortical areas. The primate neocortex comprises neurons from a small set of anatomical cell types, shown in Figure 3.11. Cells of a given type are generated by a particular set of genes.
In addition to pyramidal cells there are several other types of cross-connecting neurons, or interneurons, in numbers similar to the pyramidal cells. These form all possible circuits The structure of the cortex 51 and connections between pyramidal cells in the same and different layers. The fan-in ratio of connections to any given pyramidal cell can be as high as 10,000.
Areas have specific fixed interconnectivity. I will discuss the experimental evidence later in chapter 12, for a pattern of connectivity among areas, which appears to be the same, or similar, for all primates. A connection between two areas consists of about one million axons from pyramidal cells in a cortical layer in the source area to another cortical layer in the target area. Each area is typically connected to a small number of other areas. Connections divide into long range and short range. At short range, an area is often connected to several neighboring areas that are contiguous with it. At long range, an area is usually connected to one, two or three areas that are further away, and not contiguous with it.
Connectivity among the different neocortical areas and also subcortical areas Figure 3.12 diagrams how each layer has a different connectivity role. Layers 1 and 4 receive inputs, from other cortical areas and the thalamus, and layers 2, 3, 5 and 6 contain pyramidal cells generating outputs. The targets of these outputs depend on the layer, layer 2 to cortical areas in the same hemisphere, layer 3 to cortical areas in the opposite hemisphere, layer 5 to the thalamus, and layer 6 to subcortical areas including the basal ganglia.
There seem to be anatomical arrangements for hierarchical processing. The forward as- cending connections from prior areas enter at layer 4 and feedback descending connections from later areas enter at layer 1 [Felleman and Essen, 1991]. 52 Chapter 3: The primate brain
3.3 The uniform process of the neocortex
3.3.1 There is a uniform process
The primate cortex has a uniform structure over all of its area [Creutzfeldt, 1978] [Ullman, 1991], having a six-layer organization comprising neurons from a small set of cell types. The numbers of these cells per unit volume are very uniform over the cortical surface, the main differences being in motor cortex which has more and larger pyra- midal cells, and in visual cortex, which has a significantly, three times, greater density of cells. A canonical neocortical circuit can be described [Shepherd and Koch, 1998] [Douglas and Martin, 1998] see Figure 3.13, together with regional variations from the canonical form characterized. The boxes indicate populations of neurons of a given cell type, P2+3 are pyramidal cells in layers 2 and 3, P5+6 are pyramidal cells in layers 5 and 6, 4 are layer 4 stellate cells, and GABA are GABAergic interneurons, i.e., neurons using the neurotransmitter GABA. The blue dashed lines indicate inhibitory connections and the red continuous lines excitatory connections.
Although long-range connectivity, as we have seen, tends to be clustered around cortical regions, short and medium (<3mm) connectivity within one area of the cortex is statis- tically quite uniform. It therefore appears that information processing within different cortical regions has a common basis or principle.
3.3.2 Theories of the uniform process
This subsection is taken mainly from Shimon Ullman’s MIT AI Memo No 1311 December 1991 [Ullman, 1991]. He lists the different theories that have been proposed for the uniform cortical process: Overall components and architecture 53
1. The classifying cortex, David Marr [Marr, 1970]. The uniform process is the classifi- cation of incoming patterns
2. The non-linear spatio-temporal filter, Otto Creutzfeldt [Creutzfeldt, 1978]. The uni- form process is a filter which links the activity of the thalamus and other afferents to the effectors
3. The model builder, Horace Barlow [Barlow, 1972] [Barlow, 1990]. The uniform process is the detection and signaling of suspicious coincidence. P (AandB) >> P (A) ∗ P (B), coincidences being used to form internal models of the environment.
4. Multilevel relaxation, David Mumford [Mumford, 1994]. Multiple cortical areas inter- act to achieve a consistent interpretation of the incoming stimulus.
5. Large-scale associative memory, John Hopfield [Hopfield, 1982].
6. Interpolating memory, Tomaso Poggio [Poggio and Shelton, 1999], James Albus [Albus, 1981].
7. Neuronal group selection, Gerald Edelman [Edelman and Mountcastle, 1978].
8. Sequence-seeking using counter-stream structure, Shimon Ullman [Ullman, 1991] [Ullman, 1996]. The process is a search for a sequence of mappings linking sensory source and target model representations.
3.4 Overall components and architecture
Figure 3.14 is a very simplified diagram of the main components of the brain and their connections. An approximate idea of their functions, which will be discussed in a lot more detail later in the book, is as follows:
(i) The neocortex (or, simply, cortex) is a perception-action hierarchy with stored se- 54 Chapter 3: The primate brain mantic and episodic memories, providing overall control. It is organized as a six-layer sheet (cortex) of neurons.
(ii) The hippocampus provides episodic memory formation with some long-term storage of episodic information. It is organized as a cascade of three subcomponents, each a three-layer cortex.
(iii) The thalamus provides routing of incoming data to the neocortex and from cortex to cortex. It is organized as nine nuclei with limited mutual interaction.
(iv) The basal ganglia are involved in procedural memory and routine motor control. They are situated in loops to and from the neocortex.
(v) The amygdala integrates low-level motivational processing and has connections to orbital frontal neocortex, to the hippocampus, and to the hypothalamus. It is organized as two groups of nuclei.
(vi) The cerebellum is apparently used for smooth motor control and for spatial rep- resentation is general. It is organized as a very regular cortex with a large number of cells.
(vii) The hypothalamus is the main work center for creating and maintaining subcortical motivation states. It generates hormones, via the pituitary gland, and sends signals to the autonomic system and skeletal musculature, readying the system for different kinds of action. Hormones are also generated by glands in the body and these affect the hypothalamus and other brain components, including the neocortex, via the blood stream and synaptic receptors.
(viii) The brain stem is concerned with basic functioning and arousal of the system, including sleep.
(ix) The spinal cord concerns communication of sensory and motor data to and from the Overall components and architecture 55 body. 56 Chapter 3: The primate brain
Figure 3.9: Todd Preuss’s findings of new neocortical areas by comparing a prosimian (a Galago bushbaby) and a simian (a Macaque monkey) Overall components and architecture 57
Figure 3.10: Brodmann’s areas of the human neocortex 58 Chapter 3: The primate brain
layer I H1
layer II G2
P3S layer III P3L
M4 layer IV ST4 G4 P5S layer V P5L
M6 layer VI SP6
Figure 3.11: Cell types and internal connectivity in the neocortex, from Shepherd and Koch in [Shepherd, 1998]
layer I inputs of feedback from other cortical areas and thalamus
layer II output to other cortical areas in same hemisphere output to other cortical areas in other hemisphere layer III
layer IV inputs of feedforward from other cortical areas and thalamus outputs to thalamus layer V
outputs to subcortical areas layer VI
Figure 3.12: External connectivity of the different cortical layers Overall components and architecture 59
P2+3 (4)
GABA cells
P5+6
thalamus
Figure 3.13: Canonical neocortical circuit, due to Douglas and Martin [Douglas and Martin, 1998] 60 Chapter 3: The primate brain
neocortex perception−action hierarchy willed action
basal ganglia hippocampus routine behavior episodic memory formation routine motor control thalamus routing, communication cerebellum external spatial representation sensing smooth motor control amygdala integration of motivation hypothalamus control of internal motivational states sensing
brainstem arousal hormones autonomic system skeletal hormones musculature spinal cord information transmission
Figure 3.14: Overview of brain components Chapter 4
The historical development of system-level approaches to the brain
Abstract. I give a brief historical review of the several strands and disciplines that are relevant to my purpose.
The basis of the study of the brain was laid down in the nineteenth century, and lead to a system-level approach to describing the brain, which was used by neurologists in characterizing the pathologies of their patients.
Psychology emerged from philosophy as an experimental discipline, and then divided into experimental psychology, psychiatry and psychotherapy.
More recently the anatomy and connectivity of the brain have been elucidated by the use of lesion, tracing and now imaging techniques.
61 62 Chapter 4: The historical development of system-level approaches to the brain
4.1 Introduction
Although the scientific study of behavior, that is scientific psychology, only began in the nineteenth century, neurology, the medical understanding and treatment of dis- orders of the nervous system, has a long history going back before written records [Benton and Joynt, 1960]. In the beginning, there were not only art and philosophy, but also medicine and engineering.
Figure 4.1 attempts to diagram and overview some of the main researchers and their contributions.
Philosophy Neurology Psychiatry Neuropsychology Neurolinguistics Hippocrates 400 BC aphasia
Paracelsus 1520 AD Schmidt 1673 Locke − ideas, excitatory association Duc de Saint−Simon 1718 Goethe 1795 Pinel 1793 Gall 1810 − brain organization Esquirol 1820 Herbart 1820 − calculus of mental ideas inhibitory associative connections unconscious ideas
Spencer 1860 − system evolution and dissolution Broca 1869 − localized brain area for speech production Meynert 1870 − brain system anatomy Charcot 1875 − hypnosis Wernicke 1880 − modular brain architecture and psychiatric explanations Hughlings Jackson 1880 − brain system dissolution
Freud 1895 − neurological model Kraepelin 1900 1902 − psychoanalytic model
Janet 1905 − system dissolution Marie 1905 Brodmann 1908 − cortical areas Head 1920 Jakobson1955 Luria 1955
Milner 1957 memory and hippocampus Geschwind 1965 − disconnexion syndromes Warrington 1967 neurocognitive modeling 1968 Shallice 1970
Figure 4.1: Events in the history of neurology and neuropsychology
I will not attempt the impossible task of a clean separation among these different disci- Neurology, neuroanatomy and neurophysiology 63 plines.
4.2 Neurology, neuroanatomy and neurophysiology
John Locke, 1632-1704, developed a theory of ideas which were connected together by excitatory associations. The basic processes were sensation and reflection “our senses, conversant bout particular sensible objects, do convey into the mind several distinct per- ceptions of things, according to those various ways wherein those objects do affect them” [Locke, 1690]. These ”ideas” are, according to Locke, the fundamental building blocks of all human thought. The process of reflection could be more active and deliberate, including operations of composition and abstraction of ideas.
Johann Friedrich Herbart, 1776-1841, introduced the concept of inhibitory association [Herbart, 1824]. Ideas competed for energy and the dominant ones corresponded to conscious thought whereas the subordinated ones corresponded to unconscious thoughts. This laid a framework for Freud’s architectural ideas. Herbart’s textbook on psychology was the standard one in Germany in the midnineteenth century and certainly Freud would have read it for example when working in Meynert’s laboratory in the 1880s. It was Herbart who separated psychological ideas from philosophy, however he did not propose an experimental methodology, this was left to others such as Wilhelm Max Wundt, 1832-1920, [Wundt, 1863] [Wundt, 1874].
Franz Joseph Gall, 1758-1828, referring to language, opined that “a special organ of the brain presides over this wonderful function” [Gall and Spurzheim, 1810] (vol 4 p. 65).
Paul Broca, 1824-1880, made the first observation of localization of function in the human brain when he examined a patient, named Leborgne, with a speech production aphasia 1,
1An aphasia is any problem with the recognition or generation of language 64 Chapter 4: The historical development of system-level approaches to the brain and then was able to examine his brain post mortem and found damage to a left frontal area. This area was called Broca’s area, and aphasia due to its malfunction was called Broca’s aphasia.
In 1873, Camillo Golgi, 1843-1926, had been able to extend histological staining tech- niques for the first time to the very fine cells in the brain, to show its neuronal structure [Golgi, 1873]. This was based on the ’black reaction’ (reazione nera) causing nervous tis- sue hardening in potassium bichromate and impregnation with silver nitrate, a method now called Golgi staining.
Theodor Meynert, 1833-1892, was a neuroanatomist and developed the first overall archi- tectural scheme for the human brain, in which connections directly to and from the exter- nal sensors and effectors were called projection neurons and the other neurons connecting these projection neurons to each other were called association neurons [Meynert, 1884] [Marx, 1970]. In the 1884 edition of his book, he describes three different kinds of cortical cells and a five layer structure of the cortex.
Carl Wernicke, 1848-1905, was very influenced by Meynert and discovered a correspond- ing speech perception aphasia due to damage to left lower parietal regions, which were then called Wernicke’s area and Wernicke’s aphasia [Wernicke, 1874]. Actually this find- ing had already been published by H. Charlton Bastian, 1837-1915. However, Wernicke also noted that there were aphasias due to the disconnection of these two main speech areas, Broca’s and Wernicke’s areas, due to damage of the arcuate fasciculus, (the bun- dle of nerves connecting them). The first system diagram was due to Ludwig Lichtheim, 1845-1928, [Lichtheim, 1885], however Wernicke developed a more comprehensive system architecture [Eggert, 1977]. This lead him to develop a system approach to describ- ing the functioning of the brain and to classify neurological disorders due to malfunc- tion of different subsystem components and connections among them [Wernicke, 1881] [Wernicke, 1894] [Wernicke, 1886]. This approach was made possible by its grounding in Neurology, neuroanatomy and neurophysiology 65
Meynert’s neuroanatomical architectural framework.
Figure 4.2 is my version of Wernicke’s diagram, based on one of his overview papers [Wernicke, 1886].
Concept representations
speech verbalization comprehension 3 6
Auditory images 1 4 Motor images
7 commissure
speaking hearing 2 5
hearing organs speech organs
Figure 4.2: Wernicke’s system diagram
This diagram allows the explanation of seven different types of aphasia, due to discon- nection at the seven indicated points, namely, (1) cortical alexia - loss of ability to read and write, (2) subcortical alexia - loss of ability to read, while writing is unimpaired apart from copying, (3) transcortical alexia - loss of ability to read and write except for copying written ma- terial, (4) cortical agraphia - fine control for writing and copying is greatly impaired, but read- ing is unimpaired, (5) subcortical agraphia - similar to 4, 66 Chapter 4: The historical development of system-level approaches to the brain
(6) transcortical agraphia - should prevent spontaneous writing and allow copying, how- ever probably doesn’t exist as a separate condition, and (7) conduction agraphia - reading is undisturbed, normal writing is lost, but “para- graphic” writing may be possible.
Figure 4.3 shows how this diagram would be typically extended to include visual percep- tion and also writing output.
concept representations visual speech verbal representations comprehension expression 3 6
auditory images 1 commissure 7 4 motor images
copying output for writing 2 hearing 5 speaking
seeing organs hearing organs speech organs writing organs
Figure 4.3: Extended system diagram
At that time, it was hoped that all of psychiatric problems could be dealt with as neu- rological problems, however this was not possible and a separate discipline of psychiatry, not requiring neurological description, was developed by Emil Kraepelin, 1856-1926, and also of course psychoanalysis was developed by Sigmund Freud, 1856-1939.
In 1909, the neuroanatomist Korbinian Brodmann, 1868-1918, described the partitioning of the neocortex into distinct areas [Brodmann, 1909]. For primates, there were about 50 areas on each hemisphere, see section 3.2 and Figure 3.10. This brilliant work was based only on the anatomical appearance of populations of neurons comprising the cortex. Neurology, neuroanatomy and neurophysiology 67
There was another line of conceptualization which concerned the integrated system func- tioning of the brain and its disease. Basic concepts of system evolution and dissolution were described by Herbert Spencer, 1820-1903, [Spencer, 1863] (his chapter on dissolu- tion). John Hughlings Jackson, 1834-1911, developed concepts of hierarchical architec- ture of the brain and the idea of dissolution as a method of characterizing neurological disorders [Jackson, 1931] [Kennard and Swash, 1989]. Freud acknowledged that he was influenced by Hughlings Jackson.
It was about this time that the forces of the insecure, the overcritical, and the afraid, came into action, with the inevitability of winter. Broca’s work was trashed by Pierre Marie. In 1905, he examined the brain of Broca’s patient, which had been preserved, and concluded that there was little localization. Thereafter none of the system theorists was allowed to work in the French medical system. In England, the leading neurologist Henry Head described the work of the “diagrammakers” as chaos, and replaced it by a detailed empirical methodology with minimal theoretical expression. Wernicke had apparently offended an official while a junior researcher and thereafter for most of his life he was denied any German federal chair. Tim Shallice [Shallice, 1988] has observed that the names of Broca and Wernicke have survived, whereas those of Marie and Head have fallen into obscurity. In 1984, when CT imaging became available, the still pre- served (!) brain of Broca’s patient was imaged and studied by Jean-Louis Signoret et al [Signoret et al., 1984] using modern techniques. Broca’s description of it was upheld.
Neurological interest in how the brain processes language resurged in the 1950s, for example by Russian neurologists lead by Alexander Romanovich Luria, 1902-1977, [Luria, 1970] [Luria, 1970] [Luria, 1978] [Luria, 1980] treating the large numbers of brain-injured soldiers from the second world war. Roman Jakobson, 1896-1982, ap- plied linguistic criteria to describing neurological disorders of language processing [Jakobson and Halle, 1956]. Norman Geschwind, 1926-1984, in America lead the resur- 68 Chapter 4: The historical development of system-level approaches to the brain gence there of system descriptions and the recognition of disconnection syndromes in language processing [Geschwind, 1965] [Geschwind, 1966] [Geschwind et al., 1968].
During the 1950s to 1970s, advances were made in describing the architectural layout of the brain, using lesion methods, i.e., the selective destruction of different parts. This lead to the idea of hierarchical connected sequences of brain areas by Edward Jones and Thomas Powell in 1970 [Jones and Powell, 1970]. This work confirmed Brodmann’s partitioning and brain areas. In the 1970s, 1980s and 1990s, tracing methods were developed which allowed more detailed tracing of connectivity of the brain and the delineation of the different brain areas. This allowed Deepak Pandya and coworkers [Pandya and Yeterian, 1990] to confirm in greater detail the hierarchical structure of the cortex first indicated by Jones and Powell, see chapter 11. From the 1980s on, start- ing with Per Roland’s PET studies [Roland, 1993], brain imaging has provided a new type of data in which activations of different brain areas are observed under different experimental conditions.
4.3 Psychology
Psychology was given an experimental foundation by Wundt in the 1860s, and then became a separate discipline from neurology and philosophy. In addition, psychiatry and clinical psychology separated at about the same time, so that by the year 1900, there were these three components running concurrently. Psychology proper, henceforth referred to as experimental psychology, included human and animal behavior. By clinical psychology, I mean the study and treatment of neurotic behaviors, and I mean to include Freud and Janet and their colleagues.
The main issues in experimental psychology were memory, thinking, perception, motor Psychology 69 skills, and motivation.
The main issues in psychiatry were objective testing and treatment of the severely men- tally ill including psychotics.
The main issues in clinical psychology were hysteria, dissociation and overall severe neu- rotic conditions. These arose in patients presenting with behavioral problems, and with- out any known neurological problems.
Emil Kraepelin is justly called “the father of modern psychiatry”. He was the first to identify schizophrenia (originally called dementia praecox, meaning early, “precocious”, senility), manic-depression (bipolar disorder) and paranoia [Kraepelin, 1921], and he pio- neered the use of drugs to treat mental illness. He was also joint discoverer of Alzheimer’s disease (which he named after his collaborator, Dr Alois Alzheimer) [Kraepelin, 1922]. Kraepelin presented these and other discoveries in successive editions of his “Psychiatrie: Ein Lehrbuch” [Kraepelin, 1899]. He disagreed with Freud and did not use the notion of the unconscious. His legacy reaches us via his diagnostic manual which became the standard DSM manual used by all psychiatrists in the USA, and which in some sense defines the field of psychiatry.
Figure 4.4 attempts to diagram and overview some of the main researchers and their contributions.
4.3.1 System models in psychology
There have been many different theoretical bases suggested for psychology. Let me mention two main classes, first the clinical models of Freud and Janet, and then the information-processing models of short-term memory. 70 Chapter 4: The historical development of system-level approaches to the brain
perception association memory thinking/intelligence emotion dreams personality/types development consciousness social psychology Plato Plato Aristotle Aristotle Hobbes Wundt
Weber Fechner Ebbbinghaus Mach
James James Jung Binet Piaget Bartlett
Figure 4.4: Events in the history of psychology
Freud. In his lifetime, Sigmund Freud described three different system models, namely, (i) his “Project for a scientific psychology” model [Freud, 1895] in which neurons with psy- chic energy sought release and formed unconscious, preconscious and conscious parts, (ii) his “transcription” model [Freud, 1900] [Olsen and Koppe, 1988], introduced in his “In- terpretation of Dreams” book, in which the brain had a series of information-processing stages and information was rewritten in different forms from one stage to the next, see Figure 4.5, and (iii) his “ego, id and superego” model [Freud, 1923], which had a hierar- chical structure with repressed memories being pushed down to an unconscious level.
Figure 4.5: Freud’s diagram of his transcription model, from [Freud, 1900]
Freud acknowledged, see for example [Freud, 1894], the ideas of Pierre Janet who intro- duced the idea of dissociation in the early 1890’s, however Freud developed in a different Psychology 71
direction, introducing repression instead of dissociation, infantile sexual fantasy instead of different kinds of trauma, and an aggregated unconscious based on psychic energy.
Janet. Janet’s [Janet, 1886] [Janet, 1891] [Janet, 1894] [Janet, 1898] [Janet, 1901] [Janet, 1904] [Janet, 1907] [Janet, 1919] [Janet, 1971] dissociation model has better stood the test of time, being embraced as a precursor of modern theories of dissociation which are used to understand post-traumatic stress disorder [der Hart and Friedman, 1989] [van der Kolk et al., 1996]. An important main phenomenon both scientists tried to un- derstand in the nineteenth century was hysteria in which patients could not perform certain normal actions or else felt compelled to perform certain abnormal actions. Janet did not use the notion of unconscious and had a central conscious subsystem with higher levels of processing which could be dissociated laterally from each other. Dissociation was caused by trauma which lead to memories that were not integrated into the subject’s narrative. Such traumatic memories had different dynamics from normal memories, they could be triggered by specific stimuli, they retained salience and did not fade with time, and they had a tendency to express themselves in behavior either overtly or covertly. Thus the main explanatory idea was that of failure to integrate memories due to trauma. This could lead to various symptoms and to selective amnesia.
Janet’s clinical treatment consisted of attempts to reintegrate memories, using hypnosis and other techniques. Freud’s main approach was to reveal and make conscious the unconscious memories by free association and by talking out in a therapeutic relationship.
More recent psychotherapeutic work. There has been systematization of defense mechanisms which eventually lead to a more precise description by Kenneth Colby [Colby, 1963]. Later psychotherapeutic models include object relations due to Fairbairn [Fairbairn, 1952], self models due to Kohut [Kohut, 1971], and intersubjective models due to Ferenczi [S´andor, 1955], and more recently Stolorow [Stolorow et al., 1987]. Of these, object relation ideas are most clearly defined, however they are still some distance 72 Chapter 4: The historical development of system-level approaches to the brain
from any kind of predictive mathematical model.
Kenneth Colby. Colby also developed a model of paranoia which was successfully implemented as a computer model [Colby et al., 1971]. This used the idea of a dynamic self-esteem variable which tended to be depressed by incoming messages and from this to lead to hostile responses. This very innovative work unfortunately was always dependent on being able to incorporate natural-language understanding and this formed a barrier to further elaboration of underlying psychic mechanisms.
Information-processing memory models. Following the development of concepts of information and channel in the second world war, Donald Broadbent introduced these ideas into psychology [Broadbent, 1958] mainly in the context of understanding attention and auditory perception. This lead to the idea of a short-term memory in which incoming data was held and rehearsed, the dominant model being due to Atkinson and Shiffrin [Atkinson and Shiffrin, 1968]. This was systematized further in Morton’s Logogen model [Morton, 1970] which used a unified phonological code and storage for short-term memory items. This approach was elaborated further by Shallice [Shallice, 1988], and in general has had explanatory value in understanding dyslexia for example. During the late 1960s and the 1970s, neuropsychological methods and phenomena were being introduced by Warrington and Shallice [Warrington and Shallice, 1969]. This clinical work discovered a range of phenomena caused by malfunction of various information-processing activities of the brain, in perception and memory. Another development has been working memory and the model of Alan Baddeley, which has auditorily and visually coded stores used for thinking as well as perception [Baddeley, 1986]. These models, although systemic, have been used descriptively using natural language, rather than mathematically with precise computer models. They are not defined completely enough for computer implementation and the use of more precise models has not been seen as scientifically useful for better understanding the phenomena of interest. Chapter 5
The history of formal description
Abstract. In this chapter, I regard formal logic, theoretical computer science, and ar- tificial intelligence as forming a unity which stems from the need for precise formal de- scription in mathematics, computer science and natural science.
I trace the desire for precise description through Frege to modern predicate logic. I explain the main concepts of predicate logic, its strengths and weaknesses, and the historical struggle to develop it.
At the same time, I trace the development of theoretical computer science from Babbage through G¨odel, Turing and Church to artificial intelligence, modern theoretical computer science and logic programming. This explains how predicate logic became an important method for describing computation.
73 74 Chapter 5: The history of formal description
5.1 Introduction
It is a fundamental property of the human mind to be constantly trying to describe its environment, the events that are occurring and ultimately to try to describe itself. Marvin Minsky [Minsky, 2003] reminds us that the real problem is being able to describe high-level human thought and consciousness.
Figure 5.1 diagrams some important events in the history of formal description. I have not included neural nets as, although mathematical and precise, I don’t think that in their present form they can be called formal description methods; they are based on real analysis.
5.1.1 Using natural language for scientific and mathematical
description
Description consists of expression in some external communicable form. In science, this has resulted in the development of precise and technical language which has allowed unambiguous and precise communication of ideas and scientific findings and principles.
For several hundred years, science was also communicated in the common international language of Latin. This was replaced in the eighteenth century by German, French, English, Russian, etc. and scientists usually learned to read these languages. Proficiency in German was an official requirement for entering university to study science in England until the 1960s. The author had to attend “scientific German” classes in high school since London University for example required a pass in this subject for all entering undergraduates. He went to Oxford, which required Latin instead! Introduction 75
Philosophy and logic Computer science
Theoretical Artificial intelligence Aristotle computer science Ramon Lull
Leibniz − calculus of ideas
Frege 1879 concept script
Paradoxes Russell and Whitehead 1906
Hilbert’s logicism 1900
Goedel’s incompleteness theorem 1931 Model theory Tarski 1934 Turing machines 1935
Zermelo Frankel Church’s functional theory of sets calculus 1941 Neural nets 1946 Turing’c chess program 1950 Cellular automata 1953 Von Neumann Finite automata 1954 Artificial intelligence 1956 Newell and Simon, Minsky Formal language theory AI programing language 1956 1957 Robinson’s theory of Automata and the real numbers 1961 languages 1962 Functional programming Structure definitions Automated theorem proving Complexity theory resolution principle 1965 1967
Database theory 1970 Knowledge representation 1967 Hewitt’s Planner Model theory of functions 1973
Logic programming 1973 Frames, Minsky 1974
Logic grammars 1976
Theory of processes 1978 SOAR 1981
Deductive databases 1985 Multiagent systems 1988
Statistical natural language learning 1995
Figure 5.1: Events in the history of formal description 76 Chapter 5: The history of formal description
Technical English involves speaking in a special style and using technical vocabulary and terms that are explicitly defined. This reaches a most developed form in mathematics. The aim is to ensure that there is no ambiguity, and this has been successful; mathemati- cians have been able to communicate the most subtle and complex ideas and to work together.
5.1.2 Logic in natural language
Before Frege, logic was conceived as using natural language, and Aristotelian syllogisms were used for logical inference. Thus, in reasoning, one matched a syllogism to a nat- ural language sentence and derived a new natural language sentence. Sentences were structured as subject and predicate, which were quantified over separately. Incidentally, Aristotle also included some inductive reasoning and some reasoning by analogy, whereas later logics only include deduction, with induction being treated as a separate logical is- sue, and analogy being treated as artificial intelligence.
A description then consists of some externalized form, which we can take to be expressions in some language. In order for this to work, the receiver or beholder has to perceive these expressions, and also has to understand the language in which they are expressed, so there are these two aspects, syntax and semantics.
5.2 Formal logic
The yearning for a precise language for describing thought can be traced back to Ramon Llull, but in particular to Gottfried Wilhelm von Leibniz.
Ramon Llull, 1235-1316, was a Majorcan theologian whose major work “Ars Magna” was a set of treatises, including “The Tree of Knowledge” and “The Book of the Ascent Formal logic 77
and Descent of the Intellect”. This was at the height of the Islamic empires, and he read extensively on the Arabic tradition and the Logic of Al-Ghazzali. He wrote in Latin, Arabic and Catalan. His books included a design for a reasoning machine. It was a set of concentric disks with words written on them, that could be rotated to display different statements from combinations of these words. His idea was that one statement could be set up on the disks and this would result also in other statements being displayed, thus giving a kind of mechanical inference. Llull tried to use logic and mechanical methods involving symbolic notation and combinatorial diagrams to relate all forms of knowledge. He attempted to reduce Christianity to rational discussion, to prove the dogmas of the Church by logical argument. “Without producing, no man can love, nor can he understand or remember, nor have the power of feeling and being.” Ramon Llull, “The Hundred Names of God”.
In the seventeenth century, Gottfried Leibniz, 1646-1716, born in Leipzig, envisioned a formal language, a calculus, that would capture and embody all truth and valid reason- ing. He called this the characteristica universalis. “If we had it, we should be able to reason in metaphysics and morals in much the same way as in geometry and analysis ... If controversies were to arise, there would be no more need of disputation between two philosophers than between two accountants. For it would suffice to take their pen- cils in their hands, to sit down to their slates, and to say to each other ... Let us calculate.” G. W. Leibniz from Gerhardt [Gerhardt, 1890]. “Leibniz’s vision may have been absurdly ambitious, but the ideal was to influence many subsequent philosophers, most notably, Frege and Russell”, from the Stanford Encyclopedia of Philosophy, see also [Russell, 1900]. Leibniz was also interested in numerical calculation and in 1673, he demonstrated his incomplete calculating machine, which could multiply, divide and extract roots, to the Royal Society.
Friedrich Ludwig Gottlob Frege, 1848-1925, developed the first ever formal language. 78 Chapter 5: The history of formal description
He called it a “concept script” (Begriffschrift), his landmark paper was published in 1879 [Frege, 1879] entitled “Begriffsschrift, eine der arithmetischen nachgebildete Formel- sprache des reinen Denkens”, that is, “Concept Notation: A formula language of pure thought, modeled upon that of arithmetic” It had most of the descriptive ideas that are used in predicate logic to this day. It had boolean operators, predicates and quantifiers as used in modern predicate logic 1 Frege was the first major proponent of logicism – the view that mathematics is reducible to logic. Thus, all mathematics and science would eventually be expressed as formal logical expressions, with derivations being found by logical inference. Frege’s “Grundgesetze der Arithmetik” [Frege, 1893] was an attempt to explicitly derive the laws of arithmetic from logic.
Figure 5.2 shows Frege’s concept script and its relation to modern predicate logic, thus it involved the formal or mathematical use of universal quantifiers, meaning “for all individuals such that” and of existential quantifiers, meaning “there exists an individual such that”, individuals being the notional things that the logic makes assertions about.
Frege’s was a second order logic. By first order we of course mean involving quantification over individuals only, and second order involving quantification over predicates as well.
Formal description, that is formal, or mathematical or symbolic, logic, introduced formal languages for scientific communication. By formal we mean a transformation based solely on the form of an expression and not on its meaning. The main aim originally was to make the understanding of the language extremely easy, in fact purely mechanical. Thus, all the recipient had to do was read a formal expression as a sequence of characters, which had a simple syntax. Then in order to use the expression, all they had to do was apply logical schemata which involved only matching one bracketed expression to another. This process could be, and of course eventually was, performed by a machine.
1A predicate is a function whose values are truth values, i.e. true or false, and a boolean operator is one which combines truth values, such as and, or and not. Formal logic 79
Logical notion Frege's concept script Modern notation
It is not the case that Fx Fx F(x)
Gy If Fx then Gy F(x) G(y) Fx
Every x such that Fx x Fx ( x) F(x)
Some x such that Fx x Fx ( x) F(x)
Every F is such that Fa F Fa ( F)F(a)
Some F is such that Fa F Fa ( F)F(a)
Figure 5.2: Frege’s concept script
In 1910 Bertrand Russell, 1872-1970, and Alfred North Whitehead, 1861-1947, devel- oped a symbolic language for describing the mathematics of numbers, with mathematical proofs of their properties [Russell and Whitehead, 1910]. At that time there were a host of difficult paradoxes most of which stemmed from circular definitions. These were in the main removed by using a typed theory, with a hierarchy of types of individuals and functions. Arguments of given predicates were restricted to certain types of data, and any quantification was over values of some given type. This restriction prevents most circular definitions and statements.
The original logicism postulated that all of mathematics could be derived from a set of logical axioms using some basic rules of inference. One possible set is as follows, taken from [Mendelson, 1979]: Logical axioms. If A, B and C are any logical formulae, then the following logical state- ments are always true: 80 Chapter 5: The history of formal description
1. A⊃(B⊃C), 2. A⊃(B⊃C))⊃((A⊃B)⊃(A⊃C)), 3. (¬B⊃ ¬A)⊃((¬B⊃A)⊃B), and 4. (x)A(x)⊃A(t) 2 3, and 5. (x)(A⊃B)⊃(A⊃(x)B) 4. A possible set of rules of inference is: (i) Modus Ponens: A, (A⊃B) |= B, and (ii) Generalization. A |= (x)A. The symbol “⊃” means implies, “¬” means not, “∨” means or, “∧” means and, and “|=” means logically proves. “⊃” and “¬” and statements 1-5 are in the object language, and the symbol “|=” and statements (i) and (ii) are in the metalanguage, in which we express statements about the object language.
These days it is accepted that in order to do mathematics we need appropriate non-logical axioms, usually called proper axioms, in addition to any logical axioms. For example, for set theory we need about 20 proper axioms which define set theory. In the resolution formulation used in logic programming on computers, to be described in section 6.1, there are actually no logical axioms at all, just proper axioms.
In 1900, David Hilbert, 1862-1943, developed logical descriptions for plane geometry [Hilbert, 1902b] and an approach for the real numbers. Hilbert also articulated his over- all philosophy for mathematics which was that everything should be precisely definable in a formal logical language and formal proofs of all theorems using this language should be discoverable. This question of discoverability of formal proofs was called the Entschei-
2provided t is free for x in A(x), i.e., if and only if no free occurrence of x in A lies within the scope of any quantifier (y) where y is a variable in t. 3x occurs free in a formula A if it is not within the scope of any quantifier occuring in A 4provided x does not occur free in A. Formal logic 81
dung problem. 5 This was the tenth in the famous set of problems that Hilbert presented to the world in 1900 [Hilbert, 1902a] [Browder, 1976]: “Given a diophantine equation with any number of unknown quantities and with ra- tional integral numerical coefficients: To devise a process according to which it can be determined by a finite number of operations whether the equation is solvable in rational integers.” 6.
Other formal axiomatizations include classical mechanics by Hamel in 1903, ther- modynamics by Caratheodory in 1909, special relativity by Robb in 1914, probabil- ity theory by Kolmogrov in 1930 [Gray, 2000]. Rudolf Carnap also developed a lot of formal axiomatizations, including biological phenomena [Carnap, 1958]. The ax- iomatization of quantum mechanics was developed by von Neumann in the 1930s [von Neumann, 1932] [Birkhoff and von Neumann, 1936] [Lacki, 2000], and later Mackey and others [Mackey, 1963], and quantum field theory by Wightman in the 1960s [Velo and Wightman, 1973]. However there remained problems with formalizing the no- tion of set, and also the theory of the real numbers, in first order predicate logic. Axiom- atization of the reals by Hilbert in 1900, then also by Coolidge and by Tarski, were all in second order logic. However, a workable axiomatization of set theory was developed by Zermelo and Fraenkel in the 1930s.
In 1930, Kurt G¨odel,1906-1978, showed the incompleteness of any theory of the natural numbers [Kurt G¨odel,1930]. Thus there would always be theorems about the natural numbers that could be stated but could never be proved in the theory
In 1936, Alan Turing, 1912-1954, set out to develop a discovery process to solve Hilbert’s tenth problem, and devised his Turing machine [Turing, 1936] [Turing, 1937]. This orig-
5eine scheide - a boundary, scheiden - to separate, entscheiden - decided 6A diophantine equation is a polynomial equation for which the solutions are required to be integers, for example, to find integers x, y and z such that x4 + x2y + y3 = z4 82 Chapter 5: The history of formal description inal idea of computation was a mechanical process using a finite alphabet of discrete symbols, and using a discrete time scale. To remind the reader of the Turing machine, referring to Figure 5.3, this is a device which can at any one time be in one of a fixed set of states, and it has a reading head which is always under one square on a tape that it has, of unbounded length. The device operates by reading the symbol under the reading head and looking up in a fixed state transition diagram what next state to transition to and what action to take. An action consists of writing a symbol on the tape and then moving right one square or left one square, or not moving, or halting. The notation used is read symbol/write symbol/next state, thus for example, b/#/R means if the read/write head reads a b symbol then write a # symbol over it and move the read/write head one place to the right. The rest of the tape is taken to be filled with # symbols. The example in the figure starts with a string or a symbols and b symbols and decides true or false whether there is a string of a’s followed by a string of b’s of the same length. If true it writes a T symbol on the tape and halts and if false an F symbol.
tape format: ####aaabbb##### read/write head: #//R
true/false: T/F control diagram: start state halt state: H at left end b/F/H #//R a/#/R state:
scan right scan left a//L a//R b//L b//R #//L b//L
b/#/L test if notation for state transition: at right end a/F/H a/F/H finished read symbol/write symbol/next state #/T/H
Figure 5.3: Example of a Turing machine Formal logic 83
The way this particular Turing machine works is to move right until it finds the first, i.e., leftmost, a, to replace it with a # symbol and then to move to the right until it finds the first #. Then to step one step to the left, which should have a b and to replace it with a #. It then moves left until it finds a # again which should have an a to its right, or else a # in which case it has succeeded in recognizing a string of the form anbn. Situations which are not as expected lead to termination with failure to recognize.
As a result of his construction, the Turing machine, and its use in representing the at- tempted solution of decidability problems, Turing realized that there would be questions which were impossible to decide by any computational process.
The same device but without the tape, where it just inputs and outputs one symbol each time step, is called an automaton.
Turing was also able to construct a universal Turing machine. As shown in Figure 5.4, this machine could simulate any given Turing machine. The way this is done is to put a description, or code, of the given Turing machine as a sequence of characters on the tape, and then the universal Turing machine can just keep reading what to do next from this code. This encoding of a machine is an example of a program. There are by now many other different constructions for universal Turing machines, some of which are very simple, I believe the world record for conciseness has only 6 states.
Kurt G¨odelalso made the connection between formal proof and symbolic computation. He showed that all computations could be expressed as logical proofs and vice versa, so formal reasoning and computation were the same thing.
These twin developments by G¨odel and Turing were a major setback to the Hilbert program for the foundation of mathematics. However, the use of typed theories, and the consistent set theory due to Zermelo and Frankel, allow logical approaches to still be used. The G¨odel result is also not fatal because extensions of arithmetic theories can 84 Chapter 5: The history of formal description
tape:
#####encoded description of control diagram for M#########current state of M####tape for computation by M##############
read/write head
control diagram for universal Turing machine: start: write next read read current state as corresponding state of M current state next state of M for M
read current read write symbol on M's symbol being corresponding computation read by M write symbol tape
Figure 5.4: The idea of a universal Turing machine be developed that will allow the proof of those statements he showed undecidable in his original arithmetic theory, however these extended theories will in turn have their own undecidable statements.
There were developed in the 1930s and it remains true today that there are four main ideas for describing a discrete information-processing machine [Minsky, 1967] [Hermes, 1965] [Davis, 1958]. These are: (i) The automaton approach [Turing, 1936] [Davis, 1958], which makes state transitions according to some table, and where there is some external storage medium like a tape or tapes, stacks, registers, etc. (ii) The rule system approach of Post and of Markov [Minsky, 1967] [Hermes, 1965] [Davis, 1958] where there is a set of rules which operate on a working string of sym- bols. If the set of rules is unordered this is a Post system, if the rules are in a fixed order and are executed by always scanning from the top after each time step, until an applicable rule is found and executed, this is a Markov algorithm. Formal logic 85
(iii) The functional evaluation approach of Church and of Kleene [Church, 1941] [Kleene, 1971] where every expression denotes a function and computation starts with an applicative expression representing the application of a function to some arguments, which themselves represent functions, and reduces this expression, using given reduction operations, until an expression is derived which cannot be reduced anymore and is the representation of the value of the original applicative expression. This is the lambda calculus of Church. In the recursive approach of Kleene, a set of recursive definitions is given and evaluated by recursive calling. (iv) The logical proof approach [Kurt G¨odel, 1930] [Davis, 1958] where a formal theory is given as a set of expressions and then either one works forward inferring new expressions, or one is given a hypothesis and systematically works backward, developing a proof tree which proves the hypothesis is true. As the reader knows, every one of these approaches for describing information-processing machines can simulate any of the others, and the set of all functions computable by each approach is the same.
During this period, the mathematical notion of model was developed, a key advance being made by Alfred Tarski in 1936 [Tarski, 1936] (see [Tarski, 1956]). A formal theory is a set of logical expressions which are asserted to be true. Tarski developed a formal mathematical concept for an interpretation of any given theory, see Figure 5.5. This also gave a precise formal way of defining the truth of a logical statements, namely of the statement holds in the interpretation. An interpretation for which a theory is true is called a model of the theory. A theory may be true in some interpretations, in which case it is called satisfiable, or it may be true in all possible interpretations in which case it is called valid, or it may not be true in any interpretation, in which case it is called unsatisfiable.
Skolem and Lowenheim showed that every consistent theory always has a denumerable 86 Chapter 5: The history of formal description
Theory Model
alphabet of symbols set of individuals set of logical statements predicate for every predicate letter in the theory function for every function letter in the theory individual for every constant in the theory Every statement in the theory is true in the interpretation
universe = set of individuals
predicate for P predicate for R (x)(Ey)(P(x,y) & R(a)) individual for a
Figure 5.5: Tarski’s concept of an interpretation of a theory
model [Mendelson, 1979], i.e., a discrete model which is only as large as the set of integers. Herbrand showed how, given a consistent theory, to construct a denumerable model for it. This constructs the so-called Herbrand model of the theory [Herbrand, 1930] see [Herbrand, 1971], and section 18.2. Henkin showed we can construct a model restricting ourselves to interpreting every function as a computable function [Henkin, 1950].
In the 1960s, Abraham Robinson showed how to axiomatize the real numbers in first- order logic [Robinson, 1974]. This allowed him to formalize limit arguments and many of the results of calculus. Theoretical computer science 87
5.3 Theoretical computer science
Although Charles Babbage had shown in the 1840s how to build a machine for carry- ing out mechanical computations, it was only when such machines were actually built and used in the 1940s, using electronics, that computer science began to develop as a discipline. Although mainstream computer science has been necessarily concerned with the practicalities of present-day machines, their design, construction, programming and use, there has always been accompanying theory development which has studied a much wider class of theoretical computational machines. This lead in the 1950s to the study of neural net models and cellular automaton models.
At the same time, mainstream computer science was developing the concept of automaton and its connection to formal language processing, programming language design, both syntax and semantics, and operating system design and the organization of computer systems. Notions of functional programming were developed, based on Church’s func- tional approach. Theories of data structure and databases were developed in the 1960’s.
Theoretical studies have elucidated abstract properties of automata, abstract properties of formal languages including syntax and semantics, also theories of coordination of sets of serial processes. There are general theoretical bases for the notions of computability and of complexity of computation.
5.4 Artificial intelligence
Goal trees, backtracking and symbol structures. In 1956, Allen Newell, J. Clifford Shaw and Herbert A. Simon developed their first artificial intelligence (AI) program which they called the Logic Theorist [Newell and Shaw, 1957], which solved problems in 88 Chapter 5: The history of formal description
propositional logic, and at the same time they developed the programming language IPL [Newell and Shaw, 1957]. IPL-IV, i.e., version 4 of IPL, was used for the first versions of their General Problem Solver program, GPS, in 1958 [Newell et al., 1958]. A more standardized and usable version, IPL-V, was developed in 1959 and used for some early AI programs including those developed by Feigenbaum and by Feldman. Thus Newell’s first idea of AI was “cognitive programming” embodied in the IPL language. IPL is a bit like an assembly code, i.e., it is laid out on a line by line basis, each line corresponding to an element of a list of symbols, which denote data or operations [Newell et al., 1960]. A list is thus a sequence of symbols or lists, and was written, for example, as:
NAME PQ SYMB LINK HO COMMENTS L1 0 S1 S2 S3 0
A program is written, for example, as:
NAME PQ SYMB LINK HO COMMENTS 10 L1 0 input list L1 R1 L1 find last symbol of L1 R1 S4 find last symbol of sublist found previously, say B2 20 W1 B2 output to W1 0 where: NAME gives a name to the list, P indicates that the symbol is an input, Q gives the level of indirection of reference, SYMB is a symbol designating a data token or the name Artificial intelligence 89 of an elementary process, LINK is the name of a sublist to be used at this point, and HO is the communication cell. Program steps are calls to primitive routines which were written in machine code and of which there were a large number, about 120. Each IPL line was represented in one (40bit long) word of the JOHNNIAC computer which had 4K words i.e. 20K bytes = 0.02 Megabytes of total store.
In [Newell and Shaw, 1957], the authors basically invented list processing and attributed part of its power to the use of symbols to designate structures and processes. A symbol in IPL-V actually had two values, first its ordinary value which was (the address of) a list structure, and second a list of associated pairs holding other properties of the symbol, such as its name.
In developing GPS as a model of human problem solving and applying it to the ex- perimental data of Moore and Anderson [Moore and Anderson, 1954] on logic problem solving of the type taught in university courses, Newell and Simon developed AI pro- gramming techniques of goals, goal trees, methods, recursion and backtracking. Also their accounts talk of a cognitive model as an information processing system, IPS, based on symbol structures and sequential programs whose steps evoked elementary information processes [Newell and Simon, 1961] [Newell, 1962] [Newell, 1963][Ruimschotel, 1989]. So they saw their work as both artificial intelligence and cognitive psychology.
After this, in 1967, Newell courageously turned to a completely new way of programming using rule systems, as a way of breaking out of the constrained control structure of GPS, which was based on recursion and backtracking, by going to a much more primitive control structure, which was a simple repetitive linear scan of the program with rule matching, but out of which many other types of control structure could be synthesized. Newell’s rule systems are Markov algorithm rule systems and work on a working string of symbols which he identified with short term memory. 90 Chapter 5: The history of formal description
The rule systems built by Newell and colleagues had symbols built into them. In 1972 Newell and Simon published their first book on their work entitled “Human problem solving”. This consisted mainly of very long detailed studies and models of human problem-solving behavior for three problems, namely, cryptarithmetic, propositional logic and chess. However the book also gave definitive statements of their fundamental ideas and postulates of their theoretical approach. They define an information-processing system, IPS, and it is very similar to their original 1957 definition. The main differences were that: (i) the store is now divided into a short-term memory, STM, which is limited to 5-7 symbols, and a long-term memory, LTM, and (ii) programs are represented as rule systems which operate on STM, and store and retrieve to and from LTM.
In 1990, Newell published his “Unified Theories of Cognition” book which gave an up- to-date statement of his approach to cognitive modeling. A symbol system is similar to his 1980 definition, but the architecture of the notional machine is now the SOAR machine, and knowledge is organized into problem spaces. A problem space is sim- ply a set of rules, however the point is that (i) the SOAR machine only works in one problem space at a time, which must have a name, and (ii) the set of problem spaces is complete in the sense that problems that are encountered in any problem space can always be formulated and described during problem solving, and their solution at- tempted in another problem space in the set. This is called universal subgoaling and is achieved by having standardized representations for all data types involved in the gen- eral SOAR problem-solving process itself. A clear description of SOAR can be found in [John Laird and Paul Rosenbloom and Allen Newell, 1986].
I draw several conclusions from Newell’s research: (i) the notion of a symbol as a token which represented a symbol structure was intro- duced from the beginning and stayed the same. (ii) the processing is always serial and has stacking of tasks and often backtracking, to Artificial intelligence 91 previous goals on failure, built in. (iii) Ergo this architecture is not easily mappable onto brain architecture, which may not always use symbols, does use a lot of parallelism, and doesn’t use strict stacking or backtracking. (iv) It does not use data types as a programming construct in problem spaces, whereas the brain probably handles different data types differently. (v) Nevertheless, SOAR demonstrates for the first time a universal problem-solving pro- cess which is able to identify and formulate its own goals and to attempt their solution, and this is something that will ultimately need to be provided in any theory of human cognition.
Knowledge representation. The other main language was LISP, developed in 1959 at MIT, which was a functional language with list processing operations and program representation as lists. Lisp uses bracketed expressions to denote list structures and programs, and it encourages a functional style of programming with recursive calls instead of iteration. Lisp made it easier to develop special-purpose AI languages for particular tasks.
In the 1960s, the idea of knowledge representation and metalevel description was de- veloped by Marvin Minsky [Marvin Minsky, 1965] and Carl Hewitt [Hewitt, 1967] and others. Although there had been programming languages since 1956, their semantics was not clear and thus programs could not be used as descriptions.
Hewitt’s Planner language [Hewitt, 1967] was the first language in which knowledge could be represented directly and the semantics was clear, it was the formal deduction of new assertions from existing ones. A Planner program consists of a set of assertions and rules, where a rule has a pattern which matches any goal expressions that it applies to and a body which specifies what goals have to be satisfied before the entire rule is satisfied and its main goal satisfied. Thus in the following example 92 Chapter 5: The history of formal description
(def-theorem tc-broke1 (conse (x y) (broken ?x) (thgoal (fragile x)) (thgoal (heavy ?y)) (thgoal (on y x)))) this is a rule (called a theorem in Planner), whose name is tc-broke1, taking two argu- ments and having goal pattern (broken ?x). This theorem would be evoked if there was a goal which matched this goal pattern, for example (broken a). When the theorem is evoked it tries to satisfy each subgoal in its body in turn, by either finding existing facts or by evoking further theorems which match the subgoal. If and when the theorem can be executed to the end then the goal (broken a) is satisfied. Thus a rule is the same as a theorem in predicate logic, it represents a known true statement. So a Planner program is a set of statements the programmer is asserting to be true. These statements can be either facts or theorems.
Understanding a description in Planner basically needed a computer to elaborate it and to answer questions about it, since descriptions were long, and interacted with each other. A human could understand the descriptions but could not do much elaboration by hand.
So the concept of formal description by which humans described and communicated and elaborated, using the matching of expressions, was now taken to another level, in which a computer program interpreted the language, but the human could easily verify that they understood the description and could decide if they agreed with it or not. This can not be said of programs in normal programming languages, since the programs are much more complex and difficult to verify by hand, even if one uses a computer to elaborate them.
Frames. In 1974, Marvin Minsky introduced the concept of frame Artificial intelligence 93
[Marvin Minsky, 1974], which was a larger kind of structure that held representa- tions of many different aspects of a given situation or idea. This allowed one to describe all the different visual aspects of a given object and their interrelations. It also showed the way to the description of contexts, that is, outer situations which provide knowledge and impose constraints on more local descriptions.
Marvin Minsky and Seymour Papert’s MIT memo AIM-252 [Marvin Minsky and Seymour Papert, 1972] observed that children often draw un- realistically but in a way which demonstrates their knowledge of the depicted scene. An example is given in Figure 5.6, from [Marvin Minsky and Seymour Papert, 1972]. The
Figure 5.6: Geometric knowledge in children’s drawings example communicates that the box has four sides even though some are not visible.
Figure 5.7 gives an example of a visual frame from [Marvin Minsky, 1974].
A frame is a representation of all the knowledge that the system has about a given type of situation, and it can be matched to a given specific situation, identifying the components and giving their relationships, etc. Frames have slots which can be filled by matching the situation, slots can hold default values, and frames can be linked together. Frames can contain procedures, i.e., programs, for making transformations or calculating values.
Frames can also be used in understanding natural language. The following example is from [Marvin Minsky, 1974]. “There was once a wolf who saw a lamb drinking at a river 94 Chapter 5: The history of formal description
E E
B C A B
F A E B C D
Figure 5.7: Minsky’s concept of frame and wanted an excuse to eat it. For that purpose, even though he himself was upstream, he accused the lamb of stirring up the water and keeping him from drinking”
There is a frame for this situation: A upstream from B B muddies water A accuses B
There may be a mental image used to understand the verbally expressed situation.
There is other general knowledge active in this context, such as: wolves eat lambs stirring up makes water undrinkable stirring up is temporary
Frames showed that knowledge structures could be large and complicated, and to the point where they could provide a complete environment for processing. Frames were used in computer vision systems to represent all the knowledge required for visual perception of a known situation, and also in natural language processing for representing known situations being referenced in utterances. Patrick Winston’s AI textbook contains a comprehensive treatment of frames and their uses [Winston, 1993], and Minsky’s “Society Artificial intelligence 95
of Mind” book develops the concepts further [Minsky, 1986].
Learning methods. Learning has always been of great interest in AI, and in the 1980s became a major subarea. Several new types of learning method were developed [Michalski et al., 1990] [Winston, 1993].
Multiagent systems. The topic of multiagent systems started in the mid 80s [Bond and Gasser, 1988b] [Bond and Gasser, 1988a], and has developed into an inter- national research area [Huhns and Singh, 1998] [Ferber, 1999]. This has generalized AI to involve cooperation among a set of intelligent agents, and this leads to a consideration of distributed knowledge, distributed planning, high-level communication methods, and distributed learning. We have only scratched the surface of the potential of this fertile area.
Statistical natural language processing. During the 1990s, statistical grammars were introduced for learning natural language syntax. These turned out to be wildly suc- cessful, allowing languages to be learned quickly and used in automatic natural language translation. For example, the Hong Kong parliamentary proceedings which are bilin- gual in English and Chinese were used to learn a very accurate and complete statistical grammar for Chinese in only 3 days [Wu, 1994].
The failure of AI. It is often said, and indeed taught by the ignorant to the innocent, that AI “failed”. Their knowledge of the subject seems to be limited to answering just one Jeopardy 7 question - “It failed” - “ ‘What is AI?’, Alex”. The argument given is that some AI researchers predicted the development of super intelligent machines by now, and this hasn’t happened. However, as I have recounted above, the reality is that AI has developed new important and lasting concepts in computer science and continues
7Jeopardy is an American TV gameshow in which contestants are given an answer and have to generate the corresponding question. The original host of the show is Alex Trebeck. 96 Chapter 5: The history of formal description to do so. Incidentally, this has been achieved with a very small number of researchers worldwide, probably a few hundred.
5.5 The choice of programming language
In the United States, functional programming has continued to be used for most artificial intelligence research, partly because if advantages of functional programming such as composability and scaling, and partly in an attempt to standardize the Lisp language for practical projects and thereby ensure financial support. Logic programming has however been used instead of Lisp in Europe, and in Japan where it was the basis of the Fifth Generation computer initiative. Due to the need to save and to share programs and the cost of learning programming, the choice of programming language rapidly becomes a cultural one. Once most people are using a particular language, it is very advantageous to use that language, in spite of any faults and unsuitability it may have.
5.6 The intellectual revolution of computer science
Computer science has developed precise models and realizations for a number of key concepts in Western thought, notably those of process, representation and abstraction. By process, we mean some kind of ongoing activity in time, by data representation we mean a structuring of items corresponding to some entity of interest, and by abstraction we mean a relation whereby a representation represents classes of other representations and their general properties and behavior. These ideas had been used for centuries in imprecise ways and as a result could not be developed beyond a certain level. However computer science has allowed us to construct and use processes, representations and abstractions as a precise science. This ability is also affecting research into philosophy The intellectual revolution of computer science 97 and the foundations of mathematics. These theoretical developments in computer science are quite distinct from the practical use of computers and their great facilitation of most areas of human endeavor. Chapter 6
Logic programming
Abstract. In this chapter, I explain how logic programming developed out of the logic approach to artificial intelligence.
98 Introduction 99
6.1 Introduction
In 1965, the resolution method for theorem proving was developed by Alan Robinson [Robinson, 1965] and gave an impetus to the use of logical methods, and to the use of logical theorem proving as a basic underlying process for artificial intelligence.
The resolution method uses one basic operation called resolution to derive a new logical statement from a given pair of logical statements. Statements are represented, without loss of generality, as sets of logical literals. In resolution, a pair of literals, one from each of the pairs of statements, is unified by finding a substitution, of terms for variables, which makes them identical. Then a new statement, the resolvent, is formed by the union of the two statements with the unified literals removed. If a sequence of resolutions can be found which starts from a set of statements and produces the empty statement, then the original set of statements is unsatisfiable.
In the 1970s, inspired by Planner, by recently discovered methods for generating linear proofs, and by new ideas in using logic to describe natural language, logic program- ming became a practical reality with the development of the Prolog language by Alain Colmerauer and Robert Kowalski [Colmerauer, 1973]. This was followed by its efficient implementation by David H. D. Warren [Warren, 1977] and the development of its precise semantics by Keith Clark [Clark, 1977].
The main difference from Planner is the use of resolution instead of modus ponens 1 for inferencing, and the use of a linear strategy for proving theorems which was complete, i.e., guaranteed to find a proof if one exists. Resolution involves a specific kind of two-way match of logical expressions called unification. Logic programming also handles variables better, allowing variables as components of data structures. Prolog was developed into a more generally usable language, it has a full range of programming language features.
1defined above in section 5.2. 100 Chapter 6: Logic programming
It is also available for all the usual types of computer, i.e. platforms, and these days it is very robust and very fast, thanks to the work at SICS, The Swedish Institute for Computer Science, supported by the Swedish government and Swedish corporations.
Robert Kowalski and Maarten Van Emden also showed that top-down goal-directed search and bottom-up data-driven query processing converged on the same minimal mod- els, also tying logic programming to a fixed point semantics analogous to that developed by Dana Scott for the computation of functions.
Logic programming has allowed large AI systems to be constructed. Chapter 7
Describing information processing in computer science
Abstract. I explain description methods used in computer science for describing infor- mation processors and information processing.
Concepts include data structures, processes, abstractions, interfaces and protocols.
101 102 Chapter 7: Describing information processing in computer science
7.1 What is a computer science?
As I never tire of saying, computer science can be defined, not as the study of computers, but as the study of the description of computers. In other words, description methods are crucial and central to the whole computer science enterprise. What a description is is part of this research, however, approximately, a description is an expression, in a precise language, which denotes or describes the entity under consideration, in the sense of allowing questions about the entity to be posed and answered.
The present-day discipline of computer science in the main describes the design, imple- mentation and use of computer systems which are based on present-day technology and methods.
Thus, programming uses programming languages based on addressable stores, instruc- tions and sequential control. Hardware is serial, indeed most machines are based on a single bus. Operating systems concern themselves with managing a set of serial programs. Database techniques similarly are concerned with the management of the seriality of disk seeks, the seriality of channels connecting processor and disk, and the seriality of multiple accesses by multiple users.
Of course there are methods rising above plain seriality. Ethernets, functional program structures the parts of which can be executed concurrently, and of course experimental parallel computers.
7.2 Concepts in computer science
In this section, I want to introduce and define the key concepts in computer science which I will use in defining my model of the brain. Concepts in computer science 103
These concepts are (1) data and data structures, (2) program, control and process, (3) interfaces, and (4) communication.
There are no such concepts in theories of neural nets, which are based on real analysis and integration of equations of real variables.
In chapter 8, I will discuss the problem of defining a class of machines which are derived from plausible models of the primate brain at the system level of analysis, and of then developing a computer science for them, which would be the study of the description of computation, data representation and control structures for this class of machines, as well as their general theoretical properties.
7.2.1 Data and data structures
The origin of the word “data” is something given, from outside, beyond our control, how- ever when computers were used to construct or derive information, this new information created by the system was also called data, to distinguish it from a program.
An etymologically more felicitous word would be “fact”, since this means a conclusion which has been constructed, from the latin verb facere, to make.
How can data have structure? One approach is to use addresses and to store data in storage locations with certain addresses. Thus a linear arrangement of data such as a queue or shopping list could be represented by putting the successive members of the list in successive addresses. Then, as in IPL, we can use lists containing addresses which refer to other lists, giving list structures. In symbol structures, elements of lists are symbols (tokens) which can be primitives or can be the names of other lists. The table giving the correspondence between a symbol and the entity that it refers to is called a symbol table and may only be required for a compiler but for some languages may be required at run 104 Chapter 7: Describing information processing in computer science
time.
However we would like to use a concept of data structure which is not tied to or dependent on any particular type of computer or processing device. An answer to this problem was given by Peter Landin in 1964 [Landin, 1964], who proposed a functional approach to the idea of a structure definition. He said that we could define a particular structure type by defining three kinds of function: (i) constructors, which construct an instance of the structure type, if given as inputs the components to be structured, (ii) selectors, which are given a structure and a description of which component, and return the value of the selected component, and (iii) predicates, which, given any structure, will tell us true or false whether the structure is of the given type or not.
7.2.2 Program, control and process
Whereas a basic level of description of an information-processing machine is as an au- tomaton, explained in chapter 5, for complex systems it is much clearer to use the concept of a program which is run by an underlying automaton. A program is a description of a sequence of elaborations using machine actions, so a program is basically text.
The word “control” is used in several ways in different disciplines, most notably control engineering where it concerns systems which seek and maintain desired states. However in computer science there is a different concept of control. Control concerns which program and machine elements are currently active and which are executed in the next time interval. This usually assumes a single serial-processor, so only one action can be executed at any one time, however the same considerations apply to systems with more than one processor. Thus control structures include conditional looping, where a program segment is executed repeatedly while a given condition is satisfied, and recursion, where during execution of a program it evokes an additional copy of itself and executes it. Control Concepts in computer science 105 relations among programs include one program transferring control to another, or calling another and expecting it to return control when it terminates.
A process is usually taken to be a program which can be executed but which can also be suspended and then processing continued at a later time. Thus a process has a variable reactivation point which specifies at which point in the program execution should be resumed.
Processes can develop different control relations to each other, such as independent par- allelism, or parallelism with a shared data store, or coroutining, which is a streamed control where each process operates in parallel and sends requests for more data to other processes when it has finished processing its current data.
Of course, in a typical computer, even where only one person is using it, there will be many processes which are active at a given time, and then a scheduler which is part of the operating system will act as policeman in regulating which processes get to run. Some systems use a timesliced method in which each process gets a fixed amount of time, usually 10-100 milliseconds, before another one takes over the processor. This gives the effect of parallelism, using a serial processor.
Control can be taken to a new level with dataflow architectures in which processes are quiescent until they receive enough data to resume computation. Thus a dataflow network is a reactive network which will process whatever data is available without requiring any management by another process.
7.2.3 Interfaces
Computer science uses a highly developed and precise notion of interface between systems which specifies what information is exchanged across the boundary between systems. 106 Chapter 7: Describing information processing in computer science
Definitions can also be transported across interfaces, or parts of definitions, or definitions can be modified in crossing boundaries.
Interfaces allow large systems to be described more easily. They also facilitate the division of and distribution of design and programming activity.
7.2.4 Communication
There is also a well-developed field of computer communication which involves among other things the design of protocols by which two computers can manage their interac- tions. Protocols are typically layered into several levels of abstraction, where each level has a different protocol, which is implemented in terms of the protocol for the level below.
7.3 Symbols in computer science
Most of computer programming uses variables which can have numerical values, thus a variable x1 might have the value 2. Any variable is referred to by a symbol, e.g., x1. Most languages require the programmer to assign a data type to each variable, so it can only take values from a given set, so x1 above might be of type integer. Some types are atomic, i.e., with no analyzable structure at the given level of description, like integers, reals, booleans and characters, but other types can be structures like arrays, strings, queues and trees. However the components of structures are ultimately particular atomic values, i.e., of particular atomic types. A variable can only have one value at one time, but can change its value during the execution of the program, although in some languages assignment is write-once, a variable once assigned a value cannot change it. There can sometimes be untyped variables which can take any type of value. A variable when first created may have no value, i.e., be unbound. Concepts in computer science 107
In some programming languages however we can have values which are symbols. A symbol then is a token which has a name. Thus a variable y2, whose name is “y2”, might have a value which is the symbol whose name is “alpha”. Values can be symbols or symbol structures, a symbol structure being a list of symbols or lists. A given symbol may occur more than once in different symbol structures and the different occurrences can be tested for equality with each other. This defines the concept of symbol in the context of present-day programming languages.
7.4 The computer science experience
There is a basic experience that computer scientists have that is difficult to convey to anyone else, and this is the confrontation of one’s intuition, rationality, and imagination by the totally objective response of the computer.
Typically a computer scientist is designing and implementing a program. He or she starts with a conception of what the program is supposed to do. This is then thought through into a design. The usual way is top down from an outline design to gradually more detailed designs. These are usually written down in great detail and ideally some examples are worked out by hand on paper to make sure the design will work. This process takes several weeks, the computer scientist will have subjected these ideas to every possible test and has imagined all possible data cases. Then when there is great clarity they start to specify the last detailed level which is to express the design in a programming language, and to run it on a computer. Any grammatical errors are usually removed on the first day.
What then happens is that the program does not work.
On examining what happens when the program is run on the computer, the computer 108 Chapter 7: Describing information processing in computer science scientist realizes how their conception of the problem is incorrect, and/or they have for- gotten an essential process that is needed to make it all work, and/or the data description has components missing.
This experience is basic to computer science.
It is also why a computer scientist is never completely convinced about an idea or design or specification until the program has been implemented and the program has been successfully demonstrated to run correctly on the computer. One can propose very convincing models and mechanisms, write them down in papers, etc., but unless it has actually run on a computer it is not real.
Conversely, psychologists and neuroscientists tend not to care so much whether or not a computational theory has been programmed and its correctness demonstrated.
It is I think quite analogous to a psychologist designing an experiment and finding that it doesn’t yield viable results. Expecting a computer scientist to take an unimplemented model seriously is like expecting a psychologist to believe in an experiment that has not been run.
Thus the many psychological models which are discussed by psychologists, and have their useful place in conceptualizing psychological mechanisms, will certainly be found to have essential conceptual flaws and missing mechanisms and data when someone eventually attempts to program them. Chapter 8
Computer science description and the brain
Abstract. I discuss the problem of defining a class of machines which are plausible models of the primate brain at a system level of analysis, and of then developing a com- puter science, which would be the description of computation, data representation, control structures, and general theoretical properties, for this class of machines.
I analyze description methods used in computer science for describing information pro- cessors and information processing. I also elucidate the underlying assumptions about information processing made in computer science description methods.
I then reconstruct and generalize some of these descriptive concepts, to provide a set of descriptive concepts for information-processing by the brain. I suggest that these descrip- tive concepts can provide a basis for a system level of description of the brain.
109 110 Chapter 8: Computer science description and the brain
8.1 The computer and the brain
The question of the relationship of the brain to the computer has been asked by many thinkers, for example Edmund Berkeley [Berkeley, 1949], John von Neumann [von Neumann, 1958] and Hartwig Kuhlenbeck [Kuhlenbeck, 1966]. Terry Sejnowski has given a discussion of von Neumann’s book [Sejnowski, 1989], and quotes its es- sential message: “Thus, logics and mathematics in the central nervous system, when viewed as languages, must structurally be essentially different from those languages to which our common experience refers”, and “However, the above remarks about reli- ability and logical and arithmetic depth prove that whatever the system is, it can- not fail to differ considerably from what we consciously and explicitly consider math- ematics”. In spite of occasional discussions of the problem [Schade and Smith, 1970] [Conrad, 1973] [Spiegel, 1983], occasional speculative schemes [Mitra and Mishra, 1993] [Mitra and Mishra, 1990], and philosophical discussions [Searle, 1990], the problem has not received any treatment that has even scratched its surface. However, unlike in von Neumann’s day, present-day computer science has by now developed a host of very gen- eral concepts and powerful methods for describing information-processing systems of wide generality.
In this chapter, I will try to contribute to the solution of the problem of describing the brain, and brain-like computers, by: (i) analyzing the assumptions about information processing that underly concepts and methods used for describing complex information-processing systems in present day com- puter science. (ii) reconstructing and generalizing these descriptive concepts from computer science, to provide a set of descriptive concepts for information processing in the brain. Computer science assumptions from von Neumann machines 111
8.2 Computer science assumptions from von Neu-
mann machines
The majority of concepts in present-day computer science are derived from von Neumann machine architecture.
The notion of data is that it is addressable and therefore directly and precisely retrievable. The notion of data structure is based on sets of pointers, that is, addresses. Accessing data is a matter of indexing and following pointer chains. Data are passive.
The basic notion of control is that there is a single point of control which is an address. The point of control usually steps serially to the next address, unless there is a branch in the machine code.
There is a concept of instruction, instructions are treated as data, there is built-in im- plicit seriality of sets of instructions from the seriality of RAM addresses, and only one instruction is executed at any one time.
There is a notion of program, which is a sequence of instructions.
Of course there are many higher-level concepts which capture notions of data and data structure, and also higher-level notions of control structure. However, these higher-level concepts always incorporate the assumptions of precisely addressable data and a single precise point of control.
Present-day computer science concepts are thus in the main derived from von Neumann architecture. It is not much of an exaggeration to say that present day computer science is the study of von Neumann machines. 112 Chapter 8: Computer science description and the brain
8.3 Life and computer systems
The isolated computer did not evolve, and computers today are only the central focus of a much wider information-processing activity carried out by humans. The analogy of computers to living systems should be to this wider process. We can perhaps define a living system as one which continually renews itself, and we can also observe the characteristic of the human mind that it is continually redescribing itself.
A system which does not need human intervention, help and creativity, and which sur- vives, adapts and reinvents itself, would have to provide: (i) ontogenic development - development of software, addition of new software, addition of new hardware - additional disk drives, (ii) phylogenic development - design of new types of system, evaluation of performance, new technologies and devices, (iii) storage needs - media, and (iv) physical needs - provision of energy, physical protection - air conditioning, shock and vibration, fault detection and diagnosis, physical repair and replacement of hardware.
8.4 Assumptions underlying computer science de-
scriptions
Since my goal is to consider the applicability of computer description methods to the brain, I will list here some of the underlying assumptions operating in such methods.
Separation of data and control. Data are passive, and there is a program which determines operations which are executed on various data items.
Seriality of control. Most models assume a single point of control, where computation activity is occurring. Everything else is passive. The point of control moves rapidly over the possible locations. It is passed around as a datum. It is also located at a precise Assumptions underlying computer science descriptions 113
location at any one moment.
The concept of program. A program is a data description which provides detailed control to a hardware processor which “runs” (executes) it.
A program can be treated as data. It can be stored, moved and executed.
Location independence. The same data can be located in any location, and can be moved from one location to another.
Separation of resource management from other functionality. Programs usually specify computation and not resource management, such as computation time, real time, priority of running, disk space to be used, etc., however some programs handle RAM allocation. Such matters are dealt with in separate descriptions, in job control languages, and in operating system policies.
Erasibility. Any program or data can always be removed by erasing it.
Copying. Programs and data can be copied, and at very little cost.
Exactness. Programs and data can be stored, retrieved and copied without changing them in any way. All quantities are held to a high degree of precision, which is reliable and reproducible.
Permanence. Programs and data do not degrade with time. This also applies to suspended processes which can be resumed at any later time. Over a very long time- frame, when we try to move programs and data to new types of machine, then we start to get degradation due to obsolescence.
The hardware-software distinction. This is a relatively black and white distinction. Most mechanisms are implemented in software and run on hardware. There are also of course varieties of firmware, FPGAs (field programmable gate arrays) and so on, giving mechanisms intermediate between hardware and software. 114 Chapter 8: Computer science description and the brain
Referencing and indexing. References are precise. However, a reference is a simple descriptor. Hence, data indexes have to be used to locate items.
Context. Precise contexts are kept and used in data access and process control.
Detailed precise description of processes. One process can hold and act upon a precise description of another.
Metalevel descriptions. Descriptions of programs and data can be used which describe the properties and performance of those programs and data.
Simulation. One process can simulate another. A process can be represented in many different ways within one system.
Error correction at each level. Control and data are exact, and any residual errors or inconsistency are usually handled as close to their source as possible. It is assumed that data passed from one process to another have been checked, and any errors removed.
Energy. The use of physical energy is not much of an issue, and the results of compu- tations hardly affect the amounts of energy used.
Limitation of processing. The number of hardware operations per unit time is a definite bound on performance, as is the amount of RAM, disk swap space, and disk space. These limit the computations that can be carried out.
8.5 Computer science concepts for the brain
Let us now consider the properties of present-day computing systems and ask how the brain might differ. Whether any concepts can be carried over into brain science, whether some can be modified, whether similar issues exist and require analogous description methods. Computer science concepts for the brain 115
Separation of data and control, and programs. (i) instructions are treated as data, (ii) there is built-in implicit seriality of control from the seriality of RAM addresses, (iii) data are passive, (iv) there is a concept of instruction, (v) only one instruction is executed at any one time, and (vi) there is a notion of program.
An instruction and a program are similar ideas, they are data which control a process (the cpu). But all data have influence on computational activity, so what if any is the difference? Try this: a program is anything which tends to focus and direct computational resources in an organized manner, and to maintain and update over time its control over those resources.
A program would be some data set whose effect upon a process is to focus and maintain control over computational resources and to produce data transformations characteristic of the program.
It may well not be necessary to use the property of location independence for the brain. Different types of data may well be localized in certain areas, and processed only by certain processors.
Values, variables and data types. It is quite reasonable to assume that the brain uses stored values which can change with time.
However, how these values are described in brain processing, such as how they are stored and retrieved, may or may not require a notion of variable, i.e., a descriptor which explicitly refers to the stored value.
A variable is usually regarded as a fixed location which can have at any one time one of a set of possible values. The location is usually compiled into the process so that the original name of the variable is lost, i.e., is not used for computation, but for specification or description of the computation. There are systems with a more general concept of variable and even systems which can generate new variables, including their names, 116 Chapter 8: Computer science description and the brain during computation, but these are exceptions.
The nervous system probably uses implicit referencing routinely with certain neural sub- areas having certain possible states or values, and processes using these values as inputs or controlling information.
We will need mechanisms for describing values and their storage and access. We will need a notion of data type. Possible values could be of different types from single numerical values, to images, to patterns, masks, programs, etc.
If a value is always stored or processed in a given specified neural area, it is likely that its possible values are all of one type and of similar size and complexity.
Channels. Neuroscientists are comfortable with the concept of communication channel. The pathways connecting neural areas can be described as channels, and the notion of a limited amount of data being able to pass through it. The validity and usefulness of information-theoretic measures in biological systems, including the brain, has been known since the fifties.
Seriality, point of control, control in general. I imagine that data and processes and programs in the brain operate concurrently. I imagine that each of the fifty Brodmann areas on each hemisphere are processing simultaneously, and that within each area all the millions of neurons are firing simultaneously, perhaps many of them relatively slowly at 1 spike per second, but nonetheless firing.
The extreme seriality of operation of computers is not found here, but the notion of control still will arise, in issues of how resources are used. Does a process access one particular memory or set of memories and run one program or object, or does it run another one? The idea is that there are many possible courses of processing activity and that not all objects can be processed simultaneously. Thus there are questions of selection of which objects to process, of suspension and termination, of one process dominating Computer science concepts for the brain 117 and inhibiting another, of one process waiting for another, of one process having a control relationship with another, perhaps they work on related data, perhaps they provide data to each other, or one to the other, perhaps they compete for resources, or for access to data.
A process can be imagined as localized within some area, or as sparsely distributed over a large area, if not the entire brain.
Where an object is large and distributed, its control over resources, viewed as being processed by processes, is no longer determined by one point but depends on a large subset of the object, if not the entire object.
Other control issues include coherence - maintaining active data and programs which agree with and are relevant to each other rather than a set of unrelated data and pro- grams.
The seriality of verbal input/output, of the stream of visual percepts from the external environment, and the seriality of high-level logical thought and problem-solving search tends to imply that the brain may be able to, or may need to, function serially at the top level, particularly for very demanding tasks.
Seriality may be necessary if, for example, (i) certain resources of limited capacity are needed and are used maximally, or (ii) results of computation are needed before further computation can occur.
Referencing and context. I imagine that narrow-band referencing by addresses and names is not used in the brain, but rather a reference is a large object, with data and even some programs, which acts upon or is processed by a memory to retrieve stored memory items.
I imagine this retrieval process is not unique but that many items could be retrieved, 118 Chapter 8: Computer science description and the brain however varying amounts of information can be used, and also a retrieved optimal item can be obtained.
We must also allow for multiple concurrent retrieved items.
Items may also be active objects with data and programs.
Contexts may also be approximate and overdetermined in a similar way.
Simulation, metalevel objects and copying. We may not have these in the brain.
Precision and exactness. There can be no doubt that the brain can achieve precision of calculation, of memory retrieval and of inference. I imagine however this being achieved by a multiplicity of overdetermining constraints and cues, which establish, maintain and ensure the accuracy and precision of information retrieved and transformed.
8.6 Summary
My analysis has shown that many inbuilt assumptions of present day computer sci- ence ubiquitously infiltrate computer science description methods. I have argued that it is nevertheless possible to crystallize out, from present day computer science, a set of computational concepts and principles which can form the basis of a system level of description for the brain. Chapter 9
Levels of description in computer science
Abstract. I explain description methods used in computer science for describing infor- mation processors and information processing.
The description of a computer system is organized as a set of self-contained description levels.
I describe a typical set of levels giving formal descriptions at each level and descriptions of elements at each level in terms of the level below.
119 120 Chapter 9: Levels of description in computer science
9.1 Describing computers
During the 1970s and 1980s, with the development of VLSI technology, the precise de- scription of large complex computer systems using multilevel approaches was developed into a powerful and practical methodology. A good standard treatment of system level abstraction and levels of description can be found in [Siewiorek et al., 1982].
Description languages have been developed for the high-level description of complete computer systems. Standard treatments can be found for example for requirements in [Davis, 1993] and for real-time embedded system methods in [Calvez, 1993].
At the highest levels, there are description languages which are not completely formal or mathematical, but are used for humans to communicate with other humans about computer system specifications, for example for systems that are being contracted and have not yet been built. This level of language tends to merge into languages used by the lawyers and accountants involved in the contracts, as well as the concepts and terminology of the culture of the organizations involved.
9.2 Levels of description for computer systems
Gordon Bell and Allen Newell published a landmark book in 1971, entitled “Computer structures: Readings and Examples”, which reviewed all existing computers at that time and gave a hierarchical description scheme for describing computer systems. It also de- veloped some general concepts for a unified, more detailed, description approach. This approach had two descriptive systems called PMS and ISP. In PMS, one specifies the overall architecture of the computer, its components, which could be memories, links, controls, switches, transducers, data operations or processors, and how they are intercon- nected. Bell and Newell give a complete example of the DEC PDP-8 computer expressed Levels of description for computer systems 121 in PMS. In ISP, one specifies the basic instruction set of each processor, and Bell and Newell show how to specify the DEC PDP-8 instruction set using ISP. A predecessor of ISP is APL which was originally developed by Kenneth Iverson at IBM in the early sixties for specifying the action of processors [Iverson, 1962]. In 1982, the second edi- tion of the “Computer Structures” book appeared with an additional co-author Daniel Siewiorek [Siewiorek et al., 1982]. It had a more general hierarchical description scheme with an additional top level based on PMS description. This allowed the components of specified computer architectures to be complete computers, including software. Figure 9.1 gives a set of levels based on their description scheme. 122 Chapter 9: Levels of description in computer science
Level no Level name Description method used Concepts described
1 PMS PMS descriptions processors, memories, networks
2 applications programming languages application mechanisms
3 programming language implementation language programming constructions
4 operating system implementation language memory allocation, scheduling of tasks file system management
5 instruction code instruction set design
6 register level microprogramming, datapaths
7 switching circuit sequential circuits - counters, registers combinatory circuits - encoders, decoders, data operations
8 gates technologies for gates and memory
Figure 9.1: Levels of description in computer science Levels of description for computer systems 123
The relationships between levels are not always well defined. These depend on the spe- cific machine and languages and methods being used, and these are in a constant state of change and description. Each level will have a formal description language, allowing exact specifications of what it is describing and how, and the descriptions and description systems exist as computer programs which have been designed and implemented to pro- vide the ability to process system descriptions correctly and to give a faithful semantics of the intended meaning of the descriptions.
In general, as diagrammed in Figure 9.2, each level will “deliver” certain functions and data to the level above, meaning that it will explain those functions and data in terms of expressions involving data and functions within its level. It will “implement” those functions and data which are primitive and unanalyzable at the level above. In its turn, it works with a certain set of primitives which are delivered from the level below. What is being described is, of course, always the behavior of the system under consideration.
description language sets, elements, functions, predicates, time and space scales at level n hypotheses, questions, experimental data, models
interlevel interface between definitions of entities at level n description languages in terms of entities at level n−1 at levels n and n−1
description language sets, elements, functions, predicates, time and space scales at level n−1 hypotheses, questions, experimental data, models
Figure 9.2: Levels of descriptions and their interactions
A related diagram from [Edwards, 1992] illustrating VLSI chip design is given in Figure 9.3. This shows that spatial layout considerations can also be incorporated into the 124 Chapter 9: Levels of description in computer science description process.
Silicon compilation
Design synthesis Layout synthesis
Behavioral Structural Physical
Performance Processors, Physical System specifications memories, etc partitions
Algorithms Hardware Clusters Algorithmic subsystems
Register ALUs, Floorplans Micro− transfers registers, etc architecture
Logic Gates, Cell Logic functions flip−flops, etc estimates Cell Transfer Transistors Circuit functions layouts
VLSI chip
Figure 9.3: The process of chip design, from [Edwards, 1992]
9.3 Descriptions used at each level
Relations between levels. Let us consider how one might give an account of the execution of a Prolog program and how this would propagate, or transcribe, through the different levels.
I am spending time on this example because the levels of description of computer systems Descriptions used in each level 125 are one of the few clear examples of precise definition of a system of description levels, and later I will be arguing for an analogous description scheme for the brain.
This is thus an example of giving a precise description of a laptop computer running a certain Prolog program. In order to do this we will need eight levels of description, and each level will involve very complex descriptions. We diagram this in Figure 9.4.
Description level Interlevel interfaces
laptop Defined as processes, memories and networks
brain model process Prolog statements defined by Prolog mechanism in C
C C statements defined by machine instructions and calls to operating system
Linux operating system C statements defined by machine instructions
Pentium instructions codes defined in serialized register transfer language serialized register transfer language defined in hardware RTL
register level registers defined as gates
switching circuit gates defined as circuits
circuit elements circuit elements defined as physical devices
Figure 9.4: Levels on laptop running by brain model 126 Chapter 9: Levels of description in computer science
1. The laptop specification describes how one can run and interact with a Prolog program.
2. The Prolog program is described by how it runs on the Prolog runtime system.
3. The Prolog runtime system is described by a C program.
4. The C program is described by a machine code program that it compiles into.
5. The machine code program is described in terms of the machine instruction set and machine architecture.
6. The machine instructions are described in terms of a serial register transfer lan- guage.
7. The machine processor and architecture is described in terms of as hardware lan- guage such as RTL, the register transfer language.
8. The logic design level. Registers and register-level operations are described by designs in terms of logic gates.
9. The circuit design level. Logic gates are described in terms of circuits, i.e., arrange- ments of circuit components or devices.
Thus each level in principle has a different description of the same thing, namely the given Prolog program.
The specifications of each level and each interlevel interface exist as documents. We diagram this in Figure 9.5. Each level has at least two complex specification documents, often many more, each about 300-500 pages, and to understand these fully, one needs to specialize to work at one particular level or interface. Descriptions used in each level 127
Description level Specification of language Specification of interlevel interfaces
laptop laptop user manual implementation of laptop commands
brain model process Prolog manual and Prolog ISO specs 1 listing plus manual for Prolog interpreter in C
C C user manual and C language ISO specs listing plus manual for C compiler including operating system calls
Linux operating system manual for Linux commands and C ISO specs listing of Linux system in C
Pentium instructions Pentium machine code manual Spec of machine instructios in terms of Pentium machine architecture which is in terms of hardware RTL
register level RTL manual specifications for registers, ALUs etc as gates
switching circuit logic diagrams and tables specifications of gates defined as circuits
circuit elements circuit diagrams circuit elements defined as physical devices
Figure 9.5: Specification documents for levels and interlecl interfaces 128 Chapter 9: Levels of description in computer science
The circuit design level. A circuit can be written as a circuit diagram involving resistors, transistors etc. In this way for example a NAND gate can be described as a circuit, see Figure 9.6. It is of course possible to write a circuit diagram as a set of assertions, one for each component and then one for each connection among components.
A AB (A^B) B
Figure 9.6: NAND gate defined by circuit, logic diagram, layout and logic formula rep- resentations
The logic design level. We might describe a combinatory logic circuit by an expression in boolean, i.e., propositional logic: O1 = A ∨ ¬(A ∨ B) = A ∨ ¬B This can also be written as a logic diagram, see Figure 9.8.
The register transfer level. RTL is a formal language. In it you specify a set of input registers, an output register, transformations such as AND, OR, NOT, etc., which are executed in one clock cycle, and register transfer statements of the form: L:Z=F(X1,X2,X3,..XN) Descriptions used in each level 129
A B ¬A ¬B (A∧B) ¬(A∧B) (A∨B) ¬(A∨B) 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 1 0 1 0 1
Figure 9.7: Definitions of NOT, NAND and NOR gates as truth tables where L is a label for the statement, Z is the output register and XI are contents of the input registers. L: goto L2 if condition then goto L sub2 Thus this language has well-defined semantics.
The instruction set level. This comprises a set of about 100 different types of ma- chine instructions, whose meaning is given by specifications given in the manual for the machine in question. These meanings can also be described in terms of notional register transfers at the level below, for example: e.g., ADD X1 X2 is described as C,R[0] ←R[0]+R[X1X2] LOAD X1X2X3X4 etc is described as R[0] ←MEM[X1X2X3etc] However these specifications are not in terms of real hardware registers in the machine but instead concern a notional serial machine. This difference occurs because the ma- chine uses a lot of additional mechanisms, broadly termed instruction-level parallelism, to enhance its performance. For example, a basic arrangement is pipelining in which each instruction is processed in a sequence of operations, and then the processing of the sequence of instructions is overlapped in time using a series of different hardware proces- sors, see Figure 9.9, taken from Tuebingen course by Gerald Heim, based on the work of Patterson and Hennessy [Patterson and Hennessy, 1996] [Hennessy and Patterson, 1998] on their DLX machine. 130 Chapter 9: Levels of description in computer science
Figure 9.8: A logic circuit Descriptions used in each level 131
Figure 9.9: Pipelining diagram defines serialized register transfer language in terms of hardware RTL 132 Chapter 9: Levels of description in computer science
Other types of instruction-level parallelism are block transfer and caching, of data or program, in which a block of instructions or data is transferred together into a fast working store in the processor from which execution can be done faster. The description of this complex mechanism however is arranged to be a serial register-transfer language and the machine is designed to make this serial model be a correct description of its parallel operation. This illustrates, then, that an interface between levels of description can be quite complex and can involve complex evaluation and complex parallel mechanisms.
The C program level. Here is a simple C program: begin s := 0; for i=0 step 1 until n do s:=s+1; end This has a meaning specified in terms of the semantics of the C language. It can be explained by an equivalent machine code program for a PDP-8 machine, taken from Siewiorek, Bell and Newell’s book:
start cla clear AC dca s s=0 deposit ac in Mi clear AC dca i i=0 loop tad s two’s complement add tad i s=s+i dca s tad n cio I→N negate AC (in two’s complement) tad i sma cio I=NP skip if AC−i clear AC stop hit halt isz i I=I+1 index by 1, skip if 0 jmp loop jump This defines the C program as a sequence of machine instructions, giving a description of the program at the instruction set level.
The Prolog runtime level. This description would be the C code that implements the mechanism of the Prolog language. Thus the description is this C program. It takes the form of the code listing of a Prolog interpreter in C. Properties of descriptions 133
The Prolog level. The description takes the form of a Prolog program, which has semantics as specified in the ISO standard for Prolog.
9.4 Properties of descriptions
The equivalent descriptions at each level, even though they describe the same thing, each in its own terms, are not simply related. For example a description at one level does not map onto an adjacent level by a simple mapping, or homomorphism. The mappings between levels are complex specifications of one set of concepts in terms of others.
For example the C program is a program written in a symbolic form namely a C pro- gram. This consists of nested expressions. By contrast, the machine instruction level has no nesting of expressions but is a strict linear arrangement of expressions for machine operations. Control flow is determined by skipping the next instruction depending upon a test on data, or an unconditional jump to another instruction.
Likewise the register transfer level is a lot different from the logic gate level. They are imaginatively different even though they can in the end be mapped onto each other.
Note also that for a given expression at one level there will be many equivalent expressions at the level below.
In principle we can generate a complete description of the Prolog program in terms of logic gates. However it would be of astronomical complexity. However, by using several levels of description languages, we can understand exactly how the Prolog program can be implemented in terms of logic gates. The levels break down the complexity of the specification of the computer system into manageable steps. 134 Chapter 9: Levels of description in computer science
9.5 Design, constraints and optimization principles
In addition to the problem of simply describing the system, computer science is based on a set of principles for optimizing performance. We need to be aware of these principles in order to see how they may need to be different for brain-like computers.
Computer system design should optimize the use of hardware resources, by minimizing processor time, RAM accesses, RAM storage, disk accesses, and context switching, and it should meet real-time requirements.
There is an even greater need to optimize programming cost, to optimize the time for writing programs, and for changing and updating programs. This is usually achieved by (i) decomposing programs, minimizing interactions among subprograms - functional programming, defining data structure types, and object oriented programming, (ii) mak- ing programs understandable by others using high level languages, storage mechanisms, control mechanisms, scope of variables, data typing and strong type-checking, declara- tive descriptions, (iii) managing software teams, documentation, negotiation, program development environments, and (iv) arranging for the coexistence of multiple programs, distributed programs, interface standards, and open systems. All of these processes may have analogs in computing by the brain. Chapter 10
Brain science
Abstract. In this chapter, I define and explain my notion of Brain Science. It is comprised of a multilevel hierarchy of descriptions.
A possible set of levels is cell dynamics, single neurons, neural nets/associative memories, system level/neuropsychology, cognitive psychology, self theory/psychotherapy and social psychology.
Each level has its own self-contained vocabulary, methods and theories, and forms a scientific culture.
Levels interact with each other, lower levels providing explanations and definitions for higher level concepts and dynamics, and higher levels providing constraints and boundary conditions for lower-level concepts and dynamics.
I also adapt concepts of goal, plan, sequence, event and context for the description of information processing in the brain.
135 136 Chapter 10: Brain science
10.1 Describing the brain
10.1.1 Levels of description of the brain
Neuroscientists recognize and work at the levels shown in Figure 10.1, which we have modified from Gordon Shepherd’s standard textbook [Shepherd, 1994].
Thus, different levels of description of neural activity have been developed for small-scale activity from the cell molecular level through the level of neural firing to the activity of small sets of neurons, at level 4 of this scheme. There tends to be a neglected gap between this highest neural level of and the lowest cognitive psychology, at level 2. This is the system level, level 3.
Present-day modeling of cognitive processes uses computational concepts based on list processing. These models, although able to capture some aspects of cognitive phenomena, are not easily or directly relatable to neural processes or brain architecture.
Thus, on the one hand we have neuroscientists describing system-level phenomena using circuit-level concepts, and on the other hand we have cognitive psychologists using models with no correspondence to brain structure. Describing the brain 137
level level name terms used description experimental no data used
1 behavior sensing, motivation, animal and human characterization of behaviors response, motor action behaviors behavior frequencies
2. cognition perception, memory cognitive psychological experiments response time, attention psychology imaging studies reaction times
3 systems system, area, nucleus, interaction among PET, MRI, MEG imaging, pathways brain subsystems EEG, single electrodes, lesions
4 abstract neurons centers organization of a few hundred and neural nets interacting neurons MRI, MEG, EEG neuron action potential, firing, integration of input single electrodes spikes, transmission and generation of output chemical manipulation
5 detailed modeling synapses, dendrites patterns of synaptic connection, single electrodes of neurons determine integrative action chemical manipulation of a neuron and synapses synapse, uptake, action of complete synapse inhibition
6 cell dynamics genes, proteins, synthesis of proteins in vitro observation in development sequences, and in response to molecular manipulation transcription intercellular signals and intracellular changes and measurement in activity
Figure 10.1: Levels of description in brain science 138 Chapter 10: Brain science
10.1.2 Natural science and computer science
Natural science, including the biology of the brain, has developed powerful mathematical languages for the description of transformations of energy from one form to another. However a fully scientific theory of the brain will involve the study of information pro- cessing. The validity of information measures for describing biological systems was first explicated by Broadbent [Broadbent, 1958]. Information, unlike energy, is not conserved, and we do not have scientific laws for the dynamics of information. It is however reason- able to assume that concepts such as information, data description, coding, computation, inference and memory will be needed in the scientific description of the brain. The disci- plines in which these concepts have been developed are computer science and electronic engineering. The use of formal logical methods to describe biological systems has been advocated, for example by Rudolf Carnap [Carnap, 1958].
10.2 A hierarchy of scientific cultures
I diagram in Figure 10.2 an idea of the field of Brain Science. It has multiple disciplines of all different scientists who study the brain. Each level is a scientific culture with its own vocabulary, concepts, experimental methods, theories and models. Each level can operate in a self-contained way, making hypotheses, doing experiments, making models and validating theories, all within its own culture. Each culture interacts with other cultures by the culture below providing more detailed explanations and mechanisms for concepts on the level above, and the culture above providing constraints on systems in the culture below. New information enters the system at each level, information that cannot be discovered in other levels. Thus no one culture is primary in the scientific process; Brain Science is multicultural. A hierarchy of scientific cultures 139
These proposed levels are not cast in stone, there could be different levels and there could be additional levels. Also, cultures need not form a linear ordering, there can be branching, for example different aspects of one level might have explanations from different cultures. For example, to explain the electrical and fuel components of a car one might use a more detailed electrical analysis and a fuel analysis. There can be also upward branching, for example where cognitive psychology explains and receives constraints from both linguistics and psychotherapy.
interacting selves, groups social psychology
mental states, consciousness, the self psychotherapy
cognitive mechanisms, motivation theories cognitive psychology
system level brain models, neurocognitive models neuropsychology
cortical layers, associative memory models neural nets
single neuron models, synapses, transmission single neurons
cell dynamics, synapse dynamics, genetic transcription cell dynamics
Figure 10.2: My concept of Brain Science 140 Chapter 10: Brain science
10.3 Scientific culture
Within one level of description, there is a culture of beliefs, knowledge, and methods. A culture has a language and a universe of discourse. The language has a vocabulary of terms and their definitions, explanations and allusions. A universe of discourse is the range of things that are discussed.
A scientific culture is self-contained, in the sense that it can for its central purposes op- erate without interacting with other cultures. Questions can be asked, hypotheses made, experiments designed and carried out, and conclusions drawn, all within the culture.
A scientific culture is usually so complex that any one person can only understand one culture, or not even that, one person may only be able to master one part of the culture, with a, less complete, general knowledge of the entire culture.
A culture also has what I call a penumbra or folklore of ideas and evaluations that are understood and not expressed in papers. These are all the ideas that nobody knows how to get to work, the dreams of future achievements, the jokes and exaggerations that people exchange at coffee and on the bus. In my artificial intelligence laboratory, I would say that about 50% of all conversation consisted of jokes, some fraction of which were not about personalities but about computer science. These were not joke stories but witty remarks, plays on words, etc.
10.4 Information is generated at each level
Information can be obtained by observation, and further information can be obtained by inference and calculation. The evolution of the brain is a fact of history and it can not be discovered from cellular models of DNA, it is a historical fact. Information about Formal and computational models at each level 141 this historical fact is obtained by the science of paleontology, by using their experimental methods and vocabulary.
Thus each level produces information and insights which cannot be deduced from theories at other levels. This is mainly from input into the system of experimental findings at each level.
10.5 Formal and computational models at each level
1. social psychology - statistical models of population
2. personality and psychotherapy
3. psychology - modular models without correspondence to brain areas
4. system level - modules and connecting channels
5. cortical layers - neural equations, back propagation
6. single neurons - detailed compartmental models
7. cell dynamics - mass action equations, diffusion equations
10.6 Interactions between levels
Explanations. A concept or word may have a definition at one level or it may be primitive at that level. This concept may have an explanation or definition at the level below. The regularities of the concepts for these explanations at the level below will also constrain the possible behaviors of the objects above. 142 Chapter 10: Brain science
Constraints. The objects at one level will have certain regularities of behavior, and these will act as constraints on what is possible for corresponding behaviors at the level below.
These are interactions between cultures, and there will be some interfacing problems, as each culture will have different vocabularies and different criteria for the truth and reliability of any given statement. In general, this interaction is a form of negotiation [Bond, 1989] [Bond, 1990].
10.7 A role for logic programming
My initial proposal is that the different precise languages at each level could each be defined using a single underlying general language, namely logic programming. To define a more specialized language, the general language would be specialized and extended using additional definitions in the general language. Definitions would take the form of logical assertions.
Thus for example, we would define, in Prolog, a language for creating neural models, another language for cell dynamics, and another for cognitive modeling, etc.
One advantage of having a single underlying language is that definitions of the interface between two levels would also be expressed in the language. In particular, variables from both levels could be used in such definitions since they would all be Prolog variables.
The usual response to this suggestion is that some other logic or language would be better, however logic programming is a very stable method and systems are available for every type of computer, also tutorial material and examples are easily available. Also its mathematical theory has been explored in some detail. Further, most problems it has as a language are shared with all other proposed languages, which in any case lack tutorials, Neuroscience 143 implementations and mathematical theory.
10.8 Neuroscience
There has been a tendency over the last decade to assume that only one level is the real scientific level for describing the brain and that this level is the neural level. In this neuroscience view, all higher, aggregated, information must be and eventually will be inferred from neural models. I believe this view has had some adverse consequences and I am suggesting that it should be replaced by my multicultural concept of Brain Science.
10.9 Concepts for describing information processing
in the brain
In this section, I will discuss key concepts for describing information processing in the brain. These concepts of goal, plan, action sequence, event and context have been introduced and developed by neuroscience researchers over the past decades, and are used routinely in present day neuroscientific papers for explaining neural activity in the brain.
Information-processing terms are used in different ways in the research literature. Some- times a term is used in a broad descriptive almost metaphorical manner, sometimes an operational definition is given for behaviors, measurements and situations being de- scribed, and sometimes a brain mechanism is postulated. I will try to clarify these different uses and claims. My purpose in this chapter is to characterize the action of different brain areas by their experimentally observed involvement in certain kinds of data, processing and mechanism. 144 Chapter 10: Brain science
10.9.1 Goals
The notion of an action being goal-directed goes back at least to Edward Tolman [Tolman, 1932], who demonstrated such behaviors in rats and developed concepts such as goal-objects and means-ends action. Our working definition of goal is information which specifies a desired state that the organism tries to attain. Usually a desired state is not specified completely, but some parts or aspects of the state are specified or constrained.
Goal-directed behavioral responses have been explicated, see for example Gordon Mo- genson’s treatment [Mogenson et al., 1980] of goal-directed locomotion and of thirst and the control of water intake. Edmund Rolls [Rolls, 1999] has given a general treatment of goal-directed mechanisms.
At the neocortical level, the motor cortex may use explicit representations of target positions, which thus constitute neural representations of goals [Alexander and Crutcher, 1990] [Shen and Alexander, 1997].
Goal-directed systems with continuous properties are the subject of control theory, and discrete logical goals are used in artificial intelligence. An action at one level of control may have properties as a goal at lower levels.
Other aspects of goals include (i) competition among goals, prioritization and selection, usually only one goal is selected at once, although the selection of some set of compatible goals and actions may be possible, and (ii) goals as processes, monitoring and evaluating progress, and determining satisfaction and failure. Usually, the initiation, or termination, of activity is not conceived as part of a goal. Concepts for describing information processing in the brain 145
10.9.2 Plans
Karl Lashley, in his classic 1951 paper [Lashley, 1951] [Bruce, 1994] argued that sequences of human motor actions are guided not by chains of associations but by plans. Feedback doesn’t determine the next action, because (i) there is too short a time for the signal indicating the completion of an action to reach the brain, (ii) the errors in sequences revealing their long range organization, and (iii) two different actions can be part of two different sequences, hence the action cannot be the sole determiner of the choice of next action. A review by Steven Keele [Keele et al., 1990] asserts that the large body of experimental evidence supports Lashley’s position, and also that plans are hierarchical in nature, as shown by latencies and grouping of different parts of observed sequences.
I take a plan to be information which guides the sequencing of action. One specific type of plan is a specification of a sequence of actions selected from a small number of elementary movements. The hierarchical decomposition into plan and elementary movements is not always perfect, an elementary movement may itself involve control and the control of sequencing may depend on the control of individual actions. More generally, a plan could branch, and include actions which are conditional on the current state. Plans can also be sets of schema [Shallice, 1982], i.e. conditional actions, and yet more generally, goal schema [Schwartz, 1995], which are conditional actions which contain subgoals and continue to function until their subgoal is attained.
10.9.3 Sequencing of action
By a sequence I mean that a process enters a state and/or generates an output at one time and then at a later time enters another state and/or generates another different output, and so on, so as to produce a sequence in time of different states and/or outputs. Sequencing could be determined by a clock of some kind, it could be “reactive” being 146 Chapter 10: Brain science triggered by perceived stimuli, or it could be triggered by some kind of internal process, for example a cognitive problem-solving process, so that whenever such a process reaches the next state in a sequence of states, the next element in the sequence is generated. Sequencing of action in subcortical areas is well known. Simple sequencing, for self grooming, can even be mediated by brain stem nuclei [Klemm and Vertes, 1990].
Whereas premotor areas are associated with the control of action sequencing, the motor cortex proper is not, and is more purely reactive, generating codes for specific movements.
Sequencing of motor plans has been shown to be associated with specific medial frontal cortical areas, whereas stimulus-guided or conditional sequencing seems to be associated with dorsal frontal areas [Tanji et al., 1996] [Luppino et al., 1991].
The premotor areas thus also involve some sensory perception, and indeed Gallese Rizzo- latti and coworkers have explored the notion of “mirror” neurons [Rizzolatti et al., 1996] [Rizzolatti et al., 1998].
The only published experiments seem to involve either self-paced internally generated sequencing or else sequencing guided by perception of external events. More general than this would be a combination where, at each step, external events are perceived but then remembered sequencing information would be combined, in determining the next action.
Higher-level sequencing at the level of a plan can be inferred from the lesion work of Michael Petrides [Petrides, 1994]. In the case of delayed non-match to sample experi- ments, prefrontal areas are involved in the control of sequencing. Unfortunately these behaviors are very simple, and thus simple processing strategies of maintaining a single activation of a remembered stimulus will allow the monkey to carry out the experimen- tal task I-Fuster [Fuster, 1997]. Thus, very little experimental evidence for higher-level sequencing has been obtained for non-human primates. Concepts for describing information processing in the brain 147
However, problems with plan sequencing and other executive errors are very well-known in human frontal patients [Shallice, 1982] [Stuss and Benson, 1986] [Shallice and Burgess, 1991a] [Shallice and Burgess, 1991c].
10.9.4 Representations of events
More generally, we will be able to characterize hierarchy better if we can characterize the perception, representation, storage and behavioral influence of events. Let us try to define what this might mean, and how it might manifest itself in experimental results.
One general level of representation would be the representation of perceptions of com- plete external events, meaning a spatial context, external objects and their movement. However, a number of distinctions and gradations can also be made. One kind of event is an instantaneous movement, another is an entire episode over a time period. An episode involves typically a selection of stimuli and dimensions, a choice of time granularity, and a selection of intervals within the time period.
There is ample psychological evidence for the use of an episodic memory by human and non-human primates [Tulving, 1983], evidence for episodic memory involvement in various brain areas such as orbital frontal and retrosplenial areas [Shallice et al., 1994] and the temporal pole [Markowitsch, 1995]. However, the types of episodes that have been used in such experiments are quite limited, usually to a single point in time and to a small number of stimuli or stimulus dimensions.
Representation of types of event or episode, if used by the brain, would be of great value, since they could provide representations of context for detailed memory indexing and for action selection. One could perhaps think of the context as providing the value of an index specifying possible subsets of memory and action, from which a final selection is made using more local and immediate information. 148 Chapter 10: Brain science
A second yet more general level of representation would be of complete mental events which would include not only the perceptual representations of external events but also some aspects of the current mental state, such as attentional tuning, and goals and plans currently active. Such a representation could exist in the brain at a given time. If so, it would no doubt be distributed over many neural areas which would mutually activate each other.
10.9.5 Social interaction
Possibly the most general classes of event or situation are those involving other primates in social action and interaction.A social event might include the perceptions of con- specifics, their dispositions and intentions, joint actions with them, joint goals and joint plans. There are results showing that certain types of social features and events are processed separately in the neocortex [Perrett et al., 1989a] [Harries and Perrett, 1991] [Desimone, 1991]. For reviews of social function in the brain see [Brothers, 1990] [Brothers, 1996] and [Adolphs, 1999].
10.9.6 Contexts
The notion of context has a lively history in psychology, linguistics and philosophy. We can perhaps define a context as a “framework or background of information with respect to which more specific items of information can be identified or manipulated” [Miller, 1991].
Context is perhaps information concerning the current situation which is relatively con- stant while we are in that situation. Knowing the context allows us to retrieve and use information that is specific, appropriate or tuned to that situation. Concepts for describing information processing in the brain 149
My idea of context is that of information which is quite general concerning the current situation and its characterization. Context is activated and maintained by the brain and facilitates the selection and discrimination of objects and actions by providing a global level of indexing or association. Contextual information is provided to various brain areas which use it to function in a more focused and unambiguous manner by constraining possible choices. Nicholas Humphrey [Humphrey, 1984] has argued that the active use of context information such as beliefs, stereotypes, expectations, and contextual knowledge is an essential requirement if an individual is to comprehend and react rapidly and effectively to his or her social world.
The cognitive psychologists Sutherland and MacKintosh [Sutherland and MacKintosh, 1971] reviewed the evidence for selective attention and concluded that “dimensions” of the context were selected independently of the specific items having those dimensions. The psychologist Tolman in an associationist learning tradition introduced the notion of cognitive maps which were frameworks learned by animals and then used to guide problem solving. Neuroanatomists John O’Keefe and Lynn Nadel [O’Keefe and Nadel, 1978] argued for the creation of spatial maps in rats by the hippocampus. For humans they suggested these maps might be more generally cognitive.
Episodic memory in humans is thought to be learned in the hippocampus and stored in the neocortex. Memory for a specific episode type is then evoked by the current situation and acts as a context in the retrieval of more specific memories and in the selection and execution of plans. 150 Chapter 10: Brain science Part II
The cortex
151 152 Chapter 11
Information-processing analysis
Abstract: In this chapter, I develop a set of concepts for analyzing the brain in terms of its functional architecture, by which I will mean what processing components exist, how they are interconnected, and what information-processing functions each is involved in. I characterize the information-processing function for each neural area in terms of the types of information it is associated with, and conceive of its activity as processing, storage and transmission of data of the corresponding types for that area.
153 154 Chapter 11: Information-processing analysis
11.1 Introduction
The primate brain as a whole has a hierarchical structure, related to its developmen- tal order [Bullock, 1977] [Romer and Parsons, 1986] and some hierarchical functioning [Kandel and Schwartz, 1999], pointed out for example by Paul Maclean [Maclean, 1970]. Within the primate neocortex, some hierarchical structure is widely accepted, an early trailbreaking paper being that of Jones and Powell [Jones and Powell, 1970]. Hierarchies of perception and motor action are well-known, however a hierarchical structure encom- passing most of the neocortex is not established. The mathematical analysis of cortical connectivity by Malcolm Young [Young, 1993] did not yield a perception-action hierar- chy. The psycholinguistic analysis of Jerry Fodor [Fodor, 1983] suggested that higher- level functioning is nonmodular and probably nonhierarchical. Fuster [Fuster, 1997] in his study of the prefrontal cortex has suggested a hierarchical structure, which is diffusely distributed and nonmodular [Fuster, 1995]. Some time ago, the control theorist Jim Al- bus [Albus, 1981] suggested a hierarchical control system concept for the cortex, Ulrich Neisser [Neisser, 1976] has discussed what he called the action-perception cycle, and the brain theorist Michael Arbib [Arbib, 1981] for example has described perceptual-action interaction for motor control.
The neuroanatomist Deepak Pandya and coworkers [Pandya and Yeterian, 1990] have described some hierarchical structure of the neocortex, however their study did not use an action hierarchy. They examined lateral connections between perception hi- erarchies and corresponding areas in the frontal lobes and showed that these had similar architectonic structure, which they related to the theory of Friedrich Sanides [Sanides, 1970] of the phylogenetic development of the neocortex. More recently the anatomical connectivity of the frontal lobes has been clarified by Helen Barbas and coworkers [Barbas and Rempel-Clower, 1997], who showed that there is a spatial se- Hierarchies 155 quence of areas whose order is predicted from their architectonic properties. The hi- erarchy of functioning in the frontal lobes has been investigated by Michael Petrides [Petrides, 1994] using a lesioning technique. Observed memory characteristics also sup- port the hierarchical ordering in showing a corresponding increasing memory ability and increasing characteristic memory time.
My concept of the information-processing function of a given brain area will be that it computes data of given types. A given area receives data of certain types and computes data of other types, which it may then store and/or transmit to other areas. I will take experimental evidence for information-processing function to be based on experimental evidence for the types of data that are observed being processed by a given area. This concept of function differs from characterizations in terms of contribution to functioning in the external environment of the organism.
This chapter. In section 11.2, I examine the concept of hierarchy and its application to the neocortex. In section 11.3, I discuss the concepts of neural areas and neural connections in the neocortex.
11.2 Hierarchies
11.2.1 The concept of hierarchy
Hierarchy is a basic organizational principle in biology. The usual concept of hierarchy is where we have a set of elements and a relation which specifies whether one element is “above” another. This relation holds only between some pairs of elements, in other words it is a partial ordering. A hierarchy can occur where there is a sequence of different anatomical structures with gradually increasing measures of some biological property, such as size, density, or complexity, for example. However, a set of identical or similar 156 Chapter 11: Information-processing analysis structures can form a hierarchy by their mutual relative arrangement and connectivity, in which case the position of the element in the hierarchy is determined by its position within the total anatomical arrangement. A notion of cortical hierarchy has also been defined by David Felleman and David Van Essen, with feedforward and feedback connections defined by which cortical layers are involved. What is more cogent is a hierarchy of function. In the case of nervous systems, the main functions of interest concern the transmission, processing and storage of information. I will take processing to include computation with the generation of new information forms from other input forms, and also learning, which includes alteration of function and the creation of new information. Elements higher in a hierarchy could perform more general or more abstract processing, and store more general or more abstract data.
11.2.2 The elements of hierarchies
My first basic question is what are the components of the neocortex that we should consider as elements of a possible hierarchy?. The kinds of elements that have been postulated for the functional components of the neocortex include: (i) the gene and cell [Shepherd, 1994], (ii) the single neuron as an integrative system, using dendrites and dendritic microcir- cuits [Barlow, 1972] [Shepherd, 1990], (iii) Hebbian assemblies of neurons in a fine-grained distribution over the cortical surface [Hebb, 1949] [Fuster, 1995], (iv) cortical columns [Mountcastle, 1957] [Szentagothai, 1972] [Szentagothai, 1983] [Mountcastle, 1995a] [Mountcastle, 1997] [Malach, 1994] (Purves [Purves et al., 1994] ar- gues for other structure of similar scale but which is dynamical), (v) cortical areas. [Brodmann, 1909] [Felleman and Essen, 1991]. This is the mainstream concept. Areas are defined by (a) cytoarchitectonic distinguishability, (b) clustering of Hierarchies 157 interconnectivity, and (c) subcortical connectivity, notably thalamic, and (vi) regions, made up of contiguous sets of related cortical areas [Pandya and Yeterian, 1990]. In this review, I will mainly consider structure based on neural areas, however I will also identify aggregations of neural areas, thereby defining regions.
11.2.3 Sensory and motor hierarchies
Perhaps the best known hierarchical concept in the neocortex is the visual perceptual hierarchy. It seems that neurons in areas V1, V2, V4, and so on, respond to increasingly higher-order visual features [Essen et al., 1990]. Thus, we have a hierarchy of func- tion based on the processing of visual features. The information-processing functions involve taking input information from lower levels of the hierarchy and computing new information representing the presence of higher-order visual features of the perceived scene.
Higher-order features can mean simply “any information derived from lower order fea- tures”, but in biological visual systems the features are usually more general in the sense of responding to or describing broader classes of stimuli. In other words, the features are derived from events occurring in a greater spatial region, over a greater temporal interval, involving more stimulus dimensions, and so on.
Similarly, there is experimental evidence for a hierarchy of motor function in the entire brain [Shepherd, 1994] and in the neocortex in particular [Porter, 1990] [Picard and Strick, 1996] [Riehle, 1991]. Higher levels process general descriptions of action to be performed, and lower levels process very specific motor patterns. A good description of issues in hierarchical control in the nervous system has been given by Peter Greene [Greene, 1972]. 158 Chapter 11: Information-processing analysis
In such a case, it is not necessary for the processing higher in the hierarchy to be any more complex, it could be of a similar complexity to other levels but simply operating on more general data. The uniformity of cortical circuitry suggests a uniformity of complexity.
There is evidence for the representation of goals, in the sense of target posi- tions, at several different levels in the motor hierarchy. The anterior cingulate re- gion is typically found to be involved in different aspects of goal-directed behavior [Devinsky et al., 1995], such as monitoring of progress, and connections between stimulus and reward [Elliott and Dolan, 1998] [Carter et al., 1998].
To the extent that there is a temporal sequencing of action, we expect that at higher levels there will be longer time intervals between elements of represented sequences [Tanji et al., 1996] [Rizzolatti et al., 1998].
11.2.4 Possible bases for hierarchy in the neocortex
My second basic question is what is the basis for the hierarchical ordering relation speci- fying that one element is “above” another element in the hierarchy? We can list a priori several different aspects of information processing which may characterize hierarchy in the neocortex:
1. hierarchy of data type - the generality of the data being processed by an element.
2. hierarchy of complexity of processing - the amount of processing and data occurring in an element
3. hierarchy of motor function - the generality of action described by an element
4. hierarchy of temporal scale - temporal sequencing of action, temporal scale of per- cept, temporal scale of memory associated with the element. Anatomical regions and connections 159
5. hierarchy of goal description - the generality of the goal being processed.
6. hierarchy of memory - short to long duration, more specific to more general data, smaller to larger capacity.
7. hierarchy of control - the element may exercise control or influence over more other elements. This can be control over elements involved in the construction of an action, or control over what different elements attend to or are tuned for.
Any one or any combination of these aspects, and others not listed here, could characterize a given biological information-processing hierarchy. Further, a given set of elements could have more than one hierarchical characterization based on different aspects.
11.3 Anatomical regions and connections
11.3.1 Neural areas
The anatomical parcellation and connectivity used here is taken mainly from stud- ies of macaque monkeys. Supporting analyzes for other species of primate have been described, for prosimians by, for example, Todd Preuss and Patricia Goldman- Rakic [Preuss and Goldman-Rakic, 1991b], chimpanzees by Percival Bailey et al. [Bailey et al., 1950], and humans by, classically, Korbinian Brodmann [Brodmann, 1909], and more recently reviewed for example by Karl Zilles [Zilles, 1990]. I will call the par- cellated cortical divisions neural areas.
Parcellation starts from an architectonic analysis of the cortex and areas are partitioned from relatively sharp changes in architectonic measures such as densities of cells in each layer, and neurotransmitter activity. It can then be strengthened and confirmed from 160 Chapter 11: Information-processing analysis connectivity data: neurons in a given area tend to have the same types of connections and to the same other areas.
A detailed parcellation of the occipital and parietal lobe visual areas has been given by David Felleman and David Van Essen [Felleman and Essen, 1991]. Detailed parcellation of the frontal lobe has been given by Helen Barbas [Barbas and Rempel-Clower, 1997].
The main neural areas of the primate neocortex are quite well established, even though their identities are under constant investigation and refinement.
There is no agreement on notation, Brodmann used numbers, but then for example Constantin von Economo and Georg Koskinas [Economo, 1925] introduced an alpha- betic notation. More recently, finer subdivisions of Brodmann areas have been given subscripts, see for example [Carmichael and Price, 1994]. We will use a mixed notation which attempts to use the notation for each part of the cortex that is used by the main neuroscientists studying that part. Figure 11.1 diagrams the neural areas and nota- tion that we will be using. Zilles [Zilles, 1990] gives a conversion table between several different labeling schemes. Anatomical regions and connections 161
TPro TE1 TE2
TE3 V4 14 V2
10 32 25 V1
24 V4 9 V2 23 PO 6 PGm MDP 4 PE 1 ci 2 PEc 3 PE
PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a
46 PF PFG MT Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1
TE2 TPro TE1
TE2 TE3 TE1
TPro V4 12
11 13 FPro 10 24 V2 V1 14 25
Figure 11.1: Neural areas and notation used 162 Chapter 11: Information-processing analysis
11.3.2 Anatomical connections and their analysis
Jones and Powell. Let me briefly describe the classic findings of Jones and Powell. The stated main aim of their paper was to find convergence areas of sensory data, and the main finding was a sequence of association connections for each sensory modality, ending in the frontal lobes. “... in each of the three systems studied 1 there is a stepwise, outward projection from the main sensory areas within both the parieto-temporal and frontal lobes with an interlocking of each new parieto-temporal and frontal step.... each primary area projects to a local area in the same lobe and to a portion of the premotor cortex in the frontal lobe.” In their experimental method, gray matter from a given area was removed, attempting to avoid damaging the white matter. This was done in the intact animal; rhesus monkeys were used. After a few days the animal was perfused and degeneration of neurons observed, giving the connections from that area. Then for one area lesioned, what they found was that a small set of other areas were affected. For example, on lesioning area 7, the areas affected are shown in Figure 11.2.
1i.e., somatosensory, auditory and visual Anatomical regions and connections 163
TPro TE1 TE2
TE3 V4 14 V2
10 32 25 V1
24 V4 9 V2 23 PO 6 PGm MDP 4 PE 1 ci 2 PE c 3 PE
PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a
46 PF PFG MT Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1
TE2 TPro TE1
Figure 11.2: Lesioning an area affects a small number of other areas 164 Chapter 11: Information-processing analysis
The pattern of connectivity they discovered is diagrammed in Figure 11.3.
local area
frontal area
local area nearby local area
frontal area frontal area
area local area lesioned frontal area
(a) sensory area
(b)
Figure 11.3: Pattern of connectivity discovered by Jones and Powell Anatomical regions and connections 165
Figure 11.4 shows the three sequences reported by them. We can see evidence for a perception-action hierarchy, however the frontal connections were unclear at that time. 166 Chapter 11: Information-processing analysis
temporal somatosensory cingulate frontal orbital frontal areas areas areas areas areas
frontal pole
temporal pole 10
23,24 retrosplenial OFC ACC 45 22 12 9
35(TH) 46(upper) (a) 46(dorsal) 7 upper 8
5 6(upper)
SII 6 (SMA)
SI 4 temporal visual cingulate frontal orbital frontal areas areas areas areas areas
temporal pole frontal pole 9 10 STS (b) 45 25 OFC
lateral EC 21 46(lower)
amyg(bln) 20 PrCo
caudal STS 18 19 temporal auditory cingulate frontal orbital frontal areas areas areas areas areas 8A 17
STS
25 35(TH) 12 (c) 10 22 9
8B supratemporal plane(TB)
41+42
Figure 11.4: Summary diagram showing the three sequences reported by Jones and Powell Anatomical regions and connections 167
Jones and Powell assert that “The significance of the double projection pattern, the one local and the other to the frontal lobe ..... is obscure”. They concluded that sensory data seemed to stay separate for several steps of the sequence. Convergence of sensory data seemed to occur in STS and OFC. They speculated that frontal areas are concerned with sensory data, sensorimotor integration and with learning and discrimination.
Evidence for neural areas. The original evidence was architectonic, exemplified by the work of Brodmann. Fifty years later, connectivity findings, such as shown in Figure 11.2, provided basic evidence supporting the existence and significance of these neural areas. The connections from one area do not go to other neurons all over the cortex but they are very clustered to go only to a small set of other areas.
Further, the partitioning into areas from connectivity derived from one “source” area will be similar to that derived from other source areas. These areas also are consistent with partitionings by connectivity from subcortical areas, notably the thalamus. Further, these neural areas are relatively constant from one individual to another, and from one species of primate to another.
Per Roland, after two decades of pioneering brain imaging experiments, reported in his book [Roland, 1993], postulated his cortical field activation hypothesis [Roland, 1985] which states that “neurons in the cerebral cortex always change their biochemical activity, not in a scattered or singular fashion, but in large distinct ensembles, each covering some 800m3 to 3000m3 of the cortex. Within the field, the active synaptic regions form columns and bands of raised metabolic activity” [Roland, 1993].
What has not been established is how constant these activation areas are over different tasks, and what is their correspondence to areas found by connectivity.
The work of Pandya and coworkers. Over a twenty year period, Pandya and var- ious coworkers have systematically investigated corticocortical connectivity. Summaries 168 Chapter 11: Information-processing analysis of their work can be found in [Pandya and Yeterian, 1985] [Pandya and Yeterian, 1990]. They used various techniques, the older papers used lesioning and later work used an- terograde and retrograde tracing. They investigated the same three hierarchies as Jones and Powell. Figure 11.5 summarizes their findings, taken from Figures 17, 19 and 21 of [Pandya and Yeterian, 1990]. Clear hierarchies are seen.
PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT Tpt 10 OAa (a) AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1
PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT (b) Tpt 10 OAa AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1
PEc PE PIP MIP VIP 6 3 LIP 4 1 DP 2 9 8 7a 46 PF PFG MT Tpt 10 OAa (c) AI FST 12 V4 V1 TPO−3 V2 paAlt TS3 TPO−2 TS2 TE3 VP TPO−1 TS1 TE2 TPro TE1
Figure 11.5: Summary of hierarchies reported by Pandya and coworkers Anatomical regions and connections 169
The work of Helen Barbas on the connectivity of frontal areas. Helen Bar- bas [Barbas, 1988] [Barbas, 1992] [Barbas and Rempel-Clower, 1997], using tracing tech- niques, has described the intrinsic connectivity of the frontal areas, showing its hierar- chical structure. Her results are shown in Figure 11.6. 170 Chapter 11: Information-processing analysis
25 12
10 32 (i) medial view 24 9
6
4
6 4
9 8 46
10 12
(ii) lateral view
Figure 11.6: Intrinsic connectivity of frontal areas, from (Barbas and Pandya, 1989) Sensing as the construction of descriptions 171
Determining hierarchical structure using regions and function. It is not easy to determine hierarchical structure of the cortex and this is probably why it is not generally used. I will use anatomical connectivity data which I have researched myself, but is close to and includes that already published by Young [Young, 1993], and I give item- ized references to the connectivity findings. My approach differs from Young’s in two important respects. First, I use functional information as well a connectivity information in defining our hierarchies. Young’s work showed that anatomical connections alone do not give enough information to derive a definite architecture. Second, I do not treat the entire cortex as having a single connectivity matrix. Instead, I first divide the cortex into six main parts, thus breaking up the problem of describing connectivity into two levels, namely those within parts which we will call intrinsic and those between parts which we will call extrinsic. This I believe allows more neuroscientific intuition to be brought to bear.
Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, a planning and action hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. The planning and action hierarchy is based on increasingly general situations, plans and control.
11.4 Sensing as the construction of descriptions
Sensors do not deliver raw information but rather a feature or set of features of the stimulus. Figures 11.7 and 11.8 from Blum [Blum, 1990] lists a classification of sensory 172 Chapter 11: Information-processing analysis receptors together with the features they detect and the conduction velocity.
The diameter of the fiber in micrometers is approximately the conduction velocity in meters/sec, thus 1 micrometer corresponds to 1 meter/sec. Myelination gives 4-6 x unmyelinated, I to IV is fastest to slowest. “At the level of the primary afferent fiber, a complex sensory stimulus is broken down into a number of very specific features” [Blum, 1990]. A feature may involve temporal integration and temporal adaptation and sensitization or temporal differentiation. Each sensor: 1. has a defined adequate stimulus 2. is responsive to the adequate stimulus within its receptive field, and 3. has a rate of adaptation. Sensing as the construction of descriptions 173
Figure 11.7: Sensory features - first part 174 Chapter 11: Information-processing analysis
Figure 11.8: Sensory features- second part Chapter 12
An information-processing analysis of the primate neocortex
Abstract: In this chapter, I analyze experimental evidence for the perceptual areas of the primate neocortex for conclusions concerning the existence of neural areas, for cortic- ocortical connectivity among neural areas, and for the involvement of each cortical neural area in the functioning of the brain.
I analyze neocortical perception hierarchies one by one: olfactory, gustatory, somatosen- sory, auditory, visual - ventral then dorsal, and finally the polymodal areas of the superior temporal sulcus (STS).
This analysis shows that these areas consist in the main of an interconnected polymodal perception hierarchy.
In the next chapter, I will extend this analysis to the frontal areas.
175 176 Chapter 12: An information-processing analysis of the primate neocortex
12.1 Introduction
Even if we accept that the primate neocortex contains hierarchies of perception and ac- tion, however, the nature of these hierarchies is far from agreed. My purpose in this chapter is to analyze, in terms of information-processing, the connectivity and functional properties of the primate neocortex to determine just what, if any, hierarchical structure is present. As will be seen, it is by no means obvious what hierarchical structure exists in the neocortex, partly from lack of data but also because existing data does not al- ways suggest a straightforward hierarchy. However, when all the experimental data are gathered together and analyzed, we have clear evidence for a hierarchical structure of information processing in the neocortex. This structure is based on anatomical regions, and has hierarchical anatomical connectivity and hierarchical functionality. I will show (i) a parallel set of perceptual hierarchies based on increasing generality of data pro- cessed, temporal scale and memory, (ii) an action hierarchy based on action generality, temporal scale and memory, and (iii) lateral connections between corresponding levels of these perceptual and action hierarchies. In addition, there is no negative evidence, i.e., we know of no experimental evidence that contradicts our hypothesized hierarchical architecture.
I have not included polymodal areas in the posterior cingulate or the interparietal sulcus, since data on these areas weren’t clear enough to us. Neither have I taken into account the bihemispheric structure of the brain, postponing its consideration until a more extensive study.
The establishment of an architectural design, such as the laterally-connected perception- action hierarchy to be described here, would provide an important simplifying principle for the interpretation of experimental findings and for the generation of experimental questions and hypotheses. Furthermore, as I describe in detail in chapter 15, from the Introduction 177 description of the hierarchy, giving its elements and their corresponding processing and data types, with temporal and memory characteristics, and its connectivity, we will be able to derive a dynamic causal model of information-processing activity in the primate neocortex.
Demonstration of the existence of a functional hierarchical scheme will require a review of the experimental literature, which we will now undertake. Limitations of space pre- clude us from describing experimental paradigms and results in detail, or from listing all supporting references.
I list all the regions as sets of areas in Figure 12.1. 178 Chapter 12: An information-processing analysis of the primate neocortex
Hierarchy Region Areas olfactory hierarchy, OA OI olfactory cortex O1 14c,25,13a,13m,IO gustatory hierarchy, GA GI operculum, IG, SI taste and tongue areas G1 polymodal taste area in 12o G2 polymodal taste area in 13l auditory hierarchy, AA AI auditory cortex AA1 Tpt, paAlt and caudal TS3 AA2 rostral TS3 and TS2 AA3 TS1, TPro somatosensory hierarchy, SA SI 1, 2, 3 SA1 PE, PEa, PF SA2 PEc, rostral POa, PFG SA3 PGm, rostral PG ventral visual hierarchy, VV VI V1, V2, V3, VP, V3A, V4t VV1 OAa, TE3, V4 VV2 TE2 VV3 TE1, TPro dorsal visual hierarchy, DV DV1 PIP, PO DV2 MIP, VIP, LIP, DP, MDP DV3 caudal 7a polymodal hierarchy in STS, PM PM1 TPO-4, TPO-3 PM2 TPO-2 PM3 TPO-1, TPro planning and action hierarchy, PA MI 4 PA1 6 PA2 8 PA3 46 PA4 9, 10 PA5 11, 12 G 24, 32
Figure 12.1: Table of all hierarchical regions Analysis of neocortical perception hierarchies 179
12.2 Analysis of neocortical perception hierarchies
12.2.1 Olfactory areas
Anatomy. I diagram the olfactory areas in Figure 12.2, which is based on a figure by Thomas Carmichael and Joseph Price [Carmichael and Price, 1994]. 180 Chapter 12: An information-processing analysis of the primate neocortex olfactory tubercle insula areas Iam,Iai,Ial,Iapm,Iapl piriform cortex caudal orbital areas 14c,25,13a,13m anterior olfactory nucleus olfactory bulb olfactory sensors OL2 olfactory prefrontal areas OL1 olfactory cortex (b) principal connections and hierarchy of functional involvements G Orbital frontal cortex Iapl Ial Iai Temporal lobe PC Iapm 13m Iam AONl (a) orbital view of olfactory cortex 13a AONm 14c TTv
Figure 12.2: The olfactory hierarchy Analysis of neocortical perception hierarchies 181
These areas and their connectivity have been extensively discussed by Price [Price, 1990] [Price, 1991] and by Rolls [Rolls, 1995], see also [Davis and Eichenbaum, 1991] The pri- mate olfactory system differs from that of lower mammals in having direct connectivity from olfactory nuclei to olfactory cortical areas, as well as via thalamical connections. From the described anatomy and functionality, we can identify the following regions: Region OI. This is the olfactory cortex and is described for example by Gordon Shepherd [Shepherd, 1994] and Lewis Haberly [Haberly, 1990], and consists of several different interconnected regions, including the anterior olfactory nucleus, the piri- form cortex and olfactory tubercle. According to Thomas Carmichael and coworkers [Carmichael et al., 1994], it provides detection and discrimination of odors, as well as simple odor memories. Region OL1. This is the neocortical olfactory area described by Carmichael et al [Carmichael et al., 1994] who give a detailed architectonic division of orbital frontal and insular cortices [Carmichael and Price, 1994]. OL1 is their areas 14c, 25, 13a, 13m and IO. According to them, it provides odor-guided behaviors, forced-choice olfactory discrim- ination and mating and sexual behaviors, although most of the experimental evidence comes from rodent work.
12.2.2 Gustatory areas
Anatomy. I diagram the gustatory areas in Figure 12.3, derived mainly from the work of Edmund Rolls and coworkers [Rolls, 1995]. 182 Chapter 12: An information-processing analysis of the primate neocortex
GOV gustatory, olfactory, visual area medial OFC GU2 secondary cortical taste area caudal OFC
GU1 primary cortical taste area operculum insula somatosensory taste and tongue areas (ipsi− and contra− lateral)
medulla NST (the nucleus of the solitary tract) taste sensors taste buds
Figure 12.3: The gustatory hierarchy Analysis of neocortical perception hierarchies 183
From the described anatomy and functionality, we can identify the following regions: Region GI. The primary gustatory cortical area is located in the operculum and insula, and is described by Rolls [Rolls, 1995], see also [Shepherd, 1994] and [Finger, 1991]. It is concerned with timing and detection of taste and the characterization of types of taste. It is unnecessary for reflex responses to gustatory stimuli, but necessary for normal retention of learned taste aversions. Region GU1. This secondary gustatory cortical area approximating 12o, integrates gustatory, satiety and olfactory information, according to Rolls [Rolls, 1995]. Region GU2. This is a polymodal area, approximating 13l, which integrates visual, gustatory and olfactory information, again according to Rolls [Rolls and Baylis, 1994] [Rolls, 1995].
12.2.3 Somatosensory areas
Anatomy. I diagram the somatosensory areas in Figure 12.4. 184 Chapter 12: An information-processing analysis of the primate neocortex and grasping and grasping tactile detection tactile images guidance of reaching guidance of reaching SI SS1 SS2 SS3 PF PG PFG rostral 1 rostral POa 2 3 PEa PEc PGm PE (b) principal connections and hierarchy of functional involvements PO MDP c m PE PG PEc 31 ci PE PE 2 PE 7a 3 1 23 PFG 3 PF 2 1 (ii) medial view (a) views of somatosensory areas (i) lateral view
Figure 12.4: The somatosensory hierarchy Analysis of neocortical perception hierarchies 185
I use the notation of Deepak Pandya and coworkers for connections and architectonics of the somatosensory areas. Area PG of Pandya and coworkers is approximately the same area as Brodmann 7a. I will use and refer to rostral PG as the somatosensory part of PG, and caudal 7a as the dorsal-visual part of 7a. From the described anatomy and functionality, we can identify the following regions: Region SI (areas 1, 2 and 3). A comprehensive treatment of the somatosensory systems of primates has been given by Jon Kaas and Timothy Pons [Kaas and Pons, 1988]. The primary somatosensory area SI has topographic and nontopographic somatic representa- tions [Phillips et al., 1988], and is involved in basic processing of somatic sensation, e.g., texture and angularity [Pandya and Yeterian, 1990]. Region SS1 (areas PE, PEa and PF). There are somatic re-representations re- representations of the somatic body regions, parietal somatic body regions in the parietal lobe [Kaas et al., 1981] [Merzenich et al., 1981]. These areas are thought to be involved in more complex and integrative functions [Lynch, 1980]. There are representations of spa- tial form derived from earlier topographic representations, for example tactual form and texture [Johnson et al., 1995]. A review by Juhani Hyv¨arinen [Juhani Hyv¨arinen, 1982] concludes that the somatosensory association cortex provides a “somatosensory coordi- nate system for goal-directed voluntary movements”. Mortimer Mishkin [Mishkin, 1979] concluded that there is hierarchy of somatosensory perception and has drawn an analogy to the ventral visual hierarchy.
Regions SS2 and SS3 (areas {PEc, rostral POa and PFG} and {PGm and rostral PG}). Region SS2 Region SS3 Vernon Mountcastle [Mountcastle, 1995b] has reviewed work on parietal lobe areas. Area 5 is involved in somatosensory guidance in voluntary reaching, grasping and joint rotation. Reaching involves the projection of the arm and hand, and in grasping the hand adapts to spatial contours of the target. “Grasping” neurons have been found in rostral POa and, for visual control, in LIP. 186 Chapter 12: An information-processing analysis of the primate neocortex
12.2.4 Auditory areas
Anatomy. The main auditory areas of the temporal lobe and insula are shown in Figure 12.5, taken from a review by Jon Kaas et al. [Kaas et al., 1999].
Their diagram is based on data from [Hackett et al., 1998a] [Hackett et al., 1998b] [Hackett et al., 1999] and [Romanski et al., 1999]. Areas RT, R and AI form the core, and areas RTM, RM, CM, CL, AL and RTL form the belt. The parabelt is shown as RPB and CPB, its rostral and caudal parts. LS is the lateral sulcus and STS the superior temporal sulcus. When these are in their normal closed position, only CPB, RPB and STG, the superior temporal gyrus, are visible. Note that Tpt and 22 form the planum temporale, a planar region on the upper surface of the posterior temporal cortex. The connectivity of the areas comprising the core, belt and parabelt shows a division of these areas into rostral and caudal regions, and indeed the summary diagram in [Hackett et al., 1999] shows the connections between rostral and caudal parts as weaker than the connections between different rostral parts or different caudal parts. Analysis of neocortical perception hierarchies 187 auditory memory conspecific vocalizations space perception complex stimuli semantics of words pattern recognition auditory long term memory perception of pure tones AI AU1 AU2 AU3 CML CC CB CPB CSTG 23b RC RB RPB RSTG RSTS CSTS (b) principal connections, hierarchy and functional involvements V4 PO Tpt CPB CL CM CML MI AI AL R RM RPB 31 STS (open) RTL RT 23b RTM LS (dorsal bank) STGr 23a LS(ventral bank) INSULA 29 and 30 (ii) lateral view (a) views of auditory areas (i) medial view
Figure 12.5: The auditory hierarchy 188 Chapter 12: An information-processing analysis of the primate neocortex
The intrinsic connectivity incorporating Kaas et al.’s findings is given in Figure 11(b). I have moved the rostral parts higher in the diagram because of their connectivity to frontal areas, to be described later. This frontal connectivity is based on the findings of Kaas et al. and also the findings of Pandya and coworkers, [Pandya and Sanides, 1973] [Pandya, 1995], who described the basic architecture several years ago. In addition, we have included the auditory medial areas CML and 23b. These were originally described by Goldman-Rakic et al. [Goldman-Rakic et al., 1984] in their study of connections between the principal sulcus and the hippocampal complex. Masao Yukie [Yukie, 1995] has described their connectivity to other auditory areas. Unfortunately, to our knowledge, their full connectivity to other frontal areas has not yet been described. The main frontal connectivity from all auditory areas forms the arcuate fasciculus, which runs near the medial surface.
From the described anatomy and functionality, we can identify the following regions: Region AI. To maintain consistency in my own notation, I will name the entire core region AI. Kaas et al use the name AI for just one of the areas of the core. Accord- ing to them, the core region is cochleotopically mapped and is involved in the per- ception of pure tones. AI (in our notation) has three, four or five subareas in most species of monkey [Aitkin, 1990] [Brugge and Reale, 1985]. Auditory cortical areas in man have been described by Gastone Celesia [Celesia, 1976]. Processing in AI uses au- diofrequencies projected in a regular serially ordered way (tonotopic representation). The cochlea is also re-represented point by point and indeed each cochlea is rep- resented bilaterally [Buser and Imbert, 1992]. Primary areas are involved in elemen- tary auditory processing such as frequency and amplitude [Pandya and Yeterian, 1990] [Brugge and Reale, 1985]. Auditory cortex may also be involved in the localization of sound [Heffner and Heffner, 1990].
Nobuo Suga and coworkers have described in detail how an upstream flow of information Analysis of neocortical perception hierarchies 189 in the auditory system of the mustached bat results in tuning of the midbrain frequency map [Gao and Suga, 1998] [Yan and Suga, 1998], however analogous phenomena have not yet been investigated in primates.
Region AU1 (areas {caudal and rostral belt, and caudal parabelt}, i.e., TS3). Auditory association areas are involved in more integrative functions such as auditory pattern recognition and sound localization [Juhani Hyv¨arinen, 1982] [Pandya and Yeterian, 1990]. Auditory image formation has not yet been clearly shown in primates. Kaas et al. remark that this region is less precisely cochleotopic.
Region AU2 (areas {caudal STG, caudal STS and rostral parabelt}, i.e., TS2). Accord- ing to Kaas et al., the parabelt area is concerned with space perception and auditory memory. However it also seems there is special provision for recognizing sounds from the vocal repertoire of the animal’s own species. This has been studied in detail in the squir- rel monkey [Newman, 1978], where specialist cells for nearly all major call types have been found. James Newman concluded that “the auditory cortex is a highly efficient processor of the acoustic structure of vocalizations and other complex acoustic signals, but that the determination of the biological significance of vocalizations - their interpre- tation, their meaning - most likely takes place elsewhere” [Newman, 1978], p. 104. More recently, Josef Rauschecker et al [Rauschecker et al., 1995] [Rauschecker et al., 1997] re- port involvement in the perception of conspecific vocalizations.
Region AU3 (areas {rostral STG, rostral STS}, i.e. TS1, plus CML and 23b on the medial surface). According to Pandya et al. the rostral parts of STS and STG form a separate region, both architectonically and in terms of intrinsic and extrinsic connectivity. Michael Colombo et al., working with lesioned monkeys, concluded that “the superior temporal cortex plays a role in auditory processing and retention similar to the role the inferior temporal cortex plays in visual processing and retention” [Colombo et al., 1990] p. 336. 190 Chapter 12: An information-processing analysis of the primate neocortex
Humans have special abilities for recognizing formants produced by the human pharynx and for calibrating the heard speaker’s vocal tract [Lieberman, 1991] [Blumstein, 1995]. Functional MRI imaging [Binder et al., 1994] has shown the involvement of STG bilat- erally with more meaningful auditory stimuli activating more of STG, simple stimuli being confined to the auditory cortex.
There have been several imaging studies in humans using CT [Baum et al., 1990], PET [Morris et al., 1998] [Fiez et al., 1996] [Zatorre et al., 1996] [Smith et al., 1996] [Petersen and Fiez, 1993], and FMRI [Hickok et al., 1997] [Dhankhar et al., 1997] [Millen et al., 1995] [Bilecen et al., 1998] [Huckins et al., 1998] [Binder et al., 1997]. These have limited spatial resolution but have shown that the semantic processing of words tends to use more outer and more rostral auditory areas than the perception of tones. The other main finding is the wellknown activation of posterior areas in speech perception and the processing of nouns and of frontal areas in speech production and the processing of verbs.
I include CML and 23b in AU3 because they seem connected to other areas at this level. The role of CML and 23b in auditory long term memory has been recognized clinically by Edward Valenstein et al. [Valenstein et al., 1987] and by Rudge and War- rington [Rudge and Warrington, 1991]. Paul Grasby et al. have shown this involvement using imaging [Grasby et al., 1993], and, in their FMRI study, Jeffrey Binder et al. [Binder et al., 1997] have shown clear involvement of these areas in semantic decision tasks. As shown in the diagram, one can argue for a branch in the hierarchy, a rostral branch consisting of rostral STG and STS and connected to orbital frontal cortex, and a caudal branch connecting via CML and 23b to dorsal prefrontal. Some researchers [Romanski et al., 1999] have speculatively associated the rostral branch with complex phonetic and language processing and the caudal branch with auditory spatial processing. Analysis of neocortical perception hierarchies 191
12.2.5 Ventral visual areas
Anatomy. I diagram the ventral visual areas in Figure 12.6. 192 Chapter 12: An information-processing analysis of the primate neocortex V1 VP V2 V4 MT preattentive vision motion higher level visual features recognition and storage of complex visual forms recognition and storage of complex visual forms socially significant objects socially significant objects TE3 TE2 VI VV3 VV2 VV1 TE1 4 VP TPro MT 6 8 V1 V2 12 V4 TE1 TE2 TE3 46 (c) principal connections to frontal action hierarchy 9 10 (b) principal connections and hierarchy of functional involvements V1 V1 V1 V2 V2 VP V2 V4 V2 V4 V4 V4 MT TE3 TE3 TE3 TE2 TE2 TE2 TE1 TE1 TE1 TPro TPro TPro (iii) ventral view (ii) lateral view (i) medial view (a) views of ventral visual areas
Figure 12.6: The ventral visual hierarchy Analysis of neocortical perception hierarchies 193
Our parcellation, notation and intrinsic connectivity of the occipital lobe are taken from the work of David Van Essen and coworkers [Essen et al., 1990]. They established the hierarchy of visual processing and established criteria for connections in such a hierarchy. Most of their work used macaque monkeys. For inferotemporal (IT) areas, inferotemporal areas our parcellation, notation and intrinsic connectivity are based on the work of Seltzer and Pandya [Seltzer and Pandya, 1994].
Nikos Logothetis and David Sheinberg [Logothetis and Sheinberg, 1996] have reviewed visual processing and its neural basis. Area TE is not visuotopically organized, and there is a systematic increase in receptive field size along the posterior-anterior length of IT from 1.5o to 50o. Neurons in IT are selective for stimulus attributes such as color, orientation, texture, direction of movement and shape. There is invariance to size or position for given shape, and some scale or translation invariance. More than 85% of IT neurons respond to simple or complex visual stimuli [Desimone et al., 1984].
Logothetis and Sheinberg suggest that there may be several different types of percep- tion with different neural sites, perhaps (1) basic category level objects, (2) individual identities of particular objects, (3) animate objects, and (4) visually guided movements .
The ventral visual hierarchy has been described by Keiji Tanaka and coworkers [Tanaka, 1996], Charles Gross and coworkers [Gross, 1994], also Yasushi Miyashita and coworkers [Miyashita, 1993], as recognizing complex objects, long term and short term memory of complex objects, also as involved in visual imagery imagery [Sakai and Miyashita, 1993]. Unfortunately, the experiments and their analysis do not relate this complex object recognition to action. Furthermore, according to Keiji Tanaka [Tanaka, 1996] p.135 “the accumulated findings favor the idea that no cognitive units rep- resent the concept of objects; instead the concept of object is found only in the activities distributed over various regions of the brain”. 194 Chapter 12: An information-processing analysis of the primate neocortex
Robert Desimone and coworkers have investigated attention attentional effects in IT di- rected from the frontal lobes. “The top-down selection templates for both locations and objects are probably derived from neural circuits mediating working memory, perhaps especially in prefrontal cortex” [Desimone and Duncan, 1995]. From the described anatomy and functionality, we can identify the following regions: Region VV1 (areas TE3 and V4). Keiji Tanaka distinguishes only between pos- terior and anterior IT. The extreme posterior inferior temporal areas are concerned with higher-order visual features. VV1 lesions give simple visual pattern deficits [Logothetis and Sheinberg, 1996].
Regions VV2 and VV3 (area TE2 and area TE1). Anterior inferotemporal cor- tex both recognizes and stores complex visual forms [Miyashita, 1993]. Recency and familiarity are also detected in anterior IT [Fahy et al., 1993].
Charles Gross et al. [Gross et al., 1972] discovered the visual recognition of socially significant objects such as hands in IT. Facial identity perception has also been shown in IT, in the inferomedial occipo-temporal region near the fusiform and lingual gyri. Viewer- centered detection of bodies and body parts have been shown in IT by Wachsmuth et al [Wachsmuth et al., 1994].
12.2.6 Dorsal visual areas
Anatomy. I diagram the dorsal visual areas in Figure 12.7. Analysis of neocortical perception hierarchies 195 spatial and motion features eye saccades guidance of reaching and grasping spatial maps and perception DV1 DV2 DV3 MDP 7a PO PIP DP MT MIP LIP MST VIP (b) principal connections and hierarchy of functional involvements V1 VP V1 V2 V2 DP V2 V4 PIP PO V4 MDP MT V4 MIP VIP LIP 7a (i) lateral view (ii) medial view (a) views of dorsal visual areas
Figure 12.7: The dorsal visual hierarchy 196 Chapter 12: An information-processing analysis of the primate neocortex
Connections and architectonics of the posterior parietal lobe in rhesus monkeys have been described by Pandya and Seltzer [Pandya and Seltzer, 1982]. Andersen et al. [Andersen et al., 1990] have mapped, using cynomolgus and rhesus macaque monkeys, the inferior part of the posterior parietal lobe. We will use the intrinsic connectivity and notation of Daniel Felleman and David Van Essen [Felleman and Essen, 1991] for the dorsal visual areas. To accommodate data from Pandya and coworkers, I use the name OAa to denote an area consisting of MT, MST and FST. From the described anatomy and functionality, we can identify the following regions: Region DV1 (areas PIP, PO and MT). Spatial layout features are derived in DV1. PO, for example, mainly uses information from the periphery of the visual field. Neu- ral area MT is involved in binocular disparity, speed and direction of stimulus motion [Logothetis and Sheinberg, 1996]. I have included area MT is both the dorsal visual and the ventral visual regions following the detailed studies of John Maunsell [Maunsell, 1995] showing that it has a role in both. Andersen [Andersen, 1995] has described the encoding of intention and spatial location in the posterior parietal cortex, in areas LIP and MDP (which is medial). These intentions specify saccades that the animal intends to make. Smooth pursuit and motion perception are done in OAa.
Region DV3 (caudal 7a). Vernon Mountcastle [Mountcastle, 1995b] has reviewed work on visually-guided reaching neurons in dorsal visual areas which also connect in PM3 with somatosensory guiding neurons in area 5. “Grasping” neurons are found in dorsal visual areas as well as area 5 [Wise and Desimone, 1988]. “Reaching” neurons have been found in 7a, particularly for reaches with either arm. Mountcastle summarized that these areas are concerned with (a) spatial perception, maps and coordinate transformations, (b) generation of intentions to move, and (c) commands for visuomotor and somatomotor operations. Sereno and Maunsell [Sereno and Maunsell, 1995] have suggested that LIP has memory for shape features. Analysis of neocortical perception hierarchies 197
Area 7a subserves spatial maps and spatial perception [Andersen et al., 1990], based on a distributed planar gain field representation of the spatial map, which is head-centered. Perception of the subject’s own gaze, i.e., position of eyes in the orbits, is performed in areas 7a, LIP and DP. According to Andersen et al.(1990, p.105): “area 7a appears to be very different from the other visual areas in the IPL in that it is the only area that connects to some of the highest centers in the brain”.
Various kinds of spatial map are constructed in the parietal lobe from somatosensory, visual and perhaps even auditory information. These maps concern the body and the larger environment of the animal. This information could be propagated to the frontal lobes in elaborating the spatial aspects of action. Specific maps will support action descriptions that are relatively definite and detailed, and for which spatial aspects have been determined.
12.2.7 Polymodal STS areas
Anatomy. I diagram the polymodal STS (superior temporal sulcus) areas in Figure 12.8.
Our parcellation, notation and intrinsic connectivity are based on the work of Barnes and Pandya [Barnes and Pandya, 1992] Here, there is a wealth of work from David Perrett’s group at St. Andrews University. Working with macaque monkeys, they have found cell responses to many different social stimuli. From the described anatomy and functionality, we can identify the following regions: 198 Chapter 12: An information-processing analysis of the primate neocortex social feature recognition social configuration recognition social goal recognition PM3 PM2 PM1 TS1 Tpt TS3 TS2 paAlt TAa TPO4 TPO1 TPO2 TPO3 Ipa TPro TEa OAa TEm Principal connections and hierarchy of functional involvements TE2 TE1 TE3 V1 Tpt TPO4 VP V2 DP OAa paAlt PEc PIP TPO3 V4 MT OAa PE MIP VIP LIP 7a FST Tpt TPO−3 TE3 PFG TE3 TS3 3 TPO2 PF AI TPO−2 TE2 2 paAlt 1 TAa TPO−1 Ipa TEa TEm TS2 TS3 TE2 TE1 4 TPO1 TS2 TPro TS1 6 8 TS1 TE1 12 46 TPro 9 10
Figure 12.8: The polymodal hierarchy of the superior temporal sulcus Analysis of neocortical perception hierarchies 199
Regions PM1, PM2 and PM3. (areas {TPO-4 and TPO-3}, {TPO-2, Ipa and TAa}, and {TPO-1 and TPro}). Classes of stimuli detected include: (1) faces - facial expression and facial identity [Perrett et al., 1979] [Perrett et al., 1992], face characteristics [Perrett and Mistlin, 1990], (2) direction of head in horizon- tal and vertical planes [Perrett et al., 1991], face view [Perrett et al., 1985], (3) eye gaze direction, eye contact with subject [Perrett et al., 1985], (4) hand actions [Perrett et al., 1989b], (5) limb position and body posture, patterns of walking, jerky mo- tion [Bruce et al., 1981] [Perrett et al., 1985], (6) appearance and disappearance from the visual field [Bruce et al., 1981] [Perrett et al., 1985], (7) limb and body movements of social significance, such as turning towards or away from the viewer, standing up, crouching down. [Perrett et al., 1990a], and (8) tactile stimulation in and out of sight, unexpected tactile stimulation, social actions, touching [Perrett et al., 1990b]. Cells also distinguish between the subject’s own movements and that of others, and detect unex- pected events. Michael Oram et al. [Oram et al., 1993] have also have pointed out that perceived socially significant motions can be independent of visual form.
Perrett has developed a conceptual framework for these findings [Perrett et al., 1989a]. He suggests a processing hierarchy for recognizing social stimuli based on faces, hands, eye gaze, and body and limb position and movement. Percepts can be in viewer- centered, object-centered and goal-centered frames. Viewer-centered representations are used preferentially in social situations, for they quickly constrain social response options.
All of this information is very useful for coordinating social action, a key requirement of life in primate groups. Descriptions are derived at a level which is appropriate to support the execution of a specified plan which is described in terms of spatial relations and action types, but which does not use specific positional information.
Goal-centered descriptions would be useful at a yet higher level since they are independent of choice of specific action; neurons that encode such descriptions are found mainly in 200 Chapter 12: An information-processing analysis of the primate neocortex
PM3 [Perrett et al., 1989a].
It is difficult to find a strong hierarchy in the sequence PM1, PM2, PM3. However, PM1 mainly recognizes simple features, and goal-centered descriptions are only found in PM3. We include the temporal pole TPro in PM3; it receives connections from AU3 and VV3.
Polymodal regions in the somatosensory and visual hierarchies. PM3 contains cells responding to tactile stimulation, but conditioned by visual information. The hierarchy of processing of somatosensory information thus extends beyond SS3 to PM3. Likewise, the dorsal visual hierarchy extends from DV3 to PM2 and PM3. The hierarchy of auditory processing probably extends into the frontal lobes in a more active perception process.
The temporal pole and episodic memory. Imaging studies show that long term episodic memory is usually associated with the temporal pole [Markowitsch, 1995] and orbital frontal cortex [Shallice et al., 1994].
12.3 Summary and conclusion
Summary. In this chapter, I have reviewed the evidence for a hierarchical architecture in the primate brain. By examining neuroanatomical evidence for connections among neural areas, I was able to establish anatomical regions and connections. I then examined evidence for specific functional involvements of the different neural areas and found some support for hierarchical functioning, for the perception hierarchies.
The essential technique I am using is to characterize each brain area in terms of the data types that it creates. I assume that there is a uniform cortical process which creates data and that this is a main activity of the brain. Summary and conclusion 201
Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. Chapter 13
Frontal areas and the perception-action hierarchy
Abstract: In this chapter, experimental evidence for the frontal areas of the primate neocortex is analyzed for conclusions concerning the existence of neural areas, for cortic- ocortical connectivity among neural areas, and for the involvement of each frontal area in the functioning of the brain. This analysis shows that the primate neocortex consists in the main of a perception hierarchy, an action hierarchy and connections between them. In other words, from an information-processing point of view, the primate neocortex has a hierarchical perception-action architecture.
202 The neocortical planning and action hierarchy 203
13.1 The neocortical planning and action hierarchy
13.1.1 Planning and action areas
Anatomy of the frontal lobe. I diagram the planning and action areas of the frontal lobe in Figure 13.1. 204 Chapter 13: Frontal areas and the perception-action hierarchy muscle combinations eye saccades for realtime execution representations of explicit detailed action sequences specific plans with data in working memory context and episode representation general plans, self paced goals MI PA3 PA1 G PA4 PA2 PA5 12 8 11 4 32 6 46 10 24 9 (b) principal connections and hierarchy of functional involvements 4 4 25 6 8 6 24 12 12 32 46 9 9 10 10 (ii) lateral view (i) medial view (a) views of planning and action areas
Figure 13.1: The planning and action hierarchy The neocortical planning and action hierarchy 205
Intrinsic connectivity is taken mainly from Helen Barbas [Barbas, 1988], using the Walker notation and parcellation. Parcellations and notation consistent be- tween rhesus macaques and humans have been given by Petrides and Pandya [Petrides and Pandya, 1994]. We will call the proisocortex in the frontal lobe FPro, and that in the temporal lobe TPro.
Fortunately, the frontal areas have been perspicaciously reviewed by Michael Petrides [Petrides, 1994]; Richard Passingham’s [Passingham, 1993] book gives a comprehensive treatment; and of course Joaquin Fuster’s [Fuster, 1997] classic monograph provides the primary basis for our understanding of the frontal lobes. The human prefrontal cortex is treated in Karl Zilles’ chapter [Zilles, 1990] in George Paxinos’s book [Paxinos, 1990] on the human nervous system, and also in Andr´eParent’s edition of Carpenter’s textbook [Parent, 1996].
13.1.2 Human cognition
To determine the role of neural areas in cognition involving planning and executive man- agement of planning, we need to look at human data.
Christopher Frith [Frith et al., 1991] [Frith, 1995] has reviewed work on problem solving and has concluded that dorsolateral prefrontal areas are activated for stimulus-driven cognition whereas corresponding medial prefrontal areas are activated for cognition which is self- or internally generated. He designed experiments to distinguish willed versus non- willed action, where willed action involved choices, i.e., more than one correct response existed. It seems to us that the two distinctions, one involving external stimuli versus internal generation, and the other involving choice, should be treated as independent. That is, an action involving choice may or may not involve the use of external input. This also meant in his case that the non-willed actions were essentially easier and used 206 Chapter 13: Frontal areas and the perception-action hierarchy lower areas such as SMA.
The generation of verbal sequences principally activated area 46 as well as ante- rior cingulate. That most reliable test for prefrontal damage, the Wisconsin Card Sort Test, involves choice and has been shown in several imaging studies, e.g.,goal [Mentzel et al., 1998], to activate dorsolateral areas such as 46.
Per Roland and coworkers [Roland, 1993] investigated several cognitive tasks and showed progressively more anterior activation with the abstraction and difficulty of the task. Thus, route-finding problems were imaged by Roland [Roland and Friberg, 1985] and found to principally activate areas 9 and 10. Tower of London problems have been imaged by Richard Frackowiak et al and found to activate areas 9 and 10. Roland also found that in arithmetic problems, areas 11 and 12 were involved, which he attributed to retrieval of memory of arithmetic skills for subtraction and the integers. Roland’s conclusion was that dorsolateral prefrontal areas are used in “all tasks in which a primary instruction is given which contains directives for future processing .... if no processing of sensory information or if the performance is preempted or obvious then there is no activation”.
The work on problem solving also typically shows activation of anterior cingulate. This area seems to involve the selection of goals, with its more rostral part involved in more cognitive goals and its more caudal part involved in more directly motor goals.
The nature of problem solving activity is being researched in clinical practice by Tim Shal- lice and coworkers [Shallice and Burgess, 1991a], and by Myrna Schwartz and coworkers [Schwartz, 1995], for example. They distinguish between more routine action and more higher level action involving “contention scheduling”, or choice. They conceive higher level behavior as the activation of schemas which are goal-directed, whereas lower level, but still cortical, behavior as involving simpler schemas. It seems to us that this is quite The neocortical planning and action hierarchy 207
compatible with the selection of overall goals being made in the anterior cingulate. This clinical work uses imaging to determine areas of damage, but since these observed le- sions are typically large, and idiosyncratic, it is difficult to assign finer localization to the categories of higher- and lower- level cognitive activation.
I have chosen not to fully consider human verbal behaviors, leaving a more complete treatment of this large and fascinating subject for a future paper.
13.1.3 Frontal regions
From the described anatomy and functionality, from nonhuman primates and from hu- mans, we can identify the following regions: Region MI (areas MI and 24). MI, the motor cortex is well known. It has a body mapping and sends motor execution information/commands to muscle groups in the body. Areas of motor cortex when stimulated seem to produce groups of muscle contrac- tions corresponding to common actions of the animal. Richard Passingham’s conclusion [Passingham, 1993] (p.37) is that the motor cortex is specialized for the execution of ma- nipulative movements of the limbs and face, and fine behavioral variants that are learned and which are selected in voluntary action.
Region PA1 (area 6). According to Passingham, lateral premotor cortex plays a role in the selection of manipulative movements, but not in the repetition of the same movement. It is also active in preparing to move. Selection often consists of using external cues to direct the movements. Medial premotor cortex plays a role in the selection of movement when no such cues are available, in repetitive movements that are self-paced, and in the performance of motor sequences from memory.
Region PA2 (area 8). Selection of eye movements requires a nonegocentric geometric frame and therefore information from external senses (in contrast to area 6 which uses 208 Chapter 13: Frontal areas and the perception-action hierarchy proprioceptive information and therefore an egocentric frame). The dorsomedial eyefield selection of eye movements is not determined by Ivisual targets, whereas lateral eyefields select when a target has been presented.
Region PA3 (area 46). After a complete analysis of experimental data, Michael Petrides concluded that PA3 is involved in generation of actions. Actions are generated on the basis of information in working memory or generated from memory. The principal sulcus(PS) is involved in spatial working memory and monitoring (Petrides, 1994) p.74. The term “monitoring” implies an expectation of what must or will occur and verification of what has occurred.
Region PA4 (areas 9 and 10). Petrides considered a region he called mid-dorsal lateral frontal cortex consisting of (i) dorsal 46 above PS1 and (ii) 9, to be a separate region. Our region PA4 is similar, but we also include 10 on connectivity grounds. Lesions in this region result in impairments to nonspatial working memory tasks with self-generated and externally generated responses.
This area is involved in self-ordered tasks in interaction with the medial temporal lobe (Petrides, 1994) p.76. This region has “more (than PA5)2 specialized executive process- ing in working memory that is critical for the planning and organization of behavior” (Petrides, 1994) p.79.
According to Petrides, lesions in this region do not markedly impair working memory tasks but they do markedly impair certain nonspatial tasks such as those involving the monitoring of self-generated and externally generated responses. This applies to keeping track of which ones of a set of stimuli have already been selected, for example.
Region PA5 (areas 11 and 12). Petrides considered a region he called ventrolateral
19/46d in his notation 2my comment The neocortical planning and action hierarchy 209 frontal cortex consisting of (i) 123, (ii) ventral 46 (below PS), and ventral 84 to be a separate region. This region, which is very similar to our PA5, is involved in executive processes concerning plans and intended actions, judgments of saliency and novelty and active voluntary retrieval of information in long term memory in posterior association cortex. Lesions result in severe impairments in all types of problem solving.
PA5 is clearly a separate region, as connectivity with perception hierarchies described here shows. Petrides [Petrides, 1994] Figure 18, p.77, shows strong connections to ven- trolateral frontal cortex from all unimodal and polymodal association areas including PM3 and AU3, whereas PA4 is only connected to PM2 and AU2 and other lower areas. Lesion work shows its separate functionality. It also has much stronger connectivity with the amygdala [Barbas, 1995].
I speculate that region PA5 may be best understood for its role in generating con- text. This is supported by Roland’s work on arithmetic tasks already mentioned [Roland, 1993].
Emad Eskandar et al. [Eskandar et al., 1992] have shown that IT neurons code for visual images and for behavioral context in a separable way. We have argued above that episodes provide context. Nancy Andreasen et al. [Andreasen et al., 1995], in a PET study, found that both focused and spontaneous episodic memory retrieval activated anterior medial frontal regions (area 11) and precuneus/retrosplenial cingulate cortex. Tim Shallice et al. [Shallice et al., 1994], using PET, isolated acquisition from retrieval of verbal episodic memories. Retrieval was associated with activity in right areas 10, 46 and 125 of prefrontal cortex (12 being strongest) and the bilateral precuneus (31). Left anterior cingulate (32) was active in both acquisition and retrieval. Endel Tulving
347/12 in his parcellation 445 in his parcellation 547 in their parcellation 210 Chapter 13: Frontal areas and the perception-action hierarchy et al. [Tulving et al., 1994], in PET studies, have shown neuroanatomical correlates of retrieval in episodic memory to be right dorsolateral cortex areas 10, 46, 9 and anterior 6 using auditory sentence recognition. Passingham has argued that the ventral prefrontal cortex, which he defines as areas 11, 12, 13, and 14, “selects the goal given the current context” [Passingham, 1993] p. 171.
Region G. (area 32). This region may subserve the representation of goals. G is the anterior cingulate cortex and has a similar connectivity to PA5 except it does not connect with VV3 or DV2. Jos´ePardo et al. [Pardo et al., 1990], in an imaging study, found data to support the idea of a role for the anterior cingulate in “selection and recruitment of processing centers for task execution”. Tomas Paus et al. [Paus et al., 1993], also in an imaging study, tentatively proposed the anterior cingulate as facilitating the execu- tion of appropriate responses and/or suppressing the execution of inappropriate ones. This, they observed, is particularly useful when behavior has to be modified in new and challenging situations. Michael Posner [Posner and Rothbart, 1998] has concluded that anterior cingulate is involved in executive attention. In their review, Orrin Devinsky et al. [Devinsky et al., 1995] conclude that the anterior cingulate is involved in executive functions and the posterior cingulate is involved in visuospatial and memory functions [Vogt et al., 1992]. According to them, the executive functions of anterior cingulate in- clude initiation, motivation and goal-directed behaviors, response selection including the decision not to move, and the expression of specific movement sequences that require little or no autonomic activity. Lesions of anterior cingulate cortex tend to impair willed actions.
[Elliott and Dolan, 1998] [Carter et al., 1998]
[Lane et al., 1998] [Paus et al., 1998] [Derbyshire et al., 1998] [Bussey et al., 1997] [Meunier et al., 1997] [Bussey et al., 1996] [Muir et al., 1996] [Seamans et al., 1995]. The neocortical planning and action hierarchy 211
Region Time Memory Complexity Monitoring scale and Scope of Action
G until goal memory goal expressions monitoring of goal satisfied for goal desired states progress and satisfaction
PA5 long term memory context monitoring of episode for context context
PA4 length of working self-ordered monitoring of current plan memory tasks total plan
PA3 length of working memory action involving monitoring of current action of spatial person, objects action details spatial position
PA2 saccade use of external eye saccade no monitoring target or self generated
PA1 real time use of self-paced no monitoring limb and face visual sequencing movement cues sequencing
MI real time none muscle groups no monitoring muscle in useful contractions combinations
Figure 13.2: Characterization of planning and action hierarchy
13.1.4 Planning and action hierarchy
We can therefore define a planning and action hierarchy based on a partial ordering determined by (1) immediacy and time scale, (2) memory, whether used and temporal extent, (3) complexity and scope of action, and (4) the extent of monitoring. We display these dimensions in Figure 13.2. This table shows that we can define such a hierarchy based on these measures. 212 Chapter 13: Frontal areas and the perception-action hierarchy
13.2 The perception and action hierarchies
13.2.1 Cortical regions and their hierarchies
I summarize here the perception and action hierarchies we developed in the previous sec- tions. I defined regions which consist of several neural areas and which form components of the architecture. The use of regions breaks down the cortical connectivity analysis into three levels: (i) aggregation of neural areas into regions (ii) (intrinsic) connectivity of regions within individual hierarchies. (iii) (extrinsic) connectivity of regions between individual hierarchies. These choices are determined by an analysis of function as well as anatomy.
The component neural areas of the regions we have identified in the previous sections are summarized in Figure 12.1. I draw the regions on the cortex in Figure 13.3 and I give a summary table in Figure 13.4. The perception and action hierarchies 213
VV3 VV2
VV1
PA5 VI
PA4 G
VV1 DV1 SS3 PA1 MI SI SS2 SS1
DV1
DV2
MI DV1 SI DV3 PA2 PA3 SS1 SS2 SS3
PA1 AI PA4 VI PA5 GU1 PM1
AU1 VV1 PM2 AU2
AU3 PM3
VV2 VV3
VV2 VV1
VV3
VI PA5
PA4 OL1
G
Figure 13.3: Views of the cortex showing regions 214 Chapter 13: Frontal areas and the perception-action hierarchy
hierarchy region functional involvements olfactory OI odor detection and discrimination, simple odor memory OL1 odor-guided behavior, mating and sexual behavior gustatory GI timing, detection and characterization of taste GU1 integration of taste with satiety information somatosensory SI tactile detection SS1 tactile images SS2, SS3 guidance of reaching and grasping PM3 socially significant tactile recognition auditory AI auditory detection images AU1 auditory images, maps AU2, AU3 socially significant auditory recognition ventral visual VI visual features and images VV1 object identity and motion VV2, VV3 complex objects and long term memory dorsal visual VI visual features and images DV1 spatial features DV2 eye saccades DV2, DV3 guidance of reaching and grasping DV2, DV3 spatial maps and spatial perception PM3 socially significant perception and guidance polymodal STS PM1, PM2, PM3 socially significant viewer-centered perception PM3 socially significant goal-centered perception PM3 episodic memory planning and action G goals and action selection PA5 context and episode representation PA4 complex plans, self-paced PA3 specific plans with data in working memory PA2, PA1 explicit detailed action sequences for realtime execution MI muscle combinations
Figure 13.4: Summary of experimental findings for hierarchy of data abstraction The perception and action hierarchies 215
We can now turn to our third task, that of reviewing the extrinsic connectivity among the different individual hierarchies.
13.2.2 Connectivity between perception and action hierarchies
Figures 13.5 and 13.6 give the table of all extrinsic connections between areas in different lobes. I will assume, as an approximation, that all connections, intrinsic and extrinsic, are reciprocal (some exceptions to reciprocity are listed by Felleman and Van Essen(1990)). 216 Chapter 13: Frontal areas and the perception-action hierarchy
Region Area Area References olfactory areas planning and action areas
OI OI 14c, 25, 13a, 13m [Rolls and Baylis, 1994] gustatory areas planning and action areas
GI operculum, IG 12o [Carmichael et al., 1994] somatosensory areas polymodal STS areas
SA1 PF (7b) TPO-1 [Neal et al., 1990] somatosensory areas planning and action areas
SI postcentral gyrus SI (1, 2, 3) MI (4) [Pandya and Yeterian, 1990] SA1 PE, PEa rostral 4, 6 (dorsal (MI) and SMA(MII)) [Pandya and Yeterian, 1990] PF (7b) ventral 6, 8, 45, 46, 24 [Pandya and Yeterian, 1990], [Cavada and Goldman-Rakic, 1989] SA2 PEc rostral 6 (MI and MII) [Pandya and Yeterian, 1990] rostral POa ventral 46 [Pandya and Yeterian, 1990] PFG 8, rostral 46, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] SA3 PGm rostral 6 above AS, 8, dorsal 46 and 9, 24 [Pandya and Yeterian, 1990] 7m (PGm) 45, 23, 24 [Cavada and Goldman-Rakic, 1989] rostral PG 8, rostral 46, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] auditory areas polymodal STS areas
AA1 TS3 TPO-3, TAa [Seltzer and Pandya, 1989] AA2 TS2 TAa [Seltzer and Pandya, 1989] AA3 TS1 TPO-1, TPro [Seltzer and Pandya, 1989] auditory areas planning and action areas
AA1 CB, RB, CPB (caudal TS3) dorsal 8 in concavity of AS, caudal 46 [Pandya and Yeterian, 1990] [Hackett et al., 1999] AA2 CSTG, CSTS (rostral TS3, TS2) prearcuate 46 below principal sulcus, [Pandya and Yeterian, 1990] rostral 46, dorsal prefrontal 9 and 10 [Pandya and Yeterian, 1990] [Romanski et al., 1999] 11, 12, 13 [Romanski et al., 1999] Ipa 46, 10, 11, 12, 14 [Seltzer and Pandya, 1989] AA3 23b, CML 9, 46 [Yukie, 1995] [Goldman-Rakic et al., 1984] RPB, RSTS, RSTG (TS1) 12 and 13 (OFC), 25 and 32 (medial PFC) [Pandya and Yeterian, 1990] 10, 11, 12, 13, 24, 32 [Romanski et al., 1999]
Figure 13.5: Table of all extrinsic connections among neural areas, part 1, AS - arcuate sulcus The perception and action hierarchies 217
Region Area Area References ventral visual areas dorsal visual areas
VV1 V4 MT, FST, DP, LIP, PIP, caudal 7a [Felleman and Essen, 1991] [Neal et al., 1990] [Young, 1992] VV1 TE3 (PIT) LIP, MST, FST [Maunsell, 1995] VV3 TE1 (AIT) 7a [Maunsell, 1995] ventral visual areas polymodal STS areas
VV1 OAa Ipa [Seltzer and Pandya, 1989] VV2 CIT TPO-4 [Hilgetag et al., 2000] ventral visual areas planning and action areas
VI V2, V3, VP, V4, V4t FEF (8) [Felleman and Essen, 1991] VV1 V4 46 [Felleman and Essen, 1991] TE3, OAa (lateral prestriate) premotor prearcuate cortex (8) [Pandya and Yeterian, 1990] VV2 TE2 premotor rostral 8, prearcuate 46 below PS [Pandya and Yeterian, 1990] TEa, TEm 8, 46, 11, 12 [Seltzer and Pandya, 1989] VV3 TE1, TE2 premotor rostral 8, prearcuate 46 below PS, [Pandya and Yeterian, 1990] 11 and 12 (orbitofrontal) dorsal visual areas polymodal STS areas
DV1 PO, PIP, MIP, FST, MST TPO-4 [Seltzer and Pandya, 1994] [Hilgetag et al., 2000] DV2 MIP, PIP TPO-2, TPO-3 [Seltzer and Pandya, 1994] VIP, LIP TPO-4 [Seltzer and Pandya, 1994] dorsal visual areas planning and action areas
DV1 PO 8 [Colby et al., 1988] [Felleman and Essen, 1991] DV2 DP 8, 46 [Felleman and Essen, 1991] DV2 VIP 8 [Felleman and Essen, 1991] DV2 LIP 6 (ventral premotor), 8, 46, 12, 24 [Felleman and Essen, 1991] [Cavada and Goldman-Rakic, 1989] DV3 caudal 7a 6, 8 (weakly), 46, 9, 12, 24 [Pandya and Yeterian, 1990] [Cavada and Goldman-Rakic, 1989] polymodal STS areas planning and action areas
PM1 TPO-4 6, 8, caudal 46 [Seltzer and Pandya, 1989] TPO-3 dorsal 46, 9, 10 [Seltzer and Pandya, 1989] PM2 TPO-2 dorsal 46, 9, 10 [Seltzer and Pandya, 1989] PM3 TPO-1 46, 9, 10, 11, 12 [Seltzer and Pandya, 1989] 13, 14, 24, 32 TPro FPro [Seltzer and Pandya, 1989]
Figure 13.6: Table of all extrinsic connections among neural areas, part 2 218 Chapter 13: Frontal areas and the perception-action hierarchy
I diagram the connectivities from the perceptual regions to the frontal regions in Figure 13.7. The perception and action hierarchies 219 VV2 VV1 VV3 DV3 DV2 DV1 9 46 8 6 4 24 14 32 13 10 9 46 8 6 4 14 24 12 25 13 32 11 Connectivity of VV regions to frontal areas 10 12 25 Connectivity of DV regions to frontal areas 11 SI SS3 SS2 SS1 9 46 8 6 4 24 14 32 13 10 12 25 PM2 PM1 PM3 Connectivity of SA regions to frontal areas 11 9 46 8 6 4 24 14 AU2 AU1 AU3 32 13 10 12 25 Connectivity of PM regions to frontal areas 11 9 46 8 6 4 14 24 13 32 10 12 25 Connectivity of AA regions to frontal areas 11
Figure 13.7: Diagram of connections to frontal areas 220 Chapter 13: Frontal areas and the perception-action hierarchy
This regular clustering of lateral connections from perceptual to frontal regions provides further support for our identified perceptual regions and their intrinsic hierarchy. It also supports our partitioning of the set of frontal areas into a hierarchy of frontal regions, based on the clustering of their connectivities with the perceptual regions.
Figure 13.8 diagrams how regions are ordered within perception hierarchies, and are connected to frontal regions which form the planning and action hierarchy.
Due to the large number of connections, even between regions, I use a diagrammatic form which principally shows connections from perception regions to frontal regions; in addition, the main ascending and descending connections within hierarchies are indicated. A module is positioned at a level determined by its main connections, it is often also connected to adjacent frontal regions. DV3’s main connection is to area 46 (Andersen et al., 1990), hence it is placed at that level. To avoid clutter on the diagram, I have not shown DV2’s connections to PA5 and G. I have not shown in this diagram the fine structure of the planning and action hierarchy, i.e., all the connections intrinsic to the frontal regions. I have not analyzed lateral connections between different perception hierarchies.
I will regard the entire occipital lobe as a single region, VI, except for V4 which is part of region VV1. The regions at the bottom of the hierarchy, VI, AI, SI, MI, are concerned with interfacing to sensors and effectors. They all have complex structure, involving many subareas and often hierarchical structure within the region. For our purposes however we will treat each as a single architectural region.
13.2.3 Perception-action hierarchical architecture
Combining this evidence and these concepts, we can now define a five-level functional hierarchy. I will put G and PA5 on the same level, although they are distinct regions, Summary and conclusion 221 since they do not have a strong relative hierarchical ordering, but act more in parallel as goal and context respectively at the same level. Similarly, regions PA1 and PA2 will be put on the same level since they act in parallel, PA2 for eye movement control and PA1 for body movement control. Also, regions SS2 and SS1 seem to belong on the same level, mainly due to connectivity.
To speculatively characterize the hierarchy, level 5 concerns goals, context, episodes, social goals, overall spatial awareness, and social messages; level 4 concerns complex plans, and long term memory for complex objects; level 3 concerns specified plans with details in working memory, social objects, social action features, spatial descriptions, tactile guidance, and auditory social messages; level 2 concerns detailed action sequences for the self, eye saccades, action features, spatial features, tactile guidance, and auditory features; level 1 concerns activations of muscle groups and body parts, and tactile feedback.
Perception at each level concerns attending to, acquiring and maintaining information at this level of description. Planning and action at each level concerns acquiring, selecting and elaborating action descriptions.
13.3 Summary and conclusion
Summary. In this chapter, I have reviewed the evidence for a hierarchical architecture in the frontal areas of the primate brain. By examining neuroanatomical evidence for connections among neural areas, I was able to establish anatomical regions and con- nections. I then examined evidence for specific functional involvements of the different neural areas and found some support for hierarchical functioning for the planning and action hierarchy in the frontal lobes. 222 Chapter 13: Frontal areas and the perception-action hierarchy
The essential technique we are using is to characterize each brain area in terms of the data types that it creates. I assume that there is a uniform cortical process which creates data and that this is a main activity of the brain.
I have managed to push the analysis of the neocortex to include plans and goals as types of data. If we have this set of brain areas and a set of data types which includes different levels of percept, different levels of action description, and plans and goals, then this system constitutes a parallel computer which generates the complex primate behavior we are seeking to describe.
Thus this chapter’s main aim is to establish that this complete set of brain areas is a reasonable description of the neocortex, and that goals and different levels of planning and action can be included. Once this empirical description has been obtained, then chapters 14 and 15 can go ahead and generate a precise description of the parallel computer which is my model of the neocortex.
Conclusion. The overall structure, both anatomical and functional, of the primate brain is that of a set of cortical regions with spatial localization and clustered interconnectivity. Each region is specialized in data processing and data storage to a limited number of data types. Cortical regions are connected as a perception hierarchy, a planning and action hierarchy, and connections between corresponding levels of these two hierarchies. The perception hierarchy is based on data representing increasingly general situations. The planning and action hierarchy is based on increasingly general situations, plans and control. Summary and conclusion 223
G PA5 PA4 PA3 PA2 PA1 MI
SS3 SS2 SS1 SI
DV3 DV2 DV1
AU3 AU2 AU1 AI
VI PM3 PM2 PM1
VV3 VV2 VV1
OL1 OI
GU1 GI
Figure 13.8: Neocortical perception-action hierarchy Chapter 14
Describing information processing in the neocortex
Abstract: In this chapter, I describe the use of general description methods to describe information processing in the brain.
I develop basic computational principles that are observed to hold for the brain.
224 Introduction 225
14.1 Introduction
Motivation. In chapter 12, I reviewed experimental data on neuroanatomical connec- tivity and neurophysiological activity of the neurons comprising the primate neocortex. There was sound evidence for the widely held belief that the neocortex is made up of discrete cortical regions with specialized functional involvements. My information- processing analysis of these findings concluded that each region processes certain types of data specific to that region. I also introduced information-processing concepts of goal, plan, sequence, event, and context as data types processed by certain regions. Further- more, from an analysis of connectivity, I concluded that these regions are connected together in a particular architectural scheme, namely a perception-action hierarchy. The chapter described these cortical regions, the types of data processed by each region, and the connections among regions.
What it did not do was explain how such a set of cortical regions provides the neural basis for complex organized primate behavior. The next two chapters provides this explanation, by presenting a system-level theory of brain function, using as a basis the conclusions of the previous two chapters.
We present here a system model of the primate neocortex which shows how the set of specialized cortical functions can be put together using the connectivity of the neocortex, to produce real behavior.
System models. To reiterate, a system model treats an object of study as a set of interacting subsystems, each of which is easier to understand and to describe than the complete system. It results in explanations of objects as due to the action of each subsystem and the interactions among subsystems.
Natural science, computer science and causal models. From the hierarchy of 226 Chapter 14: Describing information processing in the neocortex functional involvement alone, we cannot construct a model of brain functioning. The experimental results demonstrate the involvement of some parts of the brain in some given behavior, but they do not demonstrate a causal functioning model of the brain actually operating to produce the behavior. Again, to reiterate, by causal we mean that the model has a dynamics of changing in time from one state to another, each next state being determined from the current state. I will determine computational principles by which such a hierarchical system of information processing regions can function to produce behavior.
The perception and action hierarchies of the primate neocortex. As a basis for our system model, I now summarize the findings of the previous chapter, showing a hierarchy of function and data types in the cortex. I work with regions made up of several neural areas. I list the neural areas comprising each region and summarize their functional involvements in Figure 14.1.
Figure 14.2 summarizes the hierarchy of behavior and functionality.
I show the regions on a lateral view of the cortex in Figure 14.3(a) Introduction 227
hierarchy region corresponding areas functional involvements reference somatosensory SI 1,2,3 tactile detection [Kaas and Huertas, 1988] SS1 PE,PEa,PF tactile images [Merzenich et al., 1981] SS2 PEc,rostral POa,PFG guidance of reaching and grasping [Mountcastle, 1995b] SS3 PGm,rostral PG guidance of reaching and grasping [Mountcastle, 1995b] PM3 TPO-1,TPro socially significant tactile recognition [Perrett et al., 1989a] auditory AI auditory cortex auditory detection images [Brugge and Reale, 1985] AU1 CB,RB,CPB (TS3) auditory images, maps [Juhani Hyv¨arinen, 1982] AU2 CSTG,CSTS,RPB (TS2) socially significant auditory recognition [Newman, 1978] AU3 RSTG and RSTS (TS1),CML,23b socially significant auditory recognition [Newman, 1978] ventral visual VI V1,V2,V3,VP,V3A,V4t visual features and images [Essen et al., 1990] VV1 TE3,V4 object identity and motion [Logothetis and Sheinberg, 1996] VV2 TE2 complex objects and long term memory [Miyashita, 1993] VV3 TE1 complex objects and long term memory [Miyashita, 1993] dorsal visual VI V1,V2,V3,VP,V3A,V4t visual features and images [Essen et al., 1990] DV1 PIP,PO,MT spatial features [Essen et al., 1990] DV2 MIP,VIP,LIP, eye saccades [Andersen, 1995] DV2 MIP,VIP,LIP,DP,MDP,MST,FST guidance of reaching and grasping [Mountcastle, 1995b] DV3 caudal 7a spatial maps and spatial perception [Andersen, 1995] PM3 TPO-1,TPro socially significant perception and guidance [Perrett et al., 1989a] polymodal STS PM1, PM2, PM3 TPO-4,TPO-3,TPO-2,TPO-1,TPro socially significant viewer-centered perception [Perrett et al., 1990a] PM3 TPO-1,TPro socially significant goal-centered perception [Perrett et al., 1989a] PM3 TPO-1,TPro episodic memory [Perrett et al., 1989a] planning and G 24,25,32 goals and action selection [Devinsky et al., 1995] action PA5 11,12,13,14,FPro context and episode representation [Petrides, 1994] PA4 9,10 complex plans, self-paced [Petrides, 1994] PA3 46 specific plans with data in working memory [Petrides, 1994] PA2, PA1 8,6 explicit detailed realtime action sequences [Passingham, 1993] MI 4 muscle combinations [Passingham, 1993]
Figure 14.1: Summary of experimental findings for hierarchy of data abstraction 228 Chapter 14: Describing information processing in the neocortex
Level Perception Action Types of Information Example
Level 5: perception of goals prioritization desired states described abstractly, goal - affiliate with X. goals and perception of context and selection priorities and urgencies of such states. context of goals contexts - classes of events and episodes, themes, current resting situation, maintenance of general plans applying to classes of episode. family foraging current context objects, actions and relations described generally. summer afternoon, X is aunt
Level 4: perception of social generation of plans for a well-defined situation class, objects, groom with X joint plans features, situations social plans actions, relations corresponding, involving others. plans in social social actions relationship and intentions
Level 3: perception of features construction, execution joint plans with assigned roles groomee is X, groomer is self, joint plan features that indicate and monitoring and including defined actions (approach, prelude, groom) in relational spatial relations of explicit joint plan in specified in terms of relations form actions and intentions relational form
Level 2: perception of construction concrete actions for the self, including detailed approach to X at self action position, and execution spatial and temporal characteristics. position (300,360,0) in detail orientation of detailed plan detailed motor programs - to allow realtime get up, turn and walk movement, velocity for self performance of the actions without immediate feedback
Level 1: perception of activation of individual actions by sets of muscle groups. front right leg(), motor actions somatosensory, tactile, muscle groups front left leg() features for muscle selection and
Figure 14.2: Computational hierarchical levels used in my model Introduction 229 GI VI OI AI SI MI SS1 PA1 DV1 SS2 AU1 PM1 VV1 PA2 DV2 (b) GU1 OL1 AU2 PA3 DV3 SS3 VV2 PM2 PA4 VV3 PA5 AU3 PM3 G VI visual features DV1 spatial features DV1 DV3 maps VV1 object identities social feat. visual SS3 soc tact. feat. PM1 DV2 social SS2 action guid. tactile features AI PM2 SS1 auditory tactile images social action detection social objects complex and features AU1 VV2 (a) SI detection tactile auditory features PM3 social VV3 episodic memory goal features AU2 social MI muscle combinations messages PA1 detailed actions social messages AU3 G saccades goals eye PA2 PA3 specific plans PA5 contexts and episodes PA4 complex plans
Figure 14.3: (a) Lateral view of the cortex showing neural regions and functional involve- ments, (b) Connectivity of regions showing perception-action hierarchy 230 Chapter 14: Describing information processing in the neocortex together with an indication of their functional involvements. The region G is shown with a dotted line boundary to indicate that it is interior, being on the medial surface. In Figure 14.3(b), I give the connectivity of the set of cortical regions. The hierarchy is diagrammed in (b) on its side with its top at the left, in order to make it correspond to the usual lateral view of the cortex in (a). The positioning of a perception region on a vertical line indicates connection to the corresponding action region.
Our model. My system model is diagrammed in Figure 14.4 showing the set of implemented modules with approximately corresponding cortical locations. Introduction 231
MI muscle SI tactile combinations detection PA1 detailed actions for self SS1 tactile images DV2 spatial maps PA3 specific joint plans and DV1 spatial plan persons features PA4 overall plans
PM1 person VI visual G goals positions and movements features
PM2 person VV1 object actions and identities relations
PM3 social dispositions and affiliations
motor tactile visual input output input
motor system detailed plans for self goals overall plans specific joint plans
plan primates
environment
perceived dispositions
primate actions and relations
social relations primate positions and movements
sensor system
Figure 14.4: Modules from neural areas of the primate neocortex, and my initial system model 232 Chapter 14: Describing information processing in the neocortex
This approximate correspondence locates the perception hierarchy along the superior temporal sulcus (STS) following David Perrett’s findings, and with episodic memory for social relations in the anterior temporal lobe. Goals are in anterior cingulate. Specific joint plans and detailed plans for self are in dorsal prefrontal. Tactile sensing in somatosensory regions and spatial maps in dorsal-visual regions were used in our extension of the model for social-spacing behaviors. This also used a simple low-level spatial navigation module which could be tentatively identified with PA1.
The model functions by continuously generating and selecting a goal, and elaborating and executing a corresponding plan via its action hierarchy, while perceiving the world using its perception hierarchy, with continuous interactions between these hierarchies.
In summary, we can create a causal model of the brain at the system level if we model each cortical region by a continuously acting process which constructs, stores, and transmits data of the types specific to that region. Processes are connected by channels whose connectivity corresponds to cortico-cortical connectivity. This results in a system model whose dynamics include feedback, goal-direction, conditional plan elaboration, attention and situated action.
Predictions. From the model, we can obtain detailed predictions of temporal sequences of spatial distributions of cortical activation, for behaviors represented using the model. These predictions have a detailed time granularity of about 20 msec, and could be com- pared with fMRI or ERP data. In order to obtain FMRI data for social behaviors, one could perhaps use a visual display showing video-clips of social interactions, or an interactive video game where the subject makes moves in a social interaction game.
Social interaction. The other main new advance is that the model shows how a perception-action model can result in a model of social interaction. This occurs be- cause, in a situation with more than one animal, each animal continuously perceives The biological basis of our computational approach 233
the other, and continuously acts toward the other conditionally upon what it perceives. Further, the hierarchical organization of the perception-action system allows a hierarchi- cal description of the social interaction with different levels of control and protocol. The model provides correspondences between measurements of social interaction and the set of cortical regions and their activation patterns.
This chapter. In section 14.2, I derive computational principles from the biology of the cortex, and in section 14.3 I describe how, guided by these principles, I can represent cortical data and processes using predicate logic. In section 14.4, I describe the dynamics of our model. The next chapter gives a detailed description of a specific brain model which I have implemented on a computer, and I report behaviors and results obtained with our implemented model. This model, and its implementation, therefore establish the correctness and feasibility of the approach. They exhibit an actual functioning brain model based on the available empirical evidence.
14.2 The biological basis of our computational ap-
proach
In this section, I examine what we know about the primate cortex, and I develop the basic computational elements upon which to design a system-level brain model.
Areas. The primate cortex is partitioned into distinct areas. I therefore structure my computational model into a set of corresponding modules.
Areas have specific interconnectivity. The connectivity among areas is the same, or similar, for all primates. I will connect our modules in the same way. Areas are typically connected to a small number of other areas. Connections divide into long range and short range. At short range, an area is often connected to several neighboring areas that 234 Chapter 14: Describing information processing in the neocortex are contiguous with it. At long range, an area is usually connected to one, two or three areas that are further away, and not contiguous with it.
Each area is involved in specific kinds of processing. I will assume that each module processes only certain kinds of data, specific to that module.
Processing is distributed. Areas process data received and/or stored locally by them. There is no central manager or controller. This is a debatable issue. In our view, areas influence each other by data sent between modules. The set of modules works together in an integrated way, but by means of local processing and the exchange of data.
There is a uniform process. As discussed in section 3.3, the cortex seems to have the same operational or computational process over its entire area.
Cortical processing proceeds at a uniform rate. All modules do similar amounts of processing and run at about the same speed.
Data parallelism in communication, storage and processing. I assume that data is coded in parallel codes, such as population codes, so that a large set of parallel fibers carries a code for one message or one meaning. I assume that processing within a module is also highly parallel, operating on a large set of parallel fibers concurrently. Parallel coded data is transmitted, stored, and triggers processing. Processing acts on parallel data to produce parallel data.
The cortex works in real-time. The cortex’s fastest reaction time to a stimulus is about 100 milliseconds. The time to process information in one area and to pass it on to the next area is about 20 milliseconds (Edmund Rolls, personal communication). Further, the path from incoming sensory stimulus to outgoing motor command runs through about five areas. Language is processed in real-time, both generation and recognition, and, as Charles Goodwin has shown [Goodwin, 1981], even the co-construction of a sentence usually occurs in real-time including nonverbal signaling between participants during the Representing data and processes using logic 235 generation of the sentence. Hence the action of an area is, at least some of the time, an immediate reaction to its incoming data. Cognition occurs by modules exchanging data and by repeatedly reacting to incoming data and newly computed data. The incoming data arrives at an area, a process occurs in about 20 milliseconds, and output data is transmitted to other modules.
Data is “wide”. The data items being transmitted, stored and processed can involve a lot of information; they can be complex. Thus, if we have a parallel set of one million neurons, then the code for one choice or component of a data item might involve 10,000 neurons, and then the set of neurons might transmit 100 such components or choices simultaneously as one data item. Scientists describing this situation to each other use natural language which is less “wide”, so we tend to unconsciously assume that a data item in the brain is of similar complexity to a natural language word or phrase. However, a single data item, we suggest, may convey as much information as a whole paragraph.
14.3 Representing data and processes using logic
Logical modeling. My intention is to develop a very general and abstract kind of model that can be changed and specialized in the light of results obtained with it. For this purpose, we use predicate logic expressions to represent data and predicate logic inference rules to represent transformations of data. Given an abstract model, we will then later be able to consider more specific implementations of it, in particular, how it can be implemented as a neural net. However, the abstract model is self-contained, it can be run on a computer and its behavior found, and it can generate falsifiable predictions that can be tested against experiment. 236 Chapter 14: Describing information processing in the neocortex
Data items, and their storage and transmission. I will assume that we can view transmission and storage of data in the brain as codes which represent some information with a specific meaning. I therefore assume that, for the purposes of modeling at this level of analysis and abstraction, we can view all data streams and storage as made up of discrete data items. I will represent each data item by a logical literal which indicates the meaning of the information contained in the data item. In order to allow for ramping up and attenuation effects, I give every literal an associated weight, or strength, which is a real number. An example data item is position(adam,300,200,0) which might mean that the perceived position of a given other animal, identified by the name “adam”, is given by (x,y,z) coordinates (300,200,0). This might be a data item that is transmitted from one brain module to another. In the brain this would actually be implemented by a set of parallel neurons firing in a spatial pattern at certain firing rates. Its effect however is that the receiving module now has the information about the position of adam.
Memory Each cortical region will be represented by a continuously acting module which is a process with storage. The main determiner of processing will be the type of data being processed (rather than the function being computed), different regions being specialized for different data types. Every module may in general have stored data items. Depending on the time characteristics of the module, these stored items may constitute volatile, short term or long term memory. Items that are activated as a result of computation will have their activation sustained and will correspond to working memory. Thus, potentially, both long term memory and working memory are distributed over the set of modules; compare [Petrides, 1994].
Processing within a module. I represent the processing within a module by a set of rules. A rule matches to incoming transmitted data items and to locally stored data items. All the processing by a module is described by a set of left-to-right rules which are executed in parallel. Representing data and processes using logic 237
The patterns on the left-hand side of rules also have weights, and the strength of a rule instance is the product of the matching data item weights and the rule weights, multiplied by an overall rule weight.
A rule may do some computation which we represent by arithmetic. This should not be more complex than can be expected of a neural net. This arithmetic is represented in the body part of the rule, written as a “provided” expression, for example:
if position(W1,[M1,X1,Y1,Z1]), position(W2,[M2,X2,Y2,Z2]), then
too near(W,[M1,M2,D]), provided(distance(X1,Y1,Z1,X2,Y2,Z2,D),D < 10.0)
This rule is intended to determine whether one animal is too near another.
The results are then filtered competitively. Typically, only the one strongest rule instance is allowed to “express itself”, by sending its constructed data items to other modules and/or to be stored locally. In some cases however all the computed data is allowed through.
Uniform process. The uniform process is then the mechanism for storage and trans- mission of data and the mechanism for execution of rules.
Uniform rate. I achieve uniformity of rate by describing time by a discrete time scale; the model runs in discrete time cycles. In one processing cycle, all the rules in all the modules are executed once, that is, all rule instances, and then all selected data are communicated between modules and/or stored locally. The events of a processing cycle represent and abstract all the changes occurring during that time interval.
Perception-action hierarchy. Modules are organized as a perception-action hierarchy. I diagram my concept in Figure 14.5.
This is an abstraction hierarchy, so that modules higher in the hierarchy process data of more abstract data types. I use a fixed number of levels of abstraction. 238 Chapter 14: Describing information processing in the neocortex
The perception hierarchy receives sensory data items at the bottom and derives higher level descriptions to form a percept. The action hierarchy generates more and more detailed descriptions of action, that is, it elaborates the plan to the point where motor actions are generated at the bottom of the action hierarchy.
An example of a perceptual rule is: if position(M,X,Y,Z) and orientation(M,A) and self_position(X1,Y1,Z1) then oriented_towards(M), provided(angle_towards(X,Y,Z,X1,Y1,Z1,A1) and app_equal(A,A1)). i.e., from data giving another primate’s position and orientation, and from the subject’s own position, calculate the angle from the primate to the subject and test if that angle is the same as the primate’s orientation, if so, create a new datum representing the fact that the other primate is oriented towards the subject.
An example of an action elaboration rule is: if plan_self_action(walk_towards(M)) and position(M,X,Y,Z) then plan_self_act(walk_towards(X,Y,Z)). i.e., if the planned action for the self in terms of relations is to walk towards some primate, and if this primate’s position is X,Y,Z then generate a new datum, which represents planned action for the self in terms of detailed position, to walk towards the position X,Y,Z.
The goal module has rules causing it to prioritize the set of goals that it has received, and to select the strongest one, which is then sent to the highest level plan module.
I also use a long-term memory module, which perceives social action and maintains the memory of affiliative relations. It generates goals to affiliate and sends them to the goal module. Dynamics of our model 239
The external world. Primates operate in an external environment which is a 3D spatial world. The environment is everything external to the brain, so it includes the body. A primate has sensors which interrogate the environment and generate sensed feature descriptions which are represented as literals. These input data items are sent to specified modules each cycle. Some modules act as effectors in that they send motor commands, represented as literals, to the environment. The environment receives motor commands from all the primates and computes what changes to make. Clearly, primates can only communicate with each other via the environment, since they are not telepathic.
14.4 Dynamics of the model
Perception-action hierarchy. Figure 14.5 shows how a perception-action compu- tational architecture could support the functioning of the brain in behavior. A plan is selected and elaborated, receiving input from the perception hierarchy to allow it to elaborate appropriately. 240 Chapter 14: Describing information processing in the neocortex
memory of social relations goals received from system
observations disposition of social relative to goals prioritized and relations goals selected goal sent evaluation perception of goals and joint plan selected dispositions attention perception evaluation elaboration requests for information on actors and actions perception in terms joint plan described in relational terms; of relations; elaboration conditional on attention dependent on requested information relational information received information received attention evaluation perception elaboration requests for information on spatial positions etc perception of plan for self described spatial detail; in spatial detail; attention dependent on requested information elaboration conditional on information received geometric information received attention perception evaluation elaboration
perception of features generation of detailed motor commands
from sensors to effectors
Figure 14.5: Functioning of interacting perception and action hierarchies in behavior Dynamics of our model 241
Conditional elaboration - situation. Within a given level, the component of the action hierarchy at that level is elaborated down to the next lower level, and evaluations are assessed and transmitted back up to the next higher level. By elaboration I mean taking data which describe action at one level and generating data which describe that action in more detail. More detail includes (1) exactly how to act (which detailed action components), (2) in what order, (3) exactly at what times, (4) exactly where in space, and (5) who will do which actions. We diagram an example of this in Figure 14.6(a). By an evaluation I mean, for example, a value indicating progress, success or failure; such a value can also be associated with a particular datum, for example, one representing an action or goal.
Conditional perception - attention. The perception hierarchy and action hierarchy cooperate closely. The action hierarchy must elaborate the currently selected plan con- ditionally upon the perceived environment. The modules of the perception hierarchy at a given level derive information required for successful action elaboration at that level. The perception hierarchy receives descriptions representing tuning information and direct requests, attention information, and prediction information, from the action hierarchy. I diagram an example of this in Figure 14.6(b). This information provides a context for perception, and enables the optimal use of processing and communication resources by the perception hierarchy in supporting the realtime action. Thus, my perception-action architecture provides a framework for attention mechanisms. 242 Chapter 14: Describing information processing in the neocortex
description of goal goal groom(alice1) goal and relational far(alice1) groom(M1),near(M) −> prelude(M1) spatial information generate relations groom(M1),far(M1),oriented(M,M1) −> approach(M1) action description description of action approach(alice) in terms of spatial relations action description detailed position(alice,50,40,0) approach(alice) and detailed position positions approach(M),position(M,X,Y,Z)−>move_to(X,Y,Z) generate detailed action
move_to(50,40,0) description in terms of coordinates (a) conditional elaboration
action concerns alice groom(alice) module sends attend(alice) attend(alice) data item to perceptual module
attend(M), too_near(alice) check_near(M) action module if too_near(M) receives detailed etc. perceptual information about alice (b) attention
friend(alice1) friend(alice2) rule fires for both alice1 and alice2 should_affiliate(alice1) should_affiliate(alice2) rule instance for alice2 is friend(M1),should_affiliate(M1) −> affiliate(M1) strengthened by confirmation
affiliate(alice2) confirm(affiliate(alice2))
near(alice2) affiliate(alice1) affiliate(alice2) affiliate(alice1) is not confirmed near(M1),affiliate(M1) −> groom(M1) affilate(alice2) is confirmed
(c) confirmation groom(alice2)
Figure 14.6: How the model works Dynamics of our model 243
Continuous action. Action is continuous with a small time granularity, The primate brain runs at about 20 milliseconds, and our implementation runs at about 100 millisec- onds on a 300MHz processor. Thus, stored data are updated every cycle, the selection of rule instances is updated every cycle, and updated motor commands are output to the environment every cycle. The process of goal generation, goal selection, plan selection, plan elaboration, action specification and motion specification proceeds continuously, renewing the information every cycle.
The stability of distributed activation. Each module selects a dominant rule which outputs data to other modules. However, this can lead to incoherence, modules can get into states with crossed purposes, and attempts to elaborate plans tend to collapse under challenge. I developed a simple, biologically plausible mechanism which stabilizes dis- tributed activity. If a module receives a data item that causes successful activity, it sends a confirmation message back to the sender, evaluating that data item. Successful activity is defined as any rule firing, not necessarily a selected one. I diagram an example of this in Figure 14.6(c). The confirmation message is specific to and contains the particular data item sent. When the sending modules receives a confirmation message it boosts the level of the rules generating that data item. This therefore consolidates the strength of the selected rules. Further, is a selected rule does not receive confirmation messages, its strength will attenuate, thereby allowing competing choices competing choices choices, competing to be tried.
Viable behavioral states. The basic action of the brain model is to try to establish a plan consistent with the response of the environment and with its own motivations. It does this by trying different alternatives at each level on a competitive basis, and subject to confirmation of successful elaboration. A state of the module in which such a plan is established, and is executing consistently with perception and confirmed elaboration, can be called a it viable state. 244 Chapter 14: Describing information processing in the neocortex
Different levels of control. Provided the internally generated goals and the external environment do not change radically, the continuous process of plan elaboration, percep- tion and action will continue. A change in spatial positions will simply result in different positions being perceived, this positional information being passed to the self action module and a different and more appropriate detailed action being generated using the updated position. The other levels will continue as before. Thus, the system will track changes in position.
Greater changes in position, posture, and action may result in different spatial and action relations being perceived at level 3. The relational information passed to the action module at this level may cause a different type of self action to be generated, but one that is still consistent with, and an elaboration of, the more generally specified plan received from level 4.
Thus, the levels of the hierarchy of perception and action correspond to a hierarchy of control concerning variations of (1) new positions and/or orientations, (2) new spatial relations, action types or action phases, (3) new plans, (4) new goals, and (5) new social situations, respectively. This is depicted, using a cortical correspondence diagram, in Figure 14.7.
Joint action. I developed a notion of plan suitable for social action. A joint plan is a set of joint steps, with temporal and causal ordering constraints, each step specifying an action for every primate collaborating in the joint plan, including the subject primate. The way a plan is executed is to attempt each step in turn, and during a step to verify that every collaborating primate is performing its corresponding action and to attempt to execute the corresponding individual action for the subject primate. I made most of the levels of the planning hierarchy work with joint plans, the next to lowest works with a “selfplan” which specifies action only for the subject primate, and the lowest with concrete motor actions. However, the action of these two lowest levels still depended on Dynamics of our model 245
basic movement tracking of sequencing position and orientation MI muscle SI tactile perceptual analysis combinations detection tracking of action phase PA1 detailed SA1 tactile images actions for self DV2 spatial maps
sequencing of plan steps PA3 specific joint plans and DV1 spatial features
PA4 overall plans plan persons
G goals PM1 person VI visual positions and features movements plan elaboration maintained by confirmation PM2 person VV1 object actions and identity attention changes relations accordingly
PM3 social dispositions and perceptual affiliations analysis
subcortical systems and long term memory provides stream of goal messages
visual input motor somatosensory output input
Figure 14.7: Response to variation in environment information received from the perception hierarchy.
Imagining brain activity. The modules operate concurrently, that is they all operate at the same time, in parallel. We can perhaps imagine a cortical surface covered with an array of cortical regions like a patchwork quilt, each region lighting up by a different amount during each cycle. Each region also stores and processes different information in each time interval, so there are different sets of expressions in each module which are constantly changing, and expressions flowing between modules.
The usual image of information activity in the brain is that of information flowing through a set of cortical areas, forming a pathway. We conceive of data flowing through the brain 246 Chapter 14: Describing information processing in the neocortex as passing sequentially through a series of brain regions. During each cycle, incoming information may be used to generate new information and/or to store information. In general, the same information is not passed on; instead, new information, derived from all the inputs received at that moment and any information already in storage at that moment, is transmitted. Because of the high processing speed of modules, a high rate of data transmission is maintained through the brain.
To understand how the brain produces behavior, we need a more general concept of computation that allows information transformation activity, combination of informa- tion, storage and retrieval of information, and activity conditional upon properties of the information. Our array of regions image allows us to think of each neural region as responding conditionally to information, as having time to compute new information and to store information, and as sending different information in different directions to different other areas, including sending information back to areas “upstream”. Chapter 15
My implemented model of the primate neocortex
Abstract: I demonstrate my approach by proposing a specific causal functioning model of the brain based on its functional architecture.
By using an abstract logical analysis, I develop a computational architecture for the brain in which each cortical region is represented by a computational module with processing and storage abilities. Modules are interconnected according to the connectivity of the corresponding cortical regions.
I report on results obtained with an implementation of this model. I conclude with a brief discussion of some consequences and predictions of this work.
247 248 Chapter 15: My implemented model of the primate neocortex
15.1 Our implemented brain model
The choice of behaviors and external environment. In order to precisely define a brain model, I needed to decide what behaviors to consider and what external envi- ronment the brain would have. I chose to consider the case of social affiliation. I used a “minisociety”, in which a group of primates (monkeys) interact socially, in a naturalistic 3D environment, with each model primate controlled by a brain model. Thus the instan- taneous state of the environment is mainly the positions, orientations and configurations of these primates. I motivated the system by defining long term memory, which stores knowledge of affiliative relations, as, among other things, generating affiliation goals, since affiliative behavior is a known driving force in primate groups [Kling and Steklis, 1976].
My impulse was to build in social interaction into my brain design from its inception. In the event, this has proved to be a fruitful decision; social interaction is arguably the most general type of behavior, and leads us to construct a general model. Social behavior involves perceiving dynamically-changing environments of primates who have complex dynamics. It involves generating social behavior which is joint and requires real-time coordination of action.
The initial model. I developed an initial brain model consisting of data and process representations, with eight memory modules, shown in Figure 14.4. An outline of each of these memories, the descriptions they store and the processes they include, is as follows: (i) the affiliations module contains all affiliative information, including kinship and dom- inance relationship information. It generates affiliation goals and sends them to the goals module. (ii) the goals module contains all goals currently held. It activates the most important goals and sends this information to the overall plans module. (iii) the overall plans module receives goals and instantiates suitable joint-plans, sending Our implemented brain model 249 them to the specific joint plans module. (iv) the specific joint plans module receives a joint-plan, and generates a detailed action based on descriptions received from the perceptual hierarchy. For the others involved in the joint plan, the detailed action or state is verified, and for the self, its detailed action is sent to the detailed actions for self module. (v) the detailed actions for self module receives the detailed self action from the spe- cific joint plans module, receives object and location information from lower levels of the perceptual hierarchy, mainly from the primate positions and movements module, and outputs a detailed motor action for this to the motor system. (vi) the primate positions and movements module receives sensory descriptions of the state of the external world and provides information on requested primates to the pri- mate actions and relations and detailed actions for self modules. (vii) the primate actions and relations module computes higher-level descriptions of the action of each primate involved in the current joint action. It requests information on particular primates from the primate positions and movements module. (viii) the plan primates module receives information from the overall plans module as to which other primates are involved in the joint action, and passes this on to the primate actions and relations module. (ix) the motor system does some processing to generate the external action given the direct action received from the detailed actions for self module.
Note that we have very much simplified the perceptual and motor hierarchies in this initial model. The perceptual hierarchy is simply the primate positions and movements and primate actions and relations modules, and the action hierarchy is the overall plans, specific joint plans and detailed actions for self modules.
Figure 15.1 indicates the data types processed in each module. 250 Chapter 15: My implemented model of the primate neocortex
affiliation goal affiliation(M1,M2) goal(G) dominance_level(M,L) primate_disposition(M,G,S) grooming(M1,M2) primate_disposition plan currently_selected_goal(G) primate_action request_primate(PP) plan_primate requested_action(M,A) requested_position(M,X,Y,Z) plan_primate(PP) plan_primate_action requested_orientation(M,[B,H]) self_action(M,A) plan_primate_action(PPA) self_position(M,X,Y,Z) head_oriented_towards(M1,M2) self_orientation(M,[B,H}) body_oriented_towards(M1,M2). head_oriented_towards(M1,M2) walk_towards(M1,M2). body_oriented_towards(M1,M2). move_to(M1,M2) walk_towards(M1,M2). move_to(M1,M2) plan_self_action primate_motion plan_self_action(PSA) plan_self_act(SA) action(M,A) self_action(M,A) position(M,X,Y,Z) self_position(M,X,Y,Z) orientation(M,[B,H}) self_orientation(M,[B,H}) requested_action(M,A) requested_position(M,X,Y,Z) requested_orientation(M,[B,.H]) sensor_system motor_system self_act(SA) environment
Figure 15.1: Description types in each module Our implemented brain model 251
Figure 15.2 outlines the rules operating in each module, for the case of two-primate grooming.
affiliation goal
compute affiliate with subordinate goals from disposition compute weight of goal compute affiliate with dominant goals compute dominate goals plan if grooming increase affiliation value from currently_selected_goal primate_disposition generate plan components
compute disposition for goals plan_primate_action plan_primate g1 if other is sitting or reorienting and far the orient_head_towards from plan_primate generate g1 if other has head_oriented_towards and far then orient_head_towards plan_primate_request g2 if other sitting and far then walk_towards g2 if other reorienting_head and far then walk_towards primate_action g2 if other head_oriented_towards and far then walk_towards g3 if near and head_oriented_towards then move closer from plan_primate_request g4 if no grooming_prelude_response then try grooming_prelude compute g4 if grooming_prelude_response then continue grooming_prelude if self has head_oriented_toward other g5 if grooming_prelude_response then try grooming if other has head_oriented_toward self g5 if grooming_response then continue grooming if other has body_oriented toward self bg1 if other far then sit if other is walking toward self bg1 if reorienting _towards and far then sit if other has moved−toward self bg1 if head_oriented_towards and far then sit bg2 if walking_towards and far then orient_towards bg3 if walking_towards and near then orient_towards primate_motion bg3 if walking_towards and near and oriented_towards then sit bg3 if walking_towards and very_near and oriented_towards then sit from plan_primate_request bg4 if grooming_prelude then grooming_prelude_response compute requested_action, requested_position, bg5 if grooming then grooming_response and requested_orientation compute self_action, self_position and plan_self_action self_orientation compute distance estimate to other if plan_self−act description and requested information compute grooming_action descriptions and self_information then construct spatially specified action sensor_system motor_system
self_act(SA) environment
Figure 15.2: Outline of description transformations in each module 252 Chapter 15: My implemented model of the primate neocortex
Results were obtained with grooming, social conflict and social spacing behaviors. These simple social behaviors were obtained using about fifteen rules per module.
15.2 Behaviors and results obtained
Two primate grooming behavior and joint action. I experimented with a pro- totypical situation in which two primates groom. I developed a four phase plan for a groomer (orientation, approach, grooming-prelude, then grooming), and a groomee (waiting, orientation, grooming-prelude-response, then grooming-response), and we de- veloped suitable rules for activity in each module in each phase. I ran my computer implementation and the primates did indeed carry out the four phases described, leading to a primate named adam1 grooming a primate named alice1. I show in Figure 15.3 images from a visualization generated by my system, showing a frame from each of the four phases of grooming. Behaviors and results obtained 253