<<

A Standard Model of : AAAI Technical Report FS-17-05

Towards A Unified of Mind and Brain

Stephen Grossberg Center for Adaptive Systems, 677 Beacon Street, Boston University, Boston, MA 02215 [email protected]

Abstract A unified theory of how brains give rise to has been getting in order to achieve biological : steadily developed for 60 years. This theory has explained data Complementary Computing and Laminar Computing. from thousands of interdisciplinary experiments, and scores of its Complementary Computing. Grossberg (2000) describes predictions have been confirmed. The theory includes new com- how the brain is organized into complementary parallel putational paradigms called Complementary Computing and processing streams whose interactions generate biological- Laminar Computing that clarify the global organization of brain ly intelligent behaviors. A single cortical processing stream dynamics and behavior. It uses a small number of basic equations can individually compute some properties well, but cannot, and modules to form modal architectures that model different by itself, process other computationally complementary modalities of intelligence. One of its models, Adaptive Resonance Theory, or ART, is currently the most advanced cognitive and properties. Pairs of complementary cortical processing neural theory of how advanced brains incrementally and stably streams interact to generate emergent properties that over- learn to attend, recognize, and predict objects and events in a come their complementary deficiencies to compute com- changing world. All the basic ART predictions have been con- plete with which to represent or control some firmed by psychological and neural data. These results provide a faculty of intelligent behavior. Thus, although brain anat- firm foundation for further development of a Standard Theory of omy embodies a great deal of functional specialization, the Mind. there are no "independent modules" in advanced brains.

Unified and Brain Complementary Computing clarifies how different brain regions can achieve a great deal of specialization without Major design , mechanisms, and architectures independent modules. have been discovered and developed during the past 60 years as part of an emerging unified theory of biological The WHAT and WHERE Cortical Streams intelligence, notably how brain mechanisms give rise to are Complementary mental functions as emergent properties. As of this writing, thousands of psychological and neurobiological experi- For example, the category , attention, recognition, ments have been explained and predicted in a unified way, and prediction circuits of the ventral, or What, cortical pro- including data about , , cognitive- cessing stream for perception and cognition are computa- emotional dynamics, and action in both normal individuals tionally complementary to those of the dorsal, or Where and clinical patients. Key articles may be downloaded at and How, cortical processing steam for spatial representa- htty://cns.bu.edu/~steve. These results include a systematic tion and action (Grossberg, 2000, 2013, 2017). One analysis of what is happening in individual brains when for What-Where complementarity is that the What stream they consciously see, hear, feel, or know something, and learns recognition categories that are substantially how we can integrated moments of seeing, invariant under changes in an object's view, size, and posi- hearing, feeling, and knowing (Grossberg, 2017). tion. These invariant object categories enable our brains to recognize valued objects without experiencing a combina- Brain paradigms: Complementary Computing torial explosion. They cannot, however, locate and act up- on a desired object in space. Where stream spatial and mo- and Laminar Computing tor representations can locate objects and trigger actions towards them, but cannot recognize them. What stream The possibility of this synthesis is predicated upon the dis- dynamics embody many aspects of declarative learning covery that advanced brains embody novel computational and , whereas Where stream dynamics realize pro- cedural learning and memory. By interacting together, the

354 What and Where streams can recognize valued objects and obstacles and spatial navigation in the dark; invariant ob- direct appropriate goal-oriented actions towards them. ject and scenic gist learning, recognition, and search; pro- totype, surface, and boundary attention; gamma and beta Adaptive Resonance Theory oscillations during cognitive dynamics; learning of ento- rhinal grid cells and hippocampal place cells, including the Abundant psychological and neurobiological data have use of homologous spatial and temporal mechanisms in the confirmed all of my foundational predictions concerning medial entorhinal-hippocampal system for spatial naviga- how perceptual/cognitive processes in the What stream use tion and the lateral stream for adaptively timed cognitive- excitatory matching and match-based learning to create emotional learning; breakdowns in attentive vigilance dur- self-stabilizing categorical representations of objects and ing , medial temporal amnesia, and Alzheimer's dis- events, notably recognition categories that can be learned ease; social cognitive abilities such as the learning of joint quickly without experiencing catastrophic forgetting dur- attention and the use of tools from a teacher, despite the ing subsequent learning. In other words, this recognition different coordinate systems of the teacher and learner; a learning process solves the so-called stability-plasticity unified circuit design for all item-order-rank working dilemma. These processes enable increasing expertise, and that enable stable learning of recognition catego- an ever-expanding sense of self, to emerge throughout life. ries, plans, and expectations for the representation and con- Excitatory matching by object attention is embodied by the trol of sequences of linguistic, spatial, and motor infor- ART Matching Rule. This type of attentional circuit ena- mation; conscious speech percepts that are influenced by bles us to prime our expectations to anticipate objects and future context; auditory streaming in noise during source events before they occur, and to focus attention upon ex- segregation; and speaker normalization that enables lan- pected objects and events when they do occur. Good guage learning from after a critical period of bab- enough matches between expected and actual events trig- bled sounds by a child; cognitive-emotional dynamics that ger resonant states that can support learning of new recog- direct motivated attention towards valued goals; and adap- nition categories and refinement of old ones, while also tive sensory-motor control circuits, such as those that co- triggering conscious recognition of the critical feature pat- ordinate predictive smooth pursuit and saccadic eye terns that are attended as part of these percepts. Excitatory movements, and coordinate looking and reaching move- matching also controls reset of the attentional focus when ments. Brain regions that are functionally described in- bottom-up inputs significantly mismatch currently active clude visual and auditory neocortex; specific and nonspe- top-down expectations. Cycles of resonance and reset un- cific thalamic nuclei; inferotemporal, parietal, prefrontal, derlie much of the brain's perceptual and cognitive dynam- entorhinal, hippocampal, parahippocampal, perirhinal, and ics. motor cortices; frontal eye fields; supplementary eye These matching and learning laws have been articu- fields; ; basal ganglia: cerebellum; superior col- lated as part of Adaptive Resonance Theory, or ART, liculus; and reticular formation. which has been progressively developed since it was first These results include many important particulars. reported in 1976. ART is a cognitive and neural theory of For example, all working memories obey an LTM Invari- how the brain autonomously learns to attend, recognize, ance that enables them to temporarily store se- and predict objects and events in a changing world. ART is quences of events that can be stably chunked and remem- currently the most highly developed cognitive and neural bered. This fact implies that all linguistic, spatial, and mo- theory available, with the broadest explanatory and predic- tor working memories are variants of the same network tive range. Central to ART's predictive power is its ability design. Properties like bounded readily follow to carry out fast, incremental, and stable unsupervised and from this shared working memory design. supervised learning in response to a changing world. ART ART also clarifies how so-called attentional bottle- specifies mechanistic links between processes of con- necks may arise from basic constraints on how object sciousness, learning, expectation, attention, resonance, and learning is regulated and dynamically stabilized by interac- synchrony (the CLEARS processes) during both unsuper- tions that use spatial and object attention. vised and supervised learning. I have predicted that all ART does not, however, describe many spatial and brains that can solve the stability-plasticity dilemma do so motor processes and their behaviors, which employ differ- using these predicted links between CLEARS processes. ent matching and learning laws. ART is thus not "a theory Indeed, my 40-year old prediction that "all conscious states of everything". are resonant states" is consistent with all the data that I know, and has helped to explain many data about con- Vector Associative Maps for Spatial Repre- sciousness, as will be briefly noted below. sentation and Action ART hereby contributes to functional and mechanis- tic explanations of such diverse topics as 3D vision and Complementary spatial/motor processes in the Where figure-ground perception in natural scenes; optic-flow stream often use inhibitory matching and mismatch-based based navigation in natural scenes towards goals around learning to continually update spatial maps and sensory-

355 motor gains in the trajectory formation processes that con- that links brain mechanisms and psychological functions, trol our changing bodies throughout life. Inhibitory match- and to demonstrate how similar organization principles and ing via a Difference Vector occurs between a representa- mechanisms, suitably specialized, can support conscious tion of where we want to move⎯a Target Position Vec- across modalities. tor⎯and where we are now⎯a Present Position Vector. This reason is that a small number of equations When we arrive at where we want to be, the match⎯viz. suffice to model all modalities. These include equations for the Difference Vector⎯equals zero. Inhibitory matching short-term memory, or STM; medium-term memory, or by this Vector Associative Map, or VAM, Matching Rule MTM; and long-term memory, or LTM, that I first thus cannot solve the stability-plasticity dilemma. That is published in PNAS in 1968. See Grossberg (2013) for a why spatial and motor representations, notably procedural review. These equations are used to define a somewhat memories, cannot support conscious qualia. Instead, they larger number of modules, or microcircuits, that are used in experience catastrophic forgetting as they learn how to multiple modalities where they can carry out different accurately control our changing bodies throughout life. functions within each modality. These modules include Together these complementary processes create a shunting on-center off-surround networks, gated dipole self-stabilizing perceptual/cognitive front end in the What opponent processing networks, associative learning stream for learning about the world, gaining increasing networks, spectral adaptively timed learning networks, expertise along the way, and becoming conscious of it, trajectory formation networks, and the like. Each module while it intelligently commands more labile spatial/motor and its specializations exhibits a rich, but not , processes in the Where stream that control our changing of useful computational properties. bodies. Such a complementary synthesis offers a view to- For example, shunting on-center off-surround net- wards developing a Standard Model of the Mind. works can carry out properties like contrast normalization, including discounting the illuminant; contrast enhance- Homologous Laminar Cortical Circuits for All ment, noise suppression, and winner-take-all from Biological Intelligence multiple parallel alternatives; short-term memory and working memory storage of event sequences; attentive The second computational is called Laminar matching of bottom-up input and top-down Computing. Laminar Computing further contributes to a learned expectations; and synchronous oscillations and Standard Model of the Mind since it describes how the traveling waves. is organized into layered circuits whose These equations and modules are specialized and specializations support all higher-order biological intelli- assembled into modal architectures, where “modal” stands gence, including but not restricted to cognition. Indeed, the for different modalities of biological intelligence, including laminar circuits of cerebral cortex seem to realize a revolu- architectures for vision, audition, cognition, cognitive- tionary computational synthesis of the best properties of emotional interactions, and sensory-motor control. feedforward and feedback processing, digital and analog An integrated self is possible because it builds on a processing, and data-driven bottom-up processing and hy- shared set of equations and modules within modal architec- pothesis-driven top-down processing (Grossberg, 2007, tures that can interact seamlessly together. 2013). For example, ART mechanisms have, to the present, Modal architectures are general-purpose, in that they been naturally embodied in laminar cortical models of vi- can process any kind of inputs to that modality, whether sion, speech, and cognition, specifically the 3D LAMI- from the external world or from other modal architectures. NART model of 3D vision and figure-ground separation, They are also self-organizing, in that they can autonomous- the cARTWORD model of speech perception, and the ly develop and learn in response to these inputs. Modal LIST PARSE model of cognitive working memory and architectures are thus less general than the von Neumann chunking. Each model uses variations of the same canoni- architecture that provides the mathematical foundation of cal laminar cortical circuitry, thereby clarifying how spe- modern computers, but much more general than a tradi- cialized resonances use similar types of circuits to support tional AI algorithm. ART networks form part of several different conscious , and providing an exist- different modal architectures, including modal architec- ence proof that other kinds of biological intelligence can tures that enable seeing, hearing, feeling, and knowing. also be modeled using variations of this canonical laminar circuit. All Conscious States are Resonant States

Why a Unified Theory is Possible: Shared ART has predicted that "all conscious states are resonant Equations, Modules, and Architectures states", but the converse is not true. I know no data that disconfirm this 40 year old prediction. ART resonances

There is another fundamental reason why it is possible for clarify questions such as the following, which have been raised by distinguished philosophers (cf., Grossberg, scientists to discover a unified mind-brain theory 2017):

356 What kind of "event" occurs in the brain during a recognition of speech and . They may synchronize conscious experience that is anything more than just a with stream-shroud and spectral-pitch-and-timbre reso- "whir of information-processing"? What happens when nances during conscious hearing of speech and language, conscious mental states "light up" and directly appear to and build upon the selection of auditory sources by spec- the ? ART explains that, over and above "just" in- tral-pitch-and-timbre resonances in order to recognize the formation processing, our brains sometimes go into a con- acoustical signals that are grouped together within these text-sensitive resonant state that can involve multiple brain streams. Cognitive-emotional resonances are predicted to regions. Abundant experimental evidence supports the support conscious percepts of feelings, as well as recogni- ART prediction that "all conscious states are resonant tion of the source of these feelings. Cognitive-emotional states". Not all brain dynamics are "resonant", and thus resonances can also synchronize with resonances that sup- is not just a "whir of information- port conscious qualia and about them. All of processing". these resonances have distinct anatomical substrates. Second, when does a resonant state embody a con- scious experience? And how do different resonant states Why Does Resonance Trigger Consciousness? support different kinds of conscious qualia? The other side of the coin is equally important: When does a resonant Detailed analyses of psychological and neurobiological state fail to embody a conscious experience? ART explains data by ART clarify why resonance is necessary for con- (Grossberg, 2017) how various evolutionary challenges sciousness. For example, in order to fully compute visual that advanced brains face in order to adapt to changing boundaries and surfaces whereby to see the world, the environments in real have been met with particular brain computes three pairs of complementary computation- conscious states, which form part of larger adaptive behav- al properties of boundaries and surfaces, along with the ioral capabilities. ART explains that are not con- three hierarchical resolutions of uncertainty that require scious just to Platonically contemplate the of the multiple processing stages to overcome. There is thus a world. Humans are conscious in order to enable them to great deal of uncertainty in the early stages of visual pro- better adapt to the world's changing demands. To illustrate cessing by the brain. Only after all three hierarchical reso- these claims, ART has proposed how the brain generates lutions of uncertainty are complete, and after boundaries resonances that support particular conscious experiences of are completed and surfaces filled-in, has the brain con- seeing, hearing, feeling, and knowing. In so doing, it sug- structed a contextually informative and temporally stable gests how resonances for conscious seeing help to ensure enough representation of scenic objects on which to base effective reaching, resonances for conscious hearing help adaptive behaviors. to ensure effective speaking, and resonances for conscious If this is indeed the case, then why do not the earlier feeling help to ensure effective goal-directed action. stages undermine behavior? The proposed answer is that brain resonance, and with it conscious awareness, is trig- The Varieties of Brain Resonances and the gered at the processing stage that represents boundary and Conscious Experiences That They Support surface representations, after they are complete and stable enough to control visually-based behaviors like attentive Towards this end, ART has explained six different types of looking and reaching. Consciousness supplies an "extra neural representations of conscious qualia, and has provid- degree of freedom" to mark these behaviorally predictive ed enough theoretical background and data explanations representations. ART also explains that object attention based on these representations to illustrate their explanato- obeys what is called the ART Matching Rule, which can ry and predictive power. The theory's predictions also sug- operate both top-down and bottom-up to select the infor- gest multiple kinds of interdisciplinary experiments to mation at other processing stages that is consistent with the deepen our mechanistic of the brain mecha- triggering resonance and to suppress all inconsistent in- nisms for generating conscious resonances. formation. For example, surface-shroud resonances are pre- dicted to support conscious percepts of visual qualia. Fea- Towards a Standard Model of Mind ture-category resonances are predicted to support con- scious recognition of visual objects and scenes. Both kinds of resonances may synchronize during conscious seeing The above summary suggests some of the fundamental and recognition. Stream-shroud resonances are predicted design principles, mechanisms, and architectures that ad- to support conscious percepts of auditory qualia. Spectral- vanced brains use to create a mind. The summary also pitch-and-timbre resonances are predicted to support con- notes that these principles, mechanisms, and architectures scious recognition of sources in auditory streams. Stream- have been embodied in self-organizing neural network shroud and spectral-pitch-and-timbre resonances may syn- models, which have been used to explain and predict an chronize during conscious hearing and recognition of audi- unrivalled range of mind and brain data. These discoveries tory streams. Item-list resonances are predicted to support thus provide a good foundation for developing a Standard Model of Mind.

357 All of the models described above have also been applied to large-scale problems in engineering and tech- nology where self-organizing, adaptive, autonomous agents are desired, including mobile agents. A partial list of tech transfers can be found at the CNS Technology Lab website: http://techlab.bu.edu/resources/articles/C5.

References Grossberg, S. (2000). The complementary brain: Unifying brain dynamics and modularity. Trends in Cognitive Sciences, 4, 233- 246. Grossberg, S. (2007). Consciousness CLEARS the mind. Neural Networks, 20, 1040-1053. Grossberg, S. (2013). Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks, 37, 1-47. Grossberg, S. (2017). Towards solving the Hard Problem of Con- sciousness: The varieties of brain resonances and the conscious experiences that they support. Neural Networks, 87, 38-95.

358