CBMM: the Science and Engineering of Intelligence
The Center for Brains, Minds and Machines (CBMM) is a multi- institutional NSF Science and Technology Center dedicated to the study of intelligence - how the brain produces intelligent behavior and how we may be able to replicate intelligence in machines.
Cognitive Machine Learning, Neuroscience, Funding 2013-2023 ~$50M Science Computer Science Computational Research Institutions ~4
Educational Institutions 12
Faculty (CS+BCS+…) ~23
Researchers ~100 Science + Engineering Publications ~500 CBMM’s focus is the Science and the Engineering of Intelligence We aim to make progress in understanding the greatest of all problems in science — the problem of intelligence. This means understanding how the brain makes the mind, how the brain works and how to build intelligent machines. We believe that the science of intelligence will enable better engineering of intelligence in the long term.
Key recent advances in the engineering of intelligence have their roots in basic research on the brain The problem of intelligence is the greatest problem in science
The CBMM bet (different from Deep Mind):
understand how the brain works, (then) make intelligent machines CBMM Organizational Chart (future)
Director Tomaso Poggio EAC
Managing Education Diversity Deputy Associate Director Research Education KT Director Evaluation Coordinator Director & Trainee Director Coordinator Coordinator Kathleen Lizanne Mandana Gabriel Coordinator Kenneth Ellen Hildreth Boris Katz Sullivan DeStefano Sassanfar Kreiman Matt Wilson Blum (WC) (MIT) (MIT) (GT) (MIT) (HU) (MIT) (HU)
Module 1: Module 3: Module 4: VISUAL Module 2: COGNITIVE TOWARDS Administrative Technology STREAM BRAIN OS CORE SYMBOLS Assistant Director Tomaso Poggio, Gabriel Kreiman Nancy Kanwisher, Boris Katz, ShimonJim DiCarlo Ullman (HU) Joshua Tenenbaum Shimon Ullman (MIT) (MIT) (MIT)
EAC- May, 2020 CBMM Participants
150 140 130 120 110 Year 1 100 Year 2 90 Year 3 Year 4 80 Year 5 70 Year 6 60 (Year 7) 50 40 30 20 10
Total Faculty Postdocs Staff/Other Undergrads Grad Students Research Scientist EAC
Demis Hassabis, DeepMind Charles Isbell, Jr., Georgia Tech Christof Koch, Allen Institute Fei-Fei Li, Stanford Lore McGovern, MIBR, MIT Joel Oppenheim, NYU Pietro Perona, Caltech Marc Raibert, Boston Dynamics Judith Richter, Medinol Kobi Richter, Medinol Amnon Shashua, Mobileye David Siegel, Two Sigma Susan Whitehead, MIT Corporation Jim Pallotta, The Raptor group Research, Education & Diversity Partners
MIT Harvard Boyden, Desimone, DiCarlo, Kaelbling, Kanwisher, Blum, Gershman, Kreiman, Livingstone, Katz, McDermott, Oliva, Poggio, Roy, Sassanfar, Saxe, Sompolinsky, Spelke Schulz, Tegmark, Tenenbaum, Ullman, Wilson, Torralba
Boston Children’s Harvard Florida International U. Howard U. Hospital Medical School Chouika, Manaye, Kreiman Finlayson Kreiman, Livingstone Rwebangira, Salmani
Hunter College Johns Hopkins U. Queens College Rockefeller U. Chodorow, Epstein, Sakas, Zeigler Isik Brumberg Freiwald Universidad Central University of Museum of Science, Stanford U. Del Caribe (UCC) Central Florida Boston Goodman Jorquera McNair Program
UMass Boston UPR - Mayagüez UPR – Río Piedras Wellesley College Blaser, Ciaramitaro, Garcia-Arraras, Maldonado-Vlaar, Santiago, Vega-Riveros Hildreth, Wiest, Wilmer Pomplun, Shukla Megret, Ordóñez, Ortiz-Zuazaga International and Corporate Partners
A*STAR Genoa U. Hebrew U. Kaist Chuan Poh Lim Verri, Rosasco Weiss Sangwan Lee
IIT MPI Weizmann Cingolani Bülthoff Ullman
Google IBM Microsoft Orcam Siemens Honda Fujitsu NVIDIA
Boston DeepMind GE Schlumberger Mobileye Intel Dynamics Videos - ~950 (May 2014 - April 2020)
(of Youtube subscribers only - 18% of viewers) Ellen Hildreth
Diversity Program
Mandana Sassanfar Code, Software and Datasets
ObjectNet: A new benchmark for object There’s Waldo! A recognition (in prep.) Normalization Model of Andrei Barbu, David Mayo, Josh Tenenbaum, Boris Katz
Visual Search Predicts Existing object detection benchmarks overstate the performance of machines, and understate the performance of humans. We are creating a dataset that removes biases Single-Trial Human and shows that machines are far inferior to humans when detecting objects. Fixations in an Object Search Task
Thomas Miconi, Laura Groomes and Gabriel
Kreiman
Cerebral Cortex 2016 Partially Occluded - See more at: http://klab.tch.harvard.edu/resources/ Hands miconietal_visualsearch_2016.html#sthash.KmHoBP B. Myanganbayar, C. Mata, G. Dekel, B. Katz, G. Ben-Yosef, A. Barbu sk.xwHtrTkJ.dpuf A dataset of RGB images of hands holding objects and interacting with objects. Measured human accuracy on reconstructing occluded portions of hands. People are extremely good at this task while networks are at near chance-level performance.
EAC- May, 2020 Summer Course at Woods Hole: Our flagship initiative
Brains, Minds & Machines Summer Course An intensive three-week course gives advanced students a “deep” introduction to the problem of intelligence
Directors
Ellen Hildreth Kathleen Sullivan
Gabriel Kreiman
Lizanne Distefano
Kris Brewer Boris Katz A self-reproducing community of scholars is being formed ~>300 applicants, ~30 accepted Sponsored fellowships by GoogleX, Hidary Foundation + Fujitsu Kenny Blum CBMM Summer School
• Signature CBMM (Education/Knowledge Transfer) activity aimed at creating an intergenerational community around the science and engineering of intelligence. • Students reported strong influence of lectures, working on projects, and interactions among faculty, TA’s, and peers on their own thinking and research development.
EAC, May, 2020 Our vision and mission understand how the brain works, (then) make intelligent machines
WHY? Recent Success Stories in AI are based on RL and DL R DL and RL come from neuroscience L
Minsky’s SNARC D L Vision for the BMM SummerSchool
We focus on the combination of neuroscience and engineering to make progress on the problem of intelligence because, as in the recent past, it is likely that several of the next breakthroughs in ML and AI are likely to come from neuroscienceANDengineering A quick recap of 40 of the last ~50 years of neuroscience and ML, through my eyes 1972-2013 Tuebingen, MPI fuer BK (1972-1981) Werner Reichardt’s PhD
Werner with Dr. Ruska (center) Photo dated Nov. 17, 1952 (courtesy B. Reichardt) The four directors of the MPI fuer Biologische Kybernetik The beautiful eyes of flies
23 Work at 3 levels
• Fixation and tracking behavior of the fly
• Motion algorithms and circuits: the beetle (and the fly); relative motion algorithms and circuits
• Biophysics of computation Fixation and tracking behavior: Reichardt’s closed loop flight simulator Fixation and tracking behavior
26
Poggio, T. and W. Reichardt. A Theory of Pattern Induced Flight Orientation of the Fly, Musca Domestica, Kybernetik, 12, 185-203, 1972. Cognition in flies: probabilistic theories then (coming only now to humans)
27 The beginning of untethered flight analysis Bülthoff, Poggio & Wehrhahn Z. Naturforsch. 35c, 811-815 (1980)
▪ most behavioral fly research was done with the Götz torque meter ▪ in 1976, based on this recording technology, Reichardt & Poggio developed their theory for: Visual control of orientation behaviour in the fly, Part I +II. Quart. Rev. Biophysics 9(3), 311-375
▪ open question: how well does this theory describe fly behavior of natural flight
▪ in 1980 Wehrhan started high-speed film recording of flies chasing each other ▪ single frame analysis ▪ 3D stereo reconstruction Cognitive theory of basic fly instincts predicts trajectory of chasing fly …
Wehrhahn, C., T. Poggio and H. Bülthoff, Biological Cybernetics, 45, 123-130, 1982. Cognition in flies
30
Geiger, G. and T. Poggio. The Muller-Lyer Figure and the Fly, Science, 190, 479-480, 1975. Work at 3 levels
• Fixation and tracking behavior of the fly (cognition in the fly…similar to Bayesian approach to cognition in humans…no neurons!)
• Motion algorithms and circuits: the beetle (and the fly); relative motion algorithms and circuits
• Biophysics of computation Motion algorithm: the beetle Clorophanus and Reichardt’s motion detector Motion algorithm: the beetle and the fly
• The beetle follows the motion • Each photoreceptor sees only an alternation of dark and light: how is motion computed? • Reichardt and Hassenstein (and Peter Kunze) found the rules used by neural circuits. The algorithm (refined by D. Varju) explained many data : Reichardt detector. • The same model describes motion perception in flies: beautiful papers on anatomy, optics and organization of motion perception by Braitenberg, Kirschfeld, Goetz.
• An equivalent (“energy”) model (Adelson) describes motion cells in primate cortex. • A form of it has been used by Matsushita in the first chips to stabilize videocameras (see also Buelthoff, Little and Poggio, Nature, 1989) Relative motion and figure-ground discrimination: the fly
Work by Werner Reichardt (with Poggio and Hausen and later with M. Egelhaaf and A. Borst) Motion discontinuities and figure-ground discrimination: neural circuitry
Towards the neural circuitry, Reichardt, Poggio, Hausen, 1983 Relative motion
36 Two of the neurons….
Hermann Cuntz, Ju¨rgen Haag, and Alexander Borst, 2003 Work at 3 levels
• Fixation and tracking behavior of the fly
• Motion algorithms and circuits: the beetle (and the fly); relative motion algorithms and circuits (similar in spirit to HMAX and DiCarlo) • Biophysics of computation Biophysics of computation (motion detection)
39 Biophysics of Computation
______
Computational vision and regularization theory
Tomaso Poggio, Vincent Torre * & Christof Koch Artificial Intelligence Laboratory and Center for Biological Information Processing, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, Massachusetts 02193, USA * Istituto di Fisica, Universita di Genova, Genova, Italy
Descriptions of physical properties of visible surfaces, such as their distance and the presence of edges, must be recovered from the primary image data. Computational vision aims to understand how such descriptions can be obtained from inherently ambiguous and noisy data. A recent development in this field sees early vision as a set of ill-posed problems, which can be solved by the use of regularization methods. These lead to algorithms and parallel analog circuits that can solve 'ill-posed problems' and which are suggestive of neural equivalents in the brain.
COMPUTATIONAL vision denotes a new field in artificial intel- generic constraints about the physical word and the imaging ligence, centred on theoretical studies of visual information stage (see box). They represent conceptually independent processing. Its two main goals are to develop image understand- modules that can be studied, to a first approximation, in isola- ing systems, which automatically construct scene descriptions tion. Information from the different processes, however, has to from image input data, and to understand human vision. be combined. Furthermore, different modules may interact early Early vision is the set of visual modules that aim to extract on. Finally, the processing cannot be purely 'bottom-up': specific the physical properties of the surfaces around the viewer, that knowledge may trickle down to the point of influencing some is, distance, surface orientation and material properties (reflect- of the very first steps in visual information processing. ance, colour, texture). Much current research has analysed pro- Computational theories of early vision modules typically deal cesses in early vision because the inputs and the goals of the with the dual issues of representation and process. They must computation can be well characterized at this stage (see refs 1-4 specify the form of the input and the desired output (the rep- for reviews). Several problems have been solved and several resentation) and provide the algorithms that transform one into specific algorithms have been successfully developed. Examples the other (the process). Here we focus on the issue of processes are stereomatching, the computation of the optical flow, and algorithms for which we describe the unifying theoretical structure from motion, shape from shading and surface framework of regularization theories. We do not consider the reconstruction. equally important problem of the primitive tokens that represent A new theoretical development has now emerged that unifies the input of each specific process. much of these results within a single framework. The approach A good definition of early vision is that it is inverse optics. Proc. R. Soc. Lond. B. 202, 409-416 (1978) has its roots in the recognition of a common structure of early In classical optics or in computer graphics the basic problem is vision problems. Problems in early vision are 'ill-posed', requir- to determine the images of three-dimensional objects, whereas ing specific algorithms and parallel hardware. Here we introduce vision is confronted with the inverse problem of recovering Printed in Great Britain a specific regularization approach, and discuss its implications surfaces from images. As so much information is lost during for computer vision and parallel computer architectures, includ- the imaging process that projects the three-dimensional world ing parallel hardware that could be used by biological visual into the two-dimensional images, vision must often rely on systems. natural constraints, that is, assumptions about the physical world, to derive unambiguous output. The identification and Early vision processes use of such constraints is a recurring theme in the analysis of Early vision consists of a set of processes that recover physical specific vision problems. properties of the visible three-dimensional surfaces from the Two important problems in early vision are the computation A synaptic mechanism possibly underlyingtwo-dimensional directional intensity arrays. Their combined output of motion and the detection of sharp changes in image intensity roughly corresponds to Marr's 2-1/2D sketch\ and to Barrow (for detecting physical edges). They illustrate well the difficulty 5 and Tennenbaum's intrinsic images • Recently, it has been cus- of the problems of early vision. The computation of the two- tomary to assume that these early vision processes are general dimensional field of velocities in the image is a critical step in selectivity to motion and do not require domain-dependent knowledge, but only several schemes for recovering the motion and the three- dimensional structure of objects. Consider the problem of deter- mining the velocity vector V at each point along a smooth 6 Examples of early vision processes contour in the image. Following Marr and Ullman , one can assume that the contour corresponds to locations of significant • Edge detection intensity change. Figure 1 shows how the local velocity vector • Spatio-temporal interpolation and approximation is decomposed into a normal and a tangential component to B y V. TORREf AND T. POGGIOj • Computation of optical flow the curve. Local motion measurements provide only the normal • Computation of lightness and albedo component of velocity. The tangential component remains • Shape from contours 'invisible' to purely local measurements (unless they refer to • Shape from texture some discontinuous features of the contour such as a corner). f Universita di Genova, Istituto• Shape from shadingdi Fisica, Genoa,The problemItaly of estimating the full velocity field is thus, in • Binocular stereo matching general, underdetermined by the measurements that are directly • Structure from motion available from the image. The measurement of the optical flow t Max-Planck-Institutfur biologische Tubingen,• Structure from Germany stereo is inherently ambiguous. It can be made unique only by adding • Surface reconstruction information or assumptions. • Computation of surface colour The difficulties of the problem of edge detection are somewhat different. Edge detection denotes the process of identifying the (Communicated by B. B. Boycott, F.R.8. - Received 1 February 1978) © 1985 Nature Publishing Group A specific synaptic interaction is proposed as the mechanism underlying the directional selectivity to motion of several nervous cells. It is shown that the hypothesis is consistent with previous behavioural and phy- siological studies of the motion detection process. Detection of movement is one of the most basic and elementary computations performed by visual systems. Hence it is not surprising that the mechanisms and principles underlying movement detection have been approached in various species with a variety of techniques from behavioural analysis and psychophysics to physiology. Although several investigators have provided a wealth of infor- mation in the last years, the early analyses of Hassenstein & Reichardt (1956), Reichardt (1957, 1961), Barlow & Hill (1963), and Barlow & Levick (1965) still represent the extent of our understanding of this function. These studies are in many respects complementary. Those of Reichardt & Hassenstein are centred on the functional principles of movement detection as inferred from the average optomotor behaviour of a whole insect, whereas Barlow & Levick attack the problem of the neural circuitry underlying directional selectivity in the ganglion cells of a vertebrate retina. Figure 1 a and b summarize the main conclusions of the two approaches. Both models postulate the existence of two types of channels (1 and 2, from two adjacent receptor regions) with different conduction properties. In figure 1 a, channel 1 and channel 2 are low pass filters with a short and a long time constant, respectively, while in figure 1 b, channel 2 simply contains a delay. Perhaps the most significant contribution of Barlow & Levick consists of the experimental recognition that movement detection, at the level of direction selectivity of the ganglion cells, results primarily from an inhibitory mechanism that ‘vetoes’ the response to simultaneous signals from the receptors (after appropriate asymmetric delay) rather than from the detection of the conjunction of excitation from two regions (see figure 1). On the other hand, the main thrust of Hassenstein & Reichardt’s analysis is the demonstration that the interaction underlying movement detection must be nonlinear and, in particular, of a multi- plicative type. Many experimental data suggest that this is indeed the functional scheme underlying movement detection in insects (Poggio & Reichardt 1976). 14 L 409 ] Vol. 202. B. Proc. R. Soc. Lond. B. 202, 409-416 (1978) Printed in Great Britain
A synaptic mechanism possibly underlying directional selectivity to motion
B y V. TORREf AND T. POGGIOj f Universita di Genova, Istituto di Fisica, Genoa, Italy t Max-Planck-Institutfur biologische Tubingen, Germany
(Communicated by B. B. Boycott, F.R.8. - Received 1 February 1978)
A specific synaptic interaction is proposed as the mechanism underlying the directional selectivity to motion of several nervous cells. It is shown that the hypothesis is consistent with previous behavioural and phy- siological studies of the motion detection process.
Detection of movement is one of the most basic and elementary computations performed by visual systems. Hence it is not surprising that the mechanisms and principles underlying movement detection have been approached in various species with a variety of techniques from behavioural analysis and psychophysics to physiology. Although several investigators have provided a wealth of infor- mation in the last years, the early analyses of Hassenstein & Reichardt (1956), Reichardt (1957, 1961), Barlow & Hill (1963), and Barlow & Levick (1965) still represent the extent of our understanding of this function. These studies are in many respects complementary. Those of Reichardt & Hassenstein are centred on the functional principles of movement detection as inferred from the average optomotor behaviour of a whole insect, whereas Barlow & Levick attack the problem of the neural circuitry underlying directional selectivity in the ganglion cells of a vertebrate retina. Figure 1 a and b summarize the main conclusions of the two approaches. Both models postulate the existence of two types of channels (1 and 2, from two adjacent receptor regions) with different conduction properties. In figure 1 a, channel 1 and channel 2 are low pass filters with a short and a long time constant, respectively, while in figure 1 b, channel 2 simply contains a delay. Perhaps the most significant contribution of Barlow & Levick consists of the experimental recognition that movement detection, at the level of direction selectivity of the ganglion cells, results primarily from an inhibitory mechanism that ‘vetoes’ the response to simultaneous signals from the receptors (after appropriate asymmetric delay) rather than from the detection of the conjunction of excitation from two regions (see figure 1). On the other hand, the main thrust of Hassenstein & Reichardt’s analysis is the demonstration that the interaction underlying movement detection must be nonlinear and, in particular, of a multi- plicative type. Many experimental data suggest that this is indeed the functional scheme underlying movement detection in insects (Poggio & Reichardt 1976). 14 L 409 ] Vol. 202. B. Cooperative Computation of Stereo Disparity Cooperative neural network for stereo D. Marr; T. Poggio Science, New Series, Vol. 194, No. 4262. (Oct. 15, 1976), pp. 283-287. Stable URL: http://links.jstor.org/sici?sici=0036-8075%2819761015%293%3A194%3A4262%3C283%3ACCOSD%3E2.0.CO%3B2-1