Backpropagation and the Brain the Structure That Is Only Implicit in the Raw Sensory Input
Total Page:16
File Type:pdf, Size:1020Kb
PERSPECTIVES predominantly unsupervised fashion1,25–27, building representations that make explicit Backpropagation and the brain the structure that is only implicit in the raw sensory input. It is natural to wonder, then, Timothy P. Lillicrap , Adam Santoro, Luke Marris, Colin J. Akerman and whether backprop has anything to tell us Geoffrey Hinton about learning in the brain25,28–30. Here we argue that in spite of these Abstract | During learning, the brain modifies synapses to improve behaviour. In the apparent differences, the brain has the capacity cortex, synapses are embedded within multilayered networks, making it difficult to implement the core principles underlying to determine the effect of an individual synaptic modification on the behaviour of backprop. The main idea is that the brain the system. The backpropagation algorithm solves this problem in deep artificial could compute effective synaptic updates by using feedback connections to induce neuron neural networks, but historically it has been viewed as biologically problematic. activities whose locally computed differences Nonetheless, recent developments in neuroscience and the successes of artificial encode backpropagation-like error signals. neural networks have reinvigorated interest in whether backpropagation offers We link together a seemingly disparate set insights for understanding learning in the cortex. The backpropagation algorithm of learning algorithms into this framework, learns quickly by computing synaptic updates using feedback connections to which we call ‘neural gradient representation 9,27,31–41 deliver error signals. Although feedback connections are ubiquitous in the cortex, by activity differences’ (NGRAD) . The NGRAD framework demonstrates that it is difficult to see how they could deliver the error signals required by strict it is possible to embrace the core principles formulations of backpropagation. Here we build on past and recent developments of backpropagation while sidestepping to argue that feedback connections may instead induce neural activities whose many of its problematic implementation differences can be used to locally approximate these signals and hence drive requirements. These considerations may be effective learning in deep networks in the brain. relevant to any brain circuit that incorporates both feedforward and feedback connectivity. We nevertheless focus on the cortex, which The brain learns by modifying the synaptic that compute synaptic changes that reduce is defined by its multilaminar structure connections between neurons1–5. Although the error (Fig. 1). and hierarchical organization, and so has synaptic physiology helps explain the In machine learning, backpropagation of long been viewed as exhibiting many of rules and processes behind individual error (‘backprop’)7–10 is the algorithm most the architectural features associated with modifications, it does not explain how often used to train deep neural networks deep networks. individual modifications coordinate to (BOx 1) and is the most successful learning achieve a network’s goal. Since learning procedure for these networks. Networks Credit assignment in networks cannot be just a blind accumulation of trained with backprop are at the heart This article emphasizes the role of learning myopic, synapse-specific events that do of recent successes of machine learning, in the generation of adaptive behaviour. not consider downstream behavioural including state-of-the-art speech11 and It should be acknowledged that brains consequences, we need to uncover the image recognition12,13, as well as language undoubtedly have prior knowledge that has principles orchestrating plasticity across translation14. Backprop also underpins recent been optimized by evolution (that is, in the whole networks if we are to understand progress in unsupervised learning problems form of neural architectures and default learning in the brain. such as image and speech generation15,16, connectivity strengths). Priors may ensure Within machine learning, researchers language modelling17 and other next-step that only a limited amount of learning based study ways of coordinating synaptic updates prediction tasks18. In addition, combining on a relatively small amount of task error or to improve performance in artificial neural backprop with reinforcement learning has feedback information is needed throughout networks, without being constrained by given rise to significant advances in solving an animal’s lifetime to acquire all the biological reality. They start by defining control problems, such as mastering Atari skills the animal will exhibit. Nonetheless, the architecture of the neural network, games19 and beating top human professionals although animals often display impressive which comprises the number of neurons in the games of Go20,21 and poker22. behaviours from birth, they are also capable and how they are connected. For example, Backprop uses error signals that are of extraordinary feats that could not have investigators often use deep networks sent through feedback connections to been tuned by evolution but instead require with many layers of neurons, since these adjust synapses and has classically been long bouts of learning. Some examples of architectures have proved to be very effective described in the supervised learning setting such feats in humans are playing Go and for many tasks. Next, researchers define (that is, with explicit, externally provided chess; programming a computer or designing an error function6 that quantifies how poorly targets). However, the brain appears to a video game; writing and playing a piano the network is currently achieving its goals use its feedback connections for different concerto; learning the vocabularies and and then they search for learning algorithms purposes23,24 and is thought to learn in a grammars of multiple languages; recognizing NATURE REVIEWS | NEUROSCIENCE PERSPECTIVES a No feedback Scalar feedback Vector feedback Synapse undergoing learning Feedback signal (e.g. gradient) Feedforward Hebbian Perturbation Backpropagation Backprop-like learning Feedback neuron (required for learning) network learning learning with feedback network Feedforward neuron (required for learning) Output Diffuse scalar reinforcement signal c Error landscape Perturbation learning Input b Scalar feedback Vector feedback 1 Parameter Weight perturbation Backpropagation Hebbian learning Precision of synaptic change in reducing error Backpropagation Node perturbation Backpropagation approximations Parameter 2 Fig. 1 | A spectrum of learning algorithms. a | Left to right: a neural b | Backpropagation and perturbation algorithms fall along a spectrum with network computes an output through a series of simple computational respect to the specificity of the synaptic change they prescribe. c | Algorithms units. To improve its outputs for a task , it adjusts the synapses between on this spectrum learn at different speeds. Without feedback , synaptic these units. Simple Hebbian learning — which dictates that a synaptic con- parameters wander randomly on the error surface. Scalar feedback does not nection should strengthen if a presynaptic neuron reliably contributes to a require detailed feedback circuits, but it learns slowly. Since the same signal postsynaptic neuron’s firing — cannot make meaningful changes to the blue is used to inform learning at all synapses, the difficulty of deciding whether synapse, because it does not consider this synapse’s downstream effect on to strengthen or weaken a synapse scales with the number of synapses in the the network output. Perturbation methods measure the change in error network: if millions of synapses are changed simultaneously , the effect of one caused by random perturbations to neural activities (node perturbation) or synapse change is swamped by the noise created by all the other changes, synapse strengths44 (weight perturbation) and use this measured change as and it takes millions of trials to average away this noise43–46. The inverse scal- a global scalar reinforcement signal that controls whether a proposed ing of learning speed with network size makes global reinforcement methods perturbation is accepted or rejected. The backprop algorithm instead extremely slow , even for moderately sized neural networks. Precise vector computes the synapse update required in order to most quickly reduce the feedback via backprop learns quickly. In real networks, it is not possible to error. In backprop, vector error signals are delivered backward along make perfect use of the internal structure of the network to compute per- the original path of influence for a neuron. In the brain, vector feedback synapse changes, but the brain may have discovered ways to approximate might be delivered in a variety of ways, including via a separate network. the speed of backprop. thousands of objects; and diagnosing a influence — its projective field — rapidly of a network is captured by an error medical problem and performing vascular expands, so the effect of changing the function that computes the degree to microsurgery. Recent work in machine synapse strength depends on the strengths which the network’s outputs [y1,…,yM] learning suggests that these behaviours of many subsequent synapses in the network deviate from their target values [t1,…,tM] depend on powerful and general learning (for example, the blue connections spreading — for example, via the squared error, algorithms12,20. Our interest here, then, is in from the input layer in