Brain Networks for Confidence Weighting and Hierarchical
Total Page:16
File Type:pdf, Size:1020Kb
Brain networks for confidence weighting and PNAS PLUS hierarchical inference during probabilistic learning Florent Meyniela,1 and Stanislas Dehaenea,b,1 aCognitive Neuroimaging Unit, NeuroSpin Center, Institute of Life Sciences Frédéric Joliot, Fundemental Research Division, Commissariat à l’Énergie Atomique et aux Énergies Alternatives, INSERM, Université Paris–Sud, Université Paris–Saclay, 91191 Gif/Yvette, France; and bChair of Experimental Cognitive Psychology, Collège de France, 75005 Paris, France Contributed by Stanislas Dehaene, March 20, 2017 (sent for review September 23, 2016; reviewed by Stephen M. Fleming and Charles R. Gallistel) Learning is difficult when the world fluctuates randomly and and normative solution to this problem requires weighting each ceaselessly. Classical learning algorithms, such as the delta rule with source of information according to its reliability (3–12). According constant learning rate, are not optimal. Mathematically, the optimal to this Bayes-optimal solution, any discrepancy between a new ob- learning rule requires weighting prior knowledge and incoming servation and a learned estimate should lead to an update of this evidence according to their respective reliabilities. This “confidence internal estimate, but the size of this update should decrease as the weighting” implies the maintenance of an accurate estimate of the prior confidence in this internal estimate increases. Furthermore, reliability of what has been learned. Here, using fMRI and an ideal- this prior confidence should depend on two factors: the precision of observer analysis, we demonstrate that the brain’s learning algorithm the current internal estimate and a discounting factor that takes into relies on confidence weighting. While in the fMRI scanner, human account the possibility that a change occurred. Indeed, a change in adults attempted to learn the transition probabilities underlying an environmental parameters would render the current estimate use- auditory or visual sequence, and reported their confidence in those less for predicting future observations. An optimal algorithm should estimates. They knew that these transition probabilities could change therefore maintain distinct types of uncertainty, organized in a hi- simultaneously at unpredicted moments, and therefore that the learn- erarchical manner: future events are uncertain because they are ing problem was inherently hierarchical. Subjective confidence reports governed by probabilities (level 1); these probabilities themselves tightly followed the predictions derived from the ideal observer. In are known only with a certain precision (level 2); and this precision particular, subjects managed to attach distinct levels of confidence to is limited because a change may have occurred (level 3). each learned transition probability, as required by Bayes-optimal in- In summary, efficient learning requires the learner to maintain ference. Distinct brain areas tracked the likelihood of new observa- and constantly update an accurate representation of the confi- tions given current predictions, and the confidence in those dence in what has been learned. In a previous behavioral study predictions. Both signals were combined in the right inferior frontal (13), by engaging human adults in a probability-learning task, we gyrus, where they operated in agreement with the confidence- showed that they possess a sense of confidence in what has been weighting model. This brain region also presented signatures of a learned that is remarkably similar to the optimal algorithm. Here, hierarchical process that disentangles distinct sources of uncertainty. we propose that learning approaches optimality in humans be- Together, our results provide evidence that the sense of confidence is cause it shares two features of the optimal algorithm: (i) it relies an essential ingredient of probabilistic learning in the human brain, on a sense of confidence that serves as a weighting factor to and that the right inferior frontal gyrus hosts a confidence-based balance prior estimates and new observations; and (ii) confi- statistical learning algorithm for auditory and visual sequences. dence is organized hierarchically, taking into account higher- order factors such as volatility. We aim to provide behavioral confidence | functional MRI | learning | probabilistic inference | Bayesian and functional magnetic resonance imaging (fMRI) evidence on he sensory data that we receive from our environment are Significance — often captured by temporal regularities for instance the COGNITIVE SCIENCES T PSYCHOLOGICAL AND – colors of traffic lights change according to a predictable green What has been learned must sometimes be unlearned in a – yellow red pattern; thunder is often followed by rain, etc. Knowl- changing world. Yet knowledge updating is difficult since our edge of those hidden regularities is often acquired through learning, world is also inherently uncertain. For instance, a heatwave in by aggregating successive observations into summary estimates (e.g., winter is surprising and ambiguous: does it denote an infrequent the probability of the light turning red when it is currently yellow). fluctuation in normal weather or a profound change? Should I When sensory data are received sequentially, learning can be de- trust my current knowledge, or revise it? We propose that hu- scribed as an iterative process that updates the internal estimates mans possess an accurate sense of confidence that allows them each time a new observation is received. Learners must therefore to evaluate the reliability of their knowledge, and use this in- constantly balance two sources of information: their current esti- formation to strike the balance between prior knowledge and mates and the new incoming observations. current evidence. Our functional MRI data suggest that a fron- Any learning algorithm must find a solution to this balancing toparietal network implements this confidence-weighted learn- act. Finding the correct balance is especially critical in a world that ing algorithm, acting as a statistician that uses probabilistic is both stochastic and changing (1), i.e., where observations are information to estimate a hierarchical model of the world. governed by probabilities that can change over time (a situation called volatility). An excessive reliance on incoming observations Author contributions: F.M. and S.D. designed research; F.M. performed research; F.M. will make the learned estimates dominated by fluctuations instead analyzed data; and F.M. and S.D. wrote the paper. of converging to the true underlying probabilities. Conversely, an Reviewers: S.M.F., University College London; and C.R.G., Rutgers University. excessive reliance on the previously acquired knowledge will slow The authors declare no conflict of interest. down the learning process and impede a quick reset of the internal Data deposition: Nonthresholded whole brain maps are available at neurovault.org/collections/ 1181/. estimates, which is useful when the environment changes. 1To whom correspondence may be addressed. Email: [email protected] or stanislas. In the 1950s, learning algorithms were proposed to solve this [email protected]. weighting problem optimally under specific conditions (e.g., the This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. Kalman filters and subsequent developments; ref. 2). A general 1073/pnas.1615773114/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1615773114 PNAS | Published online April 24, 2017 | E3859–E3868 Downloaded by guest on October 2, 2021 the brain mechanisms that implement such a hierarchical confidence- previous item), knowing that those probabilities fluctuate randomly weighted learning. over time (e.g., Fig. 1A). This task is similar to previous studies (1, We submitted human subjects to a learning task that requires the 14–21), but with several additional features. First, although many tracking of transition probabilities (i.e., the conditional probability studies resort to reinforcement-learning tasks that evaluated the of observing the current item, given the identity of the immediately learned probabilities indirectly and implicitly (1, 17, 19), we opted Fig. 1. A normative role for confidence during probabilistic inference. (A) Subjects were asked to continuously estimate the probabilities p(AjB) and p(AjA) governing the successions between two stimuli, while these probabilities underwent occasional stepwise changes. The probability of change point was held constant, but transition probabilities and change times varied randomly between sessions and participants. Subjective confidence levels were probed occasionally at trials marked by red marks. (B) Example sequence. Actual A and B stimuli corresponded to clearly distinguishable pairs of sounds or visual symbols, presented in distinct auditory and visual sessions. (C) The likelihood distribution of transition probabilities were estimated by a Bayesian observer, based on the observations received so far and the generative model. Likelihood distributions are presented as slices as a function of time. Each slice corresponds to a marginal distribution from the actual 2D space defined by combinations of p(AjB) and p(AjA). Confidence in the estimated transition probabilities corresponds to the (negative log) SD of those marginal distributions. (D) Bayesian theory predicts that confidence should serve as a weighting factor that modulates the incoming evidence during optimal inference: the update