Modeling Uncertainty-Seeking Behavior Mediated by Cholinergic Influence on Dopamine
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/699595; this version posted July 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Modeling Uncertainty-Seeking Behavior Mediated by Cholinergic Influence on Dopamine Marwen Belkaid1,2,* and Jeffrey L. Krichmar3,4 Abstract 1 Introduction Recent findings suggest that acetylcholine me- Animals constantly face uncertainty due to diates uncertainty-seeking behaviors through noisy and incomplete information about the its projection to dopamine neurons – another environment. From the information-processing neuromodulatory system known for its major perspective, uncertainty is typically considered implication in reinforcement learning and a burden, an issue that has to be resolved decision-making. In this paper, we propose a for the animal to behave correctly [Cohen leaky-integrate-and-fire model of this mecha- et al., 2007; Rao, 2010]. In the framework of nism. It implements a softmax-like selection reinforcement learning, for example, to allow with an uncertainty bonus by a cholinergic optimal exploitation and outcome maximiza- drive to dopaminergic neurons, which in turn tion, agents must explore the environment influence synaptic currents of downstream and gather information about action–outcome neurons. The model is able to reproduce contingencies [Sutton and Barto, 1998; Rao, experimental data in two decision-making 2010]. tasks. It also predicts that i) in the absence The neural mechanisms driving the decision of cholinergic input, dopaminergic activity to perform actions with uncertain outcomes would not correlate with uncertainty, and that are still poorly understood. In contrast, the ii) the adaptive advantage brought by the processes by which individuals learn to per- implemented uncertainty-seeking mechanism form successful actions have been extensively is most useful when sources of reward are not studied. Notably, the dopaminergic system is highly uncertain. Moreover, this modeling thought to play a key role in these processes, work allows us to propose novel experiments both in the learning- and in the motivation- which might shed new light on the role of related aspects [Schultz, 2002; Berridge, 2012; acetylcholine in both random and directed ex- Berke, 2018]. Moreover, studies have reported ploration. Overall, this study thus contributes dopaminergic activities that are correlated to a more comprehensive understanding of with the uncertainty of reward [Fiorillo et al., the roles of the cholinergic system and its 2003; Linnet et al., 2012]. involvement in decision-making in particular. Another neuromodulatory system which has been largely implicated in the processing 1Sorbonne Université, CNRS UMR 7222, Institut des Systèmes Intelligents et de Robotique, ISIR, F- of novelty and uncertainty is the cholonergic 75005 Paris, France. system. For instance, Yu and Dayan [2005] 2ETIS Laboratory, UMR 8051, Université Paris Seine, suggested that acetylcholine (ACh) suppresses ENSEA, CNRS, Université de Cergy-Pontoise, F-95000 top-down, expectation-driven information Cergy-Pontoise, France. relative to bottom-up, sensory-induced signals 3Department of Cognitive Sciences,University of Cali- fornia, Irvine, Irvine, CA 92697, USA. in situations of expected uncertainty, i.e. 4Department of ComputerScience, University of Cali- when expectations are known to be unreliable. fornia, Irvine, Irvine, CA 92697, USA Additionally, Hasselmo [1999, 2006] proposed *Corresponding author: MB, [email protected] that the level of ACh in the hippocampus 1 bioRxiv preprint doi: https://doi.org/10.1101/699595; this version posted July 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. determines whether it is encoding new infor- source of motivation thus driving exploratory mation or consolidating old memories. The behaviors. cholinergic system also interacts with the dopaminergic system. In particular, there are cholinergic projections onto neurons in the 2 Background ventral tegmental area (VTA), one of the two major sources of dopamine (DA) in the brain 2.1 Dopamine [Avery and Krichmar, 2017; Scatton et al., Dopamine (DA) is involved in decision-making 1980]. In a recent study, Naudé et al. [2016] through its role in reward processing and mo- provided evidence that these projections might tivation [Schultz, 2002; Berridge, 2012]. The mediate the motivation to select uncertain largest group of dopaminergic neurons is found actions. in the ventral tegmental area (VTA) [Scatton The softmax rule, where the probability et al., 1980]. It projects to the basal ganglia of choosing an action is a function of its (BG), in particular to the striatum, but also to estimated value, is generally thought to be the frontal cortex. The substantia nigra is also a good model of human [Daw et al., 2006] an important source of dopamine in the BG. and animal [Cinotti et al., 2019] decision- There is strong evidence of the role of making. But Naudé et al. [2016] showed that dopamine in the learning of the value of the decisions made by wild-type (WT) mice actions, stimuli and states of the environment. exhibited an uncertainty-seeking bias and In this context, Schultz and colleagues hypoth- followed a softmax function which included esized that the activity of DA neurons encodes an uncertainty bonus. In contrast, mice a reward prediction error [Schultz, 2002]. lacking the nicotinic acetylcholine receptors Indeed, phasic dopaminergic activities show on the dopaminergic neurons in VTA showed strong correlations with an error in the pre- less uncertainty-seeking behaviors and their diction of conditioned stimuli after Pavlovian decisions rather followed the standard softmax learning. Moreover, Berridge and colleagues rule. suggest that DA is essential for “incentive In neural networks, decision-making pro- salience” and “wanting”, i.e. for motivation cesses are generally modeled using competition [Berridge and Kringelbach, 2008; Berridge, mechanisms [Rumelhart and Zipser, 1985; Car- 2012]. For instance, DA deprived rats are penter and Grossberg, 1988]. Such mechanisms unable to generate the motivation arousal can constitute a neural implementation of the necessary for ingestive behavior and can starve softmax rule. In particular, Krichmar [2008] to death although they are able to move and proposed a model where neurotransmitters act eat [Ungerstedt, 1971]. However, dopamine upon different synaptic currents to modulate has also been suggested to signify novelty, the network’s sensitivity to differences in input which may be related to an uncertainty signal values, much like the temperature parameter [Redgrave and Gurney, 2006]. In summary, in the softmax model [Sutton and Barto, 1998]. the dopaminergic system seems to implement In this paper, we propose a new version of this a series of mechanisms that reinforce and favor model using leaky-integrate-and-fire neurons stimuli and actions that have been rewarding and integrating an uncertainty bonus. We use in the past, or that may be of interest in the this model, in comparison with three alterna- future. tive models, to verify a set of hypotheses about how cholinergic projections to dopaminergic VTA neurons in mediate uncertainty-seeking. 2.2 Acetylcholine We then perform additional experiments to Acetylcholine (ACh) originates from various assess the interest of such a mechanism for structures in the brain: the laterodorsal animals foraging in volatile environments. tegmental (LDT) and the pedunculopontine These simulations suggest that ACh effects tegmental (PPT) mesopontine nuclei pro- behavior by translating uncertainty into a jecting to the VTA and other nuclei in the 2 bioRxiv preprint doi: https://doi.org/10.1101/699595; this version posted July 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. brainstem, basal forebrain and basal gan- validity effect (VE). The model proposed by glia [Mena-Segovia, 2016]; the medial septal Yu and Dayan [2005] reproduced the results nucleus mainly targeting the hippocampus; obtained by Phillips et al. [2000] which showed and the nucleus basalis in the basal forebrain in rats experiments that VE varies inversely mainly acting on the neocortex [Baxter and with the level of ACh which was manipulated Chiba, 1999]. In addition, striatal interneurons pharmacologically. Additionally, ACh has provide an internal source of ACh in the BG. been hypothesized to set the threshold for noa- ACh has been largely implicated in the pro- drenergic signaling of unexpected uncertainty cessing of novelty and uncertainty. Significant [Yu and Dayan, 2005] which calls for more research highlighted this role in the septo- exploration by counterbalancing DA-driven hippocampal cholinergic system for instance. exploitation [Cohen et al., 2007]. In this case, novelty detection increases the level of septal ACh: novel patterns elicit little recall which reduces hippocampal inhibition 3 Methods of the septum and allows ACh neurons to discharge [Meeter et al., 2004]. In addition, 3.1 Bandit task Hasselmo