Reward Signalling in Brainstem Nuclei Under Glycemic Flux
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/243006; this version posted January 4, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Reward signalling in brainstem nuclei under glycemic 2 flux 3 Tobias Morville1*, Kristoffer Madsen3, Hartwig R. Siebner1,2, Oliver J. Hulme 4 1Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, 5 Copenhagen University Hospital Hvidovre, Kettegard Allé 30, Hvidovre, 2650, Denmark 2DTU Compute, 6 Department of Informatics and Mathematical Modeling, Technical University of Denmark, Copenhagen, Denmark 7 3Department of Neurology, Copenhagen University Hospital Bispebjerg, Copenhagen, 2400, Denmark..* 8 Corresponding author: [email protected] 9 10 Phasic dopamine release from mid-brain dopaminergic neurons signals errors of reward prediction 11 (RPE). If reward maximisation is to maintain homeostasis, then the value of primary rewards should 12 be coupled to the homeostatic errors they remediate. This leads to the prediction that RPE signals 13 should be configured as a function of homeostatic state and thus, diminish with the attenuation of 14 homeostatic error. To test this hypothesis, we collected a large volume of functional MRI data from 15 five human volunteers on four separate days. After fasting for 12 hours, subjects consumed preloads 16 that differed in glucose concentration. Participants then underwent a Pavlovian cue-conditioning 17 paradigm in which the colour of a fixation-cross was stochastically associated with the delivery of 18 water or glucose via a gustometer. This design afforded computation of RPE separately for better- and 19 worse-than expected outcomes during ascending and descending trajectories of physiological serum 20 glucose fluctuations. In the parabrachial nuclei, variations in regional activity coding positive RPEs 21 scaled positively with serum glucose for ascending and descending glucose levels. The ventral 22 tegmental area and substantia nigra became more sensitive to negative RPEs when glucose levels 23 were ascending. Together, the results show that RPE signals in key brainstem structures are 24 modulated by homeostatic trajectories of naturally occurring glycemic flux, revealing a tight interplay 25 between homeostatic state and the neural encoding of primary reward in the human brain. 26 27 Keywords. Homeostasis, Reward prediction errors, Dopamine, Midbrain, Parabrachial nucleus. 1 bioRxiv preprint doi: https://doi.org/10.1101/243006; this version posted January 4, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 28 29 30 Introduction 31 A basic assumption of many models of adaptive behavior, is that the value of primary rewards are 32 modulated by their capacity to rectify future homeostatic deficits (Pompilio et al. 2006; Cabanac 1971). 33 Compatible with this notion, deprivation-induced hypoglycaemia increases willingness to work for food 34 in rats (Sclafani et al. 1970), in humans (Pelchat 2009), as well as the subjectively reported pleasure 35 (Cabanac 1971). Catecholamine dopamine is a neurotransmitter that plays a key role in signalling 36 reward (Haber & Knutson 2010) and is involved in behavioural reinforcement, learning and motivation 37 (Berridge 2006; Schultz et al. 1997). Via meso-cortical and mesolimbic dopaminergic projections, 38 synaptic dopamine release modulates the plasticity of cortico-striatal networks and hereby sculpts 39 behavioural policies according to their reward contingencies (Haber & Knutson 2010; Schultz 2015). 40 Patterns of phasic dopaminergic firing have been demonstrated to follow closely the principles of 41 reinforcement learning, encoding the errors in the prediction of reward (O'Doherty et al. 2004; Schultz 42 et al. 1997; Rangel et al. 2008; Tobler et al. 2005). Reward prediction error (RPE) signals are 43 commensurate with the economic construct of marginal utility, defined as the additional utility obtained 44 through additional units of consumption, where utility is a subjective value inferred from choice 45 (Stauffer et al. 2014; Schultz 2005; Schultz 2015). 46 Although animals are motivated by a homeostatic deficit of thirst or hunger, homeostatic states are 47 rarely considered as relevant modulators of dopaminergic signalling of reward prediction errors. In 48 typical paradigms involving cumulative consumption, the homeostatic deficit gradually diminishes as the 49 animal plays for consumption of water or sugar-containing juice. Eventually, the animal rejects further 50 play, presumably because the marginal utility of consumption diminished to a point of indifference. 51 Interestingly, a recent electrophysiology study in rats, demonstrated that oral consumption of sodium 52 solution causes phasic dopaminergic signals in the nucleus accumbens, that are modulated by sodium 53 depletion (J. J. Cone et al. 2016). 2 bioRxiv preprint doi: https://doi.org/10.1101/243006; this version posted January 4, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 54 There is now growing evidence for a multifaceted interface between dopamine mediated reward- 55 signalling and the systems underpinning energy homeostasis. Firstly, dopamine neurons in the ventral 56 tegmental area (VTA) express a suite of receptors targeted by energy-reporting hormones ghrelin, 57 insulin, amylin, leptin and Glucagon Like Peptide 1 (GLP-1, Ferrario et al. 2016; Palmiter 2007). This 58 provides numerous degrees of freedom for flexibly interfacing between homeostatic state and reward 59 signalling. Although hormonal modulations of phasic dopamine are yet to be fully scrutinised, there is 60 emerging evidence that circulating factors do indeed modulate its magnitude. For instance, amylin, a 61 hormone co-released with insulin, acts on the VTA to reduce phasic dopamine release in its mesolimbic 62 projection sites (Mietlicki-Baase et al. 2015). In terms of neuronal input, there are many such 63 opportunities for the appetitive control of dopamine mediated signalling. 64 Appetitive control can be delineated into three interacting valuation systems (Sternson & Eiselt 2017). 65 The first system generates a negative valence signal which involves activity of the Agouti-related peptide 66 (AgRP) neurons of the arcuate nucleus of the hypothalamus (ARC). Activity of ARCAgRP neurons reports 67 on energy deficits, inhibits energy expenditure, and regulates glucose metabolism (e.g. Aponte et al. 68 2011; Dietrich et al. 2015; Luquet et al. 2005; Cansell et al. 2012). ARC neurons that contain peptide 69 products of pro-opiomelanocortin (POMC) form an opponent code compared with ARCAgRP neurons. The 70 balance between the two neuronal ARC sub-populations putatively encodes the value of near-term 71 energetic states, becoming rapidly modulated just prior to food consumption (Mandelblat-Cerf et al. 72 2015). The second system codes positive valence signals and consists of circuits involving the lateral 73 hypothalamus (LH). It is linked to positively reinforcing consummatory behaviours via its dopaminergic 74 projections, assumed to trigger positive feedback to keep consumption going during feeding bouts. The 75 third valuation system involves calcitonin gene-related protein (CGRP)-expressing neurons in the (PBN) 76 that potently suppress eating when activated, but do not increase food intake when inhibited. PBNCGRP 77 neurons are activated by signals associated with food intake, and they provide a signal of satiety that 78 has negative valence when strongly activated (Campos et al. 2016). The PBN has been characterised as a 79 hedonic hotspot, the modulation of which by either GABA or Benzodiazepines potently modulates 3 bioRxiv preprint doi: https://doi.org/10.1101/243006; this version posted January 4, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 80 experienced reward (Söderpalm & Berridge 2000); ARCAgRP neurons GABA-ergically inhibit PBN neurons, 81 thus stimuli predicting glucose consumption should inhibit ARCAgRP, releasing the PBN from inhibition 82 (Qunli Wu et al. 2014). Further, hormones related to hunger and feeding (GLP-1 & leptin) modulate PBN 83 activity and subsequent behaviour (e.g. Alhadeff, Baird, et al. 2014; Alhadeff, Hayes, et al. 2015). Of 84 note, these three valuation systems all project to and modulate the dopaminergic neurons in the ventral 85 tegmental area(VTADA). The interface between these hypothalamic-brainstem networks and the VTADA, 86 is arguably the most important interface for mediating the dialogue between energy homeostasis and 87 value computation. 88 While most evidence for encoding of RPEs is obtained under homeostatic deprivation, the modulation of 89 RPE signalling triggered by physiological fluctuations in glucose availability (glycemic flux) remains yet to 90 be characterised in