<<

PRECLINICAL MODELING OF HABITUAL SEEKING

by

Andrew Loughlin

A thesis submitted in conformity with the requirements for the degree of Master of Science

Graduate Department of Pharmacology

University of Toronto

© Copyright by Andrew Loughlin (2015)

Preclinical Modeling of Habitual Nicotine Seeking

Andrew Loughlin

Master of Science

Graduate Department of Pharmacology

University of Toronto

2015

Abstract

Current theories of substance dependence suggest that contributions to compulsive drug use may include the aberrant recruitment of cognitive processes normally responsible for automatizing repeated behaviors, or “forming habits”. While evidence has shown that both ethanol and can abnormally accelerate the formation of drug-seeking habits, whether this could be the case for nicotine has received little attention. In the current work, subjects acquired self-administration of both intravenous nicotine and orally consumed saccharin in two separate daily operant sessions, receiving 10 sessions of FR1 training for each reinforcer. Behavioral assays of “aversion-pairing” and

“contingency degradation” were respectively utilized in Experiments 1 and 2 to determine if responding for each reward was goal-directed or habitual. Both tests were consistent in suggesting the training parameters employed produced goal-directed saccharin responding, but habitual nicotine responding. Particularly noteworthy is the exceptionally rapid rate at which nicotine-seeking habits appear to have developed.

ii

Acknowledgements

I am the product of all of my relations, thus this work is not my own. Only through the generous support of all of the following people was this thesis made possible, and as such, I would like to offer my sincere gratitude to:

Dr. AD Lê, for the wonderful mentorship which has expanded my thought, developed my character, and finally revealed to me the importance of discipline;

Kathy Coen, for sharing with me her stunning brilliance in translating thought to action;

Dr. Doug Funk, for frequent uplifts of spirit and assistance in thesis writing;

Sahar Tamadon and Zhaoxia Li, for the support and company through long days of work;

Dr. Paul Fletcher and my graduate education committee, for all the time and effort expended for the benefit of students like myself;

James Ennis and Casey Suchit, for the helping hands and kind words,

Jack Bilodeau, for reminding me to rock and roll,

Abhiram Pushparaj, for being a friend both witty and wise,

Jillian Burston, for the constant camaraderie I couldn’t have done without;

Everyone else from CAMH and the University of Toronto I’ve had the privilege of knowing, for making this period of my life possible;

Every scientist before me, for the shoulders upon which I hope to stand;

And of course, my mother and father, for giving me life and more.

Finally, I would like to express my profound appreciation for every subject who has contributed to this research. To you, I owe a debt which cannot be repaid.

iii

Table of Contents

Abstract ...... ii

Acknowledgements ...... iii

Table of Contents ...... iv

List of Figures ...... ix

List of Tables ...... x

CHAPTER 1: Introduction ...... 1

1.1 Use and Nicotine Addiction ...... 1

1.1.1 Statement of Problem ...... 1

1.1.2 Cigarettes and Nicotine ...... 1

1.1.3 Nicotine ...... 2

1.1.4 Nicotine Pharmacodynamics ...... 4

1.2 Animal Models of Drug Reward and Reinforcement ...... 6

1.2.1 Animal Models of Drug Reward ...... 6

1.2.2 Animal Models of Drug Reinforcement ...... 9

1.3 Mechanisms of Nicotine Reward ...... 12

1.3.1 The Mesolimbic Dopamine System ...... 12

1.3.2 Nicotine and the Mesolimbic Reward System ...... 14

1.4 Theoretical Frameworks of Substance Dependence ...... 15

1.4.1 Motivational Accounts of Substance Dependence...... 15

1.4.2 Compulsive Accounts of Substance Dependence ...... 17

iv

1.5 Habit ...... 20

1.5.1 Two-System Control of Instrumental Behavior ...... 20

1.5.2 Goal-Directed Behaviors ...... 23

1.5.3 Determinants of Habit Formation ...... 25

1.6 Distinguishing Goal-Directed and Habitual Behavior ...... 29

1.6.1 Outcomes and Contingencies ...... 29

1.6.2 Manipulations of Outcome Value ...... 29

1.6.3 Manipulations of Instrumental Contingency ...... 34

1.7 The Effects of Drugs on Habits in Animal Models ...... 37

1.7.1 Specific Acceleration of Habit Formation ...... 37

1.7.2 Generalized Acceleration of Habit Formation ...... 38

1.7.3 Drugs and the Neural Substrates of Habit ...... 39

1.8 Habits, Addiction, and Nicotine ...... 42

1.8.1 The Hijack of Habit ...... 42

1.8.2 Habits in Compulsive Drug Use ...... 44

1.8.3 A Role for Habit in Nicotine Dependence? ...... 45

CHAPTER 2: Purpose of Investigation ...... 46

CHAPTER 3: General Materials and Methods ...... 48

3.1 Subjects ...... 48

3.1.1 Animals & Housing ...... 48

3.2 Intravenous Jugular Vein Catheterization ...... 49

3.2.1 Catheter Construction ...... 49

v

3.2.2 Surgical Procedures ...... 50

3.3 Apparatus ...... 55

3.3.1 General Procedures ...... 55

3.3.2 Orally Consumed Saccharin Self-Administration ...... 58

3.3.3 Intravenous Nicotine Self-Administration ...... 59

3.3.4 Statistical Analysis ...... 61

CHAPTER 4: Experiment 1 - LiCl Devaluation of Intravenous Nicotine ...... 62

4.1 Experiment 1: Introduction ...... 62

4.1.1 Aversion-Pairing to Nicotine ...... 62

4.1.2 Experimental Objectives and Hypotheses ...... 64

4.2 Experiment 1: Materials and Methods ...... 65

4.2.1 General Procedures ...... 65

4.2.2 Devaluation: LiCl-Reinforcer Pairing ...... 67

4.2.3 Experiment 1A: Pre-Acquisition Devaluation ...... 70

4.2.4 Experiment 1B: Post-Acquisition Devaluation ...... 71

4.3 Experiment 1: Results ...... 73

4.3.1 Experiment 1A: Pre-Acquisition Devaluation ...... 73

4.3.2 Experiment 1B: Post-Acquisition Devaluation Results ...... 78

4.4 Experiment 1: Discussion ...... 96

4.4.1 LiCl and Saccharin ...... 96

4.4.2 LiCl and Nicotine ...... 98

4.4.3 Conclusions ...... 100

vi

CHAPTER 5: Experiment 2 - Contingency Degradation ...... 101

5.1 Experiment 2: Introduction ...... 101

5.1.1 Assessing Sensitivity to Instrumental Contingency ...... 101

5.2 Experiment 2: Materials and Methods ...... 104

5.2.1 General Procedures ...... 104

5.2.2 Contingency Degradation Procedure ...... 106

5.3 Experiment 2: Results ...... 109

5.3.1 Data Analysis for Experiment 2 ...... 109

5.3.2 Experiment 2A: Nicotine-Saccharin Group Results...... 110

5.3.3 Experiment 2B: Nicotine-Alone Group Results ...... 125

5.4 Experiment 2: Discussion ...... 132

5.4.1 Data Summary ...... 132

5.4.2 Degradation and Extinction ...... 132

5.4.3 Conclusions ...... 136

CHAPTER 6: General Discussion ...... 137

6.1 Experimental Conclusions and Alternative Interpretations ...... 137

6.1.1 Overview ...... 137

6.1.2 Optimal Conditions for Goal-Directed Behavior ...... 139

6.1.3 LiCl Devaluation ...... 140

6.1.4 Contingency Degradation ...... 147

6.1.5 Other Potential Issues ...... 150

6.1.6 The Rapid Formation of Nicotine-Seeking Habits ...... 151

vii

6.2 Possible Mechanistic Interpretations ...... 152

6.2.1 Phasic VTA Dopamine Signals and the DLS ...... 152

6.2.2 Influences on Habit Formation ...... 154

6.2.3 Habits and the Insular Cortex ...... 156

6.3 Relevance to Human Tobacco Use ...... 157

6.3.1 Smoking as a Maladaptive Incentive Habit ...... 157

6.3.2 Implications for Treatment ...... 160

CHAPTER 7: Future Directions ...... 161

7.1 Validation of Outcome-Insensitive Nicotine Seeking ...... 161

7.1.1 Nicotine and Outcome Value ...... 161

7.1.2 Nicotine and Contingency Learning ...... 164

7.2 Acceleration of Habit Formation by Nicotine ...... 166

7.2.1 Generalized Habit Formation and Nicotine ...... 166

7.2.2 Unexplored Interactions ...... 167

REFERENCES ...... 168

viii

List of Figures

Figure 1.1 Instrumental Contingency During Overtraining 26

Figure 1.2 Instrumental Contingency in Ratio vs. Interval Schedules 28

Figure 4.1 Illustration of Key Events in Experiment 1 66

Figure 4.2 Group PRE: Saccharin Acquisition 75

Figure 4.3 Group PRE: Nicotine Acquisition 77

Figure 4.4 Group POST: Saccharin Acquisition 81

Figure 4.5 Group POST: Saccharin Extinction 84

Figure 4.6 Group POST: Saccharin Reacquisition – Reinforcements 86

Figure 4.7 Group POST: Saccharin Reacquisition – Lever Responses 87

Figure 4.8 Group POST: Nicotine Acquisition 89

Figure 4.9 Group POST: Nicotine Extinction 92

Figure 4.10 Group POST: Nicotine Reacquisition – Reinforcements 94

Figure 4.11 Group POST: Nicotine Reacquisition – Lever Responses 95

Figure 5.1 Illustration of Key Events in Experiment 2 105

Figure 5.2 NicSac Group: Saccharin Acquisition 112

Figure 5.3 NicSac Group: Saccharin Degradation 115

Figure 5.4 NicSac Group: Saccharin Extinction 117

Figure 5.5 NicSac Group: Nicotine Acquisition 119

Figure 5.6 NicSac Group: Nicotine Degradation 122

Figure 5.7 NicSac Group: Nicotine Extinction 124

Figure 5.8 NicAlone Group: Acquisition 126

Figure 5.9 NicAlone Group: Degradation 129

Figure 5.10 NicAlone Group: Extinction 130

ix

List of Tables

Table 1.1 Interpretation of Post-Devaluation Testing 32

Table 1.2 Interpretation of Natural Reward Degradation 36

Table 4.2 Group POST: Saccharin Baseline Data 82

Table 4.3 Group POST: Nicotine Baseline Data 90

Table 5.1 Interpretation of Intravenous Reward Degradation 109

Table 5.2 NicSac Group: Saccharin Baseline Data 113

Table 5.3 NicSac Group: Nicotine Basline Data 120

Table 5.4 NicAlone Group: Baseline Data 127

x

CHAPTER 1: Introduction

“Habit is either the best of servants, or the worst of masters.” Nathaniel Emmons

1.1 Tobacco Use and Nicotine Addiction

1.1.1 Statement of Problem

That tobacco smoking is detrimental to public health is by now a topic thoroughly investigated and publicized. One-half of smokers may die early from a preventable tobacco-related illness (Patel et al., 2008), and annual tobacco-attributable death tolls are estimated to be as high as 5 million people worldwide, of which 600 thousand are non-smokers affected by second-hand smoke

(Ezzati & Lopez, 2003; Mathers & Loncar, 2006; Öberg et al., 2011). Despite widespread knowledge of the risks, upwards of 1 billion people around the world are daily smokers (Shafey et al., 2003). It is estimated that this includes 16% of Canadian adults (Reid et al., 2014) and 18% of adults in the

U.S. (Centers for Disease Control and Prevention, 2014). Nicotine, the principal addiction-driving ingredient in cigarettes (Polosa & Benowitz 2011; Stolerman, 1991), is therefore one of the world’s most popular recreational substances; more so than even and perhaps second only to (Fagerström, 2005).

1.1.2 Cigarettes and Nicotine

Nicotine seems to be remarkably effective at inducing persistent use; although studies suggest up to 50% of current smokers could meet DSM-IV criteria for nicotine dependence

(Hughes et al., 2006), this is not necessarily prerequisite for heavy and long-term daily smoking

(Donny & Dierker, 2007). While low-nicotine and de-nicotinized cigarettes are relatively unpopular,

1

2

several alternative means of high-dose nicotine delivery (such as chewing tobacco and electronic cigarettes) are in contrast widely consumed (Benowitz, 2008; Loukas et al., 2015). That being said, tobacco cigarettes overwhelmingly remain the most common form of nicotine delivery; only 20% of

U.S. smokers report use of alternative tobacco products, and the majority of those users (>60%) do not use such products more than once per week (Kasza et al., 2014).

Nicotine users can escape the mortality risks associated with cigarette use through successful and abstinence, with earlier cessation providing better outcomes (Peto et al.,

2000). However, nicotine has one of the poorest quit rates for drugs of abuse (O’Brien & McLellan,

1996). While more than 80% of smokers will at some point make an attempt to quit, approximately

60% of those will relapse within the first week, 80% within the first month, and only around 3% will remain abstinent after 6 months, if unaided (Hughes et al., 2004; Benowitz, 2008). Although treatment can improve this final statistic to 6.75% with nicotine-replacement therapy (Moore et al.,

2009) or perhaps up to a maximum of 29.7% with the pharmacotherapeutic (Jorenby et al., 2006), these quit rates still remain relatively low. Ongoing investigation into the causes, mechanisms, and potential treatments for nicotine dependence therefore remains essential (Cobb et al., 2015; Goodwin et al., 2014).

1.1.3 Nicotine Pharmacokinetics

1.1.3.1 Absorption

Nicotine is a naturally occurring tertiary amine comprising approximately 1.5% of the weight of commercial cigarette tobacco (Hukkanen et al., 2005), and 25% of this content may survive combustion to be present in tobacco smoke (Benowitz, 2008). Roughly 80-90% of this remaining nicotine found in tobacco smoke may in turn be absorbable through inhalation (Armitage et al., 1975). While upwards of 99% of this nicotine initially exists as the pharmacologically active

3

levorotary (S)-isomer (Hukkanen et al., 2005), racemization during combustion may convert up to

10% of the nicotine found in tobacco smoke to the (R)-isomer, which is relatively pharmacologically inactive due to stereoselectivity at nicotinic receptors (Benowitz, 1996; Matta et al., 2007).

A typical cigarette smoker may systemically absorb around 20mg of nicotine per day, and has an average plasma nicotine concentration ranging from 4-5ng/mL after waking to peak levels of 10-

50ng/mL in the afternoon (Benowitz & Jacob, 1984). A single cigarette typically contains 6-20mg of nicotine, of which 1-3mg is likely bioavailable to elevate blood nicotine concentration by approximately 10-11ng/mL per cigarette (Henningfield, 1995; Patterson et al., 2003). Each cigarette puff produces a spike in arterial nicotine capable of reaching the brain in approximately 10-20 seconds (Henningfield & Keenan, 1993). As a cigarette is finished within the next 5 minutes, peak arterial nicotine concentrations of up to 100ng/mL may be achieved; this concentration can be 10 times greater than that of venous nicotine, and roughly 3-4 times the average plasma nicotine concentration of smokers (Henningfield & Keenan, 1993; Henningfield, 1995; Hukkanen et al.,

2005; Matta et al., 2007).

This rapid increase in brain and blood nicotine described may not occur if nicotine is administered through other routes. Arterial concentrations can take up to 30 minutes to reach peak levels in the case of sublingual nicotine (e.g. chewing tobacco or gum), or even up to 5 hours for transdermal nicotine (Henningfield & Keenan, 1993; Hukkanen et al., 2005). The inhalational route therefore uniquely allows smokers to tailor their intake through puff-by-puff adjustments, potentially contributing to the prevalence of cigarettes as a nicotine delivery vehicle (Benowitz, 1986).

4

1.1.3.2 Metabolism

Following the termination of smoking, blood nicotine levels will decline at an initially rapid rate; as nicotine is distributed throughout body tissues over the next 20 minutes, the rate of this decline slows greatly in turn (Hukkanen et al., 2005). In humans, nicotine has an average elimination half-life of 2 hours (45 minutes in rats), and is metabolized extensively by the liver into 6 primary metabolites and several minor metabolites (Matta et al., 2007). Metabolism occurs primarily through the hepatic enzyme CYP2A6, which converts 70-80% of plasma nicotine to the primary metabolite cotinine before further metabolizing cotinine to 3’-hydroxycotinine (Mwenifumbo & Tyndale, 2009;

Nakajima et al., 1996). While allelic variants of CYP2A6 have been implicated in variable rates of nicotine metabolism, wide variability still occurs in the rates of nicotine metabolism of individuals without polymorphisms identified as relevant (Raunio et al., 2001; Swan et al., 2005). Minor routes of nicotine metabolism include N-oxidation mediated by flavoprotein FMO3 (Cashman et al., 1992), as well as glucuronidation mediated by UGT1A9, 1A4, and 2B7 (Yamanaka et al., 2005).

1.1.4 Nicotine Pharmacodynamics

1.1.4.1 Nicotinic Receptors

The pharmacological effects of nicotine occur through its actions on the nicotinic (nAChR), upon which exogenous nicotine functions as an (Dani &

De Biasi, 2001). In brain tissue, direct nicotine binding has been shown to activate nAChRs both in vivo (Marks et al., 1985) and in cell cultures (Peng et al., 1994). The nAChR is the founding member of cys-loop superfamily of pentameric ligand-gated ion channels, and upon activation is non- selectively permeable to cations (i.e. circumstantially permeable to Na+, K+, and Ca2+) (Chen, 2010;

5

Lindstrom, 1996). In vertebrates, each nAChR consists of a central ion channel, around which 5 homologous (or identical) subunits are symmetrically arranged (Cooper et al., 1991; Karlin, 2002).

A diversity of subunit variants exist, with specialized localizations to discrete tissues (Karlin,

2002). The numerous pentameric combinations of these subunits yield distinct nAChR subtypes, with variable pharmacodynamic and pharmacokinetic properties (Taly et al., 2009). A major distinction is often drawn between those nAChRs expressed within the mammalian central nervous system (CNS) and those expressed peripherally (Matta et al., 2007). Peripheral nAChRs occur especially at neuromuscular junctions, and are composed from five subunit types (α1, β2, γ, δ, and ε).

In contrast, CNS nAChRs can be composed out of twelve subunits found in two subfamilies (α2-

α10, and β2-β4), encoded across 17 genes (Karlin, 2002). While the expression of nAChR subtypes in brain tissue is highly regionally specific, the α4β2 and α7 subtypes both stand out as being generally prevalent (Davis & de Fiebre, 2006; Taly et al., 2009). One key piece of evidence to suggest that nicotine may act on CNS nAChRs to produce its behavioral and reinforcing effects is that blockade of these receptors by (a non-selective nAChR antagonist) can prevent rats from discriminating nicotine from saline, but (selective for peripheral nAChRs) does not (Stolerman et al., 1984). However, which precise nAChR subtypes mediate the various properties of nicotine are not yet completely clear.

Generally, binding of appropriate ligands trigger conformational changes in the receptor that briefly allow cation transport before receptors return to either an agonist-sensitive “resting” conformation or an agonist-insensitive “desensitized” conformation (Dani & De Biasi, 2001;

Pitchford et al., 1992). Not only can various nAChR subtypes exhibit distinct kinetics in these conformational changes, but subtypes can furthermore possess variable binding affinities for nicotine (Gotti & Clementi, 2004; Wu & Schulz, 2012). Different nAChR subtypes can therefore exhibit many distinct functional properties, including the consequences of receptor activation; as

6

examples, the “low-affinity” α7 receptor is fast to activate and has high Ca2+ permeability, while the

“high-affinity” α4β2 receptor is characterized by slow desensitization (Wu & Schulz, 2012).

These pharmacokinetic and pharmacodynamic properties of nicotine, however, reveal little about why or how nicotine acts as a drug of abuse. In order to better address such questions, consideration of nicotine’s consequences upon behavior is necessary.

1.2 Animal Models of Drug Reward and Reinforcement

1.2.1 Animal Models of Drug Reward

1.2.1.1 Conditioned Place Preference

Laboratory conditions typically study nicotine in the absence of smoke and other extraneous factors (Dani & De Biasi, 2001). In order to study just how nicotine can elicit behaviors associated with substance dependence, animal models are commonly employed; in the case of nicotine, the laboratory rat stands out as a prevalent and reliable model organism (Caille et al., 2012).

In animal models of drug “reward”, the term has been historically been operationalized as

“approach behavior”, while the opposite term “aversion” has been operationalized as “withdrawal behavior” (White, 1989). One such animal model of drug reward is the “place conditioning” (PC) procedure, in which animals receive pairings of experimenter-delivered drugs (or vehicle) with an environment made distinct through contextual cues. Rewarding or aversive properties of the drug are thought to be associated with the specific contextual cues of the paired environment through classical Pavlovian conditioning, which can be tested later when animals are drug-free (Brabant et al.,

2005; Mucha et al., 1982). By making both the paired and an unpaired environment freely accessible, the proportion of time animals spend in the drug-paired environment is one typical measure of

7

“conditioned place preference” (CPP), or inversely as “conditioned place avoidance” (CPA) (Bardo

& Bevins, 2000; Tzschentke, 1998). Both of these effects are observable in PC paradigms, depending on drug and dose (O’Dell & Khroyan, 2009).

While opiates and psychostimulants can induce CPP over wide dose ranges and experimental parameters, the induction of CPP by drugs such as ethanol, cannabinoids, or nicotine is more complex (Tzschentke, 2007). Nicotine-induced CPP tends to yield small effect sizes, and both CPA and CPP can be induced depending on the nicotine dose (Laviolette & Van der Kooy, 2003a; Le

Foll & Goldberg, 2009). In rats, nicotine induces CPP at a relatively narrow dose range of 0.2-

0.6mg/kg. The higher dose range of 0.8-1.2mg/kg typically induces CPA, although conflicting reports of CPP at this range also exist (Le Foll & Goldberg, 2005; Grabus et al., 2006; O’Dell &

Khroyan, 2009).

Other factors which can influence the expression of nicotine CPP and complicate interpretation include prior handling by experimenters, or any innate preference subjects may have for one environmental context or another (Cunningham et al., 2003; Grabus et al., 2006). While experimental controls for the second issue can be an “unbiased” counterbalancing of subjects between contexts, some “biased” methodologies actually incorporate initial place preferences in experimental design and group assignment (Roma & Riley, 2005). The latter procedural variant has typified PC procedures in mouse studies, while the former has typified PC reports in rats (O’Dell &

Khroyan, 2009).

PC procedures are useful for investigation and manipulations of genetic and mechanistic aspects of nicotine reward. For example, mutant mice with artificially overexpressed α4 subunits demonstrated a 50-fold increase in sensitivity to nicotine reward in a PC paradigm, suggesting an important role for α4 in mediating nicotine reward (Tapper et al., 2004). Similarly, selective antagonism and genetic knock-out have shown that β2 (but not α7) subunits are necessary for inducing CPP with nicotine (Laviolette & Van der Kooy, 2003b; Walters et al., 2006). But as with

8

any other paradigm, the utility of the PC paradigm is subject to limitations. Since PC paradigms do not involve any voluntary consumption of drugs by subjects, CPP is restricted to being an index of drug reward (or aversion) rather than reinforcement (Bardo & Bevins, 2000). To model the behavioral problems of substance use in humans, the PC procedure is as such limited in scope

(Sanchis-Segura & Spanagel, 2006).

1.2.1.2 Intracranial Self-Stimulation

In another model of drug reward, the intracranial self-stimulation (ICSS) paradigm allows subjects to respond for rewarding stimulations of certain brain regions via implanted-electrodes.

Rats (Olds & Milner, 1954), mice (Stoker & Markou, 2011), nonhuman primates (Rolls et al., 1980) and even humans (Heath, 1963) will universally and readily self-stimulate. This operant response behavior, though, is not the index of interest in ICSS paradigms; rather, most critical is the magnitude of stimulation which initiates responding (termed the “threshold”), beneath which the stimulation is too weak for responses to be worthwhile (Carlezon & Chartoff, 2007). Under controlled conditions, ICSS thresholds are typically stable, and represent normal levels of signaling within the neural substrates of reward (Vlachou & Markou, 2011). Drug-induced changes in threshold are thought to occur from modulations of this basal activity, indicative of various aspects of drug reward or aversion (Negus & Miller, 2014; Stoker & Markou, 2011).

Both reductions and elevations in ICSS threshold have been demonstrated with nicotine.

With respect to the former, almost all drugs of abuse (when administered acutely) decrease brain stimulation reward thresholds, corresponding to an elevated level of basal reward signaling which thereby decreases the stimulation necessary for achieving a hedonic state (Koob & Volkow, 2010;

Vlachou & Markou, 2011). Consistent with this, voluntarily administered nicotine has been shown to lower ICSS thresholds in rats, suggestive that acute nicotine can act on reward circuitry (Kenny &

9

Markou, 2006). In contrast, increases in ICSS thresholds to elevated levels are thought to reflect depressions of normal hedonic signaling which necessitate stronger stimulation to be overcome; such negative affective states are commonly observed in ICSS procedures during drug withdrawal

(Koob & Le Moal, 1997; Vlachou & Markou, 2011). Indeed, such increases in ICSS threshold have been observed following discontinuation of chronic nicotine exposure in both mice (Johnson et al.,

2008; Stoker et al., 2008) and rats (Epping-Jordan et al., 1998; Semenova & Markou, 2003), perhaps analogous to the aversive withdrawal syndrome evoked by smoking cessation in human smokers

(Epping-Jordan et al., 1998; Hughes et al., 1994; Shiffman & Jarvik, 1976).

Despite the above, ICSS procedures are generally considered to have weak face validity (for review of types of validity, see: Spear, 2000), as they do not “overtly mimic drug taking in human tobacco users” (O’Dell & Khroyan, 2009). Furthermore, as a model of drug reward (as with CPP), certain aspects of nicotine reinforcement (e.g. how tobacco cues elicit cigarette craving) remain difficult to model in ICSS paradigms alone (O’Dell & Khroyan, 2009; Shiffman, 2009).

Nevertheless, ICSS studies contribute to the evidence that nicotine and other drugs act through common pathways to produce their rewarding effects (Negus & Miller, 2014).

1.2.2 Animal Models of Drug Reinforcement

1.2.2.1 Operant Self-Administration

In contrast to reward, the term “reinforcement” refers to how within a certain context (such as an operant chamber) the likelihood of a given behavior (such as an operant response) can be increased by the contiguous delivery of “reinforcing” stimuli (such as drugs of abuse) (Mackintosh,

1974; White, 1989). The voluntary “self-administration” (SA) of drugs as an extension of operant conditioning is considered the most reliable and predictive animal model of drug reinforcement

(Planeta, 2013; Panlilo & Goldberg, 2007). It has been noted that “…experience of drug effect as a

10

consequence of the user’s own behavior is [apparently] related to the production of addiction”

(Kalant, 2015), and distinct neuroadaptations within reward-related brain regions have been observed as a consequence of passive vs. voluntary drug administration in rats (Stefanski et al.,

1999). The operant responding of animals to voluntarily consume drugs in SA paradigms therefore has exceptionally high face validity with human drug use (Caille et al., 2012; Panlilo & Goldberg,

2007).

Nonspecific activity during SA experiments is controlled with an additional “inactive” operandum, with preferential responding on “active” as opposed to “inactive” operanda revealing the operant outcome as genuinely reinforcing (Gardner, 2000). In addition, reinforcements may be associated with visual or auditory cues, which can come to possess conditioned properties for use in further experimental manipulation (Stewart et al., 1984). The potential routes for drug delivery in SA paradigms generally fall into only two categories, both of which are discussed within the context of nicotine below.

1.2.2.2 Oral Nicotine Self-Administration

In short, the operant SA of oral nicotine can be difficult to initiate and interpret (Collins et al., 2012). While Glick et al. (1996) were able to manipulate preference for two concentrations of unsweetened nicotine through mecamylamine, the aversive taste of nicotine solution required 23-

hour Monday-Friday H2O deprivation to facilitate this operant responding. Smith & Roberts (1995) attempted sweetening nicotine solutions with sucrose, but although rats responded slightly more for nicotine-sucrose than sucrose solution alone, removal of this additional sucrose content immediately and directly depressed operant responding. Hauser et al. (2012) showed that selectively bred alcohol- preferring (P) rats readily responded for saccharin-sweetened nicotine solutions, with blood draws revealing pharmacologically relevant levels of blood nicotine. While this oral nicotine intake by P

11

rats is consistent with observations of their increased preference for i.v. nicotine (Lê et al., 2006), a potential complication is that P rats may also have had an elevated preference for saccharin over other rat strains (Sinclair et al., 1992).

1.2.2.3 Intravenous Nicotine Self-Administration

In rats, i.v. NSA was established by Corrigall & Coen (1989), who determined that a

0.03mg/kg/infusion dose was optimal for initiation and maintenance of responding. Consistent with these findings, other studies of i.v. NSA in rats have shown that nicotine reinforcement produces an inverted-U-shaped dose-response curve, peaking at infusion doses between 0.015–0.03mg/kg depending on rat strain (Donny et al., 1995; Samaha & Robinson, 2005; Shoaib et al., 1997;

Valentine et al., 1997).

Intravenous NSA is held as the gold-standard for modeling nicotine reinforcement in human smoking for a number of reasons, such as a great degree of predictive validity for pharmacotherapeutic efficacy in human tobacco abuse (Lerman et al., 2007; O’Dell & Khroyan,

2009), as well as strong pharmacokinetic similarities between intravenous nicotine deposition and the rapid arterial nicotine elevations which occur in human smokers (Caille et al., 2012; Le Foll &

Goldberg, 2009). Consistent with the latter, animals have been shown to respond more robustly for equivalent nicotine amounts infused rapidly rather than slowly (Samaha & Robinson, 2005; Shoaib,

1996; Yanagita et al., 1995). The face validity of the i.v. NSA paradigm is supplemented by demonstrations that intravenous nicotine also maintains self-administration in human smokers (for review, see Goodwin et al., 2015). For example, smokers can not only reliably choose 0.4mg and

0.7mg infusion doses over placebo and 0.1mg infusion doses, but can furthermore demonstrate this significant effect of dose on infusion choice even when blinded to infusion dose (Sofuoglu et al.,

2008).

12

Despite these advantages, the i.v. NSA procedure also has limitations. Experiments using i.v.

NSA must be of relatively short duration, as surgically implanted catheters remain viable for limited amounts of time (Panlilo & Goldberg, 2007). Other limitations include the large potential influence non-nicotine factors may hold upon i.v. NSA performance; as one example, the presentation of reward-associated cues may be necessary to evoke robust response rates (Caggiula et al., 2002), or as another, i.v. NSA has been shown sensitive to both when and how much animals are fed (Donny et al., 1998). Finally, the face validity of i.v. NSA may be challenged by the fact that intravenous routes of nicotine delivery are quite uncommon as a choice of human users (Henningfield & Keenan,

1993).

1.3 Mechanisms of Nicotine Reward

1.3.1 The Mesolimbic Dopamine System

To produce the effects of reward and reinforcement observed in the models above, converging lines of evidence suggest nicotine may exert mechanistic influence on the mesolimbic (or mesocorticolimbic) dopamine (DA) system (Doyon et al., 2013; D’Souza & Markou, 2011; Grace et al., 2007; Koob & Volkow, 2010). This system originates in the ventral tegmental area (VTA) of the midbrain (“meso-” deriving from ancient Greek for “middle”), which sends dopaminergic, glutamatergic, and γ-aminobutyric acid (GABA)ergic projections to both cortical and limbic structures (Dahlstrom & Fuxe 1964; Koob Volkow, 2010). In turn, diverse brain regions provide glutamatergic, GABAergic, and neuromodulatory afferent inputs to the VTA (Lammel et al., 2012); although DA is the primary neurotransmitter implicated in the mediation of reward, interactions

13

among many of these other transmitter systems are likely essential (Bjorklund & Dunnett, 2007;

Cannon & Bseikri, 2004).

Elevations of DA neurotransmission in terminal regions of the mesolimbic system, such as the nucleus accumbens (NAc), have been regularly observed in animals consuming natural rewards, such as eating food, drinking water, and engaging in sex (Berridge & Robinson, 1998). Disrupting this signaling with DA receptor antagonists can cause natural food rewards to lose their normal capability to maintain operant responding (Wise, 2006), and also impairs the efficiency of ICSS reward (Stellar & Corbett, 1989). The NAc is part of the ventral striatum, and constitutes the other major component of the mesolimbic pathway alongside the VTA (from which it receives dopaminergic projections) (Salgado & Kaplitt, 2015). Cumulative evidence has revealed that virtually all drugs of abuse, including nicotine, converge in action on the VTA-NAc pathway, although they may do so through distinct mechanisms (for review, see Nestler, 2005).

Like other drugs of abuse, nicotine promotes dopaminergic signaling from the VTA to the

NAc (Benowitz, 2008; Koob & Le Moal, 2006; Jasinska et al., 2014). Several subtypes of nAChR are expressed by mesolimbic DA neurons, especially in the VTA (Larson & Engel, 2004; Taly et al.,

2009); neuroanatomical studies have shown that rat VTA tissue possesses nAChRs at both cell body and dendritic levels, as well as on terminal fields in the NAc (Clarke, 1993; Deutch et al., 1987;

Schilström et al., 1998; Wada et al., 1989). Evidence for the notion that DA neurotransmission at the

NAc is involved in nicotine’s reinforcing effects includes that both DA antagonists (Corigall &

Coen, 1991) as well as neurotoxic lesions of the NAc (Singer et al., 1982) have both been shown capable of reducing NSA. This suggests that facilitation of NAc activity is crucial in nicotine’s mechanisms of action (Corrigall et al., 1992).

14

1.3.2 Nicotine and the Mesolimbic Reward System

Nicotine is therefore capable of inducing increases in accumbal dopamine within the mesolimbic system while acting as a nAChR agonist, although the precise biochemical mechanisms through which it does so are not yet completely clear. However, nicotine has been shown capable of not only directly activating DA neurons within the VTA (Grenhoff et al., 1986; Mereu et al., 1987;

Calabresi et al., 1989), but also induces them to fire in phasic “bursts” (Zhang & Sulzer, 2004); both of these effects increase extracellular DA in the VTA and NAc as a consequence (De Biasi & Dani,

2011; Rice & Cragg, 2004). Evidence also exists to suggest that these effects may be mediated by nAChRs of the CNS; while pre-treatment with the central nAChR antagonist mecamylamine was shown to prevent nicotine-induced DA release in the NAc, the peripheral nAChR antagonist hexamethonium was ineffective (Imperato et al., 1986).

Similarly, nAChR antagonism has also been found to suppress the NSA of rats when microinfused into the VTA, but not when administered into the NAc (Corrigall et al., 1994); this suggests that cholinergic signaling at the VTA may be critical. As such, another possible mechanism for nicotine reward is the indirect modulation of the VTA’s afferent inputs. Glutamatergic terminals of VTA afferents also express presynaptic nAChRs considered to prominently include the α7 subtype (Jones & Wonnacott, 2004). Although the sources of those afferents may be subject to further influences of nicotine locally, the stimulation of presynaptic glutamatergic afferent terminals is another means through which nicotine can potentiate excitatory transmission to the VTA

(Mansvelder & McGehee, 2002).

There may also be dissociable roles for individual subunits in producing nicotine’s effects.

While genetic knockout of α5 subunits has been found to increase NSA in mice, the overexpression of β4 subunits does the same (Changeux, 2010; Wu & Schulz, 2012); this suggests that while some nAChR subunits (such as α5) could generally act to inhibit nicotine reward or produce aversion,

15

others (such as β4) perform the opposite role (Tuesta et al., 2011; Wu & Schulz, 2012). Roles for specific subtypes of nAChR in nicotine reward are supported further by a number of gene deletion studies; genetic knockout of certain nAChR subunits prevalently expressed by the VTA (including

β2, α4, and α6) can abolish NSA in mice, but when re-expressed via viral vectors, NSA can be re- established (Picciotto et al., 1998; Pons et al., 2008).

However, since a single nAChR subunit can be expressed by multiple receptor subtypes, each individual subunit may perform a multiplicity of functional roles; the behavioral and rewarding properties of nicotine may potentially be mediated by the complex interplay of these effects upon more than one receptor subtype (Dani & De Biasi, 2001; Tuesta et al., 2011).

1.4 Theoretical Frameworks of Substance Dependence

1.4.1 Motivational Accounts of Substance Dependence

1.4.1.1 Characteristics of Addiction

Frequently proposed is that substance addiction is a chronically relapsing disorder involving a progressive loss of control over drug seeking and consumption, persisting despite adverse consequences. This state is characterized by: 1) compulsion to seek and take the drug, 2) loss of control over drug intake, and 3) emergence of negative affective states reflecting withdrawal syndromes in the drug’s absence (Everitt & Robbins, 2005; Koob & Le Moal, 1997; Koob &

Volkow 2010; Piazza & Deroche-Gamonet, 2013). These three characteristics of addiction have frequently been considered to have cyclical interactivity, with compulsive and escalating drug intake both arising from and perpetuating aversive craving (Koob & Volkow, 2010; Robinson & Berridge

2008; Skinner & Aubin, 2010). In elucidating these interactions, the use of animal models such as

16

those above has contributed tremendously to the study of addiction through measurement of distinct aspects of drug seeking and taking behaviors at a level not possible in human subjects (Rose

& Corrigall, 1997). In the use of such animal models, the hypotheses made by conceptual accounts may be put to the test.

1.4.1.2 Hedonic Homeostatic Dysregulation

One well known conceptual framework for addiction has origins in Solomon & Corbit’s

“opponent process model of motivation” (1974), which noted that as either good or bad stimuli are wearied of, an affective state of the opposite valence is left behind should those stimuli be abruptly removed. Proposed was that “opponent processes” could be responsible, naturally recruited by changes in affect to maintain a neutral hedonic baseline. Koob & Le Moal (1997) would go on to apply such concepts of “hedonic homeostasis” to the affective aspects of drug use, suggesting that repeated drug administration could cause abnormal, long-lasting disruptions of this system not typically achievable by natural rewards.

In this account, an addiction cycle of “spiraling distress” would cause residual decreases in hedonic homeostatic set point, gradually shifting a baseline of neutral affect to one of negative affect in a process of hedonic “allostasis” (Koob & Le Moal, 2001). For an addict to avoid the negative affective states hedonic allostasis has rendered normative, no other choice exists but for substance use to resume; the feed-forward cycle is thus perpetuated and recreational use transitions to compulsive abuse (Koob et al., 2004; Koob & Le Moal, 2008). One prediction arising from the hedonic allostasis theory is that negative affective states should arise following the cessation of chronic drug administration, an effect reliably demonstrated in models such as ICSS paradigms

(Koob & Le Moal, 1997; Koob & Volkow, 2010).

17

1.4.1.3 Incentive Sensitization

An account of addiction which takes an alternative perspective to that of hedonic allostasis is the “incentive sensitization theory” proposed by Robinson & Berridge (1993). Noting that prior exposure to certain drugs can enhance their behavioral effects in future encounters (e.g.

“locomotor” sensitization), they proposed that repeated drug exposure may also produce hypersensitivity in the mesocorticolimbic circuits responsible for assigning value to salient stimuli

(i.e. “incentive” sensitization) (Robinson & Berridge, 2008). Consequently aberrant reward signals could drive biases of attentional processing for drugs and drug-associated stimuli, pathologically elevating motivation for drug use. In this scenario, the incentive-sensitized addict would be perceptually confronted with motivational desire elevated to the point of compulsion (Robinson &

Berridge 1993). Specific evidence for theories such as incentive sensitization can be drawn from studies investigating drug-induced changes in the incentive salience attributed to drug-related stimuli, a primary example being that of the induction of CPP (Robinson & Berridge, 2008).

1.4.2 Compulsive Accounts of Substance Dependence

1.4.2.1 The Limits of Motivation

The two conceptual models discussed above are not mutually exclusive, and different aspects of addiction captured by each framework likely reflect several underlying processes and their interactions (Belin et al., 2013; Everitt et al., 2001; Vanderschuren & Everitt, 2005). That being said, certain features of addiction still pose difficulty in being entirely understood through such motivational accounts (Sjoerds et al., 2014). For example, self-reports from drug users have suggested that even as the pleasure of drug use decreases over the long-term, use can nevertheless remain prevalent (Kennet et al., 2013). Although one interpretation put forth by the incentive

18

sensitization theory is that drugs can be subjectively “wanted” without necessarily being “liked”

(Robinson & Berridge, 2008), Vanderschuren & Everitt (2004) observed that i.v. cocaine-seeking in rats given extended training was resistant to normal suppression by a fear-conditioned tone; the authors noted that this was neither due to impaired fear conditioning nor to any change in the incentive value of cocaine, perhaps reflecting genuine “compulsivity” in drug-seeking

(Vanderschuren & Everitt, 2004). In similar examples, rats with high levels of cue-reinstatement were found to be undeterred in i.v. cocaine-SA despite response-contingent footshock (Deroche-

Gamonet et al., 2004), and rats trained to consume alcohol over long periods (6 months or more) persisted in drinking even after quinine adulteration of their ethanol solution rendered it bitter enough to deter rats who had not been drinking for as long (Holter & Spanagel, 1999; Wolffgramm

& Heyne, 1995). A similar example has been found in the case of nicotine, where rats who freely chose to drink nicotine solution after 48 weeks of training became excessive consumers who also inflexibly drank despite quinine adulteration (Galli & Wolffgramm, 2011).

A prevalent concept in both theories above has been that of “craving”, referring to any desire experienced by abstinent drug users to return to substance use (Skinner & Aubin, 2010).

However, this conscious desire for drug use may not fully encapsulate the compulsion to seek and take drugs (Tiffany & Carter, 1998). For example, alcohol and cocaine users may frequently report

“impulsive use with no known cause” as responsible for relapse episodes (Miller & Gold, 1994).

Two recently published meta-analyses have systematically reviewed nicotine craving in relation to tobacco use and smoking cessation. Gass et al. (2014) considered 50 studies on nicotine craving and tobacco-consumption, to conclude that craving “may play a role in, but does not fully account for, tobacco-use behaviors”. Similarly, Wray et al. (2013) synthesized findings from 62 smoking cessation studies to conclude that craving is not a necessary condition for relapse to cigarette use, and noted that “inconsistent relationships between craving and treatment outcome call

19

into question the value of craving as a target of treatment”. If drug use cannot be motivationally accounted for through craving, then why might it occur?

1.4.2.2 The Development of Automaticity

One potential explanation was put forth by Tiffany (1990), who noted that as cognitive and motor skills are practiced, they become easier and more routine; perhaps the highly ritualized and repeated behaviors of drug-seeking and taking could also recruit such cognitive processes to thereby become automatic. In this case, “automatic” refers to behaviors that are elicited by antecedent stimuli, initiated without intention, and completed without conscious awareness (Tiffany, 1990).

While drug use may be initiated due to motivational factors, it may later transition to an automatic and inflexible response persistently elicited by environmental cues, thereby mediating tenacious use and relapse (Everitt & Robbins, 2005; Tiffany, 1990). Relapse episodes to drug use likewise seem rather stimulus-bound, most acute when addicts are confronted with drug-related cues or opportunities to consume their drug (Tiffany & Carter, 1998).

Should a smoker reaching for a cigarette find their pack is empty, their automatized drug use would be disrupted; having a cigarette now involves more deliberate and demanding “non- automatic” processing (such as planning a trip to the store). Some authors have suggested that explicit feelings of drug craving may actually be the post-hoc rationalization of such interruptions to automatized behavioral impulses (Belin et al., 2013; Tiffany & Carter, 1998). While a lack of cigarettes is one such interruption, another is a commitment to abstinence; the effortful, non- automatic processing necessary to suppress what would otherwise be an automatic behavior may be what manifests as explicit drug desire (Drummond, 2001). In their meta-analysis of craving and tobacco use, Gass et al. (2014) distinguished “automatic” nicotine-seeking measures (i.e. quantified aspects of normal smoking) from “nonautomatic” seeking measures (e.g. where nicotine had to be

20

chosen over other rewards), to find that craving is more predictive of tobacco use when assessed by nonautomatic nicotine-seeking metrics.

The potential for automatic processes to contribute to compulsive drug use has seen strong support from animal models of automaticity, in which the term “habit” is typically used instead

(Ashby et al., 2010). Investigations of habit have not only shown that repeated behaviors can become automatically elicited by the environmental context, but also that habit formation shares intimate relationships with many aspects of drug addiction. Theories proposing a role for habit in substance dependence will be revisited in section 1.8, after key concepts necessary for understanding habit research have been dealt with.

1.5 Habit

1.5.1 Two-System Control of Instrumental Behavior

1.5.1.1 Actions and Outcomes

Behaviors that are not “habits” are referred to as “goal-directed” (Dickinson, 1985). The distinction between the two is exemplified by a series of investigations in the 1980s into what kind of associative mechanisms guide animal learning throughout operant training. One possibility was that animals acquired operant responding through stimulus-response (S-R) associations, the idea that reinforcement directly associates the environmental context (the “stimulus”) with the reinforced operant behavior (the “response”), resulting in the environment more strongly evoking that response in the future (Hull, 1943). The opposing view was that animals could make response- outcome (R-O) associations, with the operant behavior (the “response”) associated instead to the current desirability of the operant reward (the “outcome”) (Tolman, 1949). This implies animals

21

have explicit knowledge (or “internal representation”) of the operant outcome and its value, and only perform responses if outcome has appeal (Adams & Dickinson, 1981; Dickinson & Balleine,

1994).

To determine which associative framework was the more accurate descriptor of animal behavior, Adams & Dickinson (1981) directly tested the predictions of each. Rats were trained to lever press for food pellet rewards, after which half of the animals were given free access to the reward pellets followed immediately by intraperitoneal (i.p.) injections of lithium chloride (LiCl).

LiCl is a nauseating agent, and aversion-pairing procedures in which LiCl injections immediately follow the consumption of a certain food (a “pairing”) had earlier been shown to condition a taste aversion to that food (Domjan & Wilson, 1972; Garcia & Koelling, 1967; Morrison & Collyer,

1974). Making the operant reward unappealing, or “devaluation”, should have distinct consequences for behavior mediated by S-R vs. R-O associations in a post-devaluation test, carried out under extinction conditions.

Since these animals had only previously experienced lever-pressing in the context of reinforcement, no differences in the test responding between “devalued” and “non-devalued” conditions would be predicted if animals were S-R associating. But if animals were R-O associating, their internal representation of the operant reward’s value should have been reduced by devaluation, and devalued animals alone would be predicted to reduce their extinction test responding. Adams &

Dickinson (1981) found that the devalued group indeed reduced their extinction responses made, suggesting that R-O associations drove the purposeful responding of these subjects.

A similar experiment recently showed that humans too are capable of purposeful actions.

Tricomi et al. (2009) trained participants to play a computer-based free-operant task for chocolate and potato chip rewards. Following the final session of training, participants were presented with an unlimited amount of one of the two foods, and instructed to eat until it was “no longer pleasant” for them. Like aversion-pairing, this “devaluation” by satiety is another means of reducing a reward’s

22

appeal. When participants returned to the task in a test session, they reduced their performance for the consumed (devalued) reward, but not for the unconsumed (valued) reward, consistent with the results reported by Adams & Dickinson (1981).

1.5.1.2 Stimuli and Responses

To complicate the above, Adams (1982) reported that if animals had received limited training (earning 100 reinforcements across 2 sessions), devaluation reduced their extinction test responding as before to signify an R-O association; but, if animals had been given extended training

(earning 500 reinforcements across 10 sessions), devaluation no longer reduced their extinction responding as a consequence. This suggested that it was S-R associations which were now mediating their behavior, and Dickinson (1985) characterized this gradual S-R predomination of behavior as the formation of a “habit”.

Habitual responses are made by humans as well as animals. In Tricomi et al.’s (2009) study, the human participants who stopped responding for satiety-devalued foods were trained with two sessions in a single day. But another experimental condition was run in which participants had six times the amount of training (four sessions per day over three days). Even after eating one food to devalued satiety, these “overtrained” participants did not change how they responded for either of the two foods during testing, suggesting that they too were now responding by habit.

1.5.1.3 Goals and Habits

These findings demonstrated that while both R-O and S-R associations can contribute to instrumental action, which of the two is revealed by a behavioral assay such as devaluation can depend on conditions such as training length (Adams, 1982; Dickinson, 1985). When S-R associations predominate, the behavior is considered “habitual”, and when R-O associations

23

predominate, the behavior is in contrast “goal-directed”. While habitual behaviors are typically considered to form through the repetition of goal-directed actions (Colwill & Rescorla, 1986;

Boakes, 1993; Dickinson, 1985; Dickinson et al., 1995), an action must be driven by two major factors to be considered-goal directed: the incentive value of the outcome (e.g. I’ll work harder for a bigger paycheck), and the causality between responses and that outcome (e.g. I’ll only work if I get paid) (Dickinson & Balleine, 1994; Balleine & Dickinson, 1998a; Balleine & Dickinson 1998b). To acquire goal-directed behavior, animals must directly experience each of these through two separate learning processes, respectively termed “incentive learning” and “contingency learning” (Balleine &

Dickinson, 1998a).

1.5.2 Goal-Directed Behaviors

1.5.2.1 Incentive Learning

Incentive learning is the process by which the value of an outcome is determined (Balleine &

Dickinson, 1998a). Although food is more appealing when one is hungry as opposed to when one is full (Rolls, 1986), Dickinson et al. (1995) demonstrated that regardless of whether or not animals were hungry or fed just prior to an extinction test session, their responses made for a distinct operant food reward were identical. However, if on a previous occasion animals were exposed to the distinct operant reward while already full, being fed just prior to testing gained the capability to reduce test responses made. Apparently, animals first had to learn that the operant reward was less satisfying when full before they could integrate this knowledge into their test responding. This represents the basic tenet of incentive learning: animals must learn, through exposure, about changes in outcome value (Dickinson & Dawson, 1989).

The experimental relevance of this is that even R-O behavior will not reflect changes in outcome value until animals discover the outcome is devalued through re-exposure (Balleine &

24

Dickinson, 1998b; Dickinson & Balleine, 1994). In the case of the LiCl aversion-pairing utilized by

Adams & Dickinson (1981), the animals critically underwent multiple LiCl-reward pairings; re- exposure to the devalued reward occurred as each pairing progressed. For satiety devaluations, re- exposure occurs continuously as the reward is consumed (Gottfried & Balleine, 2011).

Through incentive learning, goal-directed behaviors are characterized by being disrupted if their consequence is discovered to be unappealing.

1.5.2.2 Contingency Learning

Contingency learning is the process by which the causal relationship between responses and outcomes is determined through direct experience (Balleine & Dickinson, 1998a). The causal relationship is referred to as the “instrumental contingency”, operationalized as the correlation between changes in response rate and changes in reinforcement rate (Dickinson, 1985). For example, assuming an operant reward has incentive value, what actually forms the R-O association is the specific correlation the animal experiences between increases in active responding and increased reward deliveries (Balleine & Dickinson, 1998a; Dickinson et al., 1983; Dickinson, 1985).

A prediction that follows is that should responses be made inconsequential through freely presenting the reward in-session (“non-contingent” upon operant responses) the experienced instrumental contingency will be “degraded”, as reward rates would increase without a correlated increase in response rates (Dickinson & Mulatero, 1989). This should reduce the strength of R-O associations, and thus the rate of responding (Balleine & Dickinson, 1998a). This was found to be the case experimentally, with non-contingent reward presentations found to selectively reduce only those responses which could earn the same reward (Dickinson & Mulatero, 1989; Hammond 1980;

Williams, 1989). Without changing outcome value, these experiments demonstrated that R-O

25

associations still depend on experience of the instrumental contingency (Balleine & Dickinson,

1998a; Dickinson, 1985; Dickinson & Balleine, 1994).

Through contingency learning, goal-directed behaviors are further characterized by being disrupted if discovered to be inconsequential.

1.5.3 Determinants of Habit Formation

1.5.3.1 Training Length

Empirically, there is a consensus that instrumental actions are goal-directed when acquired, and transition to habitual control with repetition (Balleine & O’Doherty, 2010; Smith & Graybiel,

2013; Yin & Knowlton, 2006). In the context of operant responding, this may be due to how the instrumental contingency is experienced over the course of training. Early in training, animals learn that they can obtain rewards only if operant responses are made, and the large correlations between response and reward rates form a strong R-O association (Dickinson, 1985). However, animals will eventually maintain the stable response rate that earns them the optimal number of rewards

(Dickinson, 1985; Yin & Knowlton, 2006); during stable responding, changes in both response and reward rates diminish (Figure 1.1), such that animals no longer experience any variation with respect to how the response co-occurs with the outcome (Dickinson, 1985).

At this point, animals continue to make responses despite no longer having any strong experience of the instrumental contingency to maintain R-O associations, suggestive of a transition to S-R control (Dickinson, 1985). For behaviors mediated by S-R association, changes in outcome- value lack any means of interaction through which to influence the rate of operant responses.

Responding at this stage is now driven by neither the instrumental contingency nor the outcome, and a habit can be said to have been developed (Dickinson, 1985). That the length of training gradually reduces the goal-directedness of behavior has been demonstrated time and again in the

26

animal literature for a variety of both natural and drug reinforcers, including food pellets (Adams,

1982), sucrose solution (Mangieri et al., 2012), ethanol solution (Corbit et al., 2012; Mangieri et al.,

2012), and possibly cocaine (Olmstead et al., 2001; Zapata et al., 2010).

Figure 1.1. Diagram of general changes in response rate across an arbitrary length of training. As responding stabilizes across training, animals consistently perform the number of responses that earns the optimal number of rewards, resulting in a reduced experience of correlation between changes in response rate and changes in reward rate (the “instrumental contingency”). There cannot be a correlation between changes in these rates if they are unchanging; extended training therefore leads to a diminished experience of the instrumental contingency. Adapted from Dickinson (1985) and Yin & Knowlton (2006).

1.5.3.2 Reinforcement Schedule

However, extended training is not the only way behavior can be shifted to S-R habitual control. The specific rules that determine how operant responses relate to the delivery of reinforcements are referred to as the “schedule” of reinforcement, and various response requirements can produce distinct patterns of operant response behavior (Spealman & Goldberg,

1978). In what are called “ratio” reinforcement schedules, reinforcements are delivered following a specific number of operant responses, and whether this number is fixed or variable within a given

27

operant session respectively distinguishes “fixed-ratio” (FR) from “variable-ratio” (VR) schedules

(Skinner & Ferster, 1957).

In contrast to ratio schedules, an “interval” schedule refers to reinforcement additionally contingent on the elapse of a time-interval following each reinforcement delivery; only after this time-interval elapses is the first response made reinforced. Whether the duration of this time-interval is fixed or variable within an operant session respectively distinguishes “fixed-interval” (FI) from

“variable-interval” (VI) schedules (Skinner & Ferster, 1957). Occasionally, VI schedules may be referred to as “random interval” (RI) schedules in the literature (e.g. Miles et al., 2003), but it should be noted the term “RI” has also been used to refer to distinct variants of schedule (e.g. Millenson,

1963).

Dickinson et al. (1983) noted that previous studies generating habitual behavior had all employed VI schedules to train their subjects, while studies in which animals appeared goal-directed had all used ratio-schedules during training. It was hypothesized that animals’ experience of a strong instrumental contingency (behavior-reward correlation) in ratio schedules could promote R-O associations; reward rates are perfectly correlated with response rates in ratio-schedules, but the elapse of time-intervals in VI schedules restricts reward rate, weakening experience of the instrumental contingency (Figure 1.2) (Dickinson et al., 1983; Dickinson, 1985; Yin & Knowlton,

2006). This should undermine the formation of R-O associations, such that S-R associations would be predicted to predominantly control behavior following VI schedule training (Balleine &

Dickinson, 1998a; Dickinson, 1985). To that end, Dickinson et al. (1983) directly compared the outcome-sensitivities of rats receiving equivalent training on ratio and VI schedules, and found that each type of training respectively produced goal-directed and habitual responding, as predicted.

28

Figure 1.2. Diagram of general changes in reward rate vs. response rate for ratio and interval schedules. Reward rates can be directly increased under ratio-schedules by increased rates of responding, but the elapse of time-intervals precludes such correlations in interval-schedules. Adapted from Dickinson (1985) and Yin & Knowlton (2006).

The use of VI schedules in training has subsequently been shown to effectively accelerate habit formation for a variety of reinforcers, including food pellets (Dickinson et al., 1983), sucrose solution, (Lingawi & Balleine, 2012), and ethanol (Mangieri et al., 2012). Furthermore, VI schedules promote the formation of habits interactively with training length; Mangieri et al. (2012) demonstrated that while responding for sucrose solution was goal-directed following limited training on both VI and VR schedules, only VI schedule training produced habitual responding when training was extended.

The reduction in instrumental contingency by VI schedules is distinct from that of non- contingent reward delivery, as responses in VI schedules are still necessary for the delivery of any rewards (Dickinson 1985; Yin & Knowlton 2006). As a final aside, FI schedules do not seem to promote the formation of habits, potentially due to the predictable nature of reward availability;

DeRusso et al. (2010) found that by the time a VI schedule was able to generate habitual responding, an FI schedule did not. As such, FI schedules do not commonly appear in habit research literature.

29

1.6 Distinguishing Goal-Directed and Habitual Behavior

1.6.1 Outcomes and Contingencies

To demonstrate a behavior as either “goal-directed” or “habitual”, a behavioral assay is employed to assess whether R-O or S-R associations predominate over that behavior’s execution

(Balleine & Dickinson, 1998a; Yin & Knowlton, 2006). Through incentive learning and contingency learning, goal-directed actions are respectively sensitive to changes in the incentive value of their outcome as well as sensitive to reductions in their necessity for reward; in contrast, habitual responses are insensitive to both these factors (Balleine & O’Doherty, 2010; Dickinson & Balleine,

1994; Balleine & Dickinson, 1998a). Each of these two characteristics therefore provides a means through which goal-directed actions and habitual responses may be distinguished experimentally, and two common behavioral assays are considered for each of these two strategies, below.

1.6.2 Manipulations of Outcome Value (i.e. Devaluation Procedures)

1.6.2.1 Reward Devaluation and Extinction Testing

As a general rule, devaluation procedures are conducted in three steps: instrumental training, reward devaluation, and extinction testing (Rossi & Yin, 2012). The extinction test determines whether the incentive learning in devalued animals was capable of influencing behavior; as S-R associations are mediated by contextual experience alone, knowledge of outcome value will not affect behavior under S-R control (Adams & Dickinson, 1982; Dickinson & Balleine, 1994).

Therefore, whether devalued animals are reduced or identical in extinction test responding (relative to unpaired controls) will reveal if the behavior in question is goal-directed or habitual (Balleine &

Dickinson 1998a; Rossi & Yin, 2012).

30

This post-devaluation test must always be conducted under extinction conditions, with no rewards or associated cues presented (Rossi & Yin, 2012; Yin & Knowlton, 2006). Why this is so critical can be illustrated by considering the case of habitually responding animals: to identify them as such following devaluation, they would need to subsequently respond at a rate consistent with that of animals in a non-devalued control condition (Adams, 1982). In a normal operant session, non-devalued animals will have previous S-R associations reinforced as usual, but devalued animals will immediately have any previous S-R learning perturbed by presentations of the now aversive reinforcer (Balleine & Dickinson, 1998a; Dickinson & Balleine, 1994). As such, direct behavioral suppression by the devalued reinforcer will cause even a habitually responding animal to reduce responses made during normal operant sessions, and exactly that effect is experimentally employed in “reacquisition” tests to confirm a devaluation was effective (Adams & Dickinson, 1981; Adams,

1982).

Therefore, testing animals in extinction allows a distinct operant outcome of null- consequence to be equally experienced by both devalued and non-devalued conditions, ruling out exposure to the aversive reward as an alternative interpretation of response differences (Adams &

Dickinson 1981; Rescorla, 1994; Rozeboom, 1958). However, even in extinction, learning of the new operant outcome (null consequence) will still interfere with previously learned associations; this causes response rates to diminish over the course of the extinction session (Boakes, 1993; Spear &

Miller, 1981). Response rates and effects of devaluation are therefore most pronounced within roughly the first 10 minutes of extinction testing (Adams & Dickinson, 1981; Rozeboom, 1958). As such, the session length in extinction tests are typically shortened to better capture these effects, with a duration of around 10 minutes most commonly used (e.g. Bradfield et al., 2013; Dickinson et al., 2002; Fanelli et al., 2013; Miles et al., 2003; Mangieri et al. 2012, Serlin & Torregrossa, 2015;

Shillinglaw et al., 2014).

31

1.6.2.2 Sensory-Specific Satiety Devaluation

Exemplified by the human participants mentioned above (Tricomi et al., 2009), “sensory- specific satiety” is a phenomenon in which pre-feeding of a specific food reduces its subsequent consumption relative to non-pre-fed foods (Hetherington & Rolls, 1996; Rolls, 1986). Like aversion- pairing, satiety can be used as a method to specifically reduce the value of a certain reward as an operant outcome (Colwill & Rescorla, 1985; Balleine & Dickinson, 1998a).

“Sensory-specific” refers to the necessity of controlling for primary motivational states such as hunger and thirst, performed by pre-feeding an alternative non-operant reward to the non- devalued control group. Importantly, this control consumption should differ in only the sensory

(and motivationally arbitrary) dimension of taste. Hunger (i.e. food controlled with distinct food), thirst (i.e. solutions controlled with distinct solutions), caloric load, and consumption volumes are optimally kept consistent during testing (Balleine & Dickinson, 1998b). These necessities in experimental control groups have resulted in satiety devaluations seeing limited use outside of the study of natural rewards, especially for reinforcers which are not orally consumed.

1.6.2.3 LiCl Aversion-Pairing Devaluation

As mentioned in the preceding section, if an i.p. injection of LiCl is delivered to an animal immediately following consumption of a certain reward (a “pairing”), a specific aversion to that reward is subsequently conditioned and the reward will be devalued (Adams & Dickinson, 1981;

Domjan & Wilson, 1972). Aversion-pairing with LiCl has been demonstrated as an effective means of devaluation for various orally consumed reinforcers, including natural rewards such as food pellets or sucrose solution (Adams, 1982; Lingawi & Balleine, 2012), as well as drug rewards such as alcohol (Barker et al., 2010; Dickinson et al., 2002; Mangieri et al., 2012), cocaine solution (Miles et al., 2003), and even nicotine (Clemens et al., 2014).

32

1.6.2.4 Aversion Pairing and Reacquisition Testing

The above being said, suppose an aversion-pairing devaluation fails to alter extinction responding. Is this because the behavior is under S-R control, or simply that the pairing procedure was ineffective? To help confirm the effectiveness of aversion-pairing devaluations, the extinction test is typically followed by a reacquisition test, conducted under normal operant SA conditions

(Adams & Dickinson, 1982). As LiCl will induce conditioned aversion to a given reward following an efficacious pairing (Morrison & Collyer, 1974), a successfully devalued group will voluntarily self- administer much less of a LiCl-paired outcome than an unpaired control group. As both R-O and S-

R associations would be perturbed by delivery of a devalued reinforcer, the reacquisition test can demonstrate an aversion-pairing procedure as effective by selectively reducing the reacquisition responding of devalued animals, regardless of whether they appeared goal-directed or habitual in extinction (Adams 1982; Dickinson et al., 1983). Interpretation of all possible outcomes from post- devaluation extinction and reacquisition testing is summarized in Table 1.1.

Table 1.1. All possible interpretations of post-devaluation test session responding in devaluation procedures (relative to unpaired controls). Extinction Reacquisition Outcome

Test Sessions Sensitivity? Reduced Reduced Condition A  Sensitive responses responses No response Reduced Condition B  DEVALUATION Insensitive differences responses No response Condition C  - Uninterpretable differences

33

1.6.2.5 Intravenous Rewards and Devaluation

A particular challenge is faced when considering devaluation paradigms in the study of habitual responding for drug rewards, especially for drugs administered intravenously. In satiety paradigms, both “satiated” (devalued) and “non-satiated” (non-devalued) conditions would be needed for comparison (Dickinson, 1985), and as such the “satiated” condition would need to be

“pre-loaded” with drug prior to the extinction session. However, potential behaviorally-activating drug effects are known to exist, presenting obstacles to extinction test interpretation (Balleine &

O’Doherty, 2010). Drugs of abuse can both alter general locomotor activity (potentially causing a non-specific increase in responding) as well as drug-seeking behavior (potentially elevating drug- seeking responses). Cocaine serves as simple respective examples: acute cocaine injections have been shown to dose-dependently increase general activity in rats (Antoniou et al., 1998), and have been demonstrated in “reinstatement” paradigms to elevate previously extinguished cocaine-seeking responses (De Vries et al., 1998).

Like cocaine, nicotine injections can also induce the reinstatement of nicotine-seeking

(Shram et al., 2008), and also alter general locomotor activity (albeit in complex relationship involving both depressant and stimulatory effects) (Stolerman et al., 1995). As such, the same concerns with i.v. cocaine devaluation apply to i.v. nicotine. Due to these potentially confounding issues, satiety devaluations with drug pre-loading do not typically appear in literature studies of habitual vs. goal-directed responding for either i.v. cocaine or i.v. nicotine. To devalue an intravenous reinforcer with potential behaviorally-activating effects, satiety devaluation is unsuitable.

Unlike satiety devaluations, the method of aversion-pairing has also seen a few instances of attempted use in the study of habit formation for intravenous drug rewards, including both cocaine

(Root et al., 2009) as well as nicotine (Clemens et al., 2014). Although these studies reported effects of LiCl pairing on cue-induced cocaine seeking and NSA reacquisition following extended training,

34

respectively, both studies also report a lack of devaluation consequences in other experimental conditions. This suggests stringent pairing parameters may be necessary for efficacious aversion- pairing devaluation of intravenous rewards, and this concern will be revisited in Chapter 4.

1.6.3 Manipulations of Instrumental Contingency

1.6.3.1 Omission Schedules

The other major methodological approach to reveal habitual behavior is to examine how long animals take to stop performing a behavior no longer necessary for reward, occurring at a rate inversely proportional to the relative contribution R-O associations make in maintaining that behavior (Balleine & Dickinson, 1998a). For operant responding animals, the most effective way to decrease goal-directed responding is to not only make responses unnecessary for reward, but additionally to make responses counterproductive (Yin & Knowlton, 2006). This is the basis of

“omission schedules”, in which the instrumental contingency is completely reversed and operant responses delay non-contingent reward presentations (DeRusso et al., 2010; Mangieri et al., 2014). In omission paradigms, the rate of response decrease in animals of an “omission” condition is compared to that of an experimental control. This control can be a similarly trained omission group with either lesions or responding for a distinct reinforcer (e.g. DeRusso et al., 2010; Mangieri et al.,

2014), or else can consist of a condition receiving reinforcements at the same rate as that of the omission condition, but with operant responses inconsequential instead of punished (Yin &

Knowlton, 2006).

35

1.6.3.2 Contingency Degradation

If a salesman previously paid only on commission is now paid a salary regardless of how much they sell, would they continue making sales? If not, their sales would qualify as a goal-directed action, performed only for the pay (Yin & Knowlton, 2006). As non-contingent outcome deliveries have been found to reduce rates of operant responding in goal-directed animals (Balleine &

Dickinson, 1998a; Hammond, 1980; Dickinson & Mulatero, 1989; Williams 1989), whether or not this effect occurs can thus be employed to diagnose a behavior as being S-R or R-O mediated in

“contingency-degradation” paradigms (Balleine & Dickinson, 1998a). Contingency degradation has been used to study the development of habitual responding for food (Bradfield et al., 2013) as well as alcohol responding (Fanelli et al., 2013; Serlin & Torregrossa, 2015; Shillinglaw et al., 2014).

The major dependent variable of sensitivity to degradation (predicted to be higher in goal- directed animals) is measured through comparing the rate of operant response reductions during

“degradation-sessions” (in which non-contingent rewards are presented) to that of a control condition. Typically, this control condition performs similar degradation sessions, but with a distinct variable such as training (Fanelli et al., 2013), reinforcer (Shillinglaw et al., 2014), or lesion (Balleine

& Dickinson, 1998a; Bradfield et al., 2013).

1.6.3.3 Extinction Testing in Contingency Degradation

Animals in contingency degradation paradigms frequently operant respond for a distinct, non-degraded reinforcer throughout training and degradation sessions as a control (e.g. Bradfield et al., 2013; Dickinson & Mulatero, 1989), and contingency degradation experiments are conventionally concluded by a 10-minute test in extinction conditions as described above (e.g. Bradfield et al., 2013;

Fanelli et al., 2013; Serlin & Torregrossa, 2015; Shillinglaw et al., 2014). In contingency degradation paradigms, the normal purpose of extinction testing is now instead to rule out the possibility that

36

specific-satiety induced by the non-contingent rewards could have been responsible for any reductions in response rate observed during degradation sessions (Dickinson & Mulatero, 1989;

Serlin & Torregrossa, 2015). If animals reduced responding during degradation sessions due to satiety (instead of genuine learning of the new instrumental contingency), it would be expected that their extinction responding (in which satiety processes cannot operate) should be indistinct from that of extinction responding for the control (i.e. non-degraded) reinforcer (Dickinson & Mulatero,

1989). Interpretation of these extinction test results are summarized in Table 1.2. Since the outcome value in the case of contingency degradation paradigms remains unchanged, a reacquisition test following extinction is unnecessary (Balleine & Dickinson, 1998a).

Table 1.2. Interpretation of extinction testing in contingency degradation paradigms for natural rewards (relative to responding for a distinct, non-degraded control reinforcer).

Degradation Extinction Contingency

Performance Performance Sensitivity? No response Uninterpretable Condition A  Reduced differences (satiety) responses Reduced Condition B  Sensitive responses

1.6.3.4 Contingency Manipulations and Intravenous Rewards

However, like the literature on satiety devaluations, all previous instances of omission manipulations in the literature have used orally consumed reinforcers (DeRusso et al., 2010;

Dickinson et al., 1998; Mangieri et al., 2014; Yin & Knowlton, 2006). With respect to the suitability of omission paradigms to the study of i.v. nicotine, consider that nicotine can produce high rates of responding during extinction (Shram et al., 2008). As such, predicting the likelihood of animals successfully obtaining nicotine reinforcements under an omission schedule is difficult. Furthermore, previous reports using omission schedules use quite rapid omission intervals (i.e. those rewarded

37

lengths of time in which no responses occur), ranging from 20 seconds (DeRusso et al., 2010; Yin &

Knowlton, 2006) to 2 minutes (Mangieri et al., 2014). While orally consumed rewards can be non- contingently delivered this frequently to facilitate omission learning with little chance of being consumed to dangerous excess (as satiety processes will limit intake), intravenous infusions do not have this luxury. If nicotine was delivered following such short omission lengths, excessive and aversive drug administration to the animal could ensue. For these reasons, the contingency degradation paradigm is preferable for the study of intravenous rewards.

1.7 The Effects of Drugs on Habits in Animal Models

1.7.1 Specific Acceleration of Habit Formation

A critical development in the study of habits has been the finding that habitual operant responding develops faster for drugs of abuse, such as ethanol, than for naturally rewarding outcomes like food pellets or sucrose solution (Dickinson et al., 2002). This is typically assessed by equivalent operant training for both a drug and a natural reward, before selective devaluation of one or the other. With ethanol, this has been seen following satiety devaluation (Corbit et al., 2012), LiCl devaluation (Dickinson et al., 2002; Mangieri et al., 2012), and under omission schedules (Mangieri, et al., 2014). A similar story exists for cocaine as the instrumental reinforcer; in rats concurrently and equivalently trained to operant respond for cocaine-sucrose solution and lemon-sucrose solution, only performance of the lemon-sucrose response was affected in extinction testing by its specific devaluation (Miles et al., 2003).

Just as training length and reinforcement schedule interact to influence the rate of habit formation, so too does drug-induced acceleration of habit formation with these two factors. Corbit

38

et al. (2012) measured the time-course of habit formation for alcohol and sucrose responding under ratio-schedules following several weeks of operant training. While responding for sucrose never became insensitive to satiety devaluation even after 8 weeks, responding for ethanol became outcome-insensitive after 4 weeks of training.

Similarly, drug accelerated habit formation interacts with training schedule. Mangieri et al.

(2012) demonstrated that following limited training on VI schedules, responding for sucrose solution was outcome-sensitive, but responding for ethanol/sucrose solution was not (however, ethanol/sucrose remained outcome-sensitive if trained under a VR schedule instead). Following extended training, VI-trained sucrose responding had transitioned to outcome insensitivity, and so did responding for VR-trained ethanol/sucrose responding. When Mangieri et al. (2014) assessed the ethanol/sucrose-seeking responses of VI and VR trained animals in omission schedule paradigm, they likewise reported a greater reduction in responding for those animals who had been

VR trained.

1.7.2 Generalized Acceleration of Habit Formation

In addition to accelerating habit formation when acting as the operant reinforcer, evidence exists suggesting that exposure to certain drugs of abuse outside of operant sessions can surprisingly accelerate the formation of habitual operant responding for a different, natural reinforcer. A representative example can be drawn from the work of Corbit et al. (2012), in which daily operant sessions for sucrose solution were, in one condition, accompanied four hours later by 30 minutes of non-contingent ethanol access in animals’ homecage. Following 8 weeks of training, control animals who did not receive ethanol were goal-directed in their sucrose-seeking behavior, but animals who did have ethanol access responded for sucrose habitually. Furthermore, the same study reported a significant negative correlation between the amounts of EtOH individual animals consumed and

39

their sensitivity to devaluation of sucrose, further implicating ethanol exposure in general actions upon the habit system.

This generalized acceleration of habit formation holds true for multiple drugs; Nelson &

Killcross (2006) as well as Nordquist et al. (2007) found that experimenter-delivered amphetamine injections (sufficient to produce sensitization) facilitated insensitivity to aversion-pairing of food rewards if given prior to instrumental training. Furthermore, repeated, experimenter-delivered cocaine injections (sufficient to produce behavioral sensitization) prior to instrumental training for food rewards has been shown to render the cocaine treated rats satiety-devaluation insensitive in their food responding, relative to control (Corbit et al., 2014; LeBlanc et al., 2013).

Although this specific effect has not yet been examined for nicotine, a similar finding detailed how subcutaneous nicotine injections prior to sucrose solution self-administration sessions not only increased the apparent reinforcing effects of sucrose, but furthermore increased the sucrose cue-reactivity of animals that had been previously trained in that fashion (Grimm et al., 2012).

1.7.3 Drugs and the Neural Substrates of Goal-Directed and Habitual Behaviors

1.7.3.1 Common Substrates for Natural and Drug Rewards

The validity of behavioral distinctions between goal-directed and habitual behavior has been greatly augmented by the findings that two functionally and anatomically distinct neural circuits specifically mediate and are necessary for each of the two behavioral control types (Balleine et al.,

2009; Yin & Knowlton, 2006). Furthermore, comparable research of goal-directed actions and habitual behaviors in human subjects have found corresponding neural activity in regions of cortex and striatum homologous to those in rats so implicated in the animal literature (Balleine &

O’Doherty, 2010).

40

As an aside, within each network responsible for goal-directed or habitual behaviors, the acquisition and expression of either response style may be distinctly mediated. For example, if manipulation of a brain region prevents the acquisition, but not expression, of goal-directed responding, that is not to say that general operant responding will be abolished; rather, it indicates that goal-directed behavior can still be expressed if the manipulation is done post-training, but prior to testing (Ostlund & Balleine, 2005).

1.7.3.2 Neural Substrates of Goal Directed Behavior

The medial subdivision of the dorsal striatum is referred to as the dorsomedial striatum

(DMS) in rats, homologously as the caudate nucleus in humans, and is occasionally termed the

“associative striatum” in reference to its connectivity (Balleine & O’Doherty, 2010; Yin et al., 2005a).

The DMS is one of two regions essential for goal-directed actions, as lesions of this region prevent the acquisition as well as expression of goal-directed learning; any subsequent testing will always indicate S-R associated behavior. Both pre-training as well as post-training excitotoxic lesions of the posterior DMS (pDMS) abolish outcome sensitivity in both satiety-devaluation and contingency degradation procedures for natural food and sucrose rewards, and this effect is replicated with transient muscimol-inactivation instead of lesions (Balleine & O’Doherty, 2010; Yin et al., 2005b).

Importantly, the DMS not only mediates goal-directed responding for natural rewards, but for drug rewards as well. Murray et al. (2012) demonstrated that DA transmission blockade by α- flupenthixol in the pDMS, but not regions related to habit (discussed below), was able to dose- dependently reduce cue-controlled cocaine seeking following limited training of experimental animals, but not in those extended-trained. Fanelli et al. (2013) used extracellular electrophysiology to assess DMS activity during alcohol-self administration, to find that the firing patterns of neurons within the DMS were typically time-locked to the delivery of alcohol reinforcements, and finally

41

Corbit et al. (2012) demonstrated that transient inactivation of the DMS with GABA prior to extinction testing greatly attenuates the alcohol-seeking responses made, but only if performed at a timepoint in training when responding for alcohol is still outcome-sensitive.

Finally, evidence also exists for a role of the DMS in human habit learning. Tricomi et al.

(2009) (who demonstrated habits in human participants) actually conducted their experiment while participants were scanned with functional magnetic resonance imaging (fMRI), observing significant increases in task-related cue sensitivity in the right posterior putamen, a human homolog of the rodent DMS (Balleine & O’Doherty, 2010).

1.7.3.3 Neural Substrates of Habitual Behavior

As the DMS is crucial for goal-directed behaviors, so is the dorsolateral striatum (DLS) for habitual behavior, with the primate homolog of the rodent DLS commonly known as the putamen

(Balleine & O’Doherty, 2010). Yin et al. (2004) demonstrated that pre-training lesions to the DLS resulted in animals that were sensitive to LiCl devaluation despite extensive training, whereas sham- lesion controls performed habitually. The same study reported that the same manipulation done to the DMS did not alter habitual responding following the extended training paradigm. It was subsequently demonstrated that transient DLS inactivation prior to omission training also rendered omission performance sensitive to the outcome contingency (Yin et al., 2006). Taken together, these demonstrate that animals respond in a goal-directed fashion when the activity of the DLS is disrupted, and also that the DLS specifically mediates habitual (but not goal-directed) behaviors.

This illustrates how goal and habit systems exist in parallel; both R-O and S-R behaviors can be maintained by either respective system alone if the other system is disrupted (Smith & Graybiel,

2014).

42

To parallel findings with the DMS above, evidence exists to suggest that the DLS is also responsible for habitual responding for drug rewards. Extracellular electrophysiology conducted by

Fanelli et al. (2013) found that firing patterns of neurons in the DLS were typically time-locked to the lever responses made by animals responding for alcohol rewards (as opposed to DMS neuron firings, time-locked to reinforcements). Murray et al. (2012) found that following extended training of experimental animals, DA transmission blockade by α-flupenthixol dose-dependently blocked cue-controlled cocaine seeking if administered into the DLS, but was no longer successful in doing so if administered in the DMS. In more explicit demonstrations, Zapata et al. (2010) found that when the DLS was transiently inactivated with lidocaine, habitual i.v. cocaine seeking behavior under a chained-schedule paradigm was reverted to being goal-directed; more traditionally, Corbit et al.

(2012) demonstrated that transient inactivation of the DLS with GABA agonists prior to testing was capable of restoring outcome-sensitivity in animals habitually responding for alcohol. Notably,

Clemens et al. (2014) report that extended NSA training increased c-Fos expression in the DLS, suggesting that nicotine too may act on these substrates.

1.8 Habits, Addiction, and Nicotine

1.8.1 The Hijack of Habit

As illustrated by the previous section, the fact that habits develop with repetition seems to be mediated by a general shift in behavioral control from those neural networks responsible for goal-directed actions to those for habitual responses, especially within subregions of the striatum

(Belin-Rauscent et al., 2012). Importantly, this inter-striatal shift has been demonstrated to occur regardless of whether the relevant reward is natural or artificial, suggesting the same networks

43

responsible for the formation of natural reward-seeking habits may also mediate drug-seeking habits

(Belin et al., 2009; Belin et al., 2013; Everitt & Robbins, 2013; Smith & Graybiel, 2013; Yin &

Knowlton, 2006).

Yet when considering the proposition that the formation of drug-seeking habits contributes to compulsive drug-use, the obvious question that arises is whether or not S-R habits really become compulsive merely by virtue of being automatic. As Robinson & Berridge (2008) point out, strong

S-R habits such as tying shoes or brushing teeth are not performed compulsively by most people, even after being performed several thousands of times. However, current conceptual accounts of drug addiction have taken note of the unique effects drugs of abuse have on S-R mediated behavior outlined in the preceding section, to propose that the chronic self-administration of drugs could distinctly potentiate habitual S-R drug-seeking (Belin et al., 2013; Everitt et al., 2008; Everitt &

Robbins, 2005; Everitt & Robbins, 2013). More specifically, these authors have hypothesized a potential mechanistic basis for drugs to aberrantly engage the striatal circuitry responsible for habit formation in the dopamine-dependent “spiraling” striato-nigro-striatal serial connectivity linking the ventral and dorsal striatum. Regions of the ventral striatum (including the NAc) not only make reciprocal connections with the regions from which they receive dopaminergic innervation, but also send dopaminergic projections to more dorsal striatal regions, which do the same in turn (Belin et al., 2013; Belin-Rauscent et al., 2012; Everitt & Belin, 2008; Everitt & Robbins, 2013). The notion is supported by the finding that disconnection of ventral and dorsal striatum (i.e. the NAc & anterior

DLS, respectively) through contralateral lesions was able to selectively decrease cocaine-seeking behaviors in extensively trained rats, without compromising normal operant responding (Belin &

Everitt, 2008).

Drug-induced sensitization of this circuitry could allow drug-associated stimuli to trigger abnormal “DA overflow” in the DLS, potentially causing failures of control uncharacteristic of the typical S-R habits that constitute everyday routine (Belin et al., 2009; Everitt & Belin, 2008; Everitt

44

& Robbins, 2005; Schneck & Vezina, 2012). Belin et al. (2013) have argued that this drug-induced sensitization of striatal circuitry could reflect the pathological formation of “maladaptive incentive habits”, standing in contrast to the normal habit formation which can occur even for natural rewards.

1.8.2 Habits in Compulsive Drug Use

It has been speculated that “maladaptive incentive” drug-seeking habits could contribute to addiction even after the subjective effects (e.g. ‘rush’ and euphoria) that contributed to their formation have abated (Belin et al., 2013; Robbins & Everitt, 1999). Habits have the defining characteristic of being insensitive to the outcome of their execution, and drug-seeking despite negative consequences (akin to devaluation) is a hallmark of drug addiction (Vanderschuren &

Everitt, 2005). Similarly, the DLS has been demonstrated in rats to mediate continued cocaine- seeking in the face of seeking-contingent foot shock (Jonkman et al., 2012). The transition from recreational to compulsive use in human drug addiction could be similarly accompanied by such abnormal shifts in the neural loci of behavioral control (Yin & Knowlton 2006; Belin et al., 2009;

Belin et al., 2013).

The most insidious implication carried by the maladaptive incentive habit hypothesis is that users trying to remain abstinent may fall prey to compulsive use even when drugs are not consciously desired; with enhanced capability to bypass inhibitory executive control systems, non- declarative drug-seeking impulses triggered by drug-associated cues may be capable of initiating drug use even in the absence of explicit perceptions of drug craving (Belin et al., 2013).

45

1.8.3 A Role for Habit in Nicotine Dependence?

While the consumption of tobacco cigarettes colloquially enjoys the repute of being a “bad habit”, many aspects of the act of smoking are quite consonant with the determinants of habit formation listed above. To illustrate, Djordjevic et al. (2000) characterized the smoking habits of 133 participants, and reported that these smokers consumed on average roughly 15 cigarettes per day, taking approximately 12 puffs per cigarette; in one day, then, a smoker may repeat the behavior associated with nicotine delivery around 180 times. Furthermore, evidence from studies of human subjects alludes to a role for automaticity in smoking; for example, experienced smokers expend less cognitive load on smoking compared to novice smokers, demonstrated by faster reaction times on tasks concurrently performed while smoking (Baxter & Hinson, 2001; Field et al., 2006). More convincingly, fMRI results from exposing smokers to smoking cues have revealed not only activity in brain regions known to mediate motivational reward, but also in brain regions related to tool use knowledge and action-representation, with activity in this second set of regions correlating with the severity of smokers’ nicotine dependence (Yalachkov et al., 2009; Yalachkov & Naumer, 2011).

Despite not only the above but also the widespread folk belief that nicotine is “habit forming”, preclinical assessment of nicotine’s capability to form habits using the traditional behavioral assays for distinguishing goal-directed vs. habitual behavior is sparse. An account of nicotine-seeking responses transitioning from goal-directed to habitual behavioral control has not yet been conclusively demonstrated, despite carrying important implications. Should nicotine be shown to facilitate the transition of behavior to S-R control (as does ethanol and cocaine), not only would this open to consideration potential avenues for the treatment of nicotine dependence through the corticostriatal circuitry responsible for habit formation, but also further implicate S-R learning processes (and their neural substrates) as contributors to the compulsive patterns of drug use which characterize substance dependence.

CHAPTER 2: Purpose of Investigation

Habit is the development of behavioral automaticity, likely to facilitate the efficient performance of repetitive behaviors conducive to survival and reproduction. Actions performed for valued outcomes gradually become habits with repetition. Operationally, habits are defined as instrumental actions that are insensitive to reductions in the value of the operant reward, as well as insensitive to reductions in the instrumental contingency (i.e. the necessity of responses to obtain the operant reward). If both of these criteria are met, control of instrumental action qualifies as

“habitual”; otherwise, it is considered “goal-directed”.

Drugs of abuse may hijack neurocircuitry relevant for natural habit formation, turning drug seeking and taking into uncontrollable maladaptive impulses that contribute to compulsive drug use.

Animal models of operant self-administration have demonstrated that drug-seeking behaviors for orally consumed alcohol and cocaine not only proceed from goal-directed to habitual control, but also that these drugs themselves facilitate the rapidity of habit formation.

Like alcohol and cocaine, nicotine may also potentially promote habit formation. Oral self- administration of nicotine, though, has limited utility as an animal model. At the time of writing, a single literature report (Clemens et al., 2014) has demonstrated outcome-insensitive intravenous nicotine seeking following extended training, but was unsuccessful in devaluation (and outcome- sensitivity assessment) following limited training. Although various methods exist to manipulate outcome value and instrumental contingency, the respective two most appropriate for use in the context of intravenous drug self-administration are aversion-pairing and contingency degradation.

It was hypothesized that intravenous nicotine self-administration would proceed from goal- directed to habitual control, and the formation of nicotine seeking habits would be rapidly acquired as compared to the time-course seen for natural reinforcers. As a first investigative step, we therefore sought to examine the outcome- and instrumental contingency sensitivity of nicotine-

46

47

seeking responses using training conditions that optimized the likelihood of retaining goal-directed behavioral control; i.e. the bare minimal training requirements necessary for stable operant responding.

The purpose of the two experiments presented in this investigation can be expressed as the following questions:

1) Is intravenous nicotine self-administration susceptible to devaluation through LiCl aversion-

pairing, and can nicotine seeking behavior following limited training be demonstrated as

outcome-value sensitive?

2) Is intravenous nicotine self-administration behavior following limited training sensitive to

reductions in the instrumental contingency, as assessed by a contingency degradation

procedure?

The two experiments conducted each address one of these questions. Their results additionally evaluate the appropriateness of the methodological approaches chosen for assessment of goal-directed behavior for intravenous rewards, alongside the primary demonstration of whether intravenous nicotine seeking, produced by minimal training, is under goal-directed or habitual control.

CHAPTER 3: General Materials and Methods

3.1 Subjects

3.1.1 Animals & Housing

All experimental procedures described henceforth were conducted in full compliance with

Canadian Council on Animal Care guidelines, and approved by the local Animal Care Committee of the Center for Addiction and Mental Health. Adult male Sprague-Dawley rats weighing between

200-250g (~50 days of age) were purchased from Charles River Laboratories (QC, CAN). This strain was chosen due to ready acquisition of i.v. NSA under limited access conditions (e.g. Donny et al.,

1995), as well as maintenance of the behavior across a wide range of doses relative to other strains

(Shoaib et al., 1997). Throughout all experiments, animals were housed in a climate-controlled

(21±1ºC) facility and maintained on a reversed 12-hour light/dark cycle (lights on from 7pm to

7am). Upon arrival, animals were eartagged and double-housed in plastic shoebox cages (45 x 24 x

20 cm; this and all subsequent dimensions reported as length x width x height) lined with ¼” corncob bedding and paper roll enrichment (Enrich-o’Cobs®; The Andersons, Inc., OH, USA) for a 1-2 week acclimatization period with ad lib food (5001; LabDiet®, St. Louis, MO, USA) and water access. Animals were then single-housed and food-restricted up to 5 pellets (~20-25g) of chow each day for the remainder of each experiment. In all experiments, animals received their daily feed at least 2 hours following the final nicotine or saccharin operant session that day to minimize interference with operant performance (Donny et al., 1998).

48

49

3.2 Intravenous Jugular Vein Catheterization

3.2.1 Catheter Construction

Catheters were assembled in-house prior to surgery. For pliancy, silicon tubing (Silastic®

Medical-Grade Tubing, 0.3mm internal-diameter (ID) x 0.64mm outer-diameter (OD); Dow-

Corning, Midland, MI, USA) was trimmed to a length of 30mm for insertion and direct catheterization of the jugular vein. For durability across the subcutaneous passage between the ventral catheterization site and the dorsal location of the catheter port, Silastic tubing was attached to 65mm of thicker and relatively inelastic polyethylene tubing (PE10; 0.28mm ID x 0.61mm OD;

BD-Canada, Mississauga, ON, CAN), in turn attached to 170mm of PE20 (0.38mm ID x 1.09mm

OD; BD-Canada).

Silastic-PE10 joints were created by insertion of roughly 7mm of PE10 tubing into xylene- dilated Silastic tubing, and then secured by the heating of a 6mm length of heat-shrink (1.19mm ID) centered directly upon this overlap. Viability of the assembly was maintained during heating by permeation of the Silastic-PE lumen with 32G suture wire (World Precision Instruments, Sarasota,

FL, USA). Similarly, a 30G wire (World Precision Instruments) was threaded through the PE20 and into the PE10, such that the adjacent ends of the Silastic-PE10 assembly and the PE20 could be heat-welded into a bead joint, to form a continuous length of tubing.

The catheter access port was fashioned by feeding the unmarried end of the PE20 segment through a rigid, machine-drilled nylon bolt (19.05mm length, 6-32 thread size), providing catheters structure and stability during handling. This end of the PE20 was next obdurated with a 30G wire extending from a 23G blunted needle (BD-Canada), and heated into a flange, which centered the position of a final 16mm piece of heat-shrink. Heating formed a fluid-tight junction between the

50

PE20 and heat shrink, into which blunted 23G needles could be later inserted to deliver solutions through the catheter.

Cyanoacrylic “super” glue (Loctite®; Henkel Canada Corp., Brampton, ON, CAN) was applied to the heat-shrink segment secured overtop the PE20 to permanently affix the bolt and tubing assembly. Two permanent bends were induced in the catheter tubing by immersion into 90-

100ºC water. The first bend, a 180º turn in the PE10 just prior to the PE10-silicone joint, allowed insertion of the catheter into the jugular vein with flow directed toward the heart. The second, a 90º bend made in the PE20 adjacent to the nylon bolt, ensured that PE20 would rest flat when subcutaneous at the dorsal incision.

A small oval (roughly 30mm across the long axis) of surgical mesh (Marlex Mesh, BARD

Cardio-surgery Division, Billerica, MA, USA) was affixed perpendicular to the nylon bolt at its head with polymethyl methacrylate (Teets Cold Curing Denture Cement; A-M Systems Inc., Sequim, WA,

USA) as the final step of catheter construction. This mesh anchored the port of the catheter to dorsal connective tissue during surgical recovery. Catheter patency and integrity were verified through pressure testing both during construction, as well as immediately prior to surgical implantation.

3.2.2 Surgical Procedures

3.2.2.1 Surgical Equipment

Prior to surgery, consumables such surgical drapes, gauze, swabs, and sutures (4-0 & 5-0 silk sutures, Surgical Specialities, Reading, PA, USA) were sterilized via autoclave. Non-autoclavable materials, such as catheters and delicate metallic instruments, were sterilized via 20-minute soak (as well as flush, in the case of catheters) with 1.5% (w/v) benzalkonium chloride (Sigma-Aldrich,

Oakville, ON, CAN) solvated in distilled water. Instruments sterilized as such were rinsed in sterile

51

saline and placed on a sterile surface prior to surgical use, and catheters were similarly flushed with sterile saline and allowed to soak in a beaker of the same.

Sterile surgical fields were prepared by sanitizing stainless steel surfaces of the surgical suite with the benzalkonium chloride solution, followed by placement of an absorbent sheet (Dri-Sorb®

Underpads; Domtar Personal Care, Raleigh, NC, USA) upon which sterile surgical drapes were laid.

3.2.2.2 Surgical Anesthesia

For the induction and maintenance of an appropriate depth of anesthesia prior to surgical preparation or catheter implantation, animals were sedated with a combination of 3-5% inhalational

Isoflurane (99%USP; Halocarbon Products Corp., River Edge, NJ, USA) and oxygen, with gas ratios adjustable through a vaporizer (53-T3ISO; Benson Medical Industries Inc., Markham, ON, CAN) and delivered in a closed-air system. Onset of anesthesia was prompted through immersion of animals in a steady flow of anesthetic (4.5%) inside of a sealed induction chamber (internal dimensions 23cm x 10cm x 10cm). Following anesthetic onset and preliminary surgical preparation, animals were maintained at roughly 3% delivered through nose-cone. Throughout surgery, subjects’ respiration was monitored for anomaly and the anesthetic gas ratio titrated as appropriate.

3.2.2.3 Surgical Preparation

Intravenous jugular catheterization required two incisions be made; a dorsal incision necessary for positioning of the catheter port, and a ventral incision necessary to access the jugular vein for catheter insertion. Following the onset of anesthesia, fur around the incision sites was shaved using an electric razor armed with #40 clipper blades. For long-lasting local analgesia through nerve blockade, 1-2 mg/kg (0.125%) bupivacaine (Marcaine™) was injected as a s.c.

52

infiltrate at each of the two incision sites (0.1mL per site). Sterile gauze was used to sequentially scrub the incision sites with Betadine surgical scrub (Purdue Fredrick, Pickering, ON), 70% (v/v) ethanol, and 10% USP povidone-iodine solution (Proviodine®; Teva Canada - OTC, Mirabel, QC,

CAN). Prior to incisions being made, animals were additionally administered the following subcutaneous (s.c.) injections: 5mg/kg ketoprofen (Anafen®) was delivered as a general nonsteroidal anti-inflammatory analgesic; as an antibiotic, animals received 0.1mL Derapen

SQ/LA®; and a single 3mL injection of 0.9% saline was delivered in order to replace any fluid lost during surgery. As a final consideration, ophthalmic ointment (Lacri-Lube®; Allergan Inc.,

Markham, ON, CAN) was applied to the eyes of animals to prevent corneal drying.

3.2.2.4 Catheter Implantation

Following surgical preparation, depth of anaesthesia was verified via absence of withdrawal reflex. With the anesthetized animal positioned in ventral recumbency, a roughly 1cm transverse incision was made between the scapulae. A subcutaneous pocket was cleared through blunt dissection, which would later accommodate the catheter’s mesh assembly as well as excess PE tubing. The dorsal incision was then protected with a layer of sterile gauze and the animal transferred to dorsal recumbency.

An oblique incision made rostral to the clavicle on the animal’s ventral side allowed blunt dissection of the superficial muscle layer to expose the jugular vein. A section of vein was stripped of fascia and isolated via a locating suture. Needle drivers were used to tunnel a subcutaneous passage from dorsal to ventral incisions and grasp a large-diameter polyethylene tube, retracted from the ventral through the dorsal to create a passage for the Silastic end of the catheter to be guided through. The trocar was removed following proper positioning of the catheter tip such that the ventral catheter protrusion lay flat.

53

Ligatures of 5-0 surgical thread were used to manipulate and isolate the vein as an incision

2/3rds of the vein’s diameter was made, through which the Silastic catheter tip could be fed. The

Silastic portion of the catheter was completely inserted such that the heat-shrink join with PE10 contacted the vein, and patency was verified by extraction of venous blood backward through the catheter before infusion of a small amount of sterile saline. Heat-shrink at the silastic-PE10 join was sutured at either end to the jugular vein with 5-0 suture silk. Cyanoacrylate secured that same join to underlying tissue prior to closure of the superficial muscle layer with interrupted 5-0 silk sutures, and closure of the dermal layer with interrupted 4-0 silk sutures.

The animal was returned to ventral recumbency, and excess PE20 was fed subcutaneously such that it lay flat and centered between the scapulae. Likewise was fed the catheter mesh assembly such that the mesh laid the same way, superficial to all tubing. The dorsal incision was closed with interrupted 4-0 silk sutures, with tissue securely fitted around the protruding catheter port without offering any gaps or spaces by which foreign debris could irritate subcutaneous tissue.

A topical antibiotic powder (Cicatrin®; Glaxo Wellcome, Picking, ON, CAN) was applied to closed incision sites to prevent infection and speed recovery before animals were placed on a thermoregulated heating pad (VL-20F; Fintronics Inc., Orange, CT, USA) to be monitored while recovering from anesthesia. Once fully ambulatory, animals could be returned to the transport cart or home cage where they were monitored daily to assess recovery.

3.2.2.5 Maintenance and Verification of Catheter Patency

Immediately following surgery, the catheter port was capped to prevent contamination with a 15mm long Silastic tubing plug, that had with one end sealed with cured liquid silicon adhesive

(Dow-Corning). Catheters remained continually capped whenever animals were not receiving infusions or attached to drug lines. In order to maintain catheter patency, catheterized animals

54

received intravenous (i.v.) infusions of heparin sodium salt (Sigma-Aldrich), an anticoagulant, dissolved in 0.9% saline and filtered to maintain sterility (see section 3.3.3). Dosage was 25 USP units/0.05 mL at least once daily, beginning two days following surgery until the end of the experiment. Fresh heparin solution was prepared every 3 days, and outside of use stored at 2-8ºC.

During days in which animals performed IVNSA sessions, heparin was administered both prior to and immediately following the operant session. For cases in which animals would go longer than

24h without experimenter contact (i.e. weekends), the final heparin infusion on the preceding day was substituted by a heparin/dextrose lock solution, previously demonstrated to maintain catheter patency for extended periods. This solution was created through a heparin concentration of 500

USP units/mL in a 25%(w/v) dextrose solution (Hospira Healthcare Corporation, Saint-Laurent,

QC, CAN). Like daily heparin solutions, lock solution was stored at 2-8ºC when not in use.

To assess catheter patency following completion of any experimental protocol, animals were intravenously infused with 2-4mg thiopental (Thiotal; Vétoquinol, Lavaltrie, QC, CAN) at a concentration of 20mg/mL. Thiopental is a rapid-onset, short-acting ; successful venous delivery, indicative of catheter patency, induced a transient loss of muscle tone and righting reflex that subsided in 3-4 minutes. Animals that did not show any pronounced response to thiopental challenge were considered to have compromised drug delivery, and had their data excluded from analysis.

55

3.3 Apparatus

3.3.1 General Procedures

3.3.1.1 Operant Chambers

All operant sessions conducted in this thesis work took place across forty-two modular operant test chambers (ENV-008CT, Med Associates Inc., St. Albans, VT, USA) with interior dimensions of 30.5cm x 24.1cm x 21.0cm. Twelve chambers were used in food training, fourteen chambers used for saccharin SA, and sixteen chambers were used for i.v. NSA. Each chamber was individually housed in a fan-ventilated and sound-attenuating enclosure. Inside the chamber, animals stood on stainless steel grid floors suspended above a stainless steel waste pan lined with corncob bedding. Widthwise walls were divided into three vertical aluminum channels each, for the insertion of modular components.

The specific components outfitting the modular walls of each operant chamber varied depending on experiment, and will be discussed subsequently as deviations from the following schematic. Typically, an operant chamber would contain two retractable levers in the left and right channels of a single modular wall, mounted 6cm above the floor grid. Above each retractable lever was mounted an LED cue-light at an elevation of 10cm above the floor grid, capable of circumstantial illumination (e.g. upon certain reinforcements). Additionally, a tone generator

(2900Hz) was located at the top of the leftmost channel, above the cue light. The centre channel of the opposite wall contained an incandescent house light 18cm above the floor grid, near the top of the chamber. The house light was consistently illuminated during operant sessions except during

“time-out” (TO) periods, during which additional reinforcement was unavailable. Unless otherwise stated, featureless aluminum panels occupied the remaining positions available in modular channels such that both widthwise walls were solid and without gaps. At the beginning of any given operant

56

session, retractable levers were extended, and retracted upon session completion. Operant boxes were controlled using experimenter-written programs coded in MedState Notation, running on

MED-PC® software (SOF-735, Med Associates). Software and hardware components were connected using a PCI Interface Package (MED-SYST-16, Med Associates).

3.3.1.2 Operant Food Training

Following the acclimatization period and transfer of the animals to single housing, there was an additional 1 week period in which subjects received 23 hours of operant training for 45mg sucrose pellet rewards (F0042 Dustless Precision Pellets; Bio-Serv, Flemington, NJ, USA). This was conducted in order to facilitate the association between operant lever pressing and the availability of reinforcements (see Clemens et al., 2010; Garcia, Lê & Tyndale, 2014). The twelve operant chambers utilized for food training were distinct in that a food trough was present in the center channel between the retractable levers, inside of which sucrose pellets dispensed by a hopper under MED-

PC control would be collected and available for the animal’s consumption.

Training was distributed across two sessions, one of 7 hours (“day” session) and the other 16 hours (“night” session) in duration. Animals pressed on one of two available levers to obtain a single

45mg sucrose pellet per reinforcement, up to a maximum of 400 or 600 per day or night session, respectively, with attainment of maximum reinforcements terminating a session prematurely. The reinforced (or “active”) lever delivered reinforcements on an FR1 schedule, with no timeout period and no accompanying stimuli (i.e. cues). Responses made on the other lever (the “inactive” lever) were recorded but inconsequential for the animal. Prior to food training, animals were evenly divided between having the left or right retractable lever within all operant chambers assigned as the

“active” lever. Animals completed one “day” and one “night” session over the course of a food training week, but would never run sessions consecutively. During food training sessions, water was

57

continuously available through a 100mL bottle (#9010; Bio-Serv) mounted to the operant chamber door. Water was replaced between sessions if soiled or nearing empty, but otherwise routinely each day.

Following intravenous catheterization as well as a one-week period of surgical recovery, animals were returned to the food training boxes for a “consolidation” session. This session lasted for either 1 hour or until 100 reinforcements were earned, whichever came first. Successful performance in consolidation sessions verified that normal operant responding in post-surgical animals was intact. Following completion of this consolidation session, animals continued on to begin specific experimental procedures.

3.3.1.3 Limited Access Procedures

For situations in which animals required exposure to saccharin solution outside of the operant context, a limited access procedure (LAP) was employed; this involved placing each animal individually into wire cages (30 x 20 x 15 cm) equipped with a Richter tube containing up to 16mL of saccharin solution. Animals were permitted to drink freely before removal after an experimentally-contingent period of time, and the fill-volume differences of the Richter tubes from before and after the session were used to calculate the amount of solution consumed by each animal.

LAP drinking could not occur in sound-attenuated chambers, as in operant sessions. To minimize environmental impacts on drinking behavior, LAP sessions consistently occurred with the room lights off, but with dim illumination provided by a lamp with a red bulb (40W). White noise was generated to mask any potential impact of variable outside sounds on the animals.

58

3.3.2 Orally Consumed Saccharin Self-Administration

For experiments in which animals consumed saccharin solution, saccharin sodium salt hydrate (Sigma-Aldrich) was prepared fresh daily at a concentration of 0.1% weight/volume (i.e.

1g/1000mL) by dissolution in distilled water. Prepared solutions were loaded into 20mL (BD-

Canada) syringes to arm single-speed syringe pumps (PHM-100; MedAssociates). Syringes were attached with Luer-Lok connections to 22-gauge (22G) blunted needles, which connected Tygon™ tubing (0.020” ID, 0.060” OD; Saint-Gobain PPL Corp., Akron, OH, USA) fluid lines to the operant chamber.

Operant chambers utilized for saccharin self-administration were not radically distinct from those previously described for operant food training. Retractable levers, one active and one inactive, extended upon session start and retracted upon session end. In place of a pellet trough (as in food training), the center channel between the retractable levers was instead occupied by a dipper cup, into which solution dispensed from syringe pumps would collect and be available for consumption by the responding animal. A reinforcement earned in saccharin operant sessions dispensed 0.19mL of saccharin solution over 2 seconds, additionally triggering a 2-second tone, a 2-second illumination of the cue light above the active lever, and disabling the houselight for a 30-second “time-out” (TO) period. Active lever responses during TO periods were recorded but inconsequential, and reinforcements temporarily unavailable for the TO duration.

As animals were capable of consuming a far greater number of saccharin reinforcements than nicotine infusions, operant sessions for saccharin reinforcers were restricted to 30 minutes in duration to prevent excessive divergence in response rates between reinforcer conditions. After any squad completed a saccharin SA session, fluid left unconsumed in each dipper cup was collected and measured with a syringe. This remainder volume was converted into the equivalent number of

59

reinforcements, and subtracted from the “raw reinforcements earned” by each animal to obtain an

“adjusted reinforcements consumed” for purposes of analysis.

3.3.3 Intravenous Nicotine Self-Administration

Nicotine solutions for i.v. NSA were prepared fresh daily. Nicotine bitartrate (Sigma-

Aldrich) was dissolved in 0.9% saline solution to achieve a concentration of 0.3mg/mL, expressed as free base. Nicotine solutions were pH adjusted to 7.0 +/- 0.2 using NaOH and/or HCl as titrants, and sterilized by filtration through an Acrodisc® syringe filter (Pall Life Sciences, Washington, NY,

USA) with a 0.2μm micropore diameter Supor® membrane into a sterile container (as an aside, all solutions delivered intravenously, such as heparin, were sterilized in this fashion). Sterile nicotine solution was loaded into sterile 10mL syringes (BD-Canada), capable of Luer-Lok connections to

22G blunted needles that gave access to the drug lines of a given operant chamber as above. Each blunted needle was in this case first affixed to Tygon™ tubing that connected to a freely-rotating fluid swivel (PHM-115IP, Med Associates) inside an operant chamber’s enclosure. Swivels were in turn connected to a second length of Tygon™ tubing which terminated in a blunt 22G cannula, designed to fit the access ports of catheterized animals for intravenous infusion.

Fluid swivels allowed subjects rotational locomotor freedom during nicotine operant sessions, by keeping drug lines from becoming otherwise tangled. In addition, swivels were attached to a Drug Delivery Arm assembly (PHM-110-SAI, Med Associates), allowing variable tension in the drug line to be maintained by a counterbalancing weight; this increased animals’ flexibility of movement and reduced discomfort. The fluid swivel also served as an attachment point for a metal spring, which encased and shielded the drug line during the course of operant IVNSA sessions. A female-threaded nut was affixed to the end of each spring, and springs were secured to the catheter of each animal fastening this nut to the male-threaded ridges of a given catheter’s nylon bolt.

60

Syringes loaded with nicotine solution, after being attached to drug lines, went on to arm variable-speed motor pumps (PHM-100VS, MedAssociates). These pumps are capable of dispensing precise solution volumes under control of the MED-PC interface. Animals were weighed daily prior to operant sessions, and animal weights were used to individually set the infusion rate of their respective pump such that nicotine was delivered consistently across animals at 0.03mg/kg/infusion

(free base); this dose has been previously demonstrated optimal for establishing and maintaining

IVNSA in rats (Corrigall and Coen, 1989). Prior to session starts, a priming infusion was non- contingently experimenter-delivered to the animals; this infusion was always delivered to fill the

“dead space” of the catheter (i.e. the length of tubing bridging the access port and bloodstream).

Neglecting this priming infusion would result in the first one or two reinforcements earned by subjects effectively inconsequential; the priming infusion allows for immediate nicotine deposition as soon as the first reinforcement is obtained.

In order to better distinguish the contexts of IVNSA and saccharin self-administration, the configuration of boxes for nicotine self-administration differed slightly from those of food training and saccharin operant sessions. No modular equipment was present in between retractable levers, with that channel instead occupied by featureless aluminum panels. Upon session start, only one of the retractable levers, the “active” lever, was extended. The inactive lever was non-retractable, and located on the central channel of the modular chamber wall opposite the retractable levers. Aside from their position within the operant chamber, however, active and inactive levers for operant

IVNSA functioned identically to those for operant saccharin SA. As in operant saccharin SA sessions, a reinforcement earned in IVNSA sessions triggered the compound-stimuli of

0.3mg/kg/inf nicotine infusion, a 2s illumination of the “active” cue light, a 2s tone presentation, and initiated a 30s TO period in which the houselight was extinguished and active presses recorded but inconsequential. Unlike operant saccharin sessions, operant IVNSA sessions lasted a full hour each.

61

Prior to the commencement of each of the two major experiments discussed in this thesis, intravenous drug lines were chemosterilized by six-hour occupancy of lines with 2.0%(w/w) hydrogen peroxide solution (Accel HLD5; Virox Technologies Inc., Oakville, ON, CAN). To maintain the sterility of intravenous drug lines throughout experiments, nightly removal of nicotine syringes was followed first by a flush of drug lines with air (to expel any standing solution), followed by 70% EtOH solution. Ethanol syringes remained attached to the drug lines until preparation for use the next day, the intent being for full occupancy of drug lines by EtOH to deter any potential bacterial growth. Lines were purged with air prior to loading of fresh nicotine solution.

3.3.4 Statistical Analysis

All statistical analysis was performed using IBM SPSS statistics version 21. Unless reported otherwise, those dependent variables measured continually across multiple operant sessions (i.e. days) were analyzed using repeated-measures analysis of variance (ANOVA) with a within-subjects factor of Day. Between-subjects factors were experiment-dependent, and will be addressed individually within each section. Conversely, those dependent variables assessed in single-day comparisons were analyzed using univariate ANOVA. For any analysis performed throughout this thesis work, if Mauchly’s test of sphericity returned a significant result when applicable, the F values, p values, and degrees of freedom reported are those appropriate for a Greenhouse-Geisser correction to be considered applied. In all analyses performed, a criterion for significance in any statistical result was defined as a p value of < 0.05.

CHAPTER 4: Experiment 1 LiCl Devaluation of Intravenous Nicotine

4.1 Experiment 1: Introduction

4.1.1 Aversion-Pairing to Nicotine

As mentioned earlier, habitual behavior possesses a defining characteristic of insensitivity to changes in outcome value, and the experiment detailed in this chapter pertains to the outcome- sensitivity of nicotine seeking behavior. Outcome-sensitivity is typically assessed through devaluation paradigms, which involve making a formerly appetitive reward unappealing through an experimental manipulation such as sensory-specific satiety, or conditioned aversion through LiCl- reward pairing. The latter of these two methods is the more suitable for the devaluation of drug rewards, as unconditioned drug effects such as behavioral activation may confound extinction testing (Ostlund & Balleine, 2009). As such, aversion-paring has been used previously in the literature to devalue alcohol (e.g. Dickinson et al., 2002), cocaine (e.g. Miles et al., 2003), and nicotine (Clemens et al., 2014) reinforcers.

However, devaluations by LiCl pairing in the literature have been predominantly associated with orally consumed substances; convergence between gustatory, olfactory, and visceral pathways has been suggested to be requisite for the taste-illness learning that mediates conditioned aversions induced by LiCl (Garcia et al., 1985). As intravenous rewards lack gustatory and olfactory cues, their amenability to LiCl aversion-pairing is more uncertain; pairing of LiCl-induced nausea to an intravenous drug reward must do so through classical conditioning with the interoceptive cues elicited by the drug (Clemens et al., 2014).

Nicotine may be capable of generating such interoceptive cues. Rats can readily discriminate nicotine from saline (Charntikov et al., 2014), and can do so to predict consequent sucrose rewards

62

63

(Pittenger & Bevins, 2013a). Furthermore, if those sucrose rewards are next devalued with LiCl- induced nausea, nicotine-evoked conditioned responding for sucrose can be attenuated (Pittenger &

Bevins, 2013b). Because interoceptive nicotine cues have been shown to be capable of interacting with LiCl-induced nausea in this way, it may be possible for nicotine interoception itself to be associated with LiCl-induced nausea to thus devalue nicotine (Clemens et al., 2014).

At least two studies have previously attempted to devalue intravenous rewards with LiCl.

Root et al. (2009) attempted devaluation of i.v. cocaine via i.v. LiCl pairings. Although i.v. LiCl has been documented to produce conditioned place aversion (Mucha et al., 1982), corroborating evidence demonstrating i.v. delivered LiCl can produce conditioned taste aversion or otherwise devalue any sort of reinforcer is sparse. Root and colleagues did not report a devaluation effect upon cocaine reacquisition, and unfortunately did not perform any controls to confirm the i.v. route of

LiCl administration can produce efficacious devaluation.

More relevantly, Clemens et al. (2014) used an aversion-pairing paradigm which was previously used to devalue a natural food reward (Nelson & Killcross, 2006) in an attempt to devalue intravenous nicotine. They paired two experimenter-delivered i.v. nicotine infusions with an i.p. injection of 63.6mg/kg LiCl over two minutes, and repeated this procedure in three pairings.

They were able to successfully devalue nicotine in one experimental condition, suggesting that aversion-pairing devaluation of nicotine may indeed be possible.

As the results of studies attempting to pair LiCl with an i.v. drug reinforcer for the study of habit formation are therefore not clear cut, the following investigation was conducted to systematically assess the utility of i.p. administered LiCl in the devaluation of i.v. administered nicotine.

64

4.1.2 Experimental Objectives and Hypotheses

The purpose of this chapter’s investigation may be summarized in two questions:

1. Prior to any experience with nicotine, is it possible for the pairing of i.p. LiCl to i.v. nicotine

to diminish nicotine’s value as a reinforcing outcome (Experiment 1A)?

2. Can the same pairing procedure, conducted after training conditions thought to minimize

habit formation, devalue nicotine and reveal outcome-sensitive nicotine seeking behavior

during extinction and reacquisition testing (Experiment 1B)?

Two experiments were conducted to address these questions. In the first, we examined whether or not pairing LiCl with non-contingently delivered nicotine would affect subsequent acquisition of NSA. If such pairings were effective in devaluing nicotine, we anticipated that acquisition of NSA would be attenuated. In the second experiment, the same LiCl-nicotine pairing procedure took place after NSA had already been acquired and stabilized. Extinction test responding was examined to assess whether nicotine-seeking established by NSA (10 sessions at FR1) was goal- directed, followed by reacquisition testing of NSA to assess whether the devaluation was effective.

For comparative and control purposes, we also examined the effect of LiCl devaluation of saccharin solution on both the acquisition of saccharin self-administration (SSA) (Experiment 1A), as well as on the extinction and reacquisition testing of SSA (Experiment 1B). Saccharin-LiCl pairings were conducted in the same experimental animals who served as the nicotine-unpaired control group.

65

4.2 Experiment 1: Materials and Methods

4.2.1 General Procedures

4.2.1.1 Experimental Design

A total of 48 animals proceeded through a standard pre-training phase including acclimatization, food training, catheterization, and consolidation as described in Chapter 3. The average number of active lever responses made across all food training sessions was used to assign animals to major experimental groups: Experiment 1A: Pre-Acquisition Devaluation (n=24; group

PRE) and Experiment 1B: Post-Acquisition Devaluation (n=24; group POST), such that animals of both experiments had equilibrated active press histories as well as assignments to left and right active levers.

Animals of group PRE underwent the devaluation procedure (described below) prior to an operant acquisition phase, while animals of group POST underwent the same devaluation procedure only after the acquisition phase was completed. The acquisition phase consisted of 10 days of operant training, in which all animals performed 2 separate operant sessions daily. One of these sessions utilized saccharin solution as the reinforcer, while the other utilized intravenous nicotine infusions. To promote goal-directed (i.e. outcome-sensitive) responding, animals were maintained on

FR1 schedules of reinforcement for both rewards throughout the entirety of the experiment.

Sessional parameters for nicotine and saccharin operant sessions were consistent with those outlined in Chapter 3 (see sections 3.3.2 and 3.3.3). Animals of group POST additionally performed extinction and reacquisition sessions following devaluation, in a testing phase that terminated the experiment upon completion. Key events of the experimental design are summarized in Figure 4.1.

66

Figure 4.1. Illustration of key events in Experiment 1. Prior to the acquisition phase, animals in the Pre-Acquisition group (group PRE) were equally divided into nicotine-paired and saccharin- paired conditions and underwent the devaluation procedure. Animals in the Post-Acquisition group (group POST) were similarly divided and devalued, but only after the acquisition phase was complete. Only animals of group POST proceeded to the testing phase. Each devaluation event consisted of a 6-day pairing procedure, described in the text.

4.2.1.2 Daily Schedule

During the 10-day acquisition phase, each animal performed one operant session in the morning, and another session (with the alternate reinforcer) in the afternoon, separated by roughly

2-3 hours. Each day was divided into four time-slots, occurring at the same time each day. Animals of both group PRE and group POST were evenly divided into four squads, two of which ran nicotine and saccharin sessions simultaneously during a given time-slot. The daily breakdown of acquisition sessions are graphically represented below.

67

During time-slot one: squad 1 = nicotine squad 2 = saccharin During time-slot two: squad 3 = nicotine squad 4 = saccharin During time-slot three: squad 1 = saccharin squad 2 = nicotine During time-slot four: squad 3 = saccharin squad 4 = nicotine

The next day, the order of reinforcers presented to each squad was reversed, such that:

During time-slot one: squad 1 = saccharin squad 2 = nicotine During time-slot two: squad 3 = saccharin squad 4 = nicotine During time-slot three: squad 1 = nicotine squad 2 = saccharin During time-slot four: squad 3 = nicotine squad 4 = saccharin

This two-step pattern was repeated throughout the experiment (excepting days allotted to devaluation procedures) until its conclusion. This counterbalancing ensured that time-of-day effects were counterbalanced across nicotine and saccharin sessions for all four squads.

4.2.2 Devaluation: LiCl-Reinforcer Pairing

4.2.2.1 General Pairing Procedures

To avoid inadvertent association between LiCl-induced emesis and the animals’ daily feed, pairing procedures always took place prior to the animals being fed, and animals were not fed on these days for a minimum of 4 hours after the last LiCl injection had been delivered. When fed at this time, the animals did not show any reluctance or hesitation in approaching their home cage food hopper or initiating food consumption, nor did animals abnormally fail to finish feeds by the next day.

The entirety of the LiCl aversion-pairing procedure consisted of 3 repetitions of a 2-day pairing cycle, yielding a total of 6 days. Each day of pairing cycle exposed animals to one of the two operant reinforcers – when this was identical to the reinforcer for which they were paired, they received i.p. LiCl. On days in which animals received the reinforcer for which they were unpaired, they were given i.p. saline as a substitute.

68

To be clear, this meant that animals could only be either “nicotine-paired”, or “saccharin- paired”. Nicotine-paired animals would not receive LiCl with saccharin (after saccharin exposure, nicotine-paired animals received saline). Likewise, saccharin-paired animals would not receive LiCl with nicotine (after nicotine exposure, saccharin-paired animals received saline). This design ensured every animal, over the course of pairing, received exactly equal exposure to both nicotine as well as

LiCl, with the only difference being whether saccharin or nicotine exposure was proximal to LiCl injection.

4.2.2.2 Intraperitoneal LiCl Preparation

To maximize the efficacy of LiCl-nicotine pairings, we used an adaptation of the procedure described by Clemens et al. (2014). LiCl was dissolved into sterile water at a concentration of

6.36mg/mL, yielding an isotonic solution of 0.15M. The pairing dose of 89.7mg/kg LiCl was delivered via an i.p. injection of 14.1mL/kg. An identical volume of sterile saline, also delivered i.p., served as the control injection during those days of the pairing cycle in which animals received the non-paired reinforcer (i.e. when saccharin-paired animals received nicotine, and when nicotine- paired animals received saccharin).

4.2.2.3 Saccharin (Pairing Cycle Day 1)

Conditioned aversion to saccharin solution was induced by allowing animals to freely consume saccharin solution in an LAP context (distinct from operant and housing environments; see section 3.3.1.4) for 30 minutes. After this period, they were individually removed from LAP drinking cages and given the i.p. LiCl injection described above if “saccharin-paired”. If “nicotine- paired”, animals received a saline injection in place of LiCl. Animals were immediately returned to

69

the transport cart following injection delivery, where they were monitored for 15 minutes before being returned to the home cage.

4.2.2.4 Nicotine (Pairing Cycle Day 2)

LiCl-nicotine pairings were conducted by wheeling a transport cart of animals to a novel experimental room, in which a separate transport cart (or “pairing cart”) had been equipped with nicotine-armed variable speed motor pumps (PHM-100VS, MedAssociates) identical to the ones which delivered nicotine in operant sessions. Pairings did not take place in any operant apparatus, but rather in the transport cages of the pairing cart itself (henceforth referred to as “pairing cages”).

Animals were first weighed and given a standard heparin flush, but held in the familiar transport cart until the procedure. Animals set to undergo pairing had drug lines threaded directly from armed syringes into catheter ports, without any intervening swivel or spring (“pairing lines”).

Animals were next placed into a pairing cage, where they were allowed to habituate for 1-2 minutes and monitored continually in case of incident with pairing lines.

Allowing animals to remain in pairing carts during nicotine infusions permitted them to experience drug effects in the absence of the major contextual cues of operant sessions (e.g. operant chamber & drug line springs), as well as undistracted by experimenter handling. Furthermore, the use of the operant motor pumps allowed the exact replication of the relevant i.v. nicotine reinforcer delivered during self-administration. Nicotine pairings began with a delivery of a priming infusion to fill the catheter “dead volume”, followed by calibration of motor pumps appropriate to the animal’s weight. Delivery of the prime initiated a count-up (0:00); across 9 minutes, animals received 5 nicotine infusions (using the standard operant dose of 0.03mg/kg/infusion) delivered at 1:00, 2:00,

4:00, 6:00, & 9:00 minutes subsequent to the delivery of the prime. Even nicotine-naive animals (i.e. those in Experiment 1A) tolerated these nicotine infusions well and without incident. Following the

70

final nicotine infusion, animals were immediately removed from the pairing cart, flushed with heparin (clearing remaining nicotine from the catheter), and received the appropriate i.p. injection

(LiCl for nicotine-paired animals, saline for saccharin-paired animals) before being returned to the transport cart.

The devaluation procedures above were identical in both Experiment 1A and Experiment

1B. To reiterate, nicotine-paired animals (receiving nicotine with LiCl) received saline with saccharin, and saccharin-paired animals (receiving saccharin with LiCl) received saline with nicotine.

4.2.3 Experiment 1A: Pre-Acquisition Devaluation

4.2.3.1 Saccharin Pre-Exposure

Following animals’ assignment into Pre-Acquisition and Post-Acquisition devalued groups, animals in the group PRE were pre-exposed to saccharin solution in an LAP paradigm. This pre- exposure was necessary to prevent neophobic avoidance of saccharin when aversion-pairing began, which could potentially compromise the pairing efficacy. Animals were water-deprived for 16 hours, and subsequently permitted to consume saccharin undisturbed for one hour in the LAP context, as outlined in Chapter 3. This exact scenario was repeated on the following day, for a total of 2 hours pre-exposure.

4.2.3.2 LiCl Pairing and Acquisition

The volume of saccharin voluntarily consumed across these pre-exposure LAP sessions was used to divide the animals of Experiment 1A into Nicotine-Paired and Saccharin-Paired conditions, with the average food training active responses, active saccharin drinking, and left/right active lever assignments equilibrated between them. Following the six-day LiCl pairing procedure described

71

above, subjects began operant self-administration for nicotine and saccharin concurrently with the animals from Experiment 1B.

As previously mentioned, the aim of Experiment 1A was not any assessment of outcome- sensitivity, but rather an assessment of LiCl pairing’s ability to devalue each reinforcer, with performance in the subsequent acquisition phase providing relevant dependent variables. As such, extinction and reacquisition testing was not performed by animals assigned to the Pre-Acquisition

Devaluation group.

4.2.4 Experiment 1B: Post-Acquisition Devaluation

4.2.4.1 Acquisition and LiCl Pairing

Animals assigned to Experiment 1B (group POST) remained undisturbed in home cages while the animals of Experiment 1A (group PRE) underwent saccharin pre-exposure (2 days) and

LiCl pairing (6 days), with animals of both experiments began operant training sessions simultaneously. Baseline data from the final three days of acquisition (days 8-10) was used to counterbalance animals from Experiment 1B into nicotine-paired and saccharin-paired groups, with equilibrated averaged metrics of active lever responses for nicotine, nicotine infusions earned, active lever responses for saccharin, and adjusted saccharin reinforcements consumed. Animals in

Experiment 1B next underwent the same 6-day LiCl pairing procedure as was performed in

Experiment 1A, before outcome sensitivity and devaluation efficacy were respectively assessed in the experimental testing phase.

72

4.2.4.2 Extinction and Reacquisition Testing

The testing phase that followed the devaluation of group POST consisted of one extinction session followed by two reacquisition sessions per each individual reinforcer.

Extinction tests were 10 minutes in duration, and a total of two extinction sessions (one in each reward’s self-administration context) were undertaken by each individual animal in group

POST. Extinction testing occurred the day following completion of the 6-day LiCl pairing phase.

The order of testing was counterbalanced across squads, such that half the animals (evenly divided across pairing condition) tested for saccharin first, and the other half tested for nicotine first. In extinction testing sessions, levers were non-functional, with responses made recorded but otherwise inconsequential. No rewards or reward-associated cues were presented at any time, and the houselight was continually illuminated until session termination.

In the two days that followed extinction testing, animals of group POST again performed operant sessions identical in all aspects to those of operant acquisition. Performance during reacquisition would indicate whether the LiCl pairing effectively devalued each reward, as revealed by response differences between the animals of distinct pairing conditions. Readers are referred back to Table 1.1 for a summary of interpreting extinction and reacquisition test data.

73

4.3 Experiment 1: Results

4.3.1 Experiment 1A: Pre-Acquisition Devaluation

4.3.1.1 Attrition

Of the 24 animals assigned to Experiment 1A (group PRE), 2 animals were excluded from final analysis due to complications during the LiCl pairing, and 5 animals were excluded due to loss of catheter patency. A total of 17 animals were used in the final data analysis: 8 animals completed the study as nicotine-paired, and 9 animals completed the study as saccharin-paired. Exclusion invalidated an animal’s NSA as well as SSA data.

4.3.1.2 Data Analysis for Experiment 1A

For animals in Experiment 1A (group PRE), the impact of aversion-pairing to saccharin and nicotine on operant acquisition was assessed by comparing operant performance between pairing conditions over the 10-day acquisition phase. Comparisons were conducted through repeated- measures ANOVA using the within-factor of “Day” (acquisition sessions 1 through 10) and the between-factor of “Pairing” (nicotine-paired vs. saccharin-paired). Separate ANOVAs were performed for each of three metrics, namely reinforcements earned, active lever responses, and inactive lever responses across acquisition. This analysis was conducted separately on data from saccharin and nicotine operant sessions.

74

4.3.1.3 Pre-Acquisition Pairing on Saccharin Self-Administration

Adjusted saccharin reinforcements consumed across the acquisition phase in saccharin- paired and saccharin-unpaired (i.e. nicotine-paired) rats are shown in Figure 4.2A. An overall

ANOVA revealed a significant effect of LiCl Pairing [F(1, 15)=11.93, p=0.004], a significant effect of Day [F(2.81, 42.14)=7.18, p=0.001], and a significant Day*Pairing interaction [F(2.81,

42.14)=6.89, p=0.001]. These results indicate that not only did saccharin-unpaired animals earn and consume more saccharin reinforcements than saccharin-paired animals, but the extent of such differences between the two pairing groups increased with the progression of acquisition training sessions; saccharin-unpaired increased their number of reinforcements consumed across acquisition, whereas saccharin-paired animals did not.

Similar analysis for active lever responses (Figure 4.2B) revealed a significant effect of

Pairing [F(1, 15)=9.58, p=0.007] but no significant effect of Day [F(2.94, 44.13)=1.15, p=0.34] or

Day*Pairing interaction [F(2.94, 44.13)=1.69, p=0.18], indicating that the responses on the active lever made by the saccharin-unpaired rats were significantly higher than those made by the saccharin-paired rats, and such differences remained consistent throughout the 10 days of saccharin self-administration.

With respect to inactive lever responses made (Figure 4.2C), there was no significant effect of Pairing [F(1, 15)=0.53, p=0.48], but there was a significant effect of Day [F(2.28, 34.18)=9.49, p<0.001], without a significant Day*Pairing interaction [F(2.28, 34.18)=0.32, p=0.76]. This likely reflects that across both pairing groups, relatively high inactive lever pressing in early sessions declined across acquisition.

75

**, ††, ‡‡

,

**

,

††

,

Figure 4.2. Daily sessional averages (±SEM) across the 10-day acquisition phase of adjusted reinforcements consumed (A), active lever responses (B), and inactive lever responses (C) by group PRE (n=17) during operant saccharin self-administration sessions (0.19ml of 0.1% w/v per reinforcement) . For individual metrics, any significant repeated-measures ANOVA effects of Pairing are indicated by (**) for p<0.01; significant effects of Day are indicated by (††) for p<0.01; significant Day*Pairing interactions are indicated by (‡‡) for p<0.01.

76

4.3.1.4 Pre-Acquisition Pairing on Nicotine Self-Administration

For NSA sessions performed by group PRE during acquisition, averaged sessional metrics of nicotine infusions earned, active lever responses made, and inactive lever responses made are shown respectively in Figures 4.3A, 4.3B, and 4.3C for rats that received nicotine-LiCl pairings (nicotine- paired) and rats that received saccharin-LiCl pairings (nicotine-unpaired).

Regarding nicotine reinforcements earned (Figure 4.3A), ANOVA analysis showed no significant effects of Pairing [F(1, 15)=0.81, p=0.38], Day [F(2.87, 42.98)=2.32, p=0.09], or

Day*Pairing interaction [F(2.87, 42.98)=0.67, p=0.57], indicating that animals earned a somewhat stable number of nicotine infusions across acquisition, and the amount of nicotine voluntarily self- administered by the animals was unaffected by whether the prior aversion-pairing occurred with nicotine or with saccharin.

Similar analysis for active lever responses made (Figure 4.3B) revealed no significant effect of Pairing [F(1, 15)=0.64, p=0.44], a significant effect of Day [F(1.91, 28.70)=3.68, p=0.039], but no significant Day*Pairing interaction [F(1.91, 28.70)=1.53, p=0.23], indicating that although active responding changed across acquisition, changes occurred similarly for each group irrespective of whether animals were nicotine-paired or saccharin-paired.

With respect to inactive lever responses made (Figure 4.3C), no significant effects of Pairing

[F(1, 15)=1.38, p=0.26], Day [F(2.11, 31.62)=2.34, p=0.11], or Day*Pairing interaction [F(2.11,

31.62)=0.61, p=0.56] were observed, indicating that responses on the inactive lever occurred irrespective of pairing group and did not change across the acquisition period.

77

,

Figure 4. 3. Daily sessional averages (±SEM) across the 10-day acquisition phase of nicotine reinforcements earned (A), active lever responses (B), and inactive lever responses (C) by group PRE (n=17) during operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion). For individual metrics, a significant repeated-measures ANOVA effect of Day is indicated by (†) for p<0.05.

78

4.3.2 Experiment 1B: Post-Acquisition Devaluation Results

4.3.2.1 Attrition

Of the 24 animals assigned to Experiment 1B (group POST), 7 animals were excluded due to loss of catheter patency. A total of 17 animals were used in the final data analysis; 9 animals completed the study as nicotine-paired and 8 animals completed the study as saccharin-paired.

Exclusion invalidated an animal’s NSA as well as SSA data.

4.3.2.2 Data Analysis for Experiment 1B

For animals in Experiment 1B (group POST), relevant analyses were conducted upon baseline data from the final two days of acquisition, data from extinction test performance, and data from reacquisition sessions.

In order to verify that no pre-existing differences in the groups assigned to nicotine-pairing or saccharin-pairing conditions could confound results (as non-patent animals were eliminated at the end of the study, after counterbalancing had already occurred), repeated-measures ANOVAs were conducted on data from the final two days of the acquisition phase. These used the within-factor of

Day (acquisition days 9 & 10) and the between-factor of Pairing (saccharin-paired vs. nicotine- paired) to retroactively assess terminal acquisition data for potential confounds. Three separate

ANOVAs were performed for each reinforcer, using averaged sessional metrics of reinforcements obtained, active lever responses, and inactive lever responses.

To analyze extinction test data (captured in a single session), two separate univariate

ANOVAs were conducted. Each used a fixed-factor of “Pairing” (nicotine-paired vs. saccharin- paired), and the distinct dependent measures of active and inactive lever responding.

79

The two days of reacquisition responding were, like baseline acquisition data, analyzed through repeated-measures ANOVA using the between-factor of “Pairing” (nicotine-paired vs. saccharin-paired) and the within-factor of “Day” (reacquisition sessions 1 & 2). Separate ANOVAs were conducted for each sessional metric of reinforcements earned, active lever responses, and inactive lever responses.

In order to minimize the impact of individual subjects’ within-group baseline differences during the analysis of extinction and reacquisition testing, critical metrics of sessional active lever responses and reinforcements earned were re-expressed as a ratio of test data over each subject’s baseline performance. Baselines consisted of the averaged data from an individual subject’s final two days of acquisition. This proportional transformation is quite commonly applied to extinction and reacquisition test results throughout the habit literature for the reason listed above (e.g. Adams,

1982; DeRusso et al., 2010; Mangieri et al., 2012; Mangieri et al., 2014; Serlin & Torregrossa, 2015).

The resultant set of “proportional” data was then subjected to the exact same analysis as conducted for the raw data.

80

4.3.2.3 Group POST: Acquisition of Saccharin Responding

The average sessional performance of group POST during the 10 acquisition days for the self-administration of saccharin are expressed in adjusted saccharin reinforcements consumed

(Figure 4.4A), active lever responses made (Figure 4.4B), and inactive lever responses made (Figure

4.4C).

Importantly, animals were not counterbalanced until the acquisition phase was completed, and during counterbalancing priority was given to equilibration of nicotine performance between conditions. As such, the degraded and non-degraded conditions given in Figure 4.4 reflect the performance histories of animals that were only later assigned to those conditions.

As such, statistical analysis of the data presented in Figure 4.4 was not conducted, and is instead presented here for reference purposes only. The critical assessment of pre-existing baseline differences between the two experimental conditions utilized only data that contributed to counterbalancing, and is instead presented in Table 4.2.

81

Figure 4.4. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for group POST (n=17) in operant saccharin self- administration sessions (0.19ml of 0.1% w/v per reinforcement) during the 10-day acquisition phase. Assignment of animals to paired and unpaired conditions was performed only after the acquisition phase was complete; data presented in this figure thus reflects the performance history of animals only later assigned to those conditions for reference purposes, and therefore statistical analysis was not performed.

82

4.3.2.4 Baseline Differences in Saccharin Responding

For the animals of group POST, data from the final 2 days of operant saccharin SA acquisition are presented in Table 4.2 for animals later assigned to saccharin-paired or nicotine- paired (i.e. saccharin-unpaired) conditions. Repeated-measures ANOVA revealed no significant effects on adjusted reinforcements consumed of Pairing [F(1, 15)=0.001, p=0.98], Day [F(1,

15)=0.004, p=0.95], or Day*Pairing interaction [F(1, 15)=0.75, p=0.40]. Similar analysis of active lever responding likewise revealed no significant effects of Pairing [F(1, 15)=0.17, p=0.69], Day [F(1,

15)=1.13, p=0.31], or Day*Pairing interaction [F(1, 15)=0.10, p=0.75], and the same pattern was seen for inactive lever responding, with no significant effects of Pairing [F(1, 15)=0.18, p=0.68], Day

[F(1, 15)=0.62, p=0.45], or Day*Pairing interaction [F(1, 15)=1.62, p=0.22].

Together, these results indicate that there were no pre-existing group differences between the group POST animals later assigned to saccharin-paired and nicotine-paired conditions, for the metrics of either reinforcer. Any subsequent response differences observed between the animals of each pairing condition, then, may therefore be attributed to the devaluation procedure.

Table 4.2 Baseline data for group POST during saccharin self-administration. ACQUISITION DAY 9 ACQUISITION DAY 10 METRIC FUTURE CONDITION PERFORMANCE (±SEM4) PERFORMANCE (±SEM4) Reinforcements Saccharin-paired 42.05 (±2.77) 39.85(±3.84) Earned1 Nicotine-paired 41.11 (±3.96) 43.00 (±2.49) Active Lever Saccharin-paired 93.13 (±9.38) 87.75 (±11.73) Responding2 Nicotine-paired 90.56 (±10.27) 80.56 (±7.98) Inactive Lever Saccharin-paired 4.25 (±1.56) 2.38 (±0.56) Responding3 Nicotine-paired 3.67 (±0.87) 4.11 (±1.37) 1) Average of [raw reinforcements earned – (unconsumed saccharin volume/reinforcement volume)] for individual animals 2) Mean sessional active lever responses 3) Mean sessional inactive lever responses 4) Standard error of the mean

83

4.3.2.5 Extinction Saccharin Responding

During extinction testing in the operant environment for saccharin self-administration, analysis of active lever responding (Figure 4.5A) showed a significant effect of Pairing [F(1,

15)=5.68, p=0.031], indicating that animals who had been saccharin-paired made fewer active lever responses than those who had been nicotine-paired. This effect was also present when extinction test data were expressed as a proportion of baseline active lever responses (Figure 4.5B) [F(1,

15)=9.34, p=0.008]. These results suggest that the experience of LiCl-saccharin pairing reduced active lever responding for saccharin in extinction, relative to the LiCl-nicotine paired group.

No significant effect of Pairing was observed for inactive lever responding (Figure 4.5C)

[F(1, 15)=0.46, p=0.51], suggesting that the LiCl-reinforcer pairings did not induce any context- specific suppression of activity.

84

*

,

**

,

Figure 4.5 . Averaged sessional lever responding (±SEM) of group POST (n=17) during the 10- minute extinction test conducted in the saccharin self-administration context. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition. Inactive lever responding is expressed as mean sessional lever presses made (C). For individual metrics, any significant univariate ANOVA effects of Pairing are indicated by (*) for p<0.05, and (**) for p<0.01.

85

4.3.2.6 Reacquisition Saccharin Responding

ANOVA for adjusted reinforcements consumed during saccharin reacquisition sessions

(Figure 4.6A) revealed a significant effect of Pairing [F(1, 15)=17.81, p=0.001], as well as a significant effect of Day [F(1, 15)=9.92, p=0.007] but no Day*Pairing interaction [F(1, 15)=0.09, p=0.77]. This pattern persisted when reinforcement data were expressed as baseline proportion

(Figure 4.6B), with significant effects of Pairing [F(1, 15)=15.93, p=0.001] and Day [F(1,

15)=5.10, p=0.039], but not of Day*Pairing interaction [F(1, 15)=0.09, p=0.77]. These results indicate that saccharin-paired animals consumed fewer saccharin reinforcements than nicotine- paired animals, and that both conditions consumed more saccharin reinforcements on the second day of reacquisition.

ANOVA of active lever responses made across saccharin reacquisition sessions (Figure

4.7A) revealed a significant effect of Pairing [F(1, 15)=12.78, p=0.003], but no significant effects of Day [F(1, 15)=1.58, p=0.23] or Day*Pairing interaction [F(1, 15)=0.62, p=0.44]. The same pattern of results were seen when reacquisition of active lever responses for saccharin were expressed as proportions of baseline responding (Figure 4.7B); a significant effect of Pairing [F(1, 15)=11.68, p=0.004], but no significant effects of Day [F(1, 15)=1.90, p=0.19] or Day*Pairing interaction [F(1,

15)=0.68, p=0.42]. These results suggest that saccharin-paired animals performed fewer active lever responses relative to nicotine-paired controls, and SSA responding was stable across reacquisition days for both conditions.

ANOVA of inactive lever responding in saccharin reacquisition (Figure 4.7C) showed no significant effects of Pairing [F(1, 15)=0.98, p=0.34], Day [F(1, 15)=0.01, p=0.92], or Day*Pairing interaction [F(1, 15)=1.03, p=0.33], suggesting that the suppression of responding in the saccharin- paired animals was specific to the active lever, and not a consequence of general behavioral suppression.

86

**, ††

**, †

Figure 4.6. Sessional averages of saccharin reinforcements consumed (±SEM) by group POST (n=17) during operant saccharin self-administration sessions (0.19ml of 0.1% w/v per reinforcement) across the 2-day reacquisition phase. Reinforcements are expressed both as mean sessional reinforcements consumed (A), as well as a proportion of baseline consumption (B), using averaged data from days 9 & 10 of acquisition from individual animals. For individual metrics, any significant repeated-measures ANOVA effects of Pairing are indicated by (**) for p<0.01; significant effects of Day are indicated by (†) for p<0.05 and (††) for p<0.01.

87

**

,

**

Figure 4 .7. Sessional average lever responses (±SEM) across the 2-day reacquisition phase performed by group POST (n=17) during operant saccharin self-administration sessions (0.19ml of 0.1% w/v per reinforcement). Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition. Inactive lever responding is expressed as mean sessional lever presses made (C). For individual metrics, any significant repeated-measures ANOVA effect of Pairing is indicated by (**) for p<0.01.

88

4.3.2.7 Group POST: Acquisition of Nicotine Responding

The average sessional performance of the NicSac group during the 10 acquisition days for self-administration of nicotine are expressed in nicotine infusions earned (Figure 4.8A), active lever responses made (Figure 4.8B), and inactive lever responses made (Figure 4.8C).

Importantly, animals were not counterbalanced until the acquisition phase was completed.

As such, the degraded and non-degraded conditions given in Figure 4.8 reflect the performance histories of animals that were only later assigned to those conditions.

As such, statistical analysis of the data presented in Figure 4.8 was not conducted, and is instead presented here for reference purposes only. The critical assessment of pre-existing baseline differences between the two experimental conditions utilized only data that contributed to counterbalancing, and is instead presented in Table 4.3.

89

Figure 4.8. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for group POST (n=17) in operant intravenous nicotine self- administration sessions (0.03mg/kg/infusion delivered upon reinforcement) during the 10-day acquisition phase. Assignment of animals to paired and unpaired conditions was performed only after the acquisition phase was complete; data presented in this figure thus reflects the performance history of animals only later assigned to those conditions for reference purposes, and therefore statistical analysis was not performed.

90

4.3.2.8 Baseline Differences in Nicotine Responding

As with saccharin, nicotine self-administration sessions in the final two days of acquisition

(Table 4.3) were subjected to ANOVAs for each metric. For nicotine reinforcements earned, no significant effects were found for Pairing [F(1, 15)=0.001, p=0.98], Day [F(1, 15)=1.20, p=0.29], or

Day*Pairing interaction [F(1, 15)=0.70, p=0.42]. The complimentary ANOVA for active lever responses revealed the same pattern of absent effects for Pairing [F(1, 15)=0.39, p=0.85], Day [F(1,

15)=1.88, p=0.19], and Day*Pairing interaction [F(1, 15)=2.72, p=0.12], as did the ANOVA for inactive lever responding for Pairing [F(1, 15)=1.31, p=0.27], Day [F(1, 15)=1.93, p=0.19], and

Day*Pairing interaction [F(1, 15)=0.05, p=0.83].

Together, these results suggest that there were no pre-existing group differences between the group POST animals later assigned to saccharin-paired and nicotine-paired conditions, for the metrics of either reinforcer. Any subsequent response differences observed between the animals of each pairing condition, then, may therefore be validly attributed to the devaluation procedure.

Table 4.3 Baseline data for group POST during nicotine self-administration. ACQUISITION DAY 9 ACQUISITION DAY 10 METRIC FUTURE CONDITION PERFORMANCE (±SEM4) PERFORMANCE (±SEM4) Reinforcements Saccharin-paired 14.00 (±2.01) 13.63 (±1.99) Earned1 Nicotine-paired 15.11 (±3.16) 12.33 (±1.15) Active Lever Saccharin-paired 21.50 (±3.61) 22.25 (±3.01) Responding2 Nicotine-paired 25.11 (±5.44) 16.89 (±1.02) Inactive Lever Saccharin-paired 9.63 (±2.44) 7.63 (±2.83) Responding3 Nicotine-paired 5.78 (±2.66) 4.33 (±1.40) 1) Mean sessional nicotine infusions earned 2) Mean sessional active lever responses made 3) Mean sessional inactive lever responses made 4) Standard error of the mean

91

4.3.2.9 Extinction Nicotine Responding

As with analysis of saccharin responding, a similar analysis was conducted with data from

NSA sessions. ANOVA for active lever responding in nicotine extinction testing revealed no significant effect of Pairing when expressed as either raw data (Figure 4.9A) [F(1, 15)=0.52, p=0.48] or baseline-proportion data (Figure 4.9B) [F(1, 15)=0.01, p=0.91]. This suggests that nicotine seeking responses in extinction were performed equally by group POST animals regardless of whether animals underwent LiCl-saccharin pairing or LiCl-nicotine pairing.

Similarly, no significant effect of Pairing was found in univariate ANOVA for inactive lever responding (Figure 4.9C) [F(1, 15)=0.87, p=0.37]. This suggests that pairing conditions did not produce specific changes in general activity during nicotine extinction testing.

92

Figure 4.9. Averaged sessional lever responding (±SEM) of group POST (n=17) during the 10- minute extinction test conducted in the nicotine self-administration context. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using averages from days 9 & 10 of acquisition for individual animals. Inactive lever responding is expressed as mean sessional lever presses made (C). Repeated- measures ANOVA revealed no statistical significance for individual metrics in this figure.

93

4.3.2.10 Reacquisition Nicotine Responding

During nicotine reacquisition tests, repeated-measures ANOVA using raw nicotine infusions earned (Figure 4.10A) showed no effects of Pairing [F(1, 15)=3.37, p=0.09], Day [F(1, 15)=2.69, p=0.12], or Day*Pairing interaction [F(1, 15)=0.56, p=0.47]. However, when infusions earned in reacquisition were expressed as proportions of the average baseline infusions earned by each animal

(Figure 4.10B), significant effects of Pairing [F(1, 15)=8.88, p=0.009] and Day [F(1, 15)=5.70, p=0.031] were observed, but not of Day*Pairing interaction [F(1, 15)=0.154, p=0.70]. These results suggest that animals of both groups earned more nicotine infusions on the second day of reacquisition, but those animals who were LiCl-nicotine paired earned fewer average infusions than those who were LiCl-saccharin paired.

For raw active lever responses made in nicotine reacquisition (Figure 4.11A), ANOVA revealed a significant effect of Pairing [F(1, 15)=5.51, p=0.033], but no significant effect of Day

[F(1, 15)=4.06, p=0.06] or Day*Pairing interaction [F(1, 15)=0.08, p=0.79]. However, when active lever data was expressed as a proportion of baseline responding (Figure 4.11B), ANOVA revealed not only a significant effects of Pairing [F(1, 15)=9.67, p=0.007], but also of Day [F(1, 15)=5.86, p=0.029], yet still without a Day*Pairing interaction [F(1, 15)=0.03, p=0.87]. These results indicate that active lever responses in nicotine reacquisition were selectively reduced by LiCl-nicotine pairing, and that animals of both pairing groups made more active lever responses on the second reacquisition day.

For inactive lever responding (Figure 4.11C), no effects of Pairing [F(1, 15)=0.80, p=0.38],

Day [F(1, 15)=0.70, p=0.42], or Day*Pairing interaction [F(1, 15)=0.26, p=0.62] were found, suggesting that general activity in nicotine reacquisition was unaffected by LiCl-pairing to either reinforcer.

94

**, †

Figure 4.10. Sessional averages of nicotine reinforcements earned (±SEM) by group PRE (n=17) during operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion delivered upon reinforcement) across the 2-day reacquisition phase. Reinforcements are expressed both as mean sessional nicotine infusions (A), as well as a proportion of baseline infusions (B), using averaged infusions for each individual animal from days 9 & 10 of acquisition. For individual graphs, any significant repeated-measures ANOVA effects of Pairing are indicated by (**) for p<0.01; significant effects of Day are indicated by (†) for p<0.05.

95

*

**, †

Figure 4.11. Sessional average lever responses earned (±SEM) by group PRE (n=17) during operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion) across the 2-day reacquisition phase. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition for individual animals. Inactive lever responding is expressed as mean sessional lever presses made (C). For individual graphs, any significant repeated -measures ANOVA effects of Pairing are indicated by (**) for p<0.01; significant effects of Day are indicated by (†) for p<0.05.

96

4.4 Experiment 1: Discussion

4.4.1 LiCl and Saccharin

4.4.1.1 Successful Devaluation of Saccharin Rewards

The animals of group POST and nicotine-paired animals of group PRE readily acquired saccharin self-administration, selectively responding on the active lever to achieve relative response stability around days 6-7 of SSA, and consumed an average of approximately 35 (~6.7mL) saccharin reinforcements on the final day of self-administration. This amount is comparable with previous studies examining saccharin self-administration at the 0.1% (w/v) concentration (e.g. Shram et al.,

2008). The acquisitions of saccharin responding between group PRE and group POST are not directly comparable, as group PRE had a distinct nicotine and LiCl drug history prior to commencing SSA sessions.

Animals that underwent LiCl-saccharin pairing prior to operant acquisition performed fewer active lever responses as well as consumed less saccharin than animals who were LiCl-nicotine paired across the 10-day acquisition period, consuming an average of fewer than 2 reinforcements on the final acquisition day. Furthermore, while nicotine-paired animals increased their adjusted saccharin reinforcements consumed across the acquisition phase, animals that were saccharin-paired did not.

Similarly, in the saccharin reacquisition tests performed by group POST, animals of the saccharin-paired condition consumed significantly less saccharin (dropping to roughly 20-25 reinforcements) than those of the saccharin-unpaired (i.e. nicotine-paired) condition, who maintained a consumption average of around 40-45 reinforcements.

The significance of these results is twofold. First, the attenuation of saccharin consumption in both acquisition (in group PRE) and reacquisition (in group POST) show that the parameters of

97

the LiCl pairing procedure used with saccharin produced effective conditioned taste aversion, successfully reducing saccharin’s efficacy as a reinforcer in the saccharin-paired animals. The relatively high saccharin responding performed by the nicotine-paired group, given equal LiCl exposure, demonstrates that this reduction in saccharin value was not a general consequence of LiCl administration (a “carryover effect”) but rather a specific consequence of LiCl-saccharin pairing.

Secondly, to our knowledge, these results are the first demonstration that reward devaluation prior to the acquisition of operant responding can prevent subsequent self-administration, a phenomenon which will be critical for interpreting the effects of LiCl-nicotine pairing on the acquisition of nicotine self-administration.

4.4.1.2 Outcome-Sensitive Saccharin Seeking

As animals in Experiment 1B were counterbalanced following acquisition, animals of both saccharin-paired and nicotine-paired conditions comparably earned roughly 40 saccharin reinforcements in roughly 80 active lever responses on the final acquisition day, and no significant response or reinforcement condition differences were found prior to devaluation initiation.

Saccharin-paired animals of group POST demonstrated a selective reduction in saccharin responding in both extinction and reacquisition testing. This indicates that not only did incentive learning of the newly reduced outcome-value occur (evidenced by reduced reacquisition performance), but also that this incentive learning was capable of reducing responses made during extinction. This strongly suggests that saccharin-seeking during extinction was mediated by an R-O association.

This pattern of reduction in both extinction and reacquisition responses following LiCl devaluation is characteristic of goal-directed behaviors, and has previously been observed in goal- directed responding for food pellets (Adams & Dickinson, 1981), goal-directed responding for

98

sucrose solution (Lingawi & Balleine, 2012), and goal-directed responding for ethanol solution

(Mangieri et al., 2012). It can be concluded that the operant responding for saccharin reinforcement in group POST following 10 training sessions of FR1 reinforcement was an outcome-sensitive behavior, and hence can be considered goal-directed (readers are referred to Table 1.1. for revision of post-devaluation testing interpretation).

Considering these animals were trained for a limited amount of time on an FR1 schedule of reinforcement, these results are congruent with the literature consensus that responding for natural rewards following limited training is a behavior under goal-directed control. Importantly, these results also suggest that the methodology of aversion-pairing followed by extinction and reacquisition testing employed in Experiment 1B was effective in determining if a behavior is outcome-sensitive.

4.4.2 LiCl and Nicotine

4.4.2.1 Complex Devaluation of Nicotine Rewards

Consistent with the ranges of nicotine self-administered reported in many previous studies for the 1h session duration and the 30μg/kg infusion dose employed (e.g. Coen et al., 2009; Corrigall

& Coen, 1989; Donny et al., 1995; Watkins et al., 1999), animals of Experiment 1 readily acquired

NSA. In both group PRE and group POST, animals earned an average of roughly 13 infusions per session by the final day of acquisition, selectively responding on the active lever and achieving relative response stability around acquisition days 6-7 (Figure 4.8).

Unlike the effects of LiCl-pairing on saccharin responding, however, a complex picture was observed in the effects of LiCl-pairing upon nicotine responding in the same animals. First, the acquisition of NSA in Experiment 1A was not differentially affected by whether animals of group

PRE were saccharin-paired or nicotine-paired. Since the same LiCl pairing parameters were capable

99

of selectively suppressing the acquisition of SSA in the saccharin-paired animals, the possibility that

LiCl-nicotine pairings failed to influence NSA due to an inability of pairing parameters to condition an aversion may be ruled out. These findings indicate that when LiCl-nicotine pairing occurred prior to operant acquisition, it did not influence the subsequent acquisition of NSA and as such failed to effectively devalue nicotine.

However, after the animals of group POST underwent an identical devaluation procedure, their nicotine intake in nicotine reacquisition sessions significantly differed depending on pairing condition. Nicotine-unpaired animals increased their intake to around 17 infusions per session, while paired animals fell to roughly 11 infusions per session (Figure 4.10). This unanticipated result suggests that the devaluation was effective in group POST, but not group PRE. There are several potential explanations for why this may have been the case, with potential mediating factors including some aspect of operant acquisition, such as nicotine exposure. These possibilities will be considered in further detail in Chapter 6.

4.4.2.2 Outcome-Insensitive Nicotine Seeking

When group POST animals performed extinction and reacquisition testing for nicotine to assess goal-directedness of behavior, the pattern of results was quite different than their performance of these tests in the saccharin SA context. In the nicotine extinction test, no response differences existed between animals in saccharin-paired or nicotine-paired conditions. Yet during reacquisition sessions, the active lever responding and proportional reinforcements earned by animals that were nicotine-paired were selectively reduced. In previous studies investigating habit formation with LiCl devaluations, the pattern of LiCl-paired animals maintaining control-level responding in extinction but reduced responses in reacquisition is the defining characteristic of habitual behavior; this pattern has been previously observed in habitual responding for food pellets

100

(Adams, 1982), sucrose solution (Lingawi & Balleine, 2012; Mangieri et al., 2012), ethanol solution

(Dickinson et al., 2002; Mangieri et al., 2012), and cocaine solution (Miles et al., 2003). As such, the results of Experiment 1B best fits the interpretation that not only was the devaluation of nicotine effective, but also that operant responding for nicotine reinforcement in group POST was a behavior under habitual control.

4.4.3 Conclusions

The rather surprising implication of Experiment 1 is that even when using training conditions designed to minimize the possibility of habitual responding (i.e. maximize the likelihood of outcome-sensitive responding), nicotine seeking behavior had already become outcome- insensitive. This interpretation will be revisited in Chapter 6. In the next chapter, we employed the distinct approach of contingency degradation to further evaluate this phenomenon.

CHAPTER 5: Experiment 2 Contingency Degradation of Nicotine Responding

5.1 Experiment 2: Introduction

5.1.1 Assessing Sensitivity to Instrumental Contingency

5.1.1.1 An Alternative Approach

Nicotine-seeking behavior observed in the previous experiment was apparently outcome- insensitive despite minimal training, potentially indicating a rapid rate of habit formation. However, outcome-insensitivity is only one of the two criteria for identifying habitual behavior.

The other experimental approaches to dissociate goal-directed from habitual behaviors, such as omission learning and contingency degradation, both assess the relative sensitivity of contingency learning in each experimental condition to reductions of the instrumental contingency. As discussed in the general introduction, the contingency degradation procedure is the most suitable of these two methods for the study of drug rewards, as it is easier to control drug intake using that methodology.

In addition to assessment of habitual responding for natural rewards (e.g. Bradfield et al., 2013;

Dickinson & Mulatero 1989; Yin et al., 2006), contingency degradation has been previously used in several studies examining habitual responding for alcohol (Fanelli et al., 2013; Mangieri et al., 2014;

Serlin & Torregrossa, 2015; Shillinglaw et al., 2014).

5.1.1.2 Typical Contingency Degradation Procedures

In the contingency degradation paradigm, animals are first trained to make operant responses (i.e. lever presses) for a given reinforcer. The primary manipulation in contingency degradation paradigms is then carried out through the non-contingent presentation of that operant

101

102

reinforcer during operant sessions; active responding for that same reinforcer will thereby be decreased at a rate proportional to the contingency-sensitivity (i.e. goal-directedness) of the response behavior (Dickinson & Mulatero, 1989). Contingency degradation can be conducted by rendering active lever responses inconsequential and delivering non-contingent rewards at scheduled time intervals (e.g. Fanelli et al., 2013; Serlin & Torregrossa, 2015), or alternatively, animals can be trained to respond on a lever that produces reinforcements at a given probability-per-second and then elevating the probability of non-contingent reward delivery to the same level, making the operant response irrelevant without being ineffectual (Bradfield et al., 2013; Dickinson & Mulatero, 1989).

Sessions in which reinforcers are non-contingently presented to animals of the degraded condition are referred to as “degradation sessions”. The rate of decline in operant responses across degradation sessions is assessed through comparison to a complementary rate of decline obtained from a distinct degradation condition, such as for an alternative reinforcer or following distinct training (e.g. Mangieri et al., 2014; Shillinglaw et al., 2014). To ensure that a given contingency degradation procedure induces response-specific behavioral suppression, degraded animals frequently operant respond for a distinct, non-degraded reinforcer as a control throughout training and degradation sessions (Bradfield et al., 2013; Dickinson & Mulatero, 1989).

That being said, at the time of writing no literature appears to exist on the use of contingency degradation procedures employed upon intravenous reinforcers. This is likely due in part to the fact that intravenous drug delivery can preclude direct application of standard assays of outcome and contingency sensitivity. Drugs can have unconditioned, behaviorally-activating effects that can potentially confound response measures, a risk particularly true for psychostimulants such as cocaine (Ostlund & Balleine, 2009; Zapata, Minney & Shippenberg, 2010). As such, various modifications of the typical contingency degradation procedure were employed in the current experiment in order to adapt the method for the study of intravenous nicotine, discussed below.

103

5.1.1.3 Adapting Contingency Degradation to Intravenous Nicotine

The present experiment was designed to evaluate whether or not NSA and SSA are under goal-directed or habitual control following 10 acquisition sessions of each reinforcer, by means of contingency degradation. For comparative purposes, the parameters of the acquisition phase of

Experiment 1 were replicated as closely as possible. Animals were trained for 10 days with two daily operant sessions, one for saccharin and one for nicotine, on FR1 schedules of reinforcement. With nicotine-degradation serving as the major condition of interest, saccharin was the alternative reward to be degraded in comparison. In addition, by having the same animals respond for both reinforcers, nicotine and saccharin also served as each other’s secondary non-degraded reinforcer to control for and assess specificity of the degradation procedure. To eliminate the possibility that the concurrent performance of saccharin sessions could disrupt contingency degradation performance for the nicotine reinforcer, an additional experimental group was added in which animals responded for nicotine alone.

Another pitfall to be avoided was that animals in nicotine-degraded and nicotine-control conditions could obtain divergent numbers of nicotine infusions across the degradation phase. To exactly equilibrate nicotine intake between the degraded and non-degraded conditions of each experiment, a “yoking” procedure was employed. Yoking refers to the assignment of a non- degraded partner to each animal of the degraded condition; when one of the non-degraded animals earned a reward as normal, a “yoked” reward was simultaneously and non-contingently delivered to that animal’s degraded counterpart. As such, animals of the non-degraded control conditions simultaneously earned reinforcements both for themselves and (unknowingly) for the yoked animals of the degraded conditions.

Finally, a 10-minute extinction test was carried out the day following the 6 sessions of degradation, in which responding on both active and inactive levers was measured but otherwise

104

inconsequential (Bradfield et al., 2013; Dickinson & Mulatero, 1989; Fanelli et al., 2013; Shillinglaw et al., 2014; Serlin & Torregrossa, 2015).

5.2 Experiment 2: Materials and Methods

5.2.1 General Procedures

5.2.1.1 Experimental Design

A total of 50 animals proceeded through a standard pre-training phase including acclimatization, food training, catheterization, and consolidation as described in Chapter 3.

Following consolidation, animals were assigned to either Experiment 2A: Nicotine-Saccharin (n=26; the “NicSac group”) or Experiment 2B: Nicotine-Alone (n=24; the “NicAlone” group), such that animals of each group were evenly divided between left and right active lever assignments. Even numbers of animals were assigned to each group so all animals could have a partner with which to be yoked in the forthcoming degradation phase. As in the previous experiment, animals completed an acquisition phase consisting of 10 days of operant training under a FR1 reinforcement schedule.

Sessional parameters for nicotine and saccharin self-administration were consistent with the previous experiment, outlined in Chapter 3 (sections 3.3.2 & 3.3.3).

Following acquisition, animals within each experiment group were counterbalanced into nicotine-degraded and control conditions using data from the final 3 days of acquisition. In the case of the NicSac group, the animals in the non-degraded control condition during NSA sessions served as the degraded condition during SSA sessions, and likewise the non-degraded condition for SSA sessions served as the degraded condition during NSA sessions. For the NicAlone group, the non- degraded control condition merely experienced self-administration sessions as normal.

105

These final 4 groups completed a 6-day degradation phase, which was concluded the following day in a 10 minute extinction test (specific details of the degradation procedure are given below). The major group divisions and experimental events are summarized in Figure 5.1.

Figure 5.1. Illustration of key events in Experiment 2. Animals were assigned to Nicotine-Saccharin and Nicotine-Alone experimental groups following pre-training. After operant acquisition, animals in the Nicotine-Saccharin group were equally divided into nicotine-degraded and saccharin-degraded conditions, while animals in the Nicotine-Alone group were similarly assigned to nicotine-degraded and non-degraded control conditions. Following the degradation phase, animals in all conditions completed a test session in extinction conditions.

5.2.1.2 Daily Schedule

During the 10-day acquisition period, animals in the NicSac group each performed two daily operant sessions, one for nicotine, and one for saccharin. One session occurred in the morning and the other in the afternoon, separated by roughly 4 hours. Animals in the Nicotine-Alone group performed a solitary daily session for nicotine. Each acquisition day was divided into four time-slots, with a given time slot occurring at the same time each day. Animals in the NicSac group were

106

assigned to squads 1 & 2, while animals in the NicAlone group were assigned to squads 3 & 4. Daily breakdown of operant sessions is represented graphically, below.

During time-slot one: squad 1 = nicotine squad 2 = saccharin During time-slot two: squad 3 = nicotine During time-slot three: squad 4 = nicotine During time-slot four: squad 1 = saccharin squad 2 = nicotine

The next day, the order of reinforcers presented to each squad was reversed, such that:

During time-slot one: squad 1 = saccharin squad 2 = nicotine During time-slot two: squad 3 = nicotine During time-slot three: squad 4 = nicotine During time-slot four: squad 1 = nicotine squad 2 = saccharin

This two-step pattern was repeated throughout the experiment until its conclusion.

5.2.2 Contingency Degradation Procedure

5.2.2.1 Yoking and Counterbalancing

For a specific reinforcer, the degradation phase consisted of two conditions: the “degraded” condition (in which active lever responding was recorded but otherwise inconsequential), and the

“non-degraded” condition (in which animals experienced operant sessions identical to normal acquisition).

Upon completion of the acquisition phase, a second round of counterbalancing divided

NicSac and NicAlone groups into those final conditions for degradation just mentioned, utilizing averaged data from the last three days of acquisition. Descending priority was given to averaged nicotine infusions earned, active lever responses for nicotine, adjusted saccharin reinforcements consumed, and active lever responses for saccharin.

To perform this counterbalancing, two animals at a time from a given experimental group were selected to form a yoking pair; as much as possible, animals comprising a given yoking pair

107

were matched according to similarity in nicotine infusions earned as well as nicotine active lever responses. After all the animals within an experimental group had been assigned to yoking pairs, one animal from each pair was designated as degraded (the other as non-degraded). To equilibrate baselines for degraded and non-degraded conditions, condition assignments within yoking pairs were swapped until degraded and non-degraded conditions were comparable in average nicotine reinforcements earned, average active lever presses for nicotine, and in left vs. right active lever assignments.

In the case of the NicSac group, an additional step was next performed. Assignment into nicotine-degraded and nicotine-control conditions indicated which animals were to be respectively saccharin-control or saccharin-degraded, but the SSA baseline data of yoking pairs was no longer necessarily comparable. Saccharin sessions thus featured distinct yoking pairs from nicotine sessions, with new yoking pairs created by matching animals in the pre-determined control and degradation conditions as closely as possible for averaged baseline metrics of saccharin reinforcements consumed, saccharin active lever responses, and left vs. right active lever assignments.

5.2.2.2 Degradation Sessions

Degradation sessions occurred in the 6 days following the completion of the 10-day acquisition period, and proceeded with the same daily structure as described for acquisition (section

5.2.1.2). Animals in degraded conditions had active lever responding recorded but otherwise inconsequential, instead receiving rewards (and associated cues) contingent on responses made by their yoked partner.

108

5.2.2.3 Extinction Testing

Extinction testing occurred the day following completion of the 6th degradation day.

Animals of the NicSac group performed two 10-minute extinction tests, one in each reward’s self- administration context, each test taking place during what would have otherwise been the appropriate time-slot. Animals in the NicAlone group similarly performed one 10-minute extinction test in the nicotine self-administration context. Active and inactive lever responding was universally recorded but otherwise inconsequential. Rewards and reward-associated cues were not presented at any time during extinction sessions.

As mentioned in the general introduction, the typical purpose of extinction testing in contingency degradation paradigms is to rule out response reductions due to specific satiety as an alternative interpretation of response reductions by degraded-conditions during degradation sessions

(Dickinson & Mulatero, 1989).

In the interpretation of nicotine responding during degradation, the extinction test took on an additional role of importance in ruling out an effect of nicotine-mediated behavioral activation. If a high level of operant responding was maintained by nicotine-degraded animals during degradation, interpretation would be ambiguous between either a legitimate absence of contingency learning

(reflecting habitual behavior), or contingency learning masked by nicotine-sustained responding

(reflecting goal-directed behavior obscured by behavioral activation). However, because no nicotine or cues were delivered during extinction testing, any contingency learning which occurred during degradation (i.e. response reductions) but masked by behaviorally activating effects of nicotine (i.e. response reductions elevated to control levels via a nicotine-specific effect) would be made apparent, as reduced extinction responding relative to non-degraded control. All possible interpretations of degradation and extinction test performance for Experiment 2 are summarized in Table 5.1.

109

Table 5.1. All possible interpretations of degradation and extinction results (relative to responding for a distinct, non-degraded control reinforcer) Degradation Extinction Contingency

Performance Performance Sensitivity? No response Uninterpretable Condition A  Reduced differences (satiety) responses Reduced Condition B  Sensitive responses No response Condition C  No response Insensitive differences differences Reduced Uninterpretable Condition D  responses (drug effect)

As a final note, because neither saccharin nor nicotine rewards were never at any point devalued, testing sessions under reacquisition conditions is irrelevant for contingency degradation paradigms (Dickinson & Mulatero, 1989).

5.3 Experiment 2: Results

5.3.1 Data Analysis for Experiment 2

Just as in Chapter 4, baseline analysis was performed in order to verify that no pre-existing differences in the groups assigned to nicotine-degradation or saccharin-degradation could confound results (considering non-patent animals were eliminated after counterbalancing). Separate sets of repeated-measures ANOVA were conducted across the entire acquisition phase, as well as the final two days of the acquisition phase for both saccharin and nicotine sessions, using the within-factor of

“Day” (acquisition session) and the between-factor of “Degradation” (future assignment to nicotine- degraded or saccharin-degraded conditions). For each reinforcer, an analysis was performed on each

110

of the sessional metrics of reinforcements obtained, active lever responses, and inactive lever responses.

Performance in degradation sessions was assessed using repeated-measures ANOVA, using the within-factor of “Day” (degradation session) and the between-factor of “Degradation” (nicotine- degraded vs. saccharin-degraded), for sessional metrics of reinforcements earned, active lever responses, and inactive lever responses. Performance in extinction conditions was assessed using two separate univariate ANOVAs, each with a fixed-factor of “Degradation”, and distinct dependent measures of active and inactive lever responding.

As in the previous experiment, analysis of active lever responding during extinction testing was repeated following response expression as a proportion of each individual animal’s baseline responses (i.e. averaged sessional active lever responses from days 9 & 10 of acquisition).

5.3.2 Experiment 2A: Nicotine-Saccharin Group Results

5.3.2.1 Attrition

Of the 26 animals assigned to the Nicotine-Saccharin group, 3 were excluded from analysis due to loss of catheter patency, and 1 animal excluded due to a failure to respond in operant sessions. A total of 22 animals were used in the final data analysis: 10 animals completed the study as saccharin-degraded and 12 animals completed the study as nicotine-degraded. Exclusion invalidated an animal’s NSA as well as SSA data.

The non-patency of an animal, though, does not invalidate their yoked partner’s data. The yoking partners of excluded animals obtained reinforcements thus not included for analysis in the excluded animals’ condition (which is why all subsequent graphs of reinforcements earned by degraded and non-degraded conditions during degradation sessions do not perfectly align). This

111

remains true in the cases of nicotine reinforcement during degradation sessions, for both the NicSac and NicAlone groups.

5.3.2.2 Acquisition of Saccharin Responding

The average sessional performance of the NicSac group during the 10 acquisition days for self-administration of saccharin are expressed in adjusted saccharin reinforcements consumed

(Figure 5.2A), active lever responses made (Figure 5.2B), and inactive lever responses made (Figure

5.2C).

Importantly, animals were not counterbalanced until the acquisition phase was completed, and during counterbalancing priority was given to equilibration of nicotine performance between conditions. As such, the degraded and non-degraded conditions given in Figure 5.2 reflect the performance histories of animals that were only later assigned to those conditions.

As such, statistical analysis of the data presented in Figure 5.2 was not conducted, and is instead presented here for reference purposes only. The critical assessment of pre-existing baseline differences between the two experimental conditions utilized only data that contributed to counterbalancing, and is instead presented in Table 5.2.

112

Figure 5.2. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for the NicSac group (n=22) in operant saccharin self- administration sessions (0.19ml of 0.1% w/v per reinforcement) during the 10-day acquisition phase. Assignment of animals to degraded and non-degraded conditions was performed only after the acquisition phase was complete; data presented in this figure thus reflects the performance history of animals only later assigned to those conditions for reference purposes, and therefore statistical analysis was not performed.

113

5.3.2.3 Baseline Differences in Saccharin Responding

For the final two operant sessions of acquisition in which NicSac group animals responded for saccharin (Table 5.2), repeated-measures ANOVA of adjusted reinforcements consumed revealed no significant effect of Degradation group [F(1, 20)=2.45, p=0.13], Day [F(1, 20)=2.63, p=0.12], or Day*Degradation interaction [F(1, 20)=0.65, p=0.43]. Similar analysis of active lever responses likewise revealed no significant effects of Degradation group [F(1, 20)=1.81, p=0.19], Day

[F(1, 20)=0.82, p=0.38], or Day*Degradation interaction [F(1, 20)=0.63, p=0.44]. This was also true for inactive lever responses, with no significant effect of Degradation group [F(1, 20)=0.15, p=0.71],

Day [F(1, 20)=0.63, p=0.44], or Day*Degradation interaction [F(1, 20)=0.87, p=0.36].

These results suggest that the differences between degradation conditions during saccharin acquisition diminished as the acquisition phase ended (and approached the sessions used for counterbalancing).

Table 5.2. Baseline data for group NicSac during saccharin self-administration. ACQUISITION DAY 9 ACQUISITION DAY 10 METRIC FUTURE CONDITION PERFORMANCE (±SEM4) PERFORMANCE (±SEM4) Reinforcements Saccharin-degraded 24.87 (±4.06) 32.73 (±4.06) Earned1 Nicotine-degraded 18.61 (±4.57) 21.25 (±5.07) Active Lever Saccharin-degraded 60.30 (±10.68) 74.20 (±9.85) Responding2 Nicotine-degraded 47.08 (±11.84) 48.00 (±13.51) Inactive Lever Saccharin-degraded 2.90 (±0.71) 3.00 (±0.58) Responding3 Nicotine-degraded 3.25 (±1.12) 2.00 (±0.51) 1) Average of [raw reinforcements earned – (unconsumed saccharin volume/reinforcement volume)] for individual animals 2) Mean sessional active lever responses 3) Mean sessional inactive lever responses 4) Standard error of the mean

114

5.3.2.4 Saccharin Degradation Performance

Repeated-measures ANOVA of saccharin reinforcements consumed by degraded and non- degraded groups during the degradation phase (Figure 5.3A) did not reveal any significant effect of

Degradation [F(1, 20)=0.004, p=0.95], but did reveal a significant effect of Day [F(3.35,

66.96)=5.44, p=0.001], with no significant Day*Degradation interaction [F(3.35, 66.96)=0.01, p=0.99]. This indicates animals in the saccharin non-degraded (i.e. nicotine-degraded) condition progressively earned more reinforcements during saccharin degradation, and reinforcements between conditions were equilibrated through yoking (data points in Figure 5.3A do not perfectly align is due to the fact that when animals were tested for patency at the end of the study, single animals from yoking pairs were eliminated).

Analysis of active lever responses made during saccharin degradation sessions (Figure 5.3B) revealed a significant effect of Degradation [F(1, 20)=16.20, p=0.001], no significant effect of Day

[F(3.43, 68.51)=1.39, p=0.24], but a significant Day*Degradation interaction [F(3.43,

68.51)=5.88, p=0.001], indicating that although there may not have been a change across responding in both groups combined, there was simultaneous increase in responding of the nicotine-degraded group with a decrease in responding of the saccharin-degraded group. Responses made by the two groups separately were indeed different, evidenced by the significant effect of

Degradation.

Analysis of inactive lever responding (Figure 5.3C) revealed no effects of Degradation [F(1,

20)=0.67, p=0.42], Day [F(2.36, 47.28)=2.24, p=0.11], or Day*Degradation [F(2.36, 47.28)=0.85, p=0.45]. This suggests that the manipulation of contingency was relevant for active lever responding in saccharin sessions alone.

115

††

**, ‡‡

,

Figure 5.3. Daily sessional averages (±SEM) in reinforcements consumed (A), active lever responses (B), and inactive lever responses (C) for the NicSac group (n=22) in operant saccharin self-administration sessions (0.19ml of 0.1% w/v per reinforcement) during the 6-day degradation phase. For individual graphs, any significant repeated-measures ANOVA effects of Degradation are indicated by (**) for p<0.01; significant effects of Day are indicated by (††) for p<0.01; significant Day*Degradation interaction is indicated by (‡‡) for p<0.01.

116

5.3.2.5 Saccharin Extinction Performance

During testing of saccharin-seeking responses made under extinction conditions, ANOVA revealed a significant effect of Degradation on active lever responses made, both when expressed as raw data [F(1, 20)=13.89, p=0.001] (Figure 5.4A) as well as baseline-proportion [F(1, 20)=17.41, p<0.001] (Figure 5.4B). This indicates that animals who were saccharin-degraded maintained a low level of responding even with rewards withheld, suggesting that response reductions during degradation sessions was due to genuine contingency learning and not a by-product of saccharin satiety. In contrast, there were no significant effects on inactive lever responding of Degradation

[F(1, 20)=0.005, p=0.94] during extinction (Figure 5.4C), suggesting the contingency manipulation was relevant only for responses made on the active lever.

5.3.2.6 Effects of Squad on Saccharin Extinction

Because the NicSac group performed two extinction tests for the two different reinforcers in the same day, it may have been possible that performing an extinction test in the morning altered how a squad performed in the afternoon. To explore this possibility, a univariate ANOVA with between-factors of degradation and squad was performed on active lever responses made during saccharin extinction testing. Analysis revealed a significant effect of Degradation [F(1, 18)=14.08, p=0.001], but no effect of Squad [F(1, 18)=0.25, p=0.62] or Squad*Degradation interaction [F(1,

18)=2.14, p=0.16]. These results confirm that the saccharin-degradation procedure reduced the responding of degraded animals, and also suggest that the sessional order in which animals were tested was statistically inconsequential.

117

**

**

Figure 5.4. Averaged sessional lever responding (±SEM) of the NicSac group (n=22) during the 10- minute extinction test conducted in the saccharin self-administration context. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition. Inactive lever responding is expressed as mean sessional lever presses made (C). For individual metrics, any significant ANOVA effect of Degradation is indicated by (††) for p<0.01.

118

5.3.2.7 Nicotine-Saccharin Group: Acquisition of Nicotine Responding

The average sessional performance of the NicSac group during the 10 acquisition days for self-administration of nicotine are expressed in nicotine infusions earned (Figure 5.5A), active lever responses made (Figure 5.5B), and inactive lever responses made (Figure 5.5C).

Importantly, animals were not counterbalanced until the acquisition phase was completed.

As such, the degraded and non-degraded conditions given in Figure 5.5 reflect the performance histories of animals that were only later assigned to those conditions.

As such, statistical analysis of the data presented in Figure 5.5 was not conducted, and is instead presented here for reference purposes only. The critical assessment of pre-existing baseline differences between the two experimental conditions utilized only data that contributed to counterbalancing, and is instead presented in Table 5.3.

119

Figure 5.5. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for the NicSac group (n=22) in operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion delivered upon reinforcement) during the 10-day acquisition phase. Assignment of animals to degraded and non-degraded conditions was performed only after the acquisition phase was complete; data presented in this figure thus reflects the performance history of animals only later assigned to those conditions for reference purposes, and therefore statistical analysis was not performed.

120

5.3.2.8 Nicotine-Saccharin Group: Baseline Differences in Nicotine Responding

When operant sessions during the final two days of acquisition in which animals of the

NicSac group responded for nicotine were analyzed for baseline differences (Table 5.3), ANOVA of nicotine infusions earned revealed no significant effect of future Degradation group assignment

[F(1, 20)=0.60, p=0.45], Day [F(1, 20)=0.13, p=0.72], or Day*Degradation interaction [F(1,

20)=0.57, p=0.46]. Similar analysis for active lever responses revealed no main effect of Degradation

[F(1, 20)=0.05, p=0.83], Day [F(1, 20)=0.39, p=0.54], or Day*Degradation interaction [F(1,

20)<0.01, p=0.99], and consistently, analysis of inactive lever responses revealed the same pattern of no significant effect of Degradation [F(1, 20)=1.19, p=0.28], Day [F(1, 20)=0.26, p=0.62], or

Day*Degradation interaction [F(1, 20)=1.95, p=0.18].

These results indicate conditional baselines did not significantly differ, and differences in degradation and extinction data between nicotine-degraded and nicotine non-degraded conditions could be attributed to the degradation procedure.

Table 5.3. Baseline data for group NicSac during nicotine self-administration. ACQUISITION DAY 9 ACQUISITION DAY 10 METRIC FUTURE CONDITION PERFORMANCE (±SEM4) PERFORMANCE (±SEM4) Reinforcements Saccharin-degraded 15.80 (±2.93) 15.10 (±2.47) Earned1 Nicotine-degraded 13.08 (±1.63) 13.33 (±1.36) Active Lever Saccharin-degraded 26.60 (±5.02) 25.10 (±4.41) Responding2 Nicotine-degraded 25.42 (±4.20) 24.00 (±2.47) Inactive Lever Saccharin-degraded 3.62 (±1.76) 3.85 (±1.51) Responding3 Nicotine-degraded 5.08 (±1.78) 3.58 (±1.83) 1) Mean sessional nicotine infusions earned 2) Mean sessional active lever responses made 3) Mean sessional inactive lever responses made 4) Standard error of the mean

121

5.3.2.9 Nicotine-Saccharin Group: Nicotine Degradation Performance

For the NicSac group’s nicotine infusions earned during degradation (Figure 5.6A), repeated- measures ANOVA revealed no significant effect of Degradation [F(1, 20)=0.45, p=0.51], Day

[F(2.73, 54.62)=1.43, p=0.25], or Day*Degradation [F(2.73, 54.62)=0.35, p=0.77]. This indicates that the nicotine non-degraded (i.e. saccharin-degraded) condition earned a stable number of nicotine infusions across the six-day degradation phase, and reflects that the nicotine-degraded condition earned a comparable number of infusions through the yoking procedure.

Similar analysis of active lever responses made for nicotine during degradation (Figure 5.6B) similarly revealed no significant effect of Degradation [F(1, 20)=0.002, p=0.96], Day [F(2.54,

50.74)=2.20, p=0.11], or Day*Degradation [F(2.54, 50.74)=0.89, p=0.44]. These results suggest that unlike active lever responding for saccharin, the nicotine-degradation procedure was inconsequential for the active lever responses made by the nicotine-degraded animals.

Analysis of inactive lever responding in nicotine degradation sessions (Figure 5.6C) revealed no significant effect of Degradation [F(1, 20)=1.82, p=0.19], a significant effect of Day [F(2.70,

54.04)=3.31, p=0.031], but no significant Day*Degradation interaction [F(2.70, 54.04)=1.13, p=0.341]. This indicates that animals progressively increased responding on the inactive lever across degradation, occurring in both nicotine-degraded and nicotine non-degraded groups.

122

Figure 5.6. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for the NicSac group (n=22) in operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion delivered upon reinforcement) during the 6-day degradation phase. For individual graphs, any significant repeated-measures ANOVA effect of Day is indicated by (†) for p<0.05.

123

5.3.2.7 Nicotine-Saccharin Group: Nicotine Extinction Performance

For active lever responding (Figure 5.7A) during extinction testing in the nicotine self- administration context, no effect of Degradation was observed [F(1, 20)=1.72, p=0.21], and this lack of effect was maintained when data was analyzed as proportions of baseline responding [F(1,

20)=1.13, p=0.30] (Figure 5.7B). This indicates that the comparable response rate of the nicotine- degraded to control group seen during degradation persisted into extinction. If contingency learning had in fact occurred but was masked by a behaviorally activating property of nicotine, extinction responding would have been reduced in extinction (i.e. the absence of nicotine); this suggests that animals of the NicSac group did not learn about changes in the instrumental contingency while responding for nicotine.

Similarly, there was no effect of Degradation on inactive lever responding (Figure 5.7C) during extinction testing [F(1, 20)=0.29, p=0.60], suggesting the contingency manipulation conditions had no effect on general activity.

5.3.2.10 Effects of Squad on Nicotine Extinction

Just as with the NicSac group’s extinction performance for saccharin responding, a univariate ANOVA with between-factors of degradation and squad was performed on active lever responses in extinction testing of nicotine responding. Analysis revealed no significant effect of

Degradation [F(1, 18)=1.90, p=0.19], Squad [F(1, 18)=2.29, p=0.15], or of Squad*Degradation interaction [F(1, 18)=1.47, p=0.24]. Like the squad analysis for saccharin responding, these results suggest that the sessional order in which animals of the Nicotine-Saccharin group were tested was statistically inconsequential.

124

Figure 5.7. Averaged sessional lever responding (±SEM) of the NicSac group (n=22) during the 10- minute extinction test conducted in the nicotine self-administration context. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition. Inactive lever responding is expressed as mean sessional lever presses made (C). Univariate ANOVA revealed no statistical significance for individual graphs in this figure.

125

5.3.3 Experiment 2B: Nicotine-Alone Group Results

5.3.3.1 Attrition

Of the 24 animals assigned to the Nicotine-Alone group, 3 were excluded from analysis due to loss of catheter patency, 1 from the control condition and 2 from the degraded condition. A total of 21 animals were used in the final data analysis; 11 animals completed the study as non-degraded controls, and 10 animals completed the study as nicotine-degraded.

5.3.3.2 Nicotine-Alone Group: Acquisition of Nicotine Responding

The average sessional performance of the NicAlone group during the 10 acquisition days for self-administration of nicotine are expressed in nicotine infusions earned (Figure 5.8A), active lever responses made (Figure 5.8B), and inactive lever responses made (Figure 5.8C).

Importantly, animals were not counterbalanced until the acquisition phase was completed.

As such, the degraded and non-degraded conditions given in Figure 5.8 reflect the performance histories of animals that were only later assigned to those conditions.

As such, statistical analysis of the data presented in Figure 5.8 was not conducted, and is instead presented here for reference purposes only. The critical assessment of pre-existing baseline differences between the two experimental conditions utilized only data that contributed to counterbalancing, and is instead presented in Table 5.4.

126

Figure 5.8. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for the NicAlone group (n=21) in operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion delivered upon reinforcement) during the 6-day degradation phase. Assignment of animals to degraded and non-degraded conditions was performed only after the acquisition phase was complete; data presented in this figure thus reflects the performance history of animals only later assigned to those conditions for reference purposes, and therefore statistical analysis was not performed.

127

5.3.3.3 Nicotine-Alone Group: Differences in Baseline Responding

ANOVAs were conducted upon baseline data (Table 5.4) to uncover pre-existing condition differences. With respect to nicotine reinforcements earned there was no significant effect of

Degradation condition [F(1, 18)=0.41, p=0.53], but there was a significant effect of Day [F(1,

18)=5.45, p=0.031] without a significant Day*Degradation interaction [F(1, 18)=1.79, p=0.20]. This indicates that both conditions earned more reinforcements on acquisition day 10 than acquisition day 9.

Analysis of active lever responses revealed no significant effect of Degradation condition

[F(1, 18)=0.15, p=0.70], Day [F(1, 18)=1.96, p=0.18], or Day*Degradation interaction [F(1,

18)=1.51, p=0.24], and this same pattern was found with respect to inactive lever responses, with no effect of Degradation condition [F(1, 18)=0.81, p=0.38], Day [F(1, 18)=1.45, p=0.24], or

Day*Degradation interaction [F(1, 18)=2.32, p=0.15]. These results indicate no pre-existing condition differences existed for active and inactive lever response metrics.

Table 5.4 Baseline data for the NicAlone group. ACQUISITION DAY 9 ACQUISITION DAY 10 METRIC FUTURE CONDITION PERFORMANCE (±SEM4) PERFORMANCE (±SEM4) Reinforcements Saccharin-degraded 11.27 (±1.20) 12.91 (±1.46) Earned1 † Nicotine-degraded 13.00 (±1.24) 13.44 (±1.12) Active Lever Saccharin-degraded 25.55 (±5.02) 26.55 (±3.90) Responding2 Nicotine-degraded 22.33 (±3.85) 37.56 (±15.91) Inactive Lever Saccharin-degraded 6.45 (±1.96) 5.73 (±2.60) Responding3 Nicotine-degraded 7.78 (±3.81) 14.00 (±7.22) 1) Mean sessional nicotine infusions earned 2) Mean sessional active lever responses made 3) Mean sessional inactive lever responses made 4) Standard error of the mean (†) indicates a significant effect of Day (p<0.05)

128

5.3.3.4 Nicotine-Alone Group: Degradation Performance

During the NicAlone group’s degradation sessions, ANOVA for nicotine infusions earned

(Figure 5.9A) revealed no significant effects of Degradation [F(1, 18)=0.002, p=0.96], Day [F(3.38,

60.76)=1.80, p=0.15], or Day*Degradation [F(3.38, 60.76)=0.40, p=0.78].

For active lever responding (Figure 5.9B), analysis revealed no significant effects of

Degradation [F(1, 18)=0.03, p=0.87], Day [F(1.94, 34.95)=1.19, p=0.32], or Day*Degradation interaction [F(1.94, 34.95)=1.21, p=0.31] (p=0.310), indicating that animals of the Nicotine-Alone group responded on the active lever equivalently, regardless of being in the nicotine-degraded or non-degraded conditions.

The same pattern held true for inactive lever responding (Figure 5.9C), with no significant effects of Degradation [F(1, 18)=3.53, p=0.08], Day [F(1.84, 33.06)=0.51, p=0.59], or

Day*Degradation interaction [F(1.84, 33.06)=0.75, p=0.47].

5.3.3.5 Nicotine-Alone Group: Extinction Performance

Just as with extinction responding for nicotine in the NicSac group, extinction active lever responding by the NicAlone group using raw data showed no effect of Degradation [F(1, 18)=0.01, p=0.91] (Figure 5.10A), nor did data expressed as proportion of baseline responding [F(1, 18)=0.15, p=0.70] (Figure 5.10B). Likewise, there was found no effect of Degradation on inactive lever responding [F(1, 18)=0.02, p=0.88] (Figure 5.10C). As NicAlone animals performed only a single extinction test, no assessment of squad order effects was necessary.

129

Figure 5 .9. Daily sessional averages (±SEM) in reinforcements earned (A), active lever responses (B), and inactive lever responses (C) for the NicAlone group (n=21) in operant intravenous nicotine self-administration sessions (0.03mg/kg/infusion delivered upon reinforcement) during the 6-day degradation phase. Repeated-measures ANOVA revealed no statistical significance for individual graphs in this figure.

130

Figure 5.10. Averaged sessional lever responding (±SEM) of the NicAlone group (n=21) during the 10-minute extinction test. Active lever responses are expressed both as mean sessional lever presses made (A), as well as a proportion of baseline responding (B), using active lever presses averaged from days 9 & 10 of acquisition. Inactive lever responding is expressed as mean sessional lever presses made (C). Univariate ANOVA revealed no statistical significance for individual graphs in this figure.

131

5.3.3.6 Cross-Group Nicotine Comparisons

In order to verify that performance of additional saccharin operant sessions by the NicSac group did not disrupt their performance in nicotine degradation or extinction sessions, a series of comparisons was conducted between the nicotine responding of the Experiment 2A: Nicotine-

Saccharin group and that of the Experiment 2B: Nicotine-Alone group.

For degradation sessions, repeated-measures ANOVAs were conducted using the between- factor of Experiment and the within-factor of Day; one was for animals in the control conditions of each experiment, and the other for animals in the degraded conditions of each experiment. For animals in the control conditions, analysis exposed no significant effects of Experiment [F(1,

19)=1.21, p=0.29], Day [F(5, 95)=1.73, p=0.14], or Day*Experiment interaction [F(5, 95)=2.01, p=0.09]. Similarly for animals in the nicotine-degraded conditions, no significant effects of

Experiment [F(1, 19)=0.26, p=0.62], Day [F(1.95, 37.00)=1.96, p=0.16], or Day*Experiment interaction [F(1.95, 37.00)=0.57, p=0.57] were found.

For extinction sessions, two univariate ANOVAs were conducted using the between-factor of Experiment, one for each of control and degraded conditions. A significant effect of Experiment was found for neither the control conditions [F(1, 19)=0.20, p=0.66] nor the degraded conditions

[F(1, 19)=0.52, p=0.48].

Taken together, these results indicate that nicotine responding in Experiment 2A (the NicSac group) was comparable to that in Experiment 2B (the NicAlone group), and nicotine responding in the NicSac group was unaffected by the concurrent performance of saccharin sessions.

132

5.4 Experiment 2: Discussion

5.4.1 Data Summary

During both SSA and NSA degradation sessions, the yoking procedure employed ensured equal numbers of saccharin and nicotine reinforcements were delivered to both degraded and non- degraded conditions (Figures 5.3A & 5.6A). The stratified counterbalancing used to assign animals to degraded and non-degraded conditions was conducted in such a way that the rats of each condition would have similar means as well as distributions in terms of the number of reinforcements historically obtained.

In the NicSac group, non-contingent deliveries of saccharin over the 6 degradation sessions resulted in a reinforcer-specific reduction in active lever responses, whereas in both the NicSac and

NicAlone groups, such-non contingent deliveries of nicotine did not lead to any significant changes in active lever responding . A similar pattern was observed during extinction testing; animals that were saccharin-degraded retained a low level of active lever responding relative to control during saccharin extinction, while nicotine-degraded animals responded in a fashion not significantly different from non-degraded animals during nicotine extinction.

5.4.2 Degradation and Extinction

5.4.2.1 Saccharin

In SSA sessions, animals of the NicSac group acquired operant responding to eventually perform an average of 60 active responses by the final acquisition day, to earn and consume around

25 saccharin reinforcements (~4.8mL). This amount is comparable with previous studies examining saccharin self-administration at the 0.1% (w/v) concentration (e.g. Shram et al., 2008). Although this

133

SSA performance is slightly less than that observed in Experiment 1, distinct aspects of the pre- acquisition phase in Experiment 1 (i.e. saccharin-pre exposure in group PRE, and 6 days of being undisturbed in group POST) are potentially responsible and preclude direct comparison.

Over the 6 degradation sessions, active lever responses for saccharin increased in the non- degraded animals to approximately 70 lever presses, yet responses made by the degraded animals fell to approximately 10 lever presses by the third degradation session, remaining at this level throughout the remainder of the degradation phase (Figure 5.3B). Inactive lever responding during this period remained unchanged (Figure 5.3C). The reduction in active lever responding seen in the saccharin- degraded animals was specific for the saccharin reinforcer, as these same animals were non-degraded during NSA sessions and their active lever responding remaining constant (Figure 5.6B).

Furthermore, these differences in active lever responding persisted into extinction testing (Figure

5.4), suggesting that the response reductions observed during degradation was a genuine reflection of contingency learning, and not mediated by effects such as specific satiety (Dickinson & Mulatero,

1989).

This exact pattern of reductions in reinforcer-specific active lever responses during degradation persisting into extinction is characteristic of contingency-sensitive, goal-directed behavior; as such, it has previously been reported in literature accounts of goal-directed responding for natural rewards such as food pellets and sucrose solution (Bradfield & Balleine, 2013; Dickinson

& Mulatero, 1989; Shillinglaw et al., 2014), as well as goal-directed responding for ethanol solution

(Fanelli et al., 2013; Serlin & Torregrossa, 2015; Shillinglaw et al., 2014).

5.4.2.2 Nicotine

During NSA sessions, animals of groups NicSac and NicAlone comparably acquired operant responding such that both groups performed roughly 25 active lever responses to earn around 13

134

infusions by the final acquisition day. Not only is this consistent with the ranges of nicotine self- administered reported in many previous studies for the 1h session duration and the 30μg/kg infusion dose employed (e.g. Coen et al., 2009; Corrigall & Coen, 1989; Donny et al., 1995; Watkins et al., 1999), but also rather consistent with the NSA performance of animals in Experiment 1.

Unlike what was observed in saccharin responding during degradation, the contingency manipulation did not affect active lever responding for nicotine at any point during the 6 days of degradation in either Experiment 2A or Experiment 2B (Figures 5.6B and 5.8B, respectively). It is therefore clear that the 6 sessions of non-contingent nicotine delivery were unable to uncouple the association between active lever responding and nicotine delivery, reflecting an inability of the animals in Experiment 2 to learn the novel instrumental contingency for nicotine.

That the degradation procedure was seemingly of no consequence implies that the nicotine- degraded groups in both Experiments 2A and 2B were either incapable of learning about the changes in contingency brought on by the degradation procedure, or incapable of adjusting their responses accordingly. One could argue that some aspect of the training procedure, such as nicotine exposure, impaired contingency learning in general. However, because the animals of the NicSac group were trained equivalently for both rewards and the saccharin-degraded animals had their saccharin responding effectively degraded, this possibility may be ruled out.

Another possible explanation for the failure to observe any changes in active lever responses during nicotine degradation may be related to the potential behaviorally activating effects of nicotine induced by its non-contingent delivery, which may have masked any contingency learning which did occur. However, this possibility was ruled out through the nicotine extinction tests; if animals had successfully learned about the changes in contingency but were incapable of adjusting their responses due to any nicotine-evoked behavioral activation, the active responses of nicotine- degraded animals would have been reduced during extinction.

135

In Experiment 2A, there was indeed a trend for lower active responding for nicotine during the extinction test for nicotine-degraded animals (Figure 5.7). One potential confounding contribution to this trend may have been the prior extinction testing for SSA experienced by half the animals in each condition of the NicSac group earlier that day. However, a subgroup analysis within nicotine extinction performance for prior saccharin extinction (i.e. examination of nicotine extinction for significant effects of Squad order) did not yield any statistically significant effects

(section 5.3.2.10). That is not to say the possibility of procedural artifacts upon nicotine extinction testing could not still yet contribute to the trend seen in the NicSac group. However, the NicAlone group of Experiment 2B performed neither SSA sessions nor saccharin extinction testing, and active lever responses made by the NicAlone group during nicotine extinction testing was essentially alike between both the degraded and non-degraded conditions (Figure 5.10). This suggests that the non- significant differences in the NicSac group’s nicotine extinction performance was valid, and that nicotine-degraded animals of Experiment 2 did not reduce their active responding for nicotine during degradation nor in extinction testing.

Taken together, the lack of condition differences in the results of extinction testing in

Experiment 2A and Experiment 2B imply that the irrelevance of nicotine-degradation upon active lever responses made across the degradation sessions was not mediated by behaviorally activating effects of nicotine. Rather, when animals in Experiment 2 responded for nicotine during degradation, they were insensitive to the reduction in instrumental contingency.

One final interesting but non-significant trend was observed, in that inactive lever responding appeared to be elevated in the nicotine-degraded conditions of both NicSac and

NicAlone groups during nicotine degradation sessions (Figures 5.6C & 5.8C). However, this effect was not observed during saccharin-degradation in either condition, suggesting specificity of the effect to the nicotine-degraded animals during nicotine degradation. It could potentially be argued, however, that this trend actually supports the notion of acute nicotine potentiating S-R learning

136

processes. The S-R learning process is sensitive only to the contiguous pairing of action and reinforcer rather than the causal relationship between these events; Balleine & Dickinson (1998a) remarked that “superstitious” responding may develop in the case of a response (in this case, on the inactive lever) being coincidentally followed by a reinforcement. As such, the trend observed may reflect erroneous S-R learning by nicotine-degraded animals facilitated by acute nicotine, as the trend was absent during saccharin degradation.

5.4.3 Conclusions

Control-level responding during degradation and extinction is characteristic of habitual behaviors that are contingency-insensitive. This exact pattern of results was previously reported by

Bradfield et al. (2013) in the responding of animals with parafasicular nucleus lesions for food pellet and sucrose rewards. Bradfield et al. (2013) interpreted this result as a demonstration that these animals were responding habitually, and that the parafasicular nucleus was one brain region necessary for both contingency learning as well as goal-directed responding.

In the current study, however, the contingency-sensitive response reductions seen in experimental animals during saccharin degradation and extinction testing rules out the possibility that contingency learning in these animals was impaired. As such, the conclusions to be drawn from

Experiment 2 are that following 10 daily training sessions under FR1 reinforcement schedules, saccharin-seeking is a contingency-sensitive behavior, while nicotine-seeking is a contingency- insensitive behavior.

The validity and implications of this interpretation will be explored in depth in the upcoming chapter.

CHAPTER 6: General Discussion

6.1 Experimental Conclusions and Alternative Interpretations

6.1.1 Overview

Operant responding in self-administration paradigms is simultaneously mediated by

“response-outcome” and “stimulus-response” associations, but whether the former or latter is predominant respectively distinguishes the operant response as “goal-directed” or “habitual”.

Habitual behavior is identified by insensitivity to both reductions in outcome value as well as reductions in the instrumental contingency. These two experimental approaches have been employed quite extensively to examine goal-directed vs. habitual behavior. In the first, the outcome- value of the operant reward is manipulated (typically through devaluation). The critical test of sensitivity is conducted under extinction conditions, as reinforcement with the devalued reward would distinctly perturb the associative learning (established during training) of devalued animals; in extinction, both devalued and non-devalued animals equally experience the outcome of “null consequence”. If responding for the reward is reduced in the extinction test, the behaviour is regarded as goal directed (Adams & Dickinson, 1981). If responding for the reward is unchanged during this extinction test, the behavior is controlled by antecedent stimuli and is considered habitual

(Adams, 1982). In the second approach, the instrumental contingency is reduced by freely presenting the operant reward (degradation) without the need for appropriate responding. If the reward-specific responding is reduced during degradation, the behavior is thought to be goal-directed; if degrading the instrumental contingency has no impact on responding, then the behavior is regarded as habitual

(Dickinson & Mulatero, 1989; Dickinson & Balleine, 1994). As pointed out by Dickinson & Balleline

(1994) and Yin & Knowlton (2006), it is necessary and important to utilize both of these two experimental approaches to establish whether any given behavior is genuinely goal-directed.

137

138

In the present work, both of these experimental approaches were employed to evaluate the goal-directedness of nicotine self-administration behavior, under conditions that have been historically shown as optimal for retaining goal-directed behavioral control. Each manipulation was additionally conducted on saccharin self-administration in the same experimental animals, for purposes of control and comparison. In both of these tests, we found that:

1. Following 10 sessions of SSA under an FR1 schedule, saccharin-seeking responses were

sensitive to reduction in outcome-value as well as sensitive to reduction of the instrumental

contingency, indicating that saccharin SA is a goal-directed behavior.

2. Following 10 sessions of NSA under an FR1 schedule, nicotine-seeking responses were

insensitive to reduction in outcome-value as well as insensitive to reduction of the

instrumental contingency, suggesting that nicotine SA may be a habitual behavior.

In this section, potential issues with these interpretations will be discussed in detail with respect to Experiment 1 (devaluation with LiCL) as well as Experiment 2 (contingency degradation).

Subsequently, the apparent rapidity of the formation of nicotine-seeking habits will be considered alongside the possible factors that might influence it. But before proceeding with such discussions, the general issue of what experimental conditions have been demonstrated to facilitate goal-directed or habitual behavior will be revisited, and a summary of the relevant literature will be presented.

139

6.1.2 Optimal Conditions for Goal-Directed Behavior

6.1.2.1 Schedule and Training Duration

As discussed in the general introduction, the two major factors that ordinarily influence how quickly habits develop include the amount of instrumental training (Adams, 1982) as well as the reinforcement schedule employed during operant training (Dickinson et al., 1983).

Training animals on ratio schedules of reinforcement has traditionally been thought to produce behavior relatively resilient to habit formation (Dickinson et al., 1983). This was exemplified when Corbit et al. (2012) assessed the satiety devaluation sensitivity of animals responding for sucrose or ethanol solutions using daily training sessions under random-ratio schedules. For sucrose solution, a natural reward, ratio-schedule training did not produce outcome- insensitivity even after 8 weeks of daily training. Responding for ethanol, a drug of abuse which accelerates habit formation, still took 4 weeks of daily training for outcome-insensitivity to be observed. Similarly, Mangieri et al. (2012) report that responding for a combined sucrose/ethanol solution remained sensitive to LiCl devaluation following 8-9 days of VR training, and only after 17 days of training was outcome-insensitivity capable of being observed. When Mangieri et al. (2014) further assessed both VR and VI trained animals in an omission paradigm following 9 days of training, they reported that VR trained animals reduced their responding at a faster rate during degradation, indicating relative outcome-sensitivity in the VR training condition.

The amount of training necessary for habits to develop can be dramatically reduced should animals be trained on VI schedules. For example, rats trained on such interval schedules are capable of habitual responding for alcohol solution following 11 days of training (Dickinson et al., 2002), and capable of habitual responding for cocaine solution following 13 days of training (Miles et al.,

2003). Yet even when using VI schedules, habit formation does not necessarily occur. Olmstead et al. (2001) reported goal-directed intravenous cocaine-seeking following 8 training sessions using

140

interval reinforcement for the initial “drug-seeking lever” in their heterogeneous chained schedule paradigm (i.e. a paradigm in which reinforcement requires responding on sequentially presented levers with distinct reinforcement schedule requirements). Furthermore, Zapata et al. (2010) report that i.v. cocaine self-administration remained outcome sensitive despite 20-25 training sessions in a similar paradigm; only after having their animals perform an additional 36 training sessions was habitual i.v. cocaine-seeking observed.

6.1.2.2 Training Conditions in the Current Work

Given that relatively little has been done with respect to habit formation for intravenous reinforcers, the primary interest of the current work was to see whether the devaluation and degradation paradigms employed could detect goal-directed behavior, before attempting to look at habit formation over the long-term (or produced by interval-schedule training). Based on what has been discovered and reported in the literature mentioned above, training on an FR1 reinforcement schedule for only 10 sessions was chosen for acquisition phase of Experiments 1 & 2, in order to maximize the likelihood that both nicotine and saccharin seeking behavior would remain goal- directed. The training duration of 10 sessions was chosen, as this amount of training is generally required for stability in NSA responding (e.g. Corrigall & Coen, 1989; Donny et al., 1995; Shoaib et al., 1997)

6.1.3 LiCl Devaluation

6.1.3.1 Saccharin Interpretations

As previously mentioned, aversion-pairing with LiCl has commonly been demonstrated an effective means of devaluation for many orally consumed reinforcers (Adams, 1982; Dickinson et al.,

141

2002; Miles et al., 2003), and has historically been demonstrated as effective in conditioning taste aversions to saccharin solution (Kulkosky et al., 1980). In Experiment 1, aversion-pairing to saccharin solution was likewise found to be very effective, attenuating both the acquisition of SSA in

Experiment 1A and the reacquisition of SSA in Experiment 1B with quite robust effects in both experimental groups. Consistent with studies examining natural reward-seeking habits following limited ratio-schedule training, reductions in both extinction and reacquiring testing revealed SSA as a goal-directed behavior. As an experimental control, effective LiCl-saccharin pairings in the same animals who responded for nicotine established that not only that the pairing parameters employed

(dose, concentration, etc.) were capable of producing conditioned aversions, but also that capabilities of incentive learning were intact.

6.1.3.2 Outcome-Sensitive Saccharin Responding and LiCl Concentration

One potential issue which may confound interpretation of aversion-pairing procedures is that the use of an inappropriate LiCl concentration could cause genuinely habitual behavior to misleadingly appear goal-directed. Balleine & Dickinson (1991) used an isotonic (0.15M) LiCl concentration to show a certain training procedure produced S-R mediated responding; Rescorla

(1992) attempted replication using a hypertonic LiCl concentration (0.6M), but conflictingly found that the same training regimen was apparently producing R-O mediated behavior. To resolve this discrepancy, Balleine & Dickinson (1992) showed not only that anesthetization of animals during hypertonic LiCl pairing restored the appropriate S-R style, but also that re-exposing those animals to the devalued reward revealed the anesthetized pairing was still an effective devaluation. They interpreted this result as suggestive that visceral discomfort from hypertonic LiCl injections could work independently of nausea-conditioned aversion to confound Rescorla’s (1992) extinction test.

142

In Experiment 1, the issue of LiCl osmolarity was addressed by using the appropriate isotonic LiCl concentration (0.15M) throughout all pairing procedures. As such, the possibility that genuinely habitual saccharin-seeking responses were erroneously observed to be goal-directed may be ruled out.

6.1.3.3 Nicotine Interpretations

In contrast to the behavioral consequences of saccharin-pairings, similar LiCl pairing procedures to intravenous nicotine in the same experimental animals had a distinct pattern of effects on nicotine responding. Nicotine-paired animals responded much like unpaired controls in extinction tests, but reduced NSA in reacquisition (Figures 4.9, 4.10, & 4.11). This pattern of test results is consistent with outcome-insensitive behavior, much like literature reports of habitual responding for ethanol (Dickinson et al., 2002) as well as cocaine (Miles et al., 2003).

However, these observations also stand in contrast to the solitary previous investigation of i.v. nicotine and habit formation conducted by Clemens et al. (2014). While these authors also reported nicotine-seeking as an outcome-insensitive behavior, this was true only for animals given extended training (47 days of FR1). Clemens et al. (2014) also attempted devaluation of NSA following limited training, (10 days under an FR1 schedule; comparable to that of the current work), but did not observe significant devaluation effects in reacquisition. This suggests that nicotine devaluation in this limited-training group was unsuccessful, preventing clear interpretation of whether NSA is outcome-sensitive at the 10-day timepoint.

Some procedural differences exist between Experiment 1 of the present study and Clemens et al. (2014). First, Clemens et al. (2014) employed a nose-poke operandum as the operant response, as opposed to the lever press used in the current work. These two distinct operanda can impact a number of behaviors associated with nicotine; direct comparisons have shown that while

143

spontaneous acquisition of i.v. NSA is higher in animals trained to make nose-pokes, only animals trained to lever-press demonstrate cue-induced nicotine reinstatement (Clemens et al., 2010). These effects may furthermore be particular to nicotine; although nose-poke training increased apparent reward preferences throughout training, this effect of operandum was not observed for the case of intravenous cocaine (Clemens et al., 2010).

Similarly, nicotine drug effects can also be influenced by factors such as voluntary vs. passive delivery, prior familiarity with nicotine, and stress from experimenter handling (Donny et al., 2000;

O’Dell & Khroyan, 2009). Nicotine infusions during pairing procedures were “experimenter- delivered” by Clemens et al. (2014), while pairing nicotine in Experiment 1 was non-contingently infused. It is possible that the experience of handling or restraint during the pairings of Clemens et al. (2014) could have influenced LiCl-nicotine associations. Similarly, the mere two nicotine infusions delivered in the pairings of Clemens et al. (2014) may have yielded a relatively weak interoceptive nicotine cue, again contributing to the failure of devaluation.

In the present study, steps were taken to maximize the likelihood of a strong nausea-nicotine association. As the magnitude of the devaluation effect induced by LiCl has been found to depend on the absolute quantity of LiCl injected (Paredes-Olay & López, 2002), the LiCl dosage was increased from 63.6mg/kg used by Clemens et al. (2014) to 89.7mg/kg in Experiment 1. The number of non-contingent nicotine infusions associated with each pairing was increased from the two infusions administered by Clemens et al. (2014) to five infusions in the current work; furthermore, these infusions were delivered using infusion pumps identical to those which delivered nicotine during i.v. NSA, allowing animals to experience nicotine effects undisturbed.

The pairings employed in Experiment 1B were clearly effective in impairing the reacquisition of NSA in nicotine-paired animals, although this effect was not as robust as in the case of saccharin.

However, as seen in both raw or ratio data, it is quite evident that pairing with LiCl reduced NSA in nicotine-paired animals during both days of reacquisition (Figures 4.10 & 4.11). Despite such

144

impairments of NSA reacquisition, there were no differences observed in the extinction test responding of nicotine-paired and unpaired animals, the cardinal assay for goal-directed vs. habitual behavior. Extinction responding for nicotine in Experiment 1B was essentially similar between pairing conditions, and on the basis of both extinction and reacquisition tests, the results of

Experiment 1B indicate that nicotine-seeking behavior developed after 10 training sessions of FR1 reinforcement was insensitive to reductions in nicotine’s incentive value.

6.1.3.4 Outcome-Insensitive Nicotine Responding and Re-Exposure

Another potential issue with aversion-pairing paradigms is that insufficient pairing repetitions can cause genuinely goal-directed behavior to deceptively appear habitual. As a theory,

“incentive learning” predicts that should there be no experience with a devalued reward prior to extinction testing, incentive learning will not have had the chance to occur; the extinction test would therefore reveal no devaluation effects, even if animals are goal-directed (Dickinson & Balleine,

1994). Balleine & Dickinson (1991) tested exactly this by utilizing a single injection of LiCl to devalue a sucrose solution; if sucrose-paired animals were not given additional sucrose contact between the LiCl injection and the test, no change in extinction responding occurred. In the absence of incentive learning, even a successfully devalued outcome for goal-directed animals does not translate into reduced extinction test responding.

In Experiment 1, insufficient re-exposure to nicotine could cause the apparently habitual nicotine-seeking observed to be in truth goal-directed. To prevent this issue, LiCl-reward pairings were repeated in 3 cycles, with each cycle involving outcome exposure followed by LiCl injection; during the second and third pairing cycles, animals were therefore twice re-exposed to the devalued reward. Not only did multiple pairing cycles help ensure that LiCl-nicotine associations had ample

145

potential to be established, but they also pre-empted “missing re-exposure” from confounding nicotine extinction tests.

6.1.3.5 Failure of Nicotine Devaluation Prior to Acquisition

A surprising finding regarding the nicotine-paired animals of group PRE was the absence of any devaluation effect on the acquisition of NSA (Figure 4.3), as the same devaluation procedure effectively attenuated the acquisition of SSA in the same experiment (Figure 4.2). The successful devaluation of saccharin implies that the pairing procedure was capable of conditioning aversions, and that incentive learning by animals of Experiment 1A was not compromised. It must be concluded that i.v. nicotine was unsuccessfully devalued in group PRE due to some failure of association between nicotine and the effects of LiCl. In contrast, however, the devaluation of nicotine in group POST was successful.

This is quite the opposite from what one would expect to see if this was a consequence of mere reinforcer exposure; more experience with a reinforcer before devaluation has been shown to necessitate greater re-exposure to induce the same devaluation effects (Dickinson et al., 1993;

Dickinson et al., 1995). Yet in Experiment 1, the nicotine-paired animals with greater unpaired- reinforcer experience (group POST) manifested a devaluation effect, while the animals with no unpaired-reinforcer experience (group PRE) did not. This occurred despite both groups underdoing identical pairing procedures, including identical re-exposure to devalued nicotine (occurring during each pairing cycle).

It is possible that i.v. nicotine was unsuccessfully devalued due to a failure of animals to generalize between the nicotine non–contingently delivered during pairings and the nicotine voluntarily delivered (response-contingently) during NSA sessions, particularly in the drug-naïve animals of group PRE. A number of differences in the neurochemical and behavioral effects of

146

drugs have been reported for voluntary self-administration vs. non-contingent delivery. As examples, drug-seeking behavior as well as DA transmission in the NAc can both be differentially affected by contingent vs. non-contingent delivery of cocaine; furthermore, behavioral and biochemical cocaine- sensitization may be preferentially induced by non-contingent cocaine delivery (Markou et al., 1999;

Miguéns et al., 2008; Lecca et al., 2007). Similarly, while both response-contingent and non- contingent nicotine can elevate the blood corticosterone levels of rats within 15 minutes, these levels return to normal within an hour following response-contingent nicotine delivery alone (Donny et al.,

2000). If non-contingent delivery during pairings contributed to the failure of nicotine devaluation in group PRE, why was saccharin devaluation in the same animals successful? One readily available explanation is that during LiCl-saccharin pairings, saccharin was voluntarily consumed by animals in

LAP sessions, like their voluntary consumption during SSA. Only nicotine was non-contingently delivered during pairings.

However, nicotine was not only non-contingently delivered during the pairings of group

PRE, but also in those of group POST. Why, then, was the nicotine devaluation in group POST effective? Another possibility is that some form of experience with nicotine is necessary before its interoceptive effects are capable of associative conditioning with other stimuli; unlike group PRE, animals of group POST had ten sessions of NSA experience before undergoing pairings. Recall that

Clemens et al. (2014) similarly reported effective nicotine devaluation in extended-trained (47 days) animals, but not in those briefly-trained (10 days). Clemens et al. (2014) themselves suggested that greater experience with nicotine in their extended trained group may have allowed “richer representation of internal drug effects” to offer a more salient interoceptive cue for association with

LiCl-induced nausea.

147

6.1.4 Contingency Degradation

6.1.4.1 Major Interpretations

The parameters for operant acquisition in Experiment 1 were similarly employed in

Experiment 2 (i.e. 10 acquisition sessions of FR1 reinforcement, concurrent training for both saccharin and nicotine in the same animals). This training regimen produced SSA responding which was readily reduced by saccharin-degraded animals, who received “yoked” non-contingent saccharin deliveries (Figure 5.3). Saccharin-degraded animals also retained this response suppression during saccharin extinction testing (Figure 5.4), clearly indicating that the degradation of SSA did not reduce active responding due to saccharin specific-satiety, but rather through novel contingency learning.

These experiments together demonstrate that training parameters in the acquisition phases of these experiments produced SSA responding sensitive to reductions in both outcome-value

(Experiment 1B) and the instrumental contingency (Experiment 2A). As such, SSA behavior observed in the current work may be concluded to be goal-directed (Balleine & Dickinson, 1994;

Yin & Knowlton, 2006).

Unlike the degradation of saccharin, degradation of NSA did not reveal statistically significant differences in that active lever responding of the two degradation conditions. In both

Experiments 2A and 2B, such differences did not occur following 6 degradation sessions (Figures

5.6 & 5.8), and were not apparent in extinction testing (Figures 5.7 & 5.9). Since nicotine-degraded animals received yoked nicotine infusions consistent with the non-degraded conditions, effects of nicotine satiety may be ruled out as influential upon these results; behaviorally-activating effects of nicotine are considered below.

148

6.1.4.2 Potential Issues in Nicotine Degradation

A number of other important factors could possibly confound interpretation of NSA degradation. To our knowledge, Experiment 2 constitutes the first attempt to use a contingency degradation to paradigm to evaluate habit formation for an intravenously self-administered drug. As discussed before, one major concern was the potential induction of behaviorally activating effects by the non-contingently delivered drug (Ostlund & Balleine, 2009). Nicotine is capable of complex influences on general locomotor activity (Stolerman et al., 1995). While nicotine can dose- dependently induce locomotor depression in drug naïve animals (Clarke & Kumar 1983a), repeated nicotine exposure not only can attenuate these effects (Miller et al., 2001), but can also produce activation of general locomotion in nicotine-tolerant animals (Clarke & Kumar, 1983b).

If the nicotine non-contingently delivered during degradation produced either elevations of locomotor activity or potentiated nicotine-seeking behaviors, any reductions in the responding of nicotine-degraded animals could potentially have been masked. While the 0.03mg/kg/infusion nicotine dose employed in the current work is relatively small compared to the dosages employed in relevant studies of general activity (e.g. a range of 0.1-0.4mg/kg dose-dependently increases locomotor activity in nicotine tolerant rats (Clarke & Kumar, 1983b), repeated infusions across each degradation session mean that relevant dosages of nicotine were indeed achieved (for example,

0.39mg/kg of nicotine would have been administered if 13 infusions were delivered). Furthermore, priming injections of this nicotine dose have also been reported capable of reinstating nicotine- seeking (e.g. 0.03mg/kg observed by Chiamulera et al., 1996; 0.3mg/kg observed by Shram et al.,

2008). Although these potential influences on nicotine degradation cannot be completely ruled out, the absence of any apparent contingency learning in nicotine extinction sessions (i.e. reduced responses of nicotine-degraded animals relative to control) suggests that NSA responding was

149

indeed insensitive to the degradation procedure, and that contingency learning in these animals did not occur.

As a second potentially confounding issue, nicotine has a curious influence upon concurrently presented non-nicotine stimuli, encapsulated in a “dual-reinforcement” model of nicotine effects (Caggiula et al., 2009). Briefly, this model states that while nicotine alone possesses modest primary reinforcing effects, it can significantly potentiate the rewarding properties of concurrent neutral stimuli. As examples, rats making operant responses to illuminate a visual cue can have their responding enhanced by passive nicotine injections (Donny et al., 2003). Similarly, such visual stimuli alone are capable of producing higher operant response rates than i.v. nicotine infusions alone, but the combined delivery of the two can synergistically elevate response rates to greater levels than for either individually (Caggiula et al., 2002; Chaudhri et al., 2006; Palmatier et al.,

2006). It is possible that the maintained responding of nicotine-degraded animals during nicotine- degradation could reflect a nicotine-enhanced valuation of non-nicotine elements of the compound reinforcement, in this case the concurrently presented visual and auditory reward cues. Similarly, these nicotine-associated cues may have themselves potentiated nicotine-seeking, as is commonly seen in studies of cue-induced nicotine reinstatement (e.g. LeSage et al., 2004). However, it can be argued that this scenario is unlikely to be the case. Since associated cues were never presented in the absence of reinforcements, the correlation between responses and cues was degraded to the exact same extent as that of responses and rewards. Furthermore, nicotine-associated cues were identical to cues associated with saccharin delivery; during saccharin-degradation, the repeated presentation of these cues did not prevent saccharin-degraded animals from reducing their responding. That being said, it remains possible that given the dual-reinforcing properties of nicotine, it may take longer for degradation of NSA to occur. In future studies, this possibility may be addressed by examining

NSA and nicotine-degradation in the absence of reward-associated cues, and across lengthier degradation phases.

150

6.1.5 Other Potential Issues

6.1.5.1 Differences in Session Duration

Following identical training conditions, the use of two cardinal assays for distinguishing goal- directed vs. habitual behavior (namely reward devaluation and contingency degradation) revealed goal-directed saccharin-seeking behavior, but habitual nicotine-seeking behavior. One potential difference which might confound the comparison between nicotine and saccharin performance in these assays is the distinct duration of operant sessions employed for these rewards during operant training, with SSA sessions being 30 minutes in length, and NSA sessions lasting 60 minutes.

Response repetition is one major factor (alongside instrumental contingency) that influences the rate of habit formation (Adams, 1982; Dickinson, 1985; Dickinson & Balleine, 1994).One could argue that with longer session durations, higher response rates could potentially occur and therefore facilitate the formation of habitual behavior. This is, however, unlikely to be the case; an examination of responding in SSA and NSA sessions reveals that there were much higher numbers of responding on the active lever for saccharin, as opposed to nicotine.

6.1.5.2 Generalized Habit Formation by Nicotine

In the general introduction, it was mentioned that the rate of instrumental habit formation for natural rewards is accelerated in a general and non-specific sense should animals be exposed to certain drugs of abuse even outside of the operant context. Although this specific effect has not yet been examined for nicotine, a similar finding detailed how subcutaneous nicotine injections prior to sucrose solution self-administration sessions not only increased the apparent reinforcing effects of sucrose, but furthermore increased the sucrose cue-reactivity of animals that had been previously trained in that fashion (Grimm et al., 2012). A similar risk to that described in the minor subsection

151

above, then, may have been that nicotine could facilitate habitual responding for saccharin. It appears evident that this was not the case, but does not rule out the possibility that nicotine may generally accelerate habit formation. Why such an effect was not observed could be due to various reasons, but a definitive conclusion is difficult to reach. The time-course of generalized acceleration of habit formation is not as of yet as well characterized as that of specific habit formation, with

Corbit et al. (2012) reporting that the generalized acceleration of sucrose-seeking habit by ethanol required 56 days of ethanol exposure. It may be possible that any generalized effect nicotine had upon the formation of saccharin-seeking habits was not yet detectable in the present work, but to state so conclusively, an additional experiment comparing SSA habit formation in nicotine-exposed and non-exposed animals would be required over multiple time-points of comparison.

6.1.6 The Rapid Formation of Nicotine-Seeking Habits

Taken together, Experiments 1 and 2 suggest that nicotine-seeking responses following 10 sessions of FR1 training is a habitual behavior. In the same experimental animals, saccharin-seeking was simultaneously demonstrated by the same behavioral assays to be goal-directed. As such, alternative interpretations which can be immediately ruled out include the inability of experimental procedures to detect goal-directed actions, as well as any general impairment in the expression of goal-directed behaviors caused by nicotine exposure.

Particularly noteworthy is the apparently rapid rate of nicotine-seeking habit formation suggested by the two experiments conducted, with nicotine-seeking behavior apparently transitioning to a habit following a mere 10 sessions of training on FR1 schedules. This would indicate an even faster rate of accelerated habit formation than for alcohol and cocaine, the other two drugs which have been explicitly shown to accelerate habit formation; although habitual responding for these two substances was observed to occur at comparable lengths of training to that

152

of the current work (Dickinson et al., 2002; Miles et al., 2003), the experiments demonstrating such utilized VI reinforcement during training. The current observation of insensitivity toward both outcome value and instrumental-contingency reductions was rather unexpected, following so little ratio-schedule training. Studies of habit formation for natural rewards have consistently shown that ratio-schedule training leads to goal-directed behavior that persists through several weeks, such as at least 8 weeks of daily training in outcome devaluation (Corbit et al., 2012) as well as 6 weeks of training in contingency degradation paradigms (Shillinglaw et al., 2014). When responding for ethanol was trained in animals using both comparable training lengths (8-9 days) and upon ratio schedules as in the present work, it was found that they remained goal-directed as evaluated by both aversion-pairing (Mangieri et al., 2012) as well as omission (Mangieri et al., 2014) paradigms. Why might nicotine seem to form habits so quickly? The unexpectedly rapid rate of habit formation suggested by the experiments of this thesis work opens the possibility that nicotine’s mechanisms of action may have peculiar interactions with the neural substrates of habit formation. A consideration of three potential sites of mechanistic interaction is given below.

6.2 Possible Mechanistic Interpretations

6.2.1 Phasic VTA Dopamine Signals and the DLS

Dopaminergic cells of the VTA are typically held under strict control by inhibitory

GABAergic afferent input, keeping a sizable portion of them in an inactive, hyperpolarized state

(Grace et al., 2007). When released from inhibition, DA neurons of the VTA display a capacity for distinct firing modes. Spontaneous “tonic” firing generally consisting of individual action potentials is mediated by pacemaker-like membrane conductance, occurring continuously but irregularly at a

153

low frequency (2-10 Hz), while non-spontaneous “phasic” firing occurs in low-latency “bursts” of action potentials at a high-frequency (15-28 Hz) (De Biasi & Dani, 2011; Schultz, 2007a). Phasic firing of VTA DA neurons results in massive synaptic dopamine release (Floresco et al., 2003), and is thought to correspond with perception of salient, behaviorally relevant events for the organism, such as unexpected rewards or related cues (De Biasi & Dani, 2011; Redgrave et al., 2008).

Although noxious and aversive stimuli may sometimes silence or depress phasic firing

(Danjo et al., 2014; Redgrave et al., 2008; Schultz, 2007b), other kinds of negative stimuli such as social-defeat stress can also increase phasic DA signaling (Anstrom et al., 2009). This highlights a critical concept for understanding the role of DA signaling in the brain: although feelings of reward necessitate and are mediated by burst firing in the VTA, this is not to say that phasic signaling is in itself rewarding. Rather, it is more likely that phasic DA signals encode aspects of salience, predicted salience, and saliency prediction error, with these including aspects of both reward and aversion

(Pignatelli & Bonci, 2015; Wenzel et al., 2014). Just how these distinct behavioral relevancies are encoded by the VTA’s activity likely involves heterogeneity of DA neuron subtypes within the structure, as well as distinct afferent inputs to and axonal projections from each VTA neural subpopulation (Cohen et al., 2012; Lammel et al., 2012; Lammel et al., 2014). With respect to nicotine reward, it is likely that phasic DA signaling from the VTA to the NAc is the most critical substrate for hedonic affect and behavioral reinforcement; in order for DA release within the NAc to be enhanced, burst firing in VTA inputs is likely critical (Mansvelder & McGehee, 2000).

Mutant mice with genetic knock-out of a certain N-methyl-D-aspartate receptor (NMDAR) subtype regionally-restricted to the VTA and SNc (DA-NR1-KO mice) exhibit an interesting phenotype, in that they display normal goal-directed responding following satiety devaluations, but remain goal-directed following interval-schedule training that produces habitual behavior in wild- type controls (Wang et al., 2011). Electrophysiology conducted by these authors showed that the

DA neurons in the VTA of DA-NR1-KO mice exhibited normal tonic firing, but reduced phasic

154

firing. The capability of the DA-NR1-KO mice to perform the operant task and exhibit goal- directed responding implies a unique role for phasic DA firing in the VTA and NAc in the process of habit learning. Aggarwal & Wickens (2011) provide additional comment, noting that the DLS possesses both an extremely robust dopaminergic innervation by the midbrain as well as a dense population of DA transporters. They speculate that regional differences in DA dynamics may imbue the DLS with a unique sensitivity to phasic DA signals, which could potentially contribute to habit formation.

Should this be the case, it is possible that nicotine may influence habit formation through modulation of phasic signaling from the VTA to the DLS. As nicotine is capable of inducing burst firing within the VTA (Grenhoff et al., 1986; Schilstrom et al., 2003) as well as increasing the ratio of phasic bursts relative to tonic firing (Zhang et al., 2009), enhanced phasic DA signaling elicited by nicotine may act upon the DLS to facilitate habit formation.

6.2.2 Cholinergic Influences on Habit Formation

6.2.2.1 Cholinergic Innervation of Mesolimbic Structures

To consider another possibility, phasic burst-firing in the VTA may depend on afferent synaptic input (Doyon et al., 2013). The excitatory neurotransmitter glutamate is not alone sufficient to produce burst-firing in vitro (Grace & Onn, 1989; Lodge & Grace, 2006), suggesting at least some necessity for neuromodulatory regulation or gating mechanisms for phasic DA release (Doyon et al.,

2013; Floresco et al., 2003; Grace et al., 2007).

This gating mechanism may be mediated by two cholinergic nuclei, collectively referred to as the pedunculopontine mesencephalic tegmentum (PMT) (Maskos, 2008). One of these nuclei, the pedunculopontine tegmentum (PPTg), directly regulates burst firing of VTA neurons via glutamatergic and cholinergic signaling (Grace et al., 2007). However, Lodge & Grace (2006) found

155

that tonic input from the other of these nuclei, the laterodorsal tegmental nucleus (LDTg), is required for glutamate-elicited burst firing in vivo. The LDTg sends glutamatergic, cholinergic, and

GABAergic inputs to the VTA, and inactivation of the LDTg prevents burst firing within the VTA.

The LDTg may be the permissive ‘gate’ that allows DA neurons to respond to glutamatergic input from other afferents, with cholinergic transmission via nAChRs suggested to play a key role (Grace et al., 2007; Lodge & Grace, 2006). The prefrontal cortex (PFC) is one such glutamatergic afferent, but the receptive DA neurons typically project back to the cortex, as opposed to the more relevant

NAc (Carr & Sesack, 2000; De Biasi & Dani, 2011). In contrast, the VTA DA neurons which receive glutamatergic and cholinergic PPTg signals project predominantly to the NAc (De Biasi & Dani,

2011; Omelchenko & Sesack, 2005).

Together, the PPTg and LDTg each drive and gate the phasic firing of VTA DA neurons, respectively, and likely do so through cholinergic mechanisms (Grace et al., 2007). The PPTg itself may be regulated through cholinergic control; injection of the α4β2 nAChR-selective antagonist, dihydro-beta-erythroidine (DHβE) directly into the PPT has been found to decrease nicotine self- administration (Lanca et al., 2000), even though it is cholinergic signaling from these cells to the VTA which is known to be critical. The PPTg, though, also contains glutamatergic and GABAergic neurons interspersed among cholinergic cell clusters (Wang & Morales, 2009), and Lanca et al.

(2000) found that Fos-positive nuclei were observed almost exclusively in these non-cholinergic neurons following nAChR antagonism. The glutamatergic and GABAergic neurons of the PPTg may therefore represent an upstream mediator of nicotine’s action on the VTA and DLS.

6.2.2.2 The PPTg and Habit Formation

Two findings make the PPTg exceptionally interesting within the context of nicotine and habits. The first is that a complex role for the PPTg has been implicated in nicotine self-

156

administration: while lesions of the PPTg’s posterior portion have been shown to elevate i.v. NSA

(Alderson et al., 2006), inactivation of the PPTg with GABA agonists reduces i.v. NSA without altering progressive-ratio performance (Corrigall et al., 2001). Furthermore, muscarinic agonism or mu-opioid agonism by intracranial microinfusion into the PPTg has also been shown to reduce nicotine self-administration on fixed-ratio schedules (Corrigall et al., 2002).

The second and more intriguing finding is that the posterior PPT has itself been implicated

in goal-directed action control. Inactivation of the pPPTg by microinfusion of the GABAA agonist muscimol prevented rats from being sensitive to a contingency-degradation paradigm, with the authors concluding that the pPPTg is critical in action-outcome learning (MacLaren et al., 2013).

The PPTg, then, could potentially act as an interface through which nicotine could simultaneously influence both the mesolimbic circuitry as well as the neurobiological determinants of action control, and stands out with special interest as a brain region worth future study.

6.2.3 Habits and the Insular Cortex

A final potential site of interaction is the insular cortex (IC). While the role of the IC in goal- directed actions is less clear, Balleine & Dickinson (1998a & 2000) report that bilateral quinolinic acid lesions of the IC prior to training seems to interfere with the capacity for incentive learning; while IC lesioned animals were unimpaired in operant acquisition for food pellet rewards and appropriately reduced responding under contingency degradation conditions, they did not show any sensitivity to satiety devaluation. Parkes and Balleine (2013) used functional blockade of the IC to disrupt outcome-sensitivity in satiety testing, elaborating on the IC’s role with the proposition that it may retrieve memories relevant to incentive learning. A functionally opposite manipulation of the

IC, electrical stimulation, was found by Pushparaj and colleagues (2013) to attenuate both nicotine

157

self-administration as well as the reinstatement of nicotine-seeking from presentation of associated cues or nicotine-priming injections.

While the potential interaction between nicotine and the IC’s retrieval of incentive learning has yet to be uncovered, the second of the above findings alludes to a possibility that manipulations of the IC, such as deep brain stimulation (Pushparaj et al., 2013), could potentially be developed to reduce the influence of nicotine-seeking habits.

6.3 Relevance to Human Tobacco Use

6.3.1 Smoking as a Maladaptive Incentive Habit

6.3.1.1 Cue-Induced Relapse

The opening pages of this thesis outlined how tobacco use results in serious detriments to personal and societal health, which could be avoided if only smoking cessation could be maintained.

Two major events have been identified in triggering relapse in abstinent drug users: drug associated cues, and the induction of a state of stress (Robbins & Everitt, 1999).

As habits are responses automatically triggered by antecedent stimuli, the role of “stimulus” with respect to nicotine-seeking habits could refer to the nicotine-associated cues which trigger relapse (Robbins & Everitt, 1999). With respect to human smokers, what constitutes a nicotine- associated cue has been illuminated by a series of studies examining “cue reactivity”, a paradigm in which the presentation of drug-related stimuli to drug users elicits a stable profile of subjectively reported and physiologic measurements (Carter & Tiffany, 1999). It has been found that exposing human smokers to “proximal” cues inexorably linked with smoking (e.g. photographs of a lit cigarette) increases their desire to smoke (Conklin & Tiffany, 2001; Sayette et al., 2001). Proximal

158

cues include not just visual but also haptic aspects of smoking; having smokers hold a cigarette in their fingers provokes changes in neural cue-reactivity (captured by fMRI) exceeding those produced by the presentation of images alone (Yalachkov et al., 2013). Nicotine-associated stimuli that do not necessarily co-occur with the behavior are termed “distal” cues, for example the place where a smoker routinely smokes. Although proximal cues more potently elicit cue-reactivity, the presentation of distal cues in the form of photographs depicting typical smoking environments (such as bus stops, bars, and restaurants) are also capable of doing so, even when those photographs exclude any depiction of cigarettes, smoking, or other proximal cues (Conklin et al., 2008; Conklin et al., 2010).

While it has not yet been directly established to what extent cue-reactivity relates to actual smoking behavior (Perkins, 2009; Perkins, 2011), it regardless remains evident that environments associated with nicotine trigger relapse, and that smoking cues are identified by smokers as a consistent contributor to a majority of relapse episodes (Ferguson & Shiffman, 2009; Shiffman et al.,

1996; Shiffman, 2009). Not only do the presence of other smokers increase the likelihood of relapse

(Shiffman et al., 1996), but even seeing pictures of people around whom smokers usually smoke elicits cue-reactivity similar to those observed for proximal and distal smoking cues (Conklin et al.,

2013). Two factors identified as particularly influential for the acquisition of smoking behavior include parental smoking (Hill et al., 2005) and peer smoking (Hoffman et al., 2006), and therefore relationships to family and friends may represent a chronic channel through which cue-elicited relapse may be triggered.

If such nicotine cues constitute the “stimuli” of the nicotine-seeking S-R association, what would constitute the habitual “response”? An abstaining smoker can experience relevant cues that would trigger relapse, but not have any cigarettes available; smoking behavior itself may be engaged only secondary to what is likely to be the “true” response of the nicotine-seeking habit. Smoking cues have been shown to engage neural activity related to tool use and the execution of motor

159

actions (Yalachkov et al., 2009; Yalachkov & Naumer, 2011), and this activity may represent the unconscious drug seeking impulse which characterizes the “maladaptive incentive habit” (Belin et al.

2013), capable of triggering relapse without necessarily eliciting explicit craving.

6.3.1.2 Stress-Induced Relapse

“Stress” involves the perception and response to threatening and/or aversive stimuli (Cohen et al., 1995), mediated by chemical cascades through a stress system known as the Hypothalamic-

Pituitary-Adrenal (HPA) axis (Koob 2010). Stress not only alters general metabolism, but can also affect several nicotine-responsive neurotransmitter systems as well as nicotine-related behaviors

(Mantsch et al., 2015; Matta et al., 2007). Evidence exists to suggest that the CRF system can mediate withdrawal-induced and increases in nicotine-self administration in dependent rats

(George et al., 2007), and CRF antagonists can reverse ICSS threshold changes associated with nicotine withdrawal (Bruijnzeel & Gold, 2005). As another, the pharmacologic stressor has been shown to both significantly reinstate nicotine-seeking and significantly potentiate cue- induced reinstatement (Feltenstein et al., 2012). In humans, not only does stress increase cigarette smoking behavior (Pomerleau & Pomerleau, 1991), but stress can also promote relapse to cigarette smoking (Colamussi et al., 2007; Kassel et al., 2003).

Stress has also been shown to promote habitual behavior. During operant responding for natural rewards, rats exposed to chronic unpredictable stress were skewed toward insensitivity to both a satiety test, as well as to contingency degradation (Dias-Ferreira et al., 2009). Furthermore, the same study showed increased dendritic arborisation in the DLS alongside a decline in the DMS, the two major regions known to respectively mediate habitual and goal-directed behavior.

In humans, Schwabe & Wolf (2009) used a satiety paradigm similar to that of Tricomi et al.

(2009) involving satiety devaluation of food outcomes, to demonstrate that if participants had been

160

stressed with a “socially evaluated cold-pressor test” (a method previously demonstrated to activate the HPA axis (Schwabe et al., 2008), participants were rendered insensitive to devaluation. It has been therefore been suggested that stress shifts behaviors to favor habitual control, mediated by neuroendocrine mechanisms (Schwabe & Wolf, 2011; Schwabe et al., 2011a); as such, stress actions on habit circuitry could be one mechanism to explain the sensitivity of abstinent smokers to stress- induced relapse.

6.3.2 Implications for Treatment

As the current work has demonstrated that nicotine-seeking behavior is a habit performed inflexibly in the face of changes in outcome-value or reduced instrumental contingency, one potential avenue of research is that of pharmacotherapies designed to shift nicotine-seeking behavior to goal-directed behavioral control. This is already being pursued to a certain extent, with a recent paper by Hay et al. (2013) suggesting that naltrexone is less effective in influencing rats habitually responding for alcohol, as opposed to those who are goal-directed. More examples are found in studies such as that of Corbit et al. (2014), who found that N-acetylcysteine treatment in rats reversed the acceleration of habit formation generated by cocaine, as well as that of Schwabe et al.

(2011b), who found that the β-adrenergic antagonist propranolol was capable of preventing the stress-induced shift from goal-directed to habitual behavioral control in humans, demonstrating this effect with respect to a computerized instrumental learning task, for a food outcome devalued by satiety.

CHAPTER 7: Future Directions

“Each year one vicious habit rooted out, in time might make the worst man good throughout.” Benjamin Franklin

7.1 Validation of Outcome-Insensitive Nicotine Seeking

7.1.1 Nicotine and Outcome Value

7.1.1.1 Post-Acquisition Devaluation Efficacy

Although the average nicotine infusions conditionally earned in NSA reacquisition during

Experiment 1B significantly differed when expressed as a proportion of baseline data, the raw data of the two pairing conditions did not. It may be argued, then, that the efficacy of LiCl-nicotine pairing could be questionable. To better probe the matter in future investigations, active lever responding on a progressive-ratio (PR) schedule (in which reinforcement deliveries progressively increase within-session response requirements) could potentially supplement assessments of devaluation magnitude (Hodos, 1961; Stafford et al., 1998). However, few direct comparisons of reacquisition vs. PR responding in the literature appear to exist with respect to reacquisition testing in devaluation paradigms. Prerequisite to implementing PR schedules, a demonstration of the extent post-devaluation PR performance corresponds to typical reacquisition performance is necessary.

7.1.1.2 The Neural Substrates of Habitual Nicotine Seeking

In Experiment 1A, successful nicotine devaluation in group POST suggests aversion-pairing may be useful in future studies of habit formation following NSA acquisition. However, the failure of devaluation in group PRE suggests a limitation of aversion-pairing in revealing goal-directed nicotine-seeking. First, attempting to devalue nicotine following fewer than 10 acquisition sessions

161

162

could potentially open extinction and reacquisition tests to confound by the response instability typical of early i.v. NSA acquisition (e.g. Donny et al., 1998). Second, the effectiveness of LiCl- nicotine devaluation in group POST but not PRE suggests a currently unknown relationship may exist between the efficacy of such devaluations and some aspect(s) of operant training. As such, an alternative means of producing goal-directed NSA is desirable.

However, it may be possible that potential actions of nicotine on the substrates of habit formation could cause disproportionate recruitment of habit networks to NSA, making goal-directed nicotine seeking difficult to detect through a purely behavioral paradigm. When the neural substrates of goal-directed and habitual behaviors were first discussed in the general introduction, it was mentioned that behavior following extended training reverts to goal-directed control in the absence of the DLS (e.g. Yin et al., 2004). Goal-directed and habit substrates are systems which exist in parallel, and although behavioral control is a result of their integrated action, control can be maintained by one of these systems alone should the other be disrupted (Corbit et al., 2012; Smith &

Graybiel, 2014; Yin et al, 2014). There is therefore the potential to study the individual contributions of goal-directed and habit systems in the acquisition and expression of habitual nicotine-seeking.

Clemens et al. (2014) have already demonstrated that extended NSA training can increase c-Fos expression in the DLS, and because of the critical role of the DLS in habit formation, this region would be a sensible starting point for probing the neural substrates of nicotine-seeking habits.

General operant responding can still be initiated despite pre-training lesions to either substrates of goal-directed or habitual behavior, likely because both systems contribute simultaneously to general behavioral acquisition (Yin et al., 2004). A slowed rate of operant acquisition for natural rewards following DMS lesions has been observed (Yin et al., 2005a), perhaps reflecting that S-R associations (maintained by the DLS) are weaker in early training (Corbit et al.,

2012). To reveal the contributions of the DLS to NSA, the effect of DLS ablation could be assessed by pre-training lesions. Any impairment in NSA acquisition could be revealed by comparison to a

163

sham-lesion control (Ostlund & Balleine, 2005; Yin et al., 2004; Yin et al., 2005a). Furthermore, differences in the relative ability of DLS-ablated animals to acquire responding in distinct operant sessions for a natural reward (such as saccharin) could provide important information. In such an experiment, differences in the rates of acquisition between DLS-lesioned and sham-lesioned animals could be compared for each reinforcer. Should the DLS disproportionally contribute to the acquisition of NSA, the rate at which DLS-lesion animals acquire responding for nicotine (relative to sham-nicotine) should be significantly slower than the rate at which they acquire responding for saccharin (relative to sham-saccharin).

It was mentioned that should the DLS be inactivated or lesioned, habitual behavior reverts to goal-directed control (Yin et al., 2004; Smith & Graybiel, 2014). If NSA is still acquired following pre-training DLS lesions, responding would be expected to be mediated by the DMS and therefore goal-directed. Whether this is the case could be ascertained by conducting the behavioral assays of

Experiments 1 and 2 in DLS-lesion animals.

While the scenarios above would demonstrate the necessity of the DLS in habitual NSA, a related experiment could assess the extent to which the DMS contributes to NSA. Such an experiment could be conducted exactly as described above, but with the DMS in place of the DLS.

Should it be observed that DLS ablation completely attenuates NSA acquisition, this additional experiment could also demonstrate the sufficiency of the habit system in mediating NSA. The relevant prediction in this case is that should the DLS disproportionally contribute to the acquisition of NSA, the rate at which DMS-lesion animals acquire responding for saccharin (relative to sham- saccharin) would be significantly slower than the rate at which they acquire responding for nicotine

(relative to sham-nicotine).

164

7.1.2 Nicotine and Contingency Learning

7.1.2.1 Disruption of Contingency Learning

While the responding of saccharin-degraded animals in Experiment 2A was successfully degraded, they did not have nicotine delivered during saccharin sessions. Therefore, although it can be ruled out that chronic NSA overtly impairs contingency learning, the effect of acute nicotine on contingency learning remains unaddressed. With this in mind, it could be speculated that rather than habitual insensitivity to degradation, the contingency learning of nicotine-degraded animals was merely disrupted by acute nicotine.

Unfortunately, this possibility cannot be ruled out by the design of Experiment 2. To address it, contingency learning for any reinforcer would need be demonstrated in the presence of acute nicotine. The most relevant parameters for comparison would involve i.v. nicotine infusions delivered non-contingently during degradation sessions, with a comparison condition of animals similarly trained but degraded without nicotine delivery. Should animals of these conditions additionally respond for a non-degraded reinforcer, nicotine delivery could be reversed for each individual animal to both equilibrate nicotine exposure between conditions, as well as identify nicotine-induced response differences specific to degradation sessions.

7.1.2.2 Prevention of Contingency Learning

Another unaddressed alternative interpretation of results from Experiment 2 pertains to exactly why contingency learning in nicotine-degraded animals did not occur. Supposing behaviorally activating nicotine effects sustained responding in degradation sessions, the degraded animals may never have had an actual opportunity to learn the novel instrumental contingency. In other words, because nicotine itself kept nicotine-degraded animals responding, they may never have sufficiently

165

experienced that they could be rewarded without having to respond. This drug-induced prevention of contingency learning can be contrasted with the scenario of normal contingency insensitivity brought about by habit formation: animals responding habitually fail to learn about reductions in the instrumental contingency because their responses are automatically elicited by the contextual stimuli of the S-R habit association, and not due to any property of the reward itself.

To address this issue, contingency learning must be somehow uncoupled from behaviorally activated nicotine seeking responses. In a variant of omission-schedule paradigms, Dickinson et al.

(1998) had rats equivalently respond on two active levers for an identical food pellet operant reward, while non-contingent sucrose was also intermittently dispensed for subjects to consume at random time intervals averaging 30 seconds. During testing, one of the active levers was designated an

“omission” lever, and responses on it delayed the usual non-contingent sucrose presentation.

Animals who had received limited training learned to withhold responding on the omission lever, but extended-trained animals were not able to discriminate between them.

The procedure above may conceivably be adapted to address the alternate possibility that nicotine-mediated behavioral activation prevented the possibility of contingency learning in

Experiment 1B. Nicotine could be made equivalently available through two levers, while an alternative natural reinforcer is non-contingently presented. Preference could be equilibrated between levers through assigning each one an independent VI schedule of reinforcement (Dickinson et al., 1998). Following training and the conversion of one of the levers to trigger omission of the natural reward, a demonstration of outcome-insensitivity to the natural reward’s omission by subject animals would rule out the possibility the animals did not experience any consequences of the contingency manipulation.

166

7.2 Acceleration of Habit Formation by Nicotine

7.2.1 Generalized Habit Formation and Nicotine

In the general introduction, it was mentioned that the rate of instrumental habit formation for natural rewards is accelerated in a non-specific way if animals are exposed to certain drugs of abuse, even outside of the operant context. Although this has not yet been examined for nicotine,

Grimm et al. (2012) reported how subcutaneous nicotine injections prior to sucrose solution self- administration sessions not only increased both sucrose SA response rates as well as sucrose intake, but also increased the sucrose cue-reactivity of animals trained in that way. It may therefore be possible that nicotine too could accelerate the rate of habit formation for natural rewards.

Experimentally, an investigating paradigm would include animals receiving nicotine, while operant responding for a natural reward. This could be either a co-administration paradigm (in which distinct operant responses earn each reinforcer) or a non-contingent paradigm (in which nicotine could be passively injected before, during, or following operant sessions for the natural reward). The non-contingent method may be preferable at first to control for nicotine exposure, and operant sessions insulated from nicotine effects by delivering nicotine injections after daily operant sessions are completed (e.g. Corbit et al., 2012). Yet, nicotine could potentially accelerate habit formation distinctly if non-contingently delivered during operant sessions themselves; to isolate the consequences of nicotine on habit formation from nicotine-induced changes in response rate (e.g.

Grimm et al., 2012), training sessions would need to be terminated upon an individual subject earning a pre-determined number of reinforcements. Distinct operant sessions for an alternative reinforcer could allow equilibrated nicotine exposure between experimental conditions and demonstration of specificity in nicotine effect. Assessment of the extent of habit formation could

167

proceed with devaluation of these natural rewards using satiety or aversion-pairing, followed by extinction testing.

7.2.2 Unexplored Interactions

One potential avenue of investigation with significant translational relevance arises from consideration that drugs can have both generalized and specific effects on habit formation. These two effects have always been observed individually in separate studies; never before have they been considered together in interaction.

Consider that epidemiological studies have shown around 80% of alcoholics smoke

(DiFranza & Guerrera, 1990; Falk et al., 2006; Miller & Gold, 1998), and that nicotine and alcohol are frequently used in combination (Hertling et al., 2005). Furthermore, alcohol consumption has been demonstrated to increase the urge to smoke and cigarette smoking behavior (Barrett et al.,

2006; Burton & Tiffany, 1997; Griffiths et al., 1976; Epstein et al., 2007), and animal studies have shown that repeated treatment with nicotine can escalate ethanol intake (Clark et al., 2001; Lê et al.,

2000; Lê et al., 2003).

Paradigms for the study of alcohol and nicotine co-use have been previously described for orally consumed ethanol and nicotine solutions (Hauser et al., 2012) as well as oral ethanol and i.v. nicotine (Lê et al., 2010; Lê et al., 2014). Integrating these paradigms with those considered above

(i.e. substituting ethanol for the natural reward) could potentially determine if nicotine exposure can augment the already accelerated habit formation in ethanol SA responding. If such an interaction exists, a time-course investigation (e.g. Corbit et al., 2012) could reveal it as additive, synergistic, or complex in nature.

REFERENCES

Adams, C. D. (1982). Variations in the sensitivity of instrumental responding to reinforcer devaluation. The Quarterly Journal of Experimental Psychology, 34(2), 77-98. Adams, C. D., & Dickinson, A. (1981). Instrumental responding following reinforcer devaluation. The Quarterly journal of experimental psychology, 33(2), 109-121. Aggarwal, M., & Wickens, J. R. (2011). A role for phasic dopamine neuron firing in habit learning. Neuron, 72(6), 892-894. Alderson, H. L., Latimer, M. P., & Winn, P. (2006). Intravenous self‐administration of nicotine is altered by lesions of the posterior, but not anterior, pedunculopontine tegmental nucleus. European Journal of Neuroscience, 23(8), 2169-2175. Anstrom, K. K., Miczek, K. A., & Budygin, E. A. (2009). Increased phasic dopamine signaling in the mesolimbic pathway during social defeat in rats.Neuroscience, 161(1), 3-12. Antoniou, K., Kafetzopoulos, E., Papadopoulou-Daifoti, Z., Hyphantis, T., & Marselos, M. (1998). D- amphetamine, cocaine and caffeine: a comparative study of acute effects on locomotor activity and behavioural patterns in rats.Neuroscience & Biobehavioral Reviews, 23(2), 189-196. Armitage, A. K., Dollery, C. T., George, C. F., Houseman, T. H., Lewis, P. J., & Turner, D. M. (1975). Absorption and metabolism of nicotine from cigarettes.BMJ, 4(5992), 313-316. Ashby, F. G., Turner, B. O., & Horvitz, J. C. (2010). Cortical and basal ganglia contributions to habit learning and automaticity. Trends in cognitive sciences,14(5), 208-215. Balleine B, Dickinson A (1998a) Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37:407-419 Balleine, B. W., & Dickinson, A. (1998b). The role of incentive learning in instrumental outcome revaluation by sensory-specific satiety. Animal Learning & Behavior, 26(1), 46-59. Balleine, B. W., & Dickinson, A. (2000). The effect of lesions of the insular cortex on instrumental conditioning: evidence for a role in incentive memory.The Journal of Neuroscience, 20(23), 8954-8964. Balleine, B. W., & O'Doherty, J. P. (2010). Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action.Neuropsychopharmacology, 35(1), 48-69. Balleine, B. W., Liljeholm, M., & Ostlund, S. B. (2009). The integrative function of the basal ganglia in instrumental conditioning. Behavioural brain research,199(1), 43-52. Balleine, B., & Dickinson, A. (1991). Instrumental performance following reinforcer devaluation depends upon incentive learning. The Quarterly Journal of Experimental Psychology, 43(3), 279-296. Balleine, B., & Dickinson, A. (1992). Signalling and incentive processes in instrumental reinforcer devaluation. The Quarterly Journal of Experimental Psychology, 45(4), 285-301. Bardo, M. T., & Bevins, R. A. (2000). Conditioned place preference: what does it add to our preclinical understanding of drug reward?. Psychopharmacology, 153(1), 31-43. Barker, J. M., Torregrossa, M. M., Arnold, A. P., & Taylor, J. R. (2010). Dissociation of genetic and hormonal influences on sex differences in alcoholism-related behaviors. The Journal of Neuroscience, 30(27), 9140-9144. Barrett, S. P., Tichauer, M., Leyton, M., & Pihl, R. O. (2006). Nicotine increases alcohol self- administration in non-dependent male smokers. Drug and alcohol dependence, 81(2), 197-204. Baxter, B. W., & Hinson, R. E. (2001). Is smoking automatic? Demands of smoking behavior on attentional resources. Journal of Abnormal Psychology,110(1), 59. Belin, D., & Everitt, B. J. (2008). Cocaine seeking habits depend upon dopamine-dependent serial connectivity linking the ventral with the dorsal striatum. Neuron, 57(3), 432-441. Belin, D., Belin-Rauscent, A., Murray, J. E., & Everitt, B. J. (2013). Addiction: failure of control over maladaptive incentive habits. Current opinion in neurobiology, 23(4), 564-572.

168

169

Belin, D., Jonkman, S., Dickinson, A., Robbins, T. W., & Everitt, B. J. (2009). Parallel and interactive learning processes within the basal ganglia: relevance for the understanding of addiction. Behavioural brain research, 199(1), 89-102. Belin-Rauscent, A., Everitt, B. J., & Belin, D. (2012). Intrastriatal shifts mediate the transition from drug-seeking actions to habits. Biological psychiatry, 72(5), 343-345. Benowitz NL (1986) Clinical pharmacology of nicotine. Annual Review of Medicine 37:21-32 Benowitz NL (1996) Pharmacology of nicotine: addiction and therapeutics. Annu Rev. Pharmacol. Toxicol. 36:597-613 Benowitz NL, Jacob P 3rd (1984) Daily intake of nicotine during cigarette smoking. Clinical Pharmacology & Therapeutics 35(4)499-504 Benowitz, NL (2008) Neurobiology of Nicotine Addiction: Implications for Smoking Cessation Treatment. The American Journal of Medicine 121(4A):S3-S10 Berridge, K. C., & Robinson, T. E. (1998). What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience?. Brain Research Reviews, 28(3), 309-369. Björklund, A., & Dunnett, S. B. (2007). Dopamine neuron systems in the brain: an update. Trends in neurosciences, 30(5), 194-202. Boakes, R. A. (1993). The role of repetition in transforming actions into habits: The contribution of John Watson and contemporary research to a persistent theme. Revista mexicana de análisis de la conducta= Mexican journal of behavior analysis, (3), 67-90. Brabant, C., Quertemont, E., & Tirelli, E. (2005). Influence of the dose and the number of drug- context pairings on the magnitude and the long-lasting retention of cocaine-induced conditioned place preference in C57BL/6J mice. Psychopharmacology, 180(1), 33-40. Bradfield, L. A., Bertran-Gonzalez, J., Chieng, B., & Balleine, B. W. (2013). The thalamostriatal pathway and cholinergic control of goal-directed action: interlacing new with existing learning in the striatum. Neuron, 79(1), 153-166. Bruijnzeel, A. W., & Gold, M. S. (2005). The role of corticotropin-releasing factor-like peptides in cannabis, nicotine, and alcohol dependence. Brain research reviews, 49(3), 505-528. Burton, S. M., & Tiffany, S. T. (1997). The effect of alcohol consumption on craving to smoke. Addiction, 92(1), 15-26. Caggiula, A. R., Donny, E. C., Chaudhri, N., Perkins, K. A., Evans-Martin, F. F., & Sved, A. F. (2002). Importance of nonpharmacological factors in nicotine self-administration. Physiology & behavior, 77(4), 683-687. Caggiula, A. R., Donny, E. C., Palmatier, M. I., Liu, X., Chaudhri, N., & Sved, A. F. (2009). The role of nicotine in smoking: a dual-reinforcement model. In The motivational impact of nicotine and its role in tobacco use (pp. 91-109). Springer US. Caggiula, A. R., Donny, E. C., White, A. R., Chaudhri, N., Booth, S., Gharib, M. A., ... & Sved, A. F. (2002). Environmental stimuli promote the acquisition of nicotine self-administration in rats. Psychopharmacology, 163(2), 230-237. Caille S, Clemens K, Stinus L, Cador M (2012) Modeling nicotine addiction in rats. In “Psychiatric Disorders” (243-256). Humana Press Calabresi, P., Lacey, M. G., & North, R. A. (1989). Nicotinic excitation of rat ventral tegmental neurones in vitro studied by intracellular recording. British journal of pharmacology, 98(1), 135-140. Cannon, C. M., & Bseikri, M. R. (2004). Is dopamine required for natural reward?. Physiology & behavior, 81(5), 741-748. Carlezon, W. A., & Chartoff, E. H. (2007). Intracranial self-stimulation (ICSS) in rodents to study the neurobiology of motivation. Nature protocols, 2(11), 2987-2995. Carr, D. B., & Sesack, S. R. (2000). Projections from the rat prefrontal cortex to the ventral tegmental area: target specificity in the synaptic associations with mesoaccumbens and mesocortical neurons. The Journal of neuroscience,20(10), 3864-3873. Carter, B. L., & Tiffany, S. T. (1999). Meta-analysis of cue-reactivity in addiction research. Addiction, 94(3), 327-340.

170

Cashman JR, Park SB, Yang ZC, Wrighton SA, Jacob P 3rd, Benowitz NL (1992) Metabolism of nicotine by human liver microsomes: stereoselective formation of trans-nicotine N'-oxide. Chem. Res. Toxicol. 5(5):639-646 Centers for Disease Control and Prevention (2014) Current Cigarette Smoking Among Adults— United States, 2005–2013. Morbidity and Mortality Weekly Report 2014 63(47):1108–1112 Champtiaux, N., Gotti, C., Cordero-Erausquin, M., David, D. J., Przybylski, C., Léna, C., ... & Changeux, J. P. (2003). Subunit composition of functional nicotinic receptors in dopaminergic neurons investigated with knock-out mice.The Journal of neuroscience, 23(21), 7820-7829. Changeux JP (2010) Nicotine addiction and nicotinic receptors: lessons from genetically modified mice. Nature Reviews Neuroscience 11(6):389-401 Charntikov, S., & Bevins, R. A. (2014). Interoceptive conditioning with nicotine using extinction and re-extinction to assess stimulus similarity with .Neuropharmacology, 86, 181-191. Chaudhri, N., Caggiula, A. R., Donny, E. C., Palmatier, M. I., Liu, X., & Sved, A. F. (2006). Complex interactions between nicotine and nonpharmacological stimuli reveal multiple roles for nicotine in reinforcement. Psychopharmacology,184(3-4), 353-366. Chen L (2010) In pursuit of the high-resolution structure of nicotinic acetylcholine receptors. J. Physiol. 588(4):557-564 Chiamulera, C., Borgo, C., Falchetto, S., Valerio, E., & Tessari, M. (1996). Nicotine reinstatement of nicotine self-administration after long-term extinction.Psychopharmacology, 127(1-2), 102-107. Childress, A. R., Hole, A. V., Ehrman, R. N., Robbins, S. J., McLellan, A. T., & O’Brien, C. P. (1993). Cue reactivity and cue reactivity interventions in drug dependence. NIDA research monograph, 137, 73-73. Clark, A., Lindgren, S., Brooks, S. P., Watson, W. P., & Little, H. J. (2001). Chronic infusion of nicotine can increase operant self-administration of alcohol.Neuropharmacology, 41(1), 108-117. Clarke, P. B. (1993). Nicotinic receptors in mammalian brain: localization and relation to cholinergic innervation. Progress in brain research, 98, 77-77. Clarke, P. B. S., & Kumar, R. (1983a). The effects of nicotine on locomotor activity in non‐tolerant and tolerant rats. British journal of pharmacology, 78(2), 329-337. Clarke, P. B. S., & Kumar, R. (1983b). Characterization of the locomotor action of nicotine in tolerant rats. British journal of pharmacology,80(3), 587-594. Clemens, K. J., Caillé, S., & Cador, M. (2010). The effects of response operandum and prior food training on intravenous nicotine self-administration in rats. Psychopharmacology, 211(1), 43-54. Clemens, K. J., Castino, M. R., Cornish, J. L., Goodchild, A. K., & Holmes, N. M. (2014). Behavioral and neural substrates of habit formation in rats intravenously self-administering nicotine. Neuropsychopharmacology. Cobb CO, Hendricks PS, Eissenberg T (2015) Electronic cigarettes and nicotine dependence: evolving products, evolving problems. BMC Medicine 13:119 DOI 10.1186/s12916-015- 0355-y Coen, K. M., Adamson, K. L., & Corrigall, W. A. (2009). Medication-related pharmacological manipulations of nicotine self-administration in the rat maintained on fixed-and progressive-ratio schedules of reinforcement.Psychopharmacology, 201(4), 557-568. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B., & Uchida, N. (2012). Neuron-type-specific signals for reward and punishment in the ventral tegmental area.Nature, 482(7383), 85-88. Colamussi, L., Bovbjerg, D. H., & Erblich, J. (2007). Stress-and cue-induced cigarette craving: effects of a family history of smoking. Drug and alcohol dependence, 88(2), 251-258. Collins, A. C., Pogun, S., Nesil, T., & Kanit, L. (2012). Oral nicotine self-administration in rodents. Journal of addiction research & therapy. Colwill, R. M., & Rescorla, R. A. (1985). Postconditioning devaluation of a reinforcer affects instrumental responding. Journal of experimental psychology: animal behavior processes, 11(1), 120.

171

Colwill, R. M., & Rescorla, R. A. (1986). Associative structures in instrumental learning. The psychology of learning and motivation, 20, 55-104. Conklin, C. A., & Tiffany, S. T. (2001). The impact of imagining personalized versus standardized urge scenarios on cigarette craving and autonomic reactivity. Experimental and Clinical Psychopharmacology, 9(4), 399. Conklin, C. A., Perkins, K. A., Robin, N., McClernon, F. J., & Salkeld, R. P. (2010). Bringing the real world into the laboratory: personal smoking and nonsmoking environments. Drug and alcohol dependence, 111(1), 58-63. Conklin, C. A., Robin, N., Perkins, K. A., Salkeld, R. P., & McClernon, F. J. (2008). Proximal versus distal cues to smoke: the effects of environments on smokers' cue-reactivity. Experimental and clinical psychopharmacology, 16(3), 207. Conklin, C. A., Salkeld, R. P., Perkins, K. A., & Robin, N. (2013). Do people serve as cues to smoke?. nicotine & tobacco research, ntt104. Cooper E, Couturier S, Ballivet M (1991) Pentameric structure and subunit stoichiometry of a neuronal nicotinic acetylcholine receptor. Nature 350(6315):235-8 Corbit, L. H., Chieng, B. C., & Balleine, B. W. (2014). Effects of repeated cocaine exposure on habit learning and reversal by N-acetylcysteine.Neuropsychopharmacology, 39(8), 1893- 1901. Corbit, L. H., Nie, H., & Janak, P. H. (2012). Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biological psychiatry, 72(5), 389-395. Corrigall, W. A., & Coen, K. M. (1989). Nicotine maintains robust self-administration in rats on a limited-access schedule. Psychopharmacology,99(4), 473-478. Corrigall, W. A., & Coen, K. M. (1991). Selective dopamine antagonists reduce nicotine self- administration. Psychopharmacology, 104(2), 171-176. Corrigall, W. A., Coen, K. M., & Adamson, K. L. (1994). Self-administered nicotine activates the mesolimbic dopamine system through the ventral tegmental area. Brain research, 653(1), 278-284. Corrigall, W. A., Coen, K. M., Zhang, J., & Adamson, L. K. (2001). GABA mechanisms in the pedunculopontine tegmental nucleus influence particular aspects of nicotine self- administration selectively in the rat.Psychopharmacology, 158(2), 190-197. Corrigall, W. A., Coen, K. M., Zhang, J., & Adamson, L. K. (2002). Pharmacological manipulations of the pedunculopontine tegmental nucleus in the rat reduce self-administration of both nicotine and cocaine.Psychopharmacology, 160(2), 198-205. Corrigall, W. A., Franklin, K. B., Coen, K. M., & Clarke, P. B. (1992). The mesolimbic dopaminergic system is implicated in the reinforcing effects of nicotine. Psychopharmacology, 107(2-3), 285-289. D’souza, M. S., & Markou, A. (2011). Neuronal mechanisms underlying development of nicotine dependence: implications for novel smoking-cessation treatments. Addiction science & clinical practice, 6(1), 4. Dahlström, A., & Fuxe, K. (1964). Evidence for the existence of monoamine-containing neurons in the central nervous system. I. Demonstration of monoamines in the cell bodies of brain stem neurons. Acta Physiologica Scandinavica. Supplementum, SUPPL-232. Dani JA, De Biasi MD (2001) Cellular mechanisms of nicotine addiction. Pharmacology, Biochemistry, and Behavior 70:439-446 Danjo, T., Yoshimi, K., Funabiki, K., Yawata, S., & Nakanishi, S. (2014). Aversive behavior induced by optogenetic inactivation of ventral tegmental area dopamine neurons is mediated by dopamine D2 receptors in the nucleus accumbens. Proceedings of the National Academy of Sciences, 111(17), 6455-6460. Davis TJ, de Fiebre CM (2006) Alcohol’s actions on neuronal nicotinic acetylcholine receptors. Biological Mechanisms 29(3):179-185 De Vries, T. J., Schoffelmeer, A. N., Binnekade, R., Mulder, A. H., & Vanderschuren, L. J. (1998). Drug‐induced reinstatement of heroin‐and cocaine‐seeking behaviour following long‐term extinction is associated with expression of behavioural sensitization. European Journal of Neuroscience, 10(11), 3565-3571.

172

Deroche-Gamonet, V., Belin, D., & Piazza, P. V. (2004). Evidence for addiction-like behavior in the rat. Science, 305(5686), 1014-1017. DeRusso, A. L., Fan, D., Gupta, J., Shelest, O., Costa, R. M., & Yin, H. H. (2010). Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Frontiers in integrative neuroscience, 4. Deutch, A. Y., Holliday, J., Roth, R. H., Chun, L. L., & Hawrot, E. (1987). Immunohistochemical localization of a neuronal nicotinic acetylcholine receptor in mammalian brain. Proceedings of the National Academy of Sciences, 84(23), 8697-8701. Dias-Ferreira, E., Sousa, J. C., Melo, I., Morgado, P., Mesquita, A. R., Cerqueira, J. J., ... & Sousa, N. (2009). Chronic stress causes frontostriatal reorganization and affects decision- making. Science, 325(5940), 621-625. Dickinson, A. (1985). Actions and habits: the development of behavioural autonomy. Philosophical Transactions of the Royal Society B: Biological Sciences, 308(1135), 67-78. Dickinson, A., & Balleine, B. (1994). Motivational control of goal-directed action.Animal Learning & Behavior, 22(1), 1-18. Dickinson, A., & Dawson, G. R. (1989). Incentive learning and the motivational control of instrumental performance. The Quarterly Journal of Experimental Psychology, 41(1), 99- 112. Dickinson, A., & Dawson, G. R. (1989). Incentive learning and the motivational control of instrumental performance. The Quarterly Journal of Experimental Psychology, 41(1), 99- 112. Dickinson, A., & Mulatero, C. W. (1989). Reinforcer specificity of the suppression of instrumental performance on a non-contingent schedule.Behavioural processes, 19(1), 167-180. Dickinson, A., Balleine, B., Watt, A., Gonzalez, F., & Boakes, R. A. (1995). Motivational control after extended instrumental training. Animal Learning & Behavior, 23(2), 197-206. Dickinson, A., Nicholas, D. J., & Adams, C. D. (1983). The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. The Quarterly Journal of Experimental Psychology, 35(1), 35-51. Dickinson, A., Wood, N., & Smith, J. W. (2002). Alcohol seeking by rats: action or habit?. The Quarterly Journal of Experimental Psychology: Section B,55(4), 331-348. DiFranza, J. R., & Guerrera, M. P. (1990). Alcoholism and smoking. Journal of studies on alcohol, 51(2), 130-135. Djordjevic, M. V., Stellman, S. D., & Zang, E. (2000). Doses of nicotine and lung carcinogens delivered to cigarette smokers. Journal of the National Cancer Institute, 92(2), 106-111. Domjan, M., & Wilson, N. E. (1972). Contribution of ingestive behaviors to taste-aversion learning in the rat. Journal of Comparative and Physiological Psychology, 80(3), 403. Donny EC, Dierker LC (2007) The absence of DSM-IV nicotine dependence in monderate-to- heavy daily smokers. Drug and Alcohol Dependence 89:93-96 Donny, E. C., Caggiula, A. R., Knopf, S., & Brown, C. (1995). Nicotine self-administration in rats. Psychopharmacology, 122(4), 390-394. Donny, E. C., Caggiula, A. R., Mielke, M. M., Jacobs, K. S., Rose, C., & Sved, A. F. (1998). Acquisition of nicotine self-administration in rats: the effects of dose, feeding schedule, and drug contingency. Psychopharmacology, 136(1), 83-90. Donny, E. C., Caggiula, A. R., Rose, C., Jacobs, K. S., Mielke, M. M., & Sved, A. F. (2000). Differential effects of response-contingent and response-independent nicotine in rats. European journal of pharmacology, 402(3), 231-240. Doyon, W. M., Thomas, A. M., Ostroumov, A., Dong, Y., & Dani, J. A. (2013). Potential substrates for nicotine and alcohol interactions: a focus on the mesocorticolimbic dopamine system. Biochemical pharmacology, 86(8), 1181-1193. Drummond, D. C. (2001). Theories of drug craving, ancient and modern.Addiction, 96(1), 33-46. Epping-Jordan, M. P., Watkins, S. S., Koob, G. F., & Markou, A. (1998). Dramatic decreases in brain reward function during nicotine withdrawal. Nature,393(6680), 76-79.

173

Epstein, A. M., Sher, T. G., Young, M. A., & King, A. C. (2007). Tobacco chippers show robust increases in smoking urge after alcohol consumption.Psychopharmacology, 190(3), 321- 329. Everitt, B. J., & Robbins, T. W. (2005). Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nature neuroscience, 8(11), 1481-1489. Everitt, B. J., & Robbins, T. W. (2013). From the ventral to the dorsal striatum: devolving views of their roles in drug addiction. Neuroscience & Biobehavioral Reviews, 37(9), 1946-1954. Everitt, B. J., Belin, D., Economidou, D., Pelloux, Y., Dalley, J. W., & Robbins, T. W. (2008). Neural mechanisms underlying the vulnerability to develop compulsive drug-seeking habits and addiction. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 363(1507), 3125-3135. Everitt, B. J., Dickinson, A., & Robbins, T. W. (2001). The neuropsychological basis of addictive behaviour. Brain Research Reviews, 36(2), 129-138. Ezzati M, Lopez AD (2003) Estimates of global mortality attributable to smoking in 2000. Lancet 362:847-52 Fagerström K (2005) The nicotine market: An attempt to estimate the nicotine intake from various sources and the total nicotine consumption in some countries. Nicotine Tob. Res. 7(3):343-350 Falk, D. E., Yi, H., & Hiller-Sturmhofel, S. (2006). An epidemiologic analysis of co-occurring alcohol and tobacco use and disorders. Alcohol Res Health, 29(3), 162-171. Fanelli, R. R., Klein, J. T., Reese, R. M., & Robinson, D. L. (2013). Dorsomedial and dorsolateral striatum exhibit distinct phasic neuronal activity during alcohol self‐administration in rats. European Journal of Neuroscience,38(4), 2637-2648. Feltenstein, M. W., Ghee, S. M., & See, R. E. (2012). Nicotine self-administration and reinstatement of nicotine-seeking in male and female rats.Drug and alcohol dependence, 121(3), 240- 246. Ferguson, S. G., & Shiffman, S. (2009). The relevance and treatment of cue-induced cravings in tobacco dependence. Journal of substance abuse treatment, 36(3), 235-243. Ferster, C. B., and B. F. Skinner. "Schedules of reinforcement." (1957). East Norwalk, CT, US: Appleton-Century-Crofts. Field, M., Mogg, K., & Bradley, B. P. (2006). Automaticity of smoking behaviour: the relationship between dual-task performance, daily cigarette intake and subjective nicotine effects. Journal of Psychopharmacology, 20(6), 799-805. Floresco, S. B., West, A. R., Ash, B., Moore, H., & Grace, A. A. (2003). Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nature neuroscience, 6(9), 968-973. Galli, G., & Wolffgramm, J. (2011). Long-term development of excessive and inflexible nicotine taking by rats, effects of a novel treatment approach.Behavioural brain research, 217(2), 261-270. Garcia, J., & Koelling, R. A. (1967). A comparison of aversions induced by X rays, toxins, and drugs in the rat. Radiation Research Supplement, 439-450. Garcia, J., Lasiter, P. S., Bermudez-Rattoni, F., & Deems, D. A. (1985). A General Theory of Aversion Learning. Annals of the New York Academy of Sciences, 443, 8-21. Gardner, E. L. (2000). What We Have Learned about Addiction from Animal Models of Drug Self‐Administration. The American Journal on Addictions, 9(4), 285-313. Gass, J. C., Motschman, C. A., & Tiffany, S. T. (2014). The relationship between craving and tobacco use behavior in laboratory studies: A meta-analysis. Psychol Addict Behav 28(4):1162-76 George, O., Ghozland, S., Azar, M. R., Cottone, P., Zorrilla, E. P., Parsons, L. H., ... & Koob, G. F. (2007). CRF–CRF1 system activation mediates withdrawal-induced increases in nicotine self-administration in nicotine-dependent rats. Proceedings of the National Academy of Sciences, 104(43), 17198-17203. Glick, S. D., Visker, K. E., & Maisonneuve, I. M. (1996). An oral self-administration model of nicotine preference in rats: effects of mecamylamine.Psychopharmacology, 128(4), 426-431.

174

Goodwin RD, Sheffer CE, Chartrand H, Bhaskaran J, Hart CL, Sareen J, Bolton J (2014) Drug Use, Abuse, and Dependence and the Persistence of Nicotine Dependence. Nicotine & Tobacco Research 16(12):1606-1612 Goodwin, A. K., Hiranita, T., & Paule, M. G. (2015). The Reinforcing Effects of Nicotine in Humans and Nonhuman Primates: A Review of Intravenous Self-Administration Evidence and Future Directions for Research. Nicotine & Tobacco Research, ntv002. Gottfried, J. A., & Balleine, B. W. (2011). Sensation, Incentive Learning, and the Motivational Control of Goal-Directed Action. Gotti C, Clementi F (2004) Neuronal nicotinic receptors: from structure to pathology. Progress in Neurobiology 74:363-396 Grabus, S. D., Martin, B. R., Brown, S. E., & Damaj, M. I. (2006). Nicotine place preference in the mouse: influences of prior handling, dose and strain and attenuation by nicotinic receptor antagonists. Psychopharmacology, 184(3-4), 456-463. Grace, A. A., & Onn, S. P. (1989). Morphology and electrophysiological properties of immunocytochemically identified rat dopamine neurons recorded in vitro. J Neurosci, 9(10), 3463-3481. Grace, A. A., Floresco, S. B., Goto, Y., & Lodge, D. J. (2007). Regulation of firing of dopaminergic neurons and control of goal-directed behaviors. Trends in neurosciences, 30(5), 220-227. Grenhoff, J., Aston‐Jones, G., & Svensson, T. H. (1986). Nicotinic effects on the firing pattern of midbrain dopamine neurons. Acta Physiologica Scandinavica,128(3), 351-358. Griffiths, R. R., Bigelow, G. E., & Liebson, I. (1976). Facilitation of human tobacco self- administration by ethanol: a behavioral analysis. Journal of the experimental analysis of behavior, 25(3), 279-292. Grimm, J. W., Ratliff, C., North, K., Barnes, J., & Collins, S. (2012). Nicotine increases sucrose self‐administration and seeking in rats. Addiction biology,17(3), 623-633. Hammond, L. J. (1980). The effect of contingency upon the appetitive conditioning of free- operant behavior. Journal of the experimental analysis of behavior, 34(3), 297. Hauser, S. R., Katner, S. N., Deehan, G. A., Ding, Z. M., Toalston, J. E., Scott, B. J., ... & Rodd, Z. A. (2012). Development of an Oral Operant Nicotine/Ethanol Co‐Use Model in Alcohol‐Preferring (P) Rats. Alcoholism: Clinical and Experimental Research, 36(11), 1963- 1972. Hay, R. A., Jennings, J. H., Zitzman, D. L., Hodge, C. W., & Robinson, D. L. (2013). Specific and Nonspecific Effects of Naltrexone on Goal‐Directed and Habitual Models of Alcohol Seeking and Drinking. Alcoholism: Clinical and Experimental Research, 37(7), 1100-1110. Henningfield JE (1995) Nicotine medications for smoking cessation. N Engl J Med 333(18):1196- 203 Henningfield JE, Keenan RM (1993) Nicotine delivery kinetics and abuse liability. Journal of Consulting and Clinical Psychology 61(5):743-750 Hertling, I., Ramskogler, K., Dvorak, A., Klingler, A., Saletu-Zyhlarz, G., Schoberberger, R., ... & Lesch, O. M. (2005). Craving and other characteristics of the comorbidity of alcohol and nicotine dependence. European Psychiatry,20(5), 442-450. Hetherington, M. M., & Rolls, B. J. (1996). Sensory-specific satiety: Theoretical frameworks and central characteristics. In “Why we eat what we eat: The psychology of eating.” (pp. 267-290). Washington, DC, US Hill, K. G., Hawkins, J. D., Catalano, R. F., Abbott, R. D., & Guo, J. (2005). Family influences on the risk of daily smoking initiation. Journal of Adolescent Health, 37(3), 202-210. Hodos, W. (1961). Progressive ratio as a measure of reward strength. Science,134(3483), 943-944. Hoffman, B. R., Sussman, S., Unger, J. B., & Valente, T. W. (2006). Peer influences on adolescent cigarette smoking: A theoretical review of the literature. Substance use & misuse, 41(1), 103-155. Huges JR, Helzer JE, Lindberg SA (2006) Prevalence of DSM/ICD-defined nicotine dependence. Drug and Alcohol Dependence 85:91-102 Hughes JR, Keely J, Naud S (2004) Shape of the relapse curve and long-term abstinence among untreated smokers. Addiction 99:29-38

175

Hughes, J. R., Higgins, S. T., & Bickel, W. K. (1994). Nicotine withdrawal versus other drug withdrawal syndromes: similarities and dissimilarities.Addiction, 89(11), 1461-1470. Hukkanen J, Jacob P 3rd, Benowitz NL (2005) Metabolism and disposition kinetics of nicotine. Pharmacol Rev 57:79-115 Hull, C. (1943). “Principles of behavior.” Oxford, England. Imperato, A., Mulas, A., & Di Chiara, G. (1986). Nicotine preferentially stimulates dopamine release in the limbic system of freely moving rats.European journal of pharmacology, 132(2), 337-338. Jasinska, A. J., Zorick, T., Brody, A. L., & Stein, E. A. (2014). Dual role of nicotine in addiction and cognition: a review of neuroimaging studies in humans.Neuropharmacology, 84, 111-122. Johnson, P. M., Hollander, J. A., & Kenny, P. J. (2008). Decreased brain reward function during nicotine withdrawal in C57BL6 mice: evidence from intracranial self-stimulation (ICSS) studies. Pharmacology Biochemistry and Behavior, 90(3), 409-415. Jones, I. W., & Wonnacott, S. (2004). Precise localization of α7 nicotinic acetylcholine receptors on glutamatergic axon terminals in the rat ventral tegmental area. The Journal of neuroscience, 24(50), 11244-11252. Jonkman, S., Pelloux, Y., & Everitt, B. J. (2012). Differential roles of the dorsolateral and midlateral striatum in punished cocaine seeking. The Journal of Neuroscience, 32(13), 4645-4650. Jorenby DE, Hays JT, Rigotti NA, Azoulay S, Watsky EJ, Williams KE, Billing CB, Gong J, Reeves KR (2006) Efficacy of Varenicline, an a4b2 Nicotinic Acetylcholine Receptor Partial Agonist, vs Placebo or Sustained-Release Bupropion for Smoking Cessation. JAMA 296(1):56-63 Kalant, H. (2015). Neurobiological research on addiction: What value has it added to the concept?. The International Journal of Alcohol and Drug Research 4(1):53-59 Karlin A (2002) Emerging structure of the nicotinic acetylcholine receptors. Nat Rev Neurosci 3(2):102-14 Kassel, J. D., Stroud, L. R., & Paronis, C. A. (2003). Smoking, stress, and negative affect: correlation, causation, and context across stages of smoking.Psychological bulletin, 129(2), 270. Kasza KA, Bansal-Travers M, O’Connor RJ, Compton WM, Kettermann A, Borek N, Fong GT, Cummings KM, Hyland AJ (2014) Cigarette Smokers’ Use of Unconventional Tobacco Products and Associations With Quitting Activity: Findings From the ITC-4 U.S. Cohort. Nicotine & Tobacco Research 16(6):672-681 Kennett, J., Matthews, S., & Snoek, A. (2013). Pleasure and addiction.Frontiers in psychiatry, 4. Kenny, P. J., & Markou, A. (2006). Nicotine self-administration acutely activates brain reward systems and induces a long-lasting increase in reward sensitivity. Neuropsychopharmacology, 31(6), 1203-1211. Koob, G. F. (2006). The neurobiology of addiction: a neuroadaptational view relevant for diagnosis. Addiction, 101(s1), 23-30. Koob, G. F. (2010). The role of CRF and CRF-related peptides in the dark side of addiction. Brain research, 1314, 3-14. Koob, G. F., & Le Moal, M. (1997). Drug abuse: hedonic homeostatic dysregulation. Science, 278(5335), 52-58. Koob, G. F., & Le Moal, M. (2001). Drug addiction, dysregulation of reward, and allostasis. Neuropsychopharmacology, 24(2), 97-129. Koob, G. F., & Le Moal, M. (2008). Neurobiological mechanisms for opponent motivational processes in addiction. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1507), 3113-3123. Koob, G. F., & Volkow, N. D. (2010). Neurocircuitry of addiction. Neuropsychopharmacology, 35(1), 217-238. Koob, G. F., Ahmed, S. H., Boutrel, B., Chen, S. A., Kenny, P. J., Markou, A., ... & Sanna, P. P. (2004). Neurobiological mechanisms in the transition from drug use to drug dependence. Neuroscience & Biobehavioral Reviews, 27(8), 739-749. Lammel, S., Lim, B. K., & Malenka, R. C. (2014). Reward and aversion in a heterogeneous midbrain dopamine system. Neuropharmacology, 76, 351-359.

176

Lammel, S., Lim, B. K., Ran, C., Huang, K. W., Betley, M. J., Tye, K. M., ... & Malenka, R. C. (2012). Input-specific control of reward and aversion in the ventral tegmental area. Nature, 491(7423), 212-217. Lammel, S., Lim, B. K., Ran, C., Huang, K. W., Betley, M. J., Tye, K. M., ... & Malenka, R. C. (2012). Input-specific control of reward and aversion in the ventral tegmental area. Nature, 491(7423), 212-217. Lanca, A. J., Adamson, K. L., Coen, K. M., Chow, B. L. C., & Corrigall, W. A. (2000). The pedunculopontine tegmental nucleus and the role of cholinergic neurons in nicotine self- administration in the rat: a correlative neuroanatomical and behavioral study. Neuroscience, 96(4), 735-742. Larsson, A., & Engel, J. A. (2004). Neurochemical and behavioral studies on ethanol and nicotine interactions. Neuroscience & Biobehavioral Reviews, 27(8), 713-720. Laviolette, S. R., & van der Kooy, D. (2003). The motivational valence of nicotine in the rat ventral tegmental area is switched from rewarding to aversive following blockade of the α7- subunit-containing nicotinic acetylcholine receptor. Psychopharmacology, 166(3), 306- 313. Laviolette, S. R., & van der Kooy, D. (2003a). Blockade of mesolimbic dopamine transmission dramatically increases sensitivity to the rewarding effects of nicotine in the ventral tegmental area. Molecular psychiatry, 8(1), 50-59. Le Foll, B., & Goldberg, S. R. (2005). Nicotine induces conditioned place preferences over a large range of doses in rats. Psychopharmacology, 178(4), 481-492. Le Foll, B., & Goldberg, S. R. (2009). Effects of nicotine in experimental animals and humans: an update on addictive properties. In Nicotine Psychopharmacology (pp. 335-367). Springer Berlin Heidelberg. Lě, A. D., Corrigall, W. A., Watchus, J., Harding, S., Juzytsch, W., & Li, T. K. (2000). Involvement of Nicotinic Receptors in Alcohol Self‐Administration.Alcoholism: Clinical and Experimental Research, 24(2), 155-163. Lê, A. D., Funk, D., Lo, S., & Coen, K. (2014). Operant self-administration of alcohol and nicotine in a preclinical model of co-abuse. Psychopharmacology,231(20), 4019-4029. Le, A. D., Li, Z., Funk, D., Shram, M., Li, T. K., & Shaham, Y. (2006). Increased vulnerability to nicotine self-administration and relapse in alcohol-naive offspring of rats selectively bred for high alcohol intake. The Journal of neuroscience, 26(6), 1872-1879. Lê, A. D., Lo, S., Harding, S., Juzytsch, W., Marinelli, P. W., & Funk, D. (2010). Coadministration of intravenous nicotine and oral alcohol in rats.Psychopharmacology, 208(3), 475-486. Le, A. D., Wang, A., Harding, S., Juzytsch, W., & Shaham, Y. (2003). Nicotine increases alcohol self-administration and reinstates alcohol seeking in rats.Psychopharmacology, 168(1-2), 216-221. LeBlanc, K. H., Maidment, N. T., & Ostlund, S. B. (2013). Repeated cocaine exposure facilitates the expression of incentive motivation and induces habitual control in rats. Lecca, D., Cacciapaglia, F., Valentini, V., Acquas, E., & Di Chiara, G. (2007). Differential neurochemical and behavioral adaptation to cocaine after response contingent and noncontingent exposure in the rat. Psychopharmacology, 191(3), 653-667. Lerman, C., LeSage, M. G., Perkins, K. A., O'Malley, S. S., Siegel, S. J., Benowitz, N. L., & Corrigall, W. A. (2007). Translational research in medication development for nicotine dependence. Nature reviews Drug discovery, 6(9), 746-762. LeSage, M. G., Burroughs, D., Dufek, M., Keyler, D. E., & Pentel, P. R. (2004). Reinstatement of nicotine self-administration in rats by presentation of nicotine-paired stimuli, but not nicotine priming. Pharmacology Biochemistry and Behavior, 79(3), 507-513. Lindstrom J (1996) Neuronal nicotinic acetylcholine receptors. In “Ion Channels” (337-450) Springer US. Lingawi, N. W., & Balleine, B. W. (2012). Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits. The Journal of Neuroscience, 32(3), 1073- 1081.

177

Lodge, D. J., & Grace, A. A. (2006). The laterodorsal tegmentum is essential for burst firing of ventral tegmental area dopamine neurons. Proceedings of the National Academy of Sciences of the United States of America, 103(13), 5167-5172. Loukas A, Batanova M, Fernandez A, Agarwal D (2015) Changes in use of cigarettes and non- cigarette alternative products among college students. Addictive Behaviors 49:46-51 Mackintosh, N. J. (1974). The psychology of animal learning. Academic Press. MacLaren, D. A., Wilson, D. I., & Winn, P. (2013). Updating of action–outcome associations is prevented by inactivation of the posterior pedunculopontine tegmental nucleus. Neurobiology of learning and memory, 102, 28-33. Mangieri, R. A., Cofresí, R. U., & Gonzales, R. A. (2012). Ethanol seeking by Long Evans rats is not always a goal-directed behavior. Mangieri, R. A., Cofresí, R. U., & Gonzales, R. A. (2014). Ethanol exposure interacts with training conditions to influence behavioral adaptation to a negative instrumental contingency. Frontiers in behavioral neuroscience, 8. Mansvelder, H. D., & McGehee, D. S. (2000). Long-term potentiation of excitatory inputs to brain reward areas by nicotine. Neuron, 27(2), 349-357. Mansvelder, H. D., & McGehee, D. S. (2002). Cellular and Synaptic Mechanisms of Nicotine Addiction. Mantsch, J. R., Baker, D. A., Funk, D., Lê, A. D., & Shaham, Y. (2015). Stress-Induced Reinstatement of Drug Seeking: 20 Years of Progress.Neuropsychopharmacology. Markou, A., Arroyo, M., & Everitt, B. J. (1999). Effects of contingent and non-contingent cocaine on drug-seeking behavior measured using a second-order schedule of cocaine reinforcement in rats. Neuropsychopharmacology, 20(6), 542-555. Marks MJ, Stitzel JA, Collins AC (1985) Time course study of the effects of chronic nicotine infusion on drug response and brain receptors. The Journal Of Pharmacology and Experimental Therapeutics 235(3) 619-628 Maskos, U. (2008). The cholinergic mesopontine tegmentum is a relatively neglected nicotinic master modulator of the dopaminergic system: relevance to drugs of abuse and pathology. British journal of pharmacology, 153(S1), S438-S445. Mathers CD, Loncar D (2006) Projections of Global Mortality and Burden of Disease from 2002 to 2030. PLOS Med. 3(11):e422 Matta, S. G., Balfour, D. J., Benowitz, N. L., Boyd, R. T., Buccafusco, J. J., Caggiula, A. R., ... & Zirger, J. M. (2007). Guidelines on nicotine dose selection for in vivo research. Psychopharmacology, 190(3), 269-319. Mereu, G., Yoon, K. W. P., Boi, V., Gessa, G. L., Naes, L., & Westfall, T. C. (1987). Preferential stimulation of ventral tegmental area dopaminergic neurons by nicotine. European journal of pharmacology, 141(3), 395-399. Miguéns, M., Crespo, J. A., Del Olmo, N., Higuera-Matas, A., Montoya, G. L., García-Lecumberri, C., & Ambrosio, E. (2008). Differential cocaine-induced modulation of glutamate and dopamine transporters after contingent and non-contingent administration. Neuropharmacology, 55(5), 771-779. Miles, F. J., Everitt, B. J., & Dickinson, A. (2003). Oral cocaine seeking by rats: action or habit?. Behavioral neuroscience, 117(5), 927. Millenson, J. R. (1963). RANDOM INTERVAL SCHEDULES OF REINFORCEMENT. Journal of the Experimental Analysis of Behavior, 6(3), 437-443. Miller, D. K., Wilkins, L. H., Bardo, M. T., Crooks, P. A., & Dwoskin, L. P. (2001). Once weekly administration of nicotine produces long-lasting locomotor sensitization in rats via a nicotinic receptor-mediated mechanism.Psychopharmacology, 156(4), 469-476. Miller, N. S., & Gold, M. S. (1994). Dissociation of “conscious desire”(craving) from and relapse in alcohol and cocaine dependence. Annals of Clinical Psychiatry, 6(2), 99-106. Miller, N. S., & Gold, M. S. (1998). Comorbid cigarette and alcohol addiction: epidemiology and treatment. Journal of addictive diseases, 17(1), 55-66.

178

Moore D, Aveyard P, Connock M, Wang D, Fry-Smith A, Barton P (2009) Effectiveness and safety of nicotine replacement therapy assisted reduction to stop smoking: systematic review and meta-analysis. BMJ 338:b1024 Morrison, G. R., & Collyer, R. (1974). Taste-mediated conditioned aversion to an exteroceptive stimulus following LiCl poisoning. Journal of Comparative and Physiological Psychology, 86(1), 51. Mucha, R. F., Van Der Kooy, D., O'Shaughnessy, M., & Bucenieks, P. (1982). Drug reinforcement studied by the use of place conditioning in rat. Brain research, 243(1), 91-105. Murray, J. E., Belin, D., & Everitt, B. J. (2012). Double dissociation of the dorsomedial and dorsolateral striatal control over the acquisition and performance of cocaine seeking. Neuropsychopharmacology, 37(11), 2456-2466. Mwenifumbo, J. C., & Tyndale, R. F. (2009). Molecular genetics of nicotine metabolism. In Nicotine Psychopharmacology (pp. 235-259). Springer Berlin Heidelberg. Nakajima, M., Yamamoto, T., Nunoya, K. I., Yokoi, T., Nagashima, K., Inoue, K., ... & Kuroiwa, Y. (1996). Role of human cytochrome P4502A6 in C-oxidation of nicotine. and Disposition, 24(11), 1212-1217. Negus, S. S., & Miller, L. L. (2014). Intracranial self-stimulation to evaluate abuse potential of drugs. Pharmacological reviews, 66(3), 869-917. Nelson, A., & Killcross, S. (2006). Amphetamine exposure enhances habit formation. The Journal of neuroscience, 26(14), 3805-3812. Nestler, E. J. (2005). Is there a common molecular pathway for addiction?.Nature neuroscience, 8(11), 1445-1449. Nordquist, R. E., Voorn, P., De Mooij-van Malsen, J. G., Joosten, R. N. J. M. A., Pennartz, C. M. A., & Vanderschuren, L. J. M. J. (2007). Augmented reinforcer value and accelerated habit formation after repeated amphetamine treatment. European neuropsychopharmacology, 17(8), 532-540. Öberg M, Jaakkola MS, Woodward A, Peruga A, Pruss-Ustun A (2011) Worldwide burden of disease from exposure to second-hand smoke: a retrospective analysis of data from 192 countries. Lancet 377:139-46 O'Brien, C., & McLellan, A. T. (1996). Myths about the treatment of addiction.The Lancet, 347(8996), 237-240. O'Dell, L. E., & Khroyan, T. V. (2009). Rodent models of nicotine reward: what do they tell us about tobacco abuse in humans?. Pharmacology Biochemistry and Behavior, 91(4), 481- 488. Olds, J., & Milner, P. (1954). Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. Journal of comparative and physiological psychology, 47(6), 419. Olmstead, M. C., Lafond, M. V., Everitt, B. J., & Dickinson, A. (2001). Cocaine seeking by rats is a goal-directed action. Behavioral neuroscience, 115(2), 394. Omelchenko, N., & Sesack, S. R. (2005). Laterodorsal tegmental projections to identified cell populations in the rat ventral tegmental area. Journal of Comparative Neurology, 483(2), 217-235. Ostlund, S. B., & Balleine, B. W. (2005). Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. The Journal of neuroscience, 25(34), 7763-7770. Ostlund, S. B., & Balleine, B. W. (2009). On habits and addiction: An associative analysis of compulsive drug seeking. Drug discovery today: Disease models, 5(4), 235-245. Palmatier, M. I., Evans-Martin, F. F., Hoffman, A., Caggiula, A. R., Chaudhri, N., Donny, E. C., ... & Sved, A. F. (2006). Dissociating the primary reinforcing and reinforcement-enhancing effects of nicotine using a rat self-administration paradigm with concurrently available drug and environmental reinforcers.Psychopharmacology, 184(3-4), 391-400. Panlilio, L. V., & Goldberg, S. R. (2007). Self‐administration of drugs in animals and humans as a model and an investigative tool. Addiction, 102(12), 1863-1870.

179

Paredes-Olay, C., & López, M. (2002). Lithium-induced outcome devaluation in instrumental conditioning: Dose–effect analysis. Physiology & behavior, 75(5), 603-609. Parkes, S. L., & Balleine, B. W. (2013). Incentive memory: evidence the basolateral amygdala encodes and the insular cortex retrieves outcome values to guide choice between goal- directed actions. The Journal of Neuroscience,33(20), 8753-8763. Patel, R. R., Ryu, J. H., & Vassallo, R. (2008). Cigarette smoking and diffuse lung disease. Drugs, 68(11), 1511-1527. Patterson, F., Benowitz, N., Shields, P., Kaufmann, V., Jepson, C., Wileyto, P., ... & Lerman, C. (2003). Individual differences in nicotine intake per cigarette. Cancer Epidemiology Biomarkers & Prevention, 12(5), 468-471. Peng X, Gerzanich V, Anand R, Whiting PJ, Lindstrom J (1994) Nicotine-induced increase in neuronal nicotinic receptors results from a decrease in the rate of receptor turnover. Molecular Pharmacology 46:523-530 Perkins, K. A. (2009). Does smoking cue‐induced craving tell us anything important about nicotine dependence?. Addiction, 104(10), 1610-1616. Perkins, K. A. (2011). Subjective reactivity to smoking cues as a predictor of quitting success. Nicotine & Tobacco Research, ntr229. Peto R, Darby S, Deo H, Silcocks P, Whitley E, Doll R (2000) Smoking, smoking cessation, and lung cancer in the UK since 1950: combination of national statistics with two case•control studies. BMJ 321:323-329 Piazza, P. V., & Deroche-Gamonet, V. (2013). A multistep general theory of transition to addiction. Psychopharmacology, 229(3), 387-413. Picciotto, M. R., Zoli, M., Rimondini, R., Léna, C., Marubio, L. M., Pich, E. M., ... & Changeux, J. P. (1998). Acetylcholine receptors containing the β2 subunit are involved in the reinforcing properties of nicotine. Nature, 391(6663), 173-177. Pignatelli, M., & Bonci, A. (2015). Role of Dopamine Neurons in Reward and Aversion: A Synaptic Plasticity Perspective. Neuron, 86(5), 1145-1157. Pitchford S, Day JW, Gordon A, Mochly-Rosen D (1992) Nicotinic acetylcholine receptor desensitization is regulated by activation-induced extracellular adenosine accumulation. J Neurosci 12(11):4540-4544 Pittenger, S. T., & Bevins, R. A. (2013a). Interoceptive conditioning in rats: Effects of using a single training dose or a set of 5 different doses of nicotine.Pharmacology Biochemistry and Behavior, 114, 82-89. Pittenger, S. T., & Bevins, R. A. (2013b). Interoceptive conditioning with a nicotine stimulus is susceptible to reinforcer devaluation. Behavioral neuroscience, 127(3), 465. Planeta, C. S. (2013). Animal models of alcohol and drug dependence. Revista Brasileira de Psiquiatria, 35, S140-S146. Polosa, R., & Benowitz, N. L. (2011). Treatment of nicotine addiction: present therapeutic options and pipeline developments. Trends in pharmacological sciences, 32(5), 281-289. Pomerleau, O. F., & Pomerleau, C. S. (1991). Research on stress and smoking: progress and problems. British journal of addiction, 86(5), 599-603. Pons, S., Fattore, L., Cossu, G., Tolu, S., Porcu, E., McIntosh, J. M., ... & Fratta, W. (2008). Crucial role of α4 and α6 nicotinic acetylcholine receptor subunits from ventral tegmental area in systemic nicotine self-administration.The Journal of Neuroscience, 28(47), 12318-12327. Pushparaj, A., Hamani, C., Yu, W., Shin, D. S., Kang, B., Nobrega, J. N., & Le Foll, B. (2013). Electrical stimulation of the insular region attenuates nicotine-taking and nicotine-seeking behaviors. Neuropsychopharmacology, 38(4), 690-698. Raunio, H., Rautio, A., Gullstén, H., & Pelkonen, O. (2001). Polymorphisms of CYP2A6 and its practical consequences. British journal of clinical pharmacology, 52(4), 357-363. Redgrave, P., Gurney, K., & Reynolds, J. (2008). What is reinforced by phasic dopamine signals?. Brain research reviews, 58(2), 322-339. Reid JL, Hammond D, Rynard VL, Burkhalter R. (2014) Tobacco Use in Canada: Patterns and Trends, 2014 Edition. Waterloo, ON: Propel Centre for Population Health Impact, University of Waterloo.

180

Rescorla, R. A. (1992). Depression of an instrumental response by a single devaluation of its outcome. Quarterly Journal of Experimental Psychology: Section B, 44(2), 123-136. Rescorla, R. A. (1994). Transfer of instrumental control mediated by a devalued outcome. Animal Learning & Behavior, 22(1), 27-33. Rice, M. E., & Cragg, S. J. (2004). Nicotine amplifies reward-related dopamine signals in striatum. Nature neuroscience, 7(6), 583-584. Robbins, T. W., & Everitt, B. J. (1999). Drug addiction: bad habits add up.Nature, 398(6728), 567- 570. Robinson, T. E., & Berridge, K. C. (1993). The neural basis of drug craving: an incentive- sensitization theory of addiction. Brain research reviews, 18(3), 247-291. Robinson, T. E., & Berridge, K. C. (2008). Review. The incentive sensitization theory of addiction: some current issues. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 363(1507), 3137-3146. Rolls, B. J. (1986). Sensory‐specific satiety. Nutrition Reviews, 44(3), 93-101. Roma, P. G., & Riley, A. L. (2005). Apparatus bias and the use of light and texture in place conditioning. Pharmacology Biochemistry and Behavior, 82(1), 163-169. Root, D. H., Fabbricatore, A. T., Barker, D. J., Ma, S., Pawlak, A. P., & West, M. O. (2009). Evidence for habitual and goal-directed behavior following devaluation of cocaine: a multifaceted interpretation of relapse. PLoS One, 4(9), e7170. Rossi, M. A., & Yin, H. H. (2012). Methods for studying habitual behavior in mice. Current Protocols in Neuroscience, 8-29. Rozeboom, W. W. (1958). " What is Learned?"—An empirical enigma.Psychological review, 65(1), 22. Salgado, S., & Kaplitt, M. G. (2015). The Nucleus Accumbens: A Comprehensive Review. Stereotactic and functional neurosurgery, 93(2), 75-93. Samaha, A. N., & Robinson, T. E. (2005). Why does the rapid delivery of drugs to the brain promote addiction?. Trends in pharmacological sciences, 26(2), 82-87. Sanchis‐Segura, C., & Spanagel, R. (2006). REVIEW: behavioural assessment of drug reinforcement and addictive features in rodents: an overview. Addiction biology, 11(1), 2-38. Sayette, M. A., Martin, C. S., Wertz, J. M., Shiffman, S., & Perrott, M. A. (2001). A multi‐dimensional analysis of cue‐elicited craving in heavy smokers and tobacco chippers. Addiction, 96(10), 1419-1432. Schilström, B., Rawal, N., Mameli-Engvall, M., Nomikos, G. G., & Svensson, T. H. (2003). Dual effects of nicotine on dopamine neurons mediated by different nicotinic receptor subtypes. The International Journal of Neuropsychopharmacology, 6(01), 1-11. Schilström, B., Svensson, H. M., Svensson, T. H., & Nomikos, G. G. (1998). Nicotine and food induced dopamine release in the nucleus accumbens of the rat: putative role of α7 nicotinic receptors in the ventral tegmental area.Neuroscience, 85(4), 1005-1009. Schneck, N., & Vezina, P. (2012). Enhanced dorsolateral striatal activity in drug use: the role of outcome in stimulus–response associations. Behavioural brain research, 235(2), 136-142. Schultz, W. (2007a). Multiple dopamine functions at different time courses.Annu. Rev. Neurosci., 30, 259-288. Schultz, W. (2007b). Behavioral dopamine signals. Trends in neurosciences,30(5), 203-210. Schwabe, L., & Wolf, O. T. (2009). Stress prompts habit behavior in humans.The Journal of Neuroscience, 29(22), 7191-7198. Schwabe, L., & Wolf, O. T. (2011a). Stress-induced modulation of instrumental behavior: from goal-directed to habitual control of action. Behavioural brain research, 219(2), 321-328. Schwabe, L., Dickinson, A., & Wolf, O. T. (2011a). Stress, habits, and drug addiction: a psychoneuroendocrinological perspective. Experimental and clinical psychopharmacology, 19(1), 53. Schwabe, L., Haddad, L., & Schachinger, H. (2008). HPA axis activation by a socially evaluated cold-pressor test. Psychoneuroendocrinology, 33(6), 890-895.

181

Schwabe, L., Höffken, O., Tegenthoff, M., & Wolf, O. T. (2011b). Preventing the stress-induced shift from goal-directed to habit action with a β-adrenergic antagonist. The Journal of Neuroscience, 31(47), 17317-17325. Semenova, S., & Markou, A. (2003). Clozapine treatment attenuated somatic and affective signs of nicotine and amphetamine withdrawal in subsets of rats exhibiting hyposensitivity to the initial effects of clozapine. Biological psychiatry,54(11), 1249-1264. Serlin, H., & Torregrossa, M. M. (2014). Adolescent rats are resistant to forming ethanol seeking habits. Developmental cognitive neuroscience. Shafey, O., Dolwick, S., & Guindon, G. E. (2003). Tobacco control country profiles. Atlanta: American Cancer Society, 356. Shiffman, S. (2009). Responses to smoking cues are relevant to smoking and relapse. Addiction, 104(10), 1617-1618. Shiffman, S. (2009). Responses to smoking cues are relevant to smoking and relapse. Addiction, 104(10), 1617-1618. Shiffman, S. M., & Jarvik, M. E. (1976). Smoking withdrawal symptoms in two weeks of abstinence. Psychopharmacology, 50(1), 35-39. Shiffman, S., Paty, J. A., Gnys, M., Kassel, J. A., & Hickcox, M. (1996). First lapses to smoking: within- subjects analysis of real-time reports. Journal of consulting and clinical psychology, 64(2), 366. Shillinglaw, J. E., Everitt, I. K., & Robinson, D. L. (2014). Assessing behavioral control across reinforcer solutions on a fixed-ratio schedule of reinforcement in rats. Alcohol, 48(4), 337- 344. Shoaib, M. (1996). Determinants of nicotine self‐administration. Drug development research, 38(3‐4), 212-221. Shoaib, M., Schindler, C. W., & Goldberg, S. R. (1997). Nicotine self-administration in rats: strain and nicotine pre-exposure effects on acquisition.Psychopharmacology, 129(1), 35-43. Shram, M. J., Funk, D., Li, Z., & Lê, A. D. (2008). Nicotine self-administration, extinction responding and reinstatement in adolescent and adult male rats: evidence against a biological vulnerability to nicotine addiction during adolescence. Neuropsychopharmacology, 33(4), 739-748. Sinclair, J. D., Kampov-Polevoy, A., Stewart, R., & Li, T. K. (1992). Taste preferences in rat lines selected for low and high alcohol consumption. Alcohol,9(2), 155-160. Singer, G., Wallace, M., & Hall, R. (1982). Effects of dopaminergic nucleus accumbens lesions on the acquisition of schedule induced self-injection of nicotine in the rat. Pharmacology Biochemistry and Behavior, 17(3), 579-581. Sjoerds, Z., Luigjes, J., Van Den Brink, W., Denys, D., & Yücel, M. (2014). The role of habits and motivation in human drug addiction: a reflection. Frontiers in psychiatry, 5. Skinner, M. D., & Aubin, H. J. (2010). Craving's place in addiction theory: contributions of the major models. Neuroscience & Biobehavioral Reviews,34(4), 606-623. Smith, A., & Roberts, D. C. S. (1995). Oral self-administration of sweetened nicotine solutions by rats. Psychopharmacology, 120(3), 341-346. Smith, K. S., & Graybiel, A. M. (2013). A dual operator view of habitual behavior reflecting cortical and striatal dynamics. Neuron, 79(2), 361-374. Smith, K. S., & Graybiel, A. M. (2014). Investigating habits: strategies, technologies and models. Frontiers in behavioral neuroscience, 8:39 Sofuoglu, M., Yoo, S., Hill, K. P., & Mooney, M. (2008). Self-administration of intravenous nicotine in male and female cigarette smokers.Neuropsychopharmacology, 33(4), 715-720. Solomon, R. L., & Corbit, J. D. (1974). An opponent-process theory of motivation: I. Temporal dynamics of affect. Psychological review, 81(2), 119. Spanagel, R., & Hölter, S. M. (1999). Long-term alcohol self-administration with repeated alcohol deprivation phases: an animal model of alcoholism?. Alcohol and Alcoholism, 34(2), 231- 243.

182

Spealman, R. D., & Goldberg, S. R. (1978). Drug self-administration by laboratory animals: control by schedules of reinforcement. Annual Review of Pharmacology and Toxicology, 18(1), 313-339. Spear, L. (2000). Modeling Adolescent Development and Alcohol Use in Animals. Spear, N. E., & Miller, R. R. (1981). Information processing in animals. InBinghamton Symposium on Memory Mechanisms in Animal Behavior (1980). L. Erlbaum Associates. Stafford, D., LeSage, M. G., & Glowa, J. R. (1998). Progressive-ratio schedules of drug delivery in the analysis of drug self-administration: a review.Psychopharmacology, 139(3), 169-184. Stefanski, R., Ladenheim, B., Lee, S. H., Cadet, J. L., & Goldberg, S. R. (1999). Neuroadaptations in the dopaminergic system after active self-administration but not after passive administration of .European journal of pharmacology, 371(2), 123-135. Steiner, R. C., & Picciotto, M. R. (2006). 3 Animal Models of Nicotine Addiction: Implications for Medications Development. Medication Treatments for Nicotine Dependence, 39. Stellar, J. R., & Corbett, D. (1989). Regional neuroleptic microinjections indicate a role for nucleus accumbens in lateral hypothalamic self-stimulation reward.Brain research, 477(1), 126- 143. Stewart, J., De Wit, H., & Eikelboom, R. (1984). Role of unconditioned and conditioned drug effects in the self-administration of opiates and .Psychological review, 91(2), 251. Stoker, A. K., & Markou, A. (2011). The intracranial self-stimulation procedure provides quantitative measures of brain reward function. In Mood and Anxiety Related Phenotypes in Mice (pp. 307-331). Humana Press. Stoker, A. K., Semenova, S., & Markou, A. (2008). Affective and somatic aspects of spontaneous and precipitated nicotine withdrawal in C57BL/6J and BALB/cByJ mice. Neuropharmacology, 54(8), 1223-1232. Stolerman IP, Garcha HS, Mirza NR (1995) Dissociations between the locomotor stimulant and depressant effects of nicotinic agonists in rats. Psychopharmacology 117:430-437 Stolerman IP, Shoaib M (1991) The neurobiology of tobacco addiction. Trends Pharmacol. Sci. 12(12):467-73 Stolerman, I. P., Garcha, H. S., & Mirza, N. R. (1995). Dissociations between the locomotor stimulant and depressant effects of nicotinic agonists in rats.Psychopharmacology, 117(4), 430-437. Stolerman, I. P., Garcha, H. S., Pratt, J. A., & Kumar, R. (1984). Role of training dose in discrimination of nicotine and related compounds by rats.Psychopharmacology, 84(3), 413-419. Taly, A., Corringer, P. J., Guedin, D., Lestage, P., & Changeux, J. P. (2009). Nicotinic receptors: allosteric transitions and therapeutic targets in the nervous system. Nature reviews Drug discovery, 8(9), 733-750. Tapper, A. R., McKinney, S. L., Nashmi, R., Schwarz, J., Deshpande, P., Labarca, C., ... & Lester, H. A. (2004). Nicotine activation of α4* receptors: sufficient for reward, tolerance, and sensitization. Science, 306(5698), 1029-1032. Tiffany, S. T. (1990). A cognitive model of drug urges and drug-use behavior: role of automatic and nonautomatic processes. Psychological review, 97(2), 147. Tiffany, S. T., & Carter, B. L. (1998). Is craving the source of compulsive drug use?. Journal of Psychopharmacology, 12(1), 23-30. Tolman, E. C. (1949). There is more than one kind of learning. Psychological review, 56(3), 144. transitions and therapeutic targets in the nervous system. Nat Rev Drug Discov 8(9):733- 750 Tricomi, E., Balleine, B. W., & O’Doherty, J. P. (2009). A specific role for posterior dorsolateral striatum in human habit learning. European Journal of Neuroscience, 29(11), 2225-2232. Tuesta L, Fowler CD, Kenny PJ (2011) Recent advances in understanding nicotinic receptor signaling mechanisms that regulate drug self-administration behavior. Biochem Pharmacol 82(8):984-995

183

Tzschentke, T. M. (1998). Measuring reward with the conditioned place preference paradigm: a comprehensive review of drug effects, recent progress and new issues. Progress in neurobiology, 56(6), 613-672. Tzschentke, T. M. (2007). Review on CPP: Measuring reward with the conditioned place preference (CPP) paradigm: update of the last decade. Addiction biology, 12(3‐4), 227- 462. Vanderschuren, L. J., & Everitt, B. J. (2004). Drug seeking becomes compulsive after prolonged cocaine self-administration. Science, 305(5686), 1017-1019. Vanderschuren, L. J., & Everitt, B. J. (2005). Behavioral and neural mechanisms of compulsive drug seeking. European journal of pharmacology, 526(1), 77-88. Vlachou, S., & Markou, A. (2011). Intracranial self-stimulation. In Animal models of drug addiction (pp. 3-56). Humana Press. Wada, E., Wada, K., Boulter, J. I. M., Deneris, E., Heinemann, S., Patrick, J. I. M., & Swanson, L. W. (1989). Distribution of alpha2, alpha3, alpha4, and beta2 neuronal nicotinic receptor subunit mRNAs in the central nervous system: a hybridization histochemical study in the rat. Journal of Comparative Neurology,284(2), 314-335. Walters, C. L., Brown, S., Changeux, J. P., Martin, B., & Damaj, M. I. (2006). The β2 but not α7 subunit of the nicotinic acetylcholine receptor is required for nicotine-conditioned place preference in mice. Psychopharmacology, 184(3-4), 339-344. Wang, H. L., & Morales, M. (2009). Pedunculopontine and laterodorsal tegmental nuclei contain distinct populations of cholinergic, glutamatergic and GABAergic neurons in the rat. European Journal of Neuroscience, 29(2), 340-358. Wang, L. P., Li, F., Wang, D., Xie, K., Wang, D., Shen, X., & Tsien, J. Z. (2011). NMDA receptors in dopaminergic neurons are crucial for habit learning.Neuron, 72(6), 1055-1066. Watkins, S. S., Epping-Jordan, M. P., Koob, G. F., & Markou, A. (1999). Blockade of nicotine self- administration with nicotinic antagonists in rats.Pharmacology Biochemistry and Behavior, 62(4), 743-751. Wenzel, J. M., Rauscher, N. A., Cheer, J. F., & Oleson, E. B. (2014). A role for phasic dopamine release within the nucleus accumbens in encoding aversion: a review of the neurochemical literature. ACS chemical neuroscience, 6(1), 16-26. White, N. M. (1989). Reward or reinforcement: what's the difference?.Neuroscience & Biobehavioral Reviews, 13(2), 181-186. Williams, B. A. (1989). The effects of response contingency and reinforcement identity on response suppression by alternative reinforcement. Learning and Motivation, 20(2), 204- 224. Wise, R. A. (2006). Role of brain dopamine in food reward and reinforcement.Philosophical Transactions of the Royal Society B: Biological Sciences,361(1471), 1149-1158. Wolffgramm, J., & Heyne, A. (1995). From controlled drug intake to loss of control: the irreversible development of drug addiction in the rat. Behavioural brain research, 70(1), 77-94. Wray, J. M., Gass, J. C., & Tiffany, S. T. (2013). A systematic review of the relationships between craving and smoking cessation. nicotine & tobacco Research, nts268. Wu PH, Schulz KM (2012) Advancing addiction treatment: what can we learn from animal studies? ILAR Journal, 53(1):4-13. Yalachkov, Y., & Naumer, M. J. (2011). Involvement of action-related brain regions in nicotine addiction. Journal of neurophysiology, 106(1), 1-3. Yalachkov, Y., Kaiser, J., & Naumer, M. J. (2009). Brain regions related to tool use and action knowledge reflect nicotine dependence. The Journal of Neuroscience, 29(15), 4922- 4929. Yalachkov, Y., Kaiser, J., Görres, A., Seehaus, A., & Naumer, M. J. (2013). Sensory modality of smoking cues modulates neural cue reactivity.Psychopharmacology, 225(2), 461-471. Yamanaka H, Nakajima M, Katoh M, Kanoh A, Tamura O, Ishibashi H, Yokoi T (2005) Trans-3’- hydroxycotinine O- and N-glucuronidations in human liver microsomes. Drug Metabolism and Disposition 33:23-30

184

Yanagita, T., Ando, K., Wakasa, Y., & Shimada, A. (1995). Behavioral and biochemical analysis of the dependence properties of nicotine. In Effects of Nicotine on Biological Systems II (pp. 225-232). Birkhäuser Basel. Yin, H. H., & Knowlton, B. J. (2006). The role of the basal ganglia in habit formation. Nature Reviews Neuroscience, 7(6), 464-476. Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2004). Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. European Journal of Neuroscience, 19(1), 181-189. Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2005a). Blockade of NMDA receptors in the dorsomedial striatum prevents action–outcome learning in instrumental conditioning. European Journal of Neuroscience, 22(2), 505-512. Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2006). Inactivation of dorsolateral striatum enhances sensitivity to changes in the action–outcome contingency in instrumental conditioning. Behavioural brain research, 166(2), 189-196. Yin, H. H., Ostlund, S. B., Knowlton, B. J., & Balleine, B. W. (2005b). The role of the dorsomedial striatum in instrumental conditioning. European Journal of Neuroscience, 22(2), 513-523. Zapata, A., Minney, V. L., & Shippenberg, T. S. (2010). Shift from goal-directed to habitual cocaine seeking after prolonged experience in rats. The Journal of Neuroscience, 30(46), 15457-15463. Zhang, H., & Sulzer, D. (2004). Frequency-dependent modulation of dopamine release by nicotine. Nature neuroscience, 7(6), 581-582.