1 2 3 A role for descending auditory cortical projections in songbird 4 5 6

7

8 Yael Mandelblat-Cerf*, Liora Las*, Natalia Denissenko*, and Michale S. Fee†

9

10 McGovern Institute for Research 11 Department of Brain and Cognitive Sciences 12 Massachusetts Institute of Technology 13 Cambridge, MA 14 15 16 * These authors contributed equally

17 18 †To whom correspondence should be addressed: 19 [email protected] 20 MIT 46-5133 21 77 Massachusetts Ave 22 Cambridge, MA 02139 23 (617) 324-0173 (phone) 24 (617) 324-0256 (fax) 25 26 27

28 Abstract (149 words):

29 Many learned motor behaviors are acquired by comparing ongoing behavior with

30 an internal representation of correct performance, rather than using an explicit external

31 reward. For example, juvenile songbirds learn to sing by comparing their song with the

32 memory of a tutor song. At present, the brain regions subserving song evaluation are not

33 known. Here we report several findings suggesting that song evaluation involves an

34 avian ‘cortical’ area previously shown to project to the dopaminergic midbrain and other

35 downstream targets. We find that this ventral portion of the intermediate arcopallium

36 (AIV) receives inputs from auditory cortical areas, and that lesions of AIV result in

37 significant deficits in vocal learning. Additionally, AIV neurons exhibit fast responses to

38 disruptive auditory feedback presented during singing, but not during nonsinging periods.

39 Our findings suggest that auditory cortical areas may guide learning by transmitting song

40 evaluation signals to the dopaminergic midbrain and/or other subcortical targets.

41 Introduction

42 Most human behaviors, such as speech, music, or athletic performance, are learned

43 through a gradual process of trial and error. In all of these behaviors, the motor actions

44 are shaped during learning by an internal model of good performance (Wolpert et al.,

45 1995). While a great deal has been learned about the neural mechanisms by which

46 external rewards, such as food or juice drops, are represented in the brain and might

47 shape future behavior (Schultz et al., 1997, Hikosaka, 2007), little is known about how

48 the brain evaluates its own performance, where such self-evaluation is computed and how

49 these signals might shape future behavior.

50 Here we use the songbird as a model system to understand how complex

51 behaviors can be learned by self-evaluation of motor performance, rather than relying on

52 explicit external rewards such as food or social reinforcement. Songbirds learn their

53 vocalizations by storing a memory, or ‘template’, of their tutor’s song (Doupe and Kuhl,

54 1999). By listening to themselves sing and comparing their own song to this template

55 (Konishi, 1965), they gradually converge to what can be a nearly exact copy of the tutor’s

56 song. Vocal learning requires a basal-ganglia forebrain circuit called the anterior

57 forebrain pathway (AFP), the output of which projects to cortical premotor vocal areas

58 (Scharff and Nottebohm, 1991, Nottebohm et al., 1976, Bottjer et al., 1984). The AFP is

59 crucial for driving learned changes in song and shaping plasticity in the song motor

60 pathway (Brainard and Doupe, 2000, Andalman and Fee, 2009, Warren et al., 2011). A

61 key component of this vocal learning circuit is Area X, a basal ganglia homologue

62 containing both striatal and pallidal components (Person et al., 2008).

63 Of critical importance in understanding the mechanisms of vocal sensorimotor

64 learning in the songbird is to determine where song is evaluated and how the resulting

65 evaluation signals are transmitted to the AFP to shape learning. Some hypotheses have

66 emphasized the possible role of HVC (used as a proper name), a premotor cortical circuit

67 that receives auditory inputs and also projects to Area X (Prather et al., 2008, Troyer and

68 Doupe, 2000, Roberts et al., 2012, Mooney, 2006). While the involvement of HVC in

69 evaluating ongoing song remains an open question, several lines of evidence suggest that

70 HVC does not transmit error-related signals to the AFP (Kozhevnikov and Fee, 2007,

71 Hessler and Doupe, 1999), or receive auditory inputs (Hamaguchi et al., 2014) during

72 singing.

73 Another possibility is suggested by the large projection to Area X from

74 dopaminergic nuclei in the midbrain—the ventral tegmental area (VTA) and substantia

75 nigra pars compacta (SNc) (Person et al., 2008) (Figure 1A). Inspired by the recently

76 hypothesized role of dopaminergic signaling in reinforcement learning (Schultz et al.,

77 1997, Tsai et al., 2009, Bayer and Glimcher, 2005, Houk et al., 1994), we and others have

78 proposed that this dopaminergic input to Area X may play a role in song learning by

79 signaling vocal errors (Gale et al., 2008, Fee and Goldberg, 2011, Hara et al., 2007, Gale

80 and Perkel, 2010). Notably, a recent anatomical study in songbirds has described a

81 projection to the dopaminergic midbrain from the arcopallium (Gale et al., 2008), an

82 avian cortical region containing subtelencephalically-projecting neurons analogous to

83 those in deep layer-5 of mammalian cortex (Dugas-Ford et al., 2012, Karten, 2013).

84 Here we set out to examine the role of the cortical area projecting to VTA/SNc in

85 song learning. First, we characterized the spatial extent of the arcopallial neurons

86 projecting to VTA or SNc, and characterized the inputs to these neurons from auditory

87 cortical fields. We then carried out lesions targeted to this area to examine the effect on

88 tutor imitation and on vocal plasticity that occurs after deafening. Finally, we recorded

89 from VTA/SNc-projecting neurons in the arcopallium of singing birds and observed fast

90 transient responses to disruptive auditory feedback during singing. Altogether, our

91 findings suggest that descending auditory cortical projections may play a role in song

92 evaluation via projections to the dopaminergic midbrain and/or other subcortical targets.

93 Results

94 An auditory cortical pathway to the dopaminergic midbrain

95 To elucidate the spatial pattern of cortical neurons projecting to the dopaminergic

96 midbrain in songbirds, we made small injections of a retrograde tracer (cholera toxin

97 subunit β, CTB) into VTA and SNc (n=12 zebra finches, Figure 1B, see Methods).

98 Consistent with an earlier report (Gale et al., 2008), we observed a distinct mass of

99 retrogradely-labeled cell bodies within the ventral-most extent of the intermediate

100 arcopallium. Labeled neurons were distributed in a complex pattern that was highly

101 consistent across birds (Figure 1C and Figure 1-figure supplement 1). The main cluster of

102 labeled neurons was arranged in a stem-like column ventral to RA (robust nucleus of the

103 arcopallium), while a distinct ‘stripe’ of labeled neurons was observed 300-400µm

104 anterior to the main cluster. A smaller number of labeled neurons formed a thin shell

105 around the dorsal surface of RA, as previously described (Gale et al., 2008). We will

106 refer specifically and collectively to the arcopallial areas containing neurons retrogradely-

107 labeled from the dopaminergic midbrain, described above, as the ventral intermediate

108 arcopallium (AIV).

109 Injections of anterograde tracer (HSV viral vector expressing green fluorescent

110 protein, GFP) within the anterior and posterior clusters of AIV neurons each revealed

111 axonal arborization intermingled with TH-positive neurons in both VTA and SNc (Figure

112 1D,E, n=6 birds).

113 It is important to elucidate the anatomical relation between the area we identify as

114 AIV and the dorsal arcopallium (Ad), which has previously been implicated in vocal

115 learning (Bottjer and Altenau, 2010), and also in locomotory control (Feenders et al.,

116 2008, Jarvis et al., 2013) (called lateral intermediate arcopallium, LAI, in the latter

117 studies). Ad was labeled by injections of anterograde tracer (Alexa 488 conjugated

118 dextran 10MW) into the nidopallium lateral to LMAN (LMAN-shell), and AIV was

119 retrogradely-labeled in the same birds by injection of CTB into VTA and SNc (n=8

120 hemispheres). In coronally-sectioned tissue, Ad was seen as a band of labeled fibers

121 extending lateral to RA (Figure 1F), as previously described. Neurons retrogradely-

122 labeled from VTA/SNc were visible as a wedge of cell bodies ventrolateral to RA and

123 ventromedial to Ad. Few labeled cell bodies were observed within the region of labeled

124 fibers in Ad, although some retrogradely-labeled neurons were seen in the thin ‘neck’ of

125 labeled fibers connecting RA and Ad. Injections of retrograde tracer (CTB) into the

126 anterior or posterior parts of AIV (n=5) revealed no retrogradely-labeled neurons in the

127 nidopallium adjacent to LMAN (LMAN-shell). Thus, AIV and Ad appeared to be largely

128 distinct non-overlapping regions.

129 To further elucidate the possible cortical afferents to AIV, we made injections of

130 retrograde tracer (CTB) into different locations anterior and ventral to RA (n=10 birds).

131 Following injections into the anterior parts of AIV (n=5 birds), retrogradely-labeled

132 neurons were reliably observed in auditory cortical areas HVC-shelf, caudal mesopallium

133 (CM) and L1 (Figure 2A-F and Figure 2-figure supplement 1A-C) (Vates et al., 1996,

134 Mello et al., 1998, Kelley and Nottebohm, 1979). We note that, while these previous

135 studies have identified inputs to ventral arcopallium (RA-cup) from L1 and HVC-shelf,

136 the projection described here from CM is novel. Following injections into the more

137 posterior parts of AIV (n=5 birds), retrogradely-labeled neurons were observed in the

138 caudal nidopallium (NC) posterior to HVC-shelf (Figure 2G,H).

139 To confirm that neurons in these four areas terminate within AIV, we carried out

140 anterograde tracing using a GFP-expressing (HSV-GFP). Injections of anterograde

141 tracer into the caudal nidopallium (NC) posterior to HVC-shelf revealed axonal

142 arborization primarily in the posterior ‘stem’ region of AIV (Figure 2- figure supplement

143 1G-H, n=8 birds). Injections of anterograde tracer into auditory cortical areas L1, CM and

144 HVC-shelf all revealed axonal arborization overlapping with neurons retrogradely-

145 labeled from VTA/SNc, particularly within the anterior ‘stripe’ region of AIV (Figure 2I-

146 L, and Figure 2-figure supplement 1D-F; L1 and CM, n=4 birds each; HVC-shelf, n=10

147 birds). In these experiments, axonal arbors were not restricted to the region of

148 retrogradely-labeled neurons in AIV, but were also seen anterior to the AIV ‘stripe’ and

149 in the area directly anterior to RA.

150 The functional connectivity of projections from the auditory areas L1, CM and

151 HVC-shelf was further confirmed by electrically stimulating these areas and

152 demonstrating short latency activation of multi-unit recording sites in AIV and of

153 antidromically-identified VTA/SNc-projecting AIV neurons (Figure 3). Together these

154 findings suggest the existence of multiple descending pathways by which auditory

155 information could reach the dopaminergic midbrain and other downstream auditory

156 targets of the arcopallium (Vates et al., 1996, Mello et al., 1998, Kelley and Nottebohm,

157 1979).

158 Lesions of AIV disrupt song learning

159 To test the role of AIV in vocal learning and imitation, we carried out excitotoxic

160 lesions targeted to AIV in young zebra finches, and examined the subsequent effect on

161 song imitation. Birds were lesioned after being tutored in their home cage but prior to

162 evidence of vocal imitation (Figure 4A,B, 48-51 days post hatch, dph). After the lesion

163 surgery (see Methods), birds were maintained in isolation and their songs were recorded

164 until adulthood (90 dph). Control siblings underwent the same tutoring protocol but did

165 not receive AIV lesions. A quantification of song imitation (by analysis of pupil-tutor

166 similarity, see Methods) revealed that the AIV-lesioned birds exhibited deficits in vocal

167 learning (Figure 4C-F). Lesioned birds had a significantly lower similarity to their tutor

168 (imitation score) than their unlesioned control siblings (Figure 4-figure supplement 1A,

169 paired t-test, p=0.0048, rank-sum test between all controls and all AIV-lesioned birds

170 p=0.0024). The overall distribution of acoustic features (Tchernichovski et al., 2000) was

171 not significantly different between AIV-lesioned and control groups, suggesting that

172 these lesions did not create gross abnormalities in song production (Figure 4-figure

173 supplement 1B-D).

174 We investigated the possibility that the observed learning deficits after AIV

175 lesions might be caused by loss of neurons in dorsal arcopallium (Ad), which is

176 dorsolaterally adjacent to AIV. In a subset of juvenile birds (n=7), lesions were targeted

177 to the portion of Ad most likely affected by our AIV lesion procedure (Figure 5A, see

178 Methods). These birds underwent the same tutor exposure and learning protocol as did

179 the AIV-lesioned birds described above. The Ad-lesioned birds exhibited significantly

180 higher song imitation scores compared to AIV-lesioned birds (Wilcoxon rank-sum test

181 p=0.012), and no significant deficit in song imitation compared to non-lesioned controls

182 (Figure 5B, Wilcoxon rank-sum test p=0.74). These birds exhibited no overt motor or

183 locomotor deficits.

184 The lack of effect of Ad lesions, on either vocal learning or other aspects of motor

185 behavior, led us to wonder whether complete lesions of Ad might have an observable

186 effect. Larger lesions of Ad were carried out in three additional birds (the lesions were

187 extended 0.2 mm more laterally, but within Ad). All three birds exhibited severe akinesia

188 and immobility requiring the termination of the experiment. Because these birds did not

189 sing, we were not able to assess the effects of the larger Ad lesions on vocal learning.

190 We wondered if the song imitation deficits produced by the AIV lesions described

191 above were correlated with the size of the lesion. Several different lesion protocols were

192 tested in this study, resulting in different extents of AIV lesion across the data set (n=17

193 birds total). In all lesioned birds, the extent to which AIV was lesioned was quantified

194 histologically at age 90 dph (see Methods). Because Ad lesions minimally impacted

195 AIV, Ad-lesioned birds (n=7) were included in this analysis as sham lesions (0% AIV

196 lesion). Song imitation scores were negatively correlated with the fraction of AIV that

197 was lesioned (Figure 5B, linear regression using least square, R=0.63, F-statistic

198 p=0.0008, Figure 5-figure supplement 1A,B).

199 Birds for which we estimated that more than half of AIV was lesioned (n=8 birds,

200 referred to as a large AIV lesion) produced poor imitations of the tutor song compared to

201 controls (Figure 5C, rank-sum test, p=0.00016; Ad-lesioned and unlesioned control

202 groups were pooled in this comparison). On average, birds with large AIV lesion

203 exhibited an 879% loss of imitation capacity compared to the controls (see Methods).

204 The distribution of tutor imitation scores for birds with large lesions was not significantly

205 different from the distribution of pairwise similarity scores between unrelated adult birds

206 in our colony (Figure 5C, Wilcoxon rank-sum test, p=0.31). Altogether, our findings

207 suggest that the songs of birds with large AIV lesions were nearly as dissimilar to their

208 own tutors as unrelated adult birds in our colony are to each other.

209 Importantly, the impaired vocal imitation in AIV-lesioned birds cannot be

210 attributed to the amount of vocal practice. We compared the amount of singing in the

211 period from surgery up to 90dph for AIV-lesioned birds versus Ad-lesioned controls. No

212 difference was observed between these groups (Figure 5-figure supplement 1D,E, t-test,

213 p=0.56). Not surprisingly, both Ad- and AIV-lesioned birds sang less than the unlesioned

214 control group (ranksum p=0.003 and 0.016 respectively). However, within any

215 experimental or control group, there was no correlation between the amount of singing

216 and the degree of song imitation (F-statistics all comparisons, p>0.4).

217 In addition to impaired vocal imitation, AIV-lesioned birds developed adult songs

218 that exhibited a significantly reduced overall song stereotypy (p=0.0045, rank-sum test,

219 Figure 5-figure supplement 1C) (Aronov et al., 2008) and syllable stereotypy (p=0.01,

220 rank-sum test, Figure 5D), compared to the control groups (Ad-lesioned and unlesioned

221 control groups were individually significant, but were pooled in the comparisons stated

222 above). Another common feature of the songs of AIV-lesioned birds was the presence of

223 a large number of ‘atypical’ syllables that were uncharacteristically long in duration or

224 contained patterns of acoustic modulation not usually observed in zebra finch song.

225 AIV lesions do not have the same effect as deafening

226 Auditory feedback is necessary for vocal learning in juvenile birds, as revealed by

227 experiments showing that birds deafened as juveniles are unable to learn normal songs

228 (Konishi, 1965). Because AIV is a downstream target of auditory brain areas, we

229 wondered if the effects of AIV lesions might be similar to deafening. We took advantage

230 of the fact that deafening results in a dramatic degradation of song in young adults that

231 have largely finished learning their songs (Lombardino and Nottebohm, 2000, Nordeen

232 and Nordeen, 1992). For example, deafening at the age of 75-80dph resulted in a near

233 complete loss of song structure, typically within one week (n=14 birds, Figure 6A,D, see

234 Methods for quantification). In contrast to the effect of deafening, bilateral lesions

235 targeted to AIV (n=6 birds, 75-80 dph) produced no significant song degradation within a

236 two-week period following the lesion (t-test p=0.28, Figure 6B,D).

237 Notably, AIV lesions had no immediate effect on song structure or song

238 variability (paired t-test p=0.52, Figure 6-figure supplement 1A,B), suggesting that AIV

239 plays no direct role in song production or in the generation of vocal variability. These

240 results also suggest that our AIV lesions did not significantly disrupt fibers of passage,

241 either afferent fibers entering RA laterally from LMAN (lateral magnocellular nucleus of

242 the anterior nidopallium, an area which conveys variability into the motor output during

243 singing (Ölveczky et al., 2005, Kao et al., 2005), or efferent fibers exiting RA rostrally

244 (see Figure 6-figure supplement 1C-F).

245 AIV lesions inhibit deafening-induced song degradation

246 It has been shown that lesions of the basal ganglia-forebrain learning circuit, AFP,

247 largely prevent the degradation of song that occurs after deafening, leading to the view

248 that such degradation is an active process requiring vocal learning circuitry (Brainard and

249 Doupe, 2000, Nordeen and Nordeen, 2010). Since we have hypothesized that AIV is

250 involved in vocal learning, we wondered whether lesions targeted to this area might slow

251 the degradation of song after deafening. To test this idea, we carried out bilateral cochlear

252 removal in combination with bilateral AIV lesions (n=15, 75-80 dph). Songs were

253 recorded for 2 weeks following this procedure and were analyzed for similarity to the

254 pre-surgery song. Song degradation after deafening was significantly slower in AIV-

255 lesioned birds than in birds with intact AIV (Figure 6C,D, comparison at 7 days and 14

256 days post-lesion, t-test p=0.0074, 0.0096 respectively). The reduced rate of deafening-

257 induced song degradation after AIV lesion suggests a role for descending auditory

258 cortical pathways, including possibly those projecting to the dopaminergic midbrain, in

259 mediating the plastic changes in the AFP and motor pathways that occurs after deafening.

260 AIV neurons respond to disruptive auditory feedback during singing

261 We next wanted to test if neurons in AIV—specifically those projecting to

262 VTA/SNc—exhibit auditory responses during singing consistent with their hypothesized

263 role in vocal learning. To address this question we used a motorized microdrive to record

264 neural activity in AIV of singing zebra finches (n=7 birds, 90-120 dph). Recordings were

265 targeted based on electrophysiological mapping of RA borders (see Methods).

266 Recordings were made from 37 single units, of which 17 were antidromically-identified

267 and collision-tested VTA/SNc projectors (Figure 7A,B). We also recorded 13 multiunit

268 sites exhibiting robust antidromic response to VTA/SNc stimulation (see Methods,

269 antidromic responses to VTA/SNc stimulation were restricted to the borders of AIV as

270 defined by retrograde labeling, Figure 7-figure supplement 1). AIV single-units

271 discharged at low rates during singing (1-10Hz, Figure 7C), and exhibited only small

272 changes in average firing rate compared to non-singing (Figure 7D, significant firing rate

273 change in 15/37 neurons, paired t-test p<0.05). Only a small fraction of AIV neurons

274 exhibited significant firing rate modulations locked to the song motif (n=4/37). These

275 modulations did not appear to be premotor in nature, and were substantially weaker than

276 those reported in premotor song control nuclei (Leonardo and Fee, 2005), or auditory

277 cortical areas (Sen et al., 2001).

278 While natural song learning proceeds slowly (Tchernichovski et al., 2001),

279 playback of disruptive auditory feedback, such as brief bursts of broadband noise, can

280 induce rapid learned changes in song structure over the course of a few hours (Andalman

281 and Fee, 2009, Tumer and Brainard, 2007). Such learning has been interpreted as

282 evidence that distorted auditory feedback can be used to introduce experimentally

283 controlled ‘errors’ in song performance (Andalman and Fee, 2009, Fee and Goldberg,

284 2011, Tumer and Brainard, 2007). More than 40% of the VTA/SNc-projecting neurons

285 exhibited a significant neuronal response to distorted auditory feedback (50ms noise

286 bursts) presented during singing (Figure 7E-K, p<0.02 for n=7/17 neurons, comparison of

287 spike count in a 150ms window before and after noise burst onset for each neuron; paired

288 t-test, p<10-5 for average response, bootstrap analysis, see Methods). Significant

289 responses were also observed at 54% of the multiunit recording sites (n=7/13), and in

290 30% of AIV neurons that did not meet the criteria for antidromic identification (n=6/20

291 neurons). The response of AIV neurons to noise bursts was brief, occurring with an

292 average latency of 23±12ms from noise onset and having an average duration of

293 90±43ms (see Methods, Figure 7L). There was no significant difference between the

294 responses of antidromically-identified neurons and those that were not antidromically

295 identified. Thus, these data are pooled in the statistics on latency and duration.

296 Are AIV neurons responsive to noise bursts only during singing, or can similar

297 auditory responses be evoked during non-singing? We recorded the responses of AIV

298 neurons to noise bursts presented during playback of the bird’s own song (BOS) in 25

299 identified VTA/SNc-projecting AIV neurons under anesthesia (n=3 birds), and another

300 28 such neurons in freely-behaving birds (n=4 birds). Under these conditions, noise

301 bursts elicited no significant response in any of these neurons (Figure 8A,C,E see

302 Methods). We also examined whether AIV neurons respond to noise bursts presented in a

303 silent background (non-singing, no BOS). In this case, auditory responses were observed

304 in a small fraction of neurons in both awake and anesthetized birds: Altogether, 12/53

305 neurons exhibited a significant response within a one second window after noise onset

306 (bootstrap, p<0.02, see Methods). However, these responses were significantly slower

307 and exhibited longer latency than during singing (p<0.001, Figure 8B,D,E, only one

308 neuron responded significantly within 150ms of noise onset). In comparison, all the

309 neurons that were found to be noise-responsive during singing (significant activity within

310 a one second window), exhibited this response within 150ms of the noise burst onset.

311 Altogether, we have found that the auditory responsiveness of AIV neurons to disruptive

312 auditory stimuli is highly state-dependent, exhibiting fast and robust firing-rate changes

313 only during singing.

314

315 Discussion

316 Motivated by the hypothesized role of dopaminergic signaling in reinforcement learning

317 (Schultz et al., 1997, Tsai et al., 2009, Bayer and Glimcher, 2005, Houk et al., 1994), we

318 set out to examine the role in vocal learning of a recently-discovered songbird cortical

319 area (Gale et al., 2008) that projects to VTA and SNc. We have characterized the spatial

320 extent of neurons within the intermediate arcopallium that project to VTA and SNc, and

321 refer to the region of arcopallium retrogradely-labeled from these midbrain dopaminergic

322 areas as AIV. Using a combination of anatomical and electrophysiological techniques, we

323 have elucidated the afferent inputs to AIV neurons from other cortical areas. We

324 examined the effect on vocal learning of lesions targeted to AIV, and carried out

325 electrophysiological recordings of AIV neurons. Our findings are broadly consistent with

326 the hypothesis that an arcopallial region with descending auditory cortical projections,

327 both to the dopaminergic midbrain and to midbrain and brainstem auditory centers, plays

328 a role in vocal learning.

329 The area we have identified as AIV is adjacent to other structures that have been

330 hypothesized to play a role in vocal learning (Bottjer and Altenau, 2010). The dorsal

331 arcopallium (also referred to as the lateral intermediate arcopallium, LAI, by Jarvis et al.,

332 2013), receives inputs from the nidopallium adjacent to LMAN (LMAN-shell), and

333 projects to the basal ganglia, motor thalamus, tectum, and the reticular formation (Bottjer

334 et al., 2000). Our anatomical findings suggest that AIV, while adjacent to Ad, is part of a

335 distinct, largely non-overlapping, anatomical circuit. In contrast to Ad, AIV does not

336 receive a projection from LMAN-shell; rather it receives inputs from auditory cortical

337 areas (CM, L1 and HVC-shelf) and caudal nidopallium. Furthermore, our findings show

338 that AIV, but not Ad, projects to VTA/SNc. Given its anatomical overlap with RAcup,

339 parts of AIV also likely project to MLD or to the shell of the thalamic nucleus ovoidalis

340 (Mello et al., 1998).

341 Earlier studies have presented conflicting views on the function of Ad/LAI,

342 particularly in regard to its role in vocal learning (Bottjer et al., 2000, Achiro and Bottjer,

343 2013). Bottjer and Altenau (2010) describe evidence that lesions of Ad produce deficits

344 in vocal learning, not unlike those we report here for lesions of AIV. In contrast,

345 Feenders et al showed immediate early gene activation in Ad/LAI and in LMAN-shell

346 (which projects to Ad/LAI) during locomotory activity (i.e. hopping), but find no

347 evidence for activation in these areas during singing. This led them to suggest that

348 Ad/LAI and RA form the output of a general avian cortical motor circuit, of which RA is

349 a component specialized for vocal production (Feenders et al., 2008).

350 Our findings are consistent with the view that AIV, but not Ad, play a role in

351 vocal motor learning. We found that lesions of AIV caused deficits in vocal imitation,

352 while lesions of Ad did not cause any such deficit, suggesting that the loss of song

353 imitation following lesions targeted to AIV was not a secondary consequence of an

354 unintended lesion in Ad. Our finding that larger lesions of Ad produced severe akinesia

355 and immobility is consistent with the hypothesis that Ad plays a role in locomotion and

356 other motor behaviors, rather than vocal learning (Feenders et al., 2008).

357 There are several potential explanations for the discrepancy between our findings

358 on the effects of Ad lesions and those of Bottjer and Altenau (2010). Clearly, the Ad

359 lesion protocol of Bottjer and Altenau impacted a subset of neurons important for vocal

360 learning, while our Ad lesion protocol did not. This subset of neurons, in principle, could

361 be within Ad, within AIV, or could be in another, as yet undescribed, adjacent region

362 important for vocal learning. In our view, the most parsimonious explanation for the

363 detrimental effects on vocal learning is that the Ad lesions of Bottjer and Altenau had an

364 unintended impact on neurons in AIV. More detailed electrophysiological studies of the

365 pathways into and out of Ad will be needed to completely resolve these different views of

366 its function.

367 The effects of lesions directed to AIV in juvenile birds bear some resemblance to

368 the effects of lesions in the AFP, a basal ganglia-thalamocortical circuit necessary for

369 vocal learning. Lesions of Area X, the basal ganglia component of the AFP, in young

370 birds have a substantial detrimental effect on song imitation and result in persistent song

371 variability into adulthood (Scharff and Nottebohm, 1991). In contrast, lesions of Area X

372 have relatively little immediate effect on song performance in adult and juvenile birds

373 (Goldberg and Fee, 2011, Kojima et al., 2013). Similarly, we found that birds in which

374 AIV was lesioned at an early age produced a poor imitation of the tutor song and

375 developed a less stereotyped song than did control birds, and AIV lesions in adult or

376 juvenile birds have little immediate effect on vocal performance. Lesions of Area X also

377 largely block the degradation of song following deafening (Kojima et al., 2013), and

378 similarly, but to a lesser extent, we found that lesions of AIV slow song degradation in

379 deafened birds. The similar patterns of findings in AIV-lesioned birds and Area X-

380 lesioned birds is consistent with the possibility that the AIV, or some components of it,

381 may interact with the AFP, perhaps by transmitting to it a signal important for its

382 function.

383 In principle, there are several ways the loss of a descending auditory cortical

384 projections might affect vocal learning. We first address the possible role of the

385 projection from AIV to the dopaminergic midbrain in vocal learning. In the songbird,

386 dopaminergic inputs to Area X have been hypothesized to carry signals related to

387 motivational aspects of singing and to regulate vocal variability, in particular while

388 switching between less variable song directed to a female bird to more variable song

389 produced in social isolation (Hara et al., 2007, Leblois et al., 2010, Yanagihara and

390 Hessler, 2006). Notably, we find that juvenile and young adult birds exhibit normal song

391 variability after AIV lesions, suggesting that AIV may not be involved in the regulation

392 of vocal variability, and that the deficits in vocal learning do not result from a loss of

393 exploratory song variability. Alternatively, reduced glutamatergic drive after AIV lesion

394 could result in reduced tonic dopaminergic input to Area X. In mammals, reduced

395 dopaminergic input to the can have deleterious effects on BG function, including

396 reduced spontaneous activity and the loss of spines in the medium spiny neurons

397 (Kreitzer and Malenka, 2008). Songbirds exhibit patterns of receptor

398 expression in cortex and Area X (Kubikova et al., 2010), and responses to dopamine

399 stimulation (Ding and Perkel, 2002) that parallel those found in mammals. Finally,

400 another possibility is that impoverished vocal imitation after lesions of AIV results from

401 the loss of a reinforcement signal that carries information about recent song performance.

402 The short-latency response of AIV neurons to disrupted auditory feedback is consistent

403 with the possibility that AIV may play a role in computing or transmitting a fast online

404 signal to VTA/SNc.

405 Previous studies have shown that neurons in the vicinity of RA, and potentially

406 within AIV, have several sub-telencephalic targets other than VTA/SNc, including a

407 higher-order auditory thalamic nucleus Ov-shell and parts of the auditory midbrain

408 (Vates et al., 1996). Thus, deficits in vocal learning following lesions targeted to AIV

409 could potentially arise from loss of signaling in these other pathways. Further studies will

410 be required to elucidate the relation between neurons projecting to these different

411 downstream targets, and to determine the role of these different pathways in vocal

412 learning.

413 One key to understanding the function of AIV is to characterize its cortical

414 afferents, and several candidates have been identified. Güntürkün and colleagues have

415 described a projection in pigeons from the lateral caudal nidopallium (NCL) to the

416 ventral arcopallium (Kroner and Gunturkun, 1999). The projection we describe here in

417 the zebra finch—from the caudal nidopallium (NC) posterior to HVC—terminates

418 primarily in the posterior ‘stem’ part of AIV. This projection may be related to the

419 previously-described nidopallium caudolaterale (NCL) projection, although NCL appears

420 to be more lateral than NC. NCL was recently found to be involved in error and reward

421 processing (Starosta et al., 2013), and based on several lines of evidence, it has been

422 suggested that NCL may be analogous to mammalian prefrontal cortex (Kroner and

423 Gunturkun, 1999). Notably, in mammals the major cortical projection to VTA arises from

424 prefrontal cortex, a key structure for cognitive control and goal-directed actions (Murase,

425 1993).

426 Previous studies in the zebra finch have also identified inputs to the intermediate

427 arcopallium from auditory cortical regions, including primary auditory cortical field L1,

428 and HVC-shelf in the nidopallium (Vates et al., 1996, Mello et al., 1998, Kelley and

429 Nottebohm, 1979). These projections were found to terminate near RA, in a region

430 termed RA-cup, and are likely homologous to projections in the pigeon from the dorsal

431 nidopallium (Nd) to the ventromedial intermediate arcopallium (AIvm) (Wild et al.,

432 1993). We have confirmed the existence of these projections, and using anatomical

433 tracing and electrophysiological techniques, have demonstrated that these inputs directly

434 or indirectly innervate AIV neurons projecting to the dopaminergic midbrain. Inputs from

435 L1 and anterior parts of HVC shelf appear localized to the parts of AIV anteroventral to

436 RA. Inputs from posterior HVC shelf appear to terminate in both the anterior and

437 posterior parts of AIV, and show significant overlap with inputs from the more posterior

438 caudal nidopallium described above.

439 Our anatomical studies have also revealed a novel projection to AIV from the

440 caudal mesopallium (CM), an avian auditory area previously identified as homologous to

441 higher-order auditory cortex (Bauer et al., 2008) or to upper layers of primary auditory

442 cortex (Karten, 2013). The projection from CM terminates within AIV, and electrical

443 stimulation in CM causes spiking activity in VTA/SNc-projecting AIV neurons, likely

444 through orthodromic activation. The existence of a projection to AIV from CM and L1

445 may be important for two reasons. The cluster of neurons in CM retrogradely-labeled

446 from AIV was overlapped with the region of CM, also known as Avalanche (Av), that

447 forms bidirectional synaptic interactions with both HVC and NIf (Bauer et al., 2008,

448 Akutagawa and Konishi, 2010), two song control nuclei also involved in vocal learning

449 (Roberts et al., 2012). Furthermore, a small subpopulation of neurons in CM and L1 have

450 been shown to respond to noise bursts presented during singing, but not to noise bursts

451 presented during playback of the birds own song (Keller and Hahnloser, 2009). These

452 responses, highly reminiscent of responses in AIV, have led to the suggestion that one

453 computational function of CM and L1 may be to compare actual auditory feedback with

454 the song template (Keller and Hahnloser, 2009). Further studies are required to determine

455 if the hypothesized error-related responses of AIV neurons derive from activity in this

456 population of CM and L1 neurons.

457 One can speculate about the potential role for AIV in computing or transmitting a

458 fast online signal to VTA/SNc that potentially carries information about recent song

459 performance. Most VTA/SNc-projecting AIV neurons responded to disruptive auditory

460 feedback with a latency of less than 25ms and response duration of less than 100ms. This

461 response may have a sufficiently fast temporal resolution to mediate vocal learning,

462 assuming that the synaptic learning rules in Area X and the motor pathway employ a

463 short-term synaptic memory such as an eligibility trace (Fee and Goldberg, 2011, Sutton

464 and Barto, 1998, Redondo and Morris, 2011). Such a fast dopaminergic reinforcement

465 signal could, in principle, be used in Area X to correlate vocal performance with an

466 efference copy of vocal motor commands (Fee and Goldberg, 2011) and to drive

467 corticostriatal plasticity at HVC inputs to medium spiny neurons (MSNs). The MSNs

468 could then act through their downstream pallidothalamic pathway to 1) bias the motor

469 system in favor of vocal commands that previously led to better song performance and 2)

470 drive plasticity in the motor pathway such that the juvenile song gradually converges to

471 the desired vocal output (Andalman and Fee, 2009, Warren et al., 2011, Fee and

472 Goldberg, 2011, Charlesworth et al., 2012).

473 Additional studies would be required to establish the nature of auditory

474 reinforcement signals in the pathway we have described here, or possibly through other

475 downstream targets, and to establish a causal role for these circuits in vocal learning.

476

477 Materials and Methods

478 Subjects: Animal subjects were male zebra finches (n=110) (45-120 days post hatch,

479 dph). Birds were obtained from the Massachusetts Institute of Technology zebra finch

480 breeding facility (Cambridge, Massachusetts). The care and experimental manipulation of

481 the animals were carried out in accordance with guidelines of the National Institutes of

482 Health and were reviewed and approved by the Massachusetts Institute of Technology

483 Committee on Animal Care.

484 Identification of AIV. Retrograde labeling of neurons in AIV was obtained by injecting

485 fluorescently-labeled cholera toxin  subunit (Molecular Probes) into the ventral

486 tegmental area (VTA) and substantia nigra pars compacta (SNc). VTA/SNc were

487 localized stereotaxically with reference to the principal thalamic auditory relay nucleus

488 Ovoidalis, as described below.

489 Reliable targeting of VTA/SNc. We have developed a technique that allows us to target

490 VTA and SNc with high reliability, based on the mapped location of the thalamic

491 auditory relay nucleus Ovoidalis (Ov). Specifically, the head was oriented with the flat

492 anterior portion of the skull rotated forward at a pitch of 50 degrees. Using an

493 extracellular electrode (Carbostar-1, Kation Scientific), tilted in the LM direction 2

494 degrees towards the midline, Ov was located and mapped based on its robust auditory

495 responses. Injections were made into VTA relative to Ov as follows: 300µm anterior

496 from the center of Ov, 200µm medial from the medial edge of Ov, and 1800µm ventral to

497 the middle of Ov in the DV direction. Injections into SNc were targeted 350µm posterior

498 and 350µm lateral to the VTA coordinates. For electrophysiology experiments requiring

499 antidromic stimulation in VTA/SNc, bipolar stimulating electrodes were placed such that,

500 to the extent possible, one wire was in VTA and the other wire was in SNc.

501 Excitotoxic lesions of AIV. Bilateral lesions of AIV were carried out by injecting 2% N-

502 Methyl-DL aspartic acid (NMA, Sigma, St Louis, MO) at multiple injection sites in

503 ventral arcopallium around RA chosen to maximally lesion VTA/SNc-projecting

504 neurons. All AIV lesions were carried out stereotaxically. The head was oriented with the

505 flat anterior portion of the skull rotated forward at a pitch of 70 degrees. Because a large

506 portion of AIV is located ventral to RA, access to AIV was achieved by a laterally-

507 rotated penetration. The head was rotated 45 degrees (roll) to the left for lesions of AIV

508 in the right hemisphere, and rotated to the right for lesions of AIV in the left hemisphere.

509 Using a carbon fiber electrode (Carbostar, Inc. Mod #E1101-20) RA was then completely

510 mapped based on its characteristic electrophysiological signature (regular spiking

511 interspersed with bursts). Once all of the edges of RA were localized in the rotated frame,

512 these were used to calculate the coordinates of the injection sites. In the injection

513 coordinates given below (ML, AP, DV), dimensions are in µm; positive numbers

514 represent more lateral, more anterior, and more ventral displacements, respectively. The

515 coordinates are specified relative to the lateral edge of RA in the medial-lateral direction,

516 relative to the center of RA in the anterior-posterior direction, and relative to the most

517 ventral edge of RA in the dorsal-ventral direction. Injections were made through a glass

518 pipette using a digitally-controlled injection system (Nanoject, Drummond Scientific).

519 Injections at each site were made in multiple boluses of 13.8nL, which were spaced at 5-

520 minute intervals to avoid backflow along the pipette. AIV lesions (n=17 birds) were

521 created by up to 7 different injection sites (in 5 different penetrations) at the coordinates

522 specified below. The last number in each coordinate indicates the number of injection

523 boluses made at that site: (+200,-400,+200; 4); (+300,+0,+200; 4); (+200,+300,+0; 4);

524 (+0,+700,+0; 2); (+0,+700,+300; 2); (-300,+900,+0; 2) ; (-300,+900,+300; 2).

525 Note that, given the complex spatial structure of AIV, it was difficult to achieve a

526 complete lesion of VTA/SNc projecting neurons. For example, we did not attempt to

527 lesion the thin shell of AIV neurons dorsal to RA. Furthermore, our lesions may have

528 affected areas of the intermediate arcopallium adjacent to AIV.

529 Excitotoxic lesions of Ad.

530 We targeted the portion of Ad most likely affected by our AIV lesion procedure

531 (n=7 birds) as follows: Bilateral lesions of Ad were carried out stereotaxically by

532 injecting 1% NMA at multiple injection sites. The stereotaxic procedure was similar to

533 that described above for AIV lesions. RA was mapped electrophysiologically to

534 determine the lateral border and center (in the AP direction) of RA. Ad was also mapped

535 electophysiologically by its characteristic bursting patterns, which distinguished it from

536 the overlying arcopallium. Injection sites are listed below: the (ML, AP) coordinates are

537 given in the same notation given above for the AIV lesion. However, the injections were

538 targeted to the center of Ad in the dorso-ventral dimension as determined from

539 electrophysiological mapping of Ad. Injections at each site were made in 3 boluses of

540 13.8nL, which were spaced at 5-minute intervals to avoid backflow along the pipette.

541 Injections were made at the following locations: (+250, -100); (+250,+300); (+650,-100);

542 (+650,+300); (+1000,-100); (+1000,+300). This lesion protocol typically spared the

543 lateral-most ~200um of Ad (see Figure 5A).

544 To determine the effects of complete lesions of Ad on vocal learning, we also

545 carried out (n=3 birds) an Ad lesion protocol with two additional lateral injection sites.

546 Injections were made at the following locations: (+250, -100); (+250,+300); (+650,-100);

547 (+650,+300); (+950,-100); (+950,+300); (+1200,-100); (+1200,+300). These birds

548 exhibited severe immobility after surgery, and failed to eat or drink, requiring early

549 termination of the experiment.

550 Developmental study of song imitation (Figure 4). Male juvenile birds were maintained

551 in the aviary in their home cages with both parents until age 44-45 dph, at which point

552 they were transferred to sound isolation/recording chambers until they began to sing and

553 baseline song was acquired. For experimental birds, bilateral AIV or Ad lesions were

554 carried out as described above. After surgery birds typically began to sing within 5 days

555 (range 2-5 days). Birds were maintained in isolation and recorded continuously until they

556 reached the age of 90dph. Once it was confirmed that post-90dph songs had been

557 recorded, we histologically verified the extent of AIV lesions, as described below.

558 Unmanipulated control birds were isolated at 44-45dph (as for experimental birds), and

559 were maintained in sound isolation/recording cages until they reached the age of 90dph,

560 or until post-90dph song had been acquired. Assignment of birds to unmanipulated

561 control or experimental groups was initially made randomly, but subsequent juveniles

562 from a given breeding cage (with a given tutor) were placed in control or experimental

563 groups to provide sibling matches between these conditions.

564 Histological verification of AIV lesions. AIV lesions were examined histologically at the

565 end of each experiment. Retrograde tracer (fluorescently-labeled cholera toxin  subunit,

566 Molecular Probes) was injected into VTA/SNc and four days later the bird was sacrificed

567 and perfused transcardially with saline and then 4% paraformaldehyde. The brain was

568 removed and post-fixed in 4% paraformaldehyde overnight. The brain was sectioned,

569 stained for NeuN, and imaged under a fluorescence microscope (Zeiss Axioplan). The

570 extent of AIV lesions was analyzed on the basis of loss of NeuN staining and the density

571 of retrogradely labeled neurons in these regions. Quantification was based on the size of

572 the lesion in each parasagittal section extending from the medial edge of RA to 200 um

573 past the lateral edge, and yielded two numbers indicating the fraction of AIV lesioned on

574 the left and right side. The extent of any damage to RA was also monitored and birds

575 with more than 5% RA lesion were excluded from the analysis (n=2). The histological

576 assessment of lesion size was conducted blind to the behavioral effects on song, and the

577 quantification of the behavioral effects was done blind to the histological analysis of

578 lesion size.

579 Testing the functionality of anatomical projections:

580 To test the functional connectivity of projections from L1, CM and HVC-shelf to

581 VTA/SNc-projecting neurons in AIV, we implanted bipolar stimulating electrodes in

582 each of these auditory cortical areas (L1, n=2 birds; CM, n=3 birds; HVC-shelf, n=3

583 birds). A bipolar stimulating electrode was also placed in VTA/SNc to antidromically

584 identify neurons in AIV. Guided by the results of the anterograde tracing experiments

585 (Figure 2 and Figure 2- figure supplement 1), recordings were made in the regions of

586 AIV ventral and anterior to RA using a Carbostar electrode (1MOhm, Kation Scientific).

587 Within this region, single-unit and multi-unit activity was broadly responsive to

588 stimulation in both VTA/SNc and in L1, CM or HVC-shelf. Threshold currents required

589 to elicit spiking responses were between 50 and 200uA (single 0.2-ms unipolar pulses).

590 Location of the stimulating electrodes was verified histologically.

591

592 Analysis of song imitation. We modified the previously published Song Analysis Pro

593 (SAP) algorithm(Tchernichovski et al., 2000) and developed an automated procedure for

594 comparison of pupil songs with the tutor motif (Software is available at Mandelblat-Cerf

595 and Fee, 2014). Following SAP, songs were represented by acoustic features. Similarity

596 between time points in the two songs is based on the Euclidean distance in the feature

597 space between these points. The procedure builds a similarity matrix for all possible pairs

598 of time points between the two song samples.

599 To quantify tutor-pupil similarity the tutor motif was segmented into separate

600 syllables. While syllable segmentation is highly reliable in tutor song, it is less reliable in

601 the more variable pupil song, which was thus segmented into equally sized sections, each

602 twice the duration of the tutor motif. For each tutor syllable our procedure uses the

603 similarity matrix described above to find the sections of the pupil song bout that have the

604 highest similarity and assigns an overall similarity score. For each match between a

605 tutor’s syllable and a pupil’s song section, a sequencing score is evaluated by computing

606 the match of the next syllable in the tutor song with the next section of the pupil song.

607 The algorithm then computes a composite ‘tutor imitation score’ as the product of song

608 similarity and sequence similarity.

609 For each bird we examined song spectrograms and computed the imitation scores

610 for the first day of singing after isolation, and before any further manipulation. Three

611 birds exhibited clear evidence of tutor imitation and were excluded from further

612 experiments.

613 Imitation capacity – To determine the fractional extent to which AIV-lesions reduced the

614 capacity of birds to imitate their tutor song, imitation scores of each bird were placed on a

615 linear scale ranging from the average imitation score of control birds 0.197±0.011 to the

616 average similarity score from all pairwise comparisons of 20 unrelated adult birds in our

617 colony 0.104±0.0027, yielding a total dynamic range of 0.093[=0.197-0.104] ±0.00636.

618 For birds in which we estimate that more than half of AIV was lesioned, the average

619 imitation score was 0.116±0.008. Therefore, we can compute the similarity of AIV-

620 lesioned birds to their tutor songs relative to the song similarity of unrelated birds, as a

621 fraction of the total dynamic range, as 12.9% [=(0.116 - 0.104)/0.093]±9.2%. Thus, AIV-

622 lesioned birds suffered a loss of imitation capacity compared to controls of 87% [=1.0 -

623 0.129]±9%.

624 Quantification of song degradation following deafening and/or AIV lesion (Figure 6).

625 Male juvenile birds were obtained from the aviary at age 65-70 dph and were maintained

626 in sound isolation cages until they began to sing. Upon reaching age 75-80dph, pre-

627 surgery songs were recorded, and birds were randomly placed in three experimental

628 groups: deafening, AIV-lesion, and combined deafening and AIV-lesion. Deafening was

629 achieved by bilateral cochlear extirpation, which was carried out by accessing the cochlea

630 through the posterio-lateral cranium. AIV lesions were carried out as described above.

631 Upon recovery from surgery, birds were maintained in sound isolation cages. Undirected

632 song was recorded continuously for two weeks.

633 To quantify song degradation, song was recorded every day post-surgery and

634 compared to song recorded prior to surgery. Self-similarity scores were computed using

635 the same algorithm used to evaluate tutor imitation.

636 Chronic neural recordings in AIV. Single-unit recordings of VTA/SNc-projecting AIV

637 neurons were obtained from 7 young adult birds (90-120dph) during undirected singing.

638 For antidromic identification of AIV neurons, bipolar stimulating electrodes were placed

639 in VTA/SNc, according to the methodology described above. AIV was localized with

640 reference to the boundaries of RA, as electrophysiologically identified by the

641 characteristic tonic activity of RA neurons. Placement of the chronically-implanted

642 electrode array was further refined by finding the regions in AIV that exhibited strong

643 antidromic responses to brief unipolar square current pulses (0.2ms duration) applied to

644 the stimulating electrodes in VTA/SNc. Peak currents up to 350uA were used to search

645 for antidromically activated neurons: threshold currents for antidromic activation of

646 neurons in AIV ranged from 90uA to 250uA. Recordings in AIV were made using an

647 array of three recording electrodes (200um spacing, Microprobe, #PI20033.0A3, 3M),

648 mounted in a motorized microdrive (n=20 birds implanted). Data were acquired and

649 analyzed using custom Matlab software.

650 Disruptive auditory stimuli consisted of 50ms duration pulses of white noise (95

651 dBSPL) similar to that previously used to drive song plasticity (Andalman and Fee, 2009,

652 Tumer and Brainard, 2007) which were presented with random timing during undirected

653 singing.

654 Recordings during playback of Birds Own Song (BOS)

655 Auditory stimuli: A week before surgery, birds were isolated and undirected song was

656 recorded. The song motif was extracted and a BOS stimulus was constructed from

657 concatenation of 2 motifs. To mimic the auditory input during disruptive auditory

658 feedback during singing, we constructed ‘BOS with noise’ stimulus in which noise bursts

659 were added to a single syllable within each repetition of the motif. For each bird, the

660 noise burst was added at a fixed time near the middle of the motif. The peak loudness of

661 the BOS playback was set to be equal to the average peak loudness of song measured on

662 the head of the singing birds (90 dBSPL) (Andalman and Fee, 2009). Noise burst loudness

663 was the same as for the presentation of disruptive auditory feedback during singing (95

664 dBSPL).

665 Recordings in anesthetized birds: Birds were anesthetized with three intramuscular

666 injections of 25ul each of 20% urethane administered at 20 min intervals. If needed,

667 another injection was given during the recording session. Playback of bird’s own song

668 (BOS) was delivered to the birds using miniature custom-built headphones (E-A-RTONE

669 Mod-3A Audiometric Insert Earphone). Sound was played at 70dBSPL and 90dBSPL,

670 calibrated using a probe microphone (Model ER-7C, Etymotic, Inc). VTA/SNc-

671 projecting AIV neurons were antidromically identified using the protocol described

672 above for chronic recordings.

673 Recordings in awake (non-singing) birds: A motorized microdrive was implanted for

674 recordings in AIV as described above. After the surgery, the bird was maintained in a

675 sound attenuation chamber. VTA/SNc-projecting AIV neurons were recorded during

676 playback of BOS and BOS with noise on alternating presentations. Sound level for BOS

677 playback was (90 dBSPL) and was calibrated using a Bruel&Kajer Model 2236 sound

678 level meter placed at approximately the center of the bird cage. Histological verification

679 of electrode positions was carried out after the recordings. Data were collected from 4

680 birds.

681 Analysis of neuronal responses. The significance of single neuron and multiunit

682 responses to noise bursts were analyzed in two different ways: The first was a

683 comparison of the spike counts in a window 150ms before versus 150ms after noise onset

684 (paired t-test). The second was a test for a significant peak in the peri-stimulus time

685 histogram (PSTH) using a bootstrap analysis: over a 2 sec window (1 sec before stimulus

686 onset to 1 sec after stimulus onset) spikes were shifted circularly, where the extent of the

687 shifts were drawn from a uniform distribution of ±1 second. Given the resulting PSTH,

688 the peak within 1 second after stimulus was noted. This random shuffling was repeated

689 1000 times to obtain the distribution of peaks in the shuffled data sets. Neurons that

690 exhibited a real PSTH peak that exceeded the 95th percentile of the shuffled data were

691 considered significant. The significance of song-locked firing rate modulations of AIV

692 neurons during singing was analyzed similarly. The highest observed peak in the motif-

693 aligned PSTH was compared to the distribution of PSTH peaks obtained from surrogate

694 data sets in which spike times were circularly within the motif.

695 The significance of the averaged response to noise bursts of all projecting neurons: Over

696 a 2 sec window (1 sec before noise onset to 1 sec after noise onset) the averaged response

697 of each neuron was shifted circularly, where the extent of the shifts were drawn from a

698 uniform distribution of ±1 second. Given the resulting averaged response across all

699 neurons, two values were noted: The peak in the firing rate and the maximal spike count

700 over 150ms. This random shuffling was repeated 1000 times to obtain the distributions in

701 the shuffled data sets. The peak of the measured averaged response within 150ms from

702 noise onset, and the spike count over this period, were tested against the distribution of

703 peaks and spike counts. The indicated p-value is the higher of these two tests.

704 Response latency and duration: Activity around stimulus onset was binned in 2ms bins.

705 Each bin after stimulus onset was tested against the distribution of the activity prior to the

706 stimulus, using a z-test. Response onset (latency) was detected by the first bin for which

707 the following 5 consecutive bins (10ms) were significantly different than the activity

708 prior to the stimulus (z-test, all with p<0.05). To assess response duration we searched for

709 the first bin after response onset for which, the following 10 consecutive bins (20ms)

710 were insignificantly different than the pre-stimulus activity (z-test, all with p>0.05).

711 Stereotypy of identified syllables. To evaluate syllable stereotypy, independent of

712 sequence stereotypy, we used the peak of the spectral cross-correlation (Aronov et al.,

713 2008), on a syllable-by syllable basis. A database of song syllables was constructed for

714 each bird (n=500-600 syllables). The spectrogram of each syllable was computed using

715 the multi-taper method (k=2 tapers, bandwidth parameter p=1.5, 10ms window, 1ms step

716 size). 100 syllables were randomly selected (without replacement) from the dataset. For

717 each selected syllable, a sample of 100 syllables was drawn randomly from the dataset.

718 The similarity of the selected syllable to each of the sampled syllables was evaluated

719 following the spectral correlation method: First, the spectrogram correlation matrix was

720 computed from the correlation of power spectra (860 Hz and 8.6 kHz) between each 1ms

721 time slice in the selected syllable and each 1ms time slice in the sampled syllable.

722 Second, a lag correlation function was computed as the sum along the diagonals of the

723 correlation matrix. Third, the maximum of the lag correlation function was determined.

724 The values resulting from all comparisons to the selected syllable formed a distribution.

725 The value of the 95 percentile of this distribution represents the stereotypy of the selected

726 syllables. For each bird, the syllable stereotypy metric (Figure 5D) was computed as the

727 average over all 100 selected syllables.

728

729 Acknowledgements

730 We thank Jesse Goldberg for helpful suggestions. We also thank Daniel Rubin,

731 Anton Konovchenko, and Tim Currier for technical assistance. Funding for this work was

732 provided by the National Institutes of Health (R01 MH067105) and McGovern Institute

733 internal funding.

734

735 Competing interests: None

736 737 Author Contribution

738 All authors contributed to all phases of this work: YMC, ND, LL, and MSF conceived

739 and performed the experiments, analyzed the data and wrote the manuscript.

740

741 References

742 743 1 Wolpert, D. M., Ghahramani, Z. & Jordan, M. I. 1995. An Internal Model for 744 Sensorimotor Integration. Science 269, 1880-1882, doi:10.1126/science.7569931 745 2 Schultz, W., Dayan, P. & Montague, P. R. 1997. A neural substrate of prediction 746 and reward. Science 275, 1593-1599, doi:10.1126/science.275.5306.1593 747 3 Hikosaka, O. 2007. Basal ganglia mechanisms of reward-oriented eye movement. 748 Ann N Y Acad Sci 1104, 229-249, doi:10.1196/annals.1390.012 749 4 Doupe, A. J. & Kuhl, P. K. 1999. Birdsong and human speech: Common themes 750 and mechanisms. Annu Rev Neurosci 22, 567-631, 751 doi:10.1146/annurev.neuro.22.1.567 752 5 Konishi, M. 1965. The role of auditory feedback in the control of vocalization in 753 the white-crowned sparrow. Z. Tierpsychol 22, 770-793, doi:10.1111/j.1439- 754 0310.1965.tb01688.x 755 6 Scharff, C. & Nottebohm, F. 1991. A comparative study of the behavioral deficits 756 following lesions of various parts of the zebra finch song system: implications for 757 vocal learning. J Neurosci 11, 2896-2913, 758 7 Nottebohm, F., Stokes, T. M. & Leonard, C. M. 1976. Central control of song in 759 the canary, Serinus canarius. J. Comp. Neurol. 165, 457-486, 760 8 Bottjer, S. W., Miesner, E. A. & Arnold, A. P. 1984. Forebrain lesions disrupt 761 development but not maintenance of song in passerine birds. Science 224, 901- 762 903, 763 9 Brainard, M. S. & Doupe, A. J. 2000. Interruption of a basal ganglia-forebrain 764 circuit prevents plasticity of learned vocalizations. Nature 404, 762-766, 765 doi:10.1038/35008083 766 10 Andalman, A. S. & Fee, M. S. 2009. A basal ganglia-forebrain circuit in the 767 songbird biases motor output to avoid vocal errors. Proc Natl Acad Sci U S A 106, 768 12518-12523, doi:10.1073/pnas.0903214106 769 11 Warren, T., Tumer, E., Charlseworth, J. & Brainard, M. 2011. Mechanisms and 770 time course of vocal learning and consolidation in the adult songbird. J 771 neurophysiol 106, 1806-1821, doi:10.1152/jn.00311.2011 772 12 Person, A., Gale, S., Farries, M. & Perkel, D. 2008. Organization of the songbird 773 basal ganglia, including area X. J Comp Neurol. 508, 840-866, 774 doi:10.1002/cne.21699 775 13 Prather, J. F., Peters, S., Nowicki, S. & Mooney, R. 2008. Precise auditory-vocal 776 mirroring in neurons for learned vocal communication. Nature 451, 305-310, 777 doi:10.1038/nature06492 778 14 Troyer, T. W. & Doupe, A. J. 2000. An associational model of birdsong 779 sensorimotor learning I. Efference copy and the learning of song syllables. J. 780 Neurophysiol. 84, 1204-1223, 781 15 Roberts, T., Gobes, S., Murugan, M., Olveczky, B. & Mooney, R. 2012. Motor 782 circuits are required to encode a sensory model for imitative learning. Nat 783 Neurosci 15, 1454-1459, doi:10.1038/nn.3206

784 16 Mooney, R. 2006. Synaptic Mechanisms for Auditory-Vocal Integration and the 785 Correction of Vocal Errors. Ann N Y Acad Sci 1016, 476-494, 786 doi:10.1196/annals.1298.011 787 17 Kozhevnikov, A. A. & Fee, M. S. 2007. Singing-related activity of identified 788 HVC neurons in the zebra finch. J. Neurophysiol. 97, 4271-4283, 789 18 Hessler, N. & Doupe, A. 1999. Singing-Related Neural Activity in a Dorsal 790 Forebrain–Basal Ganglia Circuit of Adult Zebra Finches. J Neurosci 19, 10461- 791 10481, 792 19 Hamaguchi, K., Tschida, K. A., Yoon, I., Donald, B. R. & Mooney, R. 2014. 793 Auditory synapses to song premotor neurons are gated off during vocalization in 794 zebra finches. Elife 18, 01833, 795 20 Tsai, H. C., Zhang, F., Adamantidis, A., Stuber, G. D., Bonci, A., de Lecea, L. et 796 al. 2009. Phasic firing in dopaminergic neurons is sufficient for behavioral 797 conditioning. Science 324, 1080-1084, doi:10.1126/science.1168878 798 21 Bayer, H. M. & Glimcher, P. W. 2005. Midbrain dopamine neurons encode a 799 quantitative reward prediction error signal. Neuron 47, 129-141, 800 doi:10.1016/j.neuron.2005.05.020 801 22 Houk, J. C., Adams, J. L. & Barto, A. G. in Models of Information Processing in 802 the Basal Ganglia 249-270 (MIT press, 1994). 803 23 Gale, S. D., Person, A. L. & Perkel, D. J. 2008. A novel basal ganglia pathway 804 forms a loop linking a vocal learning circuit with its dopaminergic input. J Comp 805 Neurol. 508, 824-839, doi:10.1002/cne.21700 806 24 Fee, M. S. & Goldberg, J. H. 2011. A hypothesis for basal ganglia-dependent 807 reinforcement learning in the songbird. 198, 152-170, 808 doi:10.1016/j.neuroscience.2011.09.069 809 25 Hara, E., Kubikova, L., Hessler, N. A. & Jarvis, E. D. 2007. Role of the midbrain 810 dopaminergic system in modulation of vocal brain activation by social context. 811 Eur J Neuroscience 25, 3406-3416, doi:10.1111/j.1460-9568.2007.05600.x 812 26 Gale, S. D. & Perkel, D. J. 2010. A basal ganglia pathway drives selective 813 auditory responses in songbird dopaminergic neurons via disinhibition. J Neurosci 814 30, 1027-1037, doi:10.1523/JNEUROSCI.3585-09.2010 815 27 Dugas-Ford, J., Rowell, J. J. & Ragsdale, C. W. 2012. Cell-type homologies and 816 the origins of the neocortex. Proc Natl Acad Sci U S A, 817 doi:10.1073/pnas.1204773109 818 28 Karten, H. J. 2013. Neocortical Evolution: Neuronal Circuits Arise Independently 819 of Lamination. Current Biology 23, R12-R15, doi:10.1016/j.cub.2012.11.013 820 29 Bottjer, S. W. & Altenau, B. 2010. Parallel pathways for vocal learning in basal 821 ganglia of songbirds. Nature Neuroscience 13, 153-155, doi:10.1038/nn.2472 822 30 Feenders, G., Liedvogel, M., Rivas, M., Zapka, M., Horita, H., Hara, E. et al. 823 2008. Molecular Mapping of Movement-Associated Areas in the Avian Brain: A 824 Motor Theory for Vocal Learning Origin. Plos One 3, e1768, 825 doi:10.1371/journal.pone.0001768 826 31 Jarvis, E. D., Yu, J., Rivas, M. V., Horita, H., Feenders, G., Whitney, O. et al. 827 2013. Global view of the functional molecular organization of the avian 828 : mirror images and functional columns. J comp Neurol 521, 23462, 829 doi:10.1002/cne.23404

830 32 Vates, G. E., Broome, B. M., Mello, C. V. & Nottebohm, F. 1996. Auditory 831 pathways of caudal telencephalon and their relation to the song system of adult 832 male zebra finches. J Comp Neurol 366, 613-642, doi:10.1002/(SICI)1096- 833 9861(19960318)366:4<613::AID-CNE5>3.0.CO;2-7 834 33 Mello, C. V., Vates, G. E., Okuhata, S. & Nottebohm, F. 1998. Descending 835 auditory pathways in the adult male zebra finch (Taeniopygia guttata). J Comp 836 Neurol 395, 137-160, doi:10.1002/(SICI)1096-9861(19980601)395:2<137::AID- 837 CNE1>3.0.CO;2-3 838 34 Kelley, D. & Nottebohm, F. 1979. Projections of a Telencephalic Auditory 839 Nucleus-Field L-in the Canary. J Comp Neurol 183, 455-469, 840 doi:10.1002/cne.901830302 841 35 Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. 2000. A 842 procedure for an automated measurement of song similarity. Anim Bahav 59, 843 1167-1176, doi:10.1006/anbe.1999.1416 844 36 Aronov, D., Andalman, A. S. & Fee, M. S. 2008. A specialized forebrain circuit 845 for vocal babbling in the juvenile songbird. Science 320, 630-634, 846 doi:10.1126/science.1155140 847 37 Lombardino, A. J. & Nottebohm, F. 2000. Age at deafening affects the stability of 848 learned song in adult male zebra finches. J Neurosci 20, 5054-5064, 849 38 Nordeen, K. W. & Nordeen, E. J. 1992. Auditory feedback is necessary for the 850 maintenance of stereotyped song in adult zebra finches. Behav Neural Biol 57, 58- 851 66, doi:10.1016/0163-1047(92)90757-U 852 39 Ölveczky, B. P., Andalman, A. S. & Fee, M. S. 2005. Vocal Experimentation in 853 the Juvenile Songbird Requires a Basal Ganglia Circuit. PLoS Biol 3, e153, 854 doi:10.1371/journal.pbio.0030153 855 40 Kao, M. H., Doupe, A. J. & Brainard, M. S. 2005. Contributions of an avian basal 856 ganglia-forebrain circuit to real-time modulation of song. Nature 433, 638-643, 857 41 Nordeen, K. & Nordeen, E. 2010. Deafening-Induced Vocal Deterioration in 858 Adult Songbirds Is Reversed by Disrupting a Basal Ganglia-Forebrain Circuit. J 859 Neurosci 30, 7392-7400, doi:10.1523/JNEUROSCI.6181-09.2010 860 42 Leonardo, A. & Fee, M. S. 2005. Ensemble coding of vocal control in birdsong. J 861 Neurosci 25, 652-661, doi:10.1523/JNEUROSCI.3036-04.2005 862 43 Sen, K., Theunissen, F. E. & Doupe, A. J. 2001. Feature Analysis of Natural 863 Sounds in the Songbird Auditory Forebrain. J Neurophysiol 86, 1445-1458, 864 44 Tchernichovski, O., Mitra, P. P., Lints, T. & Nottebohm, F. 2001. Dynamics of 865 the vocal imitation process: how a zebra finch learns its song. Science 291, 2564- 866 2569, doi:10.1126/science.1058522 867 45 Tumer, E. C. & Brainard, M. S. 2007. Performance variability enables adaptive 868 plasticity of 'crystallized' adult birdsong. Nature 450, 1240-1244, 869 doi:10.1038/nature06390 870 46 Bottjer, S., Brady, J. & Cribbs, B. 2000. Connections of a motor cortical region in 871 zebra finches: relation to pathways for vocal learning. J Comp Neurol 420, 244- 872 260, 873 47 Achiro, J. M. & Bottjer, S. W. 2013. Neural representation of a target auditory 874 memory in a cortico-basal ganglia pathway. J Neurosci 33, 14475-14488,

875 48 Goldberg, J. H. & Fee, M. S. 2011. Vocal babbling in songbirds requires the basal 876 ganglia-recipient motor thalamus but not the basal ganglia. J Neurophys 105, 877 2729-2739, doi:10.1152/jn.00823.2010 878 49 Kojima, S., Kao, M. H. & Doupe, A. J. 2013. Task-related “cortical” bursting 879 depends critically on basal ganglia input and is linked to vocal plasticity. Proc 880 Natl Acad Sci U S A 110, 4756-4761, doi:10.1073/pnas.1216308110 881 50 Leblois, A., Wendel, B. J. & Perkel, D. J. 2010. Striatal dopamine modulates 882 basal ganglia output and regulates social context-dependent behavioral variability 883 through D1 receptors. J Neurosci 30, 5730-5743, doi:10.1523/JNEUROSCI.5974- 884 09.2010 885 51 Yanagihara, S. & Hessler, N. A. 2006. Modulation of singing-related activity in 886 the songbird ventral tegmental area by social context. Eur J Neurosci 24, 3619- 887 3627, doi:10.1111/j.1460-9568.2006.05228.x 888 52 Kreitzer, A. C. & Malenka, R. C. 2008. Striatal Plasticity and Basal Ganglia 889 Circuit Function. Neuron 60, 543-554, doi:10.1016/j.neuron.2008.11.005 890 53 Kubikova, L., Wada, K. & Jarvis, E. D. 2010. Dopamine receptors in a songbird 891 brain. J comp Neurol 518, 741-769, doi:10.1002/cne.22255 892 54 Ding, L. & Perkel, D. J. 2002. Dopamine modulates excitability of spiny neurons 893 in the avian basal ganglia. J Neurosci 22, 5210-5218, 894 55 Kroner, S. & Gunturkun, O. 1999. Afferent and Efferent Connections of the 895 Caudolateral Neostriatum in the Pigeon (Columba livia): A Retro- and 896 Anterograde Pathway Tracing Study. J Comp Neurol 407, 22-260, 897 doi:10.1002/(SICI)1096-9861(19990503)407:2<228::AID-CNE6>3.0.CO;2-2 898 56 Starosta, S., Gunturkun, O. & Stutttgen, M. 2013. Stimulus-response-outcome 899 coding in the pigeon nidopallium caudolaterale. Plos One 8, e57407, 900 doi:10.1371/journal.pone.0057407 901 57 Murase, S., Grenhoff, J, Chouvet, G, Gonon, FG, Svensson, TH. 1993. Prefrontal 902 cortex regulates burst firing and transmitter release in rat mesolimbic dopamine 903 neurons studied in vivo. Neurosci letters 157, 53-56, doi:10.1016/0304- 904 3940(93)90641-W 905 58 Wild, J. M., Karten, H. J. & Frost, B. J. 1993. Connections of the auditory 906 forebrain in the pigeon (Columba livia). J Comp Neurol 337, 32-62, 907 doi:10.1002/cne.903370103 908 59 Bauer, E., Coleman, M., Roberts, T., Roy, A., Prather, J. & Mooney, R. 2008. A 909 Synaptic Basis for Auditory–Vocal Integration in the Songbird. J Neurosci 28, 910 1509-1522, doi:10.1523/JNEUROSCI.3838-07.2008 911 60 Akutagawa, E. & Konishi, M. 2010. New brain pathways found in the vocal 912 control system of a songbird. J comp Neurol 518, 3086-3100, 913 doi:10.1002/cne.22383 914 61 Keller, G. B. & Hahnloser, R. H. 2009. Neural processing of auditory feedback 915 during vocal practice in a songbird. Nature 457, 187-190, 916 doi:10.1038/nature07467 917 62 Sutton, R. S. & Barto, A. G. 1998. Reinforcement learning: an introduction. IEEE 918 trans Neural Netw 9, 1054, 919 63 Redondo, R. & Morris, R. 2011. Making memories last: the synaptic tagging and 920 capture hypothesis. Nat Rev Neurosci 12, 17-30, doi:10.1038/nrn2963

921 64 Charlesworth, J. D., Warren, T. & Brainard, M. 2012. Covert skill learning in a 922 cortical-basal ganglia circuit. Nature 486, 251-255, doi:10.1038/nature11078 923 65 Mandelblat-Cerf, Y. & Fee, M. S. 2014. An Automated Procedure for Evaluating 924 Song Imitation. Plos One, doi:10.1371/journal.pone.0096484 925 66 Aronov, D., Andalman, A. & MS, F. 2008. A specialized forebrain circuit for 926 vocal babbling in the juvenile songbird. . Science 320, 630-634, 927 928

929

930

931 Figure 1 – Characterization of avian ‘cortical’ areas projecting to the dopaminergic 932 midbrain. A) Left panel: Schematic of the songbird brain (in sagittal view) showing 933 classical song control brain areas. Nuclei HVC and RA form the descending song motor 934 pathway, necessary for song production. Nucleus RA is in the arcopallium, a cortical-like 935 region homologous to layer-5 neurons of the mammalian cortex. Also shown is the 936 anterior forebrain pathway (AFP), a circuit necessary for vocal learning but not vocal 937 production. The AFP consists of a basal ganglia homologue Area X, thalamic nucleus 938 DLM, and cortical-like nucleus LMAN, which projects to RA. Right panel: Schematic 939 showing a set of pathways that are the focus of this paper, including a previously- 940 described projection to Area X from neurons in the dopaminergic midbrain nuclei VTA 941 and SNc (Gale et al., 2008). Also shown is AIV, a part of the intermediate arcopallium 942 found to project to VTA and SNc. Projections to AIV from auditory cortical areas L1, 943 CM and HVC-shelf are elucidated here. B) Sagittal section showing the injection site of a 944 retrograde tracer (CTB, white color, arrow) within VTA/SNc (TH-stained neurons, red). 945 C) Three sagittal sections through the arcopallium showing retrogradely–labeled neurons 946 in AIV (fluorescence, white) and relation to RA (dark field image, purple). Panel at left is 947 most lateral; sections 200um apart. D) Upper panel: image of injection site of a GFP- 948 expressing virus (HSV, fluorescence white) in the anterior part of AIV, showing relation 949 to RA (dark field image). Bottom panel: sagittal section of VTA/SNc showing 950 anterogradely-labeled fibers from AIV (green) and TH-stained neurons (red). E) Same as 951 D, with an injection site in AIV ventral to RA. All scale bars 200µm unless indicated 952 otherwise. In panels A-E, anterior is left; dorsal is up. F) Coronal section showing 953 in RA and Ad (green), anterogradely-labeled from LMAN and LMAN-shell, respectively 954 (Dextran 10MW Alexa 488). AIV neurons (white) labeled by injection of a retrograde 955 tracer (CTB) in VTA and SNc (medial, left; dorsal, up). 956 957

958 Figure 2 - Cortical inputs to ventral intermediate arcopallium (AIV): Retrograde 959 and anterograde tracing. A) Injection of a retrograde tracer (CTB) anterior to RA in the 960 vicinity of AIV. B) Retrograde label in cortical auditory areas L1 and CM resulting from 961 the injection shown above. C) Labeled neurons in HVC-shelf of the same bird. D-F) 962 Same as panels A-C. Injection of CTB was targeted to the anterior ‘stripe’ part of AIV. 963 G) Injection of CTB into the posterior part of AIV, directly ventral to RA. H) 964 Retrogradely-labeled neurons in caudal nidopallium (NC) resulting from the injection 965 shown above. I-L) Anterograde tracing from auditory cortical areas L1, CM and HVC- 966 shelf. I) Upper panel – injection of a GFP-expressing virus (HSV, green) into auditory 967 field L1. Middle panel shows labeled axons in the vicinity of AIV neurons retrogradely- 968 labeled (purple) from VTA/SNc. Color in RA is due to auto fluorescence, not label. 969 Bottom panel shows higher magnification view from image above. J) Same as (I), but the 970 injection of HSV-GFP was made into the auditory caudal mesopallium (CM). K) Same as 971 (I), but the injection of HSV-GFP was made into HVC-shelf. Scale bars: 100µm for 972 lower panels of I-L; 200µm for all other panels. 973 974 Figure 3 - Electrophysiological verification of functional connectivity. Electrical 975 stimulation in auditory cortical areas L1, CM and HVC-shelf drives spiking in 976 VTA/SNC-projecting neurons in AIV. A) Schematic at left illustrates the location of the 977 stimulating electrodes (red) in L1 and the location of the recording electrode (blue) in 978 AIV. Middle traces: responses of 3 antidromically-identified AIV neurons to L1 979 stimulation (overlayed responses from ten trials). Right traces: antidromic response of the 980 same neurons to electrical stimulation in VTA/SNc. B-C) The panels are analogous to 981 those shown in (A), except the stimulation electrode is placed in caudal mesopallium 982 (CM) or in the posterior part of HVC-shelf. 983 984 Figure 4 – Bilateral lesions of AIV impair tutor imitation. A) Excitotoxic lesion 985 targeted to AIV by injection of NMA. Lesion borders indicated by arrowheads (sagittal 986 section; anterior left, dorsal up). B) Schematic timeline of experimental protocol. C) 987 Song spectrograms of an adult bird (90dph) that underwent bilateral lesion of AIV as a 988 juvenile, compared to an unlesioned control sibling. Top: spectrogram of tutor song.

989 Middle: two example song spectrograms of the lesioned bird (45% lesion, 0.205 imitation 990 score). Bottom: two example song spectrograms of the control sibling (0.235 imitation 991 score). D) Same as panel C, but for a different pair of birds (lesioned bird, 66% lesion, 992 0.1566 imitation score; control sibling, 0.27 imitation score). E-F) Song spectrograms of 993 two additional adult birds that underwent lesions of AIV as juveniles. These birds did not 994 have control siblings. (Bird in panel E, 61% lesion, 0.097 imitation score; bird in panel F, 995 77% lesion, 0.114 imitation score.) 996 997 998 Figure 5 –AIV lesions, but not Ad lesions, impair tutor imitation. A) Left: coronal 999 section showing the relation between AIV neurons (retrogradely labeled from VTA/SNc, 1000 blue) and nucleus Ad (anterograde labeling from LMAN-shell, green; medial, left; dorsal, 1001 up) Right: coronal section showing excitotoxic lesion of Ad, revealed by loss of NeuN 1002 staining (white, note that the lesion affected some of the overlying nidopallium). B) 1003 Tutor imitation score plotted as a function of the extent of AIV lesion for each bird 1004 (hollow circles, n=17 AIV-lesioned birds). Ad lesioned birds (n=7) are shown as 0% 1005 lesion (hollow grey diamonds) since Ad lesions had minimal impact on AIV. No 1006 significant impairment of song-imitation was observed in Ad lesioned birds as compared 1007 to unlesioned controls (blue diamonds, n=19 birds). Solid line denotes least square fit to 1008 Ad-lesioned and AIV lesioned data points. Red dashed horizontal line indicates mean of 1009 all similarity comparisons between 20 unrelated adult birds (red shaded area indicates 1010 s.e.m.) C) Boxplot of the distribution of imitation scores of all control birds (cyan, 1011 unlesioned and Ad-lesioned controls) and birds with large AIV-lesions (black, >50% 1012 lesion). Also shown is a boxplot of the distribution of similarity scores of all pairwise 1013 comparisons between 20 unrelated adult birds (red). Whiskers denote 10-90 percentile. 1014 Asterisk in each boxplot denotes the mean, heavy line denotes median. D) Distribution of 1015 syllable self-similarity in AIV-lesioned birds (top) and in control birds (bottom, 1016 unlesioned and Ad-lesioned controls combined). Dashed lines denote mean of 1017 distributions. 1018 1019

1020 Figure 6 – Effect of AIV lesion on adult song and on song degradation after 1021 deafening. A) Examples of song spectrograms from a bird deafened at age 80 dph. 1022 Shown from top to bottom; song before surgery (deafening), first song post-surgery, song 1023 one week post-surgery, and song two weeks post-surgery. Note rapid degradation of song 1024 structure within one week after deafening. B) Examples of song spectrograms from a 1025 bird that underwent complete bilateral lesion of AIV at age 80 dph. Note the lack of song 1026 degradation. C) Song spectrograms from a bird that underwent both bilateral lesion of 1027 AIV and deafening at age 80 dph. D) Plot of song self-similarity, normalized to the 1028 average self-similarity of pre-surgery song. Note that lesioned birds exhibited a reduced 1029 rate of song degradation after deafening, compared to deafening alone, one week and two 1030 weeks post-surgery (t-test, p<0.01). 1031 1032

1033 Figure 7 – AIV neurons projecting to VTA/SNc exhibit error-related auditory 1034 responses during singing. A) Schematic diagram of recording setup showing 1035 stimulating electrode placed in VTA/SNc for antidromic identification of AIV neurons. 1036 B) Voltage traces showing an antidromically-evoked spike of a VTA/SNc-projecting AIV 1037 neuron (red spikes). Collision test shown in black traces. C) Distribution of firing rates of 1038 AIV neurons during singing. D) Firing rates of AIV neurons during singing vs. non- 1039 singing (antidromically-identified neurons, hollow circles; non-identified AIV neurons,

1040 filled circles). E) Recording of an antidromically-identified AIV neuron during singing 1041 (song spectrogram, top; extracellular voltage trace, bottom) with presentation of noise 1042 bursts (red arrows). F) Motif-aligned spike raster plot of the AIV neuron from panel E 1043 during singing with no noise bursts presented. Each row in the raster plot corresponds to 1044 a rendition of the song motif. Each dot corresponds to a spike. Along each row, grey 1045 areas denote syllables and white areas denote silent gaps. Two vertical red lines denote 1046 motif onset and offset. Shaded area along the histogram denotes SEM. Note lack of 1047 singing-related firing rate modulations. G) Spike raster plot and spike histogram aligned 1048 to noise bursts during singing. Red vertical line denotes noise onset. Other notations are 1049 as in F. Note the brief,response to the noise burst during singing. H-J) Another example 1050 of an antidromically-identified AIV neuron (larger spike waveform) recorded during 1051 singing. I) Motif-aligned raster plot of the AIV neuron from panel H. J) Spike raster plot 1052 and histogram aligned to noise bursts during singing. Rasters are sorted by the duration 1053 of the syllable in which the noise burst occurred. Note that the response to noise burst 1054 occurs during different syllable types. K) Average peri-stimulus histogram for all 1055 antidromically-identified AIV neurons, aligned to noise burst onset during singing. L) 1056 Distribution of response latency after noise burst (left) and response duration (right), for 1057 AIV neurons that showed a significant response to noise bursts. 1058

1059 Figure 8 – Response of AIV neurons to noise bursts during non-singing. A) Activity

1060 of an antidromically-identified AIV neuron recorded during presentation of noise bursts 1061 during playback of birds own song (BOS). Top to bottom panels: song spectrogram and 1062 simultaneous recording of neuronal activity; raster plot and spike histogram aligned to 1063 BOS onset (black line). Time of noise bursts indicated by vertical red lines. Yellow band 1064 denotes period of BOS playback. Note lack of response to noise bursts. B) Response of 1065 the same neuron in panel A to presentation of isolated noise bursts (no BOS, no singing). 1066 Note slow timecourse of the response. C,D) Recording of another antidromically-

1067 identified AIV neuron (notation same as in panels A,B). E) Averaged PSTH for all

1068 VTA/SNc-projecting AIV neurons that exhibited a significant response to isolated noise 1069 bursts within one second after noise onset. Average response of these same neurons to 1070 noise bursts presented during BOS playback (blue). In comparison, averaged response is

1071 shown for all VTA/SNc-projecting AIV neurons that responded significantly to noise 1072 burst during singing (black trace).

1073 Figure 1-figure supplement 1 – Retrograde labeling of AIV neurons from VTA and 1074 SNc. A) Site of CTB injection (white, 13.8 nL total injected volume) into VTA (tyrosine 1075 hydroxylase immunostain, red). B) Serial sagittal sections through the arcopallium 1076 showing retrogradely labeled neurons (white) in the vicinity of RA. (Most lateral image is 1077 #1, 100um between slices). C-D) Site of CTB injection into SNc and retrograde label 1078 (white) in the arcopallium. E-H) In each pair, the image at left shows the injection site in 1079 VTA/SNc. The image at right shows a single sagittal section through the arcopallium 1080 near the lateral third of RA (corresponding to sections 2 or 3 in panels B/D). We refer 1081 collectively to the regions containing neurons retrogradely-labeled from the 1082 dopaminergic midbrain (VTA/SNc) as the ventral intermediate arcopallium (AIV). 1083 Scale bars for all images are 200µm. 1084 1085

1086 Figure 2-figure supplement 1 - Cortical inputs to ventral intermediate arcopallium 1087 (AIV): A) Injection site of CTB in anterior AIV (see Figure 1 – figure supplement 1B,D 1088 panels 2-4). B) Retrogradely-labeled cell bodies in L1 and CM resulting from the 1089 injection shown in panel A. Retrograde label was also observed in L3 following 1090 injections of tracer anterior to RA. But the presence of labeled neurons in L3 was less 1091 reliable than in L1 and CM. Further studies will be required to resolve the connectivity 1092 between L3 and AIV. C) Retrogradely-labeled cell bodies in posterior HVC-shelf 1093 resulting from the injection shown in panel A. D-H) Upper panels – Injection of a GFP- 1094 expressing virus (HSV, green) into L1(D), CM (E), HVC-shelf (F) and NC (G-H). 1095 Middle panels show labeled fibers arborizing in AIV in the vicinity of neurons 1096 retrogradely-labeled from VTA/SNc (purple). Bottom panels show a higher- 1097 magnification view of the sections shown in the middle panels. Scale bars: 100µm for the 1098 bottom images in panels D-I; 200µm for all other images. 1099 1100

1101 Figure 4-figure supplement 1 – Effect of AIV lesions on song motor production and 1102 imitation. A) AIV-lesioned birds showed significantly reduced tutor imitation score 1103 compared to unlesioned controls. Lines connect data from siblings. B) Average 1104 distribution of song spectral features (FM, Weiner entropy, pitch goodness and pitch) of 1105 AIV lesioned birds (black) and unlesioned controls (blue) at 90dph. C-D) AIV lesions 1106 produce no immediate effect on juvenile songs. C) Song example of a juvenile bird just 1107 prior to AIV-lesion (50 dph; top two spectrograms) and in the first day of singing after 1108 AIV lesion (53 dph; bottom two spectrograms). This bird had a 55% AIV-lesion and 1109 ultimately exhibited a severe deficit in imitation (tutor imitation score 0.094 at 90 dph). 1110 D) Distributions of several acoustic features prior the AIV lesion (blue) and after AIV 1111 lesion (red), for the bird shown in panel C. Numbers in each subplot show the cross 1112 correlation between the feature distributions before and after lesion. Across all AIV- 1113 lesioned birds (n=17 birds) the correlation coefficients between pre- and post-lesion 1114 distributions were 0.9837±0.03, 0.944±0.06, 0.988±0.13 and 0.96±0.03 for FM, Weiner 1115 entropy, pitch goodness and pitch, respectively. 1116 1117 Figure 5-figure supplement 1 – Effect of AIV lesions on song imitation, development 1118 of song stereotypy, and rate of singing. The tutor imitation scores shown in main Figure 1119 5B are based on a product of the acoustic similarity score the sequence similarity score 1120 (Mandelblat-Cerf and Fee, 2014). Here the contribution of acoustic and sequence 1121 similarity are shown separately in panels A and B respectively. C) Plot of maturity index 1122 (a measure of song stereotypy; see methods note below) as a function of AIV lesion size. 1123 Larger AIV lesions in juvenile birds resulted in lower stereotypy in the final adult song. 1124 In panels A-C, dashed line denotes averaged scores of each metric for comparisons 1125 between unrelated tutors. Solid lines denote linear regression using least square. R-values 1126 are 0.69, 0.46 and 0.50 respectively. F-statistic p-values are noted. The song maturity 1127 index of AIV-lesioned birds is significantly smaller than that of control birds (unlesioned 1128 and Ad-lesioned birds;Wilcoxon rank-sum test, p=0.01 and p=0.025, respectively). 1129 1130 D-E) To test whether the deficits in learning (tutor imitation scores) after AIV lesion are 1131 due to a reduced rate of singing (i.e. less practice), we quantified the amount of singing in

1132 AIV-lesioned birds and Ad-lesioned controls. D) Total seconds of singing per day for 1133 Ad- lesioned birds (average of 7 birds, red) and AIV-lesioned birds (average of 17 birds, 1134 black). There was no significant difference in the accumulated amount of singing (t-test, 1135 p=0.52), nor in the daily singing rate, between these populations, as tested every 5 days 1136 (t-tests, p-values of all comparisons, 0.8>p>0.22). E) Tutor imitation scores are not 1137 correlated with the amount of singing both for AIV-lesioned (upper panel, black), Ad- 1138 lesioned (upper panel, red) and unlesioned control birds (bottom panel, blue) (F-statistics 1139 all comparisons, p>0.4). 1140 1141 Note on Methods: Our song similarity algorithm compares many renditions of the pupil 1142 song with the tutor motif. This comparison results in an acoustic similarity score and a 1143 sequence similarity score. Acoustic similarity estimates the similarity of each of the 1144 tutor’s syllables to the best match in the pupil’s song. The sequence similarity score 1145 represents the acoustic similarity of song segments preceding and following these closest- 1146 match syllables. The overall tutor imitation score is the product of the acoustic and 1147 sequence similarity scores. Song stereotypy (panel C) was quantified by measuring the 1148 peak of the spectrogram cross-correlation between different bouts of the birds song with 1149 itself (Aronov et al., 2008). This measure was shown to gradually increase through song 1150 development, and therefore we refer to it as the maturity index. 1151 1152

1153 Figure 6-figure supplement 1 – Immediate effects on song of excitotoxic and 1154 electrolytic lesions in AIV. A) Excitotoxic lesions in AIV do not cause a decrease in song 1155 variability, as would be expected if the lesions impacted LMAN axons entering RA. 1156 Example song spectrograms prior to AIV lesion surgery (top) and the first day of singing 1157 after the surgery (bottom). Data are from a bird that received bilateral lesion of AIV at 1158 age 77dph. B) Quantification of song variability before and after AIV lesion in older 1159 juvenile birds (75-80dph). Data are from the same birds shown in the main Figure 6. 1160 Song variability was assessed by computing the variance of song self-similarity scores. 1161 1162 C-F) Partial electrolytic lesion anterior to RA produces an immediate effect on song 1163 acoustic structure, presumably due to the destruction of axons in the RA output tract. 1164 Such immediate effects on song structure were never observed after excitotoxic lesions in 1165 AIV, which also extended anterior to RA. C) Sagittal section (NeuN, white) showing RA 1166 and the electrolytic lesion anterior to it (red arrow). D) Song spectograms prior to the 1167 electrolytic lesion (upper panels) and the first day of singing post lesion (lower panels), 1168 for the same bird in panel C. E) Plot of song self-similarity post-surgery, normalized to 1169 the average self-similarity of pre-surgery song (75-80dph). Post-electrolytic-lesion 1170 similarity scores (n=5 birds, red) were significantly smaller than pre-lesion similarity 1171 scores (ranksum test, p=0.008), and significantly smaller than the normalized self- 1172 similarity scores of AIV-lesioned birds (n=6 birds, grey). (ranksum test, p=0.0043). F) 1173 Plot of song self-similarity of AIV-lesioned birds (grey) and electrolytically-lesioned 1174 birds for two weeks after surgery. Self-similarity of electrolytically-lesioned birds 1175 showed an immediate drop after lesion that remained significantly lower than pre- 1176 surgery, and lower than AIV-lesioned birds, for at least two weeks (all comparisons, 1177 p<0.01). 1178 Note on Methods: Bilateral electrolytic lesions were carried out by passing cathodal 1179 current of 70uA for 45 seconds at multiple sites in the arcopallium anteroventral to RA. 1180 The head was oriented stereotaxically with the flat anterior portion of the skull rotated 1181 forward at a pitch of 45 degrees. RA was completely mapped based on its characteristic 1182 electrophysiological signature (regular spiking interspersed with bursts). Once all of the 1183 borders of RA were localized, these were used to calculate the coordinates of the lesion

1184 sites. Two penetrations were made 200µm anterior to the anterior edge of RA, spaced 1185 300µm from each other on the medial-lateral axis. For each penetration, two lesions were 1186 made in two depths: 200µm above and below the bottom edge of RA. Songs were 1187 acquired for two weeks (n=3 birds) or for two days (n=2 birds), after which birds were 1188 sacrificed and the brain analyzed histologically. 1189

1190 Figure 7-figure supplement 1 – Simultaneous mapping of VTA/SNc-projecting 1191 neurons in AIV by antidromic activation and . Does electrical 1192 stimulation in VTA/SNc antidromically activate only AIV neurons, or does it also 1193 activate other descending arcopallial pathways? To assess this question, we carried out 1194 mapping of antidromically-stimulated activity in ventral arcopallium and compared this, 1195 in the same animal, to the pattern of AIV neurons retrogradely-labeled from VTA/SNc. 1196 A) Sagittal section showing neurons in AIV retrogradely-labeled by an injection of a 1197 CTB into VTA/SNc. Antidromic responses were measured along the electrode tracks 1198 indicated (diagonal grey lines). Yellow circles indicate sites where statistically significant 1199 antidromic activation from electrical stimulation in VTA/SNc was observed. White 1200 circles indicate where no significant response was observed. Dashed line indicates the 1201 penetration along which reference electrolytic burn marks were made. B-D) Example 1202 responses to VTA/SNc stimulation are shown at selected recording sites along each of 1203 three electrode tracks. Significant antidromic responses were observed only at locations 1204 where retrogradely-labeled neurons were subsequently found. Specifically, there was no 1205 antidromic response seen in the gap between the main posterior mass of retrogradely 1206 neurons and the more anterior stripe. Nor were antidromic responses observed antero- 1207 ventral to the anterior stripe, within the borders of RA, nor dorsal to the thin rim of 1208 retrogradely-labeled neurons along the dorsal border of RA. 1209 1210 Note on Methods: Retrograde tracer (CTB) was injected into VTA and SNc, and bipolar 1211 stimulating electrodes were implanted into the same region. The head was then oriented 1212 with the flat anterior portion of the skull rotated forward at a pitch of 70 degrees. RA was 1213 completely mapped based on its characteristic electrophysiological signature (regular 1214 spiking interspersed with bursts). Electrode penetrations (Carbostar, Kation Scientific) 1215 were made in a plane 200µm medial from the lateral edge of RA and penetrations were 1216 made at several locations along the AP axis in this plane. For each penetration, recordings 1217 were made along a 1-1.5 mm range of DV coordinates spanning through the arcopallium. 1218 For each recording site the existence of antidromic responses was tested with single 0.2 1219 ms monopolar pulses, current up to 350µA and with both polarities of stimulation 1220 current. If no antidromic response was observed visually, responses were recorded using

1221 the maximum current (350 µA), otherwise, responses were recorded at 130% of threshold 1222 current. In each recording penetration, we noted the most ventral coordinate at which the 1223 characteristic spontaneous activity of RA was found. After recordings were complete, 1224 another electrode was used to make an electrolytic burn at known locations relative to the 1225 recording sites. 1226 Alignment of antidromic and retrograde maps. The were sectioned and stained for 1227 NeuN, and retrogradely-labelled neurons were visualized. The AP coordinate of every 1228 recording site was determined with respect to the electrolytic burns. The DV coordinate 1229 of every recording site was determined by aligning the ventral border of RA as 1230 determined histologically (NeuN stain) and electrophysiologically (spontaneous activity). 1231 Significance of an antidromic response: We analyzed the significance of antidromic 1232 responses as follows. For each recording site we computed the frequency spectrum of the 1233 averaged response to the stimulus (15 ms window, spanning 2-17 ms after stimulus 1234 onset). We then computed the power between 200Hz-2KHz. A t-test was used to test the 1235 hypothesis that the power in this frequency band was significantly higher than the power 1236 obtained from recordings with no antidromic response (recorded at 1-2 mm dorsal to RA 1237 and inspected visually to verify they did not show antidromic response). Note that results 1238 were consistent for different frequency bands. 1239