<<

Syringeal EMGs and synthetic stimuli reveal a switch- like activation of the songbird’s vocal motor program

Alan Busha,b,1, Juan F. Döpplera,b, Franz Gollerc, and Gabriel B. Mindlina,b

aDepartamento de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires CP 1428, Argentina; bInstituto de Física de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires CP 1428, Argentina; and cDepartment of , University of Utah, Salt Lake City, UT 84112

Edited by Harvey J. Karten, University of California, San Diego, La Jolla, CA, and approved July 10, 2018 (received for review January 23, 2018)

The coordination of complex vocal behaviors like human speech BOS (11). This suggests that these synthetic songs are ideally and oscine birdsong requires fine interactions between sensory suited to study the activation of the song system by and motor programs, the details of which are not completely auditory stimuli: good enough to evoke the highly specific re- understood. Here, we show that in sleeping male zebra finches sponse, yet sufficiently different as to produce measurable (Taeniopygia guttata), the activity of the song system selectively changes in that response. Moreover, the parameters in the model evoked by playbacks of their own song can be detected in the used to generate the synthetic songs can be continuously varied syrinx. Electromyograms (EMGs) of a syringeal muscle show to explore changes in the activation dynamics under a smooth playback-evoked patterns strikingly similar to those recorded dur- degradation of the stimulus. ing song execution, with preferred activation instants within the Measuring the sparse firing pattern of few individual neurons song. Using this global and continuous readout, we studied the gives limited information on the overall activity of the system. We activation dynamics of the song system elicited by different audi- can overcome this difficulty by measuring efferent motor com- tory stimuli. We found that synthetic versions of the bird’s song, mands that provide a global, continuous, and graded readout of ’ rendered by a physical model of the avian phonation apparatus, the song system s activity. In this regard, the electrical activity of A evoked very similar responses, albeit with lower efficiency. Mod- syringealis ventralis (vS), the largest muscle in the syrinx (Fig. 1 ), ifications of autogenous or synthetic songs reduce the response has been measured during song execution (12, 13). Remarkably, probability, but when present, the elicited activity patterns match this muscle spontaneously activates during sleep, showing variable execution patterns in shape and timing, indicating an all-or-nothing patterns which sometimes resemble the execution pattern (14). activation of the vocal motor program. In this work, we measure electromyograms (EMGs) of the vS muscle during song execution and nocturnal auditory playback . This enables the study of the overall activation zebra finch | song system | sensory–motor integration | syrinx | dynamics of the song system elicited by autogenous and synthetic electromyogram stimuli with high temporal resolution. he cooperation between different individuals, or the ability to Results Tperform coordinated and even synchronic actions, is at the We implanted electrodes in the syrinx of male zebra finches foundation of many species’ success. This requires a delicate obtaining EMG signals with amplitudes of [0.4–1.7] mV before interaction between sensory and motor programs, at specific amplification and dynamic ranges around 30 dB compared with regions of the nervous system. In this work, we explore some baseline fluctuations. Implants lasted from a few days up to 3 wk, aspects of this link, in the framework of oscine song production. showing a mild reduction of EMG amplitude over time in some Our model will be the elicitation of motor-like patterns in the cases. We confirmed electrode placement in vS by postmortem oscine song system when exposed, during sleep, to their own examinations. In some animals we observed an electrocardiogram- song. The avian auditory pathway indirectly connects to the song like (ECG) component in the signal (SI Appendix,Fig.S1A and B), system, a set of neural nuclei necessary to generate the motor gestures required for phonation (Fig. 1A; reviewed in ref. 1). Significance Neurons of HVC and RA (two of these nuclei) show a stereo- typed sparse spiking pattern elicited by playback of autogenous song in sleeping or anesthetized animals, firing at few specific The study of the integration between sensory inputs and mo- instants of the song (2–6). In zebra finches and white-crowned tor commands has greatly benefited from the finding that in sparrows these responses are highly specific to the bird’s own sleeping oscine birds, playback of their own song evokes highly song (BOS) because they are not evoked by a reversed version of specific firing patterns in neurons also involved in the pro- BOS, and similar conspecific songs, tone bursts, or combinations duction of that song. Nevertheless, the sparse spiking patterns of simple stimuli only evoke weak responses, if any (3, 7, 8). that can be recorded from few single neurons gives limited Furthermore, the spiking times are almost identical to those of information of the overall activity of the song system. Here, we the same neurons measured during song execution (4, 6). Se- show that this response is not restricted to the central nervous lective responses have also been observed in motor neurons of system but reaches vocal muscles. Combining this integrated the hypoglossal motor nucleus (nXIIts) that innervate the syrinx, A measure of the activity of the system with surrogate synthetic the avian vocal (9, 10) (Fig. 1 ). The highly selective re- songs, we found an all-or-nothing or switch-like activation of sponse of the song system to playbacks provides an excellent the song system. experimental opportunity to study its activation by auditory

stimuli and will be our tool to explore the integration between Author contributions: A.B., F.G., and G.B.M. designed research; A.B. and J.F.D. performed the auditory and motor programs. research; A.B. analyzed data; and A.B. and G.B.M. wrote the paper. One way to understand how the song system is activated by The authors declare no conflict of interest. this stimulus is to study the response to slight variations of BOS that maintain the capacity to activate the system but in a slightly This article is a PNAS Direct Submission. deteriorated way. In this regard, synthetic songs generated with a Published under the PNAS license. mathematical model of the avian vocal organ (SYN) have been 1To whom correspondence should be addressed. Email: [email protected]. shown to evoke firing patterns in HVC similar to those evoked by This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. BOS (11). Nevertheless, the probability of any given neuron of 1073/pnas.1801251115/-/DCSupplemental. responding to SYN was at most 60% of that of responding to Published online August 1, 2018.

8436–8441 | PNAS | August 14, 2018 | vol. 115 | no. 33 www.pnas.org/cgi/doi/10.1073/pnas.1801251115 Downloaded by guest on October 2, 2021 AC also capable of eliciting selective vS activity, consistent with the re- HVC Nif LMAN vS motif overlays sponse observed in HVC (11). L EMG 10 RA X 8 6 Correlation Analysis Reveals Short Delays of the Evoked Syringeal DLM 4

Uva (kHz) Freq. 2 Response. For each stimulus, we approximated the response pat- Ov DM 40 terns as the 95th percentile of vS activity at each time point of the MLd 30 A nXIIts LL 20 motif (referred to as BOS95,SYN95,CON95, and REV95;Fig.3 ). 10 CN We then cross-correlated these patterns to the median vS activity

vS act. (dB) 0 -0.1 0.0 0.1 0.2 0.3 0.4 0.50.6 0.7 during motif execution (EXE50;Fig.3B). BOS95 and SYN95 had Syrinx Hair Time (s) Pearson’s correlation coefficient ranging from 0.63 to 0.83 (Fig. 3C cells and SI Appendix,TableS1), with no significant difference among B motif 1 motif 2 motif 3 motif 4 D them (P = 0.20), but both were significantly higher than those of 10 0.2 −5 8 CON95 and SYN95 (P < 10 , ANOVA with orthogonal contrasts). 6 baseline 4 0.1 ThetimedelaysofBOS95 and SYN95, with respect to the execution Density Freq. (kHz) Freq. 2 pattern EXE , were less than 30 ms in all instances, with some 0.0 50 40 -10 0 10 20 30 40 50 C 30 cases showing no detectable delay (Fig. 3 ). There was no signif- vS activity (dB) 20 icant difference in delay times between SYN95 and BOS95 (P = 10 t

vS act. (dB) 0 0.43, paired test). 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Upon closer examination, we noticed that for some animals the Time (s) time delay of individual responses was not constant (Fig. 3 D and E Fig. 1. vS EMGs during song execution. (A, Top Left) Schematic represen- ). To visualize how the delay varies along the motif we con- structed correlation spectrograms, in which local correlation co- tation of the nuclei of the auditory pathway (violet), song motor pathway D (blue), and anterior forebrain pathways (red). (Bottom Right) Schematic efficients to EXE50 are shown for all times and delays (Fig. 3 , representation of the syrinx and associated muscles with approximate im- Bottom). In this particular example, the delay of maximal corre- plantation point of the bipolar electrodes in vS (red). (B, Top) Spectrogram lation increases linearly with time, reaching values close to 40 ms, of a typical song of a male zebra finch. Solid lines above the plot indicate after which the correlation is briefly lost and later reappears with a song motifs. (Bottom) vS activity measured during song execution (bird near-zero delay. To analyze the variability in the playback-elicited ab09, first surgery). (C, Top) Spectrogram of a single motif. (Bottom)Overlay vS activity, we time warped these patterns to a reference execution of all vS activity traces measured during motif execution. (D) Distribution of trace (SI Appendix, Methods and Fig. S3C). In this way, we could execution vS activity at the indicated times of the motif (blue histograms) assess the similarity among responses independently of small phase and baseline vS activity (gray histogram). differences between them due to variable time delays.

Syringeal Responses Evoked by Autogenous or Synthetic Songs Have consistent with the heart rate of small birds (15). The syrinx is a Similar Distributions. We performed multidimensional scaling deep structure located close to the heart; therefore, an ECG (MDS) on all of the vS activity traces measured for the different component is not surprising. To extract the EMG component of stimuli and song executions (Fig. 4A and SI Appendix, Methods). the signal we applied the Teager operator (16), a measure of MDS is a nonlinear visualization technique that represents the local energy of the signal, because this operator greatly attenuates multidimensional distance between traces in a 2D plot; that is, the ECG component (SI Appendix,Fig.S1A). We then estimated proximate points in the MDS plot correspond to similar vS activity the vS activity as the log-ratio of the Teager energy to its baseline traces. Points representing execution traces (EXE) form a com- value (Methods). pact cluster toward the right of the plot, whereas REV and CON Traces of vS activity show a repetitive pattern during song points form a disperse cloud toward the left (Fig. 4A). Some of the execution, consistent with the reiterations of the song’s motif BOS points are in the region of the REV–CON cloud and cor- (Fig. 1B). Overlaying these traces for all executions of the motif respond to traces with baseline vS activity (Fig. 4 A, i;EXE50 shows a highly stereotyped pattern, with relatively small varia- tions across renditions (Fig. 1C and SI Appendix, Fig. S1 C and D ). Furthermore, this measure of vS activity has an approxi- 10 mately normal distribution at each time in the motif (Fig. 1D). In 8 AB three birds, we performed a second surgery in which we implanted 6 the electrodes in a different position of the vS muscle. The EMG 4 Freq. (kHz) Freq. 2 patterns observed in both surgeries were similar (Pearson’scor- – P < −5 SI Appendix 40 BOS SYN relation in [0.53 0.82], 10 in all cases; ,Fig. 30 S1E), suggesting a homogeneous activation of the muscle. 20 10 Nocturnal Playbacks Selectively Evoke Activation of the Syrinx. Au- vS act. (dB) 0 ditory stimuli included a recording of the BOS and its reversed 10 CD version (REV), a conspecific song(CON),andaSYN,presentedin 8 6 random order (Methods). Fig. 2 shows all of the vS activity traces 4

evoked by nocturnal playbacks for a bird, time-locked to each motif (kHz) Freq. 2 SI Appendix (additional birds in ,Fig.S2). For REV and CON 40 REV CON motifs, the traces fluctuate around baseline, with occasional and 30 20 inconsistent increases in activity. In contrast, a significant fraction of 10

traces in response to BOS show a consistent activation, resulting in vS act. (dB) 0 an overall pattern reminiscent of the execution activity (Fig. 2A). 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Therefore, playback of the bird’s own song selectively elicits a ste- Time (s) Time (s) reotyped activation in the syrinx, consistent with the response ob- Fig. 2. Syringeal activity is selectively evoked by auditory playbacks. served in the central nervous system (3). Similarly, SYN also Nocturnal playback responses to (A) the BOS, (B)SYN,(C)REV,and(D)a stimulated consistent activation of vS. Furthermore, this activation CON. In all cases, Top shows the spectrogram of the stimulus motif, and

pattern is strikingly similar to that of BOS, although it was evoked in Bottom shows an overlay of all recorded vS responses to those motifs (bird a smaller fraction of the trials (Fig. 2B). Hence, our song synthesis is ab09, second surgery).

Bush et al. PNAS | August 14, 2018 | vol. 115 | no. 33 | 8437 Downloaded by guest on October 2, 2021 ABC showing that BOS scores are significantly higher than REV and CON in all cases, responses to SYN are significantly higher than those to REV and CON (except in ab10), and score values for SYN are significantly different from BOS (except in ch12). To gain further insight into the manner in which responses to SYN differ from those to BOS, we defined high responses as those with score greater than 0.5 ðρ > 0.5Þ. Note that close to 95% of the responses to CON and REV are below this threshold. BOS elicited a significantly larger fraction of high responses than SYN in eight out of nine experiments (Fisher’s exact test). The D distribution of score values for high responses was not signifi- cantly different between SYN and BOS (KS test; Fig. 4B and SI Appendix, Table S1).

Onsets of Syringeal Activity Are Clustered at Hot Spots. The previous analysis considered the vS activity elicited by a playback motif as a whole. However, in many cases the elicited response followed the median execution pattern during only a portion of the motif (Fig. 4 A, ii and iii). To explore the temporal pattern of activation, we identified activity segments as intervals of time where the mea- sured vS activity trace closely followed the expected execution E pattern of the motif (Methods and SI Appendix,Fig.S5A). For BOS

A iiiiii 30 Fig. 3. Evoked vS activity correlates with the execution pattern. (A) Median 20 vS activity pattern during motif execution (EXE50) and 95th percentiles of 10 responses to different auditory stimuli (BOS95,SYN95, REV95, and CON95 for 0

bird ab09). (B) Cross-correlation analyses of responses to auditory stimuli vS activity (dB) 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 with respect to median EXE50. Arrows indicate times of maximal correlation Time (s) Time (s) Time (s) for BOS95 (red) and SYN95 (blue). (C) Maximal correlation values and corre- spondent delays for BOS95 (circles) and SYN95 (triangles) of all animals (SI iv 30 Appendix, Table S1). Error bars represent SEs as calculated by bootstrapping. (D, Top) Spectrogram of an entire song of bird ab09. (Middle) vS activity 100 20 profile during execution of that song (gray) and the evoked profile during a 10 nocturnal playback (red). (Bottom) Correlation spectrogram showing in gray 0 scale the local (250-ms) correlation between the above traces for every time 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 Time (s) and delay (Methods). (E) vS activity for execution (gray) and evoked re- MDS sponse to BOS (red) for the indicated time windows. v 30 −100 20 EXE CON shown in gray for reference). Other BOS points form a cluster BOS 10 toward the top right of the plot, which marginally overlaps with SYN REV 0 the execution cluster (Fig. 4 A, iv). Remarkably, SYN points have −200 −100 0 100 200 0 0.1 0.2 0.3 0.4 0.5 0.6 MDS Time (s) a similar distribution to that of BOS, although with a smaller 0 fraction in the cluster adjacent to the execution points. Points in B ab05 ab08 ab09 ab10 ab11 ab12 ch12 the BOS–SYN cluster correspond to traces that closely follow the Sx1 Sx2 Sx1 Sx2 Sx1 Sx2 Sx1 Sx1 Sx1 Sx1 execution profile but for only a portion of the motif; as the frac- 1.0 tion of time the trace follows the execution pattern increases, the points move toward the execution cluster (Fig. 4 A, ii and iii). We show this transition in more detail in a principal component 0.5 SI Appendix analysis (PCA) plot of the same data points ( ,Fig. Score Projection S4A; note how the nonlinear MDS spreads the EXE cluster compared with PCA). MDS plots for some animals show an 0.0 overlap of the playback-evoked and execution clusters, whereas others do not (SI Appendix,Fig.S4B–E). In the latter cases, subtle Normalized a b c c a b c d d b c d e b c d e a b c d d a b c d d a b c c c a b c d d a b c d d a b b c c rev rev rev rev rev rev rev rev rev rev syn syn syn syn syn syn syn syn syn syn exe exe exe exe exe exe exe exe differences during brief time intervalsofthemotifallowthesepa- exe exe bos bos bos bos bos bos bos bos bos bos con con con con con con con con con con ration of both clusters, even though those differences are hard to identify visually (SI Appendix,Fig.S4A; compare panels iv and v). Fig. 4. BOS and SYN have similar elicited response distributions. (A) MDS To perform statistical analyses, we defined the normalized plot for all execution (EXE, violet) and stimuli-evoked (BOS, red; SYN, blue; – projection score (ρ), a measure of similarity between vS activity REV, brown; and CON, green) vS activity traces (bird ab09). i v show example traces and median execution profile (Methods). A value of zero vS activity traces of the indicated regions of the MDS plot. Gray curve in all panels corresponds to the median execution activity (EXE50). (B) Normalized indicates baseline vS activity, whereas unity corresponds to a projection score (Methods) of all vS activity traces of all animals (ab05, ab08, trace exactly matching EXE50 in shape and amplitude. The dis- B etc.), surgeries (Sx1 or Sx2), and stimuli (exe, bos, syn, con, and rev). Shared tribution of this score is shown in Fig. 4 for all birds and stimuli. letters (a–d) indicate nonsignificant differences between conditions As expected, EXE traces have scores close to 1, whereas REV (pairwise K–S tests corrected by FDR; Methods). Horizontal dashed line and CON traces have lower and more disperse values. We per- indicates the high response threshold. (For ab08, BOS was used as ref- – 95 formed pairwise Kolmogorov Smirnov (KS) tests correcting for erence pattern for score calculation instead of EXE50 because no execution the familywise false discovery rate (17) (SI Appendix, Table S1), was recorded.)

8438 | www.pnas.org/cgi/doi/10.1073/pnas.1801251115 Bush et al. Downloaded by guest on October 2, 2021 and SYN these activity segments show a structure consistent with The 22coverage for the synthetic versions of the song is roughly the repetitive motif structure of the song (Fig. 5A). The probability similar to that of BOS, although significantly lower in some of evoking a response depends on the position of the motif within portions of the motif. Notably, responses to SYN show a subset the song, with a reduced response in the first and fifth motifs (SI of the hot spots observed for BOS (Fig. 5B and SI Appendix, Fig. Appendix,Fig.S5B). S5 C–F). To assess if the delay of the evoked vS activity varied along the In some occasions, we observed vS activity consistent with the response, we performed a cross-correlation analysis against the execution pattern, after the last motif of the playback. Furthermore, execution pattern for each half of the activity segments. We found the timing of these virtual motifs was consistent with what would be larger delays for the latter part of the responses in two animals expected based on the intermotif intervals (Fig. 5 C and D). (e.g., Fig. 3D, Wilcoxon sign-rank test) and no significant dif- ferences in others. (In some animals, the analysis could not be Modified Stimuli Evoke the Same Motor Gestures with Lower performed: ab08 had no execution pattern, and the activity seg- Probability. Next, we measured the manner in which the re- ments of ab10 were too short to give informative correlations.) In sponse elicited by the synthetic song changed as we continuously no case did we find a decreasing delay along the response. varied a parameter of the synthesis. We chose the level of noise We then calculated the coverage at each time of the motif, in the tension of the labia (σβ; SI Appendix, Methods) because that is, the percentage of playback motifs with detected vS ac- previous work has shown that the response at the level of HVC tivity (Fig. 5 B, ii). Remarkably, the onset times of the activity in sleeping animals depends on this parameter (11). As can be segments tend to cluster at defined instants of the motif (Fig. 5 observed in Fig. 6 A and B, the probability of evoking the specific B, iii and iv), producing sudden increases in the coverage per- response decreases at high levels of noise. Nevertheless, when a centage (Fig. 5 B, ii). We refer to these instants as “hot spots.” response is evoked, the same complete motor gestures are ob- served as in the normal responses (individual traces are high- lighted in SI Appendix, Fig. S6). Finally, we studied the changes in the response obtained by A motif 1 motif 2 motif 3 motif 4 motif 5 muting individual elements of the motif, as has been previously 10 done while measuring single unit firing patterns in RA (4). 8 Muting an element of the motif strongly reduces the response 6 following that element. Significant reductions in the elicitation 4 Freq. (kHz) Freq. 2 probability of playback-evoked motor gestures can be observed from 50 ms after the start of the muted element to 200 ms after the end of the muted element (Fig. 6 C and D and SI Appendix, Fig. S7). Conversely, the elicitation of each playback-evoked

BOS motor gesture depends on one or more of the preceding ele- ments. When a motor gesture is evoked, it appears in its complete form. Stated differently, eliminating motif elements reduces the probability of evoking posterior motor gestures but does not strongly affect the timing or shape of the evoked gestures. SYN Discussion 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 ’ Time (s) In this work, we found that nocturnal playbacks of the bird s own B song, or synthetic versions of that song, evoke vS activity patterns i z) 10 10 H strikingly similar to those recorded during song execution. This 8 8 C 6 6 result is consistent with the activity of telencephalic nuclei eli-

eq. (k 4 r

(kHz) 4 F requency 2 cited by similar stimuli (2–4, 6, 11), yet it is surprising that the F 2 40 40 ii BOS 30 motor command should reach the syrinx in sleeping animals. In 30 20

SYN ct. (dB) 20 10 line with our finding, nXIIts motor neurons that project to the (%) 0

10 vS a syrinx are activated by BOS playbacks in anesthetized animals (9, Coverage 0 iii 2.4 2.6 2.8 3.0 3.2 3.4 10), and spontaneous song-like activity can be detected in vS of 20 BOS z) 10 15 8 D sleeping birds (14). Despite the activation of the syrinx, birds do 10 6 4 not phonate while sleeping due to the lack of the corresponding (onset) Counts 5

0 (kH Freq. 2 iv respiratory gesture (14). Sturdy et al. (10) reported an entrain- 20 SYN 40 15 30 ment of respiration by auditory playbacks, which suggests a re- 10 t. (dB) 20 10 sidual coupling between the song and respiratory systems in (onset) Counts 5 0 0 vS ac anesthetized birds. 0.1 0.2 0.3 0.4 0.5 3.4 3.6 3.8 4.0 4.2 4.4 The spontaneous nocturnal activity of the song system has Time (s) Time (s) been suggested to have a functional role in memory consolida- Fig. 5. Evoked activity onsets cluster at hot spots. (A, Top) Spectrogram of tion (4, 18). On the other hand, the high variability observed full song used as auditory stimulus (bird ab05). Activity segments detected during nocturnal activity suggests it could be involved in the for (Middle) BOS (red) and (Bottom) SYN (blue) for all playbacks, arranged generation of internal error signals that contribute to the stability chronologically per line from top to bottom. (B, i) Spectrogram of BOS motif. of the motor program (14). In any case, the functional implica- (B, ii) Activity segment coverage (i.e., percentage of presented motifs that tions of this spontaneous activity reaching syringeal muscles evoked vS activity) for every time of the motif, for BOS (red) and SYN (blue). during nocturnal motor replays remain unclear. It could be in- Arrows indicate hot spots. Shaded regions correspond to the SE as calculated volved in the maintenance of superfast syringeal muscles (19, 20) by bootstrapping. (B, iii and iv) Histograms for the onset times of activity and/or the elasticity of the syringeal membranes. On the other segments for BOS and SYN (15-ms bins). The dashed horizontal line indicates hand, the auditory elicitation of song-like syringeal activity in the threshold for significant difference from a uniform distribution (using ’ α = sleeping animals is not something likely to happen in due Bonferroni s correction for familywise error rate at 0.05). (C) In-phase ’ evoked activity can be detected after the end of the stimulus. Spectrogram to high selectivity to the bird s own song. Therefore, this be- of the final part of an auditory stimulus (Top; ab09) and a selected example havior may be a byproduct of the of the song system of evoked vS activity pattern (Bottom; black). The expected vS activity based with no adaptive value per se (21). Be that as it may, it provides a

on the execution pattern and typical intermotif intervals is shown in gray. global readout of the activation of the system well suited to ex- NEUROSCIENCE (D) Same as in C for bird ab11. plore the link between sensory and motor systems.

Bush et al. PNAS | August 14, 2018 | vol. 115 | no. 33 | 8439 Downloaded by guest on October 2, 2021 A C pattern (Fig. 6). This effect is reminiscent of categorical per- ABCDception in humans (22, 23). In swamp sparrows, similar cate- 10 SYN BOS 8 gorical responses to continuous variation of auditory stimuli have 6 been reported (23, 24). 4 2 An observation is that the moments at which the responses 30 start are not uniformly distributed along the motif. Instead, they SYN σβ=0.0 BOS (complete) B 20 cluster at few well-defined hot spots (Fig. 5 ). We could not 10 correlate these hot spots to any consistent acoustic features. Our

vS act. (dB) (kHz) Freq. 0 song syntheses usually produced onset of activity in a subset of

30 SYN σβ=0.003 BOS -A the hot spots produced by BOS, which may partially explain the 20 lower probability of evoking response by SYN. 10 The overall delay of the vS responses (relative to the execution

vS act. (dB) 0 pattern) was less than 30 ms in all cases. This is consistent with 30 SYN σβ=0.03 BOS -B what has been previously reported in the telencephalic nuclei 20 RA and HVC (4, 6, 11). Furthermore, our measurements bound 10 the delay below 10 ms for several birds (not ruling out zero de- 0 vS act. (dB) lay), whereas for others we found a near-zero delay for some 30 SYN σβ=0.1 BOS -C segments of the motif (Fig. 3 C and D). These delays are shorter 20 than the 50 ms latency observed from auditory stimulus to acti- 10 vation of the hypoglossal motor neurons that innervate the syrinx vS act. (dB) 0 (9). Therefore, our data confirm that a specific auditory event 30 SYN σβ=0.3 BOS -D can only evoke activity patterns that come later in the motif, not 20 10 the gesture causally related to the production of that particular event, as already suggested by Dave and Margoliash (4). Con-

vS act. (dB) 0 sistent with this notion, we found a 50-ms delay from the be- 0.0 0.1 0.2 0.3 0.0 0.1 0.2 0.3 ginning of a muted element to the first significant reduction in Time (s) Time (s) B D the response (Fig. 6 C and D). Furthermore, in some infrequent 0.6 Playback-evoked motor gestures cases, vS activity continued after the end of the last motif, in a ) vi syn iiiiii iv pattern consistent with what would be expected for an additional 0.5 v vii 90 rev-con virtual motif (Fig. 5 C and D). 0.4 In swamp sparrows, some area X-projecting HVC neurons bos BOS95 (HVCX), which fire at precise times during a song execution, also 0.3 Percentual change from BOS fire at the corresponding times during playback of that same i −9±5 −25±6* −57±7* −67±8* 0.2 song, even in awake and freely behaving animals (25). Further- con ii −51±14* −12±12 −48±14* −61±16* iii −89±29* −5±11 7±10 −8±11 more, these HVC units also fire during playback of similar con- 0.1 X rev iv −95±26* 3±8 16±7 1±8 specific songs, suggesting that these auditory–vocal mirror neurons Response prob. (ρ≥ρ v −91±18* −5±7 15±7 4±7 0.0 – vi −79±12* −48±9* −11±8 4±7 could be involved in song recognition and sender receiver inter- l Motor gesture 0 vii −37±8* −90±19* −72±12* −72±12* actions (25). However, RA-projecting HVC neurons (involved in ctr 0.1 0.3 0.01 0.03 BOS -A BOS -B BOS -C BOS -D A 0.003 0.001 the song motor pathway; Fig. 1 ) are not driven by auditory stimuli Noise in labia tension for SYN () Auditory stimulus in awake swamp sparrows (23, 25) or zebra finches (26, 27). Therefore, syringeal activity evoked by auditory stimuli is unlikely Fig. 6. Degraded stimuli evoke normal motor gestures with lower proba- to be triggered during wakefulness in either species. bility. (A) Overlays of vS activity traces elicited by motifs synthesized with the σ If the song system has some sort of preprogrammed internal indicated amount of noise ( β) added to the parameter representing the dynamics, the correct sequence of auditory stimuli could then tension of the labia, β(t). (B) Response probability, calculated as the fraction of playbacks with normalized projection score (ρ) higher than the 90th entrain it, activating the system and evoking the response. This ðρREV-CONÞ conceptual model is consistent with the differences observed in the percentile of REV and CON 90 .(C) Overlays of vS activity trances evoked by BOS motifs in which the indicated elements were muted. (D, Top) responses; a suboptimal sequence of auditory stimuli, such as the Labels assigned to playback-evoked motor gestures. (Bottom) Quantification ones produced by different versions of our synthetic songs, is less of the change in motor gesture coverage for muted stimuli compared with efficient at inducing the response, but once induced, the response BOS. SEs are calculated by bootstrapping. Asterisks indicate significant dif- follows its preprogrammed dynamic, independently of the stimulus ferences from BOS (controlling for familywise error rate at α = 0.05 using that evoked it (Figs. 2 and 6). Notably, continuous auditory stim- Bonferroni’s method). Bird ab18. ulation is required to maintain the response in sleeping zebra finches because muting an element of the motif greatly reduces the subsequent response (Fig. 6 C and D and ref. 4). This conceptual Notably, when activated by auditory stimuli the vS signal closely model is also consistent with the variable time delays observed in follows the execution pattern in shape and amplitude; that is, it is some birds (Fig. 3D) because coupling of nonlinear dynamical an all-or-nothing or switch-like response (Fig. 2). Note that in systems ordinarily results in time-varying phase differences (28). contrast to measurements of the activity of individual neurons, EMGs reflect the summed activity of tens of thousands of central Methods neurons and, in consequence, could potentially show partial ac- Subjects. Adult male zebra finches (Taeniopygia guttata) were acquired from tivation in individual trials. Therefore, the observed switch-like local breeders. After a 10-d quarantine protocol, birds were housed in large activation is a property of the song system and not of the cho- cages (40 × 40 × 120 cm) with conspecific males under a 14-h/10-h light/dark sen readout. The difference observed between autogenous and cycle, had ad libitum access to food and water, and could interact visually and synthetic songs is the probability of eliciting a response, not the vocally with a conspecific female. Experimental protocols were approved by the Animal Care and Use Committee of the University of Buenos Aires. profile of the evoked response (Fig. 4). Fragments of the response elicited by BOS could be missing from the EMG elicited by ma- Song Recording. Birds’ songs were recorded 3–5 d before the surgery in a nipulated stimuli. This was observed both using continuous ma- custom-built acoustic isolation chambers with a microphone (SGC568; Takstar) nipulations of the stimuli (through continuous changes in the connected to an audio amplifier and a data acquisition device (DAQ, USB- model used to generate synthetic songs) and using discrete ma- 6212; National Instruments) connected to a PC. Custom MATLAB (The nipulations (deletion of complete elements of BOS). However, Mathworks) scripts were used to detect and record sounds, including a 1-s whenever elicited, the fragments preserved the shape of the execution pretrigger window.

8440 | www.pnas.org/cgi/doi/10.1073/pnas.1801251115 Bush et al. Downloaded by guest on October 2, 2021 Surgery. Custom-made bipolar electrodes were implanted in the vS muscle as To compare the elicited activation patterns independently of the variable previously described (12) and connected to an analog differential amplifier delays observed (Fig. 3D), we implemented a smooth time warping (STW) (225×) mounted on a backpack previously fitted to the bird (SI Appendix, algorithm that allows small nonlinearities in the time stretching (SI Appen- Methods). dix, Methods and Fig. S3C). To quantify the response elicited by the different stimuli, we defined the * * * * * EMG Recording. After a 4-h recovery time, birds were placed in the acoustic normalized projection score ρ = ðy · xÞ=ðx · xÞ,wherex is the median of the isolation chamber, and the amplifier mounted to the backpack was con- * nected to the DAQ through a rotator to allow free movement. Two external execution patterns (EXE50)andy is the vS activity trace for a particular play- 12-V batteries powered the amplifier to avoid line noise. EMG and audio back trial or execution pattern. Before calculating the score, we time-aligned signals were recorded by a PC MATLAB, at a sampling rate of 44,150 Hz. all of the vS activities to the average execution pattern of a selected song using During day/night, recordings were triggered by sound/EMG with a 1-s pretrigger our STW algorithm. window. We adjusted trigger thresholds for each bird to levels slightly higher than the basal range, minimizing the probability of missing events, even when this led Statistical Analysis. Using ρ, we performed all pairwise comparisons between to a high percentage of spurious records. stimuli for each bird using a KS test and controlled for false discovery rate at α = 0.05 (17). Results are informed with letter code in Fig. 4B. We compared Playback Stimuli. The biomechanics of birdsong production constrains the the proportion of high responses (ρ > 0.5Þ between BOS and SYN using plausible relationships between different acoustic features of birdsong Fisher’s exact test for count data. Finally, we compared the distribution of the sounds, as fundamental frequency and spectral content in the case of zebra high responses between BOS and SYN using a KS test. P values for these tests finches (11, 29). Therefore, generating stimuli through computational are informed in SI Appendix, Table S1. All statistical tests were done in R. implementations of these physical models simplifies the number of manip- Periods of evoked vS activity consistent with the execution pattern were ulations that can be independently performed in a way that is consistent detected in the following manner (SI Appendix, Fig. S5A). The baseline dis- with the biomechanics. SYNs of the BOS were produced as previously tribution of vS activity P0ðvSÞ was estimated from −3to−1 s relative to reported (30) (SI Appendix). Playback protocols were passed at night (11– playback onset. The time-dependent distribution of vS activity during motif – 12 PM to 5 6 AM) from a speaker inside the acoustic isolation chamber. Each executions Px ðvS, tÞ was estimated from data as that of Fig. 1C, time-locked protocol consisted of several stimuli of similar duration including BOS, SYN, to each playback motif. These were then used to calculate the log-probability ratio Λð Þ = ð ð ð Þ + τÞ= ð ð ÞÞÞ ð Þ REV, and CON, presented in random order. Fifteen seconds were recorded defined as t log10 Px vS t , t P0 vS t ,wherevS t is the measured per playback, starting 3 s before the stimulus onset. Protocols were separated by vS activity at time t and τ is the average time delay between the evoked a random 10.0 ± 0.4-min interval of silence. Playback volume was set at 55 ± response and the execution pattern as calculated in Fig. 3C.Segmentsof 5 dB (calibrated with 4-kHz beeps, YF-20 Sound Level Meter, Yu Fung). For Fig. 6 evoked activity between times ti and tf were identified according to the A and B we produced syntheses with different values for the labial tension noise following conditions: (i) ΛðtÞ ≥ 0∀t ∈ ½ti , tf ,(ii) Λðt0Þ ≥ 1.3 for some t0 ∈ ½ti , tf , σ parameter β. For the element muting (Fig. 6 B and C)wereplaced Rtf ð Λð Þ Þ=ð − Þ > selected motif elements by background noise. In both experiments we included and (iii) t dt tf ti 0.5. Segments of less than 15 ms of duration were BOS, CON, and REV as controls. ti removed, and consecutive segments separated by small gaps were consolidated if the gap accounted for less than 10% of the resulting segment. This algorithm Data Analysis. Recordings were analyzed in Python using custom scripts was benchmarked against manual classification of activity segments giving (available at https://github.com/Dynamical-Systems-Lab). The envelope of qualitatively similar results. the EMG signal was calculated in the following manner. First, the Teager 2 energy operator was applied (16), defined as ϕðxiÞ = x − xi−1xi+1. Then the i ACKNOWLEDGMENTS. We thank M. A. Suarez for veterinarian support; 95th percentile for each 5-ms bin (220 samples) was determined, resulting in ’ S. Boari, A. Sanchez, R. G. Alonso, J. Lassa Ortiz, and C. T. Herbert for help a 200-Hz vector with a robust estimation of the EMG s Teager energy en- with animal care; and A. Amador and D. Margoliash for useful suggestions ϕ velope (r ). We found that this method greatly attenuated the ECG com- and discussion. This work was supported by the National Council of Scientific ponent of the signal observed in some birds (SI Appendix, Fig. S1A). We then and Technical Research (Argentina), the National Agency of Science and = × ð ϕ= ϕ Þ ϕ ϕ defined vS activity 10 log10 r r ref , where r ref is the mode of r during Technology (Argentina), the University of Buenos Aires, and the National periods of no activity. Note that vS activity is measured in dB. Institutes of Health through Grants R01-DC-012859 and R01-DC-006876.

1. Mooney R (2009) Neural mechanisms for learned birdsong. Learn Mem 16:655–669. 17. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and 2. McCasland JS, Konishi M (1981) Interaction between auditory and motor activities in powerful approach to multiple testing. J R Stat Soc B 57:289–300. an avian song control nucleus. Proc Natl Acad Sci USA 78:7815–7819. 18. Margoliash D, Schmidt MF (2010) Sleep, off-line processing, and vocal learning. Brain 3. Margoliash D (1986) Preference for autogenous song by auditory neurons in a song Lang 115:45–58. system nucleus of the white-crowned sparrow. J Neurosci 6:1643–1661. 19. Elemans CP, Mead AF, Rome LC, Goller F (2008) Superfast vocal muscles control song 4. Dave AS, Margoliash D (2000) Song replay during sleep and computational rules for production in songbirds. PLoS One 3:e2581. sensorimotor vocal learning. Science 290:812–816. 20. Uchida AM, Meyers RA, Cooper BG, Goller F (2010) Fibre architecture and song ac- 5. Hahnloser RH, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the tivation rates of syringeal muscles are not lateralized in the European starling. J Exp – generation of neural sequences in a songbird. Nature 419:65 70. Biol 213:1069–1078. 6. Hamaguchi K, Tschida KA, Yoon I, Donald BR, Mooney R (2014) Auditory synapses to 21. Gould SJ, Lewontin RC (1979) The spandrels of San Marco and the Panglossian par- song premotor neurons are gated off during vocalization in zebra finches. eLife 3: adigm: A critique of the adaptationist programme. Proc R Soc Lond B Biol Sci 205: e01833. 581–598. 7. Margoliash D, Fortune ES (1992) Temporal and harmonic combination-sensitive 22. Liberman AM, Harris KS, Hoffman HS, Griffith BC (1957) The discrimination of speech neurons in the zebra finch’s HVc. J Neurosci 12:4309–4326. sounds within and across phoneme boundaries. J Exp Psychol 54:358–368. 8. Margoliash D (1983) Acoustic parameters underlying the responses of song-specific 23. Prather JF, Nowicki S, Anderson RC, Peters S, Mooney R (2009) Neural correlates of neurons in the white-crowned sparrow. J Neurosci 3:1039–1057. categorical perception in learned vocal communication. Nat Neurosci 12:221–228. 9. Williams H, Nottebohm F (1985) Auditory responses in avian vocal motor neurons: A 24. Nelson DA, Marler P (1989) Categorical perception of a natural stimulus continuum: motor theory for song perception in birds. Science 229:279–282. Birdsong. Science 244:976–978. 10. Sturdy CB, Wild JM, Mooney R (2003) Respiratory and telencephalic modulation of 25. Prather JF, Peters S, Nowicki S, Mooney R (2008) Precise auditory-vocal mirroring in vocal motor neurons in the zebra finch. J Neurosci 23:1072–1086. – 11. Amador A, Perl YS, Mindlin GB, Margoliash D (2013) Elemental gesture dynamics are neurons for learned vocal communication. Nature 451:305 310. encoded by song premotor cortical neurons. Nature 495:59–64. 26. Rauske PL, Shea SD, Margoliash D (2003) State and neuronal class-dependent re- – 12. Goller F, Suthers RA (1996) Role of syringeal muscles in gating airflow and sound configuration in the avian song system. J Neurophysiol 89:1688 1701. production in singing brown thrashers. J Neurophysiol 75:867–876. 27. Cardin JA, Schmidt MF (2003) Song system auditory responses are stable and highly 13. Vicario DS (1991) Contributions of syringeal muscles to respiration and vocalization in tuned during sedation, rapidly modulated and unselective during wakefulness, and the zebra finch. J Neurobiol 22:63–73. suppressed by arousal. J Neurophysiol 90:2884–2899. 14. Young BK, Mindlin GB, Arneodo E, Goller F (2017) Adult zebra finches rehearse highly 28. Winfree AT (2000) The Geometry of Biological Time (Springer, New York). variable song patterns during sleep. PeerJ 5:e4052. 29. Sitt JD, Amador A, Goller F, Mindlin GB (2008) Dynamical origin of spectrally rich 15. Odum EP (1945) The heart rate of small birds. Science 101:153–154. vocalizations in birdsong. Phys Rev E Stat Nonlin Soft Matter Phys 78:011905. 16. Kvedalen E (2003) Signal processing using the Teager energy operator and other 30. Perl YS, Arneodo EM, Amador A, Mindlin GB (2012) Nonlinear dynamics and the nonlinear operators. PhD dissertation (University of Oslo, Oslo, Norway). synthesis of Zebra finch song. Int J Bifurcat Chaos 22:1250235. NEUROSCIENCE

Bush et al. PNAS | August 14, 2018 | vol. 115 | no. 33 | 8441 Downloaded by guest on October 2, 2021