<<

ptp u Ottawa

L'Université canadienne Canada's university FACULTÉ DES ÉTUDES SUPÉRIEURES I=J FACULTY OF GRADUATE AND ETPOSTOCTORALES u Ottawa posdoctoral studies

L'Université canadienne Canada's university

Brian Heffernan autëïïrîéIXtïïésF/MîorWthësïs

M.Sc. (Systems Science)

Department of Systems Science ~TÂCULTElÎCCtÎ7D^

Neuronal Models of Consonance and Dissonance

TITRE DE LA THÈSE / TITLE OF THESIS

André Longtin DIRECTEUR (DIRECTRICE) DE LA THESE / THESIS SUPERVISOR

CO-DIRECTEUR (CO-DIRECTRICE) DE LA THESE / THESIS CO-SUPERVISOR

ÇJiristianGiguCTe John Lewis

Gary W. Slater Le Doyen de la Faculté des études supérieures et postdoctorales / Dean of the Faculty of Graduate and Postdoctoral Studies NEURONAL MODELS OF CONSONANCE AND DISSONANCE

by

Brian Heffernan

Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the M. Sc. degree in Systems Science

Interdisciplinary Studies Faculty of Graduate and Postdoctoral Studies University of Ottawa

© Brian Heffernan, Ottawa, Canada, 2010 Library and Archives Bibliothèque et ?F? Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l'édition

395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada

Your file Votre référence ISBN: 978-0-494-69075-8 Our file Notre référence ISBN: 978-0-494-69075-8

NOTICE: AVIS:

The author has granted a non- L'auteur a accordé une licence non exclusive exclusive license allowing Library and permettant à la Bibliothèque et Archives Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par télécommunication ou par l'Internet, prêter, telecommunication or on the Internet, distribuer et vendre des thèses partout dans le loan, distribute and sell theses monde, à des fins commerciales ou autres, sur worldwide, for commercial or non- support microforme, papier, électronique et/ou commercial purposes, in microform, autres formats. paper, electronic and/or any other formats. The author retains copyright L'auteur conserve la propriété du droit d'auteur ownership and moral rights in this et des droits moraux qui protège cette thèse. Ni thesis. Neither the thesis nor la thèse ni des extraits substantiels de celle-ci substantial extracts from it may be ne doivent être imprimés ou autrement printed or otherwise reproduced reproduits sans son autorisation. without the author's permission.

In compliance with the Canadian Conformément à la loi canadienne sur la Privacy Act some supporting forms protection de la vie privée, quelques may have been removed from this formulaires secondaires ont été enlevés de thesis. cette thèse.

While these forms may be included Bien que ces formulaires aient Inclus dans in the document page count, their la pagination, il n'y aura aucun contenu removal does not represent any loss manquant. of content from the thesis.

1+1 Canada Abstract

One of the fundamental questions in music is why certain combinations of tones sound pleasant (or consonant) and others unpleasant (or dissonant). This question's impor- tance is highlighted by the cross-cultural overlap in preferred musical tone combinations found across the globe and throughout human history as well as the seemingly innate predisposition towards the processing of these same tone combinations by infants; the phenomenon indeed appears to be a universal one. In order to gain insight into the pos- sible neurophysiological mechanism(s) underlying this phenomenon, plausible neuronal models are constructed and investigated in order to assess their explanatory power. This thesis investigates two different models: one which explores a correspondence between synchrony in nonlinear dynamical systems and consonance assessments, and another which explores the relationship between the stochastic resonance - already a viable mech- anism for pitch extraction - and its consonance assessments. The results obtained using the first model indicate that the previously reported correspondence was the result of an error of analysis, and that any such correspondence is parameter dependent. A modified version of the model reflecting a higher degree of biophysical realism was constructed; initial results show an interesting modal structure in the relationship between intervals and their synchronized states. The second model further establishes the temporal coding of consonance via interspike interval statistics and shows a good correspondence with psychoacoustic consonance assessments, and a nearly perfect correspondence with mu- sical consonance assessments. The latter result is of particular interest as it not only provides a biologically plausible account of the musical consonance of intervals, but may also provide insight into the actual neural machinery behind the development of auditory perception more generally.

11 Acknowledgements

I would like thank André for providing me with the opportunity to study under his kind, enthusiastic and intellectually humbling advisement (as well as for purchasing a shiny new iMac upon which to run an endless sea of sims). Of all the areas of research I could have stumbled into, I don't believe that any of them could have been nearly so suiting to my passions and abilities. I would like to thank all of the longtinlab group for stimulating parts of my brain that I had presumed to be either dead or dormant and for being cool in general. In particular, a one future Dr. Jason Boulet who, aside from sharing a passion for good beer, shared with me his exceptional knowledge of matlab and all things technical, as well as his insight into the perplexing conceptual models I so often wrestled with. Without his help, this would have been an even more trying process. To my good friends who spared me from a complete immersion into geekdom and nerdery - I thank thee. To Mark and to the MSC: I am thankful for all the good people I met, and for learning so much about what I do and do not want to do with my life. To my family, for always supporting me, even if they don't so much know exactly what it is I'm studying. I truly could not be more blessed. Lastly, to mononucleosis, for showing me to the woman I love. If ever anything good could have come out of being so ill, this is it. Merci pour chaque moment de chaque jour que t'es dans ma vie mon chou. Bisous.

m Dedication

For my mom and dad. THIS PAGE IS INTENTIONALLY LEFT BLANK.

? Contents

1 Introduction 1

1.1 References 5

2 Background 7 2.1 Music: A Brief Introduction 8

2.1.1 What is music? 9 2.1.2 On Origins: Why is Music? 11 2.2 Systems Relevant to Music 13 2.2.1 Physical: The Basics of Sound & Musical 13 2.2.2 Biological: Music and the Brain 18 2.2.3 Cognitive and Perceptive: Music and the Mind 24 2.3 The Basics of Auditory Neuroscience 38 2.3.1 The Leaky Integrate-and-Fire Neuron Model 39 2.3.2 Ordinary Measures of Neural Firing 41 2.3.3 Auditory Coding Schemes 42 2.4 Relevant Non-Linear Dynamics 49 2.4.1 Stochastic Resonance 49 2.4.2 The Sine-Circle Map & Mode-Locking 52 2.5 Summary 54 2.6 References 56

vi 3 Artide I 63

4 Artide II 76 4.1 Background 77 4.2 Model 80 4.3 Simulation Methods 80

4.4 Results 81 4.5 Analysis 82 4.6 Discussion 84 4.7 Conclusion 88 4.8 Figures 89 4.9 References 108

5 Conclusions 112 5.1 Summary 112 5.2 Final Thoughts 114 5.3 References 116

A Code 118 A.l Shapira Lots & Stone Model 118 A. 1.1 diadsweepForPublication . m 118 A. 1.2 modelockingForPublication .m 122 A.2 Cariani's Model 125

A.2.1 chialvoGSR2 .m 125

A.2. 2 chialvoGSRCaller.m 126

VIl List of Tables

2.1 The Systems Relevant to Music 14

4.1 The Wester Dyads: Consonance and Adjusted Amplitude Values 76

vin List of Figures

2.1 The Celestial Spheres of Pythagoras 13 2.2 Pure Tones 16 2.3 The Harmonic Series: Natural Modes of a String 17 2.4 The Human Ear 21

2.5 The Cochlea 21 2.6 The Auditory Pathway 23 2.7 Equal-loudness Contours 26 2.8 Virtual Pitch 27 2.9 The "" 27 2.10 Spectrogram of a Violin Playing 29 2.11 The Western Dyads: Equal Temperament vs. Just Intonation 31 2.12 The Other Pythagorean Theorem 33 2.13 Overlapping Partíais of a Perfect 5th 35 2.14 Beats & Roughness: The Basilar Membrane as a Series of Overlapping Band-pass Filters 36 2.15 Stumpf's Theory of Tonal Fusion 37 2.16 A Neuron 40 2.17 Schematic: Stimulus to Neural Response to Percept 43 2.18 Sensory Coding Schemes 43 2.19 Temporal Coding of the Auditory Nerve 47 2.20 Summary of Pitch Percepts Described by Temporal Coding 48

ix 2.21 Stochastic Resonance 49

2.22 Ghost Stochastic Resonance 51 2.23 The Devil's Staircase 53 2.24 Arnold Tongues 54

4.1 Cariani's all-order ISIH, Harmonic Sieve, & Pitch Salience 82 4.2 Cariani's Consonance 83 4.3 Schematic of Conceptual Model 84 4.4 The Stochastic Resonance of an LIF - Replicating Barbi et al 84 4.5 First-order ISIH's for Consonances Under Identical Parameterization . . 85 4.6 First-order ISIH's for the Dissonances Under Identical Parameterization . 86 4.7 All-order ISIH for the Consonances Under Identical Parameterization . . 87 4.8 All-order ISIH for the Consonances Under Identical Parameterization (Con- tinued) 88 4.9 All-order ISIH's for the Dissonances Under Identical Parameterization . . 89 4.10 All-order ISIH's for the Dissonances Under Identical Parameterization (Continued) 90 4.11 First-order ISIH's for the Consonances with Adjusted Amplitudes .... 91 4.12 First-order ISIH's for the Dissonances with Adjusted Amplitudes 92 4.13 All-order ISIH's for the Consonances with Adjusted Amplitudes 93 4.14 All-orderlSIH's for the Consonances with Adjusted Amplitudes (Continued) 94 4.15 All-order ISIH's for the Dissonances with Adjusted Amplitudes 95 4.16 All-order ISIH's for the Dissonances with Adjusted Amplitudes (Continued) 96 4.17 Signal-to-Noise: Identical Parameterization Regime 97 4.18 Signal-to-Noise Ratio: Identical Parameterization Regime Using Savitzky- Golay Filter 98 4.19 Signal-to-Noise Ratio: Adjusted Amplitudes 99 4.20 Signal-to-Noise Ratio: Adjusted Amplitude Regime Using Savitzky-Golay Filter 100

? Chapter 1

Introduction

Perhaps the most fundamental question in music - and arguably the common denominator of all musical tonality is why certain combinations of tones are perceived as relatively consonant or harmonious and others relatively disso- nant or inharmonious. -p. 287, Purves et al., Neuroscience, 3rd Ed. Music can move us to tears of joy or sorrow, fill us with hope or antagonize us into a furious rage. It is known to calm the cries of infants and to sooth the hearts and minds of children, even to foster social cohesion among large groups of people (think battle march) or societies as a whole (think national anthem during the Olympic games). But what is music exactly? Why does music exist, and why does it evoke such strong feelings? These are by no means simple questions to answer and many a text attempting to provide answers to these questions have come and gone, often adding new perspectives, tools, or techniques, but none providing a sufficiently complete explanation of all of the mechanisms at play. As is the rule according to the reductionist perspective on science: an understanding of the parts of a system and their interactions often leads to greater insight and understanding of the whole. It is thus that I endeavor to address the 'fundamental question in music' as presented by Purves et al. [8] in the opening quote. The smaller (yet noble) quest to understand consonance and dissonance is indeed a

1 2 valid place to begin an inquiry into the roots of musical appreciation, especially given the innate and universal nature of certain primary music processing capabilities and a cross- cultural predisposition towards certain tonal relationships or combinations (typically referred to as intervals or dyads in music theory). In the overview of Pitch: Neural Coding and Behavior by Plack et al. [7], the follow- ing five fundamental questions are identified as those which have yet to be conclusively answered owing to the current 'huge gaps' in our knowledge of the underlying neural rep- resentations and mechanisms. They are, using concepts that will be clarified in Chapter 2:

1. How is phase-locked neural activity transformed into a rate-place representation of pitch?

2. Where does this transformation take place, and what types of neurons perform the analysis?

3. Are there separate pitch mechanisms for resolved and unresolved harmonics?

4. How does the pitch mechanism(s) interact with the grouping mechanism(s) so that the output of one influences the processing of the other and vice versa?

5. How and where is the information about pitch used in object and pattern identifi- cation?

Although answering these questions is not the focus of my work they are inseparable from it by way of the fact that any well-grounded explanation of the assessment of consonance requires an understanding of the answer to each of these. Chapter 2 is structured in order to provide what is hopefully a sufficient introduc- tion to the interdisciplinary study of consonance and dissonance. It reviews the basic physiology of , as well as models of neural firing, especially in the presence of harmonic forcing. 3

Chapter 3 consists of a reprinted paper published by myself and A. Longtin in the Journal Neuroscience Methods that makes use of the experimental evidence for phase- locked activity in the auditory brain mentioned in the first question, and which partly motivated the work of Shapira Lots and Stone[6] that attempts to explain the perception of the consonance and dissonance of intervals in terms of the nonlinear neural synchro- nization that they induce. In this paper we show that the results of Shapira Lots and Stone are improperly analyzed, and that upon re-analysis they hold neither in their spe- cific case nor in general, though they are nonetheless intriguing and worthy of deeper study. In light of this, I further extend their model in an attempt to make it more biophysically realistic, presenting a novel result of a modal structure (now known to be called the Multiple-Devil's Staircase) not previously witnessed in (model) neurons, nor considered in relation to sensory perception insofar as I am aware. In Chapter 4, I present a manuscript of a paper (soon to be submitted for review) in which I and A. Longtin show that a temporal coding scheme can be applied to the output of a single stochastic dynamical system, the resulting analysis indicating a possible expla- nation of the so-called musical consonance. It is this very same temporal coding scheme - which is due to the large body of work by Cariani and his colleagues[2, 1, 4, 5, 11, 3, 10] - that best answers each of the questions above to my best determination. It is my sincere belief that this result, when taken together with Cariani's result on the so-called psy- choacoustic consonance [3], provides the strongest neurophysiologically plausible account of consonance to date, adding to the already robust capabilities of the coding scheme. This is not to be aggrandizing, as I acknowledge with great pleasure that these two re- sults effectively amount to the grounding of the seminal work by Terhardt[9] that had previously produced an algorithm for virtual pitch extraction (see Chapter 2) - capable of providing a similar accounting and upon which his many insightful conclusions regarding consonance in general were drawn. In addition to the results on musical consonance, I submit that the stochastic dynamical model employed throughout the paper effectively captures the general behavior of the auditory nerve (at least insofar as the processing of 4 intervals is concerned), and is thus worthy of future study regarding its ability to capture other observed behaviors as it presents what may very well be the least computationally expensive viable dynamical model of the said nerve. Finally, I conclude (in Chapter 5). References

[1] P. Cariani. Temporal coding of sensory information. Proceedings of the annual conference on computational neuroscience, 1997.

[2] P. Cariani. Temporal coding of periodicity pitch in the : An overview. Neural Plast, 6(4):147-172, 1999.

[3] P. Cariani. A temporal model for pitch multiplicity and tonal consonance. Pro- ceedings of the Eighth International Conference on and Cognition (ICMPC), 2004.

[4] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, i. pitch and pitch salience. Journal of Neurophysiology, 76(3):1698, 1996.

[5] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, ii. pitch shift, pitch ambiguity, phase invariance, , rate pitch, and the dominance region for pitch. Journal of Neurophysiology, 76(3):1717, 1996.

[6] I. S. Lots and L. Stone. Perception of musical consonance and dissonance: an outcome of neural synchronization. J R Soc Interface, 5(29):1429-1434, 2008.

[7] C. J. Plack, A. J. Oxenham, R. R. Fay, and A. N. Popper. Pitch: neural coding and perception. Springer: New York, 2005.

[8] D. Purves. Neuroscience, Fourth Edition. Sinauer Associates, Inc., 4th edition, 2007.

5 6

[9] E. Terhardt. Pitch, consonance, and harmony. J Acoust Soc Am, 55(5):1061-1069, 1974.

[10] M. Tramo, P. Cariani, B. Delgutte, and L. Braida. Neurobiological foundations for the theory of harmony in western tonal music. Annals-New Tork Academy of Sciences, 2001.

[11] M. Tramo, P. Cariani, C. Koh, and N. Makris. Neurophysiology and neuroanatomy of pitch perception: auditory cortex. Annals New York Academy of Sciences, 1060:148, Jan 2005. Chapter 2

Background

'Music expresses that which cannot be said and on which it is impossible to be silent.' -Victor Hugo

In the preface to Hermann von Helmholtz's foundational work On the sensations of tone as a physiological basis for the theory of music, he makes the following cautionary claim for the benefit of the reader with regards to his revised section on the History of Music:

I must, however, request the reader to regard this section as a mere compila- tion from secondary sources; I have neither time nor preliminary knowledge sufficient for original studies in this difficult field. The older history of music to the commencement of Discant, is scarcely more than a confused heap of secondary subjects, while we can only make hypotheses concerning the prin- cipal matters in question. Of course, however, every theory of music must endeavour to bring some order into this chaos, and it cannot be denied that it contains many important facts. [61]

I offer up the very same cautionary claim, not only as it pertains to the history of music or music in general, but also as it pertains to gross anatomy, neuroanatomy, neurobiology, neurophysiology, etc. The tools that are at my disposal are mainly those of mathematics,

7 8 physical modeling, and computer simulation. The tool-bag of the nonlinear dynamicist does nonetheless possess a vast range and power of use and is certainly a requisite in the field of computational neuroscience, within which my work takes its roots. It is thus that I refer the reader to one or more of the authoritative texts referenced throughout the background for a more thorough explanation or list of reading materials in order to satisfy any further curiosities not addressed in my functionally minimalist contextual grounding of my own original studies in this 'chaos '. In order to achieve such a grounding I will provide both the conceptual and math- ematical tools required to objectively model the major determinants of the subjective perceptions of intervalle consonance.

2.1 Music: A Brief Introduction

An intense interest in music shows readily in humans from the early days of life. Such an interest has been shown to be unique to humans and is known to be pervasive in occurrence throughout history and across nearly all cultures. Presently, the ease of access to music provided by the Internet has led to an explosion of music distribution and listening all over the world. The massive success of services such as MySpace, iTunes, XFM Radio, etc. all point quite clearly to our passion for the consumption of music (the iTunes Store has sold more songs than there are people on the earth since its inception in April of 2003 [26]). But what exactly is music and why are we so interested in it? Although these two questions remain subjects of a debate whose finer points are often disputed among musi- cians, musicologists, philosophers, developmental psychologists, auditory neuroscientists and the like, I will attempt to provide a basic working understanding of what music is (reliant upon the assumption of at least a lay knowledge on the part of the reader) as well as an overview of the current theory of the origins of music before delving further into specifics. 9

2.1.1 What is music?

At its most basic, music is the art of arranging sounds (tones) in time. Typical of such arrangements in nearly all cultures is the production of a continuous, often emotionally evocative composition, created by structured rhythmic successions of tones (melody) and the superposition of tones (harmony). Such tones are produced either vocally or by an instrument. The timbre of the voice or instrument refers to the combination of sound qualities that distinguishes it from other tones of the same pitch and loudness. Pitch refers to the subjective sensation of the 'height' of a sound, whereas loudness refers to the sensation of 'volume'. These three aspects of musical sound - pitch, timbre, and loudness are considered to be the three primary sensations of all musical sounds across all cultures without exception. In almost all cases, a discrete set of tones are selected from the continuous set of audible frequencies forming a scale whose elements are then combined melodically and harmoniously. Often, distinct styles or 'genres' of music make use of a particular scale, the difference arising from either the number of elements, or the frequency ratios of the tones therein, or both.

To be sure, music is often self-transcending as is typical of all arts. Movements such as 'free jazz' of the 1950's and 1960's saw musicians play in a temporally unstructured manner; accelerating or decelerating and even playing in unrelated tempi with respect to one another. Atonal music has not been uncommon throughout history either; that is, music that has no tonal center or 'key'. However, the vast majority of musical experience both east and west is tonal and rhythmically structured. Furthermore, and of central interest to the author, nearly all scales that have emerged throughout history possess tones whose frequencies are related by simple ratios [43]. Take for instance the octave - defined as the interval between one musical pitch and another where one note's frequency is twice that of the other (further octaves being 2" for ? in Z, and the reciprocals). The octave is common to nearly all musically advanced cultures [52] and has been referred to as the 'basic miracle of music'. All octaves are heard to be essentially identical even 10 though one is of greater pitch height than the other (a phenomenon known as octave equivalency). Several other intervals are quite common, such as the Perfect Fifth (P5) and the Perfect Fourth (P4). It has been hypothesized that these intervals, as with the octave, are naturally derived due to the processing of the harmonic series of normal pitched sounds encountered in the natural world[57]. For example, the Perfect 5th is typically defined as an interval composed of two tones in a ratio of 3:2, and can thus be identically defined as the interval between the second and third harmonics of a pitch. The Perfect Fourth can be similarly defined as the distance between the third and fourth harmonics. In fact, this is the case for all intervals common to western music, the difference being that intervals defined by larger ratios (such as 45:32 for the tritone) are formed by elements of the harmonic series that either do not typically get voiced, or if they do, they possess relatively little power compared to the lower harmonics. Although there are clear differences that do occur between musical systems past and present, the mere fact that "the available evidence indicates that infants are sensitive to a number of sound features that are fundamental to music across cultures... [and] their discrimination of pitch and timing differences and their perception of equivalence classes are similar, in many respects, to those of listeners who have had many years of exposure to music" [60] allows one to conclude that much of our music processing abilities are innate and that we are predisposed towards certain tonal relationships[60, 54, 43]. Much in the same way that scientists and mathematicians search for invariances, perhaps it is the brain's categorization of pitch invariance based on simple rational frequency relationships (or simple counting) that results in our perceptual organization of sounds. Whatever the reason, it comes as little surprise that the assessment of an interval's consonance and/or dissonance is largely invariant across all cultures [58]. It is difficult to discern the cause of the inconsistencies [55] in the results presented across various studies due to the variations in methodologies, subjects, tone type, metrics, etc. Nonetheless, clear trends emerge in the data. These are the trends that I seek to understand by way of neural models. 11

Finally, although music mainly relies on the of hearing (or audition) whereby sound waves are detected by the ear and transduced into nerve impulses that are per- ceived by the brain, it cannot be said that the experience of music is exclusive to those with a sense of hearing. In fact, it is well known that deaf people - those who cannot hear or have very limited hearing - are able to experience music by feeling vibrations in their body. Furthermore, there have been many famous deaf musicians throughout history - including Ludwig van Beethoven, who wrote one of the most acclaimed symphonies of all time after having completely lost his hearing. This is relevant in that it goes to show that the phenomenon of music occurs at a deeper cognitive level and in a more subtle way than merely being 'pleasing to the ear'.

2.1.2 On Origins: Why is Music?

Music has certainly had and continues to have an enormous impact on human civilization - we have even sent it out into space on Voyager 1

y !

? k \ \ \ i, %?? \ fi \

•3

Terra

Figure 2.1: Pythagoras' Celestial Spheres and the Western Scale. Reprinted from [56]. 2.2 Systems Relevant to Music

Any reasonable approach towards the understanding of music requires a 'systems ap- proach'. Table 2.1 shows the various physical and biological systems involved in the emission, transmission and reception of musical sounds[51]. Missing however is any hint of the mental system; that is, the cognitive and affective functions that are performed when a subject listens to and interacts with music.

2.2.1 Physical: The Basics of Sound L· Musical Acoustics

Sound is a travelling wave created by a series of alternating longitudinal compressions and expansions of air known as oscillations that are transmitted through a medium (either solid, liquid, gas, or plasma - but typically air). It is thus that a sound wave is a form of elastic energy. Such an oscillation is typically referred to as a sound if it is composed of frequencies within a normal range of audition (or hearing), and is of a sufficient intensity to be heard. Note that the range of audition changes both across as well as within species, and thus a 'sound' to a cat for instance may be imperceptible to a human as we typically possess a range of 20Hz-20kHz, although normal adult hearing 14

System Function Excitation mechanism Energy supply Source Vibrating element Determination of fundamental tone characteristics Resonator Conversion into air pressure oscillations (sound waves), final determination of tone characteristics Medium proper Sound propagation Medium Boundaries Reflection, absorption, reverberation Eardrum Conversion into mechanical oscillations Receptor Inner ear Primary frequency sorting, conversion into nerve impulses Nervous system Processing, identification, storage, and transfer to other brain centers

Table 2.1: Physical and biological systems relevant to music, and their overall functions. Re- produced from [51]. usually degrades over time, resulting in an upper range of 15-17kHz. As one might expect, this range encompasses the range of normal human vocalizations as well as many meaningful environmental sounds relevant to decision making. Sound waves are characterized by the properties of frequency, wavelength, pe- riod, amplitude, intensity, speed, and direction as is any generic wave (see Eq. 2.1). The scientific study of the manner in which sound waves are propagated, absorbed, and reflected is called acoustics. The branch of acoustics involved with the research and description of the physics of music specifically is called musical acoustics.

Propagating Sound Waves

The following sinusoidal function is very often used to describe sound waves as they occur in time t:

y(t) = A sin(kx - ut + ?) + D (2-1) A is the amplitude (maximum deviation from the center), usually expressed in log units known as decibels (dB)) u is the angular frequency (in radians per second) 15

• ? is the position of the wave in space, with associated wavenumber k

• ? is the phase (a negative value indicates a delay or lag; a positive value a 'head- start' or lead)

• D is the center amplitude, corresponding to the mean sound pressure

The commonly referenced quantities of frequency /, expressed in cycles per second, or Hertz (Hz), wavelength ?, and the speed of propagation (c) are derived from the following relation between wavenumber and angular frequency by k = ^ = -^- = ^. Note that for all of the author's purposes the spatial distribution of the sound wave can be neglected, thus reducing the equation to its one dimensional form as seen in Eq. 2.2. This is reasonable since in no way do we investigate binaural hearing effects on consonance, and therefore does not need to model sound wave differences as received by each ear. Nor do we consider the spatial distribution across one ear; rather we consider the ear as a point receptor in space receiving the propagating wave; consequently, kx is fixed and henceforth deemed a constant phase shift that we neglect.

Pure Tones and Harmonics

The simplest form of a sound wave has a sinusoidal waveshape and is known as a pure tone. It is thus that pure tones are characterized by a simple sinusoidal function, as shown in Eq. 2.2 with parameters as indicated above. Pure tones do not possess any harmonics (or overtones - described below) . Since natural oscillatory phenomena are typically nonlinear and involve harmonics, pure tones are not naturally occurring. [51, 47].

y{t) = Aosm(fot + 0o) (2-2)

The upper harmonics of a frequency /o (referred to as the fundamental fre- quency) are its integer multiples 2/0, 3/o, 4/0, ... , etc., the set of which are often referred 16

WWVWVWWWV

Figure 2.2: A variety of pure tones of varying frequency and intensity shown progressing through time from left to right. Reprinted from [71]. to as the harmonic series. It is worth pointing out that the terminology regarding har- monics is at times inconsistent regarding whether or not the 'first harmonic' refers to I/o or 2/o, though it is typically used to refer to the fundamental frequency /o (which will be its usage throughout the remainder of this thesis). The issue is further confused by the employment of the term overtone, which is identical to a harmonic, but which is used consistently with respect to its numbering (the fìrst overtone is the second harmonic 2/o). The subharmonic frequencies or subharmonics of a fundamental frequency are the frequencies that lie below /o and oscillate with a frequency of ^l, where ? G Z+. The subharmonics are in a sense the mirror image or reciprocals of the upper harmonics, and the term undertone is therefore used in an analagous fashion to the employment of overtone. The discovery of a numerical relationship between a fundamental frequency and its overtones dates back to Pythagoras whose experiments with simple rational divisions of a string led to the characterization of the modes of vibration of a string and the so-called natural intervals in music theory, as well as the numerical definition of the octave, all of which bear a tight association to the harmonic series (see Fig. 2.3) [23]. 17

1

1/2

1/3 2/3

1/4 3/4

1/5 2/5 3/5 4/5

I H \—\—I 1 : 1 1—M. H 1 O 1/7 1/6 1/5 1/4 2/7 1/3 2/5 3/7 1/2 4/7 3/5 2/3 5/7 3/4 4/5 5/6 6/7 1

Figure 2.3: The fundamental frequency of a string and its first six overtones (2/o — 7/o) are displayed as formed by the simple rational division of a string. Thus, the natural modes of vibration of the string correspond to its harmonics, as is the case for the majority of western musical instruments (though some instruments show a stronger resonance at either the odd or the even harmonics only as a result of their shape) . Reprinted from [69] . 18

Complex Tones

A complex tone is defined as "any sound with more than one frequency component that evokes a sensation of pitch" [47]. Complex tones can be either periodic (or harmonic, as they are often called), or aperiodic (inharmonic). Harmonic complex tones comprise the vast majority of all tonal sounds encountered in both the physical environment (speech, music, etc.), and they are thus the stimuli of choice by researchers as easily witnessed in the literature. All complex tones can be described as the combination of sine waves or partíais (a harmonic complex tone's partíais are thus actual harmonics) [23] . It is com- mon to measure inharmonicity as the deviation of a partial from its nearest harmonic neighbor (the departure being measured relative to the fundamental frequency of the complex tone) [27]. Harmonic complex tones can be described as follows (from hereon out, complex tone will be used to refer to a harmonic complex tone unless otherwise stated):

? y(t) = A0 sin(/oi + 0O) + ]G A>° sindk + 1^ + **) (2·3) fc=l Here the complex tone is comprised of n+1 partíais. The parameters in this equation are identical to those of Eq. (2.1).

2.2.2 Biological: Music and the Brain The auditory system acts to transduce oscillations in air pressure (sound waves) into meaningful electrical or neural activity in the brain, which is itself typically integrated with other meaningful information from the various other sensory systems, all in order to positively guide or influence the behavior or state of the organism. For a more com- prehensive overview of the auditory system see Neuroscience by Purves et al. [48] - an authoritative text of broad scope from which the following information has been derived and synthesized. 19

The Ear

The ear is typically divided into three parts: the outer ear, the middle ear, and the inner ear, as seen in Fig. 2.4[48]. The outer ear, which consists of the pinna, the concha, and the auditory meatus is responsible for gathering sound energy in order to focus it on the eardrum or tympanic membrane. The pinna and the concha also provide additional functionality in aiding to establish the elevation of a sound source due to their selective filtering of frequencies; an ability resulting from the asymmetrical shape of the ear. It is thus that when two identical sounds are emitted - one from above, and one from below - the outer ear acts to transmit more of the higher-frequency components of the elevated sound even though the two sounds are physically identical. Another crucial function attributable to the outer ear is the direction-dependent amplification of sound pressure by a factor of between 30-100 times for sounds centered around 3kHz. It is this amplification that makes human hearing particularly prone to ear damage between 2- 5kHz, which happens to be the frequency range about which the most meaningful aspects of human speech sounds are concentrated (e.g. the phonemes - the smallest units of sound employed to form meaningful distinctions in speech), even though the normal range of human vocal pitch corresponds to a range between roughly 100-400Hz [48]. Once the sound-wave enters the middle-ear, the eardrum acts to convert their energy into mechanical oscillations, thus completing the first major step in the transduction pathway. The major function of the middle-ear is to further amplify the pressure of the oscillations - here by a factor of 200 - in order to ensure the proper transmission of sound energy across the air-fluid membrane as the energy enters into the cochlea of the inner ear. This is accomplished by the activation of three ossicles by the eardrum, known as the malleus (hammer), the incus (anvil), and the stapes (stirrup), which act as a system of levers, the footplate of the stapes ultimately striking the oval window - a membrane of significantly smaller scale relative to the tympanic membrane that lies at the intersection of the middle-ear and the inner ear. The inner ear is comprised of two main functional parts: the cochlea, which is 20 dedicated to hearing, and the vestibular system, which is dedicated to balance. The cochlea (Fig. 2.5) is responsible for separating the sound wave into its frequency compo- nents, thus effectively performing a Fourier-transform on the sound signal. This trans- formation is due to a complex process that ultimately results in the bending of tiny processes known as stereocilia, which protrude from the apical end of hair cells and sit atop the basilar membrane at distances from the oval-window that correspond to spe- cific frequencies. As the traveling wave initiated at the oval window propagates from the base of the basilar membrane, it causes local maximal displacements of the membrane at the sinusoidal frequency components of the sound, thus displacing the stereocilia suf- ficiently far from their resting state, resulting in the depolarization of the hair cell, and ultimately causing the release of neurotransmitter onto the nerve ending of the auditory nerve. This is the final step in the transduction pathway from sound wave to neural impulse. The representation of frequency along the length of the cochlea is known as tonotopy - a place-frequency representation that is maintained throughout most of the auditory circuit. 21

Malleus Incus Maoes

Stapes Semicircular Malleus ? canals Chai r WÉTHÍOW

V Dsbbular nervo

Cochlear ? lent; Tympanic of s membrane in oval window

Vestibule Inner

Round window

tustachlan tube

Oilier ear JfSf Tympanic [membrane Middle ear (¡«terna! aucOitary iineaîus

Figure 2.4: The human ear. Reprinted from [48].

1600 Hz.

800Hz

400IiZ Hase oí basilar membrane is "tuned" for high frequencies 20DHz

Xlmcoiled Apex is tuned cochlea for low frequences IUU Hz

esribufli Heucotrema 50 Hz

oval window v—^ "* w 25 Hz Round Cochlear 10 20 30 window Traveling Scala Basilair wave tympana Distance IdOm stapes (mm)

Figure 2.5: The human inner ear. The cochlea - Latin for snail - is seen here as though it were unfolded, in order to emphasize its tonotopic structure. Reprinted from [48]. 22

The Brain

The first stage of central auditory processing in the brain occurs at the cochlear nucleus. From here, the information from the peripheral (or sensory) auditory system diverges, splitting into a number of parallel central pathways[48]. One stream of information is sent to the superior olivary complex, which is the first place that information from both ears interacts, processing the cues that allow for three dimensional sound object localization. The cochlear nucleus also projects to the inferior colliculus which acts as a major integrative center and is thought to be responsible for pitch extraction [8, 36]. The output of the inferior colliculus is sent upstream to the thalamus and cortex, where additional integrative aspects (both spectral and temporal) of sound significant to speech and music are processed. A detailed rendering of the neuroanatomical pathway associated with the auditory system reveals a disproportionately large number of relay stations, revealing that the per- ception of sound is an especially intensive neural process as compared to other sensory systems [48]. Of particular interest, recent studies show a distinct hemispheric specializa- tion for the processing of speech (left brain) and music (right brain). In fact, it has been shown that trained musicians show a significant left-ear advantage for the discrimination of dissonant tone combinations [33]! 23

Primary aaiditoiry

Mediai geniculate compSex of the thalamus Rostral midbrain

-=— Inferior coiltoiluis

Caudal candlft)ira mm

NucEetus of lateral lemimascuas

Pons- midbrain junction

Mid-pons Superior olive

Cochlear nuclei Dorsal Rostral medulla Posteroveiitral Amterovenbral

Cochlea ganglion

Figure 2.6: The auditory pathway in the human brain. Reprinted from [48]. 24

2.2.3 Cognitive and Perceptive: Music and the Mind

In order to make Roederer's table of the physical and biological systems more complete from a holistic perspective (as regards one single subject), a description of the components involved in the mental system of the conscious observer (or the subject) is required. Although psychology and cognitive science are outside of my domain of specialization, I think it is safe to posit that the mind is (at least partly) responsible for the functions of perception, thought, memory, affect, willful action, imagination, etc. as they pertain to the experience of music. The field of 'Musical Cognition and Perception' is complicated by its holarchic or- der (its order of emergence); that is, a deep knowledge of the cognitive and perceptual aspects of music usually requires a knowledge of the underlying physical and biological components involved, whereas the converse is not generally true. All this to draw atten- tion to the highly interdisciplinary and often complex (and complicated!) nature of the terrain.

Psychoacoustics &; The Perception of Sound

Psychoacoustics is the study of the human perception of sounds. It is a subfield of psychophysics (listed above). The aim of the field is to understand and describe the psychological correlates of the physical parameters of acoustics and music. There are several subdisciplines in psychoacoustics that address the noetic end of the spectrum, such as and , as well as similar or partially overlapping fields such as psycholinguistics. A dearth of psychoacoustic research exists however that takes as its sole aim the assessment of the human anatomical and perceptual limitations of hearing. This work typically involve assessments of one of more of the following: the audible frequency range, the absolute threshold of hearing, sound masking, perfect pitch, , and assessments of consonance & dissonance. 25

Loudness

Loudness is a subjective measure or a psychological attribute most strongly associated with the physical quantity known as the sound pressure level (SPL) of a sound. Sound pressure refers to the pressure deviation caused by a sound relative to the average pressure of the environment without which it is sounded and is measured in pascals (Pa). For normal conditions of temperature and pressure, the SPL of a sound is the logarithmic measure of the average pressure variation given in decibels (dB) and is determined by the following equation: SPL = 20 ? log ^^ (2.4) where Aprms is the root-mean-square of the pressure variation, and ??? = 20µ?a is the minimum pressure variation required to evoke a percept at 1000Hz by convention. The equal-loudness contours shown in Fig. 2.7 show clearly how the percept of loudness relies upon both the SPL and the frequency of the sound wave. The absolute threshold of hearing is defined as the minimum SPL of a pure tone that elicits a sound percept by a subject with normal hearing in the absence of noise and is represented by the lowermost contour. The uppermost contour corresponds to the threshold of pain. This threshold is frequency, age, and sex dependent [50].

Pitch

Although the majority of people have some sort of intuitive understanding of what is referred to by the word pitch, defining the word unambiguously and without objection from all scrutinizing minds is no simple task. Indeed, many different definitions have been proposed at various times, with the common theme being that all such sensible definitions can be divided into two camps - those that connect between pitch and the musical scale, and those that do not. I have chosen to use the 'operational' definition cited by Plack and Oxenham[47]. That is: "a sound can be said to have a certain pitch 26

130 120 (estimated) 110 100 prion 3-100 w 90 co a. so

1 70 6 g 60 Ï 50 « 2 40 ? 30 1 20 10 (threshold) O -10 10 100 1000 10k 10Ok Equal-loudness contours (red) (from ISO 226:2003 revision) Original ISO standard shown (blue) for 40-phons

Figure 2.7: Equal-loudness contours. Reprinted from [65]. if it can be reliably matched by adjusting the frequency of a pure tone of arbitrary amplitude" [30] . The assumption that the listener is matching along the dimension of pitch rather than on loudness or timbre is reasonable given that the pure tone is of 'arbitrary amplitude', and that a pure tone possesses a minimal spectral overlap with any sound capable of eliciting an identical pitch percept. I will warn however that this should not necessarily lead the reader to assume the existence of a one-to-one correspondence between stimulus frequency and pitch percept as even the best trained ears are unable to discriminate compressions or stretches in complex tones to better than within 1% of the initial frequency [58]. When speaking of pitch, one often makes the distinction between a spectral pitch and a virtual pitch. The single pitch evoked by a complex tone corresponding to the pitch evoked by the frequency of the greatest common divisor (gcd) of the complex tone (i.e. the fundamental frequency) is called a virtual pitch. This contrasts the pitch evoked by a pure tone, which is called a spectral pitch. 27

G*

Figure 2.8: The image is a visual analogue to virtual pitch and the missing fundamental (described below) as the brain interprets contours that are not actually present, producing an illusory percept. Reprinted from [57].

The 'Missing· Fundamental'

Any two relatively prime tones (i.e. two tones with a greatest common divisor of 1 or gcd(fi,f2) = 1) in the upper harmonic series, when sounded together, generate a sound waveform with a repetition rate of the fundamental frequency /o, and so induce a pitch percept corresponding to /0 even though this frequency is absent in the stimulus itself. This phenomenon, which is an instance of virtual pitch, is known as the missing fundamental and can be seen in Fig. 2.9.

1/100 sec 1/100 sec

200Hz

300Hz

Fundamenta = 100 Hz

Figure 2.9: The superposition of two periodic tones with frequencies 200Hz (red) and 300Hz (green), which are components of the harmonic series of 100Hz, reveals a complex waveform that repeats at a rate of 100Hz (blue) - and would thus elicit a pitch-percept at the so-called missing fundamental. Reprinted from [69] . 28

Timbre

Timbre is the most difficult of the three primary sensations of a musical sound to define in a satisfactory fashion. Amusingly, timbre has been referred to as "the psychoacousti- cian's multidimensional wastebasket for everything that cannot be qualified as pitch or loudness" [42]. Heavily subjective terms such as tone colour, tone quality, and 'the shape of the sound' are common lay terms that are often used synonymously. Regardless of its more difficult definition, the average music listener is certainly able to differentiate quite easily between a piano and a violin, a guitar and a trumpet, etc. It is this characteristic sound quality of an instrument that the term timbre refers to. More recently, the modern tools of spectral analysis - specifically the spectrogram - have allowed for the quantification of musical sounds in terms of their spectral content, as well as the temporal qualities referred to as attack, sustain, decay and release[4]. Since any sound can be broken down into its constituent frequencies by way of a Fourier transform, a plot of a sound's frequency content as a function of time can be made. This plot is known as a spectrogram, and its characteristic form for the normally sounded notes of a given instrument amounts to the quantification of the physical attributes that drive the percept of timbre. Fig. 2.10 shows the spectrogram of a violin. From an auditory neuroscience perspective, the static sensation of timbre "emerges as the perceptual correlate of the activity distribution evoked along the basilar membrane - provided that the correct distance relationship among resonance peaks is present to bind everything into a "single-tone" sensation[51]. Although the complex task of timbre identification is likely to be taxing on the brain, its behavioral relevance is quite clearly tantamount - as distinguishing between the sounds of a wolf and a deer can quickly determine which side of dinner a hunter ends up on!

The Octave & Pitch Class

As previously mentioned, the octave is the interval between one musical pitch and another at double the frequency (further octaves being 2" for ? in Z). Thus, the nearest 29

Figure 2.10: The spectrogram of a violin playing. Here the ordinate is linear in frequency (ranging from 190Hz -1OkHz), the abscissa is time, and the intensity coloring is logarithmic (with black being -12OdBFS). Notice the visible harmonic structure indicated by the appearance of stacked bands. Reprinted from [69]. octaves of the pitch resulting from a tone of 440Hz (named A 440 or A4) are the pitches perceived by tones occurring at 880Hz (A5) and 220Hz (A3). Implicit in the notation employed is the concept of pitch class, which results from the human perception of identical "tone color" of octaves (subjectively, the tones sound the same - just higher or lower relative to one another). This quality of pitch is known as chroma, and thus a pitch class is defined as the set of all pitches with identical chroma.

Musical Scales, Keys, Tuning Systems, L· Intervals

A musical scale is simply a discrete set of pitches. Although there are innumerably many scales, the traditional scales of western music consist of seven notes selected pur- posefully from the twelve-note chromatic scale - the familiar A, A#/Bb, B, B#/Cb, C, etc. The key feature of the chromatic scale is that each element of the scale lies a half-step or a semitone apart. What constitutes a step or a whole tone in the chro- matic scale depends on the tuning system employed (described below). Western scales are therefore formed by selecting a subset from among the tones of the chromatic scale. Typically, scales are formed according to rules of selection. For instance, the Major Scale is formed by selecting a reference pitch from the chromatic scale (which determines the key of the scale) and by then further selecting pitches according to the following rule 30 of tonal stepping: ''Whole:Whole:Half:Whole:Whole:Whole:Half'[51]. Beginning with middle-C or C4 for instance, this yields the C-Major scale consisting of the notes C4-D4- E4-F4-G4-A4-B4-C5, and all of the notes that lie in the same pitch-class as any of these notes. Thus, the scale is referred to as a 7-note scale, and music created by the scale is said to be in the key of C. The concept of a musical key and its implications is not as simple a subject as it may appear by definition, but each key is typically recognized as having different tonal qualities (C-Major is often referred to as 'happy'). Although the octave is a key feature of all western tuning systems it is the division of the intermediate frequency range based on various criteria that differentiates one system from another. The two most commonly used of these western tuning systems are the Just Intonation, and the Equal Temperament [23] . The Equal Temperament (commonly abbreviated as 12Tet[32]) is the most common tuning system used today. It is formed by dividing the octave into twelve logarithmically equal parts (of 100 cents each) and relative to the standard pitch elicited by a 440Hz tone or A 440. This method of division yields a constant frequency ratio of v^ between successive notes in the chromatic scale[51]. The advantage of equal temperament is that all intervals possess the same 'character' in any key (an interval refers to the simultaneous sounding of any two tones selected from within the chromatic scale) . This can be contrasted with the Just Intonation, which is any musical tuning in which the frequency of notes are related by ratios of natural numbers[23]. As an example, the seven notes in the C-major scale are related sequentially by the following ratios: 1:1, 9:8, 5:4, 4:3, 3:2, 5:3, 15:8, 2:1. Practical difficulties with such a just tuning exist however (which largely motivated the construction of 12Tet), such as the existence of so-called wolf intervals (anomalous intervals producing highly dissonant beating in lieu of their intended consonant quality) , as well as requiring the complete retuning of many instruments in order to change from one key to another. The resulting difference between what are considered musically to be the 'same in- tervals' produced from the 'same notes' of the chromatic scale[23] is shown in Fig. 2.11. 31

The differences evident in the figure are typically within a few cents of one another, the majority of which lie within or very near to the so-called 'just-noticeable-difference' (JND) - a concept frequently employed by psychophysicists which happens to have a value of between 8-20 cents for pitch assessments [23]. Some reasons for the emergence and large-scale acceptance of these two tuning systems will be presented in the follow- ing section, and the theoretical neurophysiological reasons for their appeal are further addressed in the analysis and discussion of the manuscript presented in Chapter 4.

frequency ratio

Ociave 'Major seventh Minor seventh " Major sixth "Minor sixth ' Perfect fifth Tritone Perfect fourth Major third Minor third ' Major second Minor second

Figure 2.11: The western musical intervals (dyads) as defined according to the Equal Temper- ament tuning system (black) and the Just Intonation tuning system (blue). Although they are nearly identical, both systems have advantages and disadvantages as they relate to the balance between ideal psychoacoustic consonance and ideal musical consonance. Reprinted from [66]. 32

Consonance and Dissonance

Although consonance is considered to be one of the most salient and basic features of sound related to the appreciation of music, it is by no means a simple feature to capture. In attempting to study intervalle consonance, i.e. consonace between two tones, which is the focus of this thesis, we once again run into problems of definition as the literature is rife with confused attempts at a resolute determination of what is referred to by the terms consonance and dissonance, thereby resulting in the added difficulty of confused usage, quantification, and interpretation to be sure. Nonetheless, there is fairly widespread agreement that a consonant interval is one that is harmonious, pleasant, agreeable, beautiful, and stable, whereas a dissonant interval is one that is disagreeable, unpleasant, and in need of resolution. I will attempt to elucidate the issues underlying the difficulties that arise in the precise definition of these terms by presenting a brief history of such usage.

Pythagorus and the Harmony of the Spheres

Pythagorus is generally credited as being the first to hypothesize that the reason why certain musical intervals sound more pleasant or consonant than others is that they are related by simpler whole number ratios[23, 58] . He also regarded these ratios as indicating the harmonious relationship between the celestial spheres (the sun, the moon, and the planets) [23]. As is well known, the latter theory revealed itself to be baseless over time, however Pythagorus' theory of consonance has remained prominent, as the relationship between consonance and simple ratios is undeniable, albeit for reasons that are still not entirely clear. As noted above, small deviations from perfect integer relationships are tolerated and do not meaningfully alter the assessment of the consonance of an interval. Although Pythagorus contributed significantly to the theory of consonance with his fairly straightforward observation, some empirical observations show significant deviations from the consonance ordering predicted by his theory, and it is thus not the fina] note [35, 34, 55]! 33

?23-t ??? £?¦* -? ? ? ¦"? ?? ?ns ?+ ?&-7-?· 'Sz ¿35 6??*15ß??? ?.'»- 1 ?' * e 3 [-Î2 PJ G

13? 1*? «a

•?» Gtft*f1iii¡HS? »?

\ G? «? KC «5 3»

Pf??15 Jk! tonarsi Iti

Figure 2.12: The spectrogram of two complex tones as one is held fixed and the other swept from unison to the octave. The spectrogram reveals the emergence of more orderly overtones at the points where the frequency ratio is simple, corresponding to tone combinations that are deemed more 'harmonious'. Most of these 'orderly' points are intervals formed by tone pairs selected from the Just Intonation tuning, as indicated with their defining ratios above the arrows. The points in between the simple ratios are visibly more 'chaotic' and produce sensations of beating, roughness, and general dissonance. Reprinted from [67].

Helmholtz: Beating and Critical Band

Hermann von Helmholtz is largely credited as having provided a full accounting of why Pythagoras' theory is valid from a physiological perspective [41, 57]. According to the theory, the dissonance of a tone is positively related to the number and strength of the interferences that occurs between all of the partíais of the two complex tones composing an interval. Helmholtz therefore argued that the high degree of consonance attributed to the unison (1:1) and the octave (1:2) results from their perfectly overlapping or coincident harmonics. It is easy to see how more complex ratios have fewer overlapping harmonics since, for all else being equal, the overlap is simply determined by the interval's defining ratio (where all ratios are taken to be in their simplest form). Take for example the Major 2nd, a dissonant interval with a defining ratio of 8:9. Every 9th partial of the lower frequency tone overlaps with every 8th partial of the higher frequency tone. Compare this to the octave, where overlap occurs for every 2nd partial of the lower frequency tone and every partial of the higher frequency tone, (see Fig. 2.13). Helmholtz describes his 34 theory in the following passage:

When two musical tones are sounded at the same time, their united sound is generally disturbed by the beats of the upper partíais, so that a greater or less part of the whole mass of sound is broken up into pulses of tone, and the joint effect is rough. This relation is called Dissonance. But there are certain determinate ratios between pitch numbers, for which this rule suffers an exception, and either no beats at all are formed, or at least only such as have so little intensity that they produce no unpleasant disturbance of the united sound. These exceptional cases are called Consonances. [61]

Helmholtz proceeds to colorfully convey the sensation of beating associated with a dis- sonant interval:

In the first place the mass of tone becomes confused...But besides this...the sensible impression is also unpleasant. Such rapidly beating tones are jarring and rough. The distinctive property of jarring is the intermittent character of the sound... (and again]... A jarring intermittent tone is for the nerves of hearing what a flickering light is to the nerves of sight, and scratching is to the nerves of touch. A much more intense and unpleasant excitement of the organs is thus produced than would be occasioned by a continuous uniform tone[61].

The beating of two partíais in a complex tone, and beats in general, can be understood by analyzing the following sum of two pure tones, given by: A 8??(2p/1?) + A sm{2rrf2t) = 2A cos (fl~) sin \\) (2·5) When the frequencies of the pure tones are 'sufficiently close' a single pitch percept (i.e. an object of perception) results at the average of the two frequencies '1I2'2 called the 35

¦

Figure 2.13: The spectrogram of a violin playing a perfect 5th reveals many overlapping partíais (indicated by the dashed white lines) and is a highly consonant interval. This is well aligned with Helmholtz's theory of consonance. Reprinted from [69] . center frequency, accompanied by a slowly varying intensity modulation that produces a beating sensation with an envelope that oscillates at a frequency of /i — /2. The problem with Helmholtz's theory lies in the limited range over which beats and any associated sense of roughness occur, which is known as the critical bandwidth. When the partíais are sufficiently far apart (typically a deviation of 10-20% from the center frequency) and no longer , Helmholtz's theory implies that assessments of consonance should no longer vary [41]. This is certainly not the case as witnessed by much experimental evidence, and it is thus Helmholtz's theory becomes invalid as intervals formed by higher frequency tones are considered. As with Pythagorus, Helmholtz's theory of beats may not provide a flawless theory of consonance, however it does add a necessary consideration when assessing tones with sufficiently close partíais which are quite commonly occurring both in nature and in music. His theory also begs the question - why do we hear beats? It is now known that the physiological mechanism that results in the perception of beats lies at the sensory periphery, specifically occurring in the inner ear as a result of the fact that the basilar membrane behaves as a series of overlapping band-pass filters (see Fig. 2.14). These band- pass filters act to boost a certain range of frequencies while attenuating those frequencies that lie outside that range, and so two tones whose frequencies are within the bandwidth 36 of a single filter tend to behave as described in Eq. (2.5) above. The discretization of the audible spectrum along the basilar membrane as such has been attributed to the need for an "information and collection unit" - and hence the critical band [5 1].

OQ TJ

'S. E Bandwidth to Fl F2 f(kHz)

Figure 2.14: A band-pass filter showing the bandwidth -frequencies 'passed' by the filter, defined by the lower and upper cutoff frequencies Fi and i<2, respectively. Here the cutoff frequencies have been defined as the frequencies where the amplitude falls more than 3dB below the peak value. Fc denotes the center frequency. Reprinted from [68].

Before moving on, it is worth mentioning Strumpfs theory of Tonal Fusion as it is closely related to Helmholtz's theory and indeed possesses a unique perspective on consonance that has ties to the following section. Briefly, the theory of tonal fusion is a Gestalt theory which states that the degree of consonance of an interval is directly related to the degree to which the two notes composing the interval tend to fuse into a single tonal object [24]. The experimental setup to test this hypothesis was to simply play each interval so a group of subjects (each in isolation) and ask them to identify the presence of either a single tone or a combination of tones. Stumpfs results are shown in Fig. 2.15 and show a good correspondence with Pythagoras' theory and so the degree of fusion corresponds with the number of overlapping partíais as per Helmholtz's theory. As we will see in the following section, these results are well aligned with what is called musical consonance. However, more recent investigations of tonal fusion have shown that it is not by itself a theory of sufficient robustness [24]. 37

·>?

fi¡

t

Octave Fifth Fc-jrth Th tra Tnson* Secones

Figure 2.15: Stumpfs original results indicate that the degree to which the intervals shown along the abscissa are mistakenly perceived to be a single tone stimulus agrees well with the simple ratio theory of Pythagoras, and is thus in accord with Helmholtz's theory regarding overlapping partíais. (Here the ordinate represents '%error'). Reprinted from [24].

On the Sensations of Pyschoacoustic Consonance L· Musical Consonance

The beating and roughness of an interval resulting from processing at the sensory periph- ery is called sensory consonance or psychoacoustic consonance. The subsequent processing by the central auditory system is thought to underlie the phenomenon of musical consonance, commonly referred to as harmony. The degree of the musical consonance of an interval refers primarily to its capacity to evoke a sensation of resolu- tion or relaxation, and a sensation of tension in the case of musical dissonance. As can be seen, these are two distinctly different concepts, though they do overlap to a certain extent. The work of Terhardt grounds these concepts with a certain deftness (in [57]). He first concludes that the experimental results on consonance and roughness are "signif- icant and consistent, and thus provide a solid basis of a certain kind of consonance, i.e. psychoacoustic consonance... To this extent, Helmholtz's (1863) consonance theory is strongly supported" [57]. He then notes however, that "the universal importance of harmonic intervals in music cannot be explained satisfactory by the concept of psychoa- coustic consonance" [57]. Terhardt proceeds to develop his virtual pitch model and to 38 then show clearly that it is able to provide a sufficient explanation for numerous previ- ously unexplained psychoacoustic phenomena, including octave periodicity, the missing fundamental, and the natural derivation of music intervals from the harmonic series. As described previously, the virtual pitch of a complex tone almost always corresponds to the pitch class of the greatest subharmonic common to all of the partíais within the tone. Terhardt realized that the virtual pitch of consonant intervals implies a bass note (a low frequency note) that is itself in the chromatic scale, whereas dissonant intervals imply virtual bass notes that do not belong to the chromatic scale. As Terhardt states, and anyone with a basic experience in music composition knows, it is a "common experience that certain notes fit with each other and with a given bass note, [and] others do not" [57]. These bass notes imply a certain 'tonal meaning·' - a term coined by Rameau in 1722 in his theory of musical consonance known as the fundamental bass in which this very relationship to consonance was first posited [49]. Terhardt therefore succeeds in grounding the musical observations of Rameau in a model that fits well with many significant psychoacoustic observations. Although Terhardt stresses the importance of the virtual pitch models prediction of the tonal meaning of an interval thus providing a potential psychoacoustic basis for the grouping of musical consonances and dissonances, he does not address the general ordering of the musical consonances in relation to his model. In Chapter 4 I present what I believe to be a plausible accounting for this ordering.

2.3 The Basics of Auditory Neuroscience

Neurons are the electrically excitable cells that constitute the basic units in the brain. They come in a large variety of shapes, sizes, and characteristic behaviors and they are organized (and self-organize) to bring about cognitive function. Although they are highly variegated, most neurons share three common structural features, namely an axon, a soma (or cell body), and dendrites [48]. The axon is a long, thin, cable-like 39 projection that acts to conduct electrical impulses called action potentials away from the cell body of the neuron. Axons typically connect with other neurons (usually on their dendrites - though sometimes on the axon or the soma) and are said to synapse onto other neurons. The dendrites themselves typically form branch-like projections known as dendritic trees that act to receive electrochemical stimulation from upstream neurons, the structure of the tree influencing how the neuron integrates input from other neurons. The aforementioned action potentials of neurons are the result of a depolarization of the membrane potential of a neuron. This depolarization occurs when the intracellular- vs-extracellular concentration differences of ions change, sufficiently so that the voltage across the membrane of the neuron changes from its resting potential (typically -40 to - 9OmV) to a level above its critical threshold (typically 15mV above its resting potential). When this threshold is surpassed, a voltage-pulse (the action potential) is initiated and proceeds to propagate along the axon. The membrane potential then hyperpolarizes, often going through a refractory period where it is unable to emit another action potential. These basic properties of neurons are what allow for the transmission and processing of information by the nervous system [48].

2.3.1 The Leaky Integrate-and-Fire Neuron Model

Neuron models seek to replicate the behavior of real neurons sufficient to their intended purpose while remaining as computationally inexpensive as possible while doing so. Among the simpler neuron models the leaky integrate-and-fire model is often used as it retains the fundamental behaviors of a neuron as well as the minimal ingredients of its membrane dynamics[22]. The dynamics of the LIF model are described as follows: m_ML = cJML. (2.6) K7n at Here I(t) is the time-dependent current of the physical input (i.e. the sum of any applied or synaptic currents) to the neuron through the cell membrane, C7n is the ca- 40

*< v> Y K C- feoX t XV

. / Newon \ \ ^) V

/ \ û \ \V^ -y. penantes -*m,/ / \ W ^1 te V s-% Axon JV / ì Z ? / XT •My-. H /£/ / S///' Neurotransmitter // MoEecutes / ^ / ÍV CQ Q / \ ^ y ^ v_> ^* ~^ / '7 Receptor Ö ?\-? v-i^Swnapse ^ X~^ 7 \ ^< ^ _^?y \ / \ ^

Figure 2.16: A neuron with the dendritic and axonal functional subdivisions of indicated. The soma is not indicated; it comprises the bulbous bulk of the cell. Reprinted from [70]. pacitance of the cell-membrane and R7n its resistance, and Vm(t) is the time-dependent voltage of the cell membrane. When the voltage of the cell exceeds a chosen threshold Vm > Vth, the cell fires an action potential, which is typically modeled as a Dirac delta- function (Eq. (2.7)) and is commonly referred to as a spike. Once a spike has occurred, the voltage is reset a potential V1. < Vn1, where it is often forced to stay for a period of time, thereby modeling the refractory period of a neuron. The leak term ~fimÌ ' always drives the voltage of the cell to a state of equilibrium V7n = RmI where V7n is the fixed point of Eq. (2.6). 41

ô(t)={ +°° ift-^"^ (2.7) U II Z ^ 6 firing

2.3.2 Ordinary Measures of Neural Firing The output of a model neuron is typically a sequence of threshold crossing times (or spike times) defined as the set {ti} for t = ?,.,.,?. From the {i,} it is easy to define the spike train x(t) = X^¿>(¿ — U), as well as the interspike interval (ISI) given by Ti = ti — í¿_i where ?? = 0. The nth — order ISI is similarly defined as T, = ti+n — i,_i where ¿o = 0. The all-order ISI is then taken to be the sum of the nth — order ISI's for ? = 1, ...,N where N is typically chosen to reflect the known or hypothesized biophysical limitations of a neuron's or neural network's processing capabilities [58]. From the ISI sequence (first order), the instantaneous firing rate r of a spike train can be defined as:

? = ?. (2.8) The mean firing rate is similarly defined as: r=(è)· (2·9) where (T) is the mean of the ISI. The autocorrelation function of the spike train r(r) is defined as:

_ {x(t)x(t + T)) Ä(T) - MW - WW (2'10) The autocorrelation function is a cross-correlation of a signal (here the spike train) with itself and is often used as a tool for finding repeating patterns such as periodicities in time-domain signals and is therefore of great use in auditory neuroscience. Here R(t) represents the mean corrected probability of observing a spike at ±r time units 42 from another spike. The use of a mean-correction and variance normalization yields the attractive property of R(t) G [—1,1], where R(r) = 1 indicates perfect linear correlation and R(t) = — 1 indicates perfect linear anti-correlation. Finally, the power spectral density (PSD) describes how the power of a signal (or a time series) is distributed by frequency (essentially giving the amount of power per Hz). This measure, often referred to simply as the spectrum of a signal, is related to the Fourier transform of the autocorrelation function R{r) by the Weiner-Khinchin theorem. This gives:

/OO e^R(T)dr = F(R(T)), (2.11) -OO where T is the Fourier transform, which converts the time domain signal into a frequency domain representation of the signal in terms of amplitudes of sinusoidal frequencies and their associated phases present therein (see [22] for a thorough introduction to theoretical neuroscience and its tools).

2.3.3 Auditory Coding Schemes

How populations of neurons represent and convey information through spike trains is of primary focus in the field of computational neuroscience, and is fundamental to our understanding of auditory brain function and auditory perception (see Fig. 2.17 for a schematic depicting this relationship) [22]. The three complementary neural coding schemes shown in Fig. 2.18 correspond to different, independent, and general aspects of signals[ll]. Therefore, neural coding of a signal in the auditory system is achieved ei- ther by coding according to the physical channel through which the signal is transmitted (channel-activation or spatial coding), the interval form of the signal such as its envelope, or spectral components (temporal pattern coding) , or by the time of arrival of the signal (time-of-arrival coding). The amplitude of the signal is a fourth aspect, which can be used in conjunction with any of the three schemes mentioned [H]. 43

Psychophysics STIMULUS PERCEPT

Neurophysiologies! Psychophysiology System Identification Neural coding NEURAL RESPONSE

Figure 2.17: The schematic shows the relationship between a stimulus, its evoked neural re- sponse, and the resulting percept. Reprinted from [H].

j_j—L J LLJLJ -U- Channel-activation 4J- 44- " ftü tLn " 4 U- Latency-place

Spatio-temporal Synchrony- »"i w¡ H M pattern place ». „. ili I MaJLaaaMMNJUHMMWHi

Temporal pattern Relative tlroe-of-arrival _ii m tt fi ft mm ft fi ? I I .¡¡.. m Iff if nun If Hf WWI 4-

Figure 2.18: The space of possible neural pulse codes. Each vertex represents a coding scheme that is complimentary to the other two vertices, and the schemes indicated along the edges represent 'hybrid' combinations of their respective vertices. Reprinted from [H]. 44

Since the discharge patterns of the auditory nerve fibers are known to be phase-locked (see 'Relevant Nonlinear Dynamics' section below) to the stimulus and thus reflect the time structure of the acoustic waveform[58, 11], temporal coding-schemes have been shown to be capable of accounting for a large variety of psychoacoustic phenomena, and are thus a very likely candidate for the actual coding scheme employed within the auditory system[58, 17, 16, H]. The strength of the temporal coding scheme lies in the representation of the various periodicities present in an acoustic waveform via interspike intervals. It has been shown in the body of work due to Cariani and colleagues [59, 58, 17, 16, H] that in the population-wide all-order ISI (constructed by summing together the all-order ISI's of many single auditory nerve fibers) "the most common...interval present almost invariably corresponds to the pitch frequency, whereas the relative fraction of pitch related intervals amongst all other qualitatively corresponds to the strength of the pitch" [17]. This pitch strength is more often referred to as 'pitch salience' [16, 11], and it is through the measure of the pitch salience of the western intervals that Cariani has succeeded in showing a strong correspondence with the psychoacoustic consonance [14]. The strength of the coding scheme is undeniable as it can account for a large number of pitch-related perceptions, as well as timbre perceptions. A good example of the temporal coding of the auditory nerve can be seen in Fig. 2.19. A schematic summary of the major correspondences between the population-wide all-order ISI distribution of the auditory nerve discharges and pitch percepts is shown in Fig. 2.20[17]. These correspondences are as follows:

• Panel A shows - from left to right - the stimulus waveform, its PSD showing no power at the fundamental, the stimulus autocorrelation showing perfect correlation at the period of the missing fundamental, and the resulting all-order ISI distribu- tion of the auditory nerve, which shows a strong correspondence with the positive part of the autocorrelation. Thus, the pitch predicted by Cariani's coding scheme corresponds with the pitch percept that is heard - that of the missing fundamental. 45

• Panel B shows that the coding scheme is capable of explaining the pitch equivalence of various stimuli with the same frequency. Here a pure tone elicits the same pitch as an AM tone, a click train, and AM noise all at the same frequency of 160Hz.

• Panel C shows that the coding scheme is capable of accounting for observed vari- ations in pitch salience, which is strong when the peaks corresponding to the ob- served pitch are prominent relative to the background, and low otherwise.

• Panel D demonstrate the coding scheme's ability to account for the observation that pitch percepts are level invariant, as an increase from 40 dB SPL, to 60 dB SPL, and finally 80 dB SPL reveals changes only to the height or salience of the maximum peak, but not its location.

• Panel E shows that the coding scheme is capable of accounting for the observation of pitch shift of inharmonic AM tones. Here, an harmonic AM tone compris- ing three successive harmonics evokes a clear pitch at the fundamental frequency (n=6). When all three harmonics are shifted either upwards or downwards by an equal amount (keeping the frequency spacings constant), the observed shift in pitch is much smaller than the distance shifted (n=5.56). When the pitches of the har- monics are further shifted (here downward to n=5.5), two pitches are heard with roughly equal probability - a phenomenon which can be easily accounted for by the emergence of two peaks in the population-wide all-order ISI distribution (as indicated by the two arrows).

• Panel F shows that the coding scheme can account for the observed perceptual phenomenon of phase invariance, as two stimuli who vary in phase yield population- wide all-order ISI distribution's that are nearly identical.

• Panel G shows the coding schemes ability to account for the existence of the so- called dominance region - whereby the lower harmonics of a complex tone are dominant over the upper harmonics in determining the pitch percept evoked by a 46

stimulus. Cariarli shows this here by using two competing complex harmonic stimuli with only slightly separated fundamental frequencies. The stimulus of harmonics 3- 5 clearly evokes a stronger response than the stimulus composed of harmonics 6-12, and does indeed correspond with the pitch percept elicited in such an experiment.

• Panel H shows that the coding scheme is also capable of accounting for vowel qual- ity or discrimination, which is obviously of great importance for any candidate coding scheme for human sound processing. According to the results of Cariani, the appearance and disappearance of minor peaks in the population-interval dis- tribution closely resemble the psychophysical^ observed vowel-class boundaries, further strengthening the coding scheme's viability. 47

1ffO ^ PHch period ??/\??—WW·—"IW^—^Aa*-—

10 grifcirft^MifcaiwJk'itiitfertAJBfatAaiMAii

.. EiJ-^AkI... ¦ nL^iL^,^ MUAfettk... .. JAA^u1,.

ao so Peristfmulus time (ms) Pitch period i jAAAA/\/y ^foWftft/ Frequency Otte)

f« lh.«iuk..ill 1 10 c s to is » as Characteristic frequency (kHi) lnlorspilie Interrai {ms)

Figure 2.19: The temporal coding of the auditory nerve is shown for a periodic stimulus (A). (B) - single nerve fibers of the cat are phase-locked to specific features of the waveform and are arranged by their characteristic frequency (CF), the pure tone frequency to which they respond maximally. The ordinates correspond, for each fiber, to the number of firings obtained from multiple presentations of the stimulus in (A). The power spectrum of the waveform in (A) is shown in (C), (D) shows the mean discharge rates as a function of CF. The autocorre- lation function or autocorrelogram of the stimulus is shown in (E) with the maximum spectral component frequency Fl indicated, as well as the frequency of the repetition of the waveform, the fundamental frequency FO. The population-wide all-order ISI formed by summing all-order intervals from all fibres reveals itself to be qualitatively identical to the positive part of the au- tocorrelogram of the stimulus. Notice the relative 'salience' of the FO frequency as represented by the dominant peak at a period of 1/FO, accurately representing the perceived pitch evoked by the stimulus. Reprinted from [H]. 48

PHOs of the "missing fundamental" JlL ?».?11.1

Pwetone AIIWW Click tram Ata noise PKcIi Equivalence (160 Hz) Pillili 1.-I -1.

Strang pitches Weak pitch C Pitch salience LuiLáii L m m 4OdBSPL «OU 80 D Level invariance V1JH i^fu,^ fí4¥n ? -S J S.86 i 5.5 E Pitch shift of I i inharmonic U AU tones

AM tone F Phase invariance ????^ LLLi QPMtone \m¡$0M

G Dominance region *? 1 1 FO3^tSOHz 240Hz 320Hz 480Hz

ah Vowel quality (timbre) ìàèà ìààk LiMi ìàèÈ

Figure 2.20: Schematic summary of major correspondences between pitch percepts and popula- tion interval distributions at the level of the auditory nerve. The population-interval histograms plot relative numbers of all-order intervals (ordinates) of different durations (abcissas). Interval ranges for the histograms: 0-5 ms (H), 0-10 ms (A, F); 0-15 ms (B, C, E, G); 0-25 ms (D). Waveform segments are 20 ms long. See text for discussion. Reprinted from [H]. 49

2.4 Relevant Non-Linear Dynamics

Although the temporal coding scheme described above appears to be quite robust, Car- iarli employs a Markov modulated Poisson process in order to replicate the spike trains of auditory nerve fibers subjected to dyad stimuli, and therefore does not provide any dynamical mechanism capable of producing such a response. It is thus that we now turn to dynamical models of impulse generation at the cochlea's output as a means of exploring the interplay of neuron properties and synaptic noise in producing spike trains in response to dyadic input. Finally, the sine-circle map and some of its associated prop- erties and concepts are presented in order to provide the background concepts requisite for Chapter 3.

2.4.1 Stochastic Resonance

Stochastic resonance (SR) is said to occur when noise - a universal found in all physical systems - enhances the detection of some characteristics of an input signal to a system. SR has been investigated thoroughly over the last few decades[29, 72] in particular to assess its relevance to the behavior of biological systems [28]. The name resonance implies that, for an optimal noise strength, the forcing frequency (input signal) and noise induced switching rate (e.g. firing of a neuron) of the system being forced are matched.

Threshold crossing spikes

Subthreshold signal Noise + signal Figure 2.21: The stochastic resonance is exhibited as the noise (shown in grey) pushes the subhtreshold signal (solid black) past the threshold (dashed line). The noise-enhanced infor- mation transfer can be seen in the spike train of the model neuron (shown in the upper panel). Reprinted from [44]. 50

It has recently been shown that SR can occur in the human brain [39], although an SR-based mechanism that is employed for information processing has yet to be firmly established [44]. Regardless, SR is of interest to the study of the auditory system as it is a mechanism capable of explaining the auditory known as the missing fundamental, whereby the subject perceives a pitch to be present at the fundamental frequency of a harmonic series even though the fundamental frequency itself possesses no power in the input signal. This resonance at the missing fundamental frequency is known as the ghost stochastic resonance (GSR), and it has been shown to be the preferred resonance of an excitable system (such as a neuron) with periodic inputs whose greatest common divisor occurs at the missing fundamental (see Fig. 2.22) [9, 40, 19, 18]. Furthermore, the exhibition of the SR (and GSR) behavior has been studied explicitly for a variety of neuron models[37, 38, 2] thus adding to the theoretical agreement of the processing capabilities of neurons as regards the illusion of the missing fundamental. Note that our investigation of these simple SR-based dynamical models of neural firing does not preclude the existence of an array of neurons with different threshold and noise in the auditory periphery, and which is responsible for the full transduction of stimuli into spike trains. Rather, we focus our attention for now on single neuron models driven by dyads in the context of consonance. 51

} 3 6 Time (sec) (a) Visual illustration of GSR

i

0.1 0.2 0.3 Noise Intensity s

(b) Preference for the GSR

Figure 2.22: The upper plot shows the trace of a signal constructed by adding two sinusoidal frequencies /i = 2Hz and /2 = 3Hz. Notice that the constructive peaks of the resulting waveform are maximal every Is, corresponding to a repetition at the fundamental frequency /0 = IHz. It is not hard to see how the addition of a small amount of noise creates a preferential resonance at the missing fundamental. Indeed the probability of observing the GSR (/0) is much greater than that of observing either one of the forcing frequencies (/1 or /2) over the range of noise values that induces a resonance (lower plot). Figures reprinted from [20]. 52

2.4.2 The Sine-Circle Map & Mode-Locking Large ensembles of coupled systems (such as neural networks) whose components in- teract to influence the behavior of one another often manifest themselves as collective coherent regimes; this phenomenon is called synchronization, and is a prominent feature of many nonlinear systems. Since synchronization is robust in the presence of noise, it is considered to be a viable candidate coding scheme by computational neuroscientistsfll]. A simple system well known for its exhibition of nonlinear synchronization is that of a forced oscillator as described by the sine-circle map (often simply called a circle-map) . The circle map is a one-dimensional iterated map of the circle onto itself as follows [62]: ??+? = ?? + ?-^- sin(27T0n) (2.12) Here k is interpreted as the coupling strength and O can be interpreted as the forcing frequency being applied to the oscillator. For any fixed value of k G (0, 1], and sweeping values of n, the map produces a distinct modal structure known as the Devil's Staircase (Fig. 2.23) [63, 41]. This phenomenon is often interchangeably referred to as either mode- locking or phase-locking in the literature. Each mode-locked region is defined by the following limiting behavior known as the winding-number:

? = V- = Hm-. (2.13) q ?—>oo ? The width of the mode-locked region at each winding-number is determined by the width of the Arnold Tongue at that winding-number at height A; (see Fig. 2.24). For k E (0, 1] the width of the Arnold Tongues rank-orders from greatest to smallest according to the simplicity of the denominator q of the winding number (here we take 0 to be ? for consistency). From a practical perspective, the width of the Arnold Tongue for any rotation number determines how 'far' an oscillating system can be perturbed from its steady-state without changing the rate (or mode) of its oscillation - hence mode- 'locking'. The width of the Arnold Tongue is also referred to as a synchronization region 53 when using sine-circle maps to measure synchrony in neural networks [41, 45, 3, 21]. It was the realization that the rank-ordering of mode-locked regions or Arnold Tongue widths corresponded perfectly with the consonance rankings (according to Pythagorus) that motivated the modeling and analysis of Shapira Lots & Stone [4 1]; work which is critically assessed in the reprinted paper by myself & A. Longtin presented in Chapter 3.

Devil's Staircase

1

0.8

I 0.6 £

¿_

c I 0.4

0.2

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Frequency

Figure 2.23: In this figure k is held constant at k = 1, and the winding number for the circle map is plotted as a function of omega. Each flat region or 'stair' corresponds to a mode-locked region. Note the obvious symmetry of the plot about the mode 1:2, and that the largest modes are clearly 0 & 1, ^, ^ & |, etc. Reprinted from [64]. 54

\ /

\ / N «~ ' / ?

\

X Y. m \ / \ / / N. • \N ? ^ ?. X / S ? X N / ^ \ y Ji 3k \

^^. m

Figure 2.24: The Arnold Tongues of the circle-map are shown in black. O varies from 0 to 1 along the abscissa, and k varies from 0 to 4p along the ordinate. Notice the relative strength of the Arnold Tongue originating at the half-line along the abscissa - a mode of 1 : 2. Reprinted from [64]. 2.5 Summary

This chapter attempts to provide a solid grounding and context for the study of con- sonance from a computational neuroscience perspective. A systems approach to the problem is taken in order to accomplish this, whereby music and some current theories regarding music are discussed very generally and very briefly, and mention of all of the major systems involved in the production of music and their components is given. By de- 55 scribing the concepts and the tools required to understand consonance from the physical production of a complex tone in an instrument, the reception of the tone by the ear, and the path it follows to its eventual transduction into electrical impulses in the brain, I aim to establish an understanding in the reader that is sufficient for further exploration of neuronal theories that may bare fruit in our quest to better understanding consonance. Before embarking, current theories of prominence are discussed in detail - particularly those of Helmholtz regarding the effects of beating and its influences on the perception of consonance, and the theory of Pythagoras, which equates the simplicity of the ratios of tones with their consonance. From here, the basic tools of computational neuroscience are introduced, as is a particular temporal auditory coding scheme of known robustness, and some specific relevant nonlinear dynamics topics - all introduced in order to better prepare the reader for the following chapters that assume some knowledge of them. If this seems disparate, rest assured it is not so in the context of what follows. References

[1] A. Angrum. Voyager - music from earth, http://voyager.jpl.nasa.gov/ spacecraft /music, html, January 2010.

[2] M. Barbi, S. Chiliemi, and A. D. Garbo. The leaky integrate-and-fire with noise: a useful tool to investigate sr. Chaos, Solitons and Fractals, 11(12): 1849-1853, 2000.

[3] M. Bauer and W. Martienssen. Coupled circle maps as a tool to model synchroni- sation in neural networks. Network: Computation in Neural . . . , 1991.

[4] K. Berger. Some factors in the recognition of timbre. The Journal of the Acoustical Society of America, 1964.

[5] G. M. Bidelman and A. Krishnan. Neural correlates of consonance, dissonance, and the hierarchy of musical pitch in the human brainstem. Journal of Neuroscience, 29(42):13165-13171, Jan 2009.

[6] A. J. Blood and R. J. Zatorre. Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proceedings of the National Academy of Sciences of the United States of America, 98(20):11818-11823, 2001.

[7] M. Braun. Auditory midbrain laminar structure appears adapted to fO extraction: further evidence and implications of the double critical bandwidth. Hearing research, 129(l-2):71-82, Mar 1999.

56 57

[8] M. Braun. Inferior colliculus as candidate for pitch extraction: multiple support from statistics of bilateral spontaneous otoacoustic emissions. Hearing research, 145(l-2):130-40, Jul 2000. [9] Ó. Calvo and D. Chialvo. Ghost stochastic resonance in an electronic circuit. In- ternational journal of bifurcation and chaos in applied sciences and engineering, 16(3):731, 2006. [10] P. Cariani. Temporal coding of sensory information. Proceedings of the annual conference on Computational . . . , 1997. [11] P. Cariani. Temporal coding of periodicity pitch in the auditory system: An overview. Neural Plast, 6(4):147-172, 1999.

[12] P. Cariani. Neural timing nets. Neural Networks, 14:737, Jan 2001. [13] P. Cariani. Temporal codes, timing nets, and music perception. J New Music Res, 30(2): 107-135, 2001.

[14] P. Cariani. A temporal model for pitch multiplicity and tonal consonance. Pro- ceedings of the Eighth International Conference on Music Perception and Cognition (ICMPC), 2004. [15] P. Cariani. A temporal model for pitch multiplicity and tonal consonance. Pro- ceedings of the Eighth International Conference on Music Perception and Cognition (ICMPC), 2004. [16] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, i. pitch and pitch salience. Journal of Neurophysiology, 76(3):1698, 1996. [17] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, ii. pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. Journal of Neurophysiology, 76(3):1717, 1996. 58

[18] D. Chialvo. and ghost resonances: How we could see what isn't there. Unsolved Problems of Noise and Fluctuations: UPoN 2002: Third International Conference of Unsolved Problems of Noise and Fluctuations in Physics, Biology, and High Technology, Washington, DC, 3-6 September 2002, page 43, 2002.

[19] D. Chialvo. How we hear what is not there: A neural mechanism for the missing fundamental illusion. Chaos, 13:1226, 2003.

[20] D. Chialvo, O. Calvo, D. Gonzalez, O. Piro, and G. Savino. Subharmonic stochastic synchronization and resonance in neuronal systems. Phys Rev E, 65(5):50902, 2002. [21] S. Coombes and P. Bressloff. Mode locking and arnold tongues in integrate-and-fire neural oscillators. Phys Rev E, 60:2086-2096, 1999.

[22] P. Dayan and L. F. Abbott. Theoretical Neuroscience: Computational and Mathe- matical Modeling of Neural Systems. The MIT Press, 1st edition, 2001. [23] D. Deutsch. The Psychology of Music (Academic Press Series in Cognition and Perception). Academic Press, 1998. [24] L. DEWITT and R. CROWDER. Tonal fusion of consonant musical intervals - the oomph in stumpf. Percept Psychophys, 41(l):73-84, 1987.

[25] M. Giraudo, L. Sacerdote, and A. Sicco. Ghost stochastic resonance for a neuron with a pair of periodic inputs. Lecture Notes in Computer Science, 4729:398, 2007.

[26] D. Goldman. Steve jobs unveils new ipods. http : / /www . cnnmoney . com2 00 9/ 09/09/technology/apple_event_ipod/index.htm, September 2009.

[27] M. Grube, D. Cramon, and R. Rübsamen. Inharmonicity detection. Experimental Brain Research, 153, 2003.

[28] P. Hänggi. Stochastic resonance in biology. ChemPhysChem, 2002. 59

[29] G. Harmer, B. Davis, and D. Abbott. A review of stochastic resonance: Circuits and measurement. IEEE Transactions . . . , 2002.

[30] W. M. Hartmann. Signals, Sound, and Sensation. Springer, illustrated edition, 1997.

[31] B. Heffernan and A. Longtin. Pulse-coupled neuron models as investigative tools for musical consonance. Journal of Neuroscience Methods, 183(1):95-106, Jan 2009.

[32] L. Hulen. A musical scale in simple ratios of the harmonic series converted to cents of ... . WSEAS Transactions on Computers, 2006.

[33] K. Itoh, K. Miyazaki, and T. Nakada. Ear advantage and consonance of dichotic pitch intervals in absolute-pitch possessors. Brain and Cognition, 53(3):464-471, 2003.

[34] A. Kameoka and M. Kuriyagawa. Consonance theory part i: consonance of dyads. The Journal of the Acoustical Society of America, 45(6):1451-9, Jun 1969.

[35] A. Kameoka and M. Kuriyagawa. Consonance theory part ii: consonance of complex tones and its calculation method. The Journal of the Acoustical Society of America, 45(6): 1460-9, Jun 1969.

[36] G. Langner and C. E. Schreiner. Periodicity coding in the inferior colliculus of the cat. i. neuronal mechanisms. Journal of Neurophysiology, 60(6): 1799-822, Dec 1988.

[37] S. Lee and S. Kim. Parameter dependence of stochastic resonance in the stochastic hodgkin-huxley Phys Rev E, 1999.

[38] A. Longtin. Stochastic resonance in neuron models. Journal of statistical physics, 1993.

[39] A. Longtin. Neural coherence and stochastic resonance. In: Stochastic Methods in Neuroscience; edited by G. Lord and CR. Laing. Oxford University Press, 2009. 60

[40] A. Lopera, J. M. Buldú, M. C. Torrent, D. R. Chialvo, and J. Garcia-Oj alvo. Ghost stochastic resonance with distributed inputs in pulse-coupled electronic neurons. Physical review E, Statistical, nonlinear, and soft matter physics, 73(2 Pt 1):021101, Feb 2006.

[41] I. S. Lots and L. Stone. Perception of musical consonance and dissonance: an outcome of neural synchronization. J R Soc Interface, 5(29): 1429-1434, 2008.

[42] S. McAdams and A. Bregman. Hearing musical streams. Computer Music Journal, 1979.

[43] J. Mcdermott and M. Hauser. The origins of music: Innateness, uniqueness, and evolution. Music Perception, 2005.

[44] F. Moss, L. Ward, and W. Sannita. Stochastic resonance and sensory information processing: a tutorial and review of Clinical Neurophysiology, 2004.

[45] A. Pikovsky and M. Rosenblum. Scholarpedia - synchronization, http: //www. scholarpedia.org/article/Synchronization, December 2007.

[46] S. Pinker. How the Mind Works. W. W. Norton & Company, 1999.

[47] C. J. Plack, A. J. Oxenham, R. R. Fay, and A. N. Popper. Pitch: neural coding and perception. Springer: New York, 2005.

[48] D. Purves. Neuroscience, Fourth Edition. Sinauer Associates, Inc., 4th edition, 2007.

[49] J.-P. Rameau. Treatise on harmony; translated, with an introduction and notes, by Philip Gossett. Dover Publications, first english edition, 1971.

[50] D. Robinson. Threshold of hearing as a function of age and sex for the typical unscreened British Journal of Audiology, 22:5-20, Jan 1988. 61

[51] J. G. Roederer. The physics and psychophysics of music: an introduction. Springer- Verlag, 3rd edition, 1995.

[52] C. Sachs and J. Kunst. The wellsprings of music / Curt Sachs ; edited by Jaap Kunst Da Capo Press: New York, 1977.

[53] A. Savitzky and M. J. E. Golay. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, Vol. 36(No. 8): 1627-1639, Jan 1964.

[54] E. Schellenberg and S. Trehub. Natural musical intervals: Evidence from infant listeners. Psychological Science, 7(5):272-277, Sep 1996.

[55] D. Schwartz, C. Howe, and D. Purves. The statistical structure of human speech sounds predicts musical universale. Journal of Neuroscience, 23(18):7160-7168, Jan 2003.

[56] T. Stanley. The history of philosophy (1701). Apocryphile Press, 2006. [57] E. Terhardt. Pitch, consonance, and harmony. J Acoust Soc Am, 55(5):1061-1069, 1974.

[58] M. Tramo, P. Cariani, B. Delgutte, and L. Braida. Neurobiological foundations for the theory of harmony in western tonal music. Annals-New Tork Academy of Sciences, 2001.

[59] M. Tramo, P. Cariani, C. Koh, and N. Makris. Neurophysiology and neuroanatomy of pitch perception: auditory cortex. Annals New York Academy of Sciences, 1060:148, Jan 2005.

[60] S. Trehub. The developmental origins of musicality. Nat Neurosci, 2003.

[61] H. von Helmholtz and A. J. Ellis. On the sensations of tone as a physiological basis for the theory of music. Longmans, Green, 2nd english edition, 1885. 62

[62] E. W. Weisstein. Circle map - from wolfram mathworld. http: //mathworld. wolfram. com/CircleMap. html, January 2010.

[63] E. W. Weisstein. Devil's staircase - from wolfram mathworld. http:// mathworld. wolfram. com/DevilsStaircase .html, January 2010.

[64] Wikipedia. Circle map — wikipedia, the free encyclopedia, 2009. [Online; accessed 17-January-2010]. [65] Wikipedia. Equal-loudness contour — wikipedia, the free encyclopedia, 2009. [On- line; accessed 17-January-2010]. [66] Wikipedia. List of musical intervals — wikipedia, the free encyclopedia, 2009. [On- line; accessed 17-January-2010].

[67] Wikipedia. Consonance and dissonance — wikipedia, the free encyclopedia, 2010. [Online; accessed 17-January-2010]. [68] Wikipedia. Critical band — wikipedia, the free encyclopedia, 2010. [Online; accessed 17-January-2010]. [69] Wikipedia. Musical acoustics — wikipedia, the free encyclopedia, 2010. [Online; accessed 17-January-2010]. [70] Wikipedia. Neuron — wikipedia, the free encyclopedia, 2010. [Online; accessed 17-January-2010]. [71] Wikipedia. Sound — wikipedia, the free encyclopedia, 2010. [Online; accessed 17-January-2010].

[72] G. Winterer, M. Ziller, H. Dorn, K. Frick, C. Mulert, N. Dahhan, W. M. Herrmann, and R. Coppola. Cortical activation, signal-to-noise ratio and stochastic resonance during information processing in man. Clin Neurophysiol, 110:1193-203, 1999. Chapter 3

Article I

This chapter is a direct reprint of the article:

B. Heffernan, A. Longtin. Pulse-coupled neuron models as investigative tools for musical consonance. Journal of Neuroscience Methods. 183: 95-106, 2009.

All simulations, data collection and subsequent analysis was performed by myself. Writ- ing duties were shared between myself and A. Longtin.

63 Journal of Neuroscience Methods 183 (2009) 95-106

* ? Sf \ ^ Contents lists available at ScienceDirect NEUROSCIENCE METHODS Journal of Neuroscience Methods

journal homepage: www.elsevier.com/locate/jneumeth

Pulse-coupled neuron models as investigative tools for musical consonance B. Heffernanac, A. Longtin a,b, 1 Centre for Neural Dynamics, University of Ottawa, Ottawa, Canada " Department ofPhysics, University of Ottawa, 150 Louis Pasteur, Ottawa, Ontario KlN 6N5, Canada c Systems Science Program, University of Ottawa, Canada

ARTICLE INFO ABSTRACT

Article history: We investigate the mode locking properties of simple dynamical models of pulse-coupled neurons to Received 25 May 2009 two tones, i.e., simple musical intervals. A recently proposed nonlinear synchronization theory of musical Received in revised form 29 June 2009 consonance links the subjective ranking from consonant to dissonant intervals to the universal ordering Accepted 30 June 2009 of robustness of mode locking ratios in forced nonlinear oscillators. The theory was illustrated using two leaky integrate-and-fire neuron models with mutual excitatory coupling, with each neuron firing Keywords: at one of the two frequencies in the musical interval. We show that the ordering of mode locked states Leaky integrate-and-fire in such models is not universal, but depends on coupling strength. Further, unless the coupling is weak, Mode locking Mathematical models the observed ratio of firing frequencies is higher than that of the input tones. We finally explore generic Phase locking aspects ofa possible synchronization theory by driving the model neurons with sinusoidal forcing, leading Consonance to down-converted, more realistic firing rates. This model exhibits one-to-one entrainment when the Dissonance input frequencies are in simple ratios. We also consider the robustness to the presence of noise that is Noise present in the neural firing activity. We briefly discuss agreements and discrepancies between predictions Coupled oscillators from this theory and physiological/psychophysical data, and suggest directions in which to develop this theory further. © 2009 Elsevier B.V. All rights reserved.

1. Introduction challenging and limited by available recording technologies and suitable experimental preparations at the neuronal level. Yet the 1.1. Theories of consonance solution of this problem offers the exciting possibility of explaining a connection between simple subjective percepts and simple stim- Despite centuries of theories and experiments, the precise neu- ulus combinations in biophysical terms, and accordingly there are ral basis for our perception of consonance and dissonance is still ongoing efforts to expose the neural basis of consonance evalua- largely unknown. This is so for both the mechanisms involved tions. and their location. Studies of consonance have focused almost In this work, we explore nonlinear dynamical models of neu- exclusively on the perception of two simultaneous tones. This rons driven by periodic stimuli making up a musical interval, but superposition can include only the two fundamental frequencies, also driven by each another through mutual excitatory coupling. or these frequencies plus their harmonics with a specified ampli- The hope is to improve the biophysical realism of these models, tude distribution for these harmonics. Each such sound is termed explore the issues involved in mapping their modeled activity onto as a complex tone, and the presence of two tones, pure or complex, experimentally measurable activity, and reveal what aspects of constitutes a dyad or a musical interval. synchronization—if any—may be at play. The problem of the mechanism underlying subjective ranking The simplest and probably the oldest theory of consonance is of an interval along the consonance to dissonance axis has received that of Pythagoras. He observed that consonant mixtures of two most attention at the psychophysical level, since experimental work tones occurred when the frequencies were in simple integer ratios. involving the simultaneous processing of two periodic stimuli is Helmholtz (1877) discussed consonance in the more general con- text of complex tones, which differ from pure tones in that they have power at harmonics of the fundamental. He proposed that dissonance is proportional to the number of frequency compo- * Corresponding author at: Centre for Neural Dynamics, University of Ottawa, nents present in the two complex tones that produce beats, i.e., Ottawa, Canada. Tel.: +1-613-562-5800. E-mail addresses: [email protected] (B. Heffernan), [email protected] whose frequency difference is within the so-called critical band- (A. Longtin). width (Kameoka and Kuriyagawa, 1969; Plomp and Levelt, 1965;

0165-0270/$ - see front matter © 2009 Elsevier B.V. All rights reserved. doi:l0.10l6/j.jneumeth.2009.06.04l 96 B. Heffernan, ?. Longtìn /Journal ofNeuroscience Methods 183 (2009) 95- 106

Roderei; 1995). For example, the sum of two such pure tone com- The work presented here offers one direction towards this goal, in ponents of frequencies /i and /2 and identical amplitude A can be the context of synchronization theory. written as: 1.2. Possible contributions ofnonlinearíty

Asm2nf^C + Asin2nf2t = 2Acos (^à\ sin (????? (i) Shapira Lots and Stone (2008) recently proposed a synchro- When the frequencies are sufficiently close, this superposition nization theory of consonance that goes beyond the linear beating produces a single perceived pitch at the average frequency (/1 +/2 )/2 theory of Helmholtz. It is based on a striking observation made on with a slowly varying intensity modulation known as the beat. numerical simulations ofexcitatorily pulse-coupled neuron models Another recent approach, based on timing nets, involves the (see below): the progression from consonant to dissonant intervals analysis of population level distributions of all-order interspike is similar to the progression of step sizes on a 'Devil's Staircase'—the intervals between firings (Cariani, 2001, 2004; Tramo et al., 2005). It step sizes themselves being proportional to the width of Arnold relies on putative computations in the time domain with filters and tongues in nonlinear coupled oscillators. These technical concepts coincidence detectors; the locus is not defined, but is thought to lie refer to the range of parameters over which a given mode locking somewhere beyond the cochlear nucleus. It is based on the notion of firing patterns is seen. A mode is defined as the frequency of one that harmonically related pitches share firing intervals at their com- oscillator (e.g., a periodically firing neuron). Mode locking describes mon sub-harmonics. For example, in the case of a Perfect 5th, we the phenomenon where the frequencies of two oscillators remain have 2/1 = 3/2, and thus/i /3 =/2/2 are common sub-harmonics ofthe in a given ratio for some finite range of parameters. The fact that two tones. The presence of these common sub-harmonics causes the oscillators adjust their frequency to maintain the same ratio is neural firings to be more correlated in time than for the case of a sign of nonlinear synchronization. non-harmonically related pitches. This in turn produces maximal For example, imagine neuron A firing at a fixed frequency. If it pitch salience that can plausibly account for consonance of pairs of becomes excited periodically by neuron B, it may tend to synchro- pure and complex tones. nize its firings with that of neuron B. In the simplest case, there Indeed, the simplicity of frequency ratios has played a central is a one-to-one correspondence between the firings of neurons A role in theories of intervallic consonance and dissonance. Sen- and B, i.e., one-to-one (1:1) mode locking. Alternately, for another sory consonance is often distinguished from musical consonance. parameter setting such as a lower amplitude of coupling, neuron A The former refers to consonance based on physical (i.e., acous- may fire only once for every two firings of neuron B, i.e., there is a tic) factors, and is, therefore, independent of musical conventions. 1:2 mode locking. Sensory consonance, which is considered to be a function of the The general theory of nonlinear oscillators states that a ratio of aforementioned critical bandwidth, refers to the absence of ampli- n + n':m + m' can be found at parameters between those for which tude fluctuations in two simultaneously sounded tones (because n:m and n':m' occur (the so-called Farey sequence—see, e.g., Hilborn of their non-overlapping critical bands). Sensory dissonance refers (1994) for a general theory, and Glass and Mackey (1988) for spe- to the "roughness" (very rapid amplitude fluctuations) that can cific applications to neuron models). There is in fact a universal result from simultaneously sounded tones with overlapping critical sequence of mode locking ratios that appear as the ratio of the bands. By contrast, musical consonance is considered to result from driving frequency (neuron B's frequency fa) to the natural fre- tone compatibility, which is dependent on culture, convention, and quency (that of neuron A, i.e., /a, in the absence of input from B) context. is increased—independently of the details of the models for the Moreover, musical consonance is applicable to sequential as oscillators. Note here that/ß is not influenced by neuron A, i.e., the well as simultaneous tones. From a psychoacoustic perspective, coupling is one-directional. One can then make a plot where the consonant intervals occur between 'compatible' tones and pro- abscissa is the ratio of natural frequencies /a //b, and the ordinate duce a 'feeling of stability', whereas dissonant intervals occur is the ratio of actual firing frequencies of the two coupled oscilla- between 'incompatible' tones and cause instability (e.g., Aldwell tors. Such a plot (examples are shown below) is known as a Devil's and Schachter, 1989). Although the concepts of sensory and musical Staircase. This is called a staircase because it exhibits flat "steps" consonance differ, they are not completely independent (Bregman, (actually, an infinite number of them) in which each corresponds to 1990). For example, octaves have never been considered musically a mode locking. In other words, each step corresponds to a param- dissonant, and tones related by simple ratios, such as 2:3 and 1:2, eter range (e.g., a range of forcing frequencies) over which the same are considered to be "stable" intervals across several musical cul- ratio of firing frequencies is seen at the output of the coupled oscil- tures (Meyer, 1956). lators. Strictly speaking the standard Devil's Staircase is defined for Apart from issues of definition, there are outstanding problems the so-called sine circle map (Glass and Mackey, 1988). In decreas- in terms of the class of mechanisms that may underpin conso- ing order of width, the steps correspond to 1:1, 1:2, 1:3 and 2:3 nance ranking. As beating phenomena essentially arise from linear (same width), 2:5 and 3:5, etc., i.e., the steps decrease in width as superposition of two sinusoidal waveforms, this concept of conso- higher integers occur in their fractional representation of the mode nance and dissonance is a linear one. It has its limitations, which locking. are nicely summarized in Shapira Lots and Stone (2008): conso- Conversely, the width of each step is also a measure of how nance ratings can change beyond the critical bandwidth, can occur robust a mode locking ratio is, i.e., of how easy it is to observe given without the presence of harmonics, and cortical lesions reveal variations in system parameters. The 1:1 step (unison in musical that there are specialized neuronal pathways dedicated to disso- terms) is larger than the 2:1 step (octave), which is larger than the nance/consonance assessments (Peretz et al., 2001; Tramo et al., 3:2 step (Perfect 5th), larger than the 4:3 step (Perfect 4th) and so 2001 ). Sequential processing of tones also suggest that consonance on. Shapira Lots and Stone (2008) observed that this sequence had does not rely as much on beats as on simple frequency ratios notable similarity with the subjective ranking of consonance for (Schellenberg and Trehub, 1994a,b), and EEG responses seem to musical dyads (see Table 1 ). imply that consonance ratings are formed by processing of pitch There have been recent dynamical approaches to perception, in relationships in the auditory cortex (Itoh et al., 2003). So it is clear, which for example pitch perception relies on stable neural activity given the range of subtly nuanced psychophysical phenomena and patterns known as dynamical attractors (Cartwright et al., 2001). of outstanding problems, that much work is needed to link elec- The emphasis of such approaches is the nonlinearity of the com- trophysiological recordings, biophysical models and psychophysics. plex neuronal systems at work. The perception of a beat frequency B. Heffernan. A. Longtìn /Journal of Neuroscience Methods 183 (2009) 95-106 97

Table 1 Mode stability as measured according to the authors' simulation results (column 4), and as measured by Shapira Lots and Stone (2008), for similar parameter values (a = 100, e = 0.5). Neither presents a strong correspondence with global subjective consonance rankings. Interval Ratio Consonance ranking Ranking of mode Ranking of mode stability (Schwartz et al., 2003) stability (e = 0.5) (Shapira Lots and Stone, 2008) (e = 5) Unison 1:1 1 1 1 Octave 1:2 2 3 2 Perfect 5th 2:3 3 2 3 Perfect 4th 3:4 4 4 4 Major 6th 3:5 5/6 5 5/6/7 Major 3rd 4:5 6/5 6 5/6/7 Minor 3rd 5:6 8 7 5/6/7 Minor 6th 5:8 7 10 8 Major 2nd 8:9 10/11 9 9 Major 7th 8:15 12 13 10 Minor 7th 9:16 11/10 12 11 Minor 2nd 15:16 13 11 (12) Tritone 32:45 9 8a (13) a Here, we have measure the stability of the mode corresponding to 5:7, as the resulting difference in the second tone is imperceptible (less than 8 cents! )

requires nonlinearity in the auditory periphery (such as at the Other questions arise. How closely does the consonance rank- cochlea), since the beat frequency is not present in the linear ing match psychophysical data for different coupling strengths and superposition of two pure tones. The combination of sub-threshold fundamental frequencies? How does it behave when neuronal noise activity with sufficient neuronal noise is thought to enable the is present? From the point of view of nonlinear dynamics, Devil's detection of the missing fundamental through the so-called 'ghost Staircases with strict universal properties are limited to weak cou- stochastic resonance effect' (Chialvo et al., 2002; Chialvo, 2003). pling scenarios where one oscillator is driven by another (Glass and The detection by neurons of slow envelopes associated with nar- Mackey, 1988; Coombes and Bressloff, 1999). There is little known rowband signals relies on nonlinear thresholding with low noise about the general properties of such "staircases" in the context of (Middleton et al., 2006). The proposal by Shapira Lots and Stone mutually coupled leaky integrate-and-fire neurons. Hereafter we (2008) follows this nonlinear trend, and is interesting given its fresh nevertheless refer to the ensuing staircases as "Devil's Staircases" approach to the consonance problem. for simplicity. In fact, we notice in Shapira Lots and Stone (2008) that the stability measures are based on mode locking ratios that dif- fer from the actual ratio of natural frequencies driving the neurons 1.3. Further explorations of a synchronization theory (see Section 3.2). For example, the step corresponding to the octave ratio (the neurons are mode locked 1 :2) is seen when the input Our study focuses on simple neuron models with excitatory frequencies are in a ratio of around 1 :3. Associating mode locking pulse-coupling, as was used in the work of Shapira Lots and Stone at one interval to the consonance of another interval undermines (2008). In the discussion we comment on possible future extensions the theory. Further, there is little known about the synchroniza- of our work to include inhibitory connections. The mode locking tion properties of coupled oscillators, when each one is driven at paradigm of Shapira Lots and Stone (2008) has not been tested substantially different frequencies (Pikovsky et al., 2003). Here we against the recorded activity of nerve cells in the presence of dyads. explore these properties further numerically in the context of con- Their scheme has only been illustrated numerically, assuming that sonance. each of two neurons fire, in the absence of coupling, at one of the Finally we will consider an elaboration of the synchronization frequencies present in the dyad. For example, for a Perfect 5th with theory in which the neurons do not individually (i.e., without cou- a 256 Hz fundamental, the bias current of one neuron is adjusted pling) fire at the tone frequency. The higher processing pathways, until it fires periodically 256 times per second, and the other is in auditory and other similar pathways such as the electrosensory adjusted to fire periodically 384 times per second (2:3). However, system (Berman and Maler, 1999), implement a kind of down- to our knowledge, neurons that fire periodically at the frequency of conversion of the frequency representation to bring it into line a tone stimulus have not been found in the cortex nor elsewhere with the dynamical capabilities of the neurons. In fact much may (Tramo et al., 2005). In fact, cortical firing rates are typically low be gained by looking at other , such as the electric sense except when responding transiently to inputs. This model is thus and mechano-reception, that also deal with multiple inputs of a seen more as a caricature of how nonlinearity and synchronization harmonic nature (Eggermont, 1990). While a full modeling of the may arise in, e.g., auditory cortex, serving as a basis for exploring neurons in the auditory pathway is beyond the scope ofour work, it this theory of consonance. is possible to ask what the study of simple generic neuron models Also, while it is clear that most neurons do exhibit a stochastic driven by sinusoids can bring to our understanding of consonance, component to their firing, it is not clear what the balance of deter- on the road to more realistic models. Resulting observations and minism and noise is needed to replicate the activity of cells involved predictions could help refine experiments to validate this or other in the ranking of consonance. There are numerous cell populations theories. and intricate circuitry from the cochlear nucleus up to the auditory Section 2 exposes the methods used to explore how simple cortex (Tramo et al., 2005; Joris et al., 2003). The auditory afférents neuron models can serve as investigative tools for the study of impinging on the cochlear nucleus already exhibit significant ran- consonance phenomena. Results on mode locking as a function of domness in response to a pure tone. The firings are phase locked to coupling strength and noise, along with more realistic sinusoidally the tone, but are separated by a random integer number of periods forced models with lower rates are presented in Section 3. This ofthe tone. This down-conversion of the input frequency to the out- section also includes an analysis of the correspondence between put frequency is reproducible in terms ofmathematical models that consonance rankings and mode locking ratios. A discussion of our mix determinism and noise (see, e.g., Longtin, 1993 and references results and outlook onto future investigations are the subjects of therein, and Cariani (2001) in the context of musical perception). Section 4. 98 B. Heffernan, A. Longtìn /Journal ofNeuroscience Methods 183 (2009) 95-106

2. Methods of arrival. Numerical integration of the differential equations for each voltage and for the respective alpha functions of each neuron The leaky integrate-and-fire (LIF) model is a simple neuron was done with an Euler scheme when no noise was present. The model that retains the minimal ingredients of membrane dynamics, input frequency ratio, referred to below as the intrinsic or natural but whose behaviors nonetheless map onto many known proper- frequency ratio, was determined as the ratio of firing frequencies ties ofreal neurons. They are sufficient to mimic basic sub-threshold when the cells are uncoupled. The output mode locked ratio is the properties of neurons. They incorporate supra-threshold spiking ratio of firing frequencies actually achieved in the coupled situation artificially: when the voltage reaches a fixed threshold, chosen after transients have died out. equal to 1 in our work, a firing (or spike or action potential) is said The above model, studied by Shapira Lots and Stone (2008), and to have occurred, and is represented graphically either by a point or also analytically earlier by Coombes and Lord (1997), will be studied by a vertical arrow on top of voltage time series plots (see below). below for different values of coupling. It will also be extended in At the next numerical integration time step, the voltage is reset two more realistic directions. To take noise into account, we will to a value chosen here as 0. The threshold and reset voltages can consider the effect of noise on the current-balance equation: be rescaled to realistic values for a given cell without qualitatively dU] affecting results. For example the threshold can be set at -55 mV as dt -V1 +/1+e£, (t) + Ç,(0 the Na+ activation voltage, and the reset can equal a resting poten- (6) dV2 tial of, e.g., -70 mV. The dynamics of the coupled LIF model can be dt -?"2 + /2 + e?2(0 + ?2(0 written as: dVi The noises £¡ are independent Gaussian white noises with zero ^Ldf = _YL+/l+£f^l(t)T1 mean. For simplicity both noises were given the same intensity (2) D defining the autocorrelation function of the noise <£(0£(s)> = dV2 _ V2 + /2 + e£i^2(t) 2D<5(r-s). These equations were integrated using a standard ~W ~ ~T2~ Euler-Maruyama algorithm: the deterministic part of the dynam- Here t? and r2 are membrane time constants, chosen equal to 1 ics is integrated with an Euler method, while the noise term at below. £2^i (t) represents the effect of neuron 2 (LIF2) on neuron each time step contributes the value NV2DA where N is a ran- 1 (LIFl ), and vice versa for E1^2(O. The parameter e represents the dom Gaussian number of mean zero and variance one, and ? is the strength of coupling between the neurons. /1 and /2 represent the integration time step. Output mode locking ratios were determined bias to the cells (in units of current divided by capacitance C, where by averaging the firing activity over long stretches of the numeri- C is set to 1 ). J1 is chosen so that the neuron fires (reaches threshold) cal solutions; they are thus mean output mode locking ratios when periodically at frequency /1 =256 Hz, the fundamental tone chosen noise is present. for our study, and J2 is chosen so that it fires at the other frequency The last model we consider is the sinusoidally forced, but noise- /2 in the interval. In general, the frequency of firing of an LIF model free, pulse-coupled LIF system: is related to the bias current / by the formula: dVi -^- = -v, + J1 +A1sin(2^/1t) + e£i(t) (7) dV2 -V2 + J2 + A2sm(2nf2t) + e£2(0 for Jr > 1 which is the neural oscillator regime. This can be found df simply by looking at the solution of a single LIF model (with e = 0): Without coupling (e = 0) each neuron is known to exhibit mode V(t) = /r(l - e~clT) and equating it to the threshold value after one locking to the periodic input (Keener et al., 1981 ). The frequencies period T=/-1. When a neuron fires, an action potential is assumed of the input pure tones here are simply Z1 and f2—there is no need to propagate to the other neuron, where it causes a synaptic current for a calibration using Eq. (2). The threshold and reset are 1 and 0, in the form of an alpha function with the time course: respectively, for each cell as above. The chosen biases in this case E(t) = a2te-at0(t) (4) are smaller than the values for Eq. (1) (see below). In fact, sub- threshold dynamics are used, such that firings cannot occur when Here this formula represents what a spike at time zero contributes coupling strength and forcing amplitudes A1 and A2 are set to zero. to the post-synaptic cell. 0(t) is the Heaviside function which is 0 for The amplitudes of the pure tones are set by A1 and A2. This formu- f<0 and 1 otherwise. The strength of this pulse-coupling between lation thus further allows an investigation of stimulus intensity by the oscillators is thus determined by eE(t). For numerical work varying the amplitudes. It can also be used to mimic sensitivity of involving many spikes, one has to sum many such alpha functions different neurons to different tone frequencies, i.e., to incorporate appropriately shifted in time to compute the ongoing effect of one tonotopic receptive field properties. This is not explored here, as cell on another. This involves keeping track of every firing time and the amplitudes are set equal to one another, but our work sets the numerically evaluating a continually growing list ofalpha functions stage for these further explorations. as the simulation proceeds. Further, the exponential evaluations For these simulations, we also show the relative firing phases for are computationally costly, and amount to insignificant contribu- each cell. Every time a neuron fires, its "phase" is reset to zero; this tions after a few time constants. So instead, an equivalent procedure phase is then assumed to increase linearly in time until it reaches 2p consists in modeling each synapse by two state variables: at the time of its next firing (Pikovsky et al., 2003). Having assigned d£j each neuron a phase, it then becomes possible to compute a relative phase as the phase at which neuron 1 (or 2) is when neuron 2 (or (5) 1 ) fires. This representation simply illustrates the mode lockings. -Iidy, = -2ayi-a2Ei + J2^-tjk) 3. Results Here the index 1= 1, 2, . . . and j is the opposite of i, such that tjk is the fcth firing time of neuron j, and the sum is over all such firing 3.1. Coupling strength and mode locking times. The "delta" functions in the sum are Dirac delta functions, commonly used in computational neuroscience to mimic a spike We first look at the effect of coupling strength in Eq. ( 1 ) on mode arriving at a presynaptic terminal by focusing solely on its time locking as well as on the ordering of the mode locking ratios. Fig. 1 B. Heffeman, A. Longtin /Journal o¡ Neuroscience Methods Ì83 (2009) 95-106 99

? lft I ili. I ili lili Irli I ill. I ill I ill I.ill 1 1

760 770 Time

Fig. 1. The membrane potential of LIFl (top panel, blue) and LIF2 (bottom panel, green), and their corresponding spike trains (middle panel). The bias currents in Eq. ( 1 ) were selected such that LlFl fires at 256Hz, and L1F2 fires at 384Hz, constituting a Perfect 5th. Here, the LIFs are uncoupled (s = 0) and thus, as expected, the resulting mode is just that of the natural firing ratio of 2:3, which is quite clear from the summed spike trains. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) shows the situation corresponding to a Perfect 5th interval (pure sic) frequency of LIF2 was swept from 230 to 1230 Hz in increments tone dyad) with a natural frequency ratio of 2:3. As the coupling of 0.5 Hz by increasing the bias parameter ¡2. A coupling strength of strength is zero, the output mode locking ratio is the same as the e = 0.8 was used, and the resulting staircase structure is nearly indis- natural ratio, as expected. Fig. 2 shows the mode locking effect tinguishable from that published by those authors who used e = 8. resulting from an increase in coupling strength to e = 0.2, result- As they have confirmed this parameter choice (Shapira Lots and ing in a firing mode of 3:4. A coupling strength of e = 0.5 results in Stone, personal communication), it is not clear at this point why a mode of 15:17 (not shown). In fact, the output mode locking ratio we must scale down their values of e by a factor of 10 to produce a approaches 1:1 as e increases. match to their results. It is also clear, by comparing the middle panels of Figs. 1 and 2, One can clearly see the largest step corresponding to the out- that the firing rates of both LIFs increase with increases in coupling put ratio of 1 (1:1), followed by 0.5 (1:2), 0.66 (2:3), etc. The step strength. This is expected from Eq. (2) since the firing frequency size determines the robustness of a ratio, i.e., the range of system is a monotonie increasing function of the bias current. The purely parameters, including natural frequencies, over which the ratio will excitatory coupling causes a net positive contribution to the bias in be seen. The ordering of the step sizes and the correspondence with each cell, which leads to a higher firing rate in each cell. This under- consonance rankings are discussed below using Table 1. lies the increase in the output mode locking ratio in comparison to It is important to note here that all of the output mode lock- the natural frequency ratio (Coombes and Lord, 1997). ing ratios below 1 lie above the diagonal, and thus the mode ratios acquired with this coupling do not match the natural frequency 3.2. Devil's Staircase ratios. This implies, for instance, that the octave—whose intrinsic frequency ratio is 1:2—produces a mode locking of roughly 16:25 In order to reproduce the results of Shapira Lots and Stone (0.64), which is actually a fairly 'unstable' mode. For the parame- (2008), a plot of the output mode locking ratio versus the natu- ter values used here, the coupled systems that actually synchronize ral frequency ratio was generated and is shown in Fig. 3. Here the to a mode locked ratio of 1 :2 lie in the range of natural frequency natural frequency of LIFl was set to 256 Hz, and the natural (intrin- ratios from 0.31 to 0.36 (roughly 1 :3), which corresponds to a devi-

ili lilt Milt I liti liltlM iltli t Ii

Time

Fig. 2. A Perfect 5th with coupling strength E=0.2, a = 0.9. Here, the coupled LIFs, whose natural frequency ratio is 2:3, lock to a mode of 3:4. The frequency of discharge of each neuron has increased, as seen by comparing the middle panels of this panel with that of Fig. 1 , since each neuron causes a net excitation in the other due to the excitatory coupling. ioo B. Heffernan, A. Longcin /Journal of Neuroscience Methods 183 (2009) 95- 106 due to the randomness of the solutions, the mode locking ratio is computed as a time average of the firing frequencies of both the oscillators. Clearly, for both coupling strengths shown, the noise has the effect of washing out the stair-like structure seen in Fig. 3. Only the larger steps remain visible for a given noise level. Noise does not change the slope of the curve, nor does it appear to change the relative sizes (and thus the size ordering) of the steps. A similar smearing ofa Devil's Staircase by noise has also been reported in the context of electrosensory receptors in Chacron et al. (2000), which are very similar to mammalian primary auditory afférents (Carr, 2004). Interestingly, the staircase moves towards the diagonal as the coupling strength is reduced, i.e., the mode locking ratio acquired through coupling is closer to the natural ratio. Nevertheless, these results raise the question of how robust the synchronization for different ratios remains in the face of noise, an issue further studies will have to contend with. Perhaps other mechanisms are at work to enhance the mode lockings, such as coupling between multiple oscillators.

3.4. An analysis of correspondence of rankings

0.5 0.6 0.7 0.8 0.9 Again, under the assumption that Shapira Lots and Stone indi- Intrinsic Frequency Ratio cate e values corresponding to a ten-fold increase to ours, we Fig. 3. 'Mutually Coupled-Excitatory Devil's Staircase' structure generated by hold- generated a staircase plot in an identical fashion to that mentioned ing LIFl at 256Hz and sweeping LIF2 from 230 to 1230Hz. The coupling strength above, now using e = 0.5. Our assumption was further validated by a is e = 0.8, a = 100. The steps appear as lines due to the density of points occurring similarity in stability values, as measured by the width ofeach stair, at the given modes. The red line indicates the diagonal. (For interpretation of the i.e., the range of intrinsic frequency ratios over which a given mode references to color in this figure legend, the reader is referred to the web version of locked ratio is maintained (see Appendix A for numerical values). the article.) Table 1 shows the ranking of mode stability resulting from this ation from the tonic (256Hz) by an additional half-octave. This frequency sweep at lower coupling placed beside those indicated phenomenon is also observable in Shapira Lots and Stone (2008). It by Shapira Lots and Stone. Bold values indicate a match in rank- raises a concern regarding the strict association of step sizes with ing between mode stability, and the mean value of consonance consonance ranking, as the ratio produced is shifted from the ratio assessments due to Schwartz et al. (2003) derived by normalizing actually desired. One must thus further explore the meaning of this and averaging the rankings of several such studies (here '/' denotes association in this context. equivalence). From the table, it may seem that our reproduction of Shapira 3.3. Effect of noise on mode locking Lots and Stones' results is not such a faithful one. However, it is worth noting that the rank ordering of mode stability is parameter Fig. 4 shows the effect of neuronal noise on the mode stability dependent, and thus the simple model Eq. ( 1 ) does not actually yield of the staircase, based on numerical simulations of Eq. (6). Here, an invariant relation to consonance orderings as implied in Shapira Lots and Stone (2008). This indicates that this is not a true Devil's Staircase, since the width of the stairs do not map in a one-to-one correspondence with the simplicity of ratios regardless of model parameters. Indeed, the notion of putting into correspondence the stability of mode locked ratios with dyadic tone ratios is further compromised by the fact that the 'staircase' has an average slope greater than one for all positive couplings. The slope tends to one as the coupling strength decreases (see Fig. 5 for a small coupling strength e = 0.08). - E=O 8 It can only be made to lie perfectly on the diagonal for e = 0 (see - E=OS also Fig. 5). In this latter case all modes reduce to points, yielding a - Diagonal ^ 0.6 stability measure of zero, and the staircase becomes a slide. This in itself does not exclude the possibility of 'Western' dyads locking to their identical mode locked ratios—even though the steps are now very small in comparison to previous results with stronger coupling. Nor does it exclude the possibility of the acquired mode's stability measures being ordered according to the simplicity of dyad ratios. But it does seem to imply that this perhaps overly simplistic model 04 0.5 0.6 0.7 0.8 0.9 will never yield a perfect correspondence between the simplicity of Intrinsic Frequency Ratio frequency ratios and the stability of mode locked ratios in general. Given the similarity of some orderings however, it is nevertheless Fig. 4. Devil's Staircase generated in the same manner as in Fig. 3, here with noise an interesting model from which to launch further explorations. added (D=0.1), and for coupling strengths of e = 0.8 (blue), and e = 0.5 (red). Notice Table 2 shows the stability ranking of the output mode locking that the stairs (modes) of the staircase are almost entirely washed out by the noise. Both lie above the diagonal line shown here in green for all mode locked ratios less ratio acquired by the coupled LIF neurons driven at the dyadic ratio than l, as expected. ( For interpretation ofthe references to color in this figure legend, indicated, alongside the ranking of the stability of the mode ofthat the reader is referred to the web version of the article.) same ratio. The stability of the actual mode acquired does not cor- R Heffernan, A. Longtin /Journal of Neuroscience Methods 183 (2009) 95-106 101

Table 2 The fourth column shows the rank ordering that results from measuring the stability of the actual mode acquired by the coupled LIFs, compared to Shapira Lots and Stone's measurement of the stability of the modes corresponding to the input ratio of the intrinsic frequencies, regardless of the fact that the LlFs do not necessarily lock on to this ratio. Clearly, correspondence between mode stability and consonance ranking is weak for both simulation results, as indicated by bold. Interval Consonance ranking Stability ranking of actual Stability ranking of the mode (Schwartz et al., 2003) mode acquired (measured corresponding to input ratio from Fig. 3) (e = 0.8) (measured from Fig. 3) (e = 0.8) Unison l:l 1 1 1 Octave 1:2 2 9 2 Perfect 5th 2:3 3 11 3 Perfect 4th 3:4 4 3 5 Major 6th 3:5 5/6 7 7 Major 3rd 4:5 6/5 4 4 Minor 3rd 5:6 8 8 6 Minor 6th 5:8 7 5 8 Major 2nd 8:9 10/11 6 12 Major 7th 8:15 12 2 10 Minor 7th 9:16 11/10 10 11 Minor 2nd 15:16 13 13 13 Tritone 32:45 9 12 9 respond well with either the ranking of the stability of the mode change of resulting mode, as evidenced by the relative phase of locked ratios, nor with the global subjective consonance rankings. firing seen in the lower panels of Figs. 6-8. It is not much worse however than the stability of the mode locking ratio corresponding to the interval ratio. 3.5.2. Analysis of the behavior of the model with sine forcing Fig. 9 demonstrates the ability of a sub-threshold model neuron 3.5. Sinusoidali)/ forced model on its own (i.e., without coupling to another neuron) to either up- convert or down-convert the frequency of a sinusoidal stimulus. By 3.5.1. Mode locking fixing the sinusoidal amplitude to A = 0.2 and sweeping the stim- To further enhance the realism of our model in the context of ulus frequency from 250 to 550 Hz, we see that the average firing synchronization, we performed a similar analysis to that presented rate follows a non-monotonic function. This is the case for the var- above, this time using coupled LlFs with identical sub-threshold ious bias currents we have investigated. What we are seeing here bias currents, each presented with a sinusoidal forcing at a fre- are mode lockings between the frequency of the input and the fir- quency (pure tone) corresponding to one of the frequencies in the ing frequency of the neuron—although the input frequency is not selected dyad (again using a fixed base tone of 256Hz). The model influenced by the neuron firing (i.e., the coupling is uni-directional dynamics are given by Eq. (7). Fig. 6 shows the response of the LlFs as opposed to bi-directional as has been studied up to now). Notice to a pure tone dyad stimulus corresponding to a Perfect 5th, here that the lower the bias current, the smaller the frequency range uncoupled to illustrate the acquisition ofa mode of 2:3, as expected. over which spikes are induced by the signal. This is a consequence A short transient phase appears characteristic of such sinusoidally of the low-pass filtering characteristic of the LIF model. forced sub-threshold LIFs over the parameters of interest here. Notice also that for a bias of 0.96 (blue), the octave tone of 512 Hz The effect of increasing the coupling strength e while holding is down-converted by a factor of 2. Down-conversion from tone all other parameters fixed results in increasing (technically, non- frequency to neural firing frequency is already implemented by decreasing) firing rates of the LIFs (Figs. 6-8). This results in a clear the primary auditory fibers, as discussed in the motivation lead- ing to this model. Other conversions along the auditory pathway, which have not been fully elucidated, also occur to finally deter- mine how primary auditory neurons fire according to a tonotopic representation (Joris et al., 2003; Tramo et al., 2005). The results here reveal that nonlinear mode locking effects may play a part in the down-conversion of frequency. This may be significant, as humans are capable of detecting pitches of stimuli whose frequen- cies are significantly greater than the upper limit of neuronal firing frequencies. Fig. 10 shows the average firing frequency as a function of bias current for selected stimuli. It can be easily seen that for fixed ampli- tude of the stimulus, the LIFs will never generate spikes if the bias is too small. The low-pass filtering of the LIF is again shown clearly here, since as the stimulus frequency increases, the minimal bias required in order to make the LIF fire also increases. It is intrigu- ing to see that the locking of the octave tone of 512 Hz (cyan) to the same average firing rate resulting from the tonic tone of 256 Hz (royal blue) occurs for a bias current of roughly 0.97—which may relate to pitch invariance at octaves. Finally, Fig. 11 presents an example of Devil's Staircase for this 0.6 0.7 0.8 system Eq. (7) with coupling. Mode lockings are clearly seen in the Intrinsic Frequency Ratio numerous plateaus present. However the staircase is not monotonie as we have seen up to now. The structure of the mode lockings is Fig. 5. Devil's Staircase with very low coupling (e=0.08). Although nicely aligned with the diagonal, the modes are of nearly uniform width. All dyads are 'perceptually quite complex, and its full analysis will be left for future work. So stable', in the sense that they all lock to modes of significant width. there is no apparent clear association between the size of steps 102 B. Heffernan, A. Longtin /Journal of Neuroscience Methods 183 (2009) 95- !06

10 20 30 40 50 60 70 80 90 100 o] L|\^M^M^M^^

>~ 0.5 mMWWMMWWWiM

10 20 30 40 50 60 70 80 90 100

—? 1 1 1 1 1 1 1 1

« 2 *······· ·············

10 20 30 40 50 70 80 90 100 Time

Fig. 6. (a) The sinusoidal current 'injected' into LlFl (blue) and LIF2 (green), here representing a pure tone dyad stimulus of a Perfect 5th (256 and 384 Hz) with an amplitude OfA= 0.5. (b and c)The membrane potential of UFl and LIF2 respectively, here uncoupled (e =0), each with a bias current of 0.93 and a =0.9. Notice the effect of the sinusoidal forcing on the shape of the membrane potential in comparison to the internally driven LIFs of Section 3.1. (d) The relative phase of the neurons shows clearly a mode of 2:3, as expected. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) and the simplicity of the ratio of the natural frequencies Z1 If2. One the consonance to dissonance axis. We first explored their very sim- feature stands out however: near the integer multiples of the tonic ple model in which each noise-free LIF neuron on its own fires at (so ratios of 1, 0.5, 0.33) the output mode locking ratio is around 1. one of the two frequencies in the dyad. These neurons had mutual This may again help to explain the pitch invariance experienced at excitation via alpha functions triggered by each others firings. We octaves by humans and some primates. obtained a plot of the ratio of firing frequencies with coupling to thatwithout coupling, an extension ofthe standard Devil's Staircase 4. Discussion used when one oscillator is driven uni-directionally by an exter- nal periodic rhythm. This plot agreed with that found in Shapira 4.1. Effects of coupling and noise on ordering of consonance ratios Lots and Stone (2008), although we had to use a coupling value ten times smaller then theirs. However other interesting features stood We have investigated simple models of nonlinear neural syn- out. We noted that the position of the steps deviated significantly chronization as a basis for ranking ofmusical intervals (dyads) along from the equivalent input ratio, especially for low to mid-range

10 20 30 40 50 60 70 80 90 100 oiLHWWWW

-0.5

1.5

10 20 30 40 50 60 70 80 90 100

**· ·······················

Fig. 7. Voltage time series as in Fig. 6 (Perfect 5th) but now with a coupling strength of e = 0.3. Here, it is quite easy to see a mode of 2:3 occurring in the lower panel. B. Heffernan, A. Longtin /Journal of Neuwscience Methods 183 (2009) 95- 106 103

100 Time

Fig. 8. Voltage time series as in Fig. 7, but here with the coupling strength increased to e = 0.5. Notice the emergence of a new firing pattern for LIFl (and for LIF2 as well, although more visually subtle). The resulting mode is significantly different than that seen in Fig. 7 for smaller coupling. input ratios. This deviation pretty much disappeared as this ratio faster than without coupling. This nevertheless means that a con- reached 1, but continues on above 1 (in fact, the value of 1 seems sonance theory based on these simple neuron models must be to be a pivot point around which the Devil's Staircases rotate as re-interpreted to address this mismatch. This is especially so for coupling strength varies). In other words, the plot deviated signifi- the precise significance of the match between the orderings of the cantly from the diagonal. This means for example that input tones step sizes (robustness ofmode locking ratios) and ofthe consonance that are, e.g., in a 2:3 pattern (0.66 frequency ratio) produce a mode rankings, since a given dyad (apart from unison) does not generally locking pattern that is not 2:3, but rather corresponds to a higher cause locking at the same ratio. ratio. Our work further revealed that staircases are also seen for lower Such deviations are intuitively expected. The excitatory cou- coupling strengths, with similar but not identical ordering of phase pling produces a time-varying input, the mean of which acts locking ratios to that seen for larger coupling (Tables 1 and 2). It also as an increased bias. Therefore, with coupling each neuron fires becomes more difficult to precisely estimate the size of the steps as the coupling strength is reduced, as the staircase acquires many steps of similar sizes, with all steps tending to lie on the diagonal.

£! 350

i. 500

£ 200

?? 150

350 400 450 Stimulus Frequency (Hz) Fig. 9. The average firing frequency of a single LlF in Eq. (7) (uncoupled to the other LlF) as a function of stimulus frequency is plotted for a fixed bias current value Bias Current ( black - 0.9, red = 0.92, green = 0.94, blue - 0.96). for a constant stimulus amplitude of A = 0.2. Notice that only the highest bias current of 0.96 (blue) is capable of inducing Fig. 10. Average firing frequency of a single uncoupled neuron in Eq. (7) as a function spikes in the LIF neuron model across the entire octave spanned from 256 to 512 Hz. of bias current for stimulus tones of amplitude A1= A2 =0.2 presented at 256 (royal Here, the LIF being externally forced by a stimulus frequency of 512 Hz generates blue), 288 (green), 384 (red), and 512 Hz (turquoise). For the given amplitude, none spikes at 256 Hz, down-converting the signal by a factor of 2. (For interpretation of of the stimulus tones bring the LlF neuron model to threshold for bias current values the references to color in this figure legend, the reader is referred to the web version below 0.89. (For interpretation of the references to color in this figure legend, the of the article.) reader is referred to the web version of the article.) 104 B. Heffernan, A. Longtin /Journal ofNeuroscience Methods 183 (2009) 95- 106 The effect of the noise was to jitter the spike times and mode locked firing patterns, with the result that the overall shape of the firing mode lockings, as well as phase paths (not shown) was preserved. However, the staircase had less visible fine struc- ture and more "rounded" steps. The orderings based on the steps that had measurable widths remained as in the noiseless case, as observed in externally forced neural models (Chacron et al., 2000). It would appear however that, unless, e.g., yet-to-be-modeled net- S 0.85 work effects come to the rescue, or noise-induced firings from sub-threshold dynamics play an important role, the theory could e 0.8 quickly lose its firm footing when the noise becomes moderate and &0.75 the steps are washed away. It remains to be seen what noise level is actually at work in the cells responsible for consonance perception. Perhaps the averaged mode locking ratios play a significant role in such perception as they do in determining, e.g., the shape of tuning curves (Longtin, 2001 ). Finally we have considered another novel modeling step that 06 0.7 0.8 relaxes the rather unphysical requirement that each model neuron Ratio of Frequencies fire at one of the frequencies in the interval. The model LIFs are now Fig. 11. Devil's Staircase for sinusoidally forced LIFs in Eq. (7) (?? = A2 = OA, e = 0.4, driven by sinusoidal signals of adjustable frequency and amplitude. I1 =/2 = 0.98). Notice that the first and second octaves clearly mode lock to 1:1. By adjusting the mean bias to the model cells, it is possible to choose the mean output firing rate ofan LIF for a given forcing. This can then be calibrated against neural data in a structure that is putatively Again this is intuitively correct, since in the limit of no coupling, involved in consonance or dissonance ranking. This frequency can the ratio in equals the ratio out. The fact that the ordering varies be made lower than the driving frequency, thus implementing a with coupling strength suggests that it is not universal. It may be simple down-conversion offiring rates in this system (which would that the coupling present in the nervous system is such that it does add to earlier effects such as stochastic phase locked firing in the have some degree of similarity with the subjective scale. This raises cochlear nerve). the issue of why such a coupling would be chosen. Such a model LIF neuron with sinusoidal fording is also known Fig. 5 showed that, for very small e values (here e = 0.08), the to exhibit phase locking to one sinusoidal input, with the usual uni- mode locked ratios lie very close to the diagonal. This result should versal orderings that characterize nonlinear systems (Keener et al., not be immediately dismissed due to a lack ofa meaningful ordering 1981)—i.e., there is already a Devil's Staircase at the single neuron of mode stability with regard to consonance rankings. Throughout level, prior to the coupling. Consequently, plots of firing frequency human history, the experience of dyads has almost exclusively been versus bias current are non-monotonic and exhibit much struc- of those in their complex form (until very recently, through the ture. Plots of firing frequency versus tone frequency nevertheless advent of sound cards, synthesizers, etc.). If we had never heard decay monotonically, a consequence of the fact that the LIF acts as complex tones, we might not have any preferential regard for any a low-pass filter. dyad whatsoever, since, within a certain range, they may all be of Pulse-coupling two such LIFs with excitatory synapses produces nearly identical stability (as per Fig. 5). However, since this is clearly dynamical effects that will require much effort to analyze in detail. not the case, and we always experience a roughness proportional to Our more limited goal here is to see whether there is any semblance the 'complexity' of the dyad (or its associated ratio), this assessment of mode locking as in the Stone and Shapira Lots model and/or to of 'roughness' may have become engrained or hard-wired. That is, consonance rankings. The simple answer is that there isn't any sim- from infancy, we begin to associate this roughness with the dyads, ple semblance. Several difficulties arise due to the mode locking and thus our perceptual (and cultural) preference develops as a that follow multiple complex staircases. However, one order seems result. And pure toned dyadic experiments using left-ear right-ear apparent: for simple integer ratios of stimulus frequencies, the two separation oftones may still result in the same preferential rankings LIFs are close to 1 :1, i.e., the oscillators entrain one another (Fig. 11 ). as a result of hard-wired responses. Such entrainment may play a role in the perceptual effect of pitch Some of the ratios found in the model are perceptually invariance at octaves by signaling the presence of a simple input equivalent, in that the acoustic stimuli they generate are not distin- ratio. It may be a neural state of synchrony that is associated with guishable to the average listener. This emphasizes that numerous the subjective experience ofconsonance. It remains to be seen what ratios of varying complexity can be used to describe the same tonal feature of this entrainment, such as the width of the peaks in Fig. ? , phenomena, suggesting that there is a priori no special part played might be associated with the ranking of consonance. by simple ratios. It may also be that a relabeling according to the simplest ratio within the range generating a single percept will still 4.3. Model extensions result in a one-to-one correspondence with the global subjective rankings of consonance. For instance, relabeling the Tritone as 5:7 We have chosen equal amplitudes for the pure tones in Eq. (7). results in a better placement in terms of its actual consonance rank- This formulation allows an investigation of stimulus intensity by ing. This results in a pitch difference of roughly 8 cents from the varying the amplitudes in a manner that relates, e.g., nonlinearly, commonly employed ratio of 32:45, which is the limit of percep- to that of the actual acoustic intensities. These elaborations will tion of the best professional piano tuners, and well inside the range be explored elsewhere, building on the equi-amplitude picture dis- of discrimination of the average ear. cussed here. The effect of noise on the pulse-coupled LIF system with sinusoidal forcing will also be of interest. The program of fully 4.2. Enhanced models analyzing Eq. (7), to look at effects of frequency, amplitude, mem- brane and coupling time constants, etc., is a hefty one, and should We then considered, for added biophysical plausibility, the effect begin with a proper non-dimensionalization to reduce the number of additive Gaussian white noise on this synchronization theory. of parameters. B. Heffernan. A. Longtin /Journal of Neuroscience Methods 183 (2009) 95- 106 105

Table Al The actual mode stability measurements used to produce the rankings shown in Tables 1 and 2 computed from our simulation results. The final column to the right shows the measurements indicated in Shapira Lots and Stone (2008), which were used to produce the associated ranking shown in Table 1 . Interval Interval ratio as Consonance Stability of actual Stability of mode Stability of mode Stability of mode indicated in ranking (Schwartz mode acquired corresponding to corresponding to corresponding to ratio of Shapira Lots and et al., 2003) (e = 0.8) ratio of the interval ratio of interval interval indicated in Shapira Stone (2008) (£=0.8) (e = 0.5) Lots and Stone (2008) (e - 5) Unison 1:1 1 0.11150 0.11150 0.08400 0.075 Octave 1:2 2 0.00388 0.03691 0.02505 0.023 Perfect 5th 2:3 3 0.00261 0.02917 0.02648 0.022 Perfect 4th 3:4 4 0.02126 0.02201 0.01488 0.012 Major 6th 3:5 5/6 0.00768 0.01687 0.01480 0.010 Major 3rd 4:5 6/5 0.01231 0.02391 0.01306 0.010 Minor 3rd 5:6 8 0.00408 0.02126 0.00970 0.010 Minor 6th 5:8 7 0.00916 0.01236 0.00398 0.007 Major 2nd 8:9 10/11 0.00773 0.00408 0.00573 0.006 Major 7th 8:15 12 0.02917 0.00724 0.00074 0.005 Minor 7th 9:16 11/10 0.00308 0.00657 0.00086 0.003 Minor 2nd 15:16 13 0.00172 0.00159 0.00164 Tritone 32:45 9 0.00239 0.00768 0.00906

It will also be interesting to extend the models presented here Acknowledgements to even more realistic neural populations which somehow share the information about the complex tones. A specific possibility is to This research was supported by the Natural Sciences and Engi- assume that one neuron (or neural sub-population) is being driven neering Research Council of Canada. The authors acknowledge by one tone (or complex tone) at one amplitude, but also by the useful conversations with Andreas Daffertshofer and Len Maler and other tone (or complex tone) at a different amplitude. The issue technical help from Jason Boulet. here is one of spectral receptive field, in which the proximity of tones in frequency space will reflect the proximity of the neurons they respectively excite, according to the tonotopic map. Accord- Appendix A. ingly, if the tones are close (as in a Minor 2nd) the neurons being excited may have very similar input made up of both tones with See Table Al. approximately equal weight. On the other hand, for tones much fur- ther apart in frequency (such as for a Perfect 5th with a 3:2 ratio) References one neuron will be driven more by one tone than the other, and vice versa for the other. It will thus be interesting to investigate Aldwell E. Schachter C. Harmony and Voice Leading. 2nd ed. San Diego, CA: Harcourt the synchronization properties of coupled populations of such neu- Brace Jovanovich: 1989. Berman N, Maler L Neural architecture of the electrosensory lateral line lobe: rons with dyad-dependent mixing ratios of their inputs amplitudes. adaptations for coincidence detection, a sensory searchlight and frequency- One can only speculate at this point on the resulting phase locking dependent adaptive filtering. J Exp Biol 1999:202:1243-53. structure and its robustness to noise. Bregman AS. Auditory Scene Analysis: The Perceptual Organization of Sounds. Cam- bridge, MA: MIT Press; 1990. Note that, although we have not investigated this point here, the Cariani P. Temporal codes, timing nets, and music perception. J New Music Res sinusoidally forced model formulation enables the explorations of 2001:30(2): 107-35. loudness effects on consonance evaluation, by adjusting the ampli- Cariani P. A temporal model for pitch multiplicity and tonal consonance. In: Pro- ceedings, int. conf. music perception 8i cognition (ICMPC); 2004. tude of the sine waves in some proportion to the stimulus intensity. Carr CE. Timing is everything: organization of timing circuits in auditory and elec- It may also be of interest to study how fast mode locking is estab- trical sensory systems. J Comp Neurol 2004:26:131-3. lished under different modeling scenarios and see if there is any Cartwright JH, Gonzales DL, Piro O. Pitch perception: a dynamical-systems perspec- tive. PNAS 2001 ;98:4855-9. qualitative agreement with psychophysical studies on the speed of Chacron M, Longtin A, St-Hilaire M, Maler L. Suprathreshold stochastic firing dynam- consonance ranking. Preliminary results indicate that rapid locking ics with memory in P-type electroreceptors. Phys Rev Lett 2000;85:1576-9. occurs for Eq. (7). It will also be interesting to see what inhibitory Chialvo DR, Calvo O, Gonzalez DL, Piro O, Savino GV. Subharmonic stochastic syn- connections add to the dynamical behaviors discussed here, as it chronization and resonance in neuronal systems. Phys Rev E 2002:65:050902. Chialvo DR. How we hear what is not there: a neural mechanism for the missing is also known to cause mode locking. It may allow special forms fundamental illusion. Chaos 2003;13:1226-30. of mode locking all the while keeping the firing frequencies close Coombes S, Lord GJ. Intrinsic modulation of pulse-coupled integrate-and-fire neu- to their values in the absence of coupling. In other words, out- rons. Phys Rev E 1997;56:5809-18. Coombes S, Bressloff P. Mode lockingand Arnold tongues in integrate-and-fire neural put ratios may be closer to the input ratio. Another avenue is to oscillators. Phys Rev E 1999;60:2086-96. investigate the connection between mode locking and the pitch EggermontJJ. The correlative brain. Berlin: Springer; 1990. salience seen in timing net approaches that involve population Fishman Y, Volkov I, Noh M, Garell P, Bakken H, Arezzo J, et al. Consonance and dissonance of musical chords: neural correlates in auditory cortex of monkeys level interval estimation and coincidence detectors (Cariani, 2001, and humans. J Neurophysiol 2001 ;86(6):2761 -88. 2004). Glass L, Mackey MC. From clocks to chaos. Princeton: Princeton University Press; Finally, data on humans and primates need to be reconciled with 1988. Helmholtz H. On the sensations of tone as a physiological basis for the theory of the synchronization picture, as for any other theory of consonance. music. New York, NY: Dover Publications; 1877. Firing rates are much lower than tone frequencies; there are many Hilborn RC. Chaos and nonlinear dynamics. Oxford: Oxford University Press; 1994. random firings; and dissonance has been claimed to be associated Itoh K, Suwazono S, Nakada T. Cortical processing of musical consonance: an evoked with cortical firing rates modulated at slow beat frequencies, rather potential study. NeuroReport 2003;l4(18):2303-6. Joris PX, Schreiner CE, Rees A. Neural processing of amplitude-modulated sounds. than complex locking ratios (see, e.g., Fishman et al. (2001 ) for data Physiol Rev 2003:84:541-77. on macaque monkeys and event-related potentials in humans). It Kameoka A, Kuriyagawa M. Consonance theory I. Consonance of dyads. J Acoust Soc will thus be important to investigate how the activity from neu- Am 1969:45:1451-9. Keener JP, Hoppenstadt FC, Rinzel J. Integrate and fire models of nerve membrane ron models can be brought into correspondence with multi-unit response to oscillatory input. SIAMJ Appi Math 1981,41:503-17. activity in cortex. Longtin A. Stochastic resonance in neuron models. J Stat Phys 1993;70:309-27. 106 B. Heffernan. A. Longtin /Journal o¡ Neuroscience Methods 183 (2009) 95-106

Longtin A. Effect of noise on the tuning properties of excitable cells. Chaos Solitons Schellenberg E, Trehub S. Frequency ratios and the discrimination of pure tone Fractals 2001:11:1835-48. sequences. Percept Psychophys 1994a:56:472-8. Meyer LB. Emotion and Meaning in Music. Chicago, IL: University of Chicago Press: Schellenberg E. Trehub S. Frequency ratios and the perception of tone patterns. 1956. Psychonom Bull Rev I994b;1:l91-201. MiddletonJW, Longtin A, BendaJ, Maler L The cellular basis for parallel neural trans- Schwartz D, Howe C. Purves D. The statistical structure of human speech sounds mission of a high-frequency stimulus and its low-frequency envelope. Proc Natl predicts musical universals. J Neurosci 2003;23(18):7160-8. Acad Sci USA 2006:103:14596-601. Shapira Lots I, Stone L. Perception of musical consonance and dissonance: an out- Peretz I, Blood AJ, Penhune V, Zatorre R. Cortical deafness to dissonance. Brain come of neural synchronization. J R Soc Interface 2008;5(29):1429-34. 2001:124:928-40. Tramo MJ, Cariani PA, Delgutte B, Braida LD. Neurobiological foundations for the Pikovsky A, Rosenblum M, Kurths J, Synchronization. A universal concept in nonlin- theory of harmony in Western tonal music. Ann N Y Acad Sci 2001 ;930:92-116. ear sciences. Cambridge, UK: Cambridge University Press; 2003. Tramo MJ, Cariani PA, Koh CK, Makris N, Braida LD. Neurophysiology and Plomp R, Levelt WJM. Tonal consonance and critical bandwidth. J Acoust Soc Am neuroanatomy of pitch perception: auditory cortex. Ann N Y Acad Sci 1965:38:548-60. 2005;1060(l):148-74. Roderer JG. The physics and psychophysics of music: an introduction. Berlin: Springer Verlag; 1995. Chapter 4

Article II

This chapter is an unpublished manuscript for the article:

B. Heffernan. A. Longtin. The ghost stochastic resonance as an effective coincidence detector for the measurement of musical consonance assessments in humans. November, 2009.

A. Longtin's contributions were supervisory in nature.

76 77

4 . 1 Background

Perhaps the most fundamental question in music - and arguably the common denominator of all musical tonality - is why certain combinations of tones are perceived as relatively consonant or 'harmonious' and others relatively dissonant or 'inharmonious'. -Neuroscience 2009, Purves et al.

Indeed, this fundamental question in music has been toiled with since the days of Pythagoras, and modelers attempting to reconcile their abstractions with the psychoa- coustic data have been and continue to be plentiful. Of all the models with which the authors are familiar however, none appear to offer nearly the robustness as the 'Tempo- ral model for pitch multiplicity and tonal consonance' due to Cariani as described in his paper of the same title and conceptually developed throughout a series of related papers [10, 11, 7, 27, 8].

Cariani's work is partly motivated by Rameau's theory of the 'basse fundamentale', which essentially states that the greatest common subharmonic of all tones composing a sounded dyad or a chord will be the perceived pitch of the sound image, as well as Stumpfs theory of tonal fusion which states that the greater the tendency for sounds to cohere into a single sound image, the greater their consonance. As has been previously posited, and as Cariani shows, a feasible realization of these theories can take place if the neural substrate is capable of acting as an autocorrelator. If so, the consonance of any musical interval can be measured as the signal-to-noise ratio (SNR) of the most prominent peak in the auto-correlogram of the auditory nerve's global population-wide interval statistics to its surrounds (the background) - referred to as the pitch salience by Cariani.

Briefly, Cariani observed that the auditory nerve's response to a periodic stimulus 78

(of ?? frequency) results in phase-locked firing patterns in the individual nerve fibers. The population-wide all-order interspike interval distributions (PIDs) formed by adding together the all-order interspike intervals of each nerve fiber result in distinct intervals that occur at the pitch period (T0 = ^-) and its multiples (Tj = iT0 = 1^- - its undertones or subharmonics) . This can be seen quite clearly in Fig. 4.1(a) below, where the perfect fourth dyad stimulus - here composed of two pure tones of 450Hz and 600Hz (a ratio of 3:4) - elicits a PID with a periodicity of ,,^0 600) = y^Hz = 6.66ms, where gcd refers to the greatest common divisor of the two numbers.

By simulating the response of the auditory nerve using a statistical model, and by then analyzing the PID that results from any given western dyad using subharmonic sieves and thereby effectively excluding any component whose frequency lies more than some fixed percentage away from a subharmonic, Cariani obtains a measure of pitch salience for all of the pitches present therein. The resulting plot of maximum saliences observed is remarkably similar to the psychoacoustic consonance data obtained experimentally by Kameoka and Kuriyagawa, as can be seen in 4.2. As Cariani notes:

Harmonically-related pitches share intervals at their common subharmon- ics, so that their respective interval patterns interfere the least vis-a-vis the salience measure, while unrelated pitches reduce the salience of each other (because interval peaks in one raise the mean density for the other).

Effectively, the nearer the greatest common subharmonic is to the tones of a given in- terval the greater the measure of pitch salience. This appears to overlap with Terhardt's virtual pitch model, which establishes a correspondence between the 'tonal meaning' of an interval (i.e. its implied bass note determined by the fundamental frequency) and musical consonance, where the tonal meaning of harmonically-related pitches is itself a pitch in the chromatic scale. 79

The robustness of this temporal coding scheme can be seen in a more global analysis of pitch and its neural correlates by Cariani L· Delgutte as they conclude, among other things, that: "Interval distributions in populations of neurons constitute a general, dis- tributed means of encoding, transmitting, and representing information. Ex- istence of a central processor capable of analyzing these interval patterns could provide a unified explanation for many different aspects of pitch per- ception." We will show that a basic neuron model driven by a pure-tone dyad stimulus in the subthreshold regime with noise is both a viable and computationally inexpensive dynam- ical model of the auditory nerve (at least insofar as the processing of musical intervals is concerned), and that it is also capable of extracting (virtual) pitches due to the ghost stochastic resonance effect (as has already been established more generally for generic subhtreshold oscillators by Chialvo), thus providing a candidate 'central processor'. By measuring the maximum pitch salience (similarly to Cariani, but not indentically) we are able to show that the same model is capable of providing an explanation of why certain dyads are more consonant than others - both in the sense of psychoacoustic consonance as well as musical consonance. Since the very same neuron model is already known to exhibit the stochastic resonance, the model can clearly also account for the pitch ex- traction of a single pitch-evoking stimulus and as we know due to Chialvo, the so-called ghost stochastic resonance (GSR) can account for the perceptual auditory illusion known as the 'missing fundamental'. Thus, the results presented herein strengthen the viability of such a computational scheme being present in (and across) the auditory pathway - especially in light of recent neuroanatomical findings (see the discussion for more details).

The paper proceeds by first presenting the model, the methods of simulation and the results. An analysis of the results given, followed by a discussion and concluding remarks. 80

4.2 Model

The conceptual model is shown in Fig. 4.3. A single subthreshold leaky integrate-and-fire (LIF) neuron model was driven by a dyad composed of two pure tones (sinusoids). The LIF is described by a differential equation as follows:

where

I(t) = /0(1 + (A1 sin ujit + A2 sinodi + ... + Ansmujnt)), (4.2)

and

U1 = ku>o,Lu2 = (k + l)u>o, ... ,?? = (k + ? — 1)?0, with k > 1 G Z and A's > 0 G M.

Here ?? = ^?, where T¿ = ¿(To) is the signal component's period, and T0 is the period of the 'missing fundamental - uV· £00 is gaussian white noise with zero mean and intensity D, where the autocorrelation of the noise is given by (£(i)£(s)) = 2D5(t — s). In all instances the time constant for the leak current r was taken to be t = 1. The threshold for spiking was similarly set to 1, and thus any value of /0 smaller than 1 does not induce spikes in the absence of noise and an external signal (i.e. all A's — 0); this is the subtreshold regime. Note that since the stimulus is a dyad, only two of the A^s above will be nonzero.

4.3 Simulation Methods

The LIF was initially tuned with noise intensity D = 7 x 1O-4, with I0 = 0.9, A — 0.05, and Uq = 0.256 rad/s in order to replicate the results of Barbi et al.[l], as shown in Fig. 4.4. In this case, the stimulus current was taken to be I(t) = Iq(I + msmu>0t), 81 in order to ensure that the LIF was properly exhibiting the stochastic resonance effect. Notice that the timescale is indicated in periods of ?0, and are thus identical for both plots. However, the scaling of the ordinate gives a clear indication that we used a longer simulation end-time as well as shorter bin widths along the abscissa, thus accounting for the smoother appearance and the noticeably more detailed fine structure of the ISI histogram (ISIH) in Fig. 4.4(a) when compared to Fig. 4.4(b). Of utmost importance here is the clearly pronounced maximum resonance of the LIF at the first period of the signal - i.e. the lüq stochastic resonance, as well as the generally identical qualitative structure of the figure. With the model tuned to exhibit the stochastic resonance of the LIF, with particular sensitivity to periods of the fundamental frequency ?0, a new stimulus composed of two frequency components (in the ratio of a western dyad) was presented. The stimulus tones were set as ratio's of o>o (e.g. a perfect 5th, having a ratio of 2:3, was presented as 2?? : 3?? - see Table 4.1 for the full list of western dyads and their respective component ratios). This mimics Chialvo's work regarding the ghost stochastic resonance in neuronal systems [14] , with the notable difference being that only the two harmonics of lüq corresponding to the designated ratio of the dyad are presented, as opposed to a whole range (often k = 1 — 7 or 2?? — d??)· The model was implemented by numerical simulation using Mathworks' Matlab. All simulations were run with a fixed step size of At = 3 ? IO-3 and a simulation end time ?/,?a/ = 3x 105.

4.4 Results

Identical Parameterization

The first results show the first-order ISI's for each dyad (Fig. 4.5, Fig. 4.6) and subse- quently the all-order ISI's alongside the positive part of the autocorrelation plots (Fig. 4.7, Fig. 4.8, Fig. 4.9, Fig. 4.10) of their respective dyads under identical parameterizations. Notice the clear degradation of the strength of the resonance at ojq and its subsequent 82 subharmonics as the intervals become dissonant (according to the median assessment value calculated by Schwartz et al. (2003) [24]). Although it is appealing to conclude that this seeming inability to track the fundamental frequency ?? is what distinguishes a consonant interval from a dissonant one, the weakness of such tracking is largely owing to the low-pass filtering effects of the LIF, which is well known and was expected by the authors due to the dramatic increase in frequency components of the more dissonant dyads (the tritone, for instance, is composed of 32?0 : 45cjo, compared to 2?0 : 3a>? for the perfect 5th!).

Discounting the Low-pass Filtering Effect

In order to discount the aforementioned low-pass filtering effect of the LIF, the maxi- mum height of the membrane potential V was determined for the u)q stimulus used in Fig. 4.4 absent all noise. The amplitude A of the dyad frequency components were then selected and set (identically) such that the membrane potential V attained the same maximum height, again absent all noise (see Table 4.1). This effectively discounts the low-pass filtering by creating what is essentially an equal representation of each dyad across the membrane potential of the LIF, thus rebalancing the previous bias resulting from decreasing membrane potentials in light of a fixed noise intensity). The results show a clear preferred resonance at ?? as well as a much stronger resemblance to the autocor- relation plots for all dyads (both consonant and dissonant!) and are thus better aligned with the extant physiological data (Fig. 4.11, Fig. 4.12, Fig. 4.13, Fig. 4.14, Fig. 4.15, Fig. 4.16).

4.5 Analysis

In order to analyze the data from both the 'identical parameterization' and the 'low-pass discounting' regimes, two different signal-to-noise ratio's were used. In each case, the maximum peak of the all-order ISI histogram within 0.5T0 < t < 2.5T0 was selected 83

Dyad Ratio Median Con. Val. Max V Amplitude (A) Unison 1:1 1 0.9436 0.025 Perfect 5th (P5) 2:3 0.9436 0.0288 Perfect 4th (P4) 3:4 0.9436 0.0325 Major 6th (M6) 3:5 5/6 0.9436 0.03455 Major 3rd (M3) 4:5 6/5 0.9436 0.03685 Minor 3rd (m3) 5:6 0.9436 0.0417 Minor 6th (m3) 5:8 0.9436 0.04625 Major 2nd (M2) 8:9 10/11 0.9436 0.0579 Major 7th (M7) 8:15 12 0.9436 0.07 Minor 7th (m7) 9:16 11/10 0.9436 0.0765 Minor 2nd (m2) 15:16 13 0.9436 0.099 Tritone (TT) 32:45 0.9436 0.233

Table 4.1: The full name and abbreviation, defining ratio, mediant consonance assessment value[24] [16], and the amplitude values (A) used herein to discount the low-pass filtering of the LIF for each of the western dyads. The horizontal double line separates the consonances from the dissonances. as representing the signal thus producing a similar interval window to the biophysically realistic one employed by Cariani[9] in terms of periods of Tq. The first signal-to-noise ratio was then computed as the height of this maximum peak versus the noise floor, which was taken to be the mean number of elements per histogram bin within one T0 period of the signal, centered at the signal (so 0.5T0's to either side). The second signal-to- noise ratio was similarly computed by taking the integral of the maximum peak within a ±5% range of the maximum peak of the all-order ISI, again using the same noise floor described above. This mirrors Chialvo's work[14], as it essentially takes the signal to be the entire peak centered about To, thus resulting in what the author's believe to be a better representation of the strength of the signal as implied by a wider and thus 'stronger' pulsing about the peak. The signal-to-noise ratio's using the max-peak and integral methods are plotted against the dyads in order of decreasing simplicity of their ratios (i.e. consonant to dissonant) as shown in Fig. 4.17 under the 'identical parameterization' regime, and in Fig. 4.18 for the 'low-pass discounting' regime. Note that all values were collected and averaged over twenty simulations for each of the dyads. 84

The same SNR measures are then identically applied to the data after it has been smoothed using a Savitzky-Golay smoothing filter (degree 7) [22], which essentially per- forms a local polynomial regression on the data. This was done in order to remove the jaggedness incurred due to discretized binning in the histogram while preserving the fea- tures of the histogram such as relative maxima, minima and width; features which tend to be flattened out by other averaging techniques such as taking moving-averages. The signal-to-noise ratio's using the max-peak and integral methods are plotted as above in Fig. 4.19, and Fig. 4.20. In all cases, the signal-to-noise ratio is used to measure 'pitch salience' - the hypothesis being that the rank ordering of dyad's by the pitch of maximum salience in the all- order ISI corresponds with the rank ordering of the global subjective assessment of its consonance.

4.6 Discussion

The plots of all-order ISI's for the amplitude-adjusted pure-tone dyad stimuli driving our LIF model show a very strong qualitative correspondence to the non-negative portion of the same pure-tone dyad stimuli's auto-correlogram, as well as to the all-order ISI of the physiological data of the cat auditory nerve presented with similar dyad stimuli. This suggests that the sub-threshold LIF with noise, tuned to exhibit the GSR, has an extremely similar spike-train to that of the auditory nerve as a whole, and that this simple model may in fact be a good (and computationally inexpensive) model of the auditory nerve in general. The exhibition of the GSR by each of the western dyads is itself novel to the best knowledge of the author's, although it was not unexpected. Although the pitch salience measures used here varied in their ability to properly rank- order the dyads according to the perceptual quality of consonance, there is an undeniable consistency across them to properly rank the consonances versus the dissonances as groups (i.e. all the consonances rank higher than the dissonances). Furthermore, by 85 dismissing the 'identical parameterization' results due to their low-pass bias as reason permits, a greater consistency can be witnessed in the remaining results. Further selecting the 'peak integral' SNR measure as being the best of the two SNR measures at capturing the relative strength of the peaks and thus the pitch salience of the dyads, a greater distinction between the consonances and the dissonances emerges, and the rank-ordering of the dyads largely follows the median values for psychoacoustic consonance calculated by Schwartz et al. [24] . Our results now successfully predict the grouping of consonances and dissonances. More strikingly however is that the result perfectly predicts the rank- ordering of the dyads along the dimension of musical consonance[25], as witnessed by the monotonically descending plots in Fig. '4. 19(b) and Fig. 4.20(b) from the simplest to the most complex ratio). Therefore, in the spirit of Rameau, the author's contend that the 'ability' of a musical interval to elicit a stochastic resonance at its greatest common subharmonic (or ?? as used throughout this paper) - all else being equal - appears to rank order perfectly with musical consonance, which itself can be rank ordered by the simplicity of the ratio of the intervals. In a sense then the pitch salience measure for the pure tone dyads corresponds perfectly with the tendency of the interval to invoke resolution in a musical piece, as resolution is accomplished by moving from a dissonant or unstable interval or chord to a consonant or stable sounding one. Thus, the results shown in Fig. 4.18(b) and Fig. 4.20(b) show the model to be - to the best of our knowledge - the first biophysically plausible mechanism to properly predict the musical consonance of the western music intervals. The reason(s) for the discrepancy between the author's results and those of Cariani insofar as psychoacoustic consonance determinations are concerned are likely a result of one or more of the following factors:

• although the model was tuned to exhibit a strong resonance at uio, a thorough sweep of the parameter space (bias, stimulus amplitude, and noise intensity) was not performed in order to find the optimal parameterization that maximizes said 86

resonance.

• the variable exponential-weighting scheme employed by Cariani differs from that used by the author's, who chose to use more common SNR measures.

• the author's employed a Savitsky-Golay filter, whereas Cariani made use of the moving-average.

• the fixed width of the peak integral (±5%) weighting employed here may have produced a bias (in some cases not being wide enough to account for the entire width of the peak thus diminishing the SNR, and in others being too wide, thus exaggerating the SNR).

• the LIF is too simplistic a model to capture all the subtle behavior of more bio- physically realistic neuron models (such as the Hodgkin-Huxley model)

• the statistical model used by Cariani to reproduce the auditory nerve's response to a dyad stimulus is by design statistically identical to the actual auditory nerve, and is thus unlikely to be perfectly matched by such a simple dynamical model.

• There is much discrepancy between results of studies of psychoacoustic consonance of dyads, and thus agreement or disagreement with any one such study, or with the median values of several studies, does not necessarily imply a stronger result (see Purves et al for evidence of the variance in psychoacoustic studies). The general grouping of consonances and dissonances, however, is largely agreed upon.

The results are exciting in that there is biological evidence that the cat's inferior colliculus is "tonotopically organized into laminas exhibiting constant frequency ratios between corresponding locations in adjacent layers" [24] [19], and that the architecture of the cat's inferior colliculus suggests that it is adapted for the extraction of the funda- mental frequency of sounds (i.e. via GSR), and that "perceptions of consonance and dis- sonance might be a consequence of this functional organization" [24] [3] . This agrees with 87 conclusion that a deeper processing layer of the auditory system likely exhibits the GSR owing to the observation that dichotically presented dyads (where one tone stimulates the right ear, and one the left) still elicit the perception of the missing fundamental [5]. The very recent results of Bidelman and Krishnan indicate that the human brain- stem's responses to the western intervals (whose pitch saliences were calculated as per Cariani) are well correlated with the ordering of consonance [2]. Since the experiment records responses from the pre-attentive brainstem, Bidelman and Krishnan infer "that the choice of intervals used in compositional practice may have originated based on the fundamental processing and constraints of the auditory system" . This is extremely in- teresting, especially in light of the research by Schellenberg and Trehub[23] regarding the processing predispositions of infants towards simple ratio dyads. Indeed, infants only detect changes to pairs of pure tones if the tones are related by simple frequency ra- tios. Bidelman and Krishnan, citing similar developmental observations, conclude that since the effects observed in infants are absent of long-term enculturation and exposure, it is conceivable that the mechanism(s) responsible for pitch perception develop from 'domain-general processing', which are governed by the fundamental (or innate) capabil- ities of the auditory system. Therefore, there may exist a neurobiological predisposition towards the processing of simpler, more consonant intervals that results from the struc- tural (or architectural) design of the nervous system itself. It thus appears that this architecture may very well be nothing more than a number of sub-threshold oscillators in a noisy environment, the 'fundamental capabilities' being the extraction of the funda- mental frequency u)q, the short-term autocorrelation function of its resulting spike-train, and the computation of a signal-to-noise ratio that corresponds to perceptual stability. Finally, the results of Cariani are almost certainly complimentary to the results de- scribed herein insofar as the further elucidation of the subtleties of the perception of consonance and dissonance is concerned. Whereas Cariani's results appear to explain psychoacoustic consonance quite accurately, our results here appear to similarly explain musical consonance. As thoroughly explained by Terhardt, "the experimental results 88 on consonance and roughness appear to be significant and consistent, and thus pro- vide a solid basis for psychoacoustic consonance...However, psychoacoustic consonance is distinctly different from another kind of consonance which plays a basic role in tonal music. The universal importance of harmonic intervals in music cannot be explained satisfactory by the concept of psychoacoustic consonance" [25]. It is thus that Terhardt concludes that his abstract virtual pitch model may provide the theory of consonance and harmony with a psychoacoustic basis[25]. We have further grounded the theory by linking the psychoacoustics with a low-level physiological representation, further bridg- ing the gap with "high-level music-notions of tension and relaxation" [9] - as is Cariani's stated goal for the field.

4.7 Conclusion

By examining the ghost stochastic resonance of a leaky integrate-and-fire neuron driven by a pure tone dyad stimulus whose frequency components were determined by the ratio of western music intervals we found that the all-order interspike intervals corresponded well with the positive part of the auto-correlogram of the stimulus. This feature, which has been observed in the data obtained from physiological experiments, is established here for the first time using a dynamical neuron model. Our results are also novel in that they reproduce the spike-train of the auditory nerve when presented with a musical interval stimulus using a dynamical-model, as opposed to a using a stochastic model that pre-processes the waveform using a variety of filter's. Fi- nally, the ghost-stochastic resonance has never been explicitly established for the western musical intervals as far as the author's are aware. 89

4.8 Figures

Fq pattern

lntcrepfke interval (me) B K- H I I I ™ I I I I I I I l„ AllT-8paein(3j-

I I I I I I I I I I I I I I I I I I I I I I liI 6O

C2 12FO = 2H-*2 fl/2 ß - " I ¦·¦ i—4- 0 300 eoo UIUIUIUIUIUIUIUIUIUI 123456789 It Pitch (Hz) Al! Outer ISI - hi petiods ol e>0 (a) Cariani's Perfect 4th (b) Our replica of panel A in (a)

Figure 4.1: (a) The pitch multiplicity and salience estimation of Cariani [reprinted from ]. A. PID in response to two pure tones a fourth apart, 400 & 587 Hz. B. Representative sub- harmonic interval sieves for estimating pattern saliences. C. Pitch map: observed distribution of pattern salience values, (b) Our replica of panel A in (a). Here the plot displays the all-order ISIH of the LIF's spiketrain subject to a dyad forcing of Perfect 4th (with D = 7 ? 10~4, /o = 0.9, A = 0.05, and ?0 = 0.256rad/s. 90

2 3 4 ß ß IO ti

2 IVr >

EXP. ?ß? D SUBJECTS IANECMMC AOOM t*P It* IO SO 90 tOO 200 300 » StWJICTS FREOUENCV DEVIATION lOOIftM/fl W MTUlWU. «WTM III KWTOHCS B

(KK dala}: 0.88

a I I

1 1 m ?,1, ? \\\ 1 1

Figure 4.2: Comparisons between Cariani's simulated results and the psychoacoustic data of Kameoka & Kuriyagawa[17]. (A) and (C) show the subjective ratings of consonance for pure and complex tone intervals, respectively. (B) and (D) display the respective simulated results obtained by Cariani using the maximum pitch salience measure. Reprinted from [9]. 91

fr 1I U f2 m î y¡k WH noise noise (a) (b)

Figure 4.3: (a) A schematic representation of our model [modified from [20]]. The two inputs (i.e. tones) force the model neuron, which is in the subthreshold regime, and produce a reso- nance at the fundamental frequency (/r). (b) This schematic shows two model neurons being forced by one input (i.e. tone) each, their action potentials further forcing a third neuron (again in the subtreshold regime with noise) [reprinted from [20]]. It has been shown that the systems behave equivalently for identical stimuli and appropriate parameterizations[15].

60

30

J-' TL Jl ? s —'—H c—r-

0 12 3 li lnterspike Interval - In periods of ?0 (a) Author's Replication (b) Barbi et al.

Figure 4.4: (a) The interspike interval histogram (ISIH), which represents the number of in- tervals versus the interval duration and is not normalized, is shown here for the subthreshold LIF driven by a single sinusoidal input. Stochastic phase-locking is here evidenced by the decay of the local maxima at consecutive periods of TO — 1/??· Here D = 7 ? IO-4, with Iq — 0.9, A = 0.05, and ?? = 0.256 rad/s. (b) Reprinted results of identical simulation as originally presented by Barbi et al.[l] 92

^1_ L ^ ._ M LJliiL ?. .. 3 4 5 ß I rilerspike Interval - In periods of ? (a) P5 (b) P4

Liill-ilj ?. JIULn,,. 5 6 7 Ulilui lnterspike»ike Intervallnlei - In periods ol ?. (e) M6 (d) M3

álALj lnterspike Interval - In periods of lnterspike Interval - In periods of ? (e) m3 (f) m6 Figure 4.5: First-order ISIH's of the consonances under identical parameterization [D = 7 ? 10~4, with Iq = 0.9,^4 = 0.05, and uiq = 0.256 rad/s). Notice that each panel clearly shows its strongest resonance at the 'fundamental bass' ??· Also notice that the absolute number of spikes, as denoted on the y-axis, decreases monotonically moving from the top left to the bottom right, and that the number of prevalent intervals in each plot increases similarly thus increasing the background noise with respect to the most salient pitch. This parallels a progression from consonance to dissonance. [See Table 4.1 in the text for a legend of dyad name abbreviations.] 93

Inleisptke Interval - In periods of U0 lnteispike Interval - In periods ol O^ (a) M2 (b) M7

lnterspike Interval - In penodsot nterspike Interval - in penóos a d) m2

lnterspike Interval - In periods of

Figure 4.6: First-order ISIH's of the dissonances under identical parameterization (D = 7 ? IO"4, with I0 = 0.9, A = 0.05, and ?0 = 0.256 rad/s). Notice that the tritone's ISIH has very few data points as a result of the low-pass filtering effect of the LIF. 94

k M Mlü ? i (a) P5 (b) X-P5

UUAIM MlMlM AiOnMitSI-lnpaiiodsotcu(On*» tSI - lnpan 23456789 10 (c) P4 (d) X-P4

S 50- a e S «o-

123*56789 Mt Onto ISI- In poriodo al ?0 5 6 7 ß 9 10 (e) M6 (f) X-M6

Figure 4.7: All-order ISIH's for the consonances (P5, P4 and M6) under identical parameter- ization, and their respective auto-correlogram's (denoted by 'x-'). Notice the all-order ISIH's similarity to the auto-correlogram for each dyad. Here, D = 7 ? IO-4, with Iq — 0.9, A = 0.05, and ujq = 0.256 rad/s. 95

yuuiiu U Ordei ISI-In periods of 5 6 7 8 9 10 (a) M3 (b) X-M3

(c) m3 (d) x-m3

(e) m6 (f) x-m6

Figure 4.8: All-order ISIH's for the consonances (M3,m3, and m6) under identical parameteriza- tion, and their respective auto-correlogram's. Again, notice the similarity between each dyad's all-order ISIH and its respective auto-correlorgram. Here, D — 7 ? 10~4, with 7o = 0.9, A = 0.05, and Uq = 0.256 rad/s. 96

AD Onto ISJ-In poiiods et &0 5 6 7 ß 9 10 (a) M2 (b) X-M2

0 12 3 5 6 7 (e) M7 (d) X-M7

(e) m7 (f) x-m7

Figure 4.9: All-order ISIH's for the dissonances (M2, M7, and m7) under identical parameteri- zation, and their respective auto-correlogram's. Notice how the similarity between each dyad's all-order ISIH and its auto-correlogram diminishes. Here, D = 7x 1O-4, with Iq — 0.9, A = 0.05, and ?? = 0.256 rad/s. 97

il (a) m2 (b) x-m2

IN

AD Onta ISI - In pera (c) TT (d) x-TT

Figure 4.10: All-order ISIH's for the dissonances (ni2 and TT) and their respective auto- correlogram's. Notice the complete lack of similarity between the all-order ISIH of the tritone and its respective auto-correlogram. Here, D = 7 ? IO-4, with Io = 0.9,^4 = 0.05, and ?? = 0.256 rad/s. 98

InteiEpike Interval - In periods ol uQ lnierspike Interval - in periods ol o>0 (a) P5 (b) P4

IU lnierspike Interval - In periods ol ? lnterspiKe Interval - In periods c M6 (d) M3

lu lnierspike Interval - In periods of lntetspike Interval - In periods oí (e) m3 (f) m6 Figure 4.11: First-order ISIH's of the consonances, with adjusted stimulus amplitude as shown in Table 4.1. Notice that each panel clearly shows its strongest resonance at the 'fundamental bass' ??· Also notice that the absolute number of spikes, as denoted on the y-axis, decreases monotonically moving from the top left to the bottom right, and that the number of prevalent intervals in each plot increases similarly thus increasing the background noise with respect to the most salient pitch. This parallels a progression from consonance to dissonance. Here, D = 7 ? 10"4, with I0 = 0.9, and cj0 = 0.256 rad/s. 99

60 SO

SO

I S 30

ZO

10

lnlerspike Interval - In periods ol ?0 lnteispike Interval - In periods of ID0 (a) M2 (b) M7

100 250

BO ZOO

70

B SO ! ? «0 ¿ loo

20 EO 11IiLi1JUiIi lntersptke Intervali - In periods ot pike Interval - In odsof (c) m7 (d) m2

BO

70

so

50

S I

30

20

lnierspike Interval - In periods oí hl (e) TT

Figure 4.12: First-order ISIH's of the dissonances with adjusted stimulus amplitudes (see Ta- ble 4.1). Notice that the tritone plot now displays a resonance at ??, as well as a more meaning- ful number of data points in general. Here, D — 7 ? 10~4, with Iç, = 0.9, and ?? — 0.256 rad/s. 100

Al I-Order Interspjke Interval - In periods o 1.5 2 2.5 3 3.5 (a) P5 (b) X-P5

All-Order lnlerspike Interval - In periods ?? ? 1.5 2 2.5 3 3.5 4 4.5 5 (e) P4 (d) X-P4

All-Order lnterspike Interval - In periods of 1.5 2 2.5 3 3.5 4 4.5 5 (e) M6 (f) X-M6

Figure 4.13: All-order ISFs for the consonances P5, P4, and M6 with adjusted stimulus am- plitude values and their respective auto-correlogram's. Notice the strong similarity. Here, fl = 7x IO"4, with I0 = 0.9, and ?0 = 0.256 rad/s. 101

All-Order lnlerspike Interval - In periods of. ? U 11 11 U 11 1.5 2 2.5 3 3.5 4 4.5 5 (a) M3 (b) X-M3

All-Otder lnterspike Interval - In periods ol ? 1.5 2 2.5 3 3.5 (c) m3 (d) x-m3

All-Order lnterspike Interval - In periods at ? 1.5 2 2.5 3 3.5 4 4.5 5 (e) m6 (f) x-m6

Figure 4.14: All-order ISIH's for the consonances M3, m3 and m6 with adjusted stimulus amplitude values and their respective auto-correlogram's. Notice the strong similarity. Here, D = 7x 1(T4, with I0 = 0.9, and ?0 = 0.256 rad/s. 102

All-Oider lnterspihe Nerval - In periods of minium minimi .5 2 2.5 3 3.5 4 45 5 (a) M2 (b) X-M2

Ali-Order lnterepike Interval - In periods oi ?, 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 (c) M7 (d) X-M7

All-Order lnterspike Interval - In periods ol ?, 1.5 2 2.5 3 3.5 4 4.5 5 (e) m7 (f) x-m7

Figure 4.15: All-order ISIH's for the dissonances M2, M7 and m7 with adjusted stimulus amplitude values and their respective auto-correlogram's. Notice that the all-order ISIH's now possess a much stronger resemblance to their respective auto-correlogram's. Here, D — 7 ? 1O-4, with Io — 0.9, and uiq — 0.256 rad/s. 103

All-Order lnterspike Interval - In periods of ?0 a) m2 (b) x-m2

All-Order lnterspike Interval - In periods of ?. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 (c) TT (d) X-TT

Figure 4.16: All-order ISHFs for the dissonances m2 and TT with adjusted stimulus amplitude values and their respective auto-correlogram's. Notice that the all-order ISIH's now possess a much stronger resemblance to their respective auto-correlogram's. Here, D = 7 ? 10~4, with Iq = 0.9, and cjq — 0.256 rad/s. 104

(a) SNR - Measured by 'Max Peak'

1 2 i S 6 7 8 9 (b) SNR - Measured by 'Peak Integral'

Figure 4.17: Signal-to-Noise Ratio for 'identical parameterization' regime plotted as a function of the simplicity of dyad ratios (beginning with Perfect 5th (1) and ending with Tritone (11) - see Table 4.1). 105

16

14

»

10

8

ID 11 (a) SNR - Measured by 'Max Peak' After Applying a Savitzky-Golay Filter

350

300

2S0

200

ISO

100

50

10

(b) SNR - Measured by 'Peak Integral' After Applying a Savitzky-Golay Filter

Figure 4.18: Signal-to-Noise Ratio for the 'identical parameterization regime' and using a Savitzky-Golay filter plotted as a function of the simplicity of dyad ratios (beginning with Perfect 5th (1) and ending with Tritone (11) - see Table 4.1). 106

1 2 3 S 6 7 8 9 IO 11 (a) SNR - Measured by 'Max Peak'

10 11

(b) SNR - Measured by 'Peak Integral'

Figure 4.19: Signal-to-Noise Ratio for the 'adjusted amplitude regime' plotted as a function of the simplicity of dyad ratios (beginning with Perfect 5th (1) and ending with Tritone (11) - see Table 4.1). (a) SNR - Measured by 'Max Peak'

(b) SNR - Measured by 'Peak Integral'

Figure 4.20: Signal-to-Noise Ratio for the 'amplitude adjusted regime' and using a Savitzky- Golay filter plotted as a function of the simplicity of dyad ratios (beginning with Perfect 5th (1) and ending with Tritone (11) - see Table 4.1). References

[1] M. Barbi, S. Chiliemi, and A. D. Garbo. The leaky integrate-and-fire with noise: a useful tool to investigate sr. Chaos, Solitons and Fractals, 11(12):1849-1853, 2000.

[2] G. M. Bidelman and A. Krishnan. Neural correlates of consonance, dissonance, and the hierarchy of musical pitch in the human brainstem. Journal of Neuroscience, 29(42):13165-13171, Jan 2009.

[3] M. Braun. Auditory midbrain laminar structure appears adapted to fO extraction: further evidence and implications of the double critical bandwidth. Hearing research, 129(l-2):71-82, Mar 1999.

[4] M. Braun. Inferior colliculus as candidate for pitch extraction: multiple support from statistics of bilateral spontaneous otoacoustic emissions. Hearing research, 145(l-2):130-40, Jul 2000.

[5] O. Calvo and D. Chialvo. Ghost stochastic resonance in an electronic circuit. In- ternational journal of bifurcation and chaos in applied sciences and engineering, 16(3):731, 2006.

[6] P. Cariani. Temporal coding of sensory information. Proceedings of the annual conference on Computational . . . , Jan 1997.

[7] P. Cariani. Neural timing nets. Neural Networks, 14:737, Jan 2001.

108 109

[8] P. Cariarli. Temporal codes, timing nets, and music perception. J New Music Res, 30(2):107-135, 2001.

[9] P. Cariani. A temporal model for pitch multiplicity and tonal consonance. Pro- ceedings of the Eighth International Conference on Music Perception and Cognition (ICMPC), 2004.

[10] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, i. pitch and pitch salience. Journal of Neurophysiology, 76(3):1698, 1996.

[11] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, ii. pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. Journal of Neurophysiology, 76(3):1717, 1996.

[12] D. Chialvo. Illusions and ghost resonances: How we could see what isn't there. Unsolved Problems of Noise and Fluctuations: UPoN 2002: Third International Conference of Unsolved Problems of Noise and Fluctuations in Physics, Biology, and High Technology, Washington, DC, 3-6 September 2002, page 43, 2002.

[13] D. Chialvo. How we hear what is not there: A neural mechanism for the missing fundamental illusion. Chaos, 13:1226, 2003.

[14] D. Chialvo, O. Calvo, D. Gonzalez, O. Piro, and G. Savino. Subharmonic stochastic synchronization and resonance in neuronal systems. Phys Rev E, 65(5):50902, 2002.

[15] M. Giraudo, L. Sacerdote, and A. Sicco. Ghost stochastic resonance for a neuron with a pair of periodic inputs. Lecture Notes in Computer Science, 4729:398, 2007.

[16] B. Heffernan and A. Longtin. Pulse-coupled neuron models as investigative tools for musical consonance. Journal of Neuroscience Methods, 183(1):95-106, Jan 2009.

[17] A. Kameoka and M. Kuriyagawa. Consonance theory part i: consonance of dyads. The Journal of the Acoustical Society of America, 45(6):1451-9, Jun 1969. 110

[18] A. Kameoka and M. Kuriyagawa. Consonance theory part ii: consonance of complex tones and its calculation method. The Journal of the Acoustical Society of America, 45(6): 1460-9, Jun 1969.

[19] G. Langner and C. E. Schreiner. Periodicity coding in the inferior colliculus of the cat. i. neuronal mechanisms. Journal of Neurophysiology, 60(6):1799-822, Dec 1988.

[20] A. Lopera, J. M. Buldú, M. C. Torrent, D. R. Chialvo, and J. Garcia-Ojalvo. Ghost stochastic resonance with distributed inputs in pulse-coupled electronic neurons. Physical review E, Statistical, nonlinear, and soft matter physics, 73(2 Pt 1):021101, Feb 2006.

[21] I. S. Lots and L. Stone. Perception of musical consonance and dissonance: an outcome of neural synchronization. J R Soc Interface, 5(29):1429-1434, Jan 2008.

[22] A. Savitzky and M. J. E. Golay. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, Vol. 36(No. 8): 1627-1639, Jan 1964.

[23] E. Schellenberg and S. Trehub. Natural musical intervals: Evidence from infant listeners. Psychological Science, 7(5):272-277, Sep 1996.

[24] D. Schwartz, C. Howe, and D. Purves. The statistical structure of human speech sounds predicts musical universals. Journal of Neuroscience, 23(18):7160-7168, Jan 2003.

[25] E. Terhardt. Pitch, consonance, and harmony. J Acoust Soc Am, 55(5):1061-1069, Jan 1974.

[26] M. Tramo, P. Cariani, B. Delgutte, and L. Braida. Neurobiological foundations for the theory of harmony in western tonal music. ANNALS-NEW YORK ACADEMY OF SCIENCES, Jan 2001. Ill

[27] M. Tramo, P. Cariarli, C. Koh, and N. Makris. Neurophysiology and neuroanatomy of pitch perception: auditory cortex. Annals New York Academy of Sciences, 1060:148, Jan 2005. Chapter 5

Conclusions

'It occurred to me by intuition, and music was the driving force behind that intuition. My discovery was the result of musical perception.' -Albert Einstein, when asked about his theory of relativity

'Music gives a soul to the universe, wings to the mind, flight to the imag- ination, and life to everything.' -Plato

5.1 Summary

The two papers presented in Chapters 3 & 4 of this thesis make use of simple neuron models in order to assess the viability of different biophysically plausible coding schemes that could help to explain the seemingly universal human assessments of consonance and dissonance. The first paper assessed the viability of a nonlinear synchronization coding scheme proposed by Shapira Lots and Stone (2008) [13]. This model, consisting of two pulse- coupled model neurons with each neuron firing at one of the two frequencies in a musical interval, showed promise as it established a strong link between the synchronous or mode- locked states of forced oscillators and the ratios of these musical intervals. A critical

112 113 analysis of their results revealed a flaw in their reasoning that manifested itself in the false identification of such a meaningful relationship. Briefly, although the width of mode- locked states rank-ordered from largest to smallest in good accordance with consonance assessments, the mode-locked states obtained by the actual intervals in question - which are of sole importance - did not. This was revealed by our reproduction and subsequent analysis of Shapira Lots and Stone's experiment. Further experiments indicated that the rank-ordering of the mode-locked states was dependent upon coupling strength, and that the effects of noise washed out the majority of the modal structure. The model was then modified by adding sine-forcings to the model neurons in the subthreshold regime to reflect a greater biological realism. The results presented an interesting modal structure that could account for the psychoacoustic phenomenon of octave equivalence if synchronous coding is employed in the coding of pitch percepts by the auditory system. This structure, although unknown at the time of publication, appears to be what is known as a Multiple-Devil's Staircase. It is this result (regardless of its name) that prompts further investigation as regards the plausibility of a neural synchrony coding. Furthermore, it may prove a fruitful mathematical exercise to prove (or disprove) that the modal structure resulting from two excitatory pulse-coupled os- cillators is indeed a Multiple-Devil's Staircase, as this would help to characterize the behavior of such a system rigorously. The second paper explores the ability of stochastic phase-locking in the leaky integrate- and-fire neuron modelflO, 2, 12] to account for consonance assessments. Cariani observed that the psychoacoustic consonance data could be reliably matched by measuring the maximum pitch salience of the pitches present in the population-wide all-order inter- spike interval and rank-ordering each interval according to this measure [5]. We first showed that our simple dynamical neuron model is in good qualitative agree- ment with a statistical model of the auditory nerve as a whole when presented with a dyad stimulus; a result not previously observed, and one which is certainly worthy of further exploration. This can be seen by the obvious similarity between our results and 114 the positive part of the auto-correlogram's of the intervals; a similarity which has been observed experimentally [16, 4]. Secondly, we showed that the maximum pitch salience of each interval as measured by the signal-to-noise ratio of the maximum peak in the all-order ISIH agreed with the psychoacoustic consonance insofar as groupings of conso- nant and dissonant intervals is concerned. Strikingly, our results revealed an extremely strong correspondence with the musical consonance [15] (in one case this correspondence was shown to be perfect). This result is exciting as it presents what we believe to be the first biologically plausible grounding of the phenomenon of musical consonance - a phenomenon that has been studied since the time of Pythagorus. The viability of our model is strengthened in light of its ability to account for the audi- tory illusion known as the missing fundamenta] [9], as well as virtual pitch extraction[15] in general. Furthermore, it has been hypothesized that similar neural processing mecha- nisms are present in the auditory system[2, 1, H]. Finally, our results can also account for the primary emergence of stable consonant percepts in infants - a phenomenon which indi- cates a universal intervalle processing predisposition[14]. Further experimentation aimed at assessing the model's ability to account for the psychoacoustic consonance through the use of harmonic complex tone stimuli, and using an identical signal-to-noise ratio to that used by Cariani[5] are warranted. The exploration of the model's explanatory power as regards its ability (or inability) to account for various other psychoacoustic phenomena by way of an all-order ISI temporal coding [3, 4] would shed light on the model's robustness.

5.2 Final Thoughts

This thesis attempts to bridge the gap between our subjective appreciation of the most basic of musical objects and the neural processing and coding mechanisms of the auditory system. Although this gap certainly remains open I believe that the results presented herein, when added to the existing research in the field, provide a solid contribution and 115 a potential path forward. By gaining a better scientific understanding of the world of audition and music it is my sincerest hope that we may make positive progress towards the restoration and preservation of our beautiful sense, and of the beautiful world to which it is a gateway: the transcendent world of music. References

[1] M. Braun. Auditory midbrain laminar structure appears adapted to fO extraction: further evidence and implications of the double critical bandwidth. Hearing research, 129(1-2) :71-82, Mar 1999.

[2] 0. Calvo and D. Chialvo. Ghost stochastic resonance in an electronic circuit. In- ternational journal of bifurcation and chaos in applied sciences and engineering, 16(3):731, 2006.

[3] P. Cariani. Temporal coding of sensory information. Proceedings of the annual conference on Computational . . . , 1997.

[4] P. Cariani. Temporal coding of periodicity pitch in the auditory system: An overview. Neural Plast, 6(4):147-172, 1999.

[5] P. Cariani. A temporal model for pitch multiplicity and tonal consonance. Pro- ceedings of the Eighth International Conference on Music Perception and Cognition (ICMPC), 2004.

[6] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, i. pitch and pitch salience. Journal of Neurophysiology, 76(3):1698, 1996.

[7] P. Cariani and B. Delgutte. Neural correlates of the pitch of complex tones, ii. pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. Journal of Neurophysiology, 76(3):1717, 1996.

116 117

[8] D. Chialvo. Illusions and ghost resonances: How we could see what isn't there. Unsolved Problems of Noise and Fluctuations: UPoN 2002: Third International Conference of Unsolved Problems of Noise and Fluctuations in Physics, Biology, and High Technology, Washington, DC, 3-6 September 2002, page 43, 2002.

[9] D. Chialvo. How we hear what is not there: A neural mechanism for the missing fundamental illusion. Chaos, 13:1226, 2003.

[10] D. Chialvo, O. Calvo, D. Gonzalez, O. Piro, and G. Savino. Subharmonic stochastic synchronization and resonance in neuronal systems. Phys Rev E, 65(5):50902, 2002.

[11] G. Langner and C. E. Schreiner. Periodicity coding in the inferior colliculus of the cat. i. neuronal mechanisms. Journal of Neurophysiology, 60(6): 1799-822, Dec 1988.

[12] A. Lopera, J. M. Buldú, M. C. Torrent. D. R. Chialvo, and J. Garcia-Ojalvo. Ghost stochastic resonance with distributed inputs in pulse-coupled electronic neurons. Physical review E, Statistical, nonlinear, and soft matter physics, 73(2 Pt 1):021101, Feb 2006.

[13] I. S. Lots and L. Stone. Perception of musical consonance and dissonance: an outcome of neural synchronization. J R Soc Interface, 5 (29): 1429-1434, 2008.

[14] E. Schellenberg and S. Trehub. Natural musical intervals: Evidence from infant listeners. Psychological Science, 7(5):272-277, Sep 1996.

[15] E. Terhardt. Pitch, consonance, and harmony. J Acoust Soc Am, 55(5):1061-1069, 1974.

[16] M. Tramo, P. Cariani, B. Delgutte, and L. Braida. Neurobiological foundations for the theory of harmony in western tonal music. Annals-New Tork Academy of Sciences, 2001. Appendix A

Code

A.l Shapira Lots & Stone Model

A. 1.1 diadsweepForPublication.m

1 clear all; 3 clc;

5 %Figure Initialization scrsz = get (0, ' ScreenSize ' ) ; 7 set (0, ' Def aultFigurePosition ' , [scrsz (1) scrsz (2) scrsz (3) scrsz (4)]);

9 %dyad array (frequencies are as presented in Fischraan study) frequency = [256,269.6; 11 256,288; 256,256*4/3; 13 256,364.5; 256,384; IS 256,256*16/9; 256,486; 17 256,512]

19 %dyad names for Figures dyad={ 'Minor 2nd' 'Major 2nd' 'Perfect 4th' 'Tritone' 'Perfect 5th' 'Minor 7th' 'Major 7th' 'Octave'};

118 Code 119

21

%binary switches for the four figures beiow 23 figs=[l,l, 1,1] ;

25 %start times for autocorrelation sampling tstartl=75; 27 tstart2=75;

29 %k is the couling strength between LIF's for k=l:l, 31 disp (k) ; tic

33

for j = 0 : 0 , %j is the '?' coefficient for noise 35 disp(j) ;

tic

37

for i = 1:8, %i is the index for the dyad (frequency pairs) 39 disp (i) ;

tic

41 frequency (i, : )

43 %calls LIF coupler [C, Y, V, Phasel, Phase2, tl, t2, tstart3, dt, tfin] =modelockingForPublication (k/10, j/10, squeeze ( frequency

45

toe

47 % F=amplitudeSpectra (Y, tstart2, dt , tf in) ; %this produces a figure, 49 % but it doesn't look right

51 %all figures produced below, and pdf ' s are saved to the desktop %this requires save2pdf.m to be placed on your desktop 53 t=0:dt:tfin;

55 if figs(l)==l, %Membrane Potentials and Spike Trains figure (1) 57 subplot (411) plot (t, V( : , 1) , 'b' , 'LineWidth' ,1.5); 59 title ([ 'Membrane Potential of LIF If= 256, Hz and LIF 2 f = ', num2str (frequency (i, 2) ) , ' Hz']) ylabel('V_l') ; 61 axis ( [tstartl tfin 0 I]); set (gca, ' XTickLabel ' , [ ] ) ; Code 120

subplot (412) bar (tl,ones (1, length (tl) ) , 0.2, 'b' , 'EdgeColor ' , 'none' ) ; axis ( [tstartl tfin 0 I]); set (gca, 'XTickLabel ' , [] ) ; set (gca, 'YTickLabel ',[]); subplot (413) bar (t2,ones (1, length (t2) ) , 0 .2, ' r ' , 'EdgeColor' , ' none' ) ; axis ( [tstartl tfin 0 I]); set (gca, 'XTickLabel' , [] ) ; set (gca, 'YTickLabel ' , [ ] ) ; subplot (414) plot (t,V(:,2) , 'r ', 'LineWidth' , 1.5) ; ylabel ( 'V_2' ) ; axis ( [tstartl tfin 0 I]);

set(l, 'Color', [1,1,1]); drawnow;

save2pdf ( [ ' MemPotRelPhase ' , num2str (round (frequency (i,2))),'A', num2str (10*k) , '?' , num2str (100* j) , 'dt', num2str (1000*dt) , 'alpha', num2str (10*0 . 9) , ' .pdf ' ] , 1, 1200) ; end

if figs (2) ==1, %MUA (Summed EPSPs), Individual EPSP 's, and Phase Path figure (2) subplot (411) plot (t,Y(:,l)+Y(:,2)); title (' Simulated Microelectrode (Multi-unit Activity (MUA)) [sum of EPSPs]'); ylabel ( 'Volts ' ) ; axis ( [tstart2 tfin -1 I]); set (gca, 'XTickLabel ' , [] ) ; subplot (412) plot (t,Y(:,l) , 'b' ) ; title ('EPSP Train From LIF 1 to LIF 2 ' ) ; ylabel ( 'Volts' ) ; axis ( [tstart2 tfin -1 I]);

set (gca, ' XTickLabel ' , [ ] ) ; subplot (413) plot (t,Y(:,2) , 'r') ; title ('EPSP Train From LIF 2 to LIF 1'); ylabel ( 'Volts' ) ; axis ( [tstart2 tfin -1 I]); set (gca, ' XTickLabel ' , [ ] ) ; Code 121

subplot (414) plot (tl, Phase 1, 'b' , t2,Phase2, 'r ' ) ; title ('Phase path'); axis ( [tstart2 tfin 0 max (max (Phasel) ,max (Phase2) ) ] ) ; ylabel ( ' time ' ) ; xlabel ( [ 'time [A= ' , num2str (k) , ' , \? = ' , num2str ( j) , . . . ', dt = ', num2str(dt), ' \alpha = ', num2str (10*0 . 9) , ']']);

set (2, 'Color', [1, 1, I]) ; drawnow;

save2pdf ( [ ' AlphaTotMemPotPhasePath ' , num2str (round (frequency (i,2))),'A', num2str (10*k) , '?' , num2str (IC 'dt', num2str (1000*dt) , 'alpha', num2str (10*0 . 9) , ' .pdf ' ] , 2, 1200) ; end

if figs (3) ==1, %autocorrelat ion - may need to change start3 and xcorr params in 'modelockingforpubli figure (3) subplot (1,1,1) plot (C) ;

set (3, 'Color', [1,1, 1] ) ; drawnow;

save2pdf ([ 'Correlation ' , num2str (round (frequency (i,2))), ' A ' , num2str (10*k) , '?' , num2str (100* j ) , . . . 'dt', num2str (1000*dt) , 'alpha', num2str ( 10*0 . 9) , ' .pdf ' ] , 3, 1200) ;

end

if figs (4) ==1, %Fishman figure of MUA emulation figure (4) subplot (1, 8, i) plot(t,Y(:,l)+Y(:,2) ); titleU'Sum of EPSPs', dyad{i}]); ylabel ( 'Volts ' ) ; axis ( [tstart2 tfin -1 I]); set (gca, 'XTickLabel ' , [ ] ) ; hold on;

set (4, 'Color', [1,1,1] ) ; drawnow;

end

end Code 122

147 hold off;

149 if figs (4) ==1, save2pdf ( [ 'MultiMUA' , ' A' , num2str (10*k) , '?', num2str ( 100* j ) , . . . 151 'dt', num2str (1000*dt) , 'alpha', num2str (10*0 . 9) , ' .pdf ' ] , 4, 1200) ;

end

153 toe

end

155

toe

157 end

A. 1.2 modelockingForPublication.m

function[C, Y, V, Phasel, Phase2, t 1, t2, tstart3, dt , tfin] =modelockingForPublication (G, ?, frequency) 2 global A I tau alpha;

4 filePathOut = ' /Users/brianhef feran/Desktop/PlotsNData/ ' ; %folder where . csv and pdf ' s get published

6 % ? is the noise intensity % G is the coupling strength (0-1) 8 % frequency is the vector representing the dyad component frequencies

10 tstart3=40; tfin = 80; %finish time 12 dt = 0.001; %timestep vth = 1; % spike threshold 14 vr = 0; % reset voltage Phasel = [] ; 16 Phase2 = [ ] ; A=G; %bad programming - but it works. . . 18 fConvert2Current=exp (1 ./ (frequency/100) )./ (exp (1 ./( (frequency/100) )) -1) ; %coverts to appropriate bias c tl = [ ] ; iarrays for spike times 20 t2 = [] ; tau = 1; % membrane time constant 22 I= [fConvert2Current (1) fConvert2Current (2) ] ; %bias current values for neuron 1 and 2, respectively alpha =0.9; % inverse time constant of the synaptic current 24 vinit = [0.0 0.0]; % voltage initial condition xinit = [0.0 0.0]; % synaptic current initial condition Code

26 tlast = [O 0] ;

28 fl=100/log (I (1) /abs (I (I)-I) ) ; %actually calculates the f 2=1 00 /log (I (2) /abs (I (2)-l) ) ;

30

32V = vinit; ? = xinit; 34 y = [0 O];

36 FunFcnl='Fl ' ; FunFcn2='F2 ' ;

38 i = 0; 40N= length (1 :round (tfin/dt )) ; V(1:N, 1:2) = 0; 42 Y(1:N, 1:2) = 0;

44 % The main loop

for t=0:dt:tfin

46

i = 1+1:

48

xO = x; yO = y; vO = v;

%2nd order noise= [A*0.025*randn () , ?*0 . 0025*randn ( ) ] ; %noise dv = feval (FunFcnl, v, x, t); v=v+ dt*dv + noise; dy = feval (FunFcn2, x, y) ; y = y + dt*dy; ? = ? + dt*y;

if (v(l)>vth) %spike phi = t - tlast (2); leurrent time minus time of last spike by LIF2 tlast (1) = t; v(l) = vr; y(2) = y(2) + alpha"2; Phasel = [Phasel; phi]; tl = [tl; t]; Code 124

end;

if (v(2)>vth) %LIF2 Spikes ph2 = t - tlast (1) ; % tlast (2) = t; ? (2) = vr; y(l) = y(l) + alpha" 2; Phase2 = [Phase2; ph2]; t2 = [t2; t]; end;

%indexed arrays of voltages and alpha functions (EPSPs) V(i,l) = V(I); V(i,2) = v(2); Y(i,l) = y(l); Y(i,2) = y(2); 84 end;

86 t=0:dt:tfin;

88 outl = [tl,Phasel] ; out2 = [t2,Phase2] ; 90 out3 = [t',Y(:,D , Y(:,2) ,V(:,l) ,V(:,2) ];

92 %writes to .csv (all data visible via Excel) csvwrite ( [ ' Phase 1 ' , nura2str (frequency (2) ) , ' A' , num2str (10*A) , '?' , num2str (100*?) , . . . 94 'dt', nura2str (1000*dt) , 'alpha', num2str (10*alpha) ,'. csv '], outl) ; csvwrite ( [ ' Phase2 ' , num2str (frequency (2) ) , ' A' , num2str (10*A) , '?' , num2str (10 0*?) , . . . 96 'dt', num2str (1000*dt) , 'alpha', num2str (10*alpha) , ' . csv ' ] , out2) ; csvwrite ( [ ' Frequency ' , num2str (frequency (2) ) , 1A', num2str (10*A) , '?' , num2str (100*?) , . . . 98 'dt', num2str (1000*dt) , 'alpha', num2str (10*alpha) , ' . csv ' ] , out3) ;

100 %autocorrelation C = xcorr (Y (t start 3/dt :end, 1) +Y (t start 3/dt :end, 2) ,10000, 'unbiased' ) ; 102 C = C (floor (end/2) :end) ; %i know i should use xcov and divide by variance to normalize

function dv = Fl(v,x,t) 106 global A I tau;

108 Iholder=[I (1) , 1(2)); %use [ I (1 ) * ( l + 0*cos (0 . 51 1 *t ) ) , I (2 ) * ( l + 0*cos (20*0 . 511*t ) ) ] for sinusoidal forcing dv = -v/tau + !holder + A*x; Code

no

112 function dy = F2(x,y) global alpha;

114

dy = -2*alpha*y-alpha"2*x;

A.2 Cariani's Model

A.2.1 chialvoGSR2 .m

%chialvoGSR2.m

2 %

%See chialvoGSRCaller . m for description 4 % %Code written by Brian Heffernan @ uottawa, September 2009

6

function [isi] =chialvoGSR2 (frequency, dtPass, alphaPass, DPass, Amplitude)

8 tfin = 10000; %final time value 10 dt = dtPass; %timestep vth = 1; % spike thresholdvr = 0; 12 % reset voltage lifldata = [];%array of spike times and relative spiketime difference (lifl to lif2) 14 tau = 1; % membrane time constant A=Amplitude; %sinusoidal stimulus amplitude 16 alpha = alphaPass; % inverse time constant of the synaptic current vinit = 0; % voltage initial condition 18 xinit = 0.0;% synaptic current initial condition tlast = 0; %initialize last firing times 20 rads = frequency; ? = vinit; 22X = xinit;

D=DPass; 24 1 = 0.90;

26 % The main loop

for t=0:dt:tfin Code 126

28

%the white-noise term

30 noise=sqrt (2*D*dt) *randn () ;

32 %LIF voltage change evaluated, then added with noise to previous %voltage

34 dv = -v/tau + I*(l + A*cos (2*rads*t ) + A*cos (3*rads*t) ) ; 36 v = v + dt*dv + noise;

38 if (v > vth) %spike check for LIF tlast = t;

40 v = vr; lifldata = [lifldata; t]; 42 end;

46 !calculate the difference between successive spike times stored in lifldata isi=diff (lifldata) ;

A. 2. 2 chialvoGSRCaller .m

1 % chialvoGSRCaller . m

3 % A script written to call a neuron model exhibiting the so-called % 'ghost-stochastic resonance' . Here, all dyads occuring in the 5 % western scale were passed as the 'stimulus' waveform of the LIF neuron, % simulated in ' chialvoGSR2 .m' , which returns the inter-spike interval 7 % (ISI) data which is then plotted.

9 % Code written by Brian Heffernan SuOttawa, September 2009.

il % initialize the simulation environment clear all; 13 close alíj- ele;

15 f ilepathout= ' /Volumes/Hydra/Users/brianhef fernan5/Desktop/Plots/ghostSR/ ' Code

17

alphaPass = 1.0; %the 'inverse time-constant' for the alpha function 19 dtPass=0 . 003; %the timestep D = 0.0007; %the noise intensity 21 Amplitude=0 . 05; %the amplitude of each stimulus component

23 frequency = 0.256;

25 [isi]=chialvoGSR2 (frequency, dtPass, alphaPass, D, Amplitude); %function call, returns ratios

27 figure (1) 29 hist ( (isi . / ( (2*pi) /frequency) ) , 50*ceil (max (isi./ ( (2*pi) /frequency) ) ) ) ; gridi- si xlabel ( 'Firing Rate (in Periods of FO)'); set (gca, 'XTick',0:l:ceil (max (isi./ ( (2*pi) /frequency) )), 'font size1, 8) 33 ylabel (' Number of Spikes'); drawnow;

35

% print_pdf (strcat ( f ilepathout , ' OneLIFghostSRhistMa jor3rd' , 'D' , num2str (D) , 37 % ' RadS ' , num2str (frequency) , ' DT ' , num2str (dtPass) , ?' , num2str (Amplitude) , ' %.pdf ') , 1) ;