<<

Convergent evolution in large cross-cultural database of musical scales

John M. McBride1,* and Tsvi Tlusty1,2,*

1Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea 2Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea *[email protected], [email protected]

July 26, 2021

Abstract We begin by clarifying some key terms and ideas. We first define a scale as a sequence of notes (Figure 1A). Scales, sets of discrete pitches used to generate Notes are pitch categories described by a single pitch, melodies, are thought to be one of the most uni- although in practice pitch is variable so a better descrip- versal features of . Despite this, we know tion is that notes are regions of semi-stable pitch centered relatively little about how cross-cultural diversity, around a representative (e.g., mean, meadian) or how scales have evolved. We remedy this, in [10]. Thus, a scale can also be thought of as a sequence of part, we assemble a cross-cultural database of em- mean of pitch categories. However, humans pirical scale data, collected over the past century process relative frequency much better than absolute fre- by various ethnomusicologists. We provide sta- quency, such that a scale is better described by the fre- tistical analyses to highlight that certain intervals quency of notes relative to some standard; this is typically (e.g., the ) are used frequently across cul- taken to be the first note of the scale, which is called the tures. Despite some diversity among scales, it is tonic. We refer to in two ways: we talk the similarities across societies which are most of intervals between notes in units of cents, which can striking. Most scales are found close to equidis- be obtained by a logarithmic transform of two frequen- tant 5- and 7-note scales; for 7-note scales this cies f1 and f2, cents = 1200 × log2 f1/f2; or simply as accounts for less than 1% of all possible scales. a frequency ratio f1 : f2. Also, rather than referring to In addition to providing these data and statisti- each note in a scale relative to the tonic, we can repre- cal analyses, we review how they may be used sent a scale in terms of its set of adjacent intervals, the to explore the causes for convergent evolution in frequency ratio (or cents) between adjacent notes in the scales. sequence. In the example given in Figure 1A, one can see that the frequency ratios with respect to the tonic Introduction are amongst the simplest integer ratios. Early scholars believed that these intervals are innately important, not Music, like language, is a generative grammar consisting necessarily for valid scientific reasons [11]. Nonetheless, of basic units, and rules on how to combine them [1]. In the concept of tonal fusion – where some complex dyads melodies, the basic units are described by two qualities: may appear difficult to distinguish as separate – lends pitch (frequency) and duration (time). We generally refer credibility to this old idea [12, 13]. One interval stands to this basic pitch unit as a note, and a set of notes as out amongst the crowd: the octave, an interval of fre- a scale. Thus, as far as pitch is concerned, a scale is to quency ratio 2 : 1 (1200 cents). In many cultures two a melody what an alphabet is to writing. Despite their notes related by an octave are considered perceptually centrality to music, and apparent ubiquity, we know sur- similar (octave equivalence), and scales are considered to prisingly little about scales. Most studies focus on scales repeat when they reach the octave [14]. As a result, in from a limited number of musical traditions [2, 3], and Figure 1A the first and last note of the scale are repre- the only finding from a broad statistical approach is that sented by the same letter. Despite many claims of uni- scales have 7 or fewer notes [4, 5]; there are anecdotal re- versality, experiments have shown that octave equivalence ports that certain notes are widespread, but this has not may be a weak perceptual phenomenon which is culture- been examined statistically [2, 3, 6–9]. We lack concrete specific. In this work, we aim to provide some statistical understanding of why we use scales, how diverse they are, evidence to assess how widely the octave is used. or how they came to be that way. We suspect that the In what follows, we explain how we create a database reason for this is simply a lack of suitable resources. Here of scales and measured tunings. We analyse these tunings we address this issue by presenting a data set of musical to see what intervals repeatedly occur, verifying the com- scales from many societies, extant and extinct, built upon mon belief that the octave is used among many cultures; a century of ethnomusicological enterprise. however we also note the limited nature of this study on

1 A B

Piano Absolute Relative Freq. Adjacent Prescriptive Descriptive Freq. Freq. Ratio Intervals

C 112 cents Theory Measured B 204 cents Song A 182 cents Recordings G 204 cents 112 cents Instrument Tunings

E 182 cents

D 204 cents C

Figure 1: A: Illustration of relevant terms: as an example we show the Western major scale in 5-limit tuning, starting from middle C and spanning one octave. Scales can be represented symbolically (e.g. letters to represent notes); as a set of absolute frequencies; as a set of relative frequencies in cents or as frequency ratios, relative to the first note in the scale (tonic); as a sequence of adjacent intervals. B: Venn diagram indicating how scale and tuning data can be classified. evaluating universality. We then provide a comprehen- been made of the notes on an instrument (instrument tun- sive, statistical view of musical scales across cultures. By ing), or a recording of a song has been analysed with considering the vast number of possible scales that are not computational tools to extract a scale. Instrument tun- used anywhere, we show that scales are more similar than ings are by default prescriptive, but can be descriptive different: scales are overwhelmingly clustered around 5- in the case where all of the notes are used in a melody. and 7-note equidistant scales (scales where adjacent inter- Despite the fact that there is some error in these measure- vals are similar in size). Finally, we discuss the potential ments (tools range from tuning forks, to the Stroboconn, mechanisms of how scales change over time, discuss the to modern computational approaches; in all cases the er- challenges in understanding how scales have evolved, and ror is reported as less than 10 cents), this type of data propose credible future directions. offers the most objective insight into the scales used in melodies. Measured scales taken from song recordings are exclusively descriptive, and they make up the small- Database est part of the database. This is because it is yet quite a challenge to reliably infer scales from a recording of a per- Database curation formance using algorithms [56], and thus it requires a lot A total of 55 books, journals, and other ethnomusicolog- of manual effort and time. Despite this, we believe that ical sources were found to have relevant data on scales the future of studies on musical scales lies in tackling these [15–69] (SI Table 1). We acknowledge that a previous challenges, due to the potential of extracting descriptive attempt has been made to document scales using ethno- scales from archives of ethnographic recordings [71, 72]. musicological records [70]. However this database lacks However, it is clear that we currently lack the appropriate a defined methodology and does not link scales to refer- computational tools to perform such large-scale analyses ences, so it cannot be independently verified. [73, 74]. We can define scales either prescriptively ("these are To enable a range of potential analyses we collected the notes you can use in a melody") or descriptively information pertaining to, where applicable, the society ("these are the notes that were used in the melody"), the scales / tunings came from (country, language or eth- and we define ways of categorizing empirical scale data nic group), geography (country, region), instrument type, (Figure 1B). Theory scales consist of intervals with exact, tonic note, and modes (for lack of a better word, we define mathematical frequency ratios. These are mainly found modes as the new scale that one gets if you pick a new in a limited set of cultures that exist along the old Silk note as the tonic so that the scale is circularly permuted). Road route, and they are not necessarily played as speci- Some of these (society / geography) we considered so im- fied. These are by definition prescriptive scales, although portant that we declined to include data if these were descriptive scales can be found which closely match these. not identifiable in some capacity. Others were found in- Measured scales are obtained where measurements have frequently; tonic was only identified in 126 out of 413

2 octave (affects 41 out of 413 samples). Measured scales inferred from recordings typically do not span more than an octave; the only relevant choice for these scales is (v). A complete workflow from source to database, including examples, is given in SI Fig. 1. While it is possible to create different versions of the scale database (and details are given on the github repos- itory), we report statistics for one created according to 200 176 171 175 Theory the following choices: (i) We match theory scales from Measured 110 each musical tradition to a set of tuning systems given in 100 91 69 SI Table 2. (ii) We do not include all possible modes of 26 18 29 8 1 6 13 3 0 0 theory scales. (iii) We do not exclude scales inferred from 0 Number of scales instrument tunings if we do not know the tonic, but we Africa Western East Asia Oceania Middle EastSouth Asia do not include all possible modes. (iv) We include scales Latin America South East Asia that require an extra interval to complete the octave. (v) We use a tolerance of O = 50, finding that few of the ex- Figure 2: Scales in the database either come from a tracted scales have such a large octave deviation (SI Fig mathematical theory (theory) or from measurements from 2). This results in a total of 896 scales (462 theory scales, instruments or recordings (measured). The map shows 383 scales from instrument tunings, 51 scales from song the geographic origin of the scales (theory: regions; mea- recordings). The theory scales span 5 regions, while the sured: dark shaded countries), with sample size, S, indi- measured scales span 8 regions, and 39 countries (Figure cated by the marker size. 2). However, we must note that an assumption central to the above methodology is that scales add up to a octave. Since we are unsure about how valid this assumption is, instrument tunings; only two sources precisely identified we first study the statistics of the raw instrument tunings sets of modes that could be drawn from an instrument before studying the extracted scales. tuning [58, 68].

Extracting scales from raw data Statistics of Instrument Tunings The database exists in two forms: the raw data, and a Studying the statistics of instrument tunings allows the set of scales that is generated automatically from the raw least biased view of scales / intervals, since we do not alter data according to a set of choices. (i) Theory scales are the raw data. This does not mean they are completely free typically given in symbolic notation such as letters / sylla- from bias, and we refer the reader to the discussion section bles (e.g., solfège, swara), which can then be converted to for more detail. With this data, we can ask what notes are a numerical sequence by matching the symbols to a tuning found more or less than others, and estimate how signif- scheme such as just intonation, or 12-tone equal temper- icant these results are according to different assumptions ament (12-TET). For each theory scale in the database, about how scales are generated. There is, of course, no we provide a set of tuning schemes that were plausibly innately correct null distribution for instrument tunings; used, so a single theory scale may have several versions hence, our approach is to try several statistical models. in different tunings (it is possible to easily change the as- sumptions used to create a different set of scales). (ii) Outstanding tonic intervals: 200, 700, 1200 cents The tonic is the first note in all theory scales, and in many cases modes are also found in musical traditions. In this section we limit our analysis to tonic intervals, as- However, if one wants completely ignore tonality, one can suming that the lowest note is the tonic. Our first model also create all possible modes from all theory scales. (iii) assumes that intervals, I, (as measured from the first Tonality is rarely noted in studies of instrument tunings, note) are drawn independently from a lognormal distri- which means it is impossible to uniquely identify the order bution, P (I) = lnN(µ, σ2). Lognormal is an appropriate of notes as intended by the performer. For these scales, choice on two accounts: (i) very small intervals should be one can: include only those that denote the tonic; in- unlikely due to limits in pitch perception, so P (I) → 0 clude all plausible scales (all modes); or include more than as I → 0; (ii) for clear physical reasons (human anatomy one scale if there are multiple without including constraints; instrument material cost, etc.) large intervals modes. We define a plausible scale on an instrument tun- are be less likely, so P (I) → 0 as I → ∞. This null dis- ing as a sequence of notes that span an octave. (iv) In tribution states that intervals are entirely independent of practice, intervals are not perfect so we accept scales if each other, and the probability only depends on the size the intervals sum to 1200 ± O cents, where O is the error of the interval. While this is inherently wrong (notes are tolerance. (v) In some cases, sources indicated that the unlikely to be repeated within a given instrument tuning, octave was used in performances, but reported the scales which contradicts the idea of independence), it allows us without the octave. For these cases, we can choose to in- to compare different I to say which are especially frequent clude the missing interval that is needed to complete the / rare. We compare the real distribution with the null

3 A Data Lognormal fit 99% CI All 0.001 Idiophone 0.001 Density Density 0.000 0.000 Reg-Samp Aerophone 0.001 0.001 Density Density 0.000 0.000 Cult-Samp Chordophone 0.001 0.001 Density Density 0.000 0.000 0 600 1200 1800 2400 3000 0 600 1200 1800 2400 3000 Interval from the lowest note / cents Interval from the lowest note / cents B 0.001 All 0.001 Cult-Samp Density Density 0.000 0.000 0 600 1200 1800 2400 3000 0 600 1200 1800 2400 3000 Interval from the lowest note / cents Interval from the lowest note / cents

Figure 3: A: Distribution of intervals relative to the tonic for instrument tunings. We show distributions for 6 different samples: all samples (All); a subset balanced by region (Reg-Samp); a subset balanced by culture (Cult- Samp); subsets that only include data taken from idiophones, aerophones, and chordophones. For Reg-Samp and Cult-Samp, we show distributions from three independent samples. We show a lognormal fit to each distribution (dotted line), and indicate the 99% bootstrapped confidence interval (shaded region). We truncate the x-axis at 3000 cents for clarity. B: We repeated the same analyses on an artificial set of tunings resampled using the posterior distribution of adjacent intervals sizes and exact same numbers of notes in a tuning. distribution for the following samples (Figure 3A): all of sults cannot be said to be simply due to the combination the instrument tunings (‘All’, 404 tunings), a region sub- of the available adjacent intervals. The adjacent inter- sample (‘Reg-Samp’, 8 contintent groups, at most 10 tun- vals appear to be arranged in some specific order which ings chosen per region, 78 tunings), a culture sub-sample results in some intervals occuring more frequenctly than (‘Cult-Samp’, 72 cultures, at most 5 tunings per culture, expected by chance. 200 tunings), and three sub-samples based on the type We then check whether we get the same result if we pre- of instrument the tuning is observed on (Idiophone, 195; serve the original sets of adjacent intervals for each tuning: Aerophone, 76; Chordophone, 55). we create a second test data set (TEST2) by matching the We find the most prominent deviations from the lognor- number of tunings, and the number of notes per tunings; mal distribution at 1200, 700 and 200 cents (more likely), however, instead of sampling from all observed adjacent and 600 cents (less likely); these are consistently outside intervals, we simply shuffled the adjacent intervals in each of the 99% confidence intervals across almost all samples. tuning (SI Fig 3). In this case we find that the octave is Of the rest, we see 2400 cents three times (All, Cult-Samp, much more frequenctly than expected from a lognormal Aerophone), and ∼500 cents twice (All, Idiophone). distribution. However, this is because there are many in- strument tunings where the highest interval is 1200 ± 15; Our second model assumes that scales are constructed when we control for this, we find that only 200 cents is sig- by combination of adjacent intervals, IA. We create a test nificant again (SI Fig 4). Thus we can say that the peak data set (TEST1) by matching the number of tunings, at 200 cents is due to this being the mean size of adjacent and the number of notes per tuning, and then sampling intervals, but the peaks at 500, 700 and 1200 cannot be adjacent intervals with replacement from all adjacent in- explained by any of the suggested null distributions. tervals observed in instrument tunings. We then compare this test set against a lognormal distribution (Figure 3B), Further tests support outstanding intervals finding that the only significant interval is ∼ 200 cents, which is also the mean size of an adjacent interval (198 The previous method was able to establish whether in- cents). Thus the reason that 200 cents appears to be sig- tervals are found more than expected by chance across nificant in Figure 3A is simply due to its prevalence as a collection of instrument tunings. We can extend this an adjacent interval, while the the other significant re- approach by using the full range of intervals that can be

4 A B Greater freq than chance Less freq than chance 80 0.4 All

60 0.2 40 * ** 0.0 0.4 Reg-Samp 20 *** from the octave NS 0 0.2 0 20 40 60 80 Deviation of shuffled intervals Deviation of original intervals 0.0 from the octave 0.4 Cult-Samp

C Fraction of significant results 0.2 Greater freq than chance 0.0 0.4 Null 0.5 Less freq than chance 0.2 Fraction of

significant results 0.0 0.0 0 20 40 0 400 800 1200 1600 2000 2400 Tuning deviation (cents) Interval size (cents)

Figure 4: A: Mean deviation of original intervals from the octave compared to shuffled intervals for 147 instrument tunings; * p < 0.05, ** p < 0.005, *** p < 0.0005, or not significant (NS). Black line: x = y. B: Fraction of results that are significant, and show that an interval size is found either more or less frequently than expected by chance. We show results for the whole dataset (All), region-balanced set (Reg-Samp), a culture-balanced set (Cult-Samp), and for a sampling scheme (TEST1) that only maintains the overall distributions of adjacent interval sizes and notes per tuning (Null). C: Fraction of significant results indicating that the octave is found more or less frequently than chance against tuning devation in a test set which maximises the number of octave relationships. Shaded region shows 95% bootstrapped CI. made on an instrument, not just the tonic intervals. For sults is that there are many scales where the adjacent an instrument with N notes, this results in N ×(N −1)/2 intervals are of similar size, and can add up easily to a intervals. Since this samples many more intervals, this particular value (in this case the octave); in these cases, test is powerful enough to sometimes detect significant the shuffled scales are just as likely to have many octave results within single instrument tunings, not just across intervals as the original scales. the entire collection. We then extend this analysis to intervals over the range This test compares each instrument tuning with alter- 200 ≤ I ≤ 2600 cents, for the whole set of tunings, and native tunings that can be made by shuffling the order of for two sub-samples (Figure 4B). We see that, out of all the adjacent intervals (TEST2). This allows us to ask, intervals, the first and second octaves are those most often for a given interval I, does this interval occur more often found in tunings at a rate that is significantly greater than in the instrument tuning than one would expect if they chance (40% of the time). They are followed by intervals know the adjacent intervals but not the order. To answer of 500, 700, 1000, and 1400 (and corresponding intervals this, for each tuning we find all intervals within some dis- an octave above), in agreement with the data in Figure 3. tance w of I and calculate how far they are from I, and These results are weaker for ‘Reg-Samp’, but nonetheless then repeat the process for 50 shuffled versions of the tun- there is good agreement across sub-samples. Similarly, we ing. We then use a Mann-Whitney U test to test whether find that 600 cents, 1150 and 1250 cents are often missing either set of intervals (original or shuffled) is significantly from tunings at a rate that is significantly greater than closer to the octave than the other. We then repeat this chance. To put these results in perspective, we once again 100 times to get a converged average. create test data as in TEST1, and re-run the analyses To demonstrate this analysis we show the mean devia- (Figure 4B, Null), which shows the range of significant tion from the octave for the sets of intervals taken from results expected by chance. This clearly demonstrates the orignal tuning and the shuffled sets (Figure 4A). For that the previous high fractions of significant results were most tunings the results are not significant since the sam- not an artefact due to testing multiple hypotheses. Also, ple size of intervals that are within w = 100 of 1200 cents we show that the results do no depend on our choice of tends to be small. Another reason for non-significant re- w (SI Fig 5). These results reinforce the conclusion that

5 these intervals are not simply found due to a preference for 0.8 All certain sets of adjacent intervals – the adjacent intervals 0.7 Theory are clearly arranged to achieve some preferred intervals Measured with respect to the lowest note. 0.6 Region Culture 0.5 Tuning variability hinders search for universal in- tervals 0.4 What fraction of significant results should we expect to 0.3

find if those tuning instruments do indeed aim to tune Normalised frequency 0.2 octaves (or any interval)? To answer this we create a test 0.1 set of scales (TEST3) with maximal possible octave inter- vals. We match the number of tunings, and the number 0.0 4 5 6 7 8 9 of notes per tuning, but we alter the intervals so that for Scale degree every interval where I ≥ 1200, there is a corresponding interval an octave below. In reality humans (as any mea- surement device) have imperfect pitch resolution, so we Figure 5: Scale degree distributions for different samples alter this test set by adding normally distributed noise, of the scale database: all scales, theory scales, measured N(µ = 0, σ2) to see how much this reduces the chance of scales, sample balanced by region, sample balanced by detecting significant results. We find that even without culture. noise (σ = 0), this test can only find significant results 90% of the time (Figure 4C). A realistic estimate of σ simplest frequency ratio (2:1). In terms of statistics, it is could start by considering that tunings in the database widely described as a universal feature of music and scales are usually measured to within 10 cents, but this is not [2, 3, 7–9, 85]. Some state that it is better described as the main source of variation. In humans, interval discrim- a statistical universal: present in many, but not all musi- ination drops below about 100 cents [75–78], and humans cal traditions [87, 88]. It is difficult to judge which view typically sing with a standard deviation of at least about is correct, since most studies refer to only a handful of 10 - 20 cents [10, 69, 79–84]. Thus we can expect a rea- specific examples or to anecdotal evidence from experts. sonable upper bound to the fraction of significant results As of yet, however, there is no study that systematically that can be detected of about 0.3 to 0.7. For comparison, examines the prevalence of octave usage across cultures. we find the octave to occur significantly more than chance The evidence provided in this work has so far lent support about 40% of the time. Thus, due to and empirical mea- to the notion that it is widespread, but it is still unclear surement (human or otherwise) error, there is a limit to to what extent. In terms of perception, octaves are often how much one can infer the significance of the prevalence perceived as similar, and scale structure is often seen as or lack of a particular interval in a single scale, melody or circular (or helical), with a periodicity of one octave [14]. tuning. Despite this limitation, we find that our analy- This can be seen most clearly in cultures where notes are sis finds a fraction of significant results supporting use of named, and notes that are related by an octave have the the octave which is within the range of this hypothetical same name. Or when instruments or singers that have limit. different frequency ranges play / sing together, they of- Ultimately, we cannot say how scales are chosen. Are ten play / sing in parallel octaves. Despite these cultural there some important intervals that are fixed first (e.g. phenomena, percepual experiments on the octave tend to octave), and then the rest are chosen to fill the gaps? demonstrate rather weak effects [90–93], which are often Or are adjacent intervals of a certain size chosen, and conditioned on musical training [94–96]. Nonetheless, re- then arranged in some preferred order? All this analysis cent work has shown strong effects for sounds in has shown is that whichever way scales are created, they general, with regard to tonal fusion [13, 93], the ability to show a significant, consistent bias towards including some hear in noise [97], and memory of complex tones [98]. Al- notes, and avoiding others. together, the evidence points to a scenario where humans may be able to take advantage of the mathematical regu- Assessing evidence for the use of the octave by larities, and statistical abundance of harmonic sounds, in source such a way that this faculty can be strengthened through While the octave is often reported as one of the most uni- cultural traditions and experience. It suggests that hu- versal features of music [2, 3, 7–9, 85], this status has been mans can learn to appreciate the octave in a special way, disputed [69, 86–89]. To understand why, we review the but they need to be taught to first. To further supplement relevant physical, statistical and perceptual ways in which our earlier statistical analysis of instrument tunings, we the octave is considered special. Physically, the octave is next consider what descriptive evidence exists in the stud- produced naturally as part of the harmonic spectra. It ies cited in the scale database. is the most common and salient interval to appear in the After a first general reading of the sources, it seemed harmonic spectra (occurs most frequently, occurs in rela- appropriate to classify sources in terms of evidence for: tion to most salient ). Mathematically it is the use of the octave when tuning instruments (8 sources),

6 otteve htotvsaewdsra,btte are they supposed. but once widespread, as are universal sup- octaves as strongly not that com- findings view more these the a Nonetheless, port for recommended picture. is plete the use into octave [99]) in- of as that statistics (such texts those ethnographic independent only An of this are investigation scales. of used on limitation information sources quantitative A the clude 3. that Table SI is in are analysis detail details These more in 43]). obtained covered 37, instruments [28, ancient sites on archeological reporting through (out are is evidence 3 any octave show which the not of do that 11 suggest 69], to [65, avoided evidence actively of 36 show appreciation of 2 specific total octave, for the a evidence section, some previous exhibit sources the Over- sta- from the 65]. and evidence 61, evidence tistical descriptive [22, both performers including appreciated original when with- is the all, by octave detail way In the in any whether [21] in octave of melodies description the of any discuss oc- out range authors an intervallic cases of octave, of several the range terms of the absence in tunes the outside except discussed playing notes rarely found by Authors were for or instances tave. names No [32], [24]. unique scalar fifths fits of in tuning in the or if instead: seeing 31], and but 29, tuned octave, [18, sometimes the order were to the Instruments tradi- of reference use musical without indicate sources). the studies in (7 separate used octave but is discus- question, octave any in lack the spe- tion studies a whether Some as of octave (8 sources. sion the in 24 of apart in evidence awareness interval of descriptive octave cial sort found some an we of Overall, support notes octaves sources). parallel for in (11 harmony name melodic scale performing same the sources), of the samples by using different balanced sample for region, (right) indicates by scales Shading cents. balanced (right) in type. sample 30 instrument notes (middle) each and and scales; to (left) according measured 20 (left) scales scales, intervals measured theory adjacent (bottom) scales, culture; of all Distributions (top) database: 6: Figure

Density Density Density 0 100 Adjacent Interval/cents 200 300 400 Chordophone Aerophone Idiophone Measured Theory All 500 Culture Region 7 0 r anycnetae ewe 0 n 0 et,with cents, 300 and they 100 [100]); between than intervals concentrated Carnatic mainly larger less (in are cents forbid rarely 400 explicitly are than rules greater they music, rarely and groups: cents, however, all 100 interals, across adjacent general of are features Some at dophones). or peak Culture), (see subsamples (Region, differences Taking for distributions exaggerate the distributions theory smooth broader. in comparison, either much can cents) In are 300 scales 200, sharp. measured (100, very peaks are the scales such, omit- are As is natural scales intonation the vocal ted. theory such of the tuning as instrument as and in On values, variation expected, theoretical be 6). exact to as (Figure given is scales this measured level and one theory for butions from. type come the by they to and instrument mea- according region of scales and by measured scales of sampled different sub-sets scales theory culture; three of scales, show sub-sets We all scales; inter- sured the distributions: tonic. are the of notes to sets Scale respect with melodies. vals in relevant motion particularly scalar are represen- scales for important Adjacent two scales. of of distributions tations at look next We distributions note scale and interval Adjacent sampling Culture. for across cu- controlling persists a when that minimized is scales, is There note but 5). schemes, 6 (Figure of most degrees lack [5], fewer work rious or previous 7 with have agreement scales in that, find We degree Scale Scales of Statistics h daetitrasso akdydffrn distri- different markedly show intervals adjacent The 200 400 Scale note/cents 600 95% 800 I itga isare bins Histogram CI. ∼ 350 1000 et o Chor- for cents 1200 A B a C a Asena 27 50 0.50

25 0.25

Gamelan 25 Density Relative Frequency Frequency Guinea Malinke 3 0 0.00 b b Harmonic minor 0.1 Maqam Mahur A 10 Density

Marimba 8 Relative Ranat T'hong Frequency Frequency 0 0.0 Matape 2 c c 20 Yanyue 0.1 Ranad thum lek 10 Density Relative

Maqam Athar Kurd Frequency Frequency Kwaiker 0 0.0 Dastgah-e Chahargah d d

Hicaz Makam 1 20 0.50

Major 10 0.25 Density Relative In Frequency Frequency 0 0.00 Asena 5 e 40 e

Mela Shulini 0.2 20 So-na Density Relative

Locrian Frequency Frequency 0 0.0 Mbira 3 f 20 f Bhairavi 0.2 Khong Mon 10 0.1 Density Swastigitha Relative Frequency Frequency Mela Salagam 0 0.0 1000 0 0 200 400 600 800 10001200 Distance between scale clusters Scale note Africa Oceania Western East Asia South Asia Middle East Latin America South East Asia Region

Figure 7: A: Dendrogram showing the relations between scales, grouped into 6 clusters. Scale names are shown as they are given in the database. For each cluster we show: (B) the scale note distributions, with dotted lines indicating the average scale; (C) the frequency and relative frequency of occurrence within each region. a salient peak at about 170 − 200 cents. at 200, 500 and 700 cents, regardless of the scale degree. The scale note distributions show remarkable conver- gence across the different samples (Figure 6). There is a Scale clusters clear peak in all cases at 200, 500 and 700 cents. In all dis- To get a sense of the variation between scales we cluster tributions the peak at 700 cents is sharp. The next most them according to their scale notes (hierarchical cluster- common peak is at 200 cents; although 500 cents is very ing, Ward’s method, euclidean distance [101]) to create 6 prominent in some distributions, it is not as prominent in clusters of scales. We divide the scales into only 6 clus- aerophones and idiophones. There are then broad regions ters, simply to facilitate discussion of the results. Using that tend to be well-populated: 300-400 cents, 800-900 the euclidean distance restricts us to studying scales of cents, 1000-1100 cents. We also see regions that are not the same length, so we present results for 7-note scales as populated. Particularly notable are the regions border- an example since they are most numerous. We show the ing 0, 700 and 1200 cents (also 280 cents, 450 cents, 950 clusters in Fig 7, indicating some of the names of scales cents). The (600 cents) is rarely found in mea- or the instruments / cultures they were taken from (full sured scales, but is found in theory scales. set is given in Supplementary Data). The shape of these distributions depends heavily on the The largest cluster (a, purple, 220 scales) contains the number of notes in a scale (SI Fig 7). This means that scales that are closest to an equiheptatonic scale (e.g. the graphs shown here depend on the over-representation “Asena 27”), and it is very well represented across geo- of 7-note scales. Despite this, we still find salient peaks graphical regions (except Oceania, which only has 3 7-

8 A B C Real-Real Real Real Real-Grid Grid-Close Grid-Close Grid-Far Grid-Far Grid-All Density Density Density

0 50 100 150 200 250 300 100 200 300 400 0 50 100 150 200 Distance to nearest scale / cents Adjacent Interval / cents Mean Note Distance from Equiheptatonic Scale / cents D Density

200 200 400 600 250 500 750 350 700 1050 600 800 1000 1000 Note 2 / cents Note 3 / cents Note 4 / cents Note 5 / cents Note 6 / cents Note 7 / cents

Figure 8: A: Distribution of shortest euclidean distance between each real scale and its nearest neighbour (Real- Real), and between each scale enumerated on a grid (Real-Grid) and its closest real scale. Dashed lines indicate boundaries used to establish subsets of Grid-scales as Close to (within 50 cents) or Far from (greater than 100 cents) real scales. B: Distribution of adjacent intervals for real scales, Close, and Far. C: Distribution of distance from the equiheptatonic scale for real scales, Close, and Far. D: Distributions of scale notes for notes 2-7 (missing tonic and octave) for real scales, Close and Far. Dotted lines show corresponding position of notes in the equiheptatonic scale. note scales in total). The second largest cluster (e, blue, ever, being in cluster f ). 100 scales) is distinguished by being the most evenly spread across geography in terms of relative frequency, and also by the lack of fifths compared to the tritone. Comparison with all possible scales This is followed by a cluster (d, orange, 90 scales) which To put the diversity of scales in a broader context, we contains the Major scale, which is also known as Bi- can consider the hypothetical world of possible scales by lawal (North India), Maqam Ajam Ushayran (Arabic), enumerating them on a grid. For brevity, we only present and Qing Yue (Chinese). The smallest three clusters results for 7-note scales, since they are most numerous. show progressively less global geographical distributions We take 20 cents as a basic grid size, for several reasons (b, brown, 57 scales; c, green, 42 scales; f, red, 35 scales). (limits on singing accuracy and pitch perception percep- It is possible that these clusters are found in fewer places tion; historical smallest intervals – Greek comma, sruti). simply due to the smaller size of these clusters. Another By limiting the scales to those that include adjacent in- interpretation is that the use of certain scales was spread terval sizes between 100 and 350 cents we already remove via migration and trade: for cluster f, the scales are al- over 97% of possible scales (using a grid size of 50 cents most entirely from regions in Asia; for cluster c, most of results in a reduction of 94%). We enumerate all remain- the scales are found between Europe, Middle East and ing possible scales on this grid (grid scales), and compare South Asia. real and grid scales by calculating the distances between Some regions show much greater diversity than others. them. South Asia shows the greatest diversity (as measured by We find that the real scales are much more similar to the entropy of cluster distributions; SI Fig 8), which is each other than the grid scales (Fig 8A). We then de- primarily due to the 72 melakarta of the Carnatic tra- fine subsets of grid scales either far from real scales (Far; dition. These are 72 scales, each with their own name, distance greater than 100 cents) or close to real scales which were originally derived via combinatorial enumer- (Close; distance less than 50 cents). We find again that ation of intervals within constraints. The diversity from real scales, and Close scales, tend to have adjacent inter- South East Asian scales, on the other hand, stems almost vals of about 200 cents, and rarely have large or small exclusively from two named scales: the Thai 7-note scale, intervals; compared to Far scales, which exhibit a more and the Pelog scale from the Gamelan tradition. Despite uniform distribution (Fig 8B). By comparing the distribu- being named the same, there are countless versions of the tions of each note separately, we can see that real scales same scale, to the point that there are versions of the tend to have notes near to 7-TET tuning systems (Fig Pelog scale in 4 out of the 6 clusters (the majority, how- 8D); this is especially salient for notes near 500 and 700

9 cents. We then look at the average distance between notes scale for two cases [26]: comparing repeated measure- in real / grid scales and the equidistant heptatonic scale ments of the same at different times (8 cents), (Fig 8C). We find that in 80% of real scales, notes are comparing notes across (16 cents). Pelog scales within on average 50 cents of 7-TET. In contrast, only showed more variability (13 cents, and 37 cents). Analysis 20% of the sampled grid scales are within this region. If of all Thai scales in the database gives a standard devi- we take into account the entire set of grid scales (not only ation of 22 cents. To compare with Western music we those with 100 ≥ IA ≥ 350), the probability of randomly calculate the deviation between notes in 12-TET, just in- selecting a scale within 50 cents of 7-TET is less than 1%. tonation (JI) and of Greek modes (7 Repeating the analysis for 5-note scales leads to the same cents), and the deviation between similar-sized intervals qualitative conclusion (SI Fig X). in a Belgian carillon (15 cents) [114]. While these results lack statistical power, they suggest that the differences in tuning variability between societies are not necessarily Discussion that great – i.e., what scholars perceive as difference, may not be perceived as such by others. How different are scales? How did far-away societies come to use such similar A key finding in this work is that despite some differ- scales? The explanation undoubtedly entails some combi- ences, scales across cultures are remarkably similar. Out nation of cultural diffusion and convergent evolution, but of all the scales we found, most can be thought of as discerning between these two is fiendishly difficult. The variations on the 5- and 7-note equidistant scales. This finding that most scales are almost equidistant seems to is at odds with some recent literature that claims that result from convergent evolution (Figure 8), yet we also equidistant scales are rare [5, 9, 102–104], which seem find evidence that geographical regions can also special- to be aided by some hypothesised benefits of unequal ize in certain types of scales (Figure 7). Several attempts step sizes [105, 106]. It seems that equidistant scales can have been made to link musical traditions between South only be considered rare if you are extremely conserva- East Asia and Africa [60, 115, 116] (and Central America tive about what level of precision must be reached to be and Africa [65]). However, lacking any rigorous analy- called equidistant. Out of all possible scales that one can ses of scale content, they only focused on the apparent make, equidistant scales are exceedingly rare (Figure 8); equiheptitonic nature of the scales. This appears to be yet 80% of heptatonic scales found in this work are closer further evidence of bias in the perception of differences to equiheptitonic than 99% of possible heptatonic scales. between scales in the literature cited here. Many equi- Thus, the results unequivocally show that scales tend to heptatonic scales from across the world are similar to the- be approximately equidistant, with 5 or 7 notes (Figure ory scales, but this is rarely stated. For example, there 5); and variation in scales can be described in terms of are Thai scales that are differ from the JI Lydian mode how they deviate from equidistance. by less than 43 cents (euclidean distance), which is the same distance between the JI and the 12-TET version of It is important to consider that some differences may the Lydian mode. Upon actually comparing the statistics be artefacts arising from comparing theory and mea- of how similar scales are from Africa, South East Asia, sured scales. Theory scales are ideal representations, but and Western Europe, we find substantial overlap between when performed they must exhibit some tuning variabil- all regions (SI Fig 9). While there is undoubtedly some ity. When this is studied, it typically contains questions transmission of information and scales across cultures, the of which ideal tuning system is intended – e.g., 12-TET level of similarity seen across such great geographical di- or just intonation (JI) – [83, 84, 107] despite the fact vides points mainly to convergent evolution, rather than that differences in intervals from different tuning systems cultural diffusion. are undetectable due to limits on singing stable pitchs and pitch perception [11]. In contrast, when discussing intonation in societies without theoretical scales, tun- How do scales change over time? ing variation is often highlighted as an intrinsic part of that culture [44, 48, 108–110]. There is indeed evidence Scales can change, or persist, over time in a variety of that variability in tuning is intentional in some societies ways. On a short time-scale, vocal (descriptive) scales [107, 109, 111], and in other societies musicians may tol- are inherently statistical, due to a lack of precision in mo- erate large deviations in their tunings [112], but it feels tor control and pitch perception [117]. Performances are at times like there is an underlying assumption in the lit- ephemeral, so in between performances scales reside in erature of differences between societies with established memory – this introduces another mechanism of change. , and those without. Unfortunately, we lack Unlike vocal scales, instrument tunings physically persist sufficient examples of instrument tunings from societies through time, and the extent depends on how the instru- with theory scales for a thorough comparison, and micro- ment is constructed. Environmental changes (tempera- tonal deviations may have been airbrushed out of histor- ture, humidity) will quickly affect the tuning of stringed ical records due to the reliance on Western notation in instruments, but effects will be slower for a metal xy- transcription [113]. Despite this, we briefly compare the lophone. Through physical force, the metal reeds of an empirical tuning variability in cultures. We first calcu- mbira may be knocked out of place [29], or a string may lated standard deviations between notes in the be broken. To illustrate this point, we checked sources for

10 examples of repeat tunings of the same instruments across to allow musicians to synchronize tunings with less effort. a specific timeframe: Gamelan orchestras were found to Similarly, Hindustani music, which traditionally uses just have a standard deviation of 8 (slendro) and 13 (pelog) intonation (JI) tuning, has shown a shift towards 12-TET cents over about 25 years [26]; a likembe (plucked metal since the introduction of the harmonium, an imported in- lamellophone) was found to vary by 18 cents over a few strument with pitches fixed to 12-TET [83]. One study weeks [29]; a kora was found to vary by 27 cents over one found that scales inferred from African recordings shifted week [32]. Alternatively, scales can exist in collective hu- towards 12-TET over time [130]. Novelty biases can be man memory through music theory; e.g., one can trace found in the numerous in the 20th century the history of the major scale from the Ionian mode given who experimented with microtonal tunings [118]; Game- in Pythagorean tuning, up to the present-day common lan tuners also indicate the use of tuning variability as a use of 12-TET. To counteract this change, musical theory form of expression [109]. There is also some indication (and other technological innovations) is perhaps the most of a prestige / success bias, where tunings are chosen be- robust method of preserving a tuning standard over time. cause they are associated with success, or good ability Implicit awareness of tonal fusion would have been suffi- [67]. cient to establish an early method of maintaining a tuning It is more difficult to talk of cognitive biases; they are standard over time, by tuning simultaneous tones accord- less direct so it is difficult to verify their impact. Some ing to octaves and fifths. In more recent times this has have suggested a bias towards harmonicity; while the sug- been reinforced though global tuning standards (concert gested mechanism of vocal similarity leaves a lot to be pitch) and inventions such as fixed pitch instruments and desired [131], the prevalence of octaves and fifths does electric tuners. Thus, we can see that scales change; the hint that harmonicity is important. Others have high- timescale of change can depend on technology (quality of lighted the need to efficiently, and reliably, communicate instrument construction / materials, invention of digital and remember melodies [10, 132]. There is a trade off music technology). between the complexity of scales and the cost / error rate One factor that greatly complicates the study of evolu- of these two processes, which may explain why adjacent tion of scales is that it is possible to invent an entirely new intervals are rarely smaller than a (∼ 100 cents) scale; this is completely at odds with typical ideas of evo- [77]. Some have also suggested that sensory dissonance, lution as incremental change. There are many examples or beats, contributes to the evolution of scales [133]. It of the use and invention of microtonal tunings in West- is true that many cultures show an aversion to beats, but ern music [16, 44, 118]. In fact, many theory scales bear the opposite trend is found in some Balkan and Lithua- the hallmarks of mathematical generation. Greek modes nian singing traditions [81]. Despite the relevance of some are all circular permutants, based on simple integer ra- established psychophysical phenomena, which, if any, af- tios [119]. The current system of Carnatic melakarta is a fects the evolution of scales probably differs by society. result of combinatorial enumeration of a set of intervals, Production constraints can apply to both singing and constrained by a set of rules [100]. Similarly, other cul- instruments [134]. When singing, large intervals are more tures that have a long history of mathematical scholarship difficult to produce than small intervals, although due to [120] all use overlapping sets of scales based on theoretical limits on motor control, it is impossible to reliably pro- divisions of the octave [121–126]. duce intervals below a certain size [135]. Vocal range Scales naturally change over time, and they can be in- is typically constrained to about two octaves for both vented completely from scratch. Despite this, scales are men and women, with women singing typically about surprisingly similar across cultures, which suggests there an octave higher than men. For instruments, there are is some selection for certain types of scales. clear limits to how many notes one can fit (how many strings, reeds, blocks, etc.), what size of interval ranges How are scales selected? you can achieve. Aerophones may naturally result in a tendency to use harmonic intervals, since they are often What are the possible selection pressures acting on scales? easily achieved by overblowing. We propose three categories of selection pressure. Cul- Due consideration should be given to the stories of peo- tural evolution biases are based on how many people, or ple who tune instruments. Tuning instruments by com- which people in your group are using the scale [127]. Cog- paring harmonic intervals has a rich history. Using fifths nitive biases depend on how pitch is perceived by humans. to match the sound of two strings goes back as far as an- Production biases depend on what is easy or difficult to cient Greek [119], and perhaps ancient Babylonian times sing, and physical constraints on instruments. [136], while in modern times the on a gui- There are some notable examples of cultural selection tar mainly requires matching of fourths. Similar accounts in scales. For some reason, humans tend to synchro- are found throughout the world on using fifths and oc- nize (multiple reasons have been suggested on the basis taves to tune instruments [30, 32, 39, 47, 68]. Musicians of evolutionary psychology, e.g., in-group signalling, so- will also tune their instruments according to the sound cial bonding, signalling of coalition strength [128, 129]), of adjacent intervals [18, 29, 31]. An underappreciated which may be what leads to the apparent conformity bias form of tuning is to tune an instrument visually [67, 68]; (selection for popularity). In Western Europe, 12-tone e.g., are the holes on a flute evenly spaced? Do the xy- (12-TET) was selected for its ability lophone bars increase in size according to a smooth gra-

11 dient? One can also tune according to some standard sults will depend on the sample of scales studied, yet it (outstanding Gamelan orchestra [26]; the best-sounding is impossible to know a priori what is a suitable repre- player in a group; using an electric tuner [67]). In ad- sentative sample of scales. A representative sample could dition to the , the spectral quality mean one that equally weights different societies. Does it can be tuned as a secondary consideration [109]. While matter if societies have different population sizes? Is this it may be difficult to reconcile the individual experiences appropriate if one society tends to use a single scale, while of tuning an instrument with some population-level cog- another has a richer repertoire? There are methodological nitive bias, it would be imprudent to ignore it. choices in drawing boundaries between societies, and one In addition to considering the possible selection pres- ought to control for transmission between societies; more sures, we ought to consider the effect of group size on abstractions are not necessarily good. Is it appropriate selection. Population genetics theory has shown that the to balance by frequency of use? Indeed, in some cultures effect of selection is modulated by group size [137]. If there are many scales and some are not necessarily used there is weak selection in a small population, the effects much [138], however there is little data on this. Ulti- may not be noticeable. However, if the group size is large, mately, this is a difficult problem, and it seems like the even weak selection effects will be apparent. This may only appropriate approach is to try multiple approaches. help explain why in some of the regions where theory scales are used, the same (or very similar) scales have Bias in the scale database been used for over a thousand years. This may be due to their influential status in court and art music, which We are aware that the database itself has several poten- would have led to large group sizes. tial sources of bias. First, there is the separation between theory and measured scales; a naive user may wrongly assume that in cultures which use theory scales intervals How can we study the evolution of scales? have always got exact values. Then there are problems associated on relying on data collected by only a few eth- The challenge of studying the evolution of scales appears nomusicologists. Being limited in number, they are lim- quite formidible. Scales change over time, the rate at ited in the types of locations they visited; one can see in which they change may vary wildly depending on the in- Figure 2 that some regions are totally absent. Some have strument / technology; moreover, they can be invented suggested that ethnomusicologists have a bias towards re- from scratch. There are numerous possible selection pres- porting findings that are considered ‘interesting’, and thus sures that are not mutually-exclusive, and their effects inflating diversity in the database [88]. Some musical tra- may be modulated by group size. Nonetheless we suggest ditions (Gamelan, Thai) were very popular research top- that the following two approaches are possible. ics, so they are over-represented. Unfortunately, in some The first approach is to study the evolutionary dynam- rare instances it seems that there are statistical irregular- ics in cases where it is appropriate to make simplifying ities in the reporting of tunings: Surjodiningrat et al. [26] assumptions, and where one has sufficient data. We can note that Jaap Kunst [139] (not included in this database) think of one such case. Gamelan orchestras are typically reported gamelan tunings where all the higher notes were tuned in reference to another orchestra [109]; the instru- exactly an octave above the lower ones, and that this is ments go out of tune at a slow, steady rate compared to extremely unlikely; in a study of prehistoric bone flutes other instruments. Also, each Gamelan orchestra typi- where the tunings are given to an accuracy of 1 cent, one cally has been tuned to the same orchestra for a while, flute is recorded as having a series of equal tempered in- such that there is a network of dependencies between tervals: [200, 200, 200, 300] [43]. In all, these biases do Gamelan orchestras. Gamelan tunings are also exten- not invalidate the data, but one must consider how biases sively documented; if these are supplemented by record- may affect any analyses and subsequent conclusions. ings, there ought to be a considerable amount of data that may extend back by several decades. Nonetheless, Limitations to studying scale evolution it is not clear that such a study on evolution of Gamelan orchestra tunings would tell us much about evolution of It may be the case that, as with languages, we are wit- scales in other settings. nessing the death of diversity of scales due to the com- Another approach is to disregard the mechanisms by bined forces of globalization and technological change which scales change, and to focus instead on the selection [140]. There is evidence of homogenization, resulting in pressures. Indeed, we have shown that there are likely widespread adoption of 12-TET [83, 130]. Perhaps this some strong selection pressures at play, given the substan- is due to the proliferation of mass-produced, fixed-pitch tial convergence in scales across cultures. This approach instruments. Perhaps it is due to a full-strength confor- requires tractable, relevant mathematical models; mul- mity bias, as the internet effectively increases group size. tiple selection pressures must be considered in tandem; Ultimately, this suggests if we want to understand how models ought to be able to not only predict the scales scales evolve, we must look to the past. However, therein in use, but predict which scales are not used. Finally, lies a different problem: the older the instrument, the less multiple models may have convergent predictions, which certain we can be about how they were played. For prehis- means it may be impossible to disentangle the different toric artefacts, we cannot be sure whether they (or their effects. A major barrier to this approach is that any re- reconstructions) faithfully resemble the instrument in its

12 original condition, and the only instruments that remain References sufficiently intact to play are aerophones. Depending on the type of aerophone, it is possible to get a range of about [1] F. Lerdahl and R. S. Jackendoff. A Generative The- a semitone (100 cents) by varying air flow, flow direction, ory of Tonal Music, Reissue, With a New Preface. and embrouchure [141, 142]. Some instruments, like ocari- MIT press, 1996. nas, have many viable cross-fingerings that can result in [2] Edward M. Burns. Intervals, scales, and tuning. a dazzling range of scales [141–143]. Thus, we believe In Diana Deutsch, editor, The Psychology of Music that the best source of scales is in ethnographic record- (Second Edition), pages 215 – 264. Academic Press, ings spanning the past century [71], so it is imperative San Diego, second edition edition, 1999. doi: https: that methods be developed that can faithfully infer scales //doi.org/10.1016/B978-012213564-4/50008-1. from large samples of songs. Algorithms must be devel- oped that can handle low-quality recordings [144], back- [3] E. C. Carterette and R. A. Kendall. Comparative ground noise, instrument / singing segmentation [145], music perception and cognition. In Diana Deutsch, polyphonic stream segmentation [146], note segmentation editor, The Psychology of Music (Second Edition), [147], and tonal drift [148]. pages 725–791. Academic Press, San Diego, second In addition to improved computational methods, it is edition edition, 1999. doi: https://doi.org/10.1016/ strictly necessary to work with large samples. Due to (i) B978-012213564-4/50008-1. relatively small differences between tunings, (ii) error in measurements of tunings, and (iii) inherent variability in [4] Tran Van Khe. Is the pentatonic universal? a few tunings due to cognitive limitations, it will take a lot of reflections on pentatonism. The World of Music, 19 data to make definitive claims. In addition, the challenge (1/2):76–84, 1977. is further compounded by the wide range of potential se- lection pressures that we have identified. The strength [5] P. E. Savage, S. Brown, E. Sakai, and T. E. Cur- of selection may vary widely across cultures, as may the rie. Statistical universals reveal the structures and selection pressures themselves. Some societies may fo- functions of human music. P. Natl. Acad. Sci. cus on rhythmic complexity, to the point that weak / Usa., 112(29):8987–8992, 2015. doi: 10.1073/pnas. few selection pressures appear to act on scales. Cultures 1414495112. that lack simultaneous singing may by unmoved by the [6] M. Kolinski. Recent trends in . octave [89, 92]. It is possible to untangle these more sub- Ethnomusicology, 11(1):1–24, 1967. doi: 10.2307/ tle effects, but only with multiple recordings from each 850496. society, and also detailed ethnographic texts that indi- cate how music was performed. Despite the limitations [7] Dane L. Harwood. Universals in music: A perspec- on studying the evolution of scales, we believe that with tive from cognitive psychology. Ethnomusicology, 20 sufficient large-scale efforts, we may one day gain a deep (3):521–533, 1976. understanding of why music sounds the way it does. [8] S. E. Trehub. Human processing predispositions and musical universals. In Nils L. Wallin, Björn Conclusion Merker, and Steven Brown, editors, The origins of Scales are a cornerstone of music across the world, upon music, pages 427–448, 2000. which endless combinations of melodies can be generated. [9] Steven Brown and Joseph Jordania. Universals in Surprisingly, despite a wealth of ethnomusicological re- the worldâĂŹs . Psychol. Music, 41(2):229– search on the subject, we lacked a comprehensive, diverse 248, 2013. doi: 10.1177/0305735611425896. synthesis of scales of the world. Here we remedy this issue, with a focus on quantitative data that will enable detailed [10] Peter Q. Pfordresher and Steven Brown. Vocal mis- statistical analyses about how scales evolve. Our own pre- tuning reveals the origin of musical scales. Eur. J. liminary analyses have lent quantitative and qualitative Cogn. Psychol., 29(1):35–52, 2017. doi: 10.1080/ support for the widespread (but not necessarily univer- 20445911.2015.1132024. sal) use of the octave in some special capacity. We have shown that despite the rich diversity of scales, when put in [11] R. Parncutt and G. Hair. A psychocultural theory context of how many scales are possible, what stands out of musical interval: Bye bye pythagoras. Springer. is how remarkably similar they are across the globe. We Handb. Audit., 35(4):475–501, 2018. doi: 10.1525/ compose a treatise on the evolution of scales, and propose mp.2018.35.4.475. promising avenues for future research. [12] C. Stumpf. Tonpsychologie. Leipzig, 1890.

Author Contributions [13] M. J. McPherson, S. E. Dolan, A. Durango, T. Os- sandon, J. Valdés, E. A. Undurraga, N. Jacoby, J.M. and T.T. designed research; J.M. performed re- R. A. Godoy, and J. H. McDermott. Percep- search; J.M. analyzed data; J.M. and T.T. wrote the pa- tual fusion of musical notes by native amazoni- per. ans suggests universal representations of musical

13 intervals. Nat. Commun., 11(1):2786, 2020. doi: [30] W. Van Zanten. The equidistant heptatonic scale 10.1038/s41467-020-16448-6. of the asena in malawi. Afr. Music, 6(1):107–125, 1980. doi: 10.21504/amj.v6i1.1099. [14] R. N. Shepard. Geometrical approximations to the structure of musical pitch. Psychol. Rev., 89(4): [31] Hugo Zemp. Melanesian solo polyphonic panpipe 305–333, 1982. doi: 10.1037/0033-295X.89.4.305. music. Ethnomusicology, 25(3):383–418, 1981. doi: 10.2307/851551. [15] M. J. Hewitt. Musical Scales of the World. Note Tree, 2013. [32] B. A. Aning. Tuning the kora: A case study of the norms of a gambian musician. J. Afr. Stud., 9(3): [16] H. Rechberger. Scales and Modes Around the World: 164, 1982. The Complete Guide to the Scales and Modes of the World. Fennica Gehrman Ltd., 2018. [33] Ho Lu-Ting and Han Kuo-huang. On chinese scales and national modes. Asian Music, 14(1):132–154, [17] A. J. Ellis. On the Musical Scales of Various Na- 1982. doi: 10.2307/834047. tions. Journal of the Society of arts, 1885. [34] G. Kubik. A structural examination of homophonic [18] K. P. Wachsmann. An equal-stepped tuning in a multi-part singing in east and central africa. An- ganda harp. Nature, 165(4184):40–41, 1950. doi: uario Musical, 39:27, 1984. 10.1038/165040a0. [35] Gerhard Kubik. African tone-systems: A reassess- [19] Gerhard Kubik. Harp music of the azande and re- ment. Yearb. Tradit. Music, 17:31–63, 1985. doi: lated peoples in the central african republic: (part 10.2307/768436. i âĂŢ horizontal harp playing). Afr. Music, 3(3): 37–76, 1964. [36] Robert Gottlieb. Sudan ii: Music of the blue nile province; the ingessana and berta tribes, 1986. [20] R. Brandel. The Music of Central Africa: An Eth- nomusicological Study: Former French Equatorial [37] R. Yu-An, E. C. Carterette, and W. Yu-Kui. A Africa the Former Belgian Congo, Ruanda-Urundi comparison of the musical scales of the ancient chi- Uganda, Tanganyika. Springer Science & Business nese bronze ensemble and the modern bamboo Media, 1967. flute. Percept. Psychophys., 41(6):547–562, 1987. doi: 10.3758/BF03210489. [21] J. Kunst. Music in New Guinea. Brill, 1967. [38] D. H. Keefe, E. M. Burns, and P. Nguyen. Viet- [22] Gilbert Rouget and J. Schwarz. Sur les xylophones namese modal scales of the dan tranh. Music Per- Ãľquiheptaphoniques des malinké. Rev. Musicol., cept., 8(4):449–468, 1991. doi: 10.2307/40285522. 55(1):47–77, 1969. doi: 10.2307/927751. [39] A. Tracey. Kambazithe makolekole and his val- [23] A. Tracey. The matepe mbira music of rhodesia. imba group: A glimpse of the technique of the sena Afr. Music, 4(4):37–61, 1970. doi: 10.21504/amj. xylophone. Afr. Music, 7(1):82–104, 1991. doi: v4i4.1681. 10.21504/amj.v7i1.1932.

[24] A. Tracey. The nyanga panpipe dance. Afr. Music, [40] Edward C. Carterette, Roger A. Kendall, and Sue 5(1):73–89, 1971. doi: 10.21504/amj.v5i1.1152. Carole De Vale. Comparative acoustical and psy- choacoustical analyses of gamelan instrument tones. [25] R. Knight. Kora music from the gambia, played by Journal of the Acoustical Society of Japan (E), 14 foday musa suso, 1976. (6):383–396, 1993. doi: 10.1250/ast.14.383.

[26] W. Surjodiningrat, A. Susanto, and P. J. Sudar- [41] Albrecht Schneider. Sound, pitch, and scale: From jana. Tone Measurements of Outstanding Javanese "tone measurements" to sonological analysis in in Jogjakarta and Surakarta. Gadjah ethnomusicology. Ethnomusicology, 45(3):489–519, Mada University Press, 1972. 2001. doi: 10.2307/852868.

[27] D. Morton and C. Duriyanga. The Traditional Mu- [42] K. Attakitmongcol, R. Chinvejkitvanich, and S. Su- sic of Thailand, volume 8. Univ of California Press, jitjorn. Characterization of traditional thai musical 1976. scale. In Proceedings of the 5th WSEAS Interna- tional Conference on and Music: Theory [28] Joerg Haeberli. Twelve nasca panpipes: A study. & Applications (AMTAâĂŹ04), 2004. Ethnomusicology, 23(1):57–74, 1979. doi: 10.2307/ 851338. [43] J. Zhang, X. Xiao, and Y. K. Lee. The early development of music. analysis of the jiahu bone [29] Gerhard Kubik. Likembe tunings of kufuna kan- flutes. Antiquity, 78(302):769âĂŞ778, 2004. doi: donga (angola). Afr. Music, 6(1):70–88, 1980. 10.1017/S0003598X00113432.

14 [44] W. A. Sethares. Tuning, , Spectrum, Scale. [58] Robert Garfias. Preliminary thoughts on burmese Springer Science & Business Media, 2005. modes. Asian Music, 7(1):39–49, 1975.

[45] L. E. McNeil and S. Mitran. Vibrational frequen- [59] A. M. Jones. A kwaikèr indian xylophone. Ethno- cies and tuning of the african mbira. J. Acoust. , 10(1):43–47, 1966. Soc. Am., 123(2):1169–1178, 2008. doi: 10.1121/1. 2828063. [60] A. M. Jones. Indonesia and africa: The xylophone as a culture-indicator. African Music : Journal of [46] J. L. Strand. The Sambla Xylophone: Tradition and the International Library of African Music, 2(3):36– Identity in Burkina Faso. PhD thesis, Wesleyan 47, 1960. doi: https://doi.org/10.21504/amj.v2i3. University, Connecticut, 2009. 608.

[47] M. Kuss. Music in Latin America and the [61] G. Kubik. The endara xylophone of bukonjo. Caribbean: An Encyclopedic History: Volume 1: African Music : Journal of the International Li- Performing Beliefs: Indigenous Peoples of South brary of African Music, 3(1):43–48, 1962. doi: America, Central America, and Mexico. University https://doi.org/10.21504/amj.v3i1.736. of Texas Press, 2010. [62] G. Kubik. Discovery of a trough xylophone in north- [48] J. Garzoli. The myth of equidistance in thai tuning. ern mozambique. African Music : Journal of the Anal Approaches Music, 4(2):1–29, 2015. International Library of African Music, 3(2):11–14, 1963. doi: https://doi.org/10.21504/amj.v3i2.826. [49] N. Wisuttipat. Relative nature of thai traditional music through its tuning system. International [63] Gerhard Kubik. Embaire xylophone music of Journal of Creative and Arts Studies, 2(1):86–97, samusiri babalanda (uganda 1968). The World of 2015. doi: 10.24821/ijcas.v2i1.1441. Music, 34(1):57–84, 1992. [50] A. Morkonr, S. Punkubutra, et al. The collecting [64] Alan Thrasher. The transverse flute in traditional process of xylophone’s sound d (ran¯adxek) from art chinese music. Asian Music, 10(1):92–114, 1978. to numerical data. In 2018 International Conference on Engineering, Applied Sciences, and Technology [65] C. Miñana Blasco. Afinación de las marimbas en la (ICEAST), pages 1–4, 2018. doi: 10.1109/ICEAST. costa pacífica colombiana: Un ejemplo de la memo- 2018.8434434. ria interválica africana en colombia. 1990.

[51] R. Bader. Temperament in tuning systems of south- [66] J. Sundberg and P. Tjernlund. Computer measure- east asia and ancient india. In Computational ments of the tone scale in performed music by means Phonogram Archiving, pages 75–107. Springer, of frequency histograms. STL-QPS, 10(2-3):33–35, 2019. doi: 10.1007/978-3-030-02695-0_3. 1969. [52] C. M. L. Kimberlin. Masinqo and the Nature of [67] Lyndsey Copeland. Pitch and tuning in beninese Qanat. PhD thesis, The University of California, brass bands. Ethnomusicology Forum, 27(2):213– Los Angeles, 1976. 240, 2018. doi: 10.1080/17411912.2018.1518151. [53] Unjung Nam. Pitch distributions in korean court [68] T. E. Miller. Traditional Music of the Lao: Kaen music: Evidence consistent with tonal hierarchies. Playing and Mawlum Singing in Northeast Thai- Music Percept., 16(2):243–247, 1998. doi: 10.2307/ land. Number 13. Praeger, 1985. 40285789. [69] Frank Scherbaum, Nana Mzhavanadze, Simha [54] S. Weisser and F. Falceto. Investigating qanat in Arom, Sebastian Rosenzweig, and Meinard Müller. amhara secular music: An acoustic and historical Tonal Organization of the Erkomaishvili Dataset: study. Annales d’ÃĽthiopie, 28(1):299–322, 2013. Pitches, Scales, Melodies and Harmonies. Num- doi: 10.3406/ethio.2013.1539. ber 1. 2020. doi: 10.25932/publishup-47614. [55] Rytis Ambrazevičius. The perception and transcrip- tion of the scale reconsidered: Several lithuanian [70] Aline Honingh and Rens Bod. In search of uni- cases. The World of Music, 47(2):31–53, 2005. versal properties of musical scales. J. New Music Res., 40(1):81–89, 2011. doi: 10.1080/09298215. [56] Rytis Ambrazevičius. Modelling of scales in tradi- 2010.543281. tional solo singing. Music. Sci., 10(1_suppl):65–87, 2006. doi: 10.1177/1029864906010001041. [71] A. Wood, K. R. Kirby, C. Ember, S. Silbert, H. Daikoku, J. Mcbride, S. Passmore, F. Paulay, [57] P. R. Cooke. Ludaya âĂŞ a transverse flute from M. Flory, J. Szinger, et al. The global jukebox: eastern uganda. Yearb. Int. Council, 3: A public database of performing arts and culture. 79âĂŞ90, 1971. doi: 10.2307/767457. PsyArXiv, 2021.

15 [72] Joren Six, Olmo Cornelis, and Marc Leman. Tar- [84] Sara D’Amario, David M. Howard, Helena Daffern, sos, a modular platform for precise pitch analysis and Nicola Pennill. A longitudinal study of intona- of western and non-western music. J. New Music tion in an a cappella singing quintet. J. Voice, 34 Res., 42(2):113–129, 2013. doi: 10.1080/09298215. (1):159.e13–159.e27, 2020. doi: https://doi.org/10. 2013.797999. 1016/j.jvoice.2018.07.015.

[73] Olmo Cornelis, Joren Six, Andre Holzapfel, and [85] Mieczyslaw Kolinski. Recent trends in ethnomusi- Marc Leman. Evaluation and recommendation of cology. Ethnomusicology, 11(1):1–24, 1967. pulse and tempo annotation in ethnic music. J. New Music Res., 42(2):131–149, 2013. doi: 10.1080/ [86] W. Udo. Two types of octave relationships in 09298215.2013.812123. central australian vocal music? Musicology Aus- tralia, 20(1):6–14, 1997. doi: 10.1080/08145857. [74] Y. Ozaki, J. McBride, E. Benetos, P. Pfordresher, 1997.10415970. J. Six, A. Tierney, P. Proutskova, E. Sakai, H. Kondo, H. Fukatsu, et al. Agreement among hu- [87] B. Nettl. An ethnomusicologist contemplates uni- man and automated transcriptions of global songs. versals in musical sound and musical culture. In PsyArXiv, 2021. Nils L. Wallin, Björn Merker, and Steven Brown, editors, The origins of music, volume 3, pages 463– [75] J. A. Siegel and W. Siegel. Categorical perception of 472, 2000. tonal intervals: Musicians canâĂŹt tell sharp from flat. Percept. Psychophys., 21(5):399–407, 1977. doi: [88] B. Nettl. The Study of Ethnomusicology: Thirty- 10.1037/h0094008. One Issues and Concepts. University of Illinois Press, 2010. [76] J. H. McDermott, M. V. Keebler, C. Micheyl, and A. J. Oxenham. Musical intervals and relative pitch: [89] Nori Jacoby, Eduardo A. Undurraga, Malinda J. Frequency resolution, not interval resolution, is spe- McPherson, JoaquÃŋn ValdÃľs, TomÃąs Os- cial. J. Acoust. Soc. Am., 128(4):1943–1951, 2010. sandÃşn, and Josh H. McDermott. Universal and doi: 10.1121/1.3478785. non-universal features of musical pitch perception revealed by singing. Curr. Biol., 2019. doi: 10. [77] J. M. Zarate, C. R. Ritson, and D. Poeppel. Pitch- 1016/j.cub.2019.08.020. interval discrimination and musical expertise: Is the semitone a perceptual boundary? J. Acoust. [90] W. J. Dowling and A. W. Hollombe. The percep- Soc. Am., 132(2):984–993, 2012. doi: 10.1121/1. tion of melodies distorted by splitting into several 4733535. octaves: Effects of increasing proximity and melodic contour. Perception & Psychophysics, 21(1):60–64, [78] P. Larrouy-Maestri, P. M. C. Harrison, and 1977. doi: 10.3758/BF03199469. D. Müllensiefen. The mistuning perception test: A new measurement instrument. Behav. Res. [91] Carol L. Krumhansl, Pekka Toivanen, Tuomas Methods, 51(2):663–675, 2019. doi: 10.3758/ Eerola, Petri Toiviainen, Topi JÃďrvinen, and s13428-019-01225-1. Jukka Louhivuori. Cross-cultural music cognition: Cognitive methodology applied to north sami yoiks. [79] B. Hagerman and J. Sundberg. Fundamental fre- Cognition, 76(1):13–58, 2000. doi: https://doi.org/ quency adjustment in barbershop singing. Speech, 10.1016/S0010-0277(00)00068-8. Music and Hearing Quarterly Progress and Status Report, 21(1):28–42, 1980. [92] D. Bonnard, C. Micheyl, C. Semal, R. Dauman, and L. Demany. Auditory discrimination of frequency [80] H. Jers and S. Ternström. Intonation analysis of a ratios: The octave singularity. J. Exp. Psychol. Hu- multi-channel choir recording. TMHQPSR Speech, man., 39(3):788–801, 2013. doi: 10.1037/a0030095. Music and Hearing: Quarterly Progress and Status Report, 47(1):1–6, 2005. [93] Laurent Demany, Guilherme Monteiro, Catherine Semal, Shihab Shamma, and Robert P. Carlyon. [81] R. Ambrazevičius and I. Wiśniewska. Tonal hier- The perception of octave pitch affinity and har- archies in sutartin˙es. Journal of interdisciplinary monic fusion have a common origin. Hearing Res., music studies, 3(1/2):45–55, 2009. 404:108213, 2021. doi: https://doi.org/10.1016/j. heares.2021.108213. [82] J. Devaney, M. I. Mandel, D. P. W. Ellis, and I. Fu- jinaga. Automatically extracting performance data [94] D. Allen. Octave discriminability of musical and from recordings of trained singers. PsychoMusicol- non-musical subjects. Psychon. Sci., 7(12):421–422, ogy, 21(1-2):108–136, 2011. doi: 10.1037/h0094008. 1967. doi: 10.3758/BF03331154.

[83] J. Serra, G. K. Koduri, M. Miron, and X. Serra. [95] H. J. Kallman. Octave equivalence as measured by Assessing the tuning of sung indian classical music. similarity ratings. Perception & Psychophysics, 32 In ISMIR, pages 157–162, 2011. (1):37–49, 1982. doi: 10.3758/BF03204867.

16 [96] M. Hoeschele, R. G. Weisman, and C. B. Sturdy. [108] Jay Rahn. Javanese pÃľlog tunings reconsidered. Pitch chroma discrimination, generalization, and Yearb. Int. Folk Music Council, 10:69–82, 1978. doi: transfer tests of octave equivalence in humans. At- 10.2307/767348. tention, Perception, & Psychophysics, 74(8):1742– 1760, 2012. doi: 10.3758/s13414-012-0364-2. [109] Roger Vetter. A retrospect on a century of gamelan tone measurements. Ethnomusicology, 33(2):217– [97] M. J. McPherson, R. C. Grace, and J. H. McDer- 227, 1989. doi: 10.2307/924396. mott. Harmonicity aids hearing in noise. bioRxiv, 2020. doi: 10.1101/2020.09.30.321000. [110] Marc Perlman. American gamelan in the garden of eden: Intonation in a cross-cultural encounter. The [98] M. J. McPherson and J. H. McDermott. Time- Musical Quarterly, 78(3):510–555, 1994. dependent discrimination advantages for harmonic sounds suggest efficient coding for memory. P. Natl. [111] Michael Theodore Coolen. The fodet: A senegam- Acad. Sci. Usa., 117(50):32169–32180, 2020. doi: bian origin for the blues? The Black Perspective in 10.1073/pnas.2008956117. Music, 10(1):69–84, 1982.

[99] S. A. Mehr, M. Singh, D. Knox, D. M. Ketter, [112] Simha Arom and Susanne FÃijrniss. An interactive D. Pickens-Jones, S. Atwood, C. Lucas, N. Jacoby, experimental method for the determination of mu- A. A. Egner, E. J. Hopkins, R. M. Howard, J. K. sical scales in oral cultures. Contemp. Music Rev., 9 Hartshorne, M. V. Jennings, J. Simson, C. M. Bain- (1-2):7–12, 1993. doi: 10.1080/07494469300640301. bridge, S. Pinker, T. J. O’Donnell, M. M. Kras- now, and L. Glowacki. Universality and diversity [113] D. J. Chadwin. Applying Microtonality to Pop in human song. Science, 366(6468), 2019. doi: Songwriting: A Study of Microtones in Pop Music. 10.1126/science.aax0868. PhD thesis, University of Huddersfield, 2019.

[100] K. G. Vijayakrishnan. The Grammar of Carnatic [114] A. Schneider and M. Leman. Sound, Pitches Music, volume 8. Walter de Gruyter, 2007. and Tuning of a Historic Carillon, pages 247–298. Springer International Publishing, Cham, 2017. doi: [101] Eric Jones, Travis Oliphant, Pearu Peterson, et al. 10.1007/978-3-319-47292-8_9. Scipy: Open source scientific tools for Python, 2001–. [115] Percival R. Kirby. The indonesian origin of certain african musical instruments. Afr. Stud-uk., 25(1): [102] J. McDermott and M. Hauser. The origins of mu- 3–22, 1966. doi: 10.1080/00020186608707224. sic: Innateness, uniqueness, and evolution. Music Percept., 23(1):29–59, 2005. doi: 10.1525/mp.2005. [116] R. Blench. Using diverse sources of evidence for 23.1.29. reconstructing the past history of musical exchanges in the indian ocean. Afr. Archaeol. Rev., 31(4):675– [103] P. Ball. The Music Instinct: How Music Works 703, 2014. doi: 10.1007/s10437-014-9178-z. and Why We Can’t Do Without It. Random House, 2010. [117] S. M. Hutchins and I. Peretz. A frog in your throat or in your ear? searching for the causes of poor [104] Barry Ross and Sarah Knight. Reports of equitonic singing. Journal of Experimental Psychology: Gen- scale systems in african musical traditions and their eral, 141(1):76–97, 2012. doi: 10.1037/a0025064. implications for cognitive models of pitch organiza- tion. Music. Sci., page 1029864917736105, 2017. [118] H. Partch. . University of Wis- doi: 10.1177/1029864917736105. consin Press, 1950.

[105] Gerald J. Balzano. The group-theoretic description [119] Richard L. Crocker. Pythagorean mathematics and of 12-fold and microtonal pitch systems. Comput. music. The Journal of Aesthetics and Art Criticism, Music J., 4(4):66–84, 1980. doi: 10.2307/3679467. 22(2):189–198, 1963. doi: 10.2307/427754.

[106] S. E. Trehub, E. G. Schellenberg, and S. B. [120] M. K. J. Goodman. An Introduction to the Early Kamenetsky. Infants’ and adults’ perception of Development of Mathematics. John Wiley & Sons, scale structure. J. Exp. Psychol. Human., 25(4): 2016. 965, 1999. doi: 10.1037/0096-1523.25.4.965. [121] S. Marcus. The interface between theory and prac- [107] S. Rosenzweig, F. Scherbaum, D. Shugliashvili, tice: Intonation in arab music. Asian Music, 24(2): V. Arifi-Müller, and M. Müller. Erkomaishvili 39–58, 1993. doi: 10.2307/834466. dataset: A curated corpus of traditional georgian vocal music for computational musicology. Trans- [122] I. Katz. Henry George Farmer and the First In- actions of the International Society for Music In- ternational Congress of Arab Music (Cairo 1932). formation Retrieval, 3(1), 2020. Brill, 2015.

17 [123] H. Farhat. The Dastgah Concept in Persian Music. [137] J. H. Gillespie. Population Genetics: A Concise Cambridge University Press, 2004. Guide. JHU Press, 2004.

[124] K. L. Signell. Makam: Modal Practice in Turkish [138] M. C. Can. Geleneksel türk sanat müziğinde arel Art Music, volume 4. Da Capo Pr, 1977. ezgi uzdilek ses sistemi ve uygulamada kullanıl- mayan bazı perdeler. Gazi Üniversitesi Gazi Eğitim [125] E. te Nijenhuis. Dattilam: A Compendium of An- Fakültesi Dergisi, 22(1), 2002. cient Indian Music, volume 11. Brill Archive, 1970. [139] J. Kunst. Music in Java: Its History, Its Theory [126] Lothar von Falkenhausen. On the early develop- and Its Technique. Springer, 1949. ment of chinese musical theory: The rise of pitch- standards. J. Am. Oriental Soc., 112(3):433–439, [140] T. Cowen. Creative Destruction: How Globaliza- 1992. tion Is Changing the World’s Cultures. Princeton University Press, 2009. [127] N. Creanza, O. Kolodny, and M. W. Feldman. Cul- tural evolutionary theory: How culture evolves and [141] G. A. S. Santiago. Los Artefactos Sonoros Del Oax- why it matters. P. Natl. Acad. Sci. Usa., 114(30): aca Prehispánico, volume 3. Secretaría de Cultura 7782–7789, 2017. doi: 10.1073/pnas.1620732114. del Estado de Oaxaca, 2005.

[128] P. E. Savage, P. Loui, B. Tarr, A. Schachner, [142] Susan Rawcliffe. Eight west mexican flutes in the L. Glowacki, S. Mithen, and W. T. Fitch. Mu- fowler museum. The World of Music, 49(2):45–65, sic as a coevolved system for social bonding. Be- 2007. hav. Brain Sci., page 1âĂŞ36, 2020. doi: 10.1017/ S0140525X20000333. [143] Shigeru Yoshikawa and Kazue Kajiwara. Cross fin- gerings and associated intonation anomaly in the [129] S. A. Mehr, M. M. Krasnow, G. A. Bryant, and shakuhachi. Acoustical Science and Technology, 36 E. H. Hagen. Origins of music in credible signaling. (4):314–325, 2015. doi: 10.1250/ast.36.314. Behav. Brain Sci., page 1âĂŞ41, 2020. doi: 10. 1017/S0140525X20000345. [144] Gfeller, Dominik Roblek, Marco Tagliasacchi, and Pen Li. Learning to denoise historical music. In [130] Dirk Moelants, Olmo Cornelis, and Marc Leman. ISMIR 2020 - 21st International Society for Music Exploring african tone scales. In Proceedings of the Information Retrieval Conference, 2020. 10th International Society for Music Information Retrieval Conference, pages 489–494, Kobe, Japan, [145] M. Marolt, C. Bohak, A. KavÄŊiÄŊ, and M. Pe- 2009. ISMIR. doi: 10.5281/zenodo.1416338. sek. Automatic segmentation of ethnomusicological field recordings. Applied Sciences, 9(3), 2019. doi: [131] K. Z. Gill and D. Purves. A biological rationale 10.3390/app9030439. for musical scales. Plos One, 4(12):1–9, 2009. doi: 10.1371/journal.pone.0008144. [146] E. Benetos, S. Dixon, Z. Duan, and S. Ewert. Auto- [132] W. J. Dowling. The cognitive framework for musical matic music transcription: An overview. Ieee Signal scales in various cultures. J. Acoust. Soc. Am., 84 Proc. Mag., 36(1):20–30, 2019. doi: 10.1109/MSP. (S1):S204–S204, 1988. doi: 10.1121/1.2026117. 2018.2869928.

[133] J. J. Aucouturier. The hypothesis of self- [147] M. Mauch, C. Cannam, R. Bittner, G. Fazekas, organization for musical tuning systems. Leonardo J. Salamon, J. Dai, J. Bello, and S. Dixon. Music J., 18:63–69, 2008. doi: 10.1162/lmj.2008.18. Computer-aided melody note transcription using 63. the tony software: Accuracy and efficiency. In Pro- ceedings of the First International Conference on [134] A. T. Tierney, F. A. Russo, and A. D. Patel. The Technologies for Music Notation and Representa- motor origins of human and avian song structure. P. tion, 2015. Natl. Acad. Sci. Usa., 108(37):15510–15515, 2011. doi: 10.1073/pnas.1103882108. [148] M. Mauch, K. Frieler, and S. Dixon. Intonation in unaccompanied singing: Accuracy, drift, and [135] J. Sundberg, I. Titze, and R. Scherer. Phona- a model of reference pitch memory. J. Acoust. tory control in male singing: A study of the effects Soc. Am., 136(1):401–411, 2014. doi: 10.1121/1. of subglottal pressure, fundamental frequency, and 4881915. mode of phonation on the voice source. J. Voice, 7(1):15 – 29, 1993. doi: https://doi.org/10.1016/ S0892-1997(05)80108-0.

[136] M. L. West. The babylonian and the hurrian melodic texts. Music Lett., 75(2):161– 179, 1994. doi: 10.2307/737674.

18