Mol Genet Genomics DOI 10.1007/s00438-012-0683-y

ORIGINAL PAPER

A comparative phylogenetic study of genetics and folk music

Horolma Pamjav • Zolta´n Juha´sz • Andrea Zala´n • Endre Ne´meth • Bayarlkhagva Damdin

Received: 7 December 2011 / Accepted: 21 February 2012 Ó Springer-Verlag 2012

Abstract Computer-aided comparison of folk music corresponding musical cultures were also compared using from different nations is one of the newest research areas. an automatic overlap analysis of parallel melody styles for We were intrigued to have identified some important 31 Eurasian nations. We found that close musical relations similarities between phylogenetic studies and modern folk of populations indicate close genetic distances (\0.05) with music. First of all, both of them use similar concepts and a probability of 82%. It was observed that there is a sig- representation tools such as multidimensional scaling for nificant correlation between population genetics and folk modelling relationship between populations. This gave us music; maternal lineages have a more important role in the idea to investigate whether these connections are folk music traditions than paternal lineages. Furthermore, merely accidental or if they mirror population migrations the combination of these disciplines establishing a new from the past. We raised the question; does the complex interdisciplinary research field of ‘‘music-genetics’’ can be structure of musical connections display a clear picture and an efficient tool to get a more comprehensive picture on the can this system be interpreted by the genetic analysis? This complex behaviour of populations in prehistoric time. study is the first to systematically investigate the incidental genetic background of the folk music context between Keywords Y haplogroups mtDNA haplogroups Folk different populations. Paternal (42 populations) and music Music genetics Human demographic history maternal lineages (56 populations) were compared based on Fst genetic distances of the Y chromosomal and mtDNA haplogroup frequencies. To test this hypothesis, the Introduction

Computer-aided comparison of traditional music features Communicated by S. Hohmann. from different cultures is one of the newest research areas. Electronic supplementary material The online version of this While a considerable amount of genetic studies has been article (doi:10.1007/s00438-012-0683-y) contains supplementary carried out in connection with human evolution history and material, which is available to authorized users. migrations, relations between genetic research and music cultures have never been investigated. Thus, we turned our H. Pamjav (&) A. Zala´n E. Ne´meth DNA Laboratory, Institute of Forensic Medicine, attention to correlation patterns between genetic data and Network of Forensic Science Institutes, , folk music melodies. PO Box 216, 1536, Human history is tightly related with a history of pop- e-mail: [email protected] ulation migration. Patterns of genetic diversity provide Z. Juha´sz information about population history because each major Department of Complex Systems, Research Institute for demographic event left an imprint on genomic diversity of Technical Physics and Materials Science, Budapest, Hungary populations. These demographic signatures are passed from generation to generation, thus the genomes of modern B. Damdin Department of Molecular Biology, Faculty of Biology, National individuals reflect their demographic history. Studies for University of Mongolia, Ulaanbaatar, Mongolia evolutionary history have benefited from analyses of 123 Mol Genet Genomics mitochondrial DNA (mtDNA) and the non-recombining variation is an essential feature of most oral cultures. Cross- region of the Y chromosome. Because these regions of cultural analysis of different nations in Eurasia drew a well- human genome are not shuffled by recombination, they are interpretable network of ‘‘genetic’’ relations between transmitted intact from one generation to the next, reveal- musical traditions (Juha´sz 2000; Juha´sz and Sipos 2009). ing the maternal and paternal lineages of a population. This led us to the question whether these connections are Populations share mtDNA or Y chromosome lineages as a merely accidental or can they possibly relate to the result of common origins or gene flow (admixture). Use of migration of populations in any way? Does the complex mtDNA and the Y chromosome has been extensively structure of musical connections display a clear picture and studied and used in migration studies and in the analysis of can we interpret this connection system on a genetic basis population history and origins (Wells et al. 2001; Cinnioglu as well? et al. 2004; Nasidze et al. 2005; Pakendorf et al. 2006). Y chromosome markers tend to show restricted regional distribution, or population specificity, making them ideal in Materials and methods marking unique migration events (Hammer et al. 1997). Similarly, the variation of mtDNA between different pop- Materials ulation groups can be used to estimate the time back to common ancestors. In the last few years, there has been In the present work, we describe the results of a computer- significant progress in reconstructing the detailed genea- aided cross-cultural analysis of 31 representative Eurasian logical branching order of the tree topologies for both and North-American folksong collections, and a parallel mtDNA and the non-recombining region of the Y chro- study of genetic relations between the corresponding pop- mosome (Jobling and Tyler-Smith 2003; Underhill and ulation groups. Our melody database contains digital Kivisild 2007; Karafet et al. 2008). notations of 31 cultures, each of them consisting of Affinities between populations may result from their 1,000–2,500 melodies. The studied cultures are as follows: common origin or from recent admixture due to geographic Chinese, Mongolian, Kyrgyz, Volga Region (Mari–Chuv- proximity. In particular, genetic distances between popu- ash–Tatar–Votiac), Sicilian, Bulgarian, Azeri, Anatolian, lations can generally be related to geographic distances, Karachay, Hungarian, Slovakian, Moravian, Romanian, according to a model of isolation by distance (Cavalli- Finnish, Norwegian, German, Luxembourgish, French, Sforza and Bodmer 1977). But this is not relevant in all Dutch, Irish–Scottish–English (one group), Spanish, cases. Dakota, Komi, , Croatian–Serbian (Balkan group), According to earlier reports, correlation between genetic Kurdish, Russian from the Pskov Area, Navajo and three distance and language as a cultural marker has been regions of Poland: Great Poland, Warmia and Cassubia. debated (Nettle and Harriss 2003). We, however, set out to Simultaneously with the musical database investigation, examine relationships between genetic distance and folk we also calculated corresponding genetic distance matrices music cultures. using widely used software tools (Schneider et al. 2000) The comparative study of different folk music cultures from published data for those populations compared above, goes back to the early 20th century. Barto´k and Koda´ly with the following differences: (1) no genetic data were identified certain musical styles of Hungarian folk music found for the Luxembourgish, Karachay and Dakota Indi- being common with Mari, Chuvash, as well as Anatolian ans, (2) Y chromosomal and mtDNA genetic data were melodies (Barto´k 1949; Koda´ly 1971). These results initi- used for the Croatian; Tatar, Mari, Udmurt/Votiac and ated the expansion of a digitized international folksong Chuvash (instead of Volga Region); Irish and Scottish database (the European Folksong Catalogue), and a soft- populations separately, (3) genetic data of three Hungarian ware system for mathematical analysis of the data in the speaking groups (Hungarian, Szekler and Csango) were 1960s (Cse´bfalvy et al. 1965). Although this project, orga- used (Szekler and Csango Y chromosomal data is unpub- nized by the Hungarian Academy of Sciences was inter- lished). The Szeklers and Csangos are Hungarian speaking rupted, development of computation tools led to the later ethnic groups living in , (4) one Polish population resurrection of the idea (Toiviainen 1996, 2000; Leman data were used for the Y and mtDNA genetic comparisons; 2000; Toiviainen and Eerola 2002; Kranenburg et al. 2009; whereas three Polish folksong databases arising from dif- Garbens et al. 2007; Huron 1996; Juha´sz (2009). ferent areas of Poland were analysed. In recent years, we have developed and enhanced a software system which characterizes the overall similarity Methods for genetic analysis of different folk song databases using a scalar measure. The basic idea of this system is that musical information prop- We calculated genetic distances (Fst) with AMOVA based agates and evolves by endless variation of melodies, since on Y chromosomal and mtDNA haplogroup frequencies 123 Mol Genet Genomics with Arlequin 2.0 software (Schneider et al. 2000). Mul- contour vectors of D dimensions (D element pitch tidimensional scaling (MDS) plots were constructed with sequences) from the digital codes of music notations ViSta 7.9.2.4 software. (Juha´sz 2006). Since D was the same for each melody, the Y chromosomal haplogroups were combined into groups contours could be compared to each other using a common C, D, E, F, I, J2, K, L, N1c, PxR1a and R1a, so that Euclidean distance function defined in the same published sources could be used for comparisons (Bosch D-dimensional melody space, irrespective of their individual et al. 2006; Cinnioglu et al. 2004; Di Gaetano et al. 2009; length. The contour vectors are sampled time functions of the Hammer et al. 2006; Kayser et al. 2005; Maca-Meyer et al. pitch variation, the main rhythmic characteristics of the 2003; Malhi et al. 2008; Nasidze et al. 2002, 2005; Pericic´ melodies are also mapped into the vector space determined et al. 2005a, b; Petrejcı´kova´ et al. 2010; Pimenoff et al. by the contour vectors. At the same time, the tempo is 2008; Rosser et al. 2000; Semino et al. 2000; Tambets et al. completely normalised by this technique, but this is a useful 2004;Vo¨lgyi et al. 2009; Wells et al. 2001; Xue et al. 2006; and general tool of feature extraction in ethnomusicology. Zastera et al. 2010). To avoid problems arising from different notation principles, To compare maternal lineages, we used published the melodies of all collections were transposed to common mtDNA haplogroup frequencies (Bermisheva et al. 2004; final tone G. In most unison folk song cultures, the melodies Bosch et al. 2006; Brandsta¨tter et al. 2007; Comas et al. have a well-defined tonic, and this tonic is identical to the 2004; Derbeneva et al. 2002; Helgason et al. 2001; Irwin final tone; therefore, this technique is musically relevant in et al. 2007; Maca-Meyer et al. 2003; Malyarchuk et al. the most studied cases. However, since this is not generally 2002; Merriwether et al. 1996; Pakendorf et al. 2007; true for Western folksongs, the automatic transposition of Passarino et al. 2002; Pericic´ et al. 2005a; Pimenoff et al. these databases was corrected partly by experts, partly by a 2008; Quintana-Murci et al. 2004; Simoni et al. 2000; further algorithm. Starikovskaya et al. 2005; Yao et al. 2004; http://www. After training, the SOM with the contour vectors of a eupedia.com). Haplogroups were combined into groups A, given culture, the learned contour type vectors belonging to B, C, D, E, F, G, HV, H, I, J, K, L, M (9Q, G, E, D, C, Z), different grid locations of the map represented the most N, P, Q, R, S, T, U 9 K, V, W, X, Y and Z. important contours characterizing the given culture (Juha´sz 2006). In other words, a well-trained SOM represents a Methods for musical analyses musical language which is optimal for ‘‘understanding’’ melodies of its own culture. Using the terms of neural We developed a system for automatic classification of networks, the contour type vectors function as ‘‘receptors’’, melody contours based on a generally applied kind of ‘‘firing’’ or ‘‘being activated’’ when a melody contour is artificial intelligence, the so-called self-organizing map found to be similar enough to them. To compare two dif- (SOM) (Kohonen 1995). The operation of our system is ferent ‘‘musical languages’’, one of them will be ‘‘excited’’ demonstrated in Fig. 1. The first step is to construct melody by the contour types of the other language. Common

Fig. 1 The generation of contour vectors and the learning of averaged contour type vectors using a SOM

123 Mol Genet Genomics musical features can be identified by contour types being This technique is well known in genetics. Therefore, we activated by type vectors of the foreign language, and are able to establish the relationships between musical particular characteristics are also described by those having cultures using the same method that is commonly used in no relations in the other culture. the field of phylogenetic studies. The degree of general relation between cultures A and B can be characterized by the overlap of their contour type sets. More exactly, we defined the degree of relation of the Results ni;j ith and jth cultures as di;j ¼ 1 , where ni;j is the number Ni of contour types of the ith culture activated by those of the The genetic relationships jth one, and Ni are the total number of the contour types of Genetic distances (Fst) of Y chromosomal haplogroup ÀÁthe ith culture. Since the relation is not symmetric frequencies between 42 populations compared (Online di;j 6¼ dj;i ; our definition will not satisfy the criteria of a distance function in the mathematical sense, but we can Resource 1) are displayed as an MDS plot (Fig. 2). Most state that the closer two cultures are to each other, the populations compared tend to group into four clusters in the middle of the plot with two smaller clusters on the smaller are the corresponding di;j and dj;i values. This rate, the relation of any musical culture to the edges, whereas some population groups appear as outliers. others, can be characterized by a 31 dimensional vector Populations were determined to be members of a cluster on the basis that the genetic distances between them were less containing the above defined degrees di;j. To get a clear than 0.05, and also that as many populations as possible picture of the very complicated system of relations should be grouped. Three Hungarian speaking populations described by these non-symmetric data, we applied a spe- (Hungarian, Szekler and Csango) clustered together with cial version of the multidimensional scaling (MDS) method two eastern European (Bulgarian and Serbian), two Tatar (Borg and Groenen 2005).

Fig. 2 Multidimensional scaling (MDS) plot constructed on Fst (Kayser et al. 2005) 22 German (Kayser et al. 2005) 23 Spanish genetic distances of Y haplogroups of the populations compared 1 (Maca-Meyer et al. 2003) 24 Khanty (Pimenoff et al. 2008) 25 Mansi Mari (Semino et al. 2000) 2 Scottish (Rosser et al. 2000) 3 Irish (Pimenoff et al. 2008) 26 Navajo (Malhi et al. 2008) 27 Croatian (Rosser et al. 2000) 4 Bulgarian (Rosser et al. 2000) 5 Udmurt pooled (Pericic´ et al. 2005a, b) 28 Serbian (Pericic´ et al. 2005b) 29 (Semino et al. 2000) 6 Chinese (Xue et al. 2006) 7 Ewenki pooled Ainu (Hammer et al. 2006) 30 Kurdish (Nasidze et al. 2005) 31 (Xue et al. 2006; Tambets et al. 2004) 8 Inner Mongolian (Xue et al. Chuvash (Tambets et al. 2004) 32 Finnish (Tambets et al. 2004) 33 2006) 9 Kyrgyz (Wells et al. 2001) 10 Kazan Tatar (Wells et al. 2001) Norwegian (Tambets et al. 2004) 34 Slovakian (Petrejcı´kova´ et al. 11 Crimean Tatar (Wells et al. 2001) 12 Azeri (Wells et al. 2001) 13 2010) 35 Czech (Zastera et al. 2010) 36 French (Semino et al. 2000) Kurdish (Wells et al. 2001) 14 Russian (Wells et al. 2001) 15 Sicilian 37 Dutch (Semino et al. 2000) 38 Italian (Semino et al. 2000) 39 (Di Gaetano et al. 2008) 16 Azeri (Nasidze et al. 2002) 17 Chechen Hungarian (Vo¨lgyi et al. 2009) 40 Szekler (unpublished data) 41 (Nasidze et al. 2002) 18 Turkish (Cinnioglu et al. 2004) 19 Greek Csango (unpublished data) 42 Outer Mongolian (unpublished data) (Bosch et al. 2006) 20 Romanian (Bosch et al. 2006) 21 Polish 123 Mol Genet Genomics

(Crimean and Kazan), two Kurdish, a Sicilian and a Greek case the genetic distances between populations were less population group (Cluster I). Most western European than 0.05, as well as in the case of the Y chromosomal (German, Dutch, Spanish, Norwegian and Italian) and three MDS plot. central European (Czech, Croatian and Slovakian) popu- Three Hungarian speaking population groups (Hungar- lations grouped into cluster II. Cluster III included three ian, Szekler and Csango) grouped together in cluster I. populations from the Volga Region (Udmurt, Chuvash and Most European populations formed a fairly compact cluster Mansi) and one Russian population. Three Finno-Ugric in Fig. 3 (cluster II). Out of Finn-Ugric speaking people, populations (Mari, Khanty and Finnish) clustered together two Finnish and Mari population groups included in cluster in cluster IV. Two Azerian, Turkish and Chechen popula- II and the rest of them belonged to other Europeans (14 tion groups clustered together in cluster V. Three Asian population groups). Four other European populations populations (Ewenki, Inner and Outer Mongolian) were (Sicilian, Norwegian, French and Spanish) formed cluster included in Cluster VI. III. Cluster IV included three populations from the Volga To place these findings in context of female lineages, Region (Chuvash, Udmurt and Tatar), two Komi groups genetic distances of mtDNA haplogroups between the (Ural), Azeri (Caucasus), Turkish (Anatolia) and two populations from published sources were calculated European populations (Italian and Bulgarian). Three Mansi (Online Resource 2), and were again presented as an MDS (Ural), two Kyrgyz (Central Asia) and one Evenki (Siberia) plot (Fig. 3). Clusters were distinguished as a cluster in population groups belonged to cluster V, whereas six Asian

Fig. 3 Multidimensional scaling (MDS) plot constructed on Fst Finnish (Simoni et al. 2000) 28 Italian, Tuscany (Simoni et al. 2000) genetic distances of mtDNA haplogroups of the populations com- 29 Sicilian (Simoni et al. 2000) 30 Norwegian (Simoni et al. 2000) 31 pared 1 Outer Mongolian (Yao et al. 2004) 2 Chinese (Yao et al. Spanish, Central (Simoni et al. 2000) 32 Turkish, Anatolian 2004) 3 Kyrgyz (Yao et al. 2004) 4 Inner Mongolian (Yao et al. 2004) (Quintana-Murci et al. 2004) 33 Kurdish, Iran (Quintana-Murci 5 Evenks, Stony Tunguska (Pakendorf et al. 2007) 6 Evenks, Iengra et al. 2004) 34 Kurdish, Turkmenistan (Quintana Murci et al. 2004) (Pakendorf et al. 2007) 7 Evenks, Nyukhaza (Pakendorf et al. 2007) 8 35 Slovakian (Malyarchuk et al. 2002) 36 Czech (Malyarchuk et al. Evenks, Yakut speaking (Pakendorf et al. 2007) 9 Crimean Tatar 2002) 37 Greek (Bosch et al. 2006) 38 Romanian (Bosch et al. 2006) (Comas et al. 2004) 10 Kyrgyz (Comas et al. 2004) 11 Evenki 39 Polish (Malyarchuk et al. 2002) 41 Finnish (Derbeneva et al. 2002) (Merriwether et al. 1996) 12 Navajo (Merriwether et al. 1996) 13 42 Mansi, Konda (Derbeneva et al. 2002) 43 Mansi, Lyamin Evenki (Starikovskaya et al. 2005) 14 Mansi (Starikovskaya et al. (Derbeneva et al. 2002) 44 Norwegian (Passarino et al. 2002) 45 2005) 15 Kyrgyz (Bermisheva et al. 2004) 16 Outer Mongolian Russian (Helgason et al. 2001) 46 German (Helgason et al. 2001) 47 (Bermisheva et al. 2004) 17 Azeri (Bermisheva et al. 2004) 18 Tatar Irish (Helgason et al. 2001) 48 Scotish (Helgason et al. 2001) 49 (Bermisheva et al. 2004) 19 Chuvash (Bermisheva et al. 2004) 20 Spanish, Cantabrian (Maca-Meyer et al. 2003) 50 Khanty (Pimenoff Komi, Permyak (Bermisheva et al. 2004) 21 Komi, Zyryan et al. 2008) 51 Mansi (Pimenoff et al. 2008) 52 Croatian (Pericic et al. (Bermisheva et al. 2004) 22 Mari (Bermisheva et al. 2004) 23 2005a) 53 Hungarian (Irwin et al. 2007) 54 Szekler (Brandsta¨tter et al. Udmurt (Bermisheva et al. 2004) 24 Bulgarian (Simoni et al. 2000) 25 2007) 55 Csango (Brandsta¨tter et al. 2007) 56 Dutch (http://www. French (Simoni et al. 2000) 26 German (Simoni et al. 2000) 27 eupedia.com) 123 Mol Genet Genomics population groups (3 Mongolian, Evenki, Kyrgyz and music, containing diatonic melodies of low ambitus has Chinese populations) were grouped into cluster VI (Fig. 3). close Romanian and Spanish contacts. On the other hand, a Cluster VII included two Evenki (Siberia) and Mansi (Ural) further significant layer in the Spanish corpus containing population groups. Out of 56 population groups compared, plagal melodies shows close French and Finnish relations. the other ones were outliers as shown in Fig. 3. The Central European group (III: Hungarian–Slovakian– Great Polish corpora) has close contacts with almost all of The system of musical relations the mentioned families, due to the particularly extended relations of Hungarian folk music. The close Chinese, The musical contacts revealed by the above described Volga, Karachay, Sicilian, Turkish, Romanian, Spanish, SOM-based analysis are represented in Fig. 4. The loca- Polish, Norwegian, Finnish, Irish–Scottish–English and tions of the nodes representing the 31 cultures were Slovakian connections determine a central location for the determined according to the MDS principle. The lines Hungarian node, but this optimal position yields still rather indicate a very close musical contact between the con- long edges to the related cultures, due to the limitations of nected nodes (both di;j and dj;i are less than a threshold the two-dimensional approximation. Musical and historical distance of 0.44), and all close musical contacts are indi- analysis of the Hungarian melody layers being in contact cated in Fig. 4. Although the aim of the algorithm is to with other cultures exceeds the aim of this paper, but some minimize the average length of the edges, Fig. 4 shows that aspects have already been published in previous publica- some of them remained rather long after learning. This fact tions (Juha´sz and Sipos 2009). shows that the algorithm has found an optimal, but not Up to this point, we aimed to the identify groups of exact two-dimensional mapping of the complicated system cultures containing a significant amount of common mel- of connections. However, the structure of the graph really ody types. The complementary approach is to search for reflects the following main properties: a very simplified large groups of melody types being simultaneously found description of the Chinese–Mongolian–Volga (IV) group in a common subset of the 31 cultures. To accomplish this can refer to the dominance of the melodies of high ambitus task, we determined the totality of the important contour and pentatonic scale. The descending contours in this group types appearing in our melody database by training a large have close contacts in Dakota, Anatolian Turkish, Karac- SOM with all of the contour types previously learned by hay and Sicilian (V) folk music, although these latter the 31 national/areal SOMs. Being in possession of this variants are mainly diatonic. Another layer of Turkish large ‘‘unified’’ contour type collection, the above question can be formulated as an algorithmic search as follows: we search for the largest subsets of the 31 melody corpora having the greatest amount of common elements in the unified contour type collection. Results are demonstrated in Fig. 5, where the common contour type vectors are assigned to points of the plane, and the resulting point system is arranged by an MDS algorithm. Thus, similar contour types are assigned to near locations of the plane, and the whole arrangement of the point system mirrors the overall similarity of relations of the common contour type collection. A search for four element subsets of cultures with the largest overlap found the Volga–Turkish–Karac- hay–Hungarian and the Chinese–Turkish–Karachay–Hun- garian groups with respectively 46 and 45 totally common types. Black dots denoting these common types are located in a well-defined area of the melody map, consequently they may belong to a well-defined musical style with a continuous system of variants. Fig. 4 Musical language groups determined automatically from the As a reference, we also show common melody types of overlaps of 31 folk song cultures. Aze Azeri; Bal Balkan; Bul Bulgarian; Cass Cassubian (Poland); Chin Chinese; Dak Dakota; Fin four Western cultures in Fig. 5c. These types are located in Finnish; Fr French; Ger German; GPL Great (Central and Southern) a completely different area of the unified cloud, thus, they Poland; Hol Holland; Hun Hungarian; ISE Irish–Scottish–English; stand for completely different musical forms. The number Kar Karachay; Khan: Khanty; Kom: Komi; Kur Kurdish; Kyr of the totally common Western types (17) shows a much Kyrgyz; Lux Luxembourgish; Mor Moravian; Nav Navajo; Nor Norwegian; Rom Romanian; Rus Russian; Sic Sicilian; Slo Slovakian; smaller musical core in the music of Western Europe than Spa Spanish; Tur Turkish; Vol Volga Region; War Warmia (Poland) in the former groups. The search for groups of 7 cultures 123 Mol Genet Genomics containing the most totally common melody types found explained by geographic proximity, but we found con- the same subset completed by Sicilian, Finnish and Dakota, vincing musical correlations between many members of as well as Mongolian, Volga and Sicilian cultures. Fig- this group [for instance: Hungarian–Tatar (Volga), Kurd– ure 5d and e show that the totally common melody types Serbian–Croatian (Balkan)–Azeri—see Fig. 4 and below]. form subsets of the larger black clouds shown in Fig. 5a, b. The close genetic relationships that were detected among The genetic distance data of the above cultures are sum- populations included in clusters II, III, IV, V and VI can be marized in Table 1, where the close relations are high- interpreted by admixture due to geographical proximity. lighted in bold. Ainu, Chinese, Romanian, Mansi and Navajo Indian pop- ulation groups were outliers as shown in Fig. 2. Scottish– Irish (2 and 3) and Polish–Kyrgyz (9 and 21) population Discussion groups showed close affinities to each other (Fig. 2). The MDS plot of mtDNA represented seven distinct Genetic relationships based on MDS plots clusters; generally more populations are grouped into a cluster based on threshold value comparing with Y chro- To evaluate genetic relationships between populations mosomal MDS plot. Close genetic affinity between the investigated, Fst genetic distances were calculated and the three Hungarian speaking population groups is evident general structure of the distance matrix was depicted by (Cluster I). Almost all European population groups com- MDS plots. The genetic clustering of the populations can pared (17 populations) based on the Fst genetic distances of be interpreted as the degree of similarity of population mtDNA haplogroups belong to a more compact cluster structures that often reflect common ancestry or gene flow. than that of Y chromosomal comparison (Fig. 3, Cluster Both Y chromosomal and mtDNA MDS plots were con- II). This observation could also be explained by admixture. structed from nearly the same set of populations depending Population affiliations between French, Spanish, Sicilian on genetic data found. MDS plots of Y chromosome and and Norwegian peoples are probably due to geographic and mtDNA showed six and seven distinct clusters suggesting historical backgrounds (cluster III). Population groups genetic affiliations between populations based on geo- included in cluster IV reflect an interesting genetic rela- graphical and historical background. Most central and tionship: Genetic affinities of three groups of the Volga south-eastern European population groups clustered toge- Region (Chuvash, Udmurt, and Tatar) and two Komi ther indicating close genetic relationships (Fig. 2, cluster groups in Ural are probably associated with geographic I). Genetic affinity of two Kurdish (Turkmenistan, Turkey– proximity. The same can be said for the Azeri, Turkish and Georgia) and two Tatar population groups included in Bulgarian populations, but the Italian population is Cluster I together with other Europeans could not be exceptional in this cluster. Interestingly, the Mansi, Kyrgyz

Fig. 5 The largest overlaps of musical cultures on the great common melody map. Chin Chinese; Fr: French; Ger German; Hol Holland; Hun: Hungarian; Kar Karachay; Lul Luxembourgish; Spa Spanish; Tur Turkish; Vol Volga Region

123 Mol Genet Genomics

and Evenki groups mirrored a close genetic affinity to each other, despite greater geographical distances (Fig. 3, 0.003 0.047 0.004 0.012 0.013 0.022 Cluster V). Close genetic distances between the popula- tions included in clusters VI and VII are acceptable due to historical backgrounds. The largest genetic distances were detected at Ainu, – 0.064 0.028 0.049 0.060 0.065 Chinese and Navajo Indian populations in the MDS of Y chromosomal comparison as well as at two Evenki and Navajo in mtDNA comparison. These outlier populations are labelled as empty diamonds in Figs. 2 and 3. – 0.007 0.043 0.045 0.049 Comparisons of Y chromosomal and mtDNA studies in the investigated populations have indicated greater levels of mtDNA lineage sharing among populations, suggesting that females may have accomplished more mobility during 0.005 0.010 0.0540.086 0.064 0.081 0.095 0.110 0.051 0.052 0.068 0.099 0.094 0.099 human evolution history than males. This observation is er and Csango), Sicilian, Turkish and Finnish populations are consistent with results previously reported (Bamshad et al. 2001). 0.055 0.017 0.025 0.002 0.005 0.054 0.023 0.040 – 0.060

2 Musical and genetic correlations

The results of the comparison of folk music corpora of 31 populations are summarized in Online Resource 3. The 0.053 0.022 0.031 0.011 0.024 0.109 0.004 tones of the edges in Fig. 4 refer to correspondence of musical and genetic similarities. Thick black edges indicate a simultaneously close musical and genetic relation between the corresponding populations; while long dash lines connect populations with a close musical contact 0.096 0.053 0.003 0.040 0.071 0.074 0.080 – – 0.079 0.126 without a close genetic relation. Short dash lines indicate the cases when we could not find genetic data belonging to the related population. No data were available for Dakota, 0.012 0.001 0.007 0.012 0.025 0.036 0.011 0.001 0.049 Karachay and Luxembourgish populations in either the 2 mitochondrial or male lineages. In order to draw as detailed a picture of musical contacts as possible, we did not eliminate these cultures from the musical analysis, 0.174 0.147 0.123 0.255 0.247 0.226 0.200 0.235 – 0.015 although we confined ourselves to the cultures character- 2 ised by simultaneously existing musical and genetic data when analysing the correlation between music and genet- ics. Therefore, the incomplete data do not influence our 0.048 0.195 0.137 0.150 0.169 0.214 0.139accuracy. 0.189 0.158 0.153 0.169 In some other cases, multiple genetic data correspond to the same musical collection. This can be illustrated by the Mari, Chuvash, Tatar and Udmurt genetic distance data, being mapped to the same musical database named ‘‘Vol- 0.012 ga’’, containing a unified folksong collection of these nations. Utilizing this ‘‘Volga’’ musical culture as an example, the genetic distances from the Turkish population are 0.3,

Fst values (females) Chinese Mongolian Mari Udmurt Tatar Chuvash Sicilian Turkish0.2, Hungarian 0.049 Szekler and Csango 0.13, Finnish respectively, for Mari, Udmurt, Tatar and Chuvash Y chromosomal data. Notwithstanding that three of these data are much higher than 0.05, the Turkish– Genetic distances (Fst) of male and females lineages compared Tatar distance fulfils our requirement; therefore, we can say that close contact of the Turkish musical tradition in the Finnish 0.324 0.407 Csango 0.272 0.318 0.325 0.144 Szekler 0.201 0.279 0.260 0.091 Chinese – MongolianMari 0.322 0.346 – 0.458 0.147 – 0.098 0.053 0.104 0.117 0.175 0.090 0.159 0.127 0.127 0.124 UdmurtTatar 0.272 0.187 0.394 0.303 0.165 0.223 – 0.059 – Chuvash 0.197 0.372 0.133 Hungarian 0.222 0.314 0.291 0.105 Sicilian 0.191 0.246 0.309 0.189 0.056 0.136 – Turkish 0.164 0.247 0.299 0.198 Genetic distances between Chinese, Mongolian, Volga Region (Mari, Udmurt, Tatar, Chuvash), Turkish, Hungarian speaking people (Hungarian, Szekl summarized. Bold characters indicate thatFst Fst genetic genetic distances distances of are Y less chromosomal than and 0.05 mtDNA haplogroup frequencies were chosen from the Tables in Online Resource 1 (ESM_1) and 2 (ESM_2) Fst values (males) Table 1 Volga-region can be traced back to the close genetic 123 Mol Genet Genomics relation between the Turkish and Tatar people. Moreover, the mitochondrial genetic distance between Turkish and Tatar populations becomes merely 0.002, therefore, this latter data can be accepted as the characteristic distance of the given populations (see Table 1). As a consequence of this consideration, the simultaneously close musical and genetic relations are indicated by the thick black edges connecting the Turkish and Volga nodes in Fig. 4. We can say in general that a genetic relation between certain characteristic parts of two populations indicates an earlier physical and biological contact; therefore, the musical relation can be traced back to a probable cultural interaction or cooperation of the ancestors of the given Fig. 6 The dependence of some characteristic frequencies on the groups. Therefore, we state the concept of close genetic genetic threshold value. 1 The frequency of the close genetic contacts within the whole set of the 31 cultures. 2 The frequency of the close relation for our case as follows: a close genetic relation genetic contacts within the group of the musically related cultures belonging to a pair of musical cultures is established on ðÞpðgjmÞ¼X=M . 3 The frequency of the close musical contacts condition that at least one of the genetic distance data within the group of genetically related cultures ðÞpðmjgÞ¼X=G determined for any subsets of the corresponding popula- tions is less than the critical threshold value. Since the genetic distance is in the range of 0–1 by contacts within the group of closely related musical cul- definition, our choice of the threshold of 0.05 is a rather tures. In the case of total independence of genetic and rigorous requirement for establishing a close genetic con- musical relations, this curve would be essentially identical tact. This requirement is satisfied by 21% of the Y chro- to that of showing the ratio of the close genetic contacts mosomal and 37% of the mitochondrial data. within the whole set of the studied 31 cultures. However, Counting up the number of cases when a close relation curve 1, representing this latter frequency, is significantly was detected between and consequently lower than curve 2. This observation verifies that the correlation being found between the 1. genetic (G) musical and genetic relations for the genetic threshold of 2. musical (M) and 0.05 is standing independently of the choice of the 3. both musical and genetic (X) data, the conditional threshold value. Curve 3 shows the pðmjgÞ¼X=G proba- probabilities of the events that bility as a function of the genetic distance threshold (the (a) a close genetic relation implies a close musical frequency of the musical relations within the genetically contact, as well as related cultures). In contrast to the above cases, this curve (b) a close musical relation implies a close genetic shows a decreasing tendency, therefore, our statement that contact becomes no musical contacts can be predicted from genetic relations pðmjgÞ¼X=G ¼ 0:28 and pðgjmÞ¼X=M also proved to be independent of the genetic threshold. ¼ 0:82 respectively Comparing Figs. 2 and 3 with Fig. 4, one can see that merely partial correspondences can be found between the According to the low probability of the first case, close genetic and musical clusters. For instance, the close Bul- genetic relations of populations are definitely not garian–Azeri, Serbian–Kurdish, Hungarian–Tatar, Irish– appropriate indicators of similar musical cultures. Dutch, Chuvash–Udmurt and Finnish–Mari musical However, the high value of the second conditional relations really connect cultures belonging to the same probability verifies that a significant similarity of folk genetic cluster in Fig. 2, but the Turkish–Hungarian, music cultures detects significant genetic connections in Mongolian–Chuvash, Finnish–Scottish, etc. musical rela- 82% of the studied cases. In other words, genetic relations tions do not fit into this system at all. This fact is a natural can be predicted from the musical relations with a hit consequence of the asymmetric conditions indicated by the probability of 82%. above discussed asymmetric conditional probability values. The dependence of the correlation of the musical and At the same time, the high predictability of the genetic genetic relations on the genetic threshold value is repre- contacts from musical relations is expressed in the domi- sented in Fig. 6. In this figure, curve 2 shows the variation nance of the thick black edges in Fig. 4. of the pðgjmÞ¼X=M conditional probability as a function As a musical illustration of the weight of this latter of the genetic threshold value. In other words, curve 2 result, we show some examples of similar Hungarian, shows the variation of the frequency of close genetic Norwegian and Appalachian (Appalachian people are

123 Mol Genet Genomics mainly of Scottish and Irish origin) songs in Note 1 (Online common melody types exist in the quartets of the Volga– Resource 4). The musical analysis showed mutually Turkish–Karachay–Hungarian and the Chinese–Turkish– intensive contacts among these cultures, but the high Karachay–Hungarian groups with 46 and 45 totally geographical distances made these results questionable common types, respectively. These types are situated in a before the comparison to the genetic data. However, the continuous part of the common music map, therefore, we genetic distances of mtDNA data show a consequent cor- can state that they form a cohesive musical style with many relation with the musical results: 0.0017, 0.024 and 0.024 melody types of certain common characteristics (for for Norwegian–Appalachian (Scottish), Norwegian–Hun- instance, most of them have a descending contour with a garian and Hungarian–Appalachian relations, respectively range of an octave.). Continuing the systematic search In the light of these data, the melodies in Note 1 (Online among larger subsets of the 31 cultures, we found that the Resource 4) can no longer be considered to be accidentally ensembles having the largest overlaps are the right extensions similar ‘‘compositions’’ of three completely independently of the above quartets: The Volga–Sicilian–Turkish– evolving cultures. It is worth mentioning here that the Karachay–Hungarian–Finnish–Dakota and Chinese–Mon- musical analysis itself excludes this assumption, because gol–Volga–Sicilian–Turkish–Karachay–Hungarian cultures the connecting lines in Fig. 4 refer to an extremely high have the largest sets of common melody types among the number of overlapping melody types in the three cultures. seven element groups (12 and 10, respectively). Apart from Moreover, our examples show that these overlaps cover a the lack of Dakota and Karachay data, the close musical significant amount of common types simultaneously relations of the former group totally correlate with the appearing in all of the three cultures. The first example known genetic relations (See Table 1). It is regrettable that (Online Resource 4a) contains variants of a melody type of the Karachay data is not available, however, most of the domed contour with an ascending fifth transposition genetic distances calculated for the neighbouring Chechen between the first and second sections, while the melodies of people show close genetic connections to Hungarian, example Online Resource 4b represent a descending con- Turkish, Sicilian and Tatar (Volga) populations. According tour type. These significantly different forms illustrate the to these data and some recently published results, a close existence of a rather varied group of melody forms being genetic relationship of the Karachays to the mentioned simultaneously present in the three cultures. populations can be expected (Sen 2010). Accomplishing the above probability analysis for Turning our attention to the second septet, we found mitochondrial as well as male genetic distances separately, large genetic distances of the Chinese people from the we found pðmjgÞ¼0:27 and pðgjmÞ¼0:77 for mitochon- members of the very compact Volga–Sicilian–Turkish– drial and pðmjgÞ¼0:36 and pðgjmÞ¼0:35 for Y chro- Hungarian group. Thus, groups b and e in Fig. 5 are surely mosomal data. These results show that musical relations not completely coherent in a genetic sense; hopefully, this indicate dominantly female genetic contacts. Can this anomaly may also be explained by some future analyses. observation be attributed to brides transferring mitochon- The common area of the above-discussed musical cul- drial genetic information to far populations of their fiance´s, tures in the melody map is totally separate from that of the or to mothers transmitting their musical mother tongue to German–{Luxembourgian–Lorrain}–French–Holland their children even in foreign surroundings? The answer is quartet, where the genetic distances—being less than 0.03 open. All in all, women seem to be much more important for any pair—also indicate very intensive genetic contacts actors in the cross-cultural transfer of musical and genetic (see Fig. 5c). The fact that we found a totally disjunctive information than men, indicating more mobility during our area for the common ‘‘Western’’ and ‘‘Eastern’’ melody evolutionary history as described in the section of genetic forms can be interpreted by the existence of two essentially relationship of the populations. different musical language groups. Musical examples The above-mentioned examples of simultaneously illustrating the typical common melody forms in Western existing Appalachian–Norwegian–Hungarian variants of and Eastern language groups are found in Notes 2–4 common contour types pose some further questions: Are (Online Resource 5–7). The relatively low number of there major or minor groups of common musical styles totally common types in the Western cultures (17) shows which exist simultaneously in many different cultures, and that their pair-wise large overlaps contain mainly different can these data be correlated with parallel genetic relations? types; therefore, they form a much less extended core-style A positive answer to these questions would reveal common than the Eastern group. crystallization points of musical cultures and verify that What melody forms construct the totally common these connections can be traced back to historic or pre- musical styles simultaneously existing in so many cultures? historic reasons. An extended analysis of this core-style exceeds the limi- The answer to these exciting questions is given in Fig. 5, tations of this study; therefore, we illustrate it merely by where we have shown that the largest subsets of totally some examples in Notes 3 and 4 (Online Resource 6–7). 123 Mol Genet Genomics

These melodies, being variants of two common contour genetic sense establishing a new interdisciplinary research types of high ambitus, gradually descending from the field as ‘‘music-genetics’’. This result led to the conclusion octave, indicate a well-defined common style residing in all that musical styles arise from a common musical parent cultures classified above as members of one of the two language of historic and prehistoric origin. ‘‘Eastern’’ groups (Fig. 5). One can hardly imagine a different explanation for such Acknowledgments This work was supported by the Hungarian common musical style than the existence of a common National Research Found (grant no. K81954), and the Network of Forensic Science Institutes (NFSI), Ministry of Public Administration ‘‘parent language’’ as an initial crystallization point of and Justice. We would like to say special thanks to Dr. Eva Susa more or less independent musical evolutions. Of course, (General Director of the Network of Forensic Science Institutes) for this statement cannot be considered as a real conclusion of her support. We thank both reviewers for their constructive comments the present work. It is an assumption that needs further and suggestions and Ati Rosselet for the English editing. historical, archaeological, etc. investigation, but we can state that it is supported by the genetic correlation dis- cussed here. Earlier studies of the contacts of Hungarian References folk music have drawn the conclusion that descending, sometimes fifth transposing melodies of high ambit con- Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE, Rao stitute the most important part of the contact melodies BB, Naidu JM, Prasad BV, Reddy PG, Rasanayagam A, Papiha between Hungarian as well as Mari, Chuvash (Koda´ly SS, Villems R, Redd AJ, Hammer MF, Nguyen SV, Carroll ML, Batzer MA, Jorde LB (2001) genetic evidence on the origins of 1971), Anatolian (Barto´k 1949; Sipos 2000, 2001, 2006) Indian caste populations. Genome Res 11(6):994–1004 and Mongolian (Sipos 1997) folk music. Our results con- Barto´k B (1949) On collecting folk songs in Turkey. Tempo, New firm that the correspondences found by these independent Ser, No. 13, Bartok Number (Autumn, 1949), pp 15–19 studies were not coincidental, since they can be deduced Bermisheva MA, Kutuev IA, Korshunova TY, Dubova NA, Villems R, Khusnutdinova EK (2004) Phylogeographic analysis of from the existence of the common core of the ‘‘Eastern’’ mitochondrial dna in the nogays: a strong mixture of maternal cultures. According to the above results, most of the pop- lineages from Eastern and Western Eurasia. Mol Biol ulations conserving this musical style in their current cul- 38:516–523 ture have common genetic roots; consequently, the Borg I, Groenen P (2005) Modern multidimensional scaling: theory and applications, 2nd edn. Springer, New York development of this musical culture in very early times can Bosch E, Calafell F, Gonza´lez-Neira A, Flaiz C, Mateu E, Scheil HG, be attributed to their common ancestors. Huckenbeck W, Efremovska L, Mikerezi I, Xirotiris N, Grasa C, In conclusion, using the current methods of genetics, Schmidt H, Comas D (2006) Paternal and maternal lineages in data mining, artificial intelligence research and musicol- the Balkans show a homogeneous landscape over linguistic barriers, except for the isolated Aromuns. Ann Hum Genet ogy, we have drawn the system of musical connections of 70:459–487 31 cultures in Eurasia and verified that these contacts can Brandsta¨tter A, Egyed B, Zimmermann B, Duftner N, Padar Z, Parson be traced back to genetic basis in 82% of the studied cul- W (2007) Migration rates and genetic structure of two Hungarian tures, while close genetic relations of populations are ethnic groups in , Romania. Ann Hum Genet 71:791–803 definitely not appropriate indicators of similar musical Cavalli-Sforza LL, Bodmer WF (1977) The genetics of human cultures (28%). Such a result can be easily accepted in the populations. W.H. Freeman and Co., San Francisco case of people living recently in the same area, like the Cinniog˘lu C, King R, Kivisild T, Kalfog˘lu E, Atasoy S, Cavalleri GL, Western countries, Slovakians and , and Lillie AS, Roseman CC, Lin AA, Prince K, Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL, Underhill PA (2004) Excavating Norwegians, etc. However, a significant part of our results Y-chromosome haplotype strata in Anatolia. Hum Genet indicate simultaneous musical and genetic similarity of 114:127–148 peoples who live very far from each other over a very long Comas D, Plaza S, Wells RS, Yuldaseva N, Lao O, Calafell F, period. For instance, the simultaneous musical and genetic Bertranpetit J (2004) Admixture, migrations, and dispersals in Central Asia: evidence from maternal DNA lineages. Eur J Hum contacts of Hungarians and Norwegians, Tatars and Sicil- Genet 12:495–504 ians, Sicilians and Turks, etc., emphasize that musical Cse´bfalvy K, Havass M, Ja´rda´nyi P, Vargyas L (1965) Systematiza- contacts may arise from very early times, thus, oral musical tion of tunes by computers. Studia Musicologica 7:253–257 traditions may conserve prehistoric musical forms and Derbeneva OA, Starikovskaya EB, Wallace DC, Sukernik RI (2002) Traces of early Eurasians in the Mansi of northwest Siberia styles. The collaboration of genetics and computational revealed by mitochondrial DNA analysis. Am J Hum Genet musicology was a crucial requirement for uncovering this 70:1009–1014 ‘‘hidden’’ connection. We identified a set of melody forms Di Gaetano C, Cerutti N, Crobu F, Robino C, Inturri S, Gino S, constructing the most extended common (ancient) musical Guarrera S, Underhill PA, King RJ, Romano V, Cali F, Gasparini M, Matullo G, Salerno A, Torre C, Piazza A (2009) Differential style appearing in many cultures simultaneously. Further- Greek and northern African migrations to Sicily are supported by more, we showed that most of the cultures having this genetic evidence from the Y chromosome. Eur J Hum Genet common musical heritage are also closely related in a 17:91–99 123 Mol Genet Genomics

Garbens J, Kranenburg P, Volk A, Wiering F, Veltcamp R, Grijp L among native North Americans: a study of Athapaskan popu- (2007) Using Pitch Stability Among a Group of Aligned Query lation history. Am J Phys Anthropol 137:412–424 Melodies to Retrieve Unidentified Variant Melodies. In: Pro- Malyarchuk BA, Grzybowski T, Derenko MV, Czarny J, Woz´niak M, ceedings of the 8th ISMIR conference Vienna 2007 Mis´cicka-Sliwka D (2002) Mitochondrial DNA variability in Hammer MF, Spurdle AB, Karafet T, Bonner MR, Wood ET, Poles and Russians. Ann Hum Genet 66:261–283 Novelletto A, Malaspina P, Mitchell RJ, Horai S, Jenkins T, Merriwether DA, Hall WW, Vahlne A, Ferrell RE (1996) mtDNA Zegura SL (1997) The geographic distribution of human Y variation indicates Mongolia may have been the source for the chromosome variation. Genetics 145(3):787–805 founding population for the New World. Am J Hum Genet Hammer MF, Karafet TM, Park H, Omoto K, Harihara S, Stoneking 59:204–212 M, Horai S (2006) Dual origins of the Japanese: common ground Nasidze I, Sarkisian T, Kerimov A, Stoneking M (2002) Testing for hunter-gatherer and farmer Y chromosomes. J Hum Genet hypotheses of language replacement in the Caucasus: evidence 51:47–58 from the Y-chromosome. Hum Genet 112:255–261 Helgason A, Hickey E, Goodacre S, Bosnes V, Stefa´nsson K, Ward R, Nasidze I, Quinque D, Ozturk M, Bendukidze N, Stoneking M (2005) Sykes B (2001) mtDNA and the islands of the North Atlantic: MtDNA and Y-chromosome variation in Kurdish groups. Ann estimating the proportions of Norse and Gaelic ancestry. Am J Hum Genet 69:401–412 Hum Genet 68:723–737 Nettle D, Harriss L (2003) Genetic and linguistic affinities between Huron D (1996) The melodic arch in Western folksongs. Comput human populations in Eurasia and West Africa. Hum Biol Musicol 10:3–23 75(3):331–344 Irwin J, Egyed B, Saunier J, Szamosi G, O’Callaghan J, Padar Z, Pakendorf B, Novgorodov IN, Osakovskij VL, Danilova AP, Parsons TJ (2007) Hungarian mtDNA population databases from Protod’jakonov AP, Stoneking M (2006) Investigating the Budapest and the Baranya county Roma. Int J Legal Med effects of prehistoric migrations in Siberia: genetic variation 121:377–383 and the origins of Yakuts. Hum Genet 120:334–353 Jobling MA, Tyler-Smith C (2003) The human Y chromosome: an Pakendorf B, Novgorodov IN, Osakovskij VL, Stoneking M (2007) evolutionary marker comes of age. Nat Rev Genet 4:598–612 Mating patterns amongst Siberian reindeer herders: inferences Juha´sz Z (2000) A model of variation in the music of a Hungarian from mtDNA and Y-chromosomal analyses. Am J Phys ethnic group. J New Music Res 29:159–172 Anthropol 133:1013–1027 Juha´sz Z (2006) A systematic comparison of different European folk Passarino G, Cavalleri GL, Lin AA, Cavalli-Sforza LL, Børresen- music traditions using self-organizing maps. J New Music Res Dale AL, Underhill PA (2002) Different genetic components in 35:95–112 the Norwegian population revealed by the analysis of mtDNA Juha´sz Z (2009) Motive identification in 22 folksong corpora using and Y chromosome polymorphisms. Eur J Hum Genet dynamic time warping and self organising maps. In: Hirata K, 10:521–529 Tzanetakis G (eds) Proceedings of the 10th Internapional Society Pericic´ M, Barac´ Lauc L, Martinovic´ Klaric´ I, Janic´ijevic´ B, Rudan P for Music Information retrieval Conference. International Soci- (2005a) Review of Croatian genetic heritage as revealed by ety for Music Information Retrieval, Kobe, pp 171–176 mitochondrial DNA and Y chromosomal lineages. Croat Med J Juha´sz Z, Sipos J (2009) A comparative analysis of Eurasian folksong 46:502–513 corpora, using self organising maps. J Interdiscip Music Stud Pericic´ M, Lauc LB, Klaric´ IM, Rootsi S, Janic´ijevic B, Rudan I, 4:1–16 Terzic´ R, Colak I, Kvesic´ A, Popovic´ D, Sijacki A, Behluli I, Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Dordevic D, Efremovska L, Bajec DD, Stefanovic´ BD, Villems Hammer MF (2008) New binary polymorphisms reshape and R, Rudan P (2005b) High-resolution phylogenetic analysis of increase resolution of the human Y chromosomal haplogroup southeastern Europe traces major episodes of paternal gene flow tree. Genome Res 18:830–838 among Slavic populations. Mol Biol Evol 22:1964–1975 Kayser M, Lao O, Anslinger K, Augustin C, Bargel G, Edelmann J, Petrejcˇ´ıkova´ E, Sota´k M, Bernasovska J, Bernasovsky´ I, Sovicova´ A, Elias S, Heinrich M, Henke J, Henke L, Hohoff C, Illing A, Boˆzˇikova´ A, Boronˇova´ I, Gabrikova´ D, Sˇvı´cˇkova´ P, Macˇekova´ Jonkisz A, Kuzniar P, Lebioda A, Lessig R, Lewicki S, S, Cˇ herhova´ V (2010) The genetic structure of the Slovak Maciejewska A, Monies DM, Pawłowski R, Poetsch M, Schmid population revealed by Y-chromosome polymorphisms. Ant- D, Schmidt U, Schneider PM, Stradmann-Bellinghausen B, horopol Sci 118:23–30 Szibor R, Wegener R, Wozniak M, Zoledziewska M, Roewer L, Pimenoff VN, Comas D, Palo JU, Vershubsky G, Kozlov A, Sajantila Dobosz T, Ploski R (2005) Significant genetic differentiation A (2008) Northwest Siberian Khanty and Mansi in the junction between Poland and Germany follows present-day political of West and East Eurasian gene pools as revealed by uniparental borders, as revealed by Y-chromosome analysis. Hum Genet markers. Eur J Hum Genet 16:1254–1264 117:428–443 Quintana-Murci L, Chaix R, Wells RS, Behar DM, Sayar H, Scozzari Koda´ly Z (1971) Folk . Corvina, Budapest R, Rengo C, Al-Zahery N, Semino O, Santachiara-Benerecetti Kohonen T (1995) Self-organising maps. Springer, Berlin AS, Coppa A, Ayub Q, Mohyuddin A, Tyler-Smith C, Qasim Kranenburg P, Volk A, Wiering F, Veltkamp RC (2009) Musical Mehdi S, Torroni A, McElreavey K (2004) Where west meets models for folk-song melody alignment. In: Hirata K, Tzanetakis east: the complex mtDNA landscape of the southwest and G, Yosh K (eds) 10th International Society for Music Informa- Central Asian corridor. Am J Hum Genet 74:827–845 tion Retrieval Conference (ISMIR 2009), pp 507–512 Rosser ZH, Zerjal T, Hurles ME, Adojaan M, Alavantic D, Amorim Leman M (2000) An auditory model of the role of short-term memory A, Amos W, Armenteros M, Arroyo E, Barbujani G, Beckman in probe-tone ratings. Music Perception 17:481–509 G, Beckman L, Bertranpetit J, Bosch E, Bradley DG, Brede G, Maca-Meyer N, Sa´nchez-Velasco P, Flores C, Larruga JM, Gonza´lez Cooper G, Coˆrte-Real HB, de Knijff P, Decorte R, Dubrova YE, AM, Oterino A, Leyva-Cobia´n F (2003) Y chromosome and Evgrafov O, Gilissen A, Glisic S, Go¨lge M, Hill EW, mitochondrial DNA characterization of Pasiegos, a human Jeziorowska A, Kalaydjieva L, Kayser M, Kivisild T, Kra- isolate from Cantabria (Spain). Ann Hum Genet 67:329–339 vchenko SA, Krumina A, Kucinskas V, Lavinha J, Livshits LA, Malhi RS, Gonzalez-Oliver A, Schroeder KB, Kemp BM, Greenberg Malaspina P, Maria S, McElreavey K, Meitinger TA, Mikelsaar JA, Dobrowski SZ, Smith DG, Resendez A, Karafet T, Hammer AV, Mitchell RJ, Nafa K, Nicholson J, Nørby S, Pandya A, Parik M, Zegura S, Brovko T (2008) Distribution of Y chromosomes J, Patsalis PC, Pereira L, Peterlin B, Pielberg G, Prata MJ,

123 Mol Genet Genomics

Previdere´ C, Roewer L, Rootsi S, Rubinsztein DC, Saillard J, Krumina A, Baumanis V, Koziel S, Rickards O, De Stefano GF, Santos FR, Stefanescu G, Sykes BC, Tolun A, Villems R, Tyler- Anagnou N, Pappa KI, Michalodimitrakis E, Fera´kV,Fu¨redi S, Smith C, Jobling MA (2000) Y-chromosomal diversity in Europe Komel R, Beckman L, Villems R (2004) The western and is clinal and influenced primarily by geography, rather than by eastern roots of the Saami–the story of genetic ‘‘outliers’’ told by language. Am J Hum Genet 67:1526–1543 mitochondrial DNA and Y chromosomes. Am J Hum Genet Schneider S, Roessli D, Excoffier L (2000) Arlequin: a software for 74:661–682 population genetics data analysis. Ver 2.000, Genetics and Toiviainen P (1996) Optimizing auditory images and distance metrics Biometry Lab, Department of Anthropology, University of for self-organizing timbre maps. J New Music Res 25:1–30 Geneva Toiviainen P (2000) Symbolic AI versus connectionism in music Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman LE, research. In: Mirinda E (ed) Readings in music and artificial De Benedictis G, Francalacci P, Kouvatsi A, Limborska S, intelligence. Harwood Academic Publishers, Amsterdam, Marcikiae M, Mika A, Mika B, Primorac D, Santachiara- pp 47–68 Benerecetti AS, Cavalli-Sforza LL, Underhill PA (2000) The Toiviainen P, Eerola T (2002) A computational model of melodic genetic legacy of Paleolithic Homo sapiens sapiens in extant similarity based on multiple representations and self-organizing Europeans: a Y chromosome perspective. Science 290:1155–1159 maps. In: Stevens C, Burham D, McPherson G, Schubert E, Sen A (2010) The genetic history of the Karachays: insights from Rewick J (eds) Proceedings of the 7th International Conference mtDNA and Y-chromosome evidence. https://www.sas.upenn. on Music Perception and Cognition, Causal Productions, Sidney, edu/anthropology/undergraduate/theses/Sen pp 236–239 Simoni L, Calafell F, Pettener D, Bertranpetit J, Barbujani G (2000) Underhill PA, Kivisild T (2007) Use of Y chromosome and Geographic patterns of mtDNA diversity in Europe. Am J Hum mitochondrial DNA population structure in tracing human Genet 66:262–278 migrations. Annu Rev Genet 41:539–564 Sipos J (1997) Similar musical structure in Turkish, Mongolian, Vo¨lgyi A, Zala´n A, Szvetnik E, Pamjav H (2009) Hungarian Tungus and Hungarian folk music. In: Berta A´ (ed) Historical population data for 11 Y-STR and 49 Y-SNP markers. Forensic and linguistic interaction between Inner-Asia and Europe. Sci Int Genet 3:e27–e28 Szeged, pp 305–317 Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Sipos J (2000) In the wake of Barto´k in Anatolia. European Folklore Blue-Smith J, Jin L, Su B, Pitchappan R, Shanmugalakshmi S, Institute, Budapest (Bibliotheca Traditionis Europeae 2) Balakrishnan K, Read M, Pearson NM, Zerjal T, Webster MT, Sipos J (2001) Egy most felfedezett belso-mongo} ´liai kvintva´lto´ stı´lus Zholoshvili I, Jamarjashvili E, Gambarov S, Nikbin B, Dostiev e´s magyar vonatkoza´sai. Ethnographia 112, pp 1–80 A, Aknazarov O, Zalloua P, Tsoy I, Kitaev M, Mirrakhimov M, Sipos J (2006) Comparative analysis of Hungarian and Turkic folk Chariev A, Bodmer WF (2001) The Eurasian heartland: a music. Edition of TIKA (Tu¨rk I˙s¸birlig˘i ve Kalkınma I˙daresi continental perspective on Y-chromosome diversity. Proc Natl Bas¸kanlıg˘ı) and the Hungarian Embassy, Ankara Acad Sci USA 98:10244–10249 Starikovskaya EB, Sukernik RI, Derbeneva OA, Volodko NV, Ruiz- Xue Y, Zerjal T, Bao W, Zhu S, Shu Q, Xu J, Du R, Fu S, Li P, Hurles Pesini E, Torroni A, Brown MD, Lott MT, Hosseini SH, ME, Yang H, Tyler-Smith C (2006) Male demography in East Huoponen K, Wallace DC (2005) Mitochondrial DNA diversity Asia: a north-south contrast in human population expansion in indigenous populations of the southern extent of Siberia, and times. Genetics 172:2431–2439 the origins of Native American haplogroups. Ann Hum Genet Yao YG, Kong QP, Wang CY, Zhu CL, Zhang YP (2004) Different 69:67–89 matrilineal contributions to genetic structure of ethnic groups in Tambets K, Rootsi S, Kivisild T, Help H, Serk P, Loogva¨li EL, Tolk the silk road region in china. Mol Biol Evol 21:2265–2280 HV, Reidla M, Metspalu E, Pliss L, Balanovsky O, Pshenichnov Zastera J, Roewer L, Willuweit S, Sekerka P, Benesova L, Minarik M A, Balanovska E, Gubina M, Zhadanov S, Osipova L, Damba L, (2010) Assembly of a large Y-STR haplotype database for the Voevoda M, Kutuev I, Bermisheva M, Khusnutdinova E, Gusar Czech population and investigation of its substructure. Forensic V, Grechanina E, Parik J, Pennarun E, Richard C, Chaventre A, Sci Int Genet 4:e75–e78 Moisan JP, Bara´c L, Pericic´ M, Rudan P, Terzic´ R, Mikerezi I,

123