Comparative genetic analysis of grayling ( sp.) from Central Asia (Kazakhstan, Russia and Mongolia)

Master’s Thesis

In Partial Fulfillment of the Requirements for the Degree Master of Science at the Karl-Franzens University of Graz

Jacqueline Grimm

Institute of Zoology Supervisor: Assoc. Prof. Dr. Steven Weiss

September 2015 TABLE OF CONTENTS

ABSTRACT ...... iii ZUSAMMENFASSUNG ...... iv ACKNOWLEDGEMENT ...... v 1 INTRODUCTION ...... 1 1.1 Phylogeography and its importance ...... 1 1.2 Perspectives of Siberian paleo-history ...... 1 1.3 Importance of and the genus Thymallus ...... 2 1.4 Thymallus species in Eurasia ...... 3 1.5 The central Asian basin and its Thymallus inhabitants...... 5 1.6 Research questions ...... 6 2 MATERIAL & METHODS ...... 7 2.1 Sampling and DNA extraction ...... 7 2.2 Genetic analyses - mtDNA ...... 7 2.2.1 Amplification and sequencing ...... 7 2.2.2 Sequence alignment and phylogenetic analyses ...... 8 2.2.3 Haplotype network analysis ...... 12 2.3 Genetic analyses – microsatellites (SSR) ...... 12 2.3.1 Amplification and screening ...... 12 2.3.2 Basic population genetic statistics ...... 14 2.4 Genetic analysis – mixed data set ...... 15 3 RESULTS ...... 16 3.1 Inhabitants of the Khovd River basin ...... 16 3.1.1 Genetic diversity of Thymallus sp., classified in studied water bodies ...... 16 3.1.2 Population structure of Thymallus sp., classified in studied water bodies...... 17 3.1.3 Genetic diversity of samples, classified in different taxa of Thymallus ...... 19 3.1.4 Population structure of samples, classified in different taxa of Thymallus ...... 19 3.2 Lineage relationship of Thymallus sp. from Kazakhstan ...... 21 3.2.1 Microsatellite analysis of genetic diversity...... 21 3.2.2 Microsatellite analysis of population structure ...... 22 3.2.3 Phylogenetic relationships using mtDNA ...... 25 3.2.4 Divergence times and population history using mixed data set ...... 29

i

4 DISCUSSION ...... 32 4.1 General genetic differentiation ...... 32 4.2 Thymallus inhabiting the Khovd river basin ...... 33 4.3 Taxonomic assignment of populations from Kazakhstan ...... 34 REFERENCES ...... 35 LIST OF FIGURES ...... 42 APPENDIX ...... 44

ii

ABSTRACT

Siberia is a vast region containing four of the world’s ten largest freshwater rivers. The region’s fish diversity was affected by large-scale hydrological dynamics during the glacial and postglacial periods. The wide-spread populations of the genus Thymallus arouse interest of phylogeographers, aiming to improve knowledge on the history of these paleo-hydrological events. The first phylogeographic analysis supported that distinct mitochondrial (mtDNA) lineages belong to major drainage systems. Recent molecular research, coupled with systematic morphological analysis exposed the within-basin diversity and extreme population substructure of graylings. Despite numerous studies, the number of species and their systematic scheme are still unclear. In this study, seven microsatellite loci were used to increase the genetic resolution of previously published data on grayling of the Khovd River basin in Mongolia. The central question focused on whether or not the basin’s phenotypic diversity comprised one (i.e. T. brevirostris) or more species. Additionally, data on the mtDNA control region and flanking transfer RNA genes were combined with microsatellite data to assess the population genetic structure and lineage relationship of Thymallus from the upper reaches of the Irtysh River, in Kazakhstan, including Lake Markakol. Allelic variation of microsatellites revealed no evidence of multiple species in the Central Asian basin and suggests that phenotypic variation in Mongolian grayling is the result of rapid adaptation to different habitats. Results of both mtDNA and microsatellite analysis support a sister relationship between Mongolian grayling (T. brevirostris) and grayling of the upper reaches of the Irtysh (T. sp.). These grayling, together with four other described taxa of the region (T. nikolskyi, T. svetovidovi, T. nigrescens, T. baicalensis), spread across three major river basins, are relatively closely related and presumed to have a common ancestor.

iii

ZUSAMMENFASSUNG

Sibirien ist ein riesiges Gebiet, das vier der zehn größten Süßwasserflüsse besitzt. Die Vielfalt der Fische dieses Gebietes wurde während glazialer und postglazialer Perioden von großräumig hydrologischen Bewegungen beeinflusst. Die weit verbreiteten Populationen von Thymallus erweckten das Interesse der Phylogeographen, die das Wissen über den Verlauf dieser paleohydrologischen Ereignisse erweitern wollten. Laut erster phylogeographischer Analysen gehörten spezifische mitochondriale Linien zu Hauptwassersystemen. Systematische Morphologieanalysen und jüngste molekulare Forschungen deckten die Diversität der Äsche innerhalb eines Beckens und ihre ausgeprägte Populationsstruktur auf. Die Zahl der Arten und ihre Systematik ist jedoch noch unklar. In dieser Studie wurden sieben Mikrosatelliten verwendet, um die genetische Auflösung zuvor publizierter Daten von Äschen aus dem Einzugsgebiet des Khovds in der Mongolei zu verbessern. Es stellte sich die Frage, ob sich die Diversität des Phänotyps im Becken aus einer (T. brevirostris) oder mehrerer Arten zusammensetzt. Mittels mtDNA Kontroll-Region, angrenzender Transfer-RNA-Gene und Mikrosatelitten wurde die populationsgenetische Struktur und Abstammungsbeziehung von Thymallus aus dem Oberlauf des Irtyshs in Kasachstan, sowie des Sees Markakol, erhoben. Die Allelvariabilität der Mikrosatelliten gab keinen Hinweise auf mehrere Arten im zentralasiatischen Becken und schließt daraus, dass phänotypische Unterschiede der mongolischen Äsche die Folge einer raschen Anpassung an unterschiedliche Lebensräume ist. Analysen der mtDNA und Mikrosatelliten weisen auf ein Schwesternverhältnis zwischen mongolischer Äsche (T. brevirostris) und der Äsche aus Kasachstan (T. sp.) hin. Zusammen mit vier weiteren Taxa dieser Region (T. nikolskyi, T. svetovidovi, T. nigrescens, T. baicalensis) sind sie über drei Hauptwassersysteme verteilt, relativ nahe miteinander verwandt und es wird angenommen, dass sie einen gemeinsamen Vorfahren besitzen.

iv

ACKNOWLEDGEMENT

I would like to take the opportunity to thank the people who spent their time and shared their knowledge for helping me to complete my thesis with the best possible result.

First, I would like to give my special thanks to my supervisor, Assoc. Prof. Dr. Steven Weiss, for the patient guidance, encouragement and advice he has provided throughout my time as his student. I am very grateful to have a supervisor who cared so much about my work, and who responded to my questions and queries so promptly.

I want to thank Dr. Igor Knizhin from the Irkutsk State University (ISU), for providing some samples, creating the very comprehensive schematic maps of sample locations, and additional support in understanding the region. I would also like to thank Sergey Alekseev of the N. K. Kolzov Institute of Developmental Biology, of the Russian Academy of Sciences in Moscow and Mirgaliy Baimukanov of the Institute of Hydrobiology and Ecology, Almaty, Kazakhstan for providing the valuable sample material from Lake Markakol and its surrounding region. Without this support it would not have been possible to carry out this research for my master’s thesis.

A large acknowledgment is also due to Mag. Tamara Schenekar and Dr. Karin Mattersdorfer, who have taught me all the skills in the laboratory and in the field with their graciousness and patience and for giving me competent advice and support whenever needed. I thank MSc Christine Börger and MSc Laurène Lecaudey for reading the manuscript of this thesis and for their recommendations to improve it. I am very grateful to all my colleagues at the laboratory - the pleasant working atmosphere has always been a perfect source of motivation.

I owe my deepest gratitude to my friends Barbara Martintschitsch and Julia Pockrandt, for their constant help, for their support in many different ways and for being just wonderful. You always had faith in me and I am very thankful for that.

Most importantly, none of this would have been possible without the love and patience of my family. Especially my mother Gisela to whom this thesis is dedicated to, has been a constant source of love, concern and support all these years. I would like to express my heart- felt gratitude to her.

v

INTRODUCTION

1 INTRODUCTION

1.1 Phylogeography and its importance

Phylogeography is a field of study and an applied discipline dealing with the principles and processes of geographic distributions of genealogical lineages (Avise, 2000). Over time, a variety of molecular markers and analytical techniques have been applied to improve the genetic resolution of evaluating phylogeographic structure within species, or among closely-related species. However, phylogeography is not only used for systematic or zoogeographic questions. Increasingly phylogeographic analysis plays a role in conservation planning and the definition of conservation units (Moritz, 1994; Weiss, Kopun, & Sušnik, 2013). Phylogeography helps us to review the evolutionary history of a species, provides insight into paleo-hydrological events and assists to understand the impact of ice-age refugia throughout the world.

1.2 Perspectives of Siberian paleo-history

One of the most interesting regions for such investigations in aquatic environments is Siberia, a large region containing four of the ten largest freshwater rivers in the world (Weiss et al., 2007). An understanding of the paleo-hydrological dynamics of Siberia is far from clear. There are at least two contrasting perspectives of Siberian paleo-history. Initially it was thought that the necessary quantity of precipitation was not sufficient to build such extensive glaciers as found in North America and Europe (Froufe, Knizhin, & Weiss, 2005). But according to Grosswald (1998) Pleistocene glaciation occurred along the polar continental shelves and coastal lowlands and left informative imprints on Siberian drainage systems. North flowing rivers were blocked by ice sheets and favored the formation of pro-glacial lakes (Fig. 1). While some of the more extravagant visions of Grosswald have not been supported, there is increasing hard evidence of more extensive glaciation or other forms of ice-phenomena both in the Siberian interior (Mangerud et al., 2004; Spielhagen, Erlenkeuser, & Siegert, 2005) as well along sea margins (Niessen et al., 2013). Freshwater fishes like those of the family Salmonidae are ideal organisms to aid in reconstructing hydrological history. The distribution of their intraspecific lineages is often linked to the modification of river courses and isolation of lakes through several glacial epochs.

1

INTRODUCTION

a b

FIGURE 1: Changing concept of the proglacial drainage systems in northern Eurasia (taken from Grosswald, 1998). Directions of the past meltwater flow are shown by the arrows. (a) paleo-hydrology includes lakes and spillways of Europe and West Siberia; (b) larger ice-sheet system, the western catchment also becomes larger, and a second, east-bound drainage system

appeared; (c) ice-sheets were too large, and the c ice-free terrains too small for the development of the second Siberian drainage system.

1.3 Importance of Salmonidae and the genus Thymallus

Fishes of the family Salmonidae are native to the Northern Hemisphere and widely distributed over Siberia. They are grouped into three subfamilies: Salmoninae, Coregoninae and Thymallinae (listed as a distinct family Thymallidae in: Osinov & Lebedev, 2000; Skurikhina, Mednikov, & Tugarina, 1985) and contain 11 genera (Nelson, 2006). Salmonoid fishes are known as tasty food fish and caught by commercial fisherman, bred in aquaculture and are also a desired target of sport fishing. However, they are not only of economic importance, but also serve as an important ecological component of temperate freshwater systems (Weiss et al., 2006). Especially the genus Thymallus arouses the interest of phylogeographers, because of its fine level of population structure and within and between-basin diversity. The paleo- hydrological events have left a genetic imprint on the genetic architecture of these fishes originated from ice-age perturbations (Koskinen, Piironen, & Primmer, 2001; Weiss et al., 2002; Froufe et al., 2003a; Knizhin et al., 2004; Stamford & Taylor, 2004). Particularly in Siberia, an increasing number of Thymallus species have been described, but a complete overview of the actual number of species has not yet been completed.

2

INTRODUCTION

1.4 Thymallus species in Eurasia

The distribution of is quite widespread in Europe. Its habitat stretches from Loire basin, southern France (Persat, Pattee, & Roux, 1978), east to the Balkans as far south as the Luča River in Montenegro (Jankovic, 1964). The northern boundary of their distribution extends from Great Britain across Scandinavia east to the bay of the river Kara in the Urals (Dujmic, 1997; Weiss et al., 2002). Additionally, European grayling display levels of within-basin fragmentation although not recognized as a full species (Sušnik, Snoj, & Dovč, 1999; Gross et al., 2001; Weiss et al., 2002). In the Ural mountain region, its range coincides with the habitat of Thymallus arcticus. Furthermore, evidence has been provided for a degree of hybridization between these two species (Zinoviev, 1980; Shubin & Zakharov, 1984). T. arcticus has been previously described as occurring throughout northern Eurasia and most of Canadian North America (Scott & Crossman, 1998; Redenbach & Taylor, 1999). Despite significant diversity of this species in North America (Stamford & Taylor, 2004), biologists group all populations under one taxon, T. arcticus (Redenbach & Taylor, 1999; Froufe, Knizhin, & Weiss, 2005). In contrast, Asian ichthyologists and field biologists are known for a long history of classification of Thymallus on a sub-specific and even intra-sub- specific level. According to genetic investigations in Weiss et al. (2006) T. arcticus is primarily found along the coastal region of Eurasia, and in the Lena system, but limited to the Delta or lowermost courses of the drainage. A second genetically and phenotypically distinct lineage, therein referred to Thymallus arcticus baicalolenensis (Matveyev et al., 2005), was shown to inhabit the rest of the Lena catchment. Additionally, it was found in northeastern tributaries of and in three areas of the Amur basin. It is scientifically a new taxon, and in Weiss et al. (2006) a species description as Thymallus lenensis was proposed. Attempts to publish this name, however, have failed due to the precedent of T. arcticus baicalolensis, and thus if this taxon is to be considered a species (a few shared by both S. Weiss, and I. Knizhin), it would have the name Thymallus baicalolenensis. Therefore, avoid confusion, we will refer to this taxon as T. baicalolenensis, throughout this thesis. Graylings, living in Lake Baikal, populations in the Selenga River basin of Mongolia and in most of the Enisey basin, were classified as a subspecies Thymallus arcticus baicalensis (Dorofeeva, 2002; Knizhin et al., 2006; Knizhin, Bogdanov, & Vasil’eva, 2006), but both initial genetic investigations (Koskinen et al., 2002) and subsequent studies

3

INTRODUCTION demonstrate that a single taxon inhabits Lake Baikal. Additional research activities (Weiss et al., 2007) resulted in recognition of Thymallus baicalensis as a full species. There is a long history of studying the genus Thymallus in the Amur River Basin, started in 1869, when Dybowski firstly described the species Thymallus grubii, which was sampled from the upper reaches of the river. Later, a new species Thymallus burejensis was described, which was found in the Bureya River (Antonov, 2004). The Bureya is a left tributary of the middle Amur, where this species lives in sympatry with T. grubii. Two years later, T. grubii was split into Thymallus grubii grubii and Thymallus grubii flavomaculatus (Knizhin, Antonov, & Weiss, 2006). The latter taxon is found in the upper reaches of large tributaries of the Lower Amur basin and some rivers flowing to the Tatar Strait, the Sea of Okhotsk, and the Sea of Japan. Finally, fishes from tributaries of the lower and middle current of the Amur River and previously corresponding to T. grubii were redescribed as Thymallus tugarinae (Knizhin et al., 2007). The upper reaches of the Ob River (Biya, Katun, and Chuya Rivers) adjacent to the Khovd River are populated by Thymallus nikolskyi, firstly described by Kashchenko (1899). Individuals from the drainage of the Irtysh River, the chief tributary of the Ob River, have been referred to as T. arcticus (Kottelat, 2006), although this designation is not compatible with the views presented in Weiss et al. (2006). According to Mitrofanov & Petr (1999) there is an isolated population in Lake Markakol, which inhabits the large left-hand tributaries of the Irtysh and other streams. Mongolia’s largest lake, Lake Hovsgol, is inhabited by Thymallus nigrescens. The taxon is “endemic” to the lake and considered as a species in Scott and Crossman (1998) while other researchers classified it as a subspecies Thymallus arcitcus nigrescens (Reshetnikov et al., 2002). Clearly, based on the genetic data in Koskinen et al. (2002), grayling from Lake Hovsgol are very closely related to those in Lake Baikal. This leads to the contrasting view of either considering Lake Hovsgol grayling as T. nigrescens, a relatively widely accepted name, or a close relative (i.e. subspecies) of Baikal grayling (T. baicalensis), however current data would support it as a subspecies of T. arcticus. Futhermore, Thymallus svetovidovi was found in the Shishkhed River, Darkhad depression (Knizhin & Weiss, 2009) of the upper Enisey River. Besides this species, T. baicalensis and Arctic grayling T. arcticus also inhabit the portions of the Enisey River basin (Knizhin & Weiss, 2009), with the latter only occurring (similar to T. arcticus in the Lena drainage) in the lower reaches or Delta region of the drainage.

4

INTRODUCTION

Finally, there is the Mongolian grayling Thymallus brevirostris. Initially, fish of this species were thought to represent the primitive member of its genus, but Koskinen et al. (2002) presented them as a relatively recent development within the Thymallus complex. Mongolian grayling live in the lakes and rivers of the closed Central Asian basin in Western Mongolia and border regions of Kazakhstan and the Tuva Republic (Froufe, Knizhin, & Weiss, 2005).

1.5 The central Asian basin and its Thymallus inhabitants

The Central Asian basin is a closed drainage basin - that is - it contains rivers and lakes that do not drain to an ocean basin. Its changes of surface water flow result from inputs like precipitation and surface flows, and outputs like evaporation and seepage. For a long time it was thought that the Central Asian basin was also populated by the Arctic grayling T. arcticus. Distinct differences in morphological characters indicated not only two closely related species, but also proposed a degree of hybridization (Ioganzen, 1945; Pivnicka & Hensel, 1978; Zinov’ev, 2005). In 2008, however, Knizhin et al. (2008) conducted an investigation that shed light into the distribution of T. arcticus in the Khovd River basin. The authors examined morphological characters, the pattern on the dorsal fin, some biological parameters, and variations in the mtDNA control region. In accordance with this study, water bodies of western Mongolia are only inhabited by T. brevirostris. This species contains a small form (bentophagous) that was thought to be the Arctic grayling and a larger form (predatory) (Fig. 2). The mtDNA control region, which was used in the mentioned study, evolves more slowly than other regions of mtDNA for the genus Thymallus (Froufe, Knizhin, & Weiss, 2005). Microsatellites are often highly polymorphic and have a relatively high mutation rate due to the so-called process of slippage replication. Thus, they can give us a greater understanding of population structure.

5

INTRODUCTION

1.6 Research questions

Therefore, the first aim of this study was to evaluate whether or not the Khovd River is inhabited only by one species T. brevirostris, using microsatellites as a genetic marker. The second aim was to evaluate the population genetic structure or lineage relationship of Thymallus in the upper reaches of the Irtysh River adjacent to the Khovd River basin including Lake Markakol. We wanted to know whether Thymallus in this region is more closely related to, or conspecific with described species found in Mongolia, the Enisey, or the upper Ob river basin.

FIGURE 2: External outlines of the head of Thymallus brevirostris sampled from water bodies of Altai (taken from Knizhin et al., 2008). (a, b) representing large predatory forms and (c-f) representing small benthos- eating forms. (a) Lake Khoton; (b) Lake Khurgan; (c) Lake Tolbo; (d) Bogdoin Gol River; (e) Lake Kyndykty-Kol; (f) Lake Khoton.

6

MATERIAL & METHODS

2 MATERIAL & METHODS

2.1 Sampling and DNA extraction

In 2012 a total of 89 graylings were collected by gill nets from the Kara-Kaba River and two tributaries of Lake Markakol (Kaldzhir River and River Urunkhaika) all part of the Irtysh drainage system. A fin clip was taken from each specimen and preserved in 96% ethanol. Additionally, preserved fin clips from Telezkoye Lake (N = 40) and the basin of the upper Khovd River (N = 54), from a previous study (Knizhin et al., 2008), were also used in this study (Fig. 3). Whole genomic DNA was isolated using a high salt (ammonium acetate) extraction protocol, modified from Sambrook, Fritsch, & Maniatis (1989).

2.2 Genetic analyses - mtDNA

2.2.1 Amplification and sequencing

The complete mtDNA control region (CR) and partial segments of both flanking genes of tRNA were amplified in six individuals (Biya: Telezkoye Lake) using the forward primer LRBT-25 (5’-AGA GCG CCG GTG TTG TAA TC-3’) and reverse primer LRBT-1195 (5’- GCT AGC GGG ACT TTC TAG GGT C-3’), first reported in Uiblein et al. (2001). Individuals of the Irtysh drainage system (N = 32) were amplified using the primers CRII_Int2F (5’-GGA ATC CCC CGG CTT CTA C-3’), CRI_Int1R (5’-ACT TCC TGG TTT AGG GGT TTG AC-3’) and the internal primer Int5R (5’-ATA TAA GAG AAC GCC CGG CT-3’). Polymerase chain reactions (PCR) were carried out in 25 µl volumes. Each reaction contained 13,87 µl H2O, 5 µl of Phusion GC Buffer, 0,5µl of 10 mM dNTP, 1,25 µl 10mM of each primer, 2,5 µl Phusion Polymerase and 1 µl of 100ng/µl DNA template. Cycle parameters were as follows: initial denaturation at 98°C for 30 sec, followed by 35 cycles at 98°C for 10 sec, annealing at 57°C (LRBT-25 and LRBT-1195) or 55°C (CRII_Int2F, CRI_Int1R and Int5R) for 30 sec, then 72°C for 30 sec and final extension at 72°C for 10 min. In some cases it was necessary to cut out PCR products from an agarose gel and cleaned them by using the Wizard® SV gel and PCR Clean-Up System (Promega). The amplified PCR products were purified by ExoSap-IT (Amersham Biosciences) treatment in a total volume of 9,1 µl using 7 µl of DNA, 0,7 µl ExoSap-It enzyme and 1,4 µl sterile dH2O. The purification mix was incubated at 37°C for 15 min. and in a second step the enzyme was inactivated for 15 min. at 80°C. Final products were sequenced using a BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems). A total volume of 0,25 µl Big Dye

7

MATERIAL & METHODS

Terminator premix, 2 µl 5x Buffer and either 0,125 µl forward or 0,125 µl reverse primer were added to each ExoSAP-IT treated sample. Sequencing reactions were performed using 94°C for 5 min, followed by 32 cycles at 94°C for 10 sec, 50°C for 5 sec, 60°C for 4 min and final extension at 60°C for 7 min. Afterwards Sephadex G-50 (Amersham Biosciences) was used to purify sequencing reactions and the resulting DNA was visualized in both directions on an ABI 3130xl Genetic Analyzer (Applied Biosystems).

2.2.2 Sequence alignment and phylogenetic analyses

The remaining CR sequences of grayling from different water bodies of Siberia were taken from previously published research (Koskinen et al., 2002; Weiss et al., 2002; Froufe et al., 2003b; Froufe, Knizhin, & Weiss, 2005; Weiss et al., 2006; Weiss et al., 2007; Knizhin et al., 2008; Knizhin & Weiss, 2009; GenBank Accession Nos. in Table 1). Sequences were aligned by eye using the program MEGA version 6.06 (Tamura et al., 2013). The program, jModelTest 2 (Darriba et al., 2012) was used to find the most likely among 88 nucleotide substitution models for this dataset to estimate a tree using Bayesian inference (BI) and Maximum likelihood (ML) approaches. The Hasegawa-Kishino-Yano model (Hasegawa, Kishino, & Yano, 1985) with a proportion of invariable sites and gamma- distributed rate variation across sites (HKY+I+G) was selected to construct a Bayesian 50% majority-rule consensus tree using MrBayes v3.2.2 (Ronquist & Huelsenbeck, 2003). Markov chains (MC3) were run for 1 x 106 generations, and trees were sampled every 100 generations. Two independent runs were performed. Tracer version 1.5 (Rambaut & Drummond, 2009) was used to discard all trees within the burn-in phase (25%). ML trees were calculated in RAxML v7.0.4 (Stamatakis, 2006) using the substitution model GTR+I+G (Tavaré, 1986) with 1000 bootstrap replicates to statistically support topologies. Afterwards, the ML trees were used to construct a 50% majority-rule consensus tree in PHYLIP v3.6 (Felsenstein, 2005). We calculated nucleotide diversity (Nei & Li, 1979) using the program DnaSP Version 5.10.01 (Librado & Rozas, 2009) to further characterize the differences between Biya, the Khovd basin and the upper reaches of the Irtysh River. The between-group variation in haplotypes for the populations of these three drainage systems and for all species of Thymallus, which are implied in this study, was calculated in MEGA, using the net nucleotide divergence (Dɑ) between groups (p-distances) and maximum pair-wise divergence. The three populations of the upper reaches of the Irtysh are designated as Thymallus sp.

8

MATERIAL & METHODS

FIGURE 3: Schematic map of a large part of the drainage systems of Siberia. Left enlargement gives a closer perspective of the upper reaches of the Irtysh including Lake Markakol while the right enlargement represents the second studied region – Western Mongolia. Dotted line outlined the region of sample locations of populations collected for Knizhin et al. (2008). (For coordinates of the sampled populations see Table 1.)

9

MATERIAL & METHODS

79;

-

83

-

89

86 -

-

-

-

-

73;EU676277

-

EU676294

EU168922

EU168913

DQ683720

DQ683704 DQ683710

DQ683695

EU676287

EU676285

70; EU676275, EU676280;

-

EU676290;EU676292

AY779013; EU676264

AY168348; AY168349 EU676265;EU676274;

EU676276;EU676281

GenBank number accession GenBank

EU676271

EU676284; EU676293; EU676284;EU676293; EU676295

EU676266

is study. is

99°01'

91°07'

91°16'

86°31' 85°11'

87°14'

100°39'

151°03'

110°46'

124°50'

113°00'

90°00'16

89°06'29

88°20'22

88°42'22

86°07'29

Long. (E) Long.

used in th in used

51°27'

51°28'

66°10'

68°17'

68°16'

56°05'

69°43'

60°00'

47°48'

48°25'

51°47'

Lat. (N) Lat.

48°35'01

48°55'02

48°37'09 48°33'37

48°46'07

Thymallus

-

-

-

-

-

-

-

-

-

3

9

29

13

29

34 26

40 msats

2

2

1

1

1

1

1

1

1

3

6

8

6

8

10

12

14 mtDNA

Number of individuals of Number

letter letter population code, the number of individuals forscreened both mtDNA and

-

Ocean

ctic

Ocean

Ocean

→ Arctic

Baikal → Enisey Baikal Ar

Kaldzhir River → Irtysh

Enisey → Enisey Arctic

Enisey

Selenga River → Lake Baikal → Lake River Selenga

River → Central Asian basin → Central River Asian

inage/Basin

ovd River → Central Asian basin ovd→ Central River Asian

Egin Gol Egin Gol

Kyzyl Khem Khem Kyzyl

Abakan River River Abakan

Arctic Ocean Arctic

Khantaiskoye → EniseyLake Khantaiskoye

Khantaiskoye → EniseyLake Khantaiskoye

V. Angara River River V. Angara

Lena → Arctic Ocean → Lena Arctic

Lena → Arctic Ocean → Lena Arctic

Kh

Central Asian basin Central Asian

Khovd

Khovd River → Central Asian Khovdbasin → Central River Asian

→ Ob` River → Arctic Ocean → Arctic River → Ob`

Markakol Lake Lake Markakol

Irtysh → Ob` River → Arctic Ocean River → → Ob` Arctic Irtysh

Irtysh → Ob` River → Arctic Ocean River → → Ob` Arctic Irtysh

Biya River → Ob` River Ocean → → Ob` Arctic River Biya Dra

Pil

Shi

Tol

Kal

Biy

Kol

Kht

Rur

Edy

Chv

Yak

Mol

Kka

Kon

Gog

Khv

Khg

Pop. Pop. Code

Sample Sample locations including major river basins/regions, the three

: :

1

ovsgul Kaba River River Kaba

River i

-

Lake H Lake

Shishkid Gol

Konu

Kolyma River Kolyma

Gogochenda River Gogochenda

Edyngde Lake Edyngde

Yakichy Creek Yakichy

Moloda River

Pilka River Pilka

Lake Tolbo Lake

Khovd River

Lake KhotonLake

Lake Khurgan Lake

River Urunkhaika Urunkhaika River

Kara

Kaldzhir River

Telezkoye Lake Telezkoye Population

TABLE TABLE for numbers accession GenBank and coordinates geographical variation, (msats) microsatellite

10

MATERIAL & METHODS

AF522442

AF522414

AF522403

AY246409

AY779012

AY168390

AY168374

AY168353 AY168355

AY168395; AY168396 AY779007; AY779008

number accession GenBank

AY246389; AY246396 AY246404;

'

'

7'

08°37'

13°02'

12°56'

137°55'

134°4

139°21'

110°25

110°25

134°53'

108°37' 109°53'

109°51' Long. (E) Long.

'

'

'

'

'

'

'

'

39

°41'

47°40

48°11

47°47

49°17

51°17

48°05

48°39

48°

51°55'

53°40'

55

55°31' Lat. (N) Lat.

-

-

-

-

-

-

-

-

-

-

-

-

183 msats

1

1

1

3

1

1

2

1

2

1

1 1

94

Continued

mtDNA

Number of individuals of Number

1: TABLE

Enisey

Black Sea Black

Black Sea Black

Angara Angara

Atlantic

Danube

Danube

Anui River → Amur → Tatar Strait, Pacific Ocean Pacific → Strait,Tatar → Amur Anui River

Amgun River → Amur → Tatar Strait, Pacific Ocean Pacific → Strait,Tatar → Amur Amgun River

Tatar Strait, Ocean Pacific Tatar

Shilka → Amur River → Tatar Strait → Tatar River Shilka → Amur

Shilka → Amur River → Tatar Strait → Tatar River Shilka → Amur

Amur River → Strait, PacificRiver Ocean Tatar Amur

Baikal → Angara → Enisey → Angara Baikal

Baikal Baikal

Baikal → Angara → Enisey → Angara Baikal Drainage/Basin

At

Da

Da

Bot

Bur

Uib

Fbb

Mer

Anu

Ogb

Dbb

Pop. Pop. Amu

Code

usen)

Total

Rhine (Schaffha

Enknach

Saalach

Anui Anui River

Merek River Merek

Botchi River

Onon River

Bureya River Bureya

Ushkaniy IslandsUshkaniy

Dagary Bay Dagary

Frolikha Bay

Population

11

MATERIAL & METHODS

2.2.3 Haplotype network analysis

The haplotype genealogy of sequences drawn from the Biya drainage, the basin of the upper Khovd River and the upper reaches of the Irtysh River were evaluated with two unrooted networks. One Median-joining network was constructed using the program PopART (Bandelt, Forster, & Röhl, 1999). A second network was constructed in the TCS 1.13 computer program (Clement et al., 2000) using a 95% criterion (Templeton, Crandall, & Sing, 1992) whereby gaps (or indels) were counted as events (i.e. treated as a fifth state).

2.3 Genetic analyses – microsatellites (SSR)

The microsatellites, used in this study, were selected from loci isolated from European grayling (Sušnik et al., 2000; Snoj et al., 1999), Coregonus lavaretus (Winkler & Weiss, 2008) and from loci previously demonstrated to cross-amplify in Thymallus sp. (Diggs & Ardren, 2008; Junge et al., 2010).

2.3.1 Amplification and screening

All forward primers were labelled with fluorescent dyes (HEX, NED, FAM) and the microsatellite loci were screened through the use of one single and two multiplex PCR reactions. The loci combination for the three populations from the upper reaches of the Irtysh River are as follows: BFRO004, BFRO010, Tth445, Tar103 for the first multiplex, Tar100, Tar101, Tar110, Tar112, Tth313 for the second multiplex and the remaining locus ClaTet1 was screened individually (Table 2). The loci combination for populations from Biya and the basin of the upper Khovd River was the same but without the locus Tar110, Tar112 and Tth445. PCR conditions (10 µl reactions) were as follows: each reaction contained 0,25 µl of each primer, 5 µl Qiagen Reaction Mix fulfilled with bidest. H2O.The cycle parameters were as follows: initial denaturation at 95°C for 5 min, denaturation at 95°C (30 sec), annealing at 58°C for both multiplex and at 60°C for the single locus (45 sec), and extension at 72°C (30 sec) repeated for 35 cycles and a final extension at 72°C for 30 min. PCR products were dried by heating at 60 °C for 20 min, a loading solution of 10 µl Hi-Di formamide and 0,125 µl GENEScan-500 (ROX) size standard was added to each sample and denatured at 95 °C for 5 min. The resulting products were run on an ABI 3130xl Genetic Analyzer (Applied Biosystems) and electropherograms were generated for each locus using the GeneMapper software (Version 3.7; Applied Biosystems, Foster City, CA, U.S.A.) to view allelic variation.

12

MATERIAL & METHODS

Focal species Focal

Thymallus arcticus

Thymallus arcticus

Thymallus arcticus

Thymallus arcticus

Thymallus arcticus

Thymallus thymallus

Thymallus thymallus

Thymallus thymallus Thymallus thymallus

Coregonus lavaretus

fluorescent fluorescent dyes, annealing

999)

Weiss (2008)Weiss

Reference

Ardren (2008)Ardren

Ardren (2008)Ardren

Ardren (2008)Ardren

Ardren (2008)Ardren

Ardren (2008)Ardren

&

&

& &

&

Snoj et al. (1

Junge et Junge et al. (2010)

Junge et Junge et al. (2010)

Source Source

Sušnik et al. (2000) Sušnik et al.

Diggs Diggs

Diggs Diggs

Diggs Diggs

Diggs Diggs Diggs & Diggs

Winkler

primer primer sequences,

a

T

58°C

58°C

58°C

58°C

58°C

58°C

58°C

58°C

58°C

60°C

(°C)

Dye

NED

HEX

NED

NED

HEX

NED

FAM

FAM

FAM

FAM

AGTATCC

, , including repeat motifs,

C

3´)

-

(

1

Thymallus

Primer sequence Primer

AGGAGGTTCAGTGAGTGTTTC

GTTTCTTCCACAGAGGGTTCTACATTG

TGA CGG CTA CAG GAA CTA CAG CGG TTGTGAA TGA

R: GTTTCTTCTCCTGTTTATCACATGA

F: AAACCAGTCCAAGCGAGAG

R:

F: CAGTCGGGCGTCATCACCTGGGAATCAACAAAGTATC

R: GTTTCTTCTCCTCTGATTCCAAGAAATG R: GTTTCTTCTCCTCTGATTCCAAGAAATG

F: GCAATAACAATTCCATGAGAAG

R: GTTTCTTAGGGCAAGTCATTCCAGT

F: CAGAGCACACCAAGCAGAG F: CAGAGCACACCAAGCAGAG

R: GAGAAAGCAAGGAGAAATCAC

F: CAGTCGGGCGTCATCATTTGGATGTGTCAGACCTG

R:

F:

R: CTTCACTGTCGCTGTGAGTAC

F: CAGTCGGGCGTCATCACGGGGATCAATAA

R: GTTTCTTGATTTCATAATCAGGTCAATAGTCAT

F: GGA CGG AGC CAG CAT CAC CAG CGG F:AGC GGA

R: AGGCCACTGATTGAGCAGAG

F: GCTCCAGTGAGGGTGACCAG

R: CTGCTACCCACAAACCCCTG

F: GAGCCCATCATCACTGAGAAAGA

species.

18 14

7

13

22

30

20

22

17 11

focal of

TCC(ATCC)

(GT)

(AC)

CTTC(CTTT)

7

(TATC)

5

(CTTT)

(GAGT)

(TAGA)

(GATA)

(GACA) Repeat Repeat motif

source

(ATCC)

(CTTT)

) and )and

a

T

Microsatellite combinations of PCR reactions for

: :

2

Locus

Tth313

Tth445

Tar112

Tar110

Tar101

Tar100

Tar103

ClaTet1

BFRO010 BFRO004

F forward = reverse R =

-

Plex

1 1

5Plex

4

single single

TABLE TABLE ( temperatures

13

MATERIAL & METHODS

2.3.2 Basic population genetic statistics

The number of alleles per locus, deviations from Hardy-Weinberg and linkage disequilibria and FIS values were calculated with the program FSTAT 2.9.3.2 (Goudet, 2002). Observed and expected heterozygosities were inferred using ARLEQUIN v3.5.1.2 (Excoffier

& Lischer, 2010) and the pairwise fixation index (FST) was calculated to evaluate population differentiation. All calculations were carried out for the populations of the upper reaches of the Irtysh River, Biya and individuals of the Khovd River basin, at first classified in studied water bodies and then classified in different taxa (Arc = T. arcticus, Bre = T. brevirostris, Hyb = T. brevirostris x T. arcticus). General genetic relationships among Thymallus sp. individuals were assessed in two ways. First, a factorial correspondence analysis (FCA) was computed in GENETIX v4.05.2 (Belkhir et al., 1996-2004) to graphically display individual relationships based on the presence or absence of alleles. Second, the dataset was analyzed with the Bayesian clustering method in STRUCTURE v2.3.4 (Pritchard, Stephens, & Donnelly, 2000). The posterior probabilities of K (number of populations) were estimated assuming uniform prior values of k between 1 and 6 for individuals of Mongolia, classified in studied water bodies, and between 1 and 5 for individuals of Mongolia, classified in different taxa. Prior values of k varied from 1 to 11 for the whole dataset, including populations of the Khovd River basin, the upper reaches of the Irtysh and Biya. Structure was run for 100,000 steps, of which the first 50,000 were discarded as burn-in and five independent replicates of the MCMC were conducted for each value of k. An admixture model and correlated allele frequencies were assumed. Output of STRUCTURE was interpreted using a combined approach of the Delta K method of Evanno, Regnaut, & Goudet, (2005) This method predicts the most likely value of K in a given data set based on the second order rate of change of the likelihood function, with respect to K, and the standard prediction of K based on a plot of the estimated mean ln probability of K [Mean Ln Prob (K)] with standard deviation. In addition, STRUCTURE creates a plot of individual Q-values. Q-values are estimated likelihoods that an individual belongs to a given population (K), meaning the higher a q-value, the more likely the individual belongs to a given population.

14

MATERIAL & METHODS

2.4 Genetic analysis – mixed data set

A coalescent-based Approximate Bayesian Computation (ABC) algorithm in the program package DIY-ABC v2.0.4 (Cornuet, Ravigné, & Estoup, 2010; Cornuet et al., 2014) was used to estimate divergence times and population history of the three populations of the Irtysh drainage system. This algorithm simulates datasets for each of a specified set of scenarios of historic and/or demographic events, and the generated data are reduced to summary statistics. These simulated statistics are compared with the statistics of the observed data. Posterior probability distributions and a credibility interval for the parameters of interest are estimated and alternative scenarios can be compared (Csilléry et al., 2010). For the present study, three different scenarios (Fig. 15) were tested and prior distributions of demographic parameters were as followed: uniform [30; 40,000] for effective population size (similar for all populations), uniform [1; 10,000] for t1, uniform [30; 20,000] for t12 (with t12 ≥ t1), uniform [50; 40,000] for t2 (with t2 > t1, t2 > t12), and uniform [0.001; 0.999] for ra. Each competing scenario was given equal prior probability. Microsatellites had a possible range of 40 contiguous allelic states and were categorized as di- and tetranucleotides, representing the first two groups of loci. The ten loci were assumed to follow a Generalized Stepwise Mutation model (Estoup, Jarne, & Cornuet, 2002) using the two parameters, mean mutation rate (mean µ) and the mean parameter of the geometric distribution (mean P): uniform [10-6; 9x10-4] for mean mutation rate of dinucleotides, uniform [10-4; 9x10-4] for mean mutation rate of tetranucleotides, and uniform [0.1; 0.3] for mean geometric distribution of both di- and tetranucleotides. In addition, each microsatellite was characterized by individual µloc and Ploc values drawn from Gamma (mean = mean µ and shape = 2) and Gamma (mean = mean P and shape = 2) distributions, respectively. The mtDNA control region represented the third group of loci. Sequences were assumed to follow the Hasegawa-Kishino-Yano model (Hasegawa, Kishino, & Yano, 1985), considering 61% of constant sites and the shape parameter of the Gamma distribution of mutations among sites equal to 0.596. The mean mutation rate per site per generation for the mtDNA locus was drawn in uniform [10-9; 10-7]. We simulated 106 datasets for each explored scenario. Posterior probabilities of each scenario were compared by using logistic regression on 1% of the closest simulated data set.

15

RESULTS

3 RESULTS

3.1 Inhabitants of the Khovd River basin

All seven microsatellites could be unambiguously scored, and amplified polymorphic products across all populations of the Khovd River drainage system. Results are first shown in which individuals are classified by sample sites and then by their presumed taxonomic classification.

3.1.1 Genetic diversity of Thymallus sp., classified in studied water bodies

The mean number of alleles ranged from 3.4 (Tol) to 10.6 (Kht), but note that the low value of Tolbo Lake is based on only three individuals and can be regarded as negligible. Mean allelic richness, a measure of genetic variation corrected for sample size, varied from 3.4 for individuals from Tolbo Lake to 4.2 for individuals from Khovd River, but this low value should also be viewed with caution, as it is strongly influenced by the low sample sizes of Tolbo Lake. One locus (BFRO004) exhibited a significant deviation from HWE within one population (Kht). After correcting for multiple tests, all of the loci (across populations) and all of the populations (across loci) displayed genotypes in HWE proportions. Eight instances of a significant (P < 0.05) deviation from LE were observed in a total of 84 between-locus comparisons across populations. However, after correcting for multiple tests, none of these comparisons remained significant. Thus, there is no evidence for linkages between the microsatellites. The mean expected heterozygosity was high across all populations and ranged from 0.67 (Kht) to 0.81 (Khv). FIS-values were distributed around zero with exception of the sample from Tolbo Lake (FIS = -0.153), which indicates an excess of heterozygotes for this population. The general genetic diversity measures of each population are summarized in Table 3.

Genetic differentiation between pairs of populations, as measured by FST, showed a low level and ranged from 0.010 to 0.028 with exception of one pairwise comparison with a slightly moderate level (FST = 0.075) between individuals of Lake Khoton and Tolbo Lake (Table 4).

16

RESULTS

3.1.2 Population structure of Thymallus sp., classified in studied water bodies

The factorial CA provides an overview of the genotypic relationships among all individuals of the four studied water systems of Mongolia (Fig. 4). Individuals of Lake Khurgan showed broad distribution along axis 2, which reflects 4.69% of the variation in the data while individuals of Khovd River and Lake Khoton spread along the first and most explanatory axis (4.88%). Despite distribution along different axes, individuals did not cluster by populations.

TABLE 3: Summary genetic statistics: Shown are the mean number of alleles per population over all seven loci (NA), the mean allelic richness per population (AR), the mean observed (HO) and expected (HE) heterozygosity and within-population coefficient of inbreeding (FIS).

Pop. NA AR HO HE FIS Code Biy 13.429 4.178 0.835 0.83 -0.006 Kal 8.286 3.524 0.665 0.687 0.033 Khg 8.286 3.742 0.714 0.723 0.012 Kht 10.571 3.547 0.631 0.669 0.057 Khv 8.286 4.192 0.81 0.807 -0.003 Kka 11.714 4.184 0.79 0.834 0.054 Rur 8.143 3.581 0.767 0.738 -0.040 Tol/Hyb 3.429 3.429 0.81 0.724 -0.153 Arc 12.286 3.997 0.734 0.769 0.045 Bre 8.286 3.196 0.607 0.577 -0.052

TABLE 4: Genetic differentiation among Thymallus populations of studied water bodies based on variance in allele frequencies (FST).

Biy Kal Khg Kht Khv Kka Rur Kal 0.220*** Khg 0.175*** 0.284*** Kht 0.193*** 0.317*** 0.026* Khv 0.131*** 0.248*** 0.014 0.023* Kka 0.156*** 0.165*** 0.200*** 0.241*** 0.164*** Rur 0.193*** 0.037*** 0.254*** 0.291*** 0.220*** 0.129*** Tol 0.174*** 0.288*** 0.028 0.075* 0.010 0.189*** 0.258***

*P<0.05, ***P<0.001.

17

RESULTS

The results of STRUCTURE revealed the highest posterior probabilities of K (∆K) for K = 1 and 2 (Ln Prob = -1431). After this climax Ln Prob (K) decreased linearly with increasing number of K (Fig. 5a). Delta K showed a clear peak at K = 2 with a value of ∆K = 74 (Fig. 5b), considering that these values are only calculated for K > 1.

FIGURE 4: Bi-variate plot of the first two factors of a factorial CA of microsatellite allele variation of 54 individuals from the four studied water bodies of Mongolia. Each population is marked with a unique symbol.

a) b)

FIGURE 5: Results of STRUCTURE analysis for the dataset of the four populations of Mongolia (summarized across five replicates) including a) the estimated mean of Ln probability for each K value [mean Ln Pr (K)] with standard deviation and b) Delta K (Evanno, Regnaut, & Goudet, 2005), which shows the rate of change between successive values of K.

18

RESULTS

The plot of mean Q-values for K = 2 showed approximately symmetrical membership proportions between the two defined groups and individuals could not be clearly assigned to any group. Thus, all individuals of Mongolia showed extensive sharing of allelic diversity and absence of genetic substructure between the studied water bodies of the Khovd River system.

3.1.3 Genetic diversity of samples, classified in different taxa of Thymallus

Summaries of the genetic and allelic diversities of all individuals classified into different taxa are presented also in Table 3. The mean number of alleles ranged from 3.4 (Hyb) to 12.3 (Arc), also noting that the low value of hybrids is based on three individuals, which were all sampled from Tolbo Lake. Mean allelic richness ranged from 3.2 (Bre) to 4.0 (Arc). One of the three taxa (Arc) displayed deviation from HWE for one locus (BFRO010), but showed no significance after correcting the p-value threshold for multiple tests. Out of 63 between-locus comparisons across populations, five deviations from LE were observed. No significant genotypic disequilibrium was detected following Bonferroni correction for multiple tests. Mean expected heterozygosity showed high values for Hyb and Arc ranging from 0.72 to 0.77 while HE of Bre had a moderate value of 0.58. There was almost no level of inbreeding within the classified taxa Arc and Bre (0.05 to -0.05) while Hyb displayed an excess of heterozygotes (FIS = -0.153). Unlike the minor changes in allele frequencies TABLE 5: Genetic differentiation

(low FST) of analysis for the dataset classified in water (FST) among different taxa of Thymallus. bodies, FST values indicated higher levels of genetic differentiation between pairs of taxa (Table 5). The Arc Bre pairwise population comparison between Arc and Bre Bre 0.071***Arc Bre showed a significant slightly moderate value (0.07) and Hyb Bre 0.027 0.104*** between Bre and Hyb a highly significant moderate 0.071*** Hyb value for F (0.10). *P<0.05, ***0.027P<0.001. 0.104*** ST TABLE 5. Genetic differentiation

(FST) among different taxi of 3.1.4 Population structure of samples, classified inThymallus. different taxa of Thymallus

This divergence is also seen in the factorial CA diagram whereby the three individuals of Hyb clustered together and all individuals of Bre, except one, revealed a distinct group of genotypes along the x-axis (4.88%) while Arc was distributed relatively even over both axis (Fig. 6).

19

RESULTS

The results of STRUCTURE indicate that with increasing number of K, the posterior probabilities of K (∆K) decreased, ranging from Ln Prob = -1431 at K = 1 to Ln Prob = -1729 at K = 6 (Fig. 7a). The highest value of ∆K was achieved at K = 2 with a peak of ∆K = 20 (Fig. 7b). Taking a deeper look at the results of the plot of Q-values at K = 2, all individuals showed a membership coefficient of ~ 0.50 for the two defined groups. Thus, cluster analyses of grayling samples from Mongolia, representing different taxa of genus Thymallus, also revealed low levels of divergence.

FIGURE 6: Bi-variate plot of the first two factors of a factorial correspondence analysis of microsatellite allele variation of 54 individuals of Mongolia, classified in different taxi of the genus Thymallus. Each taxon is marked with a unique symbol.

a) b)

FIGURE 7: Results of STRUCTURE analysis for the dataset of samples from Mongolia, classified in different taxa (summarized across five replicates) including a) the estimated mean of Ln probability for each K value [mean Ln Pr (K)] with standard deviation and b) Delta K (Evanno, Regnaut, & Goudet, 2005), which shows the rate of change between successive values of K. 20

RESULTS

3.2 Lineage relationship of Thymallus sp. from Kazakhstan

3.2.1 Microsatellite analysis of genetic diversity

A data set was used in order to evaluate the lineage relationship or even population genetic structure of samples from the upper reaches of the Irtysh River including individuals of three populations of this drainage system (Kal, Kka and Rur), individuals of the four populations from the Khovd River systems (Khg, Kht, Khv and Tol) and individuals sampled from Telezkoye Lake (Biya), which belongs to the upper reaches of the Ob River. Results of the general genetic diversity measures of each population are listed in Table 3. Among sampling sites of the upper reaches of the Irtysh River, the mean number of alleles varied from 8.1 (Rur) to 11.7 (Kka). Values of mean allelic richness for Kaldzhir River and River Urunkhaika (AR = 3.5) equaled those of three populations of the Khoved River system (Khg, Kht and Tol), while the value of Kara-Kaba River (AR = 4.2) equaled those of Biya and Khovd River. Thus, mean allelic richness had a small range and seems to be quite regularly distributed over all eight populations. Deviation from HWE was noted for one locus (BFRO004) within two populations (Kht and Kka). In a total of 168 between-locus comparisons across populations, 16 deviations from LE were observed. After correcting for multiple tests, no significant linkage disequilibrium or departures from HWE were observed. Mean expected heterozygosity varied from 0.69 (Kal) to 0.83 (Kka) for the three populations of the upper reaches of the Irtysh River. Based on FIS-values, which were distributed around zero, there is no deficiency or excess of heterozygotes in any population.

Almost all pairwise FST-values were highly significant, except for four populations

(Khg, Kht, Khv and Tol), all sampled from the Khovd River basin (Table 4). The mean FST between these populations and the three populations of the upper reaches of the Irtysh River (Kal, Kka and Rur) was 0.25 whereby the individual values for each population had strong variation (0.164-0.317). The mean FST between Kaldzhir River, Kara-Kaba River, River Urunkhaika and Biya, which is inhabited by Thymallus nikolskyi, was 0.190 (0.156-0.220).

Additionally, FST-values between Kara-Kaba River and populations of the Khovd River basin or Biya indicated the lowest divergence. Furthermore, FST detected a substructure within the three populations of the upper reaches of the Irtysh River. Kaldzhir and Urunkhaika rivers displayed low genetic differentiation (FST = 0.037) while divergence values between these two tributaries of Markakol and Kara-Kaba River had a moderate level (FST = 0.129-0.165).

21

RESULTS

3.2.2 Microsatellite analysis of population structure

Divergence between samples of the three different drainage systems is also seen in the factorial CA diagram. Individuals of the upper reaches of the Irtysh River are clustered together, distinct from all others, along the first axis (4.94% of variance) while individuals of Biya and the Khovd River basin formed own clusters along the second axis (4.37%) (Fig. 8). Comparing the second and third axes supports a strong separation of the different drainage systems (x-axis) with non-overlapping clusters. Additionally, the two tributaries of Markakol diverged from individuals of Kara-Kaba River along the y-axis, which reflects 3.13% of the variation in the data (Fig. 9). The robustness of this differentiation is also supported by the Bayesian clustering analysis. Runs of STRUCTURE indicated a large change in the posterior probabilities (Ln Prob) of K (∆K) between K = 3 and K = 4 (Ln Prob - 5625 to – 5208) (Evanno, Regnaut, & Goudet, 2005). Inspection of the ∆K plot for models with a range of K values form 1 to 11 revealed a distinct peak at K = 4 (∆K =763). Hence, we identified an optimal solution of K = 4 clusters using the Bayesian algorithm. In the four-cluster model (Fig. 10), the first group (blue) included individuals sampled from the Telezkoye Lake and the second group (orange) consists of individuals from Kaldzhir River and River Urunkhaika, representing the two tri butaries of Lake Markakol. All individuals from the four water systems of Mongolia were summarized in a third group (purple) while the fourth group was represented by individuals of the Kara-Kaba River (Kka) (green). Assignment probabilities (Q-values) at K = 4 remained above 78% for all individuals, with little evidence of admixture between the four groups.

FIGURE 8: Results of the first two factors of the factorial CA based on microsatellite allele frequencies of individuals of Kazakhstan, Mongolia and Russia. The clusters represent the genetic differentiation between the different drainage systems. 22

RESULTS

FIGURE 9: Results of the second and third factor of the factorial CA based on microsatellite allele frequencies of individuals of Kazakhstan, Mongolia and Russia. Clusters represent genetic differentiation between different drainage systems and an additional population structure for individuals of Kazakhstan.

1

0.8

value - 0.6

0.4

Mean Q Mean 0.2 0

FIGURE 10: Assignment (membership coefficient (Q)) of individuals of Biya, the Khovd River basin and the upper reaches of the Irtysh to genetic clusters using the STRUCTURE algorithm for K = 4.

Three additional loci could be scored across the three populations of the Irtysh drainage system to allow a more precise result of the population structure of samples from Kazakhstan (Kal, Kka and Rur). Allele frequencies and size distributions of all loci are presented in Figure 11. Visualization of the allele-size distributions suggested that the Kal and Rur population diverged from the other population of the upper reaches of the Irtysh. For instance, at the loci BFRO004, BFRO010, ClaTet1, Tth313 and Tth445, the Kal and Rur specimens showed a similar distribution while Kka exhibited alleles that were geographically confined to this population. Moreover, allele-size distribution of Kka showed more gaps than Kal and Rur (six of ten loci).

23

RESULTS

BFRO004 229 BFRO010 215

195 209

175

155 189

Tar100 325 315

305 295

285 275

265 255

Tar101

245 235

Tar103 Tar110 270 440 420 250 400 230 380

210 360

190 340 Allele size in base size pairs Allele Tar112 200 460 ClaTet1 440 420 180 400 380 360 340 160 215 Tth313 530 Tth445 5 195 510 490

175 470 450 155 430 135 410

Kal Kal Kka Rur Kka Rur

FIGURE 11: Allele frequencies and size distributions of ten microsatellite loci in Thymallus sp. sampled from the upper reaches of the Irtysh drainage systems. Areas of the bubbles correspond to frequencies of the respective alleles in given populations. Light grey represents dinucleotide loci and dark grey represents tetranucleotide loci.

24

RESULTS

The Bayesian population structure analysis for a dataset of the three populations of the Irtysh River using the 10 microsatellites resulted in the same cluster using seven microsatellites (Fig. 12). Individuals of Kal and Rur are summarized in one cluster and individuals of Kka represent another cluster, all having a membership coefficient of > 90%.

khaika

River River Urun

Kaba Kaba

-

Kara River

Kaldzhir Kaldzhir River

FIGURE 12: Sampling locations of the three populations of the upper reaches of the Irtysh River and bar plots of estimates of membership coefficient (Q) for each individual of the region for the inferred clusters (K = 2) with maximum log-likelihood probability.

3.2.3 Phylogenetic relationships using mtDNA

The final alignment of 94 sequences included complete control region (CR, 1009 base pairs), adjacent fragment of genes of tRNA proline (68 bp) and phenylalanine (10 bp). There were 182 variable sites, 141 of which were parsimony informative (including. indels), defining 50 haplotypes (43 excluding indels). Both BI and ML methods resulted in the same topology, thus only the BI tree is depicted but ML bootstrap values were added to tree nodes (Fig. 13). Net mean divergence between species used in this study ranged from a minimum of 0.4%. to a maximum of 5.6% (Table 6). After Thymallus arcticus, Thymallus baicalolenensis and species living in the Amur River basin split off, there remained three monophyletic clades. Clade A corresponds to populations from Biya, representing Thymallus nikolskyi, and populations from the Shishkhed River, representing Thymallus svetovidovi. Net divergence between these species and Thymallus sp. indicate values of 1.7% and 1.4%, respectively. Samples of Thymallus nigrescens and Thymallus baicalensis are assigned to Clade B and differ from Thymallus sp. by 1.8% and 1.6%, respectively.

25

RESULTS

FIGURE 13: Bayesian 50% majority-rule consensus tree of haplotypes based on the control region of mtDNA and fragments of genes of tRNA from Thymallus sp. using the HKY + G + I model. Three- letter samples codes as in Table 1. Node support is shown by Bayesian probabilities (above) and bootstrap values (over 50%) for ML (below). Tree is rooted with three haplotypes from European grayling Thymallus thymallus.

26

RESULTS

-

13

- 12

0.049

-

11

0.052 0.033

-

10

0.042

0.035 0.036

-

9

0.048

0.029

0.032 0.036

-

8

0.049

0.026

0.037

0.035 0.031

-

7

0.053

0.002

0.033

0.035

0.032 0.029

-

6

0.047

0.020

0.030

0.033

0.035

0.027

0.022

-

5

0.048

0.044

0.034

0.034

0.041

0.051

0.044

0.043

- 4

distance) used in this study. this in used distance)

-

0.056

0.047

0.037

0.034

0.047

0.053

0.047

0.045

0.010

p

-

3

0.045

0.017

0.028

0.030

0.030

0.026

0.018

0.013

0.040

0.043

-

2

0.049

0.020

0.030

0.035

0.036

0.029

0.022

0.007

0.041

0.043

0.015

-

1

0.044

0.016

0.030

0.030

0.031

0.027

0.018

0.014

0.042

0.045

0.004 0.017

ɑ) between species (uncorrected (uncorrected species between ɑ)

.

sp

hymallus

T. tugarinae

T. baicalensis

T. burejensis

T. thymallus

T. arcticus

T. baicalolenensis

T. nigrescens

T. svetovidovi

T. grubii flavomaculatus

T. grubii grubii

T. brevirostris

T. nikolskyi

T

Net mean divergence (D divergence mean Net

13

12

11

10

9

8

7

6

5

4

3

2

1

TABLE 6: TABLE

27

RESULTS

Within Clade C two monophyletic subclades are identified corresponding to the upper reaches of the Irtysh River and the Khovd River basin. These two subclades are supported by a moderate ML bootstrap value (86%) and a high BI (100%). There were 12 haplotypes among 27 individuals for samples from Mongolia and 10 haplotypes among 32 individuals for samples from Kazakhstan (Table 7). Samples from these two regions did not share any haplotype and the net mean divergence between their haplotypes was 0.4%, and the maximum pairwise divergence reached 0.9%. Biya exhibited a net divergence of 1.5% from samples of the Khovd River basin and 1.7% from samples of the upper reaches of the Irtysh River, with a maximum pairwise divergence of 2.1%. Biya showed more nucleotide diversity (π=0.0032) than samples from Mongolia (π = 0.0020) and Kazakhstan (π = 0.0011). No divergence could be detected between the populations of the upper reaches of the Irtysh. TCS could not support a network including all haplotypes. Haplotypes from the Khovd River basin and the upper reaches of the Irtysh River are summarized in one network, while the four haplotypes from Biya are too many steps away and are presented in their own network (Yellow, - Fig. 14a). This reflects not only isolation between Biya and the Khovd River basin but also isolation between Biya and the three populations from Kazakhstan. The 95% parsimony network revealed a most frequent haplotype, which is shared by individuals of Rur and Kal. Both populations also revealed private haplotypes. Five haplotypes are found exclusively in Kka and one haplotype is shared by individuals of Kka and Kal. These six haplotypes are located between the haplotypes of Rur and Kal. Additionally, the network is closed and two different haplotypes from the upper reaches of the Irtysh River are connected along two paths with haplotypes from the Khovd River basin, every path having a minimum of 5 steps. For variable nucleotide positions of all haplotypes see S1 Appendix. The network constructed in PopART spanned a maximum of 30 mutations also including haplotypes of Biya (Fig. 14b). Haplotypes of Biya were a minimum of 18 steps from haplotypes of the Khovd River basin and three additional mutations divergent from haplotypes of the upper reaches of Irtysh River (minimum of 21 steps). However, that network revealed a haplotype that is shared by individuals of Rur, Kal and Kka and one haplotype corresponding to individuals of Rur and Kal. These shared haplotypes are connected over one of five private haplotypes of Kka.

28

RESULTS

a) b)

FIGURE 14: a) Parsimony network of individuals from Biya, the Khovd River basin and the upper reaches of the Irtysh, whereby gaps (or indels) are counted as events. b) Median-joining network neglecting indels. Circle size is proportional to the observed haplotype frequencies and small black circles represent missing or theoretical haplotypes.

3.2.4 Divergence times and population history using mixed data set

We compared posterior probabilities of our ABC analysis for the three competing scenarios using local linear regression (Fig. 15). Scenario 3 showed the lowest support with probabilities lower than 0.1 while Scenario 1 in particular stood out for its best fit to the observed data. The first split (t2) in this scenario was estimated to have occurred 2 x 104 generations ago. The present population of the Kara-Kaba River was estimated to have split off from the ancestral population (NA). 1 x 104 generations ago (t12), and the ancestral

population turned into a precursor population. Finally, the precursor population divided into (including (including the two populations of the Kaldzhir River and River Urunkhaika with a divergence time at 3

5 x 10 generations (t1). These results are similar to those of the calculations of population structure using microsatellite loci, which showed difference between Kara-Kaba River and the

two populations, inhabiting the tributaries of Lake Markakol. Effective population size was haplotypes of

species from the studied water water the studied from species estimated at 2 x 104 for all populations. An overview of results of all parameters is given in

Appendix (S2).

Distribution Distribution

Thymallus

7:

of

TABLE indels) in used and Russia Mongolia Kazakhstan, of bodies of TCS. network theParsimony

29

RESULTS

Kht70

1

1 26

Kht56

3

3 25

Khv7

2

1

1 24

Khg29

1

1 23

Khg34

1 1

22

Kh t51

1

1 21

Cas2

3

3

2

2

10 20

Khv4

1 1

19

Khg35

1

1 18

Kht58

3

1

2 17

Khv6

1 1

16 Kht50

2

2

15

Kal04

1 1

14

Rur40

1

1

13

Haplotypes

Kal17

2

2 12

Kka80

1 1

11

Kka59

species from the studied water bodies of Kazakhstan, Mongolia and Russia used in used Russia and Mongolia Kazakhstan, of bodies the water studied from species

4

4

10

Kka72

2

2 9

Kka63

4

4 8

Thymallus

Kka64

of of

3

2

1 7

Kka70

1 1 6

Rur27

2

5 13

11 Biy24

1

1

4

(including indels) (including

Biy22

5 5

3

Biy06

1

1 2

Biy02

haplotypes

1

1

1

of

3

6

8

6

8

n

67

12 14

10

Distribution Distribution

7:

Haplotypes are designated by haplotypesindividuals,of in which they found were first.

Kaba River

-

ake Tolbo

Note:

Sum totalSum

L

River UrunkhaikaRiver

Kara

Khovd Khovd River

Lake Lake Khoton

Lake Lake Khurgan

Kaldzhir River

Biya Biya

Water body

TABLE of TCS. network theParsimony

30

RESULTS

FIGURE 15: Graphic representation of the three scenarios analyzed both with microsatellite and mtDNA data in DIY-ABC. Pop 1 = Kal, Pop 2 = Kka, Pop 3 = Rur. Graph of linear regression shows posterior probabilities of each scenario, having the best support for scenario 1.

31

DISCUSSION

4 DISCUSSION

In this study we used genetic analyses to evaluate populations of grayling (Thymallus sp.) inhabiting the Khovd river basin and to assess the phylogenetic relationship among lineages from western Mongolia, the Ob river basin and the upper reaches of the Irtysh River.

4.1 General genetic differentiation

Values of mean allelic richness of the investigated populations were low (3.2 to 4.2) but congruent with those previously published Thymallus studies (Weiss et al., 2007; Weiss, Kopun, & Sušnik, 2013). Observed and expected heterozygosity displayed high values for all populations (0.58 - 0.84). No indication of a global trend in the geographical distribution of genetic diversity could be found. Microsatellite analyses identified strong genetic differentiation between populations of the Khovd river and the upper reaches of the Irtysh

(mean FST = 0.25), and it is graphically represented in the bar blot of the STRUCTURE (Fig. 10) and the factorial CA (Fig. 8 & 9) analyses. In contrast, results from the analyses of mtDNA showed a very small genetic difference (Dɑ = 0.004), but still seen in the first split of Clade C (Fig. 13). While the microsatellite data set suggests that populations of Mongolia are more closely related to Biya (mean FST = 0.17), results of analyses of mtDNA propose a closer relationship between populations of Mongolia and the upper reaches of the Irtysh. Both mitochondrial and nuclear DNA based estimates resulted in an almost equal genetic difference between Biya and populations sampled from Mongolia and between Biya and populations sampled from Kazakhstan. Using the calibration of 1% sequence divergence per Myr for the complete mitochondrial control region in Thymallus (Koskinen et al., 2002), and confirmed with a calibration in Thymallus (Koskinen et al., 2002), haplotypes of Biya and Markakol diverged 1,700,000 years ago. Divergence time of 1,500,000 years was assumed between Biya and Khovd river haplotypes. Furthermore, sequence results implied 400,000 years since haplotype divergence of the Khovd River and the upper reaches of the Irtysh.

32

DISCUSSION

4.2 Thymallus inhabiting the Khovd river basin

Microsatellite analyses of populations from the Khovd River support that there is only one species T. brevirostris, which inhabits this basin. Thus, the hypothesis of T. arcticus and T. brevirostris living in sympatry in water bodies of western Mongolia is not supported with our selection of genetic markers and supports the results of a previous investigation based on mtDNA (Knizhin et al., 2008). In this investigation no structuring of haplotypes within individuals of the Khovd river basin could be observed (mean p-distance = 0.000). When individuals were classified by sample site, only the pairwise comparison between individuals of Lake Khoton and Tolbo Lake showed a slightly moderate level of genetic differentiation (FST = 0.075), again, influenced by the extremely small sample size from Tolbo Lake. When classifying the individuals by taxon, the highest genetic differentiation was assumed between T. brevirostris and putative hybrids T. brevirostris x T. arcticus (originally assigned to this status based on morphology by Igor Knizhin, prior to the collection of genetic data). Hence, individuals which were classified as hybrids did not represent the genetic link between the two taxa. Individuals could not be assigned to a distinct cluster regardless of the classification used. An explanation for different Thymallus phenotypes living in the Khovd river basin could be the appearance of new environmental conditions following the isolation of the Central Asian basin. The total area of river systems decreased while species poor lake systems became the main habitat of Thymallus brevirostris in western Mongolia. Under conditions of isolation, food shortage, and the presence of a single prey fish with an elongated dorsal spine (Oreoleuciscus dsapchynensi, Kottelat, 2006), it is presumed a predatory phenotype with a big head and long jaws evolved relatively rapidly parallel to a benthic feeding phenotype, more predominantly present in rivers. The three individuals of Tolbo Lake didn’t represent the genetic link between the two described forms, with their external morphology presumably reflecting adaptation to a different nutritional niche than T. brevirostris (Knizhin et al., 2008). Thus, Mongolian grayling is clearly a monophyletic lineage with polymorphic phenotypes adapted to different habitats with no signal of genetic correlation to these differences, at our level of genetic resolution.

33

DISCUSSION

4.3 Taxonomic assignment of populations from Kazakhstan

Mitochondrial DNA could not reveal population structure within the samples of the upper reaches of the Irtysh. In contrast, the microsatellite data set identified genetic differentiation of grayling populations between the two tributaries of Lake Markakol and the

Kara-Kaba River, illustrated by high number of private microsatellite alleles, high FST values

(mean FST = 0.15) and high accuracy of individual assignments in STRUCTURE analysis (Fig. 12). Microsatellite analyses were congruent with the mtDNA-based estimates, which could not reveal any population structure within the Markakol system. Both tributaries are connected over Lake Markakol and are geographically close. Individuals of River Urunkhaika and Kara-Kaba River are even closer, but genetic differentiation strongly suggests reproductive isolation of these two grayling lineages. The Approximate Bayesian Computation analysis estimated a divergence time of 100,000 years (generation time for Thymallus in approx. 5 years; Koskinen et al., 2002) between populations of the two tributaries of Lake Markakol and Kara-Kaba River. Markakol grayling are clearly a sister lineage to Mongolian grayling, yet are distinct from at least from the large, toothed piscivorous form. We refrain at this time from categorizing them taxonomically, and call them simply Thymallus sp. While Markakol grayling and Mongolian grayling reveal a sister relationship, another Ob basin taxon (T. nikolski) reveals a sister relationship to T. svetovidovi of the upper Enisey basin. Thus, at least five grayling taxa with a common ancestry occur across the intersection of three major basins (Ob, Enisey, Central Asian Basin) – with a combination of allopatry and distinct nutritional adaptation driving diversification.

34

REFERENCES

REFERENCES

Antonov, A. (2004). A New Species of Grayling Thymallus burejensis sp. nov. (Thymallidae) from the Amur Basin. Journal of Ichthyology, 44, 401–411.

Avise, J. (2000). Phylogeography: The History and Formation of Species. Cambridge, MA: Harvard University Press.

Bandelt, H., Forster, P., & Röhl, A. (1999). Median-Joining Networks for Inferring Intraspecific Phylogenies. Molecular Biology and Evolution, 16(1), 37–48. (http://popart.otago.ac.nz)

Belkhir K., Borsa P., Chikhi L., Raufaste N. & Bonhomme F. 1996-2004. GENETIX 4.05, logiciel sous Windows TM pour la génétique des populations. Laboratoire Génome, Populations, Interactions, CNRS UMR, 5000, Université de Montpellier II, Montpellier (France).

Clement, M., Posada, D., & Crandall, K. (2000). TCS : a computer program to estimate gene genealogies. Molecular Ecology, 9, 1657–1659.

Cornuet, J., Pudlo, P., Veyssier, J., Dehne-Garcia, A., Gautier, M., Leblois, R., Marin, J., Estoup, A. (2014). DIYABC v2.0: A software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data. Bioinformatics, 30, 1187–1189. doi:10.1093/bioinformatics/btt763

Cornuet, J., Ravigné, V., & Estoup, A. (2010). Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0). BMC Bioinformatics, 11, 401. doi:10.1186/1471-2105-11-401

Csilléry, K., Blum, M., Gaggiotti, O., & François, O. (2010). Approximate Bayesian Computation (ABC) in practice. Trends in Ecology and Evolution, 25, 410-418. doi:10.1016/j.tree.2010.04.001

Darriba, D., Taboada, G., Doallo, R., & Posada, D. (2012). jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9, 772.

Diggs, M., & Ardren, W. (2008). Characterization of 12 highly variable tetranucleotide microsatellite loci for Arctic grayling (Thymallus arcticus) and cross amplification in other Thymallus species. Molecular Ecology Resources, 8(4), 828–830. doi:10.1111/j.1755-0998.2007.02081.x

Dorofeeva, E. A. (2002). The genus Thymallus. In Y. S. Reshetnikov (Ed.), Atlas of freshwater fish in Russia (pp. 163–169). Moscow: Nauka.

35

REFERENCES

Dujmic, A. (1997). Der vernachlässigte Edelfisch: die Äsche. Status, Verbreitung, Biologie, Ökologie und Fang. Vienna: Facultas Universitätsverlag.

Dybowski, B. (1869). Vorläufige Mitteilungen über die Fischfauna des Ononflusses und des Ingoda in Transbaikalien. Verhandlungen der Zoologisch-Botanischen Gesellschaft in Wien, 19, 945–958.

Estoup, A., Jarne, P., & Cornuet, J. (2002). Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Molecular Ecology, 11(9), 1591–1604. doi:10.1046/j.1365-294X.2002.01576.x

Evanno, G., Regnaut, S., & Goudet, J. (2005). Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology, 14(8), 2611– 2620. doi:10.1111/j.1365-294X.2005.02553.x

Excoffier, L., & Lischer, H. (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10(3), 564–567. doi:10.1111/j.1755-0998.2010.02847.x

Felsenstein, J. (2005). PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle. (http://evolution.genetics.washington.edu/phylip/getme.html)

Froufe, E., Alekseyev, S., Knizhin, I., Alexandrino, P., & Weiss, S. (2003a). Comparative phylogeography of salmonid fishes (Salmonidae) reveals late to post-Pleistocene exchange between three now-disjunct river basins in Siberia. Diversity and Distributions, 9, 269–282. doi:10.1046/j.1472-4642.2003.00024.x

Froufe, E., Knizhin, I., Koskinen, M. T., Primmer, C. R., & Weiss, S. (2003b). Identification of reproductively isolated lineages of Amur grayling (Thymallus grubii Dybowski 1869): concordance between phenotypic and genetic variation. Molecular Ecology, 12(9), 2345–2355. doi:10.1046/j.1365-294X.2003.01901.x

Froufe, E., Knizhin, I., & Weiss, S. (2005). Phylogenetic analysis of the genus Thymallus (grayling) based on mtDNA control region and ATPase 6 genes, with inferences on control region constraints and broad-scale Eurasian phylogeography. Molecular Phylogenetics and Evolution, 34, 106–117. doi:10.1016/j.ympev.2004.09.009

Goudet, J. (2002). FSTAT, a program to estimate and test gene diversities and fixation indices. Version 2.9.3.2. (http://www2.unil.ch/popgen/softwares/fstat.htm)

Gross, R., Kühn, R., Baars, M., Schröder, W., Stein, H., & Rottmann, O. (2001). Genetic differentiation of European grayling populations across the Main, Danube and Elbe drainages in Bavaria. Journal of Fish Biology, 58(1), 264–280. doi:10.1006/jfbi.2000.1444

Grosswald, M. (1998). New approach to the ice age paleohydrology of Northern Eurasia. In: Benito, G., Baker, V., Gregory, K. (Eds.), Paleohydrology and Environmental Change (pp. 199–214). Chichester, England: John Wiley and Sons.

36

REFERENCES

Hasegawa, M., Kishino, H., & Yano, T. (1985). Dating the human-age split by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 22, 160–174.

Ioganzen, B. (1945). New Fish forms from western Siberia. In Notes on Fauna and Flora of Siberia (pp. 1-16). Tomsk, Russia: TomskGos. Univ.

Jankovic, D. (1964). Synopsis of biological data on European grayling, Thymallus thymallus (L. 1758). FAO, Rome: FAO Fisheries Biology Synopses, vol. 24.

Junge, C., Primmer, C., Vøllestad, L., & Leder, E. (2010). Isolation and characterization of 19 new microsatellites for European grayling, Thymallus thymallus (Linnaeus, 1758), and their cross-amplification in four other salmonid species. Conservation Genetics Resources, 2(S1), 219–223. doi:10.1007/s12686-009-9147-z

Kashchenko, N. (1899). Pisces. In Kononov, M. & Skulimovsky, I., eds. Results of Altai Zoological Expedition of 1898 (pp. 131–141). Tomsk, Russia: Tipo-Litografiya Imperatorskago University.

Knizhin, I., Antonov, A., Safronov, S., & Weiss, S. (2007). New species of grayling Thymallus tugarinae sp. nova (Thymallidae) from the Amur River Basin. Journal of Ichthyology, 47(2), 123–139. doi:10.1134/S0032945207020014

Knizhin, I., Antonov, A., & Weiss, S. (2006). A new subspecies of the amur grayling Thymallus grubii flavomaculatus ssp. nova (Thymallidae). Journal of Ichthyology, 46(8), 555–562. doi:10.1134/S0032945206080017

Knizhin, I., Bogdanov, B., & Vasil’eva, E. (2006). Biological and morphological characteristic of the Arctic grayling Thymallus arcticus (Thymallidae) from Alpine Lakes of the basin of the upper reaches of the Angara River. Journal of Ichthyology, 46(9), 709–721. doi:10.1134/S0032945206090037

Knizhin, I., & Weiss, S. (2009). A new species of grayling Thymallus svetovidovi sp. nova (Thymallidae) from the Yenisei basin and its position in the genus Thymallus. Journal of Ichthyology, 49(1), 1–9. doi:10.1134/S0032945209010019

Knizhin, I., Weiss, S., Antonov, A., & Froufe, E. (2004). Morphological and Genetic Diversity of Amur Graylings (Thymallus , Thymallidae ). Journal of Ichthyology, 44(1), 52–69.

Knizhin, I., Weiss, S., Bogdanov, B., Kopun, T., & Muzalevskaya, O. (2008). Graylings (Thymallidae) of water bodies in western Mongolia: Morphological and genetic diversity. Journal of Ichthyology, 48(9), 714–735. doi:10.1134/S0032945208090038

Knizhin, I., Weiss, S., Bogdanov, B., Samarina, S., & Froufe, E. (2006). Finding a new form of the grayling Thymallus arcticus (Thymallidae) in the basin of lake baikal. Journal of Ichthyology, 46(1), 34–43. doi:10.1134/S003294520601005X

37

REFERENCES

Koskinen, M., Knizhin, I., Primmer, C., Schlötterer, C., Weiss, S. (2002). Mitochondrial and nuclear DNA phylogeography of Thymallus sp . ( grayling ) provides evidence of ice-age mediated environmental perturbations in the world ’ s oldest body of fresh water, Lake Baikal. Molecular Ecology, 11, 2599–2611. doi: 10.1046/j.1365-294X.2002.01642.x

Koskinen, M., Nilsson, J., Veselov, A., Potutkin, A., Ranta, E., Primmer, C. (2002). Microsatellite data resolve phylogeographic patterns in European grayling, Thymallus thymallus, Salmonidae. Heredity, 88(5), 391–401. doi: 10.1038/sj.hdy.6800072

Koskinen, M., Piironen, J., & Primmer, R. (2001). Interpopulation genetic divergence in European grayling (Thymallus thymallus , Salmonidae ) at a microgeographic scale : implications for conservation. Conservation Genetics, 2(2), 133–143. doi: 10.1023/A:1011814528664

Kottelat, M. (2006). Fishes of Mongolia Fishes of Mongolia. A check-list of the fishes known to occur in Mongolia with comments on systematics and nomenclature. Washington, DC: The World Bank.

Librado, P., & Rozas, J. (2009). DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics, 25(11), 1451–1452. doi:10.1093/ bioinformatics/btp187

Mangerud, J., Jakobsson, M., Alexanderson, H., Astakhov, V., Clarke, G., Henriksen, M., Hjort, C., Krinner, G., Lunkka, J.-P., Möller, P., Murray, A., Nikolskaya, O., Saarnisto, M., and Svendsen, J. (2004). Ice-dammed lakes and rerouting of the drainage of northern Eurasia during the Last Glaciation. Quaternary Science Reviews, 23(11-13), 1313–1332. doi:10.1016/j.quascirev.2003.12.009

Matveyev, A., Samusenok, V., Tel’pukhovskiy, A., Pronin, N., Vokin, A., Prosekin, K., & Anoshko, P. (2005). New subspecies of Arctic grayling, Thymallus arcticus baicalolenensis ssp. nova (Salmoniformes, Thymallidae). Vestnik Buryatskogo Universiteta, S.2(Biologiya 7), 69–82.

Mitrofanov, V., & Petr, T. (1999). Fish and fisheries in the Altai, northern Tien Shan and Lake Balkhash (Kazakhstan). In Fish and Fisheries at Higher Altitudes: Asia (p. 304). Rome: FAO Fisheries Technical Paper 385. Food and Agricultural Organization of the United Nations.

Moritz, C. (1994). Defining “evolutionary significant units” for conservation. Trends in Ecology and Evolution, 9, 373–375. doi:10.1016/0169-5347(94)90057-4

Nei, M., & Li, W. (1979). Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences of the United States of America, 76(10), 5269–5273. doi:10.1073/pnas.76.10.5269

Nelson, J. S. (2006). Fishes of the world (4th ed.). Hoboken, New Jersey: John Wiley & Sons.

38

REFERENCES

Niessen, F., Hong, J., Hegewald, A., Matthiessen, J., Stein, R., Kim, H., Kim, S., Jensen, L., Jokat, W., Nam, S., Kang, S. (2013). Repeated Pleistocene glaciation of the East Siberian continental margin. Nature Geoscience, 6(10), 842–846. Retrieved from http://dx.doi.org/10.1038/ngeo1904

Osinov, A., & Lebedev, V. (2000). Genetic divergence and phylogeny of the Salmoninae based on allozyme data. Journal of Fish Biology, 57(2), 354–381. doi:10.1006/jfbi.2000.1307

Persat, H., Pattee, E., & Roux, A. (1978). Origine et caractéristiques de la distribution de l’ombre commun, Thymallus thymallus (L., 1758) en Europe et en France. Verh. Internat. Verein. Limnol., 20, 2117–2121.

Pivnicka, K., & Hensel, K. (1978). Morphological variation in the genus Thymallus, Cuvier, 1829 and recognition of the species and subspecies. Acta Univ. Carolinae-Biologica 1975-1976, 4, 37–67.

Pritchard, J., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155, 945–959. doi:10.1111/j.1471- 8286.2007.01758.x

Rambaut, A., & Drummond, A. (2009). Tracer version 1.5. (http://tree.bio.ed.ac.uk/software/tracer/)

Redenbach, Z., & Taylor, E. (1999). Zoogeographical implications of variation in mitochondrial DNA of Arctic grayling (Thymallus arcticus). Molecular Ecology, 8(1), 23–35. doi:10.1046/j.1365-294X.1999.00516.x

Reshetnikov, Y., Popova, O., Sokolov, L., Tsepkin, E., Sideleva, V., Dorofeeva, E., Chereshnev, I., Moskal’kova, K., Dgebuadze, Y., Ruban, G., Korolev, V. (2002). Atlas presnovodnykh ryb Rossii v dvukh tomakh (Atlas of Russian Freshwater Fishes in two Volumes), V.1, Nauka, Moscow.

Ronquist, F., & Huelsenbeck, J. (2003). MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics, 19(12), 1572–1574. doi:10.1093/bioinformatics/btg180

Sambrook, J., Fritsch, E., & Maniatis, T. (1989). Molecular Cloning: a Laboratory Manual (2nd ed.). New York: Cold Spring Harbor Laboratory Press.

Scott, W., & Crossman, E. (1998). Freshwater Fishes of Canada (5th ed.). Oakville, Ont., Canada: Galt House Publications Ltd.

Shubin, P., & Zakharov, A. (1984). Hybridization between European grayling, Thymallus thymallus , and Arctic grayling, Thymallus arcticus , in the contact zone of the species. Journal of Ichthyology, 24(4), 159–162.

Skurikhina, L., Mednikov, B. & Tugarina, P. (1985). Geneticheskaya divergentziya khariusov (Thymallus Cuvier) Evrazii i seti vidov. Zool. Zh., 64(2), 245 (in Russian).

39

REFERENCES

Snoj, A., Sušnik, S., Pohar, J., & Dovč, P. (1999). The first microsatellite marker (BFRO 004) for grayling, informative for its Adriatic population. Animal Genetics, 30(1), 74–75.

Spielhagen, R., Erlenkeuser, H., & Siegert, C. (2005). History of freshwater runoff across the Laptev Sea (Arctic) during the last deglaciation. Global and Planetary Change, 48(1-3), 187–207. doi:10.1016/j.gloplacha.2004.12.013

Stamatakis, A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22, 2688–2690. doi: 10.1093/bioinformatics/btl446

Stamford, M., & Taylor, E. (2004). Phylogeographical lineages of Arctic grayling (Thymallus arcticus) in North America: divergence, origins and affinities with Eurasian Thymallus. Molecular Ecology, 13(6), 1533–1549. doi:10.1111/j.1365-294X.2004.02174.x

Sušnik, S., Snoj, A., Jesenšek, D., Dovč, P. (2000). Microsatellite DNA markers ( BFRO010 and BFRO011) for grayling. Journal of Animal Science, 78, 488–489.

Sušnik, S., Snoj, A., & Dovč, P. (1999). Microsatellites in grayling (Thymallus thymallus): comparison of two geographically remote populations from the Danubian and Adriatic river basin in Slovenia. Molecular Ecology, 8(10), 1756–1758.

Tamura, K., Stecher, G., Peterson, D., Filipski, A., & Kumar, S. (2013). MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution, 30, 2725–2729. doi: 10.1093/molbev/mst197

Tavaré, S. (1986). Some probabilistic and statistical problems in the analysis of DNA sequences. In R. Miura (Ed.), Some mathematical questions in biology—DNA sequence analysis (pp. 57–86). Providence, USA: American Mathematical Society.

Templeton, A., Crandall, K., & Sing, C. (1992). A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram Estimation. Genetics, 132, 619–633.

Uiblein, F., Jagsch, A., Honsig-Erlenburg, W., & Weiss, S. (2001). Status, habitat use, and vulnerability of the European grayling in Austrian waters. Journal of Fish Biology, 59(A), 223–247. doi:10.1006/jfbi.2001.1762

Weiss, S., Knizhin, I., Kirillov, A., & Froufe, E. (2006). Phenotypic and genetic differentiation of two major phylogeographical lineages of arctic grayling Thymallus arcticus in the Lena River, and surrounding Arctic drainages. Biological Journal of the Linnean Society, 88(4), 511–525. doi:10.1111/j.1095-8312.2006.00621.x

Weiss, S., Knizhin, I., Romanov, V., & Kopun, T. (2007). Secondary contact between two divergent lineages of grayling Thymallus in the lower Enisey basin and its taxonomic implications. Journal of Fish Biology, 71(Suppl. C), 371–386. doi:10.1111/j.1095- 8649.2007.01662.x

40

REFERENCES

Weiss, S., Kopun, T., & Sušnik, S. (2013). Assessing natural and disturbed population structure in European grayling Thymallus thymallus: melding phylogeographic, population genetic and jurisdictional perspectives for conservation planning. Journal of Fish Biology, 82(2), 505–521. doi:10.1111/jfb.12007

Weiss, S., Persat, H., Eppe, R., Schlötterer, C., & Uiblein, F. (2002). Complex patterns of colonization and refugia revealed for European grayling Thymallus thymallus, based on complete sequencing of the mitochondrial DNA control region. Molecular Ecology, 11(8), 1393–1407. (http://www.ncbi.nlm.nih.gov/pubmed/12144660)

Winkler, K., & Weiss, S. (2008). Eighteen new tetranucleotide microsatellite DNA markers for Coregonus lavaretus cloned from an alpine lake population. Molecular Ecology Resources, 8(5), 1055–1058. doi:10.1111/j.1755-0998.2008.02153.x

Zinov’ev, E. (2005). Doctoral Dissertation in Biology. Perm, Russia: Permsk. Gos. Un-t.

Zinoviev, E. (1980). Parallelizm Izmenchivosti U Evropeiskogo I Sibirskogo Khariusov // Lososevidniye Rybi. Leningrad: ZIN AN SSSR.

41

LIST OF FIGURES

LIST OF FIGURES

FIGURE 1: Changing concept of the proglacial drainage systems in northern Eurasia (taken from Grosswald, 1998). Directions of the past meltwater flow are shown by the arrows. (a) paleo-hydrology includes lakes and spillways of Europe and West Siberia; (b) larger ice-sheet system, the western catchment also becomes larger, and a second, east-bound drainage system appeared; (c) ice-sheets were too large, and the ice-free terrains too small for the development of the second Siberian drainage system...... 2

FIGURE 2: External outlines of the head of Thymallus brevirostris sampled from water bodies of Altai (taken from Knizhin et al., 2008). (a, b) representing large predatory forms and (c-f) representing small benthos-eating forms...... 6

FIGURE 3: Schematic map of a large part of the drainage systems of Siberia. Left enlargement gives a closer perspective of the upper reaches of the Irtysh including Lake Markakol while the right enlargement represents the second studied region – Western Mongolia. Dotted line outlined the region of sample locations of populations collected for Knizhin et al. (2008). (For coordinates of the sampled populations see Table 1.) ...... 9

FIGURE 4: Bi-variate plot of the first two factors of a factorial CA of microsatellite allele variation of 54 individuals from the four studied water bodies of Mongolia. Each population is marked with a unique symbol...... 18

FIGURE 5: Results of STRUCTURE analysis for the dataset of the four populations of Mongolia (summarized across five replicates) including a) the estimated mean of Ln probability for each K value [mean Ln Pr (K)] with standard deviation and b) Delta K (Evanno, Regnaut, & Goudet, 2005), which shows the rate of change between successive values of K...... 18

FIGURE 6: Bi-variate plot of the first two factors of a factorial correspondence analysis of microsatellite allele variation of 54 individuals of Mongolia, classified in different taxon of the genus Thymallus. Each taxon is marked with a unique symbol...... 20

FIGURE 7: Results of STRUCTURE analysis for the dataset of samples from Mongolia, classified in different taxa (summarized across five replicates) including a) the estimated mean of Ln probability for each K value [mean Ln Pr (K)] with standard deviation and b) Delta K (Evanno, Regnaut, & Goudet, 2005), which shows the rate of change between successive values of K...... 20

FIGURE 8: Results of the first two factors of the factorial CA based on microsatellite allele frequencies of individuals of Kazakhstan, Mongolia and Russia. The clusters represent the genetic differentiation between the different drainage systems...... 22

42

LIST OF FIGURES

FIGURE 9: Results of the second and third factor of the factorial CA based on microsatellite allele frequencies of individuals of Kazakhstan, Mongolia and Russia. Clusters represent genetic differentiation between different drainage systems and an additional population structure for individuals of Kazakhstan...... 23

FIGURE 10: Assignment (membership coefficient (Q)) of individuals of Biya, the Khovd River basin and the upper reaches of the Irtysh to genetic clusters using the STRUCTURE algorithm for K = 4...... 23

FIGURE 11: Allele frequencies and size distributions of ten microsatellite loci in Thymallus sp. sampled from the upper reaches of the Irtysh drainage systems. Areas of the bubbles correspond to frequencies of the respective alleles in given populations. Light grey represents dinucleotide loci and dark grey represents tetranucleotide loci...... 24

FIGURE 12: Sampling location of the three populations of the upper reaches of the Irtysh River and bar plots of estimates of membership coefficient (Q) for each individual of that reagion for the inferred cluster (K = 2) with maximum log-likelihood probability...... 25

FIGURE 13: Bayesian 50% majority-rule consensus tree of haplotypes based on the control region of mtDNA and fragments of genes of tRNA from Thymallus sp. using the HKY + G + I model. Three-letter samples codes as in Table 1. Node support is shown by Bayesian probabilities (above) and bootstrap values (over 50%) for ML (below). Tree is rooted with three haplotypes from European grayling Thymallus thymallus...... 26

FIGURE 14: a) Parsimony network of individuals from Biya, the Khovd River basin and the upper reaches of the Irtysh, whereby gaps (or indels) are counted as events. b) Median-joining network neglecting indels. Circle size is proportional to the observed haplotype frequencies and smal black circles represent missing or theoretical haplotypes...... 29

FIGURE 15: Graphic representation of the three scenarios analyzed both with microsatellite and mtDNA data in DIY-ABC. Pop 1 = Kal, Pop 2 = Kka, Pop 3 = Rur. Graph of linear regression shows posterior probabilities of each scenario, having the best support for scenario 1...... 31

43

APPENDIX

APPENDIX

Appendix S1: Variable nucleotide positions for all haplotypes from Biya, the Khovd River basin and the upper reaches of the Irtysh River defined in this study. Positions correspond to the tRNA proline gene (1-68), the control region (69-1077), and the tRNA phenylalanine gene (1078-1084). Hash characters refer to base pair deletions or insertions and dots represent concordance with the Biy02 haplotype.

44

APPENDIX

Appendix S1: Continued

45

APPENDIX

4

08

04

04

-

-

- true

values

5.05E

5.00E

4.51E

2.00E+04

1.00E+04

5.01E+03

2.00E+04

2.00E+04

2.00E+04

2.00E+0

2.00E+04

08

04

04

-

-

-

q975

08E

9.54E

8.72E

3.

3.53E+04

1.83E+04

7.86E+03

3.69E+04

3.95E+04

3.90E+04 3.91E+04

3.56E+04

08

04

04

- -

-

q950

9.31E

8.42E

2.22E

3.19E+04

1.69E+04

6.62E+03

3.46E+04

3.91E+04

3.80E+04 3.82E+04

3.24E+04

ABC: ABC: mean, median and mode

-

08

04

05

- -

-

q750

7.86E

6.88E

8.02E

2.04E+04

9.65E+03

3.36E+03

2.53E+04

3.65E+04

3.23E+04

3.04E+04

1.92E+04

08

04

05

-

- -

q250

5.08E

4.02E

2.48E

1.05E+04

3.36E+03

1.10E+03

1.37E+04

3.09E+04

2.16E+04

1.24E+04

5.63E+03

08 04

05

-

-

-

q050

3.42E

2.39E

1.16E

5.97E+03

1.50E+03

4.26E+02

7.71E+03

2.55E+04

1.33E+04

4.11E+03 1.53E+03

Posterior

08

04

06

- -

-

q025

3.01E

2.01E

9.14E

4.89E+03

1.10E+03

2.80E+02

5.85E+03

2.37E+04

1.09E+04

2.71E+03

8.61E+02

08

04

05

-

- -

mode

6.48E

5.56E

2.51E

1.25E+04

3.24E+03

1.46E+03

1.80E+04

3.50E+04

2.80E+04

2.26E+04

3.63E+03

08

04

05

.

-

- -

median

6.43E

5.47E

4.35E

1.45E+04

5.73E+03

1.96E+03

1.90E+04

3.41E+04

2.72E+04

2.12E+04

1.12E+04

08

04

05

- -

-

mean

6.42E

5.44E

7.01E

1.61E+04

7.03E+03

2.50E+03

1.98E+04

3.34E+04

2.66E+04

2.12E+04

1.33E+04

7]

04]

04]

-

-

-

40,000]

20,000]

10,000]

40,000]

40,000]

40,000]

40,000]

40,000]

09,1.00E

-

-

-

-

-

-

-

-

04; 9.00E

06; 9.00E

-

-

-

Distribution

UF [50

UF [30

UF [10

UF [30

UF [30

UF [30

UF [30

UF [30

UF [1.00E

Prior

UF [1.00E

UF [1.00E

per site per generation) per site per

= Uniform Uniform = rate mutation Mean = (

Set of prior distributions. Posterior probabilities of the selected scenario (scenario 1) estimated in DIY

:

2

UF UF µmic

t2

t1

N3

N2

N1

t12

NA

N12

µmic µmic seq

Parameter

µmic µmic dinuc.

µmic µmic tetranuc.

Appendix Appendix S values true and distribution posterior the of quantiles four values,

46