i

EFFECTS OF LANDSCAPE ON GENETIC VARIATION OF STONE LAPPING MINNOW (Garra cambodgiensis (TIRANT, 1884)) POPULATIONS IN THE UPPER

CHAOWALEE JAISUK

A DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DOCTOR DEGREE OF SCIENCE IN AQUATIC SCIENCE FACULTY OF SCIENCE BURAPHA UNIVERSITY JANUARY 2018 COPYRIGHT OF BURAPHA UNIVERSITY ii

iii

ACKNOWLEDGEMENT

First of all, I would like to express my deepest gratitude and sincere appreciation to my advisor, Dr.Wansuk Senanan, for her excellent advice, discussion, and stimulating suggestions throughout the research process and the dissertation writing. Without her patient instruction, insightful criticisms and expert guidance, the completion of this thesis would not have been possible. Secondly, I am grateful for the support from dissertation committee members, and colleagues from the Faculty of Sciences, Burapha University, the Faculty of Sciences and Agricultural Technology, Rajamangala University of Technology Lanna Nan Campus and the Nan Provincial Administrative Organization (PAO). The dissertation committee members, Dr.Wongpathom Kamolrat, Assoc. Prof. Dr.Vipoosit Manthachitra, Dr.Narinratana Kongjandtre and Dr.Salinee Khachonpisitsak provided valuable comments and suggestions to improve my dissertation research and writing. Dr.Prasarn Intacharoen from the Department of Aquatic Science and Mr.Suchart Chayhard from the Environmental science graduate program, Faculty of Sciences guided me through the use of GIS for my research. The Nan PAO provided the GIS map data. This research was funded by the 2017 graduate research program at National Research Council of , and the Department of Aquatic Science, Faculty of Science Burapha university. Laboratory support was provided by the Department of Aquatic Science, Faculty of Science, Burapha University and Biotechnology Center at Rajamangala University of Technology Lanna. I am very thankful for these opportunities. Finally, I greatly appreciate the loving support of my family and friends. Their support has sustained me through this intense intellectual journey.

Chaowalee Jaisuk

iv

54810252: MAJOR: AQUATIC SCIENCE; Ph.D. (AQUATIC SCIENCE) KEYWORDS: SPATIAL GENETIC VARIATION/ MICROSATELLITE VARIATION/ LANDSCAPE GENETICS/ PHYSICAL BARRIERS/ POPULATION GENETICS/ GARRA CAMBODGIENSIS/ UPPER NAN RIVER/ THAILAND/ TROPICAL STREAM FISH CHAOWALEE JAISUK: EFFECTS OF LANDSCAPE ON GENETIC VARIATION OF STONE LAPPING MINNOW (Garra cambodgiensis (TIRANT, 1884)) POPULATIONS IN THE UPPER NAN RIVER. ADVISORY COMMITTEE: WANSUK SENANAN, Ph.D. 125 P. 2018.

Spatial genetic variation of stream-dwelling freshwater fish is typically affected by the historical and contemporary river landscape as well as life-history traits. Tropical river and stream landscapes have endured extended geological change and were less affected by the latest glaciation period. As a consequence, these systems tend to be extremely complex and may have shaped genetic diversity of fish populations in a unique manner. Such information on population structure for tropical aquatic systems, especially in freshwater ecosystem is lacking. These data are becoming important for designing appropriate management and conservation plans, as these aquatic systems are undergoing intense development and exploitation. Therefore, this dissertation research evaluated the effects of landscape features on population genetic diversity of Garra cambodgiensis, a tropical stream cyprinid, in the upper Nan River drainage basin, , using 5-11 hypervariable microsatellite loci. This research consisted of three elements focusing on two geographic scales, basin and sub-basin levels. First study described overall spatial genetic variation of populations from eight tributaries (six sub-basins) representing the entire upper Nan River drainage (Meed, Kon, Pua, Yao, Yang, Sa, Wa and Haeng rivers). Based on 11 microsatellite loci, I detected moderate genetic diversity within eight population samples (average number of alleles per locus across loci = 10.99±3.00; allelic richness = 10.12±2.44. Allelic richness within samples negatively correlated with stream order (P < 0.05). There was no evidence for recent bottleneck events in these populations. These populations in the upper Nan River drainage basin were genetically heterogeneous v

(Global FST = 0.022, P < 0.01). The Bayesian clustering algorithms (TESS and STRUCTURE) suggested that four to five genetic clusters roughly coincide with sub-basins; (1) the headwater streams/ the main stem of the Nan River, (2) a middle tributary, (3) a southeastern tributary and (4) a southwestern tributary. I observed positive correlation between geographic distance and linearized FST (P < 0.05), and the genetic differentiation pattern can be moderately explained by the contemporary stream network (STREAMTREE analysis, R2 = 0.75). The MEMGENE analysis suggested genetic division between northern (genetic clusters 1 and 2) and southern (clusters 3 and 4) sub-basins. The second component examined the impacts of landscape features on genetic variation of G. cambodgiensis at a sub-basin level, the Nam Wa sub-basin. This sub-basin, with the Wa River being the major river, represents a complex landscape allowing for some in-depth evaluation. Samples came from five localities (SP, Pha, NW, HR and NS) along the Wa River, representing different land use types, elevations, stream orders and positions relative to a man-made dam, namely Nam Wa Dam. Based on 10 microsatellite loci, the genetic diversity of samples in the sub-basin level was lower than that observed in a basin level. The impacts of hierarchical structure of the stream on genetic variation was less noticeable. However, pairwise differences in elevation and pairwise geographic distance among sites were important explanatory variables contributing to the existing spatial genetic variation of G. cambodgiensis. The genetic impacts of a recently built large concrete dam, namely Nam Wa dam, on G. cambodgiensis populations in the Wa River was not apparent. However, genetic monitoring would be needed to assess long-term genetic impacts of this dam. The third component examined the effects of physical barriers on genetic variation of G. cambodgiensis from six locations above and below three physical barriers in the Wa River system, namely Sapun Waterfall (SPU, SPL, Pun Stream), Nakham Dam (NKU, NKL, Mang Stream) and Suwanua Dam (SWU, SWL, Wa River). Based on 5 microsatellite loci, sample located above the Sapun Waterfall (SPU) was most genetically distinct from other remaining samples, including SPL

(FST = 0.097-0.307), and had the lowest genetic diversity. For the remaining samples, samples above and below the weirs were more genetically similar although the vi

genetic distance values were significant for all sample pairs (FST = 0.051 for SWU-SWL; 0.024 for NKU-NKL). STRUCTURE analysis revealed unequal admixture from the NK samples in SWU and SWL, suggesting some restriction to movement downstream. The results suggested a large genetic impact of a large barrier (> 10 meter high) and more subtle genetic impacts of smaller concrete weirs (< 5 meter high). The barrier size should be an important consideration for the design for fish-friendly structures. In summary, a contemporary structure of a river network, pairwise difference in elevation and stream orders greatly shaped genetic population strucutre of G. cambodgiensis n the upper Nan River system. A high degree of genetic admixture in each location in the upper Nan River Basin highlighting the importance of natural flooding patterns and possible genetic impacts of supplementary stocking. At the sub-basin level in the Nam Wa sub-basin, isolated headwater populations may undergo recent bottlenecks. Any habitat change to disrupt the connectivity of the river should be avoided. Insights obtained from this research advance our knowledge of the interactions between the complexity of a tropical stream system and the ecology of stream-dwelling fish as well as provide guidance for current conservation and restoration efforts for this species.

vii

CONTENTS

Page ABSTRACT ...... iv CONTENTS ...... vii LIST OF TABLES ...... ix LIST OF FIGURES ...... xii CHAPTER 1 INTRODUCTION ...... 1 Statements and Significance of the Problems ...... 1 Objectives ...... 4 Hypothesis...... 4 Contribution to knowledge ...... 4 Scope of the study ...... 5 2 LITERATURE REVIEWS ...... 7 Genetics and Population ...... 7 The formation of population structure ...... 9 Microsatellite loci ...... 10 Upper Nan River, ...... 11 The model species Garra cambodgiensis ...... 13 Influence of the gene flow barrier on genetic structure lotic fish populations ...... 15 Genetic diversity within isolated populations ...... 25 Landscape genetics ...... 28 Genetic management of fisheries resources ...... 30 3 RESEARCH METHODOLOGY ...... 33 Study design and sampling locations ...... 33 Laboratory methods ...... 41 Genetic data analysis...... 43 viii

CONTENTS (CONTINUED)

Chapter Page 4 RESULTS ...... 50 Tier I: Population genetic structure of G. cambodgiensis in the upper Nan River drainage basin ...... 50 Tier II(1): Population genetic structure of G. cambodgiensis in the Nam Wa sub-basin ...... 71 Tier II(2): Effect of physical barrier on genetic structure of G. cambodgiensis in the Wa River ...... 82 A Combined data set for the entire upper Nan River basin ...... 91 5 DISCUSSION AND CONCLUSIONS ...... 101 REFERENCES ...... 114 BIOGRAPHY ...... 125 ix

LIST OF TABLES

Table Page 1-1 Summary of the scope of study, sampling locations, number of loci examined and genotyping techniques...... 6 2-1 Effects of geographic distance on genetic population structuring of lotic fish species ...... 18 2-2 Effects of physical barriers on genetic population structuring of lotic fish species ...... 22 2-3 Genetic diversity within populations as a consequence of population isolation ...... 26 3-1 Landscape characteristics of sampling locations of G. cambodgiensis in the upper Nan River, Thailand ...... 38 3-2 Landscape characteristics of sampling locations of G. cambodgiensis in the Wa River, Nam Wa sub-basin Thailand ...... 39 3-3 Sampling locations of G. cambodgiensis populations above and below three barriers in the Wa River, Thailand ...... 40 3-4 Descriptions of primer sequences, fluorescent labelling, and annealing o temperature (TA) ( C) of microsatellite loci analyzed in this study ...... 42 3-5 Temperature profile of the polymerase chain reaction (PCR) used in this study ...... 43 3-6 Analysis of genetic variation and population genetic structure ...... 48 4-1 Average allelic variability at 11 microsatellite loci of G. cambodgiensis in the upper Nan River, Thailand ...... 52 4-2 Estimates and 95% confidence intervals of contemporary effective

population size (Ne), NeS and the detection of bottlenecks based on Wilcoxon’s test for eight population samples at 11 microsatellite loci ...... 61

4-3 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among G. cambodgiensis population samples in the upper Nan River, Thailand ...... 62

x

LIST OF TABLES (CONTINUED)

Table Page 4-4 Historical gene flow among G. cambodgiensis population in the upper Nan River at 11 microsatellite loci ...... 65 4-5 Pearson correlations between landscape characteristics in the upper Nan River and allelic richness within G. cambodgiensis population samples ...... 69 4-6 Multiple regression on distance matrices (MRM) for explaining

linearized pairwise FST among populations of G. cambodgiensis population samples ...... 69 4-7 Average allelic variability at 10 microsatellite loci of five G. cambodgiensis samples in the Wa River, Nam Wa sub-basin ...... 72 4-8 Estimates and 95% confidence intervals of contemporary effective

population size (Ne) based on linkage disequilibrium and the detection of bottlenecks based on Wilcoxon’s test for five population samples at 10 microsatellite loci ...... 78

4-9 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among five G. cambodgiensis population samples in the Wa River, Nam Wa sub-basin, Thailand ...... 80 4-10 Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis ...... 81 4-11 Average allelic variability at 5 microsatellite loci of six G. cambodgiensis samples in three barrier, Wa River ...... 83 4-12 Estimates and 95% confidence intervals of contemporary effective

population size (Ne) based on linkage disequilibrium and sibship approaches and the detection of bottlenecks based on Wilcoxon’s test for six population samples at 5 microsatellite loci ...... 87 4-13 Genetic variation within and among G. cambodgiensis population samples from three streams in the Wa River based on analysis of molecular variance (AMOVA) ...... 88 xi

LIST OF TABLES (CONTINUED)

Table Page

4-14 Pairwise FST values among six G. cambodgiensis population samples and among three streams ...... 89 4-15 Average allelic variability of G. cambodgiensis in upper Nan River drainage basin ...... 93 4-16 Estimates and 95% confidence intervals of contemporary effective

population size (Ne) based on linkage disequilibrium and the detection of bottlenecks based on Wilcoxon’s test for twelve population samples at 10 microsatellite loci ...... 96

4-17 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among twelve G. cambodgiensis population samples in the upper Nan River drainage basin ...... 97 4-18 Multiple regression on distance matrices (MRM) for explaining

linearized pairwise FST among populations of G. cambodgiensis...... 100 5-1 Records of supplementary stocking activities within the upper Nan River drainage basin between 2009 and 2017 ...... 108

xii

LIST OF FIGURES

Figure Page 2-1 Five different fragmented population structure ...... 8 2-2 River network of the upper Nan River, Nan Province ...... 12 2-3 Morphology of adult Garra cambodgiensis ...... 15 2-4 Map of the Marys River ...... 20 2-5 Four analysis levels at which landscape genetic data can be analyzed ...... 30 3-1 Study design of this dissertation research...... 34 3-2 Study area of this dissertation research ...... 35 3-3 Locations of population samples of G. cambodgiensis in the upper Nan River...... 37 3-4 Sampling locations of G. cambodgiensis populations in the Wa River. The map also shows stream size and physical appearance of barriers examined in this study...... 40 4-1 UPGMA dendrogram of eight population samples of G. cambodgiensis in the upper Nan River based on Nei’s genetic distance (Nei, 1978) with 1000 bootstrap replicates at 11 microsatellite loci...... 63 4-2 Bar plot of membership coefficients of individuals assigned to genetic clusters (K = 4 and 5) generated by a Bayesian clustering algorithm, TESS software. The individual coefficients were grouped by population samples...... 64 4-3 Historical gene flow (migration rate, m) among G. cambodgiensis population the upper Nan River at 11 microsatellite loci ...... 67 4-4 MEMGENE analysis for eight population samples of G. cambodgiensis in the upper Nan River basin ...... 70 4-5 UPGMA dendrogram of five population samples of G. cambodgiensis in the Wa River ...... 80 4-6 Bar plot of membership coefficients of G. cambodgiensis in Wa River, Nam Wa sub-basin ...... 81

xiii

LIST OF FIGURES (CONTINUED)

Figure Page 4-7 UPGMA dendrogram of six population samples of G. cambodgiensis among samples obtained from areas above and below physical barriers in the Pun, Mang and Wa Streams in the Wa River ...... 90 4-8 Bar plot of membership coefficients of G. cambodgiensis in the Wa River .. 91 4-9 UPGMA dendrogram of twelve population samples of G. cambodgiensis in the upper Nan River drainage basin ...... 98 4-10 Bar plot of membership coefficients of G. cambodgiensis in the upper Nan River drainage basin ...... 99

1

CHAPTER 1 INTRODUCTION

Statements and Significance of the Problems Genetic variation in natural populations reflects population history and the evolutionary potential of a species (Frankham, Ballou, & Briscore, 2010), an important consideration for aquatic conservation (Allendorf & Luikart, 2007). The presence of population subdivision within a fish species is the result of an interplay between restricted gene flow and independent genetic changes within isolated populations (Hedrick, 2011). In the absence of gene flow, conspecific populations will generally diverge from one another as a result of genetic drift, natural selection and mutations (Freeland, 2005). In addition, fish dispersal strategies and life-history traits determine the magnitude of landscape effects in shaping the genetic variation (Pilger, Gido, Propst, Whitney, & Turner, 2017). Although we recognize a general pattern of landscape effects on population genetic diversity, the boundaries for genetic divergence specific to a river system also vary depending upon the scale and complexity of local landscapes, fish life history and fish population dynamics. Genetic differentiation is often detected at a drainage basin level (Neville, Dunham, & Peacock, 2006; Beneteau, Mandrak, & Heath, 2009) or upstream and downstream within a drainage (Barson, Cable, & Van Oosterhout, 2009). Geographic factors encouraging population division include geographic distance between locations (Lamphere & Blum, 2012; Crookes & Shaw, 2016; Beneteau et al., 2009; Hopken, Douglas, & Douglas, 2012), the presence of barriers (Neville et al., 2006; Yamamoto, Morita, Koizumi, & Maekawa, 2004), the complexity of a stream network (Pilger et al., 2017) and habitat fragmentation (Sterling, Reed, Noonan, & Warren, 2012). Moreover, the magnitude of population divergence depends also on life-history traits such as body size (Hughes, Real, Marshall, & Schmidt, 2012; Pilger et al., 2017), habitat preference (Lamphere & Blum, 2012), and migratory behavior within the life cycle (Barson et al., 2009). An explicit integration of spatial and genetic analysis, i.e., landscape genetics, has been applied to aquatic environment and can reveal unexpected 2 determinants for population divergence (Selkoe, Scribner, & Galindo, 2016). For instance, Leclerc, Mailhot, Mingelbier, and Bernatchez (2008) reported that the population genetic structure of yellow perch (Perca flavescens) in Saint Lawrence River, Québec, Canada, was explained by more than five landscape variables (e.g., the presence of physical barriers, turbidity, water temperature, water masses and size of habitat) and their interactions. Although the genetic variation can be explained by the isolation by distance model, the presence of physical barriers (dams) was the most important factor affecting gene flow and the formation of genetic structure. In addition, Cook, Kennard, Real, Pusey, and Hughes (2011) examined the interactions between landscape variables including water body connectivity (i.e., distance, elevation), stream gradient and flow and population genetic variation of a widely distributed, tropical freshwater fish Mogurnda mogurnda. The null hypothesis was that the connectivity-related variables were the most important landscape factors driving population structure of the species. Unexpectedly, the landscape genetic analysis revealed that the stream gradient was the major influence on genetic variation in M. mogurnda in Daly River, Australia. Due to the complexity of a tropical river system, data derived from temperate or sub-tropical species may not provide adequate guidance for local conservation and management of fish populations. A tropical river landscape is the outcome of a long geomorphological process, and was less affected by the latest glaciation period at the end of the Pleistocene era (approximately 100,000 to 10,000 years ago) than these in temperate regions. The Nan River is the largest tributary (740 km) of the . The entire Nan River drainage basin covers 34,682.04 km2 and is divided by a large dam and reservoir, Sirikit Dam, into upper and lower portions. The upper Nan River drainage basin capture all highland headwater streams draining into the main stem of the Nan River and occupies approximately 1/ 3 of the entire Nan River basin. Due to its heterogeneous landscape containing some pristine highland headwater streams, the upper Nan River basin harbors high fish species diversity,-at least 108 species (Lothongkham, 2008) of the 600 species found in Thailand. The drainage is alo highly complex with more than 10,000 first and second order streams. Similar to other 3 tropical aquatic systems, rapid habitat alterations, including , and heavy fishing pressure are threats to the sustainability of fish populations. The Stone lapping minnow or False Siamese algae eater (Garra cambodgiensis) is a small-bodied (4-10 cm) cyprinid and inhabits rocky bottoms of fast-flowing sections of small-sized and medium-sized streams. This species is widely distributed in large river basins in , including those of the Chao Phraya, and Meklong rivers (Lothongkham, 2008) as well as in southern China, India and the Middle East to northern and central Africa (Kullander & Fang, 2004). This species occurs in almost all tributaries of the Nan River. Life-history characteristics related to dispersal ability of this species have not been well documented. However, local fishermen often harvest them during the spawning season (May-August) when breeders congregate in flooded lowland areas adjacent to the stream/ river. As the water recedes, the breeders disperse, as well as their fry. It is likely that their semi- buoyant eggs and fry drift via stream flow throughout the stream network. This dissertation research took a two-tier approach, each of which focuses on a different geographic scale and employed an integrative landscape genetic framework. Tier I research assessed population genetic structure of G. cambodgiensis populations at a large geographic scale, the entire upper Nan River basin, using highly polymorphic microsatellite loci. Tier II research examined, in detail, the effects of landscape features (i.e., distance, the presence of physical barrier, land use types) on genetic diversity of G. cambodgiensis populations in a headwater tributary of the upper Nan River basin, namely the Wa River. Microsatellite genetic markers are ideal for examining gene flow and is most commonly used in landscape genetic (e.g., Hall & Beissinger, 2014). Data at both geographic levels allowed for an inference on the scale of gene flow among populations of this lotic fish species. These understandings on landscape genetics of G. cambodgiensis will serve as a baseline for the genetic monitoring and provide a tool for conservation and restoration of wild populations of stream fish.

4

Objectives 1. Examine genetic variation within and among G. cambodgiensis populations inhabiting major tributaries of the upper Nan River, Nan Province, using microsatellite loci. 2. Evaluate impacts of landscape features on genetic variation within and among G. cambodgiensis populations in the upper Nan River at a drainage basin and a sub-basin levels. 3. Identify scale of gene flow among G. cambodgiensis populations in the upper Nan River basin.

Hypothesis 1. A stream network structure facilitates the formation of population genetic structure of a widely dispersed fish species. (Objective 1) 2. Genetic divergence between sampling locations increases as a pairwise geographic distance increases. (Objectives 1-3) 3. Populations separated by a physical barrier are likely to be genetically divergent. (Objective 2) 4. The heterogeneous landscape features (Land use types) affect genetic diversity within and among populations differently. (Objective 2-3) 5. Populations in poor quality habitats, represented by those in agriculture landscape, (presumably small populations) are have lower genetic variation than those in high quality habitats, represented by those in forested areas. (Objective 2)

Contribution to knowledge 1. Understand the roles of landscapes that shape population genetic structure of lotic fish populations. 2. Genetics data will be useful for conservation and genetic management of G. cambodgiensis populations. 3. Provide a baseline for existing genetic diversity of G. cambodgiensis. 4. Contribute to a landscape genetics model that can predict the formation of population structure in fish species with similar life history. 5

5. Elucidate on possible migratory pattern of G. cambodgiensis.

Scope of study 1. This research focused on two geographic scales in the upper Nan River basin. At a basin leval (Tier I), I analyzed G. cambodgiensis samples from eight major tributaries of the upper Nan River drainage basin. The results revealed overall genetic variation and basin-level landscape impacts. At a sub-basin scale (Tier II), I analyzed five or six populations from the Nam Wa sub-basin, one of the largest sub-basin of the upper Nan River. I tested for the effects of various landscape characteristics as well as physical barriers on genetic variation of G. cambodgiensis. 2. These three environmental variables are (1) distance between sampling locations, (2) elevation, (3) stream order, (4) the presence of physical barrier (i.e.,waterfall or dams if existed), and (5) land use types (i.e., agriculture and forested areas, as a surrogate for habitat fragmentation). 3. The selected species for this study was G. cambodgiensis (Tirant, 1884). This species is widely distributed throughout the Nan River basin, and can be collected in the proposed sampling locations. The samples were collected during the non-spawning season to maximize the opportunities to detect existing genetic differentiation. 4. I analyzed 5-11 microsatellite loci on 35-100 individuals/ location (Table 1-1) 5. The study explicitly integrated genetic data (i.e., descriptive genetic variation, gene flow and genetic distance measures) and environment variables (i.e., geographic distance, the presence of barriers, and different types of land uses) using an integrated framework.

6

Table 1-1 Summary of the scope of study, sampling locations, number of loci examined and genotyping techniques.

Geographic scales Sampling locations Number of Genotyping loci technique Tier I 8 tributaries of upper Nan 11 Fluorescent (a basin level) River basin: Meed, Kon, fragment Pua, Yao, Yang, Sa, Wa, analysis Haeng Tier II 5 locations along the Wa 10 Fluorescent (a sub-basin level) River, Nam Wa sub-basin: fragment SP, Pha, NW, HR, NS analysis 3 barriers in the Wa River: 5 Silver Sapun Waterfall, Nakham staining Dam, Sawanua Dam

7

CHAPTER 2 LITERATURE REVIEWS

This chapter reviews existing knowledge on how geographic and biological factors affect genetic variation in natural populations of stream fish species as well as some background information of the study area and the fish species (G. cambodgiensis).

Genetics and Population 1. Genetic diversity Genetic diversity can be described in terms of within- and among-population genetic diversity. Within population, genetic diversity is the variety of alleles and genotypes present in the group under study (population, species or group of species). Genetic diversity in population is generated by mutation or introduced by gene flow. Genetic diversity is the raw material required for adaptive evolutionary change (Frankham et al., 2010). Among population genetic diversity can be determined by genetic divergence measures, such as genetic distances. It is the degree to which allele frequencies varied among populations. Processes that shape genetic diversity within populations include genetic drift, natural selection, gene flow and mutation. The process affecting between population genetic diversity includes gene flow. 2. Genetic processes 2.1 Genetic drift Genetic drift is the changes in allele frequency that result from the random sampling of gametes from generation to generation in a finite population (Hedrick, 2011). Genetic drift reduces within population genetic variation in two ways: heterozygosity is reduced and alleles are lost. Furthermore, independent genetic drift occurred in each population leads to genetic divergence between population (Halliburton, 2004). 2.2 Gene flow Gene flow is the transfer of genetic material from one population to another following the dispersal and subsequent reproduction of individuals, propagules or gametes (Freeland, 2005). Gene flow can profoundly influence 8 population size, genetic diversity, local adaptation and ultimately speciation (Freeland, 2005). Two immediate outcomes of gene flow include the increase in genetic diversity in a reciving population and the reduction of genetic divert among populations. In the absence of gene flow, conspecific population will generally diverge from one another as a result of genetic drift, natural selection and mutations (Freeland, 2005). On the other hand, when the amount of gene flow between groups is high, gene flow may homogenize genetic variation among groups. Five common gene flow models include (A) mainland-island (or source- sink), (B) island structure, (C) linear stepping-stone structure (D) two-dimensional stepping-stone structure and (E) metapopulation (Frankham et al., 2010, Figure 2-1).

Figure 2-1 Five different population genetic structure

Note Five different population genetic structure model: (A) Mainland-island (or source-sink), (B) Island structure, (C) Linear stepping-stone structure, (D) Two-dimensional stepping-stone structure and (E) Metapopulation (Frankham et al., 2010)

1. Mainland-island (or source-sink) model refers to a situation where the “mainland” (source) provides all the input to the islands (sink). 2. Island structure occurs when migration/ gene flow is equal among equal sized islands (demes). 9

3. A linear stepping-stone model refers to an exchange between the adjacent neighbouring demes. 4. Two-dimensional stepping-stone model adds another direction to the linear stepping stone model. 5. Metapopulation refers to a structure where gene flow is more dynamic, a small deme founded or extinct frequently (symbolized by solid and empty circles respectively). Gene flow occurs between adjacent demes. To maintain this structure, the connectivity between demes is important (Frankham et al., 2010). 2.3 Natural selection Natural selection is the differential contribution of genotypes to the next generation due to differences in survival and reproduction. The natural selection were effects on allele and genotype frequencies, that interaction to genetic drift, therefore, actions taken to reduce genetic drift will also reduce the potential for natural selection. (Frankham et al., 2010). Natural selection is important mechanism that generates adaptation of organisms to their environments.

The formation of population structure Population genetic structure refers to genetic differences among populations or subpopulations (demes). Population genetic structure may exist for several different evolutionary reasons. The effect of gene flow is to keep the allele frequencies in different subpopulations similar. However, if the subpopulation are finite in size, then genetic drift may result in random differences among them, even with gene flow (Hedrick, 2011). The formation of population genetic structure may take place on several different spatial scales. Within a drainage, there may be separate groups that have a substantial amount of genetic exchange between them. On a larger scale, there may be genetic exchange between adjacent drainages, but in lesser amounts than between the groups within the drainage. On an even larger scale, there may be population in quite separated drainages that presumably have little direct exchange but may share some genetic history (Hedrick, 2011). Habitat fragmentation includes two processes, a reduction in total habitat area and creation of separate ‘island’ patches from a larger continuous distribution. 10

These processes lead to overall reductions in population size and reduce migration (gene flow) among patches. The deleterious consequences of reduced population size on genetic diversity are inbreeding depression and extinction risk (Frankham et al., 2010). As a result of restricted gene flow among patches, fragmented populations may be genetically different. For the example, the artificial dams led to genetic differences between white-spotted charr (Salvelinus leucomaenis) populations above and below the dams in Haraki, Hitozuminai and Ken-ichi rivers in the southern region of Hokkaido, Japan (Yamamoto et al., 2004).

Microsatellite loci Microsatellites or simple sequence repeats (SSRs) are short tandemly repeated sequence motifs consisting of repeated units of 1-6 base pairs (bp). They are highly abundant in genome. The polymorphisms of microsatellite is a result of a change in repeat numbers caused by intra molecular mutation mechanism called DNA slippage. The most common mutation mode is the change of a single repeat unit (Schlotterer, 1998). The microsatellite mutation rate is estimated at 10−2-10−6 per locus per generation (Chistiakov, Hellemans, & Volckaert, 2006). Microsatellites are highly polymorphic DNA marker with discrete loci and codominant alleles. Microsatellite fragments are often small (< 500 bp) which can be easily amplified with the polymerase chain reaction. This type of marker is selectively neutral, allowing for the accumulation of genetic diversity. All of these desirable features provide a foundation for their successful application in a wide range of fundamental and applied fields of biology, such as population and conservation genetics (Chistiakov et al., 2006). In fish population genetics (Castric, Bonny, & Bernatchez, 2001; Neville, Dunham, Rosenberger, Umek, & Nelson, 2009; Yamamoto et al., 2004) microsatellites are useful for the characterization of genetic diversity within and among populations (e.g., Neville et al., 2006; Waits, Bagley, Blum, Mccormick, & Lazorchak, 2008). Moreover, microsatellites are most commonly used markers in landscape genetics, especially to describe gene flow where selectively neutral markers are informative (Cook et al., 2011). 11

Upper Nan River, Nan Province In this study, I examined fish populations in the upper Nan River basin, located in Nan Province, Thailand. The climate of upper Nan River is tropical monsoon. The Nan River is the longest and largest tributary of the four tributary systems within the Chao Phraya River (740 km and 34,682.04 km2). The four major tributaries are Ping, Wang, Yom and Nan rivers. The Nan River drainage basin is divided by a large reservoir, Sirikit Dam, into upper and lower portions. Nan River headwater is 1,837 m above the mean sea level (MSL) and located on the Phu Wae mountain, Range, Chaloem Phra Kiat district. Most of the area in Nan Province is mountainous with the elevation ranging from 400-1,000 MSL, especially along the border of the Nan Province, Thailand and , . At the Thai-Lao boarder, the Luang Prabang Range divides the Mekong and the Nan River. The upper Nan River consist of nine sub-basins, namely upper part of Mae Nam Nan, Nam Yao (1), second part of Mae Nam Nan, Nam Yao (2), Nam Samun, third part of Mae Nam Nan, Nam Sa, Nam Wa and Nam Haeng. Each sub-basin contain one to two major tributary streams (Figure 2-2, Lothongkham, 2008). This study focused on eight major tributaries (stream order greater than 3) including Meed, Kon, Pua, Yao1, Yang, Sa, Wa, and Haeng rivers.

12

Figure 2-2 River network of the upper Nan River, Nan Province.

Note River network of the upper Nan River, Nan Province. Black lines indicate a boundary for a sub-basin for each major tributary stream (stream order > (3). The sub-basins include upper part of Mae Nam Nan, Nam Yao (1), second part of Mae Nam Nan, Nam Yao (2), Nam Samun, third part of Mae Nam Nan, Nam Sa, Nam Wa, and Nam Haeng. 13

Fish habitats in this river are highly heterogeneous and subsequently, allows for high biological diversity (Kottelat & Whiten, 1996). Lothongkham (2008) reported more than 108 species in the upper Nan River. Moreover, the fish species diversity already described in smaller tributaries ranges from 40-60 (43 species in Wa River, 45 species in Haeng River, 59 species in Yao River, 52 species in Kon River and 60 species Sa River and lower Wa River) (Lothongkham, 2015). Recently, anthropogenic activities have altered suitable habitat patches in the Nan River, i.e. forested areas, to agricultural areas or a hydroelectric dam. Land use change from forest to agriculture reduces pristine habitats as well as increases chemical contamination due to the use of algaecide and pesticide in agriculture. A recent construction of hydroelectric dam in the Wa River, the biggest tributary of the Nan River, may act as a barrier to gene flow between fish populations upstream and downstream of the dam. The Nam Wa dam was constructed in 2012. These habitat changes in the upper Nan River present a unique opportunity to examine the complex interaction between fish populations and genetic processes imposed by various landscape features.

The model species Garra cambodgiensis (Tirant, 1884) The cyprinid fish of the genus Garra includes bottom dwelling fishes usually found in fast flowing streams. The genus is widely distributed from southern China, across South East Asia, India and the Middle East to northern and central Africa. This genus totaling more than 60 of which about 40 are from Asia (Kullander & Fang, 2004). Most species of the genus Garra occur in swift flowing waters of hilly rivers and streams, where they commonly adhere to the surface of underwater gravel and rocky substrates. There are limited information on migration pattern. However, some species show some habitat preference at different life stages and possible movements to preferred locations. For example, in sucker head (Garra gotyla gotyla) from the rivers in Nepal, Ranjan, Herwig, Subodh, and Michael (2005) reported only certain size classes in some rivers, namely Arungkhola and Karrakhola rivers, each of which contained only either adult or juvenile. On the other hand, some other rivers, namely Aandhikhola and Tinau, contained all size classes including breeders. 14

The Stone sucker or Stone lapping minnow (Garra cambodgiensis) is a small-bodied (4-10 cm), G. cambodgiensis is widely distributed in the Chao Phraya River, Mekong River and Meklong basins (Lothongkham, 2008). It can be found in almost all tributaries of the Nan River. This species can be found in riffles area, less than 1-meter depth, and requires highly oxygenated, cool (water temperature 23 ºC) and clear water. Life history related to dispersal ability of this species has not been well documented, but I have observed the movements of the adult fish for mating by flood in the rainy season. My personal observations suggest G. cambodgiensis reproduces during the rainy season (May-August) mature males and females usually congregate in the more stagnant portions of the stream (e.g., rice fields or small tributaries). Then, the semibuoyant eggs and fry will get drifted by stream flow throughout the stream network. Currently, population abundance of this species has been declining in almost all tributaries as a result of the anthropogenic activities, such as fishing and habitat alteration. It important commodity for local communities in the upper Nan River. Its relatively high price (500 Baht/ kg) leads to high fishing pressure and habitat changes (Jaisuk, Lothongkham, Keereelang, & Sriyam, 2014). In addition, the expansion of agriculture areas has increased water pollution, especially from algaecide or pesticide applied to agriculture fields. These activities negatively affect to the abundance of local fish species including G. cambodgiensis (Lothongkham, 2015). 15

Figure 2-3 Morphology of adult Garra cambodgiensis

In summary, the Nan River has been changing by many human activities, which are affecting local fish diversity. Because of its wide distribution, G. cambodgiensis will be a good model species for understanding the effects of altered geographical structure, reduced population size and various habitat quality on fish populations in the Nan River. Moreover, the knowledge of species will useful for fishery resource management; for example, breeding program and broodstock management for restoration of wild and hatchery populations.

Influence of the gene flow barrier on genetic structure lotic fish populations Geographic and biological barriers to gene flow can lead to genetic divergence of fish populations. Geographic barriers include pairwise geographic distance, connectivity between water bodies (e.g., stream network structure) and the physical barriers (e.g., waterfall, dam). Biological barriers include life history, dispersal abilities and differential spawning behaviors of populations. 1. Geographic barriers 1.1 Geographic distance For several lotic fish species, pairwise geographic distance is one of the most likely explanations for genetic divergence among populations, especially at a large geographic scale. Geographic distances appear to positively correlate with genetic distance (Table 2-1). Genetic differentiation is often detected at a drainage 16 basin level. For example, Neville et al. (2006) examined Lahontan cutthroat trout (Oncorhynchus clarki henshawi) populations from two tributaries, western and eastern drainages of the Marys River, the western Great Basin desert in western North America, covering an area of about 500 km2. Sites represented locations in the headwater and confluence portions of most tributaries and throughout the main-stem river, 10 sites from the western and 6 sites from the eastern drainage basin. The

FST values indicated significant differentiation between drainage basin (0.07 to 0.32). Similarly, Barson et al. (2009) examined 11 populations of Guppy (Poecilia reticulata) in the Northern Range of Trinidad, from 4 tributaries of the Caroni drainage basin, and 2 tributaries of the Northern drainage basin. They found the

FST values were significant for all pairwise comparisons between drainage basins (0.169 to 0.806). Beneteau et al. (2009) examined Greenside darter (Etheostoma blennioides) 26 samples from 4 drainage basins in Southwestern Ontario collected in 2005 and 2006. For both years, the genetic differentiation among drainage basins was significant (FST = 0.079 and 0.055 for 2005 and 2006 samples respectively). Moreover, the pairwise genetic divergence was positively correlated to geographic distance, indicated by the Mantel test of isolation by distance (P = 0.010, R2 = 0.62 for the 2005 samples and P = 0.02, R2 = 0.41 for the 2006 samples). Within a drainage, researchers often detected the relationships between pairwise population genetic difference and geographic distance, especially in a somewhat linear river/ stream system. For examples, Castric et al. (2001) detected the population genetic structure of brook charr (Salvelinus fontinalis) among 10 samples within the Penobscot drainage basin, Main, USA. The genetic divergence (Expressed as FST/ (1-FST)) among populations was marginally correlated with the sum waterway distance (Mantel test, Z = 5458, P = 0.0556, slope of the log FST/ (1-FST) to log distance = 0.33). The the smallest geographic distance to detect population divergence was 100 km. Waits et al. (2008) examined Central stonerollers (Campostoma anomalum) populations in the Mill Creek drainage basin, covering 274 km2 in southwestern Ohio, USA. They detected a weak, but significant, positive correlation between genetic distance (measured by pairwise ST estimates) and geographical 17

distance (river-kilometres) (R = 0.4016, P one-sided = 0.0378, slope = 0.0008). The genetic distances appeared bimodal suggesting that the significance of the isolation by distance test may be due to slight substructure between the upper and lower catchments, located 45 km apart. Leclerc et al. (2008) examined yellow perch (P. flavescens) from 16 locations in a large fluvial ecosystem, distributed throughout the 310 km portion of the Saint Lawrence River. Pairwise geographic distance between samples ranged from 50 to 310 km. The P. flavescens populations were genetic different across the whole study area (Global FST = 0.039) and the genetic distances were correlated with the geographic distances (Mantel test, R = 0.527, P < 0.001). Similarly, Barson et al. (2009) found genetic differences among populations of P. reticulata within two drainages, the Caroni and Northern drainage basins, in the Northern Range of

Trinidad (within drainage FST = 0.013 to 0.927). Meeuwig, Guy, Kalinowski, and Fredenberg (2010) examined 16 samples of bull trout (Salvelinus confluentus) from lakes in Glacier National Park, Montana. They found the effect of waterway distance on genetic differentiation within a tributary. For every 1 km increase in pairwise geographic distance, FST values increase by 0.002 to 0.003. Lamphere and Blum (2012) examined 8 samples of mottled sculpin (Cottus bairdi) from Nantahala River (North Carolina, USA). The genetic differentiation increases with geographic distance with the FST values ranging from 2 0.003 to 0.071 (Mentel test, R = 0.844, P < 0.0001). The FST value were significant between sites at least 2.01 km apart (FST = 0.018) and two most distant sites, 5.37 km apart, (FST = 0.059). Sterling et al. (2012) examined Yazoo darters (Etheostoma raneyi) (Percidae) samples from 17 sites representing 5 tributaries in the north central portion of the Mississippi River. For two out of five tributaries, the pairwise FST values were significant between sites at least 5 km apart and the differentiation patterns can be explained by isolation by distance (Mantel test, R = 0.89, P = 0.0001 for Yocona River drainage, R = 0.5, P = 0.015 for Tallahatchie River drainage). For samples 2 within the Yocona River drainage (1,014 km ), FST ranged from 0.17 to 0.21 and for 18

2 those within Tallahatchie River drainage (2,755 km ), FST values ranged from 0.034 to 0.123.

Table 2-1 Effects of geographic distance on genetic population structuring of lotic fish species

Geographic Genetic differentiation Ref. Species Study site scale or Mantel FST distance test Among drainage basins O. clarkii 2 drainage basins 500 km2 0.07-0.32 - (Neville henshawi : 10 sites, western et al., 2006) drainage basin. : 6 sites, eastern drainage basin. P. reticulata 2 drainage basins, - 0.169-0.806 - (Barson 11 samples et al., 2009) E. blennioides 4 drainage basins, - - - (Beneteau 26 samples et al., 2009) 2005 - 0.079 R2 = 0.62 P = 0.010 2006 - 0.055 R2 = 0.41 P = 0.02 Within a drainage basin S. fontinalis 10 samples 100-400 km - Z = 5458 (Castric P = 0.0556 et al., 2001) P. flavescens 16 samples 50-310 km 0.039 R = 0.527 (Leclerc P < 0.001 et al., 2008) P. reticulata 2 drainage basins, - 0.013-0.927 - (Barson 11 samples et al., 2009) 2 C. bairdi 8 samples 2.01-5.37 km 0.018-0.059 R = 0.844 (Lamphere & P = 0.0001 Blum, 2012)

19

Table 2-1 (Continued)

Geographic Genetic differentiation Ref. Species Study site scale or Mantel FST distance test E. raneyi 2 drainage basins, (Sterling 17 samples et al., 2012) Yocona R. 1,014 km2 0.17-0.21 R = 0.89 drainage. P = 0.0001 Tallahatchie R. 2,755 km2 0.034-0.123 R = 0.5 drainage. P = 0.015

1.2 Connectivity of water bodies In some instances, geographic distance alone cannot explain genetic differentiation among populations, especially those within a more complex stream/ river network or within a smaller geographic scale. Other geographic factors influencing genetic divergence may include degrees of water body connectivity (through a stream network), altitude and habitat fragmentation. For example, Neville et al. (2006) found the effect of habitat connectivity, even at a very small geographic scale, on population genetic structure of O. clarkii henshawi in eastern and western drainage basins of the Marys River, USA. Patterns of differentiation among drainages within this river system were not uniform. These differences were partially due to differed configuration among the stream networks and life history characteristics of the fish species. Even though most sites were 3 to 5 km apart, most sample pairs in the eastern and western drainage basins of the Marys River were significantly different. The two exceptions were two pairs of up and downstream sample (about 5 km apart) in two short tributary streams, Wildcat (WC) and Draw Creeks (DC) in the eastern drainage basins. Interestingly, in the western drainage basins sites downstream of adjacent tributary streams, East Marys River (EMR2) and Marys River Basin Creek (MRBC2), were not genetically different but these sites were different from its respective upstream sites (EMR1, MRBC1) (about 3 km apart) (Figure 2-4).

20

Figure 2-4 Map of the Marys River

Note Map of the Marys River and sampling areas of O. clarkii henshawi in Neville et al. (2006). Sample sites are indicated by bold colored lines.

Pairs of samples with similar colors had FST values that were not statistically different from each other, whereas samples with different colors were significantly differentiated from all other samples

Barson et al. (2009) reported different degrees of genetic differentiation among P. reticulata populations in the upstream and downstream sites in the Caroni tributaries. The FST values between upland populations of the Caroni drainage basin were particularly high (0.443-0.927), reflecting a high level of genetic differentiation and population isolation. On the other hand, the level of genetic differentiation between lowland populations was considerably lower (0.028-0.288), these populations appeared to have ongoing migration within and among tributaries. 21

Genetic analyses indicated that lowland populations in the Caroni drainage basin received immigrants from the lowland populations of other tributaries in the drainage basin, particularly in a downstream direction. 2. Physical barriers Barriers imposed by landscape or man-made structure can restrict dispersal of individuals and subsequently, restrict gene flow. The extent of genetic divergence among populations depends on the barrier size, separation period and the size of separated populations. A small barrier or short separation time may not lead to genetic divergence (Table 2-2). 2.1 Waterfall Natural waterfalls can greatly restrict genetic exchange between population upstream and downstream of the barrier. In O. clarkia henshawi in the western drainage basin of Marys River, Neville et al. (2006) revealed low levels of gene flow in two samples above the waterfalls compared to other sites downstream of the waterfalls or in tributaries without the waterfall (Figure 2-4). Rates of gene flow, estimated by MIGRATE, were low in two headwater sites of the western Marys River (WMR) (immigration rate = 49.87, emigration rate = 88.12) and Upper Marys River Basin Creek (MRBC1) (immigration rate = 56.00, emigration rate = 49.03). In contrast, the gene flow rates were much higher at other sites without the waterfall, such as Upper East Marys River (EMR1) (immigration rate = 123.69 and emigration rate = 88.68). As a consequence, the two headwater sites were genetically different from the downstream sites (pairwise FST = 0.01-0.32). 2.2 Dam Yamamoto et al. (2004) examined 11 samples of S. leucomaenis from the Haraki, Hitozuminai and Ken-ichi rivers in the southern region of Hokkaido, Japan. All these rivers have small dams (2-23 m high). These dams were constructed in 1963 to 1991. In all rivers, the populations above the dam were genetically different from ones below the dams. The FST values between the populations below and above the dam were 0.023, 0.160, 0.219 in Hitozuminai River, Haraki River and Ken-ichi River respectively. Likewise, the P. flavescens collected from 16 samples along the Saint Lawrence River. At the entrance of Lake Saint-Pierre (first portion of the Saint Lawrence River), there are also five small dams, which were constructed during the 22

1930s. The presence of dams was important for restricting gene flow and for the formation of genetic structure in these populations, the geographical barriers were significant and separated four genetically distinct populations of P. flavescens (Leclerc et al., 2008). In some instances, a smaller dam may not effectively isolate populations. Lamphere and Blum (2012) examined the C. bairdi populations above and below a 1 m high dam at the Nantahala River (North Carolina, USA). The C. bairdi populations were not different.

Table 2-2 Effects of physical barriers on genetic population structuring of lotic fish species

Species Study site Result Ref. Waterfall O. clarkii henshawi Western Marys (Neville et al., drainage basin of the 2006) Marys River Above waterfall: WMR Immigration rate 49.87 Emigration rate 88.12 MRBC1 Immigration rate 56.00 Emigration rate 49.03 Without the waterfall: EMR1 Immigration rate 123.69 Emigration rate 88.68 Dam

S. leucomaenis 3 rivers of Hokkaido, In all rivers, FST were highly (Yamamoto Japan significant for populations above et al., 2004) and below the dam.

- Hitozuminai River FST = 0.023

- Haraki River FST = 0.160

- Ken-ichi River FST = 0.219 23

3. Biological barriers Biological characteristics of a species can greatly affect population dynamics in a landscape and gene flow among populations. Biological factors that may affect gene flow include (1) life history characteristics especially those relating to dispersal ability (2) homing, spawning behavior and time of spawning. 3.1 Life history characteristics Migration pattern, dispersal ability and spawning behavior of a fish species can facilitate or restrict gene flow. Species with limited dispersal or species with a homing behavior tend to have stronger population structure than a widely dispersed species. For example, in a non-migratory trout species, O. clarkii henshawi, Neville et al. (2006) detected genetic differences between populations located in close proximity (3-5 km) in some streams in the Great Basin desert of the western US. Likewise, C. bairdi populations located only 2.01 km apart were genetically different (Lamphere & Blum, 2012). Sterling et al. (2012) examined 17 samples of the E. raneyi, a benthic headwater fish restricted to tributaries in the north central portion of the Mississippi River, USA. They found significant isolation by distance pattern reflecting the dispersal ability of this species. Sites located within less than 5 km of each other were not genetically different. In contrast, in a widely dispersed species, M. mogurnda, Cook et al. (2011) detected genetic differentiation of samples from sites located more than 50 km apart (at low elevation): 17 samples throughout Daly River, Australia (about 300 km long) were genetically homogenoeus. However, the landscape characteristics (i.e., steep slope and high water veolocity) can alter pattern of genetic differentiation of this species. Samples separated by high stream gradient, can be genetically different even when they were less than 20 km apart. 3.2 Homing, spawning behavior and time of spawning Some freshwater and anadromous fish species have a homing behavior where adults returned to their natal stream for reproduction. For instance, the sutchi catfish (Pangasianodon hypophthalmus) in Cambodia migrates up to 1,230 km along the Lower Mekong River to feed and reproduce. Adults migrate upstream to reproduce at specific spawning grounds on the Lower Mekong River. So, Maes and Volckaert (2006) analyzed samples of P. hypophthalmus collected from 10 24 geographic locations (3 and 7 locations representing spawning and feeding grounds respectively). The results show P. hypophthalmus populations were genetically diverse. Baysean analyses suggested three genetic clusters for all foraging (juvenile) and spawning (adult) individuals collected. Moreover, the samples collected at the spawning sites deviated from Hardy-Weinberg expectations, suggesting population admixture (at various proportions of the three genetic clusters at each spawning ground). Similarly, in Walleye (Sander vitreus), genetic data suggested this species may return to specific spawning sites within a lake. Stepien and Faber (1998) detected significant differences in genotypic frequencies among all samples collected from 6 spawning sites in Lake Erie and 4 spawning sites in St. Clair Lake, USA. This pattern was consistent across a larger scale. Stepien, Murphy, Lohner, Villet, and Haponski (2009) detected significant differences of the walleye from 26 spawning sites from drainage basins across the North American Great Lakes, Lake Winnipeg, upper Mississippi River, Ohio River and Mobile Bay of the Gulf Coast. Futherumore, This genetic differentiation pattern in Walleye is consistent with the predictions based on the species natal homing pattern; this behavior reduces gene flow among sites. The timing of spawning of a population usually correlates with environmental conditions and is partially controlled by genetics. Rainbow Trout (Oncorhynchus mykiss) from North Taupo and South Taupo, had a different spawning period with the North Taupo rainbow trout spawns throughout the year whereas the South Taupo group spawns for the period of 6 months in a year (June-November) (Hamer, 2007). The populations with different spawning time usually have limited gene exchange between them. For example, McGregor et al. (1998) tested the influence of return timing (early or late), spawning site (intertidal or upstream) and sample year (samples collected in 1979, 1981, and 1983) on the genetic structure of pink salmon (Oncorhynchus gorbuscha) in Auke Creek, Alaska. The analysis revealed both timing (P ≤ 0.001) and year (P ≤ 0.01) significantly affected allele frequencies, whereas spawning site did not. 25

Genetic diversity within isolated populations Short-term genetic changes in isolated populations were the influence of population size and gene flow. A small isolated population often lose genetic variation faster than a large population, via genetic drift. However, the genetic outcome is a net effect between two opposing forces, genetic drift and gene flow. In fish, several studies showed low genetic diversity in small, isolated populations with restricted gene flow (Table 2-3). In S. fontinalis from Maine, USA, Castric et al.

(2001) reported that the mean heterozygosity (He) was negatively correlated with the location altitude (P = 0.0381, R = -0.3804). Samples at the lowest altitude, 68 m above sea level, had He values ranging from 0.5 to 0.75, whereas, ones at highest altitude, 599 m, had He values ranging from 0.4 to 0.6. The finding of lower He in populations at higher elevation reflect their geographic isolation. Neville et al. (2006) reported the O. clarkii henshawi populations with limited connectivity or poor quality/ smaller sized habitat were low allelic richness (Ar) and He. The Ar within samples ranged from 3.38 to 6.09 while average He ranged from 0.42 to 0.56. In another study, upland P. reticulata populations of the Caroni drainage showed very low levels of genetic diversity (alleles per locus (A) = 1.5-2.53:

He = 0.023-0.252) compared to the lowland populations (A = 8.24-11.26:

He = 0.6105-0.719). Effective population size estimates: Ne, for these populations were 244 and 910 for the upland and lowland populations respectively (Barson et al., 2009). Lamphere and Blum (2012) examined eight C. bairdi populations in the Nantahala River (North Carolina, USA). The downstream populations had greater

He and number of alleles (He = 0.556, total number of alleles = 116) than the upstream populations (He = 0.647, total number of alleles = 96). Moreover, Sterling et al. (2012) examined E. raneyi populations from four tributary drainages (Tippah River, Cypress Creek, Yocona River and Otoucalofa Creek) of the north central portion of the Mississippi River. Populations in a smaller drainage, Yocona River drainage (1,014 km2), had lower genetic diversity (The mean

Ar = 4.66 and He = 0.608) than one in a larger drainage, the Tallahatchie River 2 drainage, covering 2,755 km (Ar = 6.89 and He = 0.766). The differences in the 26 amount of genetic diversity can be explained by the amount of suitable habitat available to darters in the two different drainage basins. In addition, fish populations upstream of the dams, waterfalls or culverts can also have low genetic diversity due to small population size and restricted gene flow. For example, Yamamoto et al. (2004) examined S. leucomanis in the three rivers in the southern region of the Hokkaido, Japan, and found that the populations upstream of artificial dams had lower genetic diversity than the downstream counterparts for all rivers examined. Neville et al. (2006) reported the O. clarkia henshawi populations above the waterfall in western Marys River had lower gene flow compared to site below.

Table 2-3 Genetic diversity within populations as a consequence of population isolation

Species Study site Type of isolation Results Ref. Geographic barriers

S. fontinalis Maine, USA. Altitude 68 m , He = 0.5-0.75 (Castric

599 m He = 0.4-0.6 et al., 2001)

O. clarkia Marys River - Low Ar = 3.38-6.09 (Neville henshawi connectivity He = 0.42 to 0.56 et al., 2006) - Resident - Poor quality/ small sized habitat P. reticulata Caroni Drainage Connectivity Upland populations (Barson A = 1.5-2.53 et al., 2009)

He = 0.023-0.252 Lowland populations A = 8.24-11.26

He = 0.6105-0.719 E. raneyi North central Connectivity Yocona River: (Sterling

Mississippi Ar = 4.66, He = 0.608 et al., 2012) Tallahatchie River:

Ar = 6.89, He = 0.766

27

Table 2-3 (Continued)

Species Study site Type of isolation Results Ref. Physical barriers S. leucomanis 3 rivers in Dam Hitozuminai River: (Yamamoto

Hokkaido, Japan Above: A = 6.0, He = 0.63 et al., 2004)

Below: A = 7.2, He = 0.68 Haraki River:

Above: A = 3.0, He = 0.35

Below: A = 8.2, He = 0.71 Ken-ichi River:

Above: A = 2.6, He = 0.37

Below: A = 7.2, He = 0.64 O. clarkii western Marys Waterfall (Neville henshawi River et al., 2006) Above waterfall: Immigration rate 56.00 MRBC1 Emigration rate 49.03 Below waterfall: Immigration rate 102.25 MRBC2 Emigration rate 126.16

Note A = Number of alleles, He = Expected heterozygosity, Ar = Allelic richness

Different measurements for genetic diversity such as the number of alleles

(A), He and Ar can have varied sensitivity to detect genetic change. Compared to He, allelic diversity is usually more sensitive to drastic genetic change via genetic drift, e.g., bottleneck events (Allendorf & Luikart, 2007). Neville et al. (2009) detected this variation in measurement sensitivity in rainbow trout populations from Boise and

Payette rivers. Allelic diversity measures (i.e., Ar) were able to reveal the relationships between both habitat size (Spearman rank correlation, Rs = 0.49, P = 0.0001) and water body connectivity (Kruskal-Wallis test, P = 0.015) while

He was able to only detect the effects of habitat size (Rs = 0.36, P = 0.007), but the Migration ratio (M-ratio) was not detected. 28

Landscape genetics Landscape genetics explicitly addresses how geographical and environmental features shape the genetic variation at both the population and individual levels (Manel, Schwartz, Luikart, & Taberlet, 2003). This discipline differs from conventional conservation genetics in that it explicitly integrates genetic and spatial data. Genetics data include genotypes and allele frequencies. The spatial data include multivariate landscape or environmental data, landscape or habitat map, environmental data and geographic coordinates or Euclidean distance (Hall & Beissinger, 2014). To integrate genetic data, spatial data and population structure information, statistical tools are necessary. Wagner and Fortin (2013) recently proposed analytical framework to analyze relationships between landscape features and genetic differentiation patterns at four levels, including node, link, neighborhood and boundary. 1. Node-based methods relate the presence of adaptive genes or the genetic diversity of local populations to environmental site conditions at sampling locations or to patch attributes. They thus address the question of what determines the presence and abundance of alleles, or the genetic diversity, at a spatial location. At the node level, the matrix of alleles (Y) or an aggregate measure of genetic diversity (e.g., Ar) at each sampling location, a, b or c are related to environmental conditions or landscape features (X) observed at the same location (Figure 2-5A). Nodes of different sizes refer either to different genetic values or habitat patch sizes (Wagner & Fortin, 2013). The multivariate analysis (e.g., PCA) are useful for this of analysis. 2. Link-based methods relate pairwise genetic distance between individuals or demes to their landscape distance (e.g., geographic distance, cost distance, presence or number of barriers), hypothesized to be related to the probability of dispersal and migration. They thus address the question of the magnitude of gene flow between two habitats. This analysis level may be particularly relevant for assessing genetic connectivity, testing hypothesis on landscape resistance, and identifying corridors for conservation applications. Figure 2-5B illustrates the analysis at the link level: the genetic distance between pairs of sampling locations ab, ac and bc (DY) is related to distance-based landscape data, describing the intervening matrix along each link (DX) 29

(Wagner & Fortin, 2013). The Mantel or partial Mantel tests are most common for this kind of analysis (Hall & Beissinger, 2014). 3. Neighborhood-based methods relate genetic diversity of demes, or their genetic differentiation, to attributes of the local landscape context, either within a certain radius around the sampling location or as a function of the neighboring patches, with weights inversely proportional to their landscape distance scaled by the dispersal ability of the organism. At the neighborhood level, the alleles Y (or diversity measure, see above) at sampling location a are related to a patch-level connectivity measure jXj that quantifies local landscape context (Figure 2-5C: Wagner & Fortin, 2013). Neighborhood-based analyses are useful for understanding demographic and metapopulation processes or measuring spatial autocorrelation because they integrate genetic data with characteristics of the landscape surrounding sampling locations (Hall & Beissinger, 2014). 4. Boundary-based methods aim to detect the extent of gene flow and delineate discrete or admixed populations in space (Figure 2-5D). At the boundary level, spatially contiguous, discrete populations a, b and c are inferred from genetic data and overlaid with landscape features to identify barriers to gene flow. This may be achieved by relating spatial rates of change in genetic data, βY, to spatial rates of change in landscape predictors βX. Alternatively, β may denote between-cluster components of variance in X or Y (Wagner & Fortin, 2013). Boundary-based analyses identify dispersal barriers by measuring spatial overlap of genetic and landscape or environmental discontinuities on the landscape. In a genetic context, boundaries act as barriers to gene flow, separating panmictic populations (Hall & Beissinger, 2014). Analytical tools that are useful include the spatial bayesian clustering analyses (Hall & Beissinger, 2014). These methods detect a population boundary by determining likely number of genetic populations explained by individual genotypes, under Hardy-Weinberg and linkage equilibrium. These methods need individual genotypes combined with geographic coordinates or Euclidean distance.

30

Figure 2-5 Four analysis levels at which landscape genetic data can be analyzed (Wagner & Fortin, 2013).

The Geographic Information System (GIS) has provided a useful platform for geo-referencing environmental data. The databases implemented in GIS have been rapidly growing in recent years and had made the analysis of multiple landscape variables possible. It provide tools for visualization and analysis of genetic-landscape surfaces in combination with other geo-referenced environmental data (Vandergast, Perry, Lugo, & Hathaway, 2011).

Genetic management of fisheries resources Management of fisheries resources should recognize the importance of genetic diversity at two levels, within- and among-population genetic diversity (Hallerman, 2003; Frankham et al., 2010). Genetic diversity is a raw material for natural selection to act upon. Loss of genetic diversity at both levels may reduce the capacity of populations to evolve in response to environmental changes. The major concerns over the loss of genetic variation within-population genetic variation are the negative effects of inbreeding depression (reduction of fitness), the increase of harmful alleles and the loss of beneficial alleles (Frankham et al., 2010). Due to their small population sizes, populations of endangered species or fragmented populations are particularly prone to the loss of genetic variation and subsequent detrimental outcomes. Some of management strategies may include (1) increasing population size by controlling harvest, designation of reserves, reduction of pollutants, improving habitat quality and connectivity (if applicable) and eradication of unnatural predators and competitors and (2) genetic rescue of the small 31 inbred population by outcrossing, the introduction of individuals from other populations to improve reproductive fitness and restore genetic diversity (Frankham et al., 2010). Genetic tools can also be used to diagnosing genetic problems. For fisheries management, managers may also interested in knowing stock contribution to fishery catches or assign individuals to their most likely stock of origin, and in monitoring and evaluating genetic changes imposed by hatchery programs. In a fishing ground, a group of fish under a harvest scheme may consist of members belonging to different genetic stocks (e.g., Pacific, Olsen et al., 2011). These stocks may have different population dynamics and status (e.g., endangered vs. abundant) and therefore can respond differently to a fishing scheme. Successful management of mixed-stock fisheries relies upon obtaining adequate escapements of all contributing individual stock. For among-population genetic differentiation, managers require the knowledge of population identity and relationships among populations. Genetic data can be used to describe and delineate genetically distinct populations. Population structure data can provide a basis for the identification of management and conservation units. Management actions, such as hatchery operations and supplementary stocking, can inadvertently affect genetic diversity of a managed population. Implementing measures to reduce genetic risks or mitigate genetic impacts can reinforce effectiveness of fishery management (Hallerman, 2003; Shaklee & Curren, 2003). Genetic hazards, events that are like to lead to losses of genetic diversity or fitness, may be classified into four general kinds, (1) extinction, (2) loss genetic diversity within populations, (3) loss genetic diversity among population and (4) domestication, or loss of fitness in wild from artificial propagation. Depending on the management context, each activity may impose a different kind genetic hazard, at varying magnitude, which lead to varying severity of genetic impacts. In summary, the understanding of the influence of barriers on genetic structure lotic fish populations can help the design of a management plan appropriate for natural populations within a given landscape context. Landscape genetics, an 32 explicit integration of genetic and spatial data, offers a comprehensive way to better understand the complexity of the interactions between environment and variation behavior of fish species.

33

CHAPTER 3 RESEARCH METHODOLOGY

Study design and sampling locations This study consisted of two steps (Figure 3-1, Figure 3-2), Teirs I and II. All fish samples were collected during November 2016-Febuary 2017 and for each location, 23-100 individuals were analyzed. These individuals were identified to a species level based on Nelson (2006). For each individual collect, a small piece of the caudal fin were removed and preserved in 95% alcohol. Individuals that were still alive after the tissue removal were returned to the stream. Tier I research described overall genetic variation within and among populations of G. cambodgiensis in the major tributaries of the upper Nan River. These tributary rivers included Meed, Kon, Pua, Yao, Yang, Sa, Wa and Haeng rivers (Figure 3-3A, 3-3B, Table 3-1). Tier II research evaluated the heterogeneity of landscape on G. cambodgiensis population structure at a sub-basin level, the Nam Wa sub-basin

(the main river is the Wa River). It is the largest sub-basin of the upper Nan River basin. The Nam Wa drainage covers approximately 3,375.80 km2. Tier II (1) research consisted of another two studies, one focused on the entire length of the Wa River to test the effects of several landscape variables and another study focused on the effects of physical barriers. The first study examined genetic variation among samples from five locations along the Wa River: Sapun (SP), Phasuk (Pha), Namwa (NW), Hadrai (HR) and Nasa (NS), representing one or two environmental variables (i.e., isolation by distance, different land use types and the presence of physical barrier). There was a human-made dam, the Nam Wa dam (209 x 310 meters) constructed in 2012. These samples allowed for analysis of impacts by zone (upper and lower Wa River) and by positions relative to the dam (above and below Nam Wa dam) (Figure 3-3A, 3-3B, Table 3-2). Tier II (2) to specifically test for the effects of a physical barrier, the second study focused on G. cambodgiensis individuals collected from six locations representing sections above and below three physical barriers within the Wa River, 34

Nam Wa sub-basin. These barriers are Nakham Dam (a concrete weir, 5 meter high, NKU and NKL) in Mang Stream and Sapun Waterfall (natural waterfall, 10 meter high, SPU and SPL) in Pun Stream and Sawanua Dam (a concrete weir, 3 meters high, SWU and SWL) in the Wa River main-stem (Figure 3-4, Table 3-3). The geographic coordinates of sampling locations were collected for the input in a geographic information system (GIS) to compute stream distance, distance between locations, elevation, and types of land use.

Geographic scale Hypothesis Sampling

Isolation by distance/ the One sample from major Basin level Tier I stream network on tributaries of upper Nan

population genetic structure River basin

Isolation by distance (IBD) A few samples from Isolation by barrier (IBB) Tier II Sub basin level each landscape type in Habitat fragmentation the Nam Wa sub-basin Habitat quality

Figure 3-1 Study design of this dissertation research

Note Study design of this dissertation research, consisting of two steps each which focuses on a different geographic scale. Teir I and II research focuses on basin level and sub basin level geographic scales respectively.

35

Figure 3-2 Study area of this dissertation research

36

(A) 37

(B)

Figure 3-3 Locations of population samples of G. cambodgiensis in the upper Nan River

Note Locations of population samples of G. cambodgiensis in the upper Nan River ( ) and Nam Wa sub-basin ( ). The maps also illustrate (A) stream orders, flooded areas and elevation and (B) land use types in the drainage basin. The Nam Wa dam. (the Wa and NW was the same location identified differently in a basin and sub-basin studies) 32

Table 3-1 Landscape characteristics of sampling locations of G. cambodgiensis in the upper Nan River, Thailand

Locations Geographic Sub-basin* Sub- Elevation Major land use types (%) Stream Distance to Sample (Sampling code) co-ordinates (UTM) basin (MSL) (Within 4 km radius of the order at the main size (N) area sampling location) sampling channel of X Y (Km2) Forest: Agriculture: locations Nan River Paddy field (km) Meed River (Meed) 690744 2138971 500 45.19: 52.58: 0.78 4 4.6 46 Upper part of 2,222.34 Kon River (Kon) 701057 2133505 300 63.37: 21.58: 8.93 5 8.97 46 Mae Nam Nan Pua River (Pua) 704968 2125543 292 28.57: 31.89: 26.76 4 17.89 46 Yao River (Yao) 676400 2150534 Nam Yao-1 787.73 329 26.24: 73.34: 0 5 58.96 46 Yang River (Yang) 705554 2112381 Second part of 2,200.39 365 52.87: 35.36: 7.38 3 22.99 100 Mae Nam Nan Sa River (Sa) 659914 2055068 Nam Sa 778.40 347 49.07: 49.96: 0.44 7 45.68 41 Wa River (Wa) 712063 2062370 Nam Wa 3,375.80 252 61.88: 34.99: 0.52 7 64 30 Haeng River 667162 2027495 Nam Haeng 1,043.80 404 37.29: 54.69: 3.92 5 65.64 42 (Haeng) * Pakoksung and Koontanakulvong (2015)

3

8

38

Table 3-2 Landscape characteristics of sampling locations of G. cambodgiensis in the Wa River, Nam Wa sub-basin Thailand.

Location Stream Zone Barrier Geographic Elevation Major land use types (%) Stream Sample (Sampling code) (Nam Wa dam) co-ordinates (UTM) (MSL) (Within 4 km radius of the order at size (N) X Y sampling location) sampling Forest: Agriculture: locations Paddy field Sapun (SP) Pun Stream Upper 731294 2123534 727 82.33: 14.03: 3.64 3 30 Phasuk (Pha) Wa River Wa River Above 733065 2102653 485 90.78: 3.67: 4.46 6 30 Namwa (NW) Wa River 712063 2062370 252 63.85: 33.18: 0.49 7 30 Lower Hadrai (HR) Wa River 703442 2053759 214 54.42: 42.71: 0.46 7 23 Wa River Below Nasa (NS) Wa River 694415 2056310 204 25.05: 69.16: 2.30 7 26

3

9

40

Figure 3-4 Sampling locations of G. cambodgiensis populations in the Wa River. The map also shows stream size and physical appearance of barriers examined in this study.

Table 3-3 Sampling locations of G. cambodgiensis populations above and below three barriers in the Wa River, Thailand

Stream Name and type of Geographic Position Location Sample barrier co-ordinates relative to code size (n) (UTM) the X Y barrier Mang Stream Nakham Dam 726355 2112478 Above NKU 35 Concrete weir, 5 m high, Below NKL 31 ~ 40-50 years old Pun Stream Sapun Waterfall 731079 2123372 Above SPU 34 Natural waterfall, Below SPL 29 10 m high, > 100 years old Main-stem Sawanua Dam 729096 2127669 Above SWU 31 Wa River Concrete weir, 3 m high, Below SWL 31 (Wa Stream) ~ 40-50 years old 41

Laboratory methods Total genomic DNA was extracted from caudal fin tissues using salt extraction protocol modified from Aljanabi and Martines (1997). The extracted genomic DNA was visualized by 1% agarose gel electrophoresis (70 Volts for 40 minutes) after staining with ethidium bromide and exposing to the UV light. To evaluate DNA quantity, the intensity of DNA smears under the UV light were compared with known amount of DNA. Microsatellite loci were amplified by the polymerase chain reaction (PCR). The number of microsatellite loci varied upon phases of the research (11, 10 and 5 loci for Tier I, II/ 1 and II/ 2 respectively, Table 3-4). Tier I research analyzed 11 polymorphic microsatellite loci: GC203 and GC187 were developed for G. cambodgiensis; Sa197 was developed for Scaphiodonichthys acanthopterus (Jaisuk et al., 2014); Gar3, Gar6, Gar8, Gar9 and Gar13 were developed for G. orientalis (Su et al., 2013): PH8A, JQSO, HOLN were developed for G. barreimiae (Kirchner, Weinmaier, Rattei, Sattmann, & Kruckenhauser, 2014) (Table 3-4). Teir II research analyzed a subset of the 11 loci examined in Tier I research. For the most part of the research, I analyzed fluorescently labeled microsatellite loci. For this type of analysis, a forward primer for each primer pair was fluorescently labeled at the 5’ end (FAM, HEX, VIC or ROX). Total volume of a polymerase chain reaction was 10 ul, consisting of 10 ng of DNA template, 0.1 mM of each primer in a primer pair, and 5 ul of i Taq mastermix solution (iNtRON BIOTECHNOLOGY, Korea). The PCRs were performed in a thermal cycler (BioRad, MJ Mini Cycler, Italy) with the following temperature profile: a cycle of 94 oC for two minutes, 40 cycles of a denaturation at 94 oC for 30 seconds, an annealing temperature specific to each primer pair for 30 seconds (48 oC for PH8A and Gar8; 54 oC for Gar9 and JQSO; 58 oC for GC203, GC187, Gar3, Gar6, Gar13 and 60 oC for Sa197 and HOLN), elongation at 72 oC for 30 seconds and a final cycle of 72 oC for 5 minutes (Table 3-5). The PCR products were submitted to a commercial genetic analysis service (First BASE Laboratories Sdn Bhd, Malaysia) for electrophoresis and genotyping on an ABI3730XL DNA analyser. Scores were determined relative to an internal size standard (LIZ 500) using the GeneMapper software v3.0 (Applied Biosystems, Forest City, CA, USA). 42

For Teir II (2) research, the genotyping was perfomed using a standard non- automated technique, Silver Staining (Promega, USA). The DNA bands in the gels were estimated by comparison to a reference sequence of pGEM-3Zf (+) Vector, with a 1 base-pair band increment (Promega, USA). An individual with a known genotype was used as a positive control for each gel.

Table 3-4 Descriptions of primer sequences, fluorescent labelling, and annealing

o temperature (TA) ( C ) of microsatellite loci analyzed in this study.

Locus Primer sequences 5′----> 3′ Fluorescent TA Studies Ref. name labelling (oC) F : GTTCTCCAGGTGTGGATTTCTC GC203 VIC 58 I, II1, II2 R : AACATACACTCACAGTTTGGCCT (Jaisuk F : GTGGACTACCTGCTGAGAAACC GC187 HEX 58 I, II1 et al., R : GCGTGGACTAACTTTGCTTTTAG 2014) F : TGCACATTTCTCCTCTAGCTCA Sa197 ROX 60 I, II1 R : CAGTGGCCTCCTGTAAGTGTCT F : ATTACTGATGCTCCCG Gar3 FAM 58 I, II1, II2 R : GTTGCTGCTCTTGTCC F : GCTTTACCTCCATCGC Gar6 HEX 58 I, II1, II2 R : GTCACTCCACCAACCC F : GAAGCATTTACCGTCAC (Su et al., Gar8 HEX 48 I, II1 R : TTAGCATTGGCAGAAG 2013) F : GCTCCATTCACATTCCAT Gar9 VIC 54 I, II1 R : ATCACCTGCTTTCCCACT F : ACTCACGCAGACTCGC Gar13 FAM 58 I, II1, II2 R : GACTACAGAAATAGGGTT F : ACACTTCCAAAACGGTCGC PH8A VIC 48 I R : CGGAGCAAACCCAGTGACTA (Kirchner F : TTGGCTGGAGCGATGGCTG JQSO HEX 54 I, II1 et al., R : AGGGCTACATCACAGACTCAC 2014) F : ACTGCGCTCGTACCCTATG HOLN FAM 60 I, II1, II2 R : CAGCAGCCGGTAAATAGCTG 43

Table 3-5 Temperature profile of the polymerase chain reaction (PCR) used in this study.

Step Temperature (oC) Time Cycle 1 94 2 min 1 2 94 30 s 3 48-60 30 s 4 72 30 s back to step 2 for 40 cycles 5 72 5 min 1

Genetic data analysis 1. Genetic diversity within populations The following standard genetic diversity indices were estimated: average number of alleles (A), effective number of alleles (Ae), heterozygosity (observed,

Ho and expected heterozygosity, He), inbreeding coefficient (Fis), all using GenAlEx v.6.5 (Peakall & Smouse, 2006). Unequal sample sizes among population samples can lead to incomparable allelic diversity across samples. To account for unequal sample sizes among population samples, I estimated allelic richness (Ar) based on smallest sample size across all samples based on a rarefaction method implemented in Fstat v.2.9.3 (Goudet, 2001). Mann-Whitney U-tests was used to test for statistically significant differences in genetic diversity measures among population samples (Crooks & Shaw, 2016). Observed genotypes within each sample were tested for the deviation from those expected under the Hardy-Weinberg equilibrium by the Markov Chain Monte Carlo method of exact probability test implemented in the software Genepop v.4.0 (Rousset, 2008). P-values were estimated from 10,000 dememorization numbers, in 100 batches with 5,000 iterations per batch. For the statistical inference, the P-value was adjusted using Bonferroni correction for multiple tests (Rice, 1989). Genotyping errors due to non-amplified alleles (null alleles), short allele dominance (large allele dropout) and the scoring of stutter peaks were determined based on the Chakraborty et al. (1992) and Dempster, Laird, and Rubin (1977) methods implemented in the programs Micro-Checker v.2.2.3 (Van Oosterhout, Hutchinson, Will, & Shipley, 44

2004) and FreeNA (Chapuis & Estoup, 2007), respectively. Moreover, to account for the effects of null alleles on the detection of population genetic structure, pairwise

FST values (Weir, 1996) were estimated based on allele frequencies corrected for null alleles (i.e., ENA, excluding null alleles, Chapuis & Estoup, 2007).

2. Estimation of Ne and the presence of recent bottlenecks

The contemporary effective population size (Ne) of each sample/ genetic cluster was calculated based on two methods, the linkage disequilibrium (LD) method (Do et al., 2014) and sibship method (Wang, 2016), implemented in NeEstimator v.2 and COLONY v.2.05.1, respectively. For the LD method, the lowest allele frequency used was 0.01 and putative 95% confidence intervals were calculated by a parametric method (Do et al., 2014). For the sibship approach, COLONY uses maximum likelihood to estimate probabilities of full and half siblings of a sample of individuals taken from a population of interest. The rapid increase in heterozygosities within a population compared to those expected across loci under a mutation-drift equilibrium can indicate a recent bottleneck event. For the G. cambodgiensis samples, the analysis assumed a two- phase model of microsatellite evolution (TPM), which is most appropriate for empirical microsatellite data (Di Rienzo et al., 1994), with 90% single-step mutations and 10% multiple-step mutations implemented in BOTTLENECK v.1.2.02 (Piry, Luikart, & Cornuet, 1999). The statistically significance of heterozygosity excess was determined using Wilcoxon’s sign rank test based on 1,000 iterations. 3. Genetic differentiation among populations Genetic divergence among samples were analyzed using conventional and model-based approaches. Analysis of molecular variance (AMOVA; Excoffier, Laval, & Schneider, 2005), an analogue to ANOVA, allowed for the partitioning of overall variance into within and among sample variation. The level of variation examined depend of the geographic scale of the study. At a large geographic scale, the source of variation included among sub-basin, among sample within sub-basin and among population. At a finer scale, Nam Wa sub-basin level the source of variation included among sections of the stream (upper and lower river segments, upstream and downstream of a dam) and within section. Based on the AMOVA framework,

I then estimated pairwise FST values, with P-values generated from random 45 permutation procedures (1,000 permutations) using the software ARLEQUIN v.3. (Excoffier & Lischer, 2010). The level of significance was adjusted for multiple simultaneous tests using the sequential Bonferroni procedure (Rice, 1989). Cluster analysis was performed based upon the Nei’s genetic distance (Nei, 1978) matrix, and a dendrogram was constructed with an unweighted pair group method of arithmetic averages (UPGMA) algorithm using the Poppr R package (Kamvar, Tabima, & Grunwald, 2014). The consensus dendrogram showed bootstrap supporting values for each node (percentage based on 1000 bootstrap replicates). In addition, I analyzed genetic variation using two Bayesian clustering models implemented in the software STRUCTURE v.2.3.4 (Pritchard, Stephens, & Donnelly, 2000; Hubisz, Falush, Stephens, & Pritchard, 2009) and TESS v.2.3 (Chen, Durand, & Forbes, 2007; Francois & Durand, 2010). The two models differ in their utility of spatial information, with TESS incorporating geographic coordinates of individuals in the analysis. Both approaches analyzed multilocus genotypes of individuals to determine a likely number of genetic clusters (K) and estimated membership coefficients (for a given K value) for each individual. For STRUCTURE analysis, the most likely K value for the dataset was determined by a method proposed by Evanno, Regnaut, and Goudet (2005) based on the difference in log probability of data between successive K values (i.e., ΔK statistics). A K value with the highest rate of change would be the probable K value for the data set. To obtain these probability values, I simulated a range of K values between minimum and maximum number of populations, with 20 replicated runs for each value of K and 100,000 Markov Chain Monte Carlo (MCMC) iterations, following a burn-in period of 25,000. The program settings include the admixture model with correlated allele frequencies, and other default parameter settings. The ΔK statistics plot was generated by STRUCTURE HARVESTER v.0.6.94 (Earl & vonHoldt, 2012). For TESS analysis, the most likely K value was selected based on the rapid decline of Deviance information criterion (DIC) values averaged over 20 simulations between subsequent K values. To obtain the DIC values, the analysis was performed using the CAR admixture model, which assumes spatial autocorrelation of the genomes of individuals in closer geographical proximity compared with those further apart. The spatial interaction parameter (ψ) was set to the default value of 0.6 for 46 analysis. TESS was run with a burn-in of 30,000 sweeps followed by 50,000 sweeps, with 20 independent runs conducted for each value of K, from 2 to 9. For each K, both STRUCTURE and TESS estimated a membership coefficient, accounting for sampling location, for each individual. These coefficients reflected the genetic admixture level within individuals (if existed). The display of membership coefficients of individuals for each K value was generated using the Pophelper R web app v.1.0.10 (Francis, 2016). At a large geographic scale (Teir I), I determined historic gene flow pattern by jointly estimation of long-term migration rates among populations and historical effective population size using the MCMC maximum-likelihood method implemented in the software Migrate-n v.3.2.1 (Beerli, 2012). Search criteria in Migrate-n were set to 10 short chains of 10,000 steps, 500 trees recorded and 3 long chains of 100,000 steps, 5,000 trees recorded and static heating scheme 1.0, 1.20, 1.50 and 3.00. Microsatellite mutation was modeled as a continuous Brownian and stepwise process. For the estimation of the migration rate, the program setting included MCMC search of 10,000 burn-in steps followed by 250,000 steps with parameters recorded every 50 steps; a static heating scheme consisted of four chains with temperature (1.0, 1.3, 3.0, 10,000); uniform prior on theta (min: 0.00, max: 0.100, delta: 0.01); uniform prior on migration (min: 0.00, max: 1000.0, delta: 100.0). Migrate-n was run six times with parameter values starting from FST-based estimates and the distribution of parameter values was compared across runs to ensure overlap of 95% C.I. Effective sample size was 7000 for all runs. Values of long-term, historical estimates of gene flow (M) were converted to proportion of migrants (m). The conversion was calculated using the formula: m = Mµ (Apodaca, Rissler, & Godwin, 2012) where µ = 5.56 x 10-4 (Yue, David, & Orban, 2007). At a finer geographic scale, the analysis did not provide relevant insights. 4. Spatial genetic analysis This analysis only applied to the basin wide studies (i.e., the upper Nan River, and the Wa River). To determine a spatial pattern of genetic variation, I examined correlations between landscape features and genetic diversity (i.e., allelic richness) as well as with genetic differentiation (i.e., linearlized FST). The correlation between landscape charateristics, such as sampling site elevation, stream order, and 47 percentages of land use types within 4 km radius of sampling locations (i.e., forest, agriculture and paddy field) (Table 3-1 and Table 3-2) and allelic richness was performed using Pearson correlation analysis (IBM SPSS Statistics v.20). To evaluate the extent to which landscape variables contribute to genetic differences among samples (linearized FST), I followed an approach used by Pilger et al. (2017) by using an information theoretic approach in combination with multiple- regression-on-distance matrices (MRM). The best candidate multiple regression models were selected based on Akaike’s information criteria (AIC) scores, adjusted for small sample size (AICc), and Akaike weights (wi). These values were calculated using the AICcmodavg package in R (Mazerolle, 2017). Candidate models with the lowest AICc scores (AICc < 2.0) and highest weights (wi > 0.10) were retained (Burnham & Anderson, 2002) and evaluated for the contribution of explanatory variables to the overall fit of the model (MRM R2) using the function MRM implemented in the ecodist package in R (Goslee & Urban, 2007). Significance of MRM models was assessed by a permutation test (1000 permutations). I also tested for the isolation by distance (IBD) pattern by performing Mantel’s tests. IBD will calculate genetic distance statistics between all pairs of populations (FST, Slatkin's M, Rousset's distance) and pair-wise stream distance among sampling locations, calculated by Google Earth (log transformed). The Mantel tests (P-values were generated from 1,000 permutations) were performed in the ecodist package in R (Goslee & Urban, 2007). In addition, I determined if genetic diversity reflected contemporary patterns of stream connectivity by analyzing the model fit between the pairwise FST values and the number of stream sections connecting sampling locations based on the statistical methods in the software STREAMTREE (Kalinowski, Meeuwig, Narum, & Taper, 2008). This approach evaluate the model fit, indicated by a coefficient of determination (R2), between two neighbor joining dendrograms, one from genetic distance data and another one from stream network data sections. To determine a boundary of gene flow at a basin level, I also performed multivariate analysis based on genotypes and spatial coordinates. The patial genetic variation was visualized using an approach proposed by Galpern, Peres-Neto, Jean, and Micheline (2014) based on a combination of a multivariate analysis (Moran’s eigenvector maps) of multilocus genotypes and a regression between a genetic 48 distance matrix (i.e., proportion of shared alleles) and landscape predictors. The analyses were implemented in the MEMGENE package in R language (Galpern et al., 2014). The results from the MEMGENE analysis were superimposed on the upper Nan River map.

Table 3-6 Analysis of genetic variation and population genetic structure

Indexes for genetic diversity within populations: Number of alleles (A) GenAlEx v.6.5 Peakall & Smouse (2006) I, II1, II2

Effective number of alleles (Ae) GenAlEx v.6.5 Peakall & Smouse (2006) I, II1, II2

Fis GenAlEx v.6.5 Peakall & Smouse (2006) I, II1, II2

Heterozygosity (He and Ho) GenAlEx v.6.5 Peakall & Smouse (2006) I, II1, II2 Hardy-Weinberg Equilibrium Genepop v.4 Rousset (2008) I, II1, II2

Allelic richness (Ar) Fstat v.2.9.3 Goudet (2001) I, II1, II2 Null allele Micro-Checker Van Oosterhout et al. I, II1, II2 v.2.2.3 (2004) Null allele frequencies FreeNa Chapuis & Estoup (2007) I, II1, II2 Allelic diversity differences Mann-Whitney among samples U-tests

Estimation of Ne and the presence of recent bottlenecks:

Effective population size (Ne), NeEstimator v.2 Do et al. (2014) I, II1, II2 base on linkage disequilibrium method

Effective population size (Ne), COLONY v.2.05.1 Wang (2016) I, II2 base on sibship method Bottleneck BOTTLENECK Piry et al. (1999) I, II1, II2 v.1.2.02 Indexes for genetic differentiation among populations: Analysis of Molecular variance ARLEQUIN v.3.11 Excoffier et al. (2006) I, II1, II2 (AMOVA) Cluster analysis (UPGMA Poppr, R language Kamvar et al. (2014) I, II1, II2 Dendrogram, Nei’s distance) Model-based cluster analysis STRUCTURE Pritchard et al. (2000) I, II1, II2 v.2.3.4 TESS v2.3 Francois & Durand (2010) I, II1, II2 49

Table 3-6 (Continued)

Gene flow: Migration rate (M) Migrate-n v.3.2.1 Beerli (2012) I No.of migrants per generation Migrate-n v.3.2.1 Beerli (2012) I (m) Spatial genetic analysis: Isolation by distance IBD v.1.53 Bohonak (2002) I, II1 Stream network analysis STREAMTREE Kalinowski et al. (2008) I, II1 Multiple-regression-on-distance AICcmodavg package Mazerolle (2017) I, II1 matrices (MRM) in R Multivariate analysis MEMGENE Galpern et al. (2014) I (Moran’s eigenvector maps)

50

CHAPTER 4 RESULTS

Tier I: Population genetic structure of G. cambodgiensis in the upper Nan River drainage basin 1. Genetic variation within samples Microsatellite variability and genetic diversity within G. cambodgiensis populations from Nan River, all microsatellite loci were polymorphic in all sample populations (Table 4-1). Gar6 had the highest allelic diversity while Gar8 had the lowest value. The average number of alleles per locus across samples ranged from 6.25±1.56 (Gar8) to 14.63±4.27 (Gar6), the average effective number of alleles per locus ranged from 3.02±0.83 (Gar9) to 8.35±1.28 (HOLN) and the average allelic richness ranged from 5.62±0.99 (Gar8) to 12.72±2.70 (Gar6). The expected heterozygosities averaged across samples ranged from 0.64±0.10 (Gar9) to 0.88±0.02 (HOLN). There was no evidence of stutter products or allelic dropout. Nevertheless, Micro-Checker analysis suggested the presence of null alleles at some loci, with the frequencies ranging from 0.05±0.04 (JQSO) to 0.24±0.07 (Sa197). However, for a given locus, the null allele frequencies varied among population samples. The

FreeNA analysis suggested only a slight change in pairwise FST values when including or excluding the estimated frequencies of null alleles (data not shown). The original dataset was therefore retained for population divergence analyses. Based on within sample genetic diversity measures, the sample Wa had the lowest allelic richness (7.64±1.92, P < 0.05, Mann-Whitney U-test) but had comparable values for other measures to the remaining samples. For allelic diversity, the average number of alleles per locus (A) ranged from 8.00±1.92 (Wa) to

12.09±1.88 (Pua), the effective number of alleles per locus (Ae) ranged from 4.31±1.70 (Wa) to 6.39±2.41 (Haeng), and allelic richness ranged from 7.64±1.92 (Wa) to 11.05±1.87 (Pua). Heterozygosity values were comparable across all samples with observed heterozygosity values (Ho) ranging from 0.51±0.14 (Meed) to

0.65±0.12 (Yao) and expected heterozygosity values (He) ranging from 0.73±0.11 51

(Wa) to 0.83±0.06 (Meed). Of the 88 sample-locus cases (8 samples x 11 loci), 40 cases showed significant deviations from HWE (P < 0.00057, after Bonferroni correction = 0.05/ 88). All of the deviations were heterozygote deficiencies (Ho < He) (Table 4-1).

51

Table 4-1 Average allelic variability at 11 microsatellite loci of G. cambodgiensis in the upper Nan River, Thailand.

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Meed Gar3 41 10 6.18 9.65 0.61 0.84 0.27 0.13 Gar6 43 13 6.64 12.10 0.70 0.85 0.18 0.09 Gar8 46 5 3.89 4.88 0.46 0.74 0.39 0.17 Gar9 42 10 4.24 9.31 0.36 0.76 0.53 0.23 Gar13 42 8 4.27 7.83 0.57 0.77 0.25 0.10 GC187 44 11 7.55 10.62 0.55 0.87 0.37 0.18 GC203 44 12 8.59 11.35 0.73 0.88 0.18 0.10 HOLN 45 13 8.22 12.07 0.44 0.88 0.49 0.23 JQSO 46 10 6.98 9.85 0.61 0.86 0.29 0.15 Sa197 42 15 9.31 14.14 0.33 0.89 0.63 0.29 PH8A 43 13 3.94 12.00 0.30 0.75 0.60 0.26 Ave. 43.46±1.62 10.91±2.64 6.35±1.90 10.35±2.38 0.51±0.14 0.83±0.06 0.38±0.15 0.18±0.07

52

52

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Kon Gar3 45 12 6.23 10.49 0.78 0.84 0.07 0.04 Gar6 46 17 7.50 14.65 0.65 0.87 0.25 0.12 Gar8 46 5 3.94 4.88 0.59 0.75 0.21 0.09 Gar9 38 13 3.06 11.60 0.37 0.67 0.45 0.19 Gar13 41 11 5.20 9.85 0.56 0.81 0.31 0.13 GC187 45 9 6.05 8.95 0.60 0.84 0.28 0.12 GC203 46 11 6.38 10.47 0.67 0.84 0.20 0.10 HOLN 43 12 8.25 11.55 0.44 0.88 0.50 0.23 JQSO 46 12 6.22 11.03 0.74 0.84 0.12 0.07 Sa197 43 11 5.12 10.08 0.26 0.80 0.68 0.30 PH8A 45 15 3.55 13.13 0.51 0.72 0.29 0.13 Ave. 44±2.45 11.64±2.93 5.59±1.54 10.61±2.36 0.56±0.15 0.81±0.06 0.31±0.17 0.14±0.07

53

53

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Pua Gar3 46 11 7.60 10.17 0.74 0.87 0.15 0.06 Gar6 46 16 7.50 14.48 0.76 0.87 0.12 0.06 Gar8 46 9 4.02 7.49 0.61 0.75 0.19 0.09 Gar9 43 10 2.25 8.79 0.33 0.56 0.41 0.16 Gar13 46 13 6.74 12.11 0.72 0.85 0.16 0.07 GC187 45 11 7.17 10.82 0.56 0.86 0.35 0.16 GC203 46 12 6.11 11.47 0.83 0.84 0.01 0.02 HOLN 46 13 9.14 11.84 0.74 0.89 0.17 0.09 JQSO 46 11 7.43 10.15 0.87 0.87 -0.01 0.02 Sa197 43 14 8.83 13.35 0.51 0.89 0.42 0.20 PH8A 45 13 1.83 10.89 0.27 0.46 0.41 0.14 Ave. 45.27±1.14 12.09±1.88 6.24±2.36 11.05±1.87 0.63±0.19 0.79±0.14 0.22±0.15 0.10±0.06

5

4

54

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Yao Gar3 46 10 6.61 9.53 0.72 0.85 0.16 0.06 Gar6 46 15 5.28 12.71 0.67 0.81 0.17 0.07 Gar8 46 6 4.02 5.64 0.61 0.75 0.19 0.07 Gar9 43 11 3.96 10.18 0.49 0.75 0.35 0.15 Gar13 46 10 4.53 9.51 0.61 0.78 0.22 0.11 GC187 46 9 5.85 8.52 0.67 0.83 0.19 0.09 GC203 46 13 7.78 12.67 0.83 0.87 0.05 0.03 HOLN 44 12 6.96 11.58 0.64 0.86 0.26 0.13 JQSO 46 11 8.46 10.64 0.85 0.88 0.04 0.02 Sa197 45 14 7.60 12.75 0.44 0.87 0.49 0.23 PH8A 44 14 3.83 12.45 0.59 0.74 0.20 0.10 Ave. 45.27±1.05 11.36±2.50 5.90±1.61 10.56±2.11 0.65±0.12 0.82±0.05 0.21±0.12 0.10±0.06

5

5

55

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Yang Gar3 99 14 6.49 11.39 0.71 0.85 0.16 0.07 Gar6 98 23 8.00 16.88 0.62 0.88 0.29 0.14 Gar8 100 8 3.94 5.83 0.57 0.75 0.24 0.11 Gar9 81 14 3.74 10.89 0.31 0.73 0.58 0.25 Gar13 99 13 5.92 9.88 0.58 0.83 0.31 0.14 GC187 93 11 6.61 10.56 0.54 0.85 0.37 0.17 GC203 99 12 9.71 11.44 0.66 0.90 0.27 0.13 HOLN 96 15 10.05 13.18 0.64 0.90 0.29 0.14 JQSO 100 12 6.31 10.10 0.75 0.84 0.11 0.05 Sa197 91 13 3.32 10.91 0.24 0.70 0.65 0.28 PH8A 96 12 2.36 8.37 0.48 0.58 0.17 0.10 Ave. 95.64±5.4 13.36±3.52 6.04±2.42 10.86±2.62 0.55±0.15 0.80±0.10 0.31±0.16 0.14±0.07

5

6

56

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Sa Gar3 40 10 5.16 9.87 0.73 0.81 0.10 0.03 Gar6 41 10 4.82 9.42 0.81 0.79 -0.02 0.00 Gar8 41 6 3.28 5.90 0.59 0.70 0.16 0.04 Gar9 36 8 2.00 7.62 0.25 0.50 0.50 0.19 Gar13 38 10 5.80 9.86 0.55 0.83 0.33 0.15 GC187 41 9 5.81 8.66 0.54 0.83 0.35 0.16 GC203 41 12 8.36 11.65 0.76 0.88 0.14 0.08 HOLN 39 12 7.47 11.60 0.69 0.87 0.20 0.09 JQSO 41 11 8.53 10.66 0.81 0.88 0.09 0.04 Sa197 35 10 6.46 9.71 0.57 0.85 0.32 0.15 PH8A 40 10 3.67 9.36 0.65 0.73 0.11 0.06 Ave. 39.36±2.06 9.82±1.64 5.58±1.99 9.48±1.59 0.63±0.15 0.79±0.11 0.21±0.14 0.09±0.06

5

7

57

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Wa Gar3 30 8 4.95 8.00 0.63 0.80 0.21 0.09 Gar6 30 8 3.30 8.00 0.63 0.70 0.09 0.05 Gar8 30 4 2.58 4.00 0.73 0.61 -0.20 0.00 Gar9 30 8 2.81 8.00 0.63 0.64 0.02 0.03 Gar13 30 6 2.54 6.00 0.57 0.61 0.07 0.04 GC187 30 8 4.60 8.00 0.77 0.78 0.02 0.01 GC203 30 11 7.35 11.00 0.77 0.86 0.11 0.06 HOLN 30 10 6.52 10.00 0.57 0.85 0.33 0.15 JQSO 30 8 5.94 8.00 0.70 0.83 0.16 0.07 Sa197 30 8 4.80 8.00 0.53 0.79 0.33 0.13 PH8A 30 5 2.07 5.00 0.30 0.52 0.42 0.16 Ave. 30±0.00 8.00±1.92 4.31±1.70 7.64±1.92 0.62±0.13 0.73±0.11 0.14±0.17 0.07±0.05

5

8

58

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Haeng Gar3 40 12 8.51 11.47 0.70 0.88 0.21 0.10 Gar6 42 15 6.49 13.51 0.50 0.85 0.41 0.19 Gar8 42 7 3.37 6.35 0.48 0.70 0.32 0.13 Gar9 38 6 2.07 5.75 0.42 0.52 0.19 0.08 Gar13 34 10 6.35 9.75 0.62 0.84 0.27 0.12 GC187 42 10 7.38 9.91 0.57 0.87 0.34 0.17 GC203 40 11 6.03 10.48 0.48 0.83 0.43 0.20 HOLN 41 15 10.22 14.29 0.63 0.90 0.30 0.14 JQSO 42 11 7.84 10.40 0.83 0.87 0.05 0.02 Sa197 38 14 8.62 13.34 0.21 0.88 0.76 0.36 PH8A 41 11 3.41 9.65 0.46 0.71 0.34 0.14 Ave. 40±2.37 11.09±2.78 6.39±2.41 10.45±2.59 0.54±0.16 0.81±0.11 0.33±0.17 0.15±0.08 All Ave. 47.88±18.82 10.99±3.00 5.80±2.12 10.12±2.44 0.59±0.16 0.80±0.10 0.26±0.17 0.12±0.07 samples

59

59

Table 4-1 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Each locus Gar3 48.38±19.74 10.88±1.69 6.47±1.10 10.07±1.04 0.70±0.05 0.84±0.03 0.17±0.06 0.07±0.03 Gar6 49.00±19.16 14.63±4.27 6.19±1.50 12.72±2.70 0.67±0.09 0.83±0.06 0.19±0.12 0.09±0.05 Gar8 49.63±19.71 6.25±1.56 3.63±0.48 5.62±0.99 0.58±0.08 0.72±0.05 0.24±0.07 0.09±0.05 Gar9 43.88±14.61 10.00±2.50 3.02±0.83 9.02±1.78 0.40±0.11 0.64±0.10 0.38±0.18 0.16±0.07 Gar13 47.00±20.33 10.13±2.20 5.17±1.27 9.35±1.66 0.60±0.05 0.79±0.07 0.24±0.08 0.11±0.03 GC187 48.25±17.58 9.75±1.09 6.38±0.93 9.51±1.03 0.60±0.08 0.84±0.03 0.28±0.11 0.13±0.05 GC203 49.00±19.55 11.75±0.66 7.54±1.24 11.32±0.66 0.72±0.11 0.86±0.02 0.17±0.12 0.09±0.05 HOLN 48.00±18.75 12.75±1.56 8.35±1.28 12.01±1.18 0.60±0.10 0.88±0.02 0.32±0.11 0.15±0.05 JQSO 49.63±19.71 10.75±1.20 7.21±0.95 10.10±0.87 0.77±0.08 0.86±0.02 0.11±0.09 0.05±0.04 Sa197 45.88±18.90 12.38±2.45 6.76±2.18 11.54±2.18 0.39±0.14 0.83±0.07 0.54±0.17 0.24±0.07 PH8A 48.00±18.71 11.63±2.91 3.08±0.80 10.11±2.47 0.45±0.13 0.65±0.11 0.32±0.15 0.14±0.06

Note The indices included the sample size (N), number of alleles per locus (A), effective number of alleles (Ae), allelic richness (Ar),

observed heterozygosity (Ho), expected heterozygosity (He), fixation index (Fis) and estimated null allele frequencies.

Fis values and probability of significant deviation from Hardy-Weinberg equilibrium (P) are given for each population and locus. Values underlined indicate statistical significance, P < 0.00057, after Bonferroni correction = 0.05/ 88

60

61

2. Effective population size and evidence of recent bottlenecks

All population samples had similar effective population size (Ne) estimates, with Pua having the lowest estimate of 260.4 (CI = 153.2-761.6). I did not detect significant heterozygote excess in any of the samples under the two-phase mutation model (TPM). Additionally, the mode-shift test showed a normal L-shaped distribution pattern of the allele frequencies in all samples. The results implied the lack of bottleneck events in the recent history of these populations (Table 4-2).

Table 4-2 Estimates and 95% confidence intervals of contemporary effective

population size (Ne), NeS and the detection of bottlenecks based on Wilcoxon’s test for eight population samples at 11 microsatellite loci.

Effective population size Bottleneck Based on linkage disequilibrium Based on sib ships test Samples 95% Confidence intervals

Ne Lower Upper NeS Lower Upper TPM bound bound estimate estimate (P-value) Meed 408.3 175.9 infinite 64 43 100 0.831 Kon Infinite 407.1 infinite 63 40 99 0.175 Pua 260.4 153.2 761.6 58 38 95 0.206 Yao 422.1 192.4 infinite 54 36 85 0.465 Yang 1,554.5 556.6 infinite 97 71 137 0.102 Sa Infinite 273.0 infinite 44 27 73 0.700 Wa 406.6 101.7 infinite 40 25 47 0.577 Haeng Infinite 339.6 infinite 70 48 107 0.765

3. Genetic differentiation among samples and population genetic structure G. cambodgiensis populations in the upper Nan River were genetically heterogeneous. Both conventional and Bayesian genetic analyses suggested genetic divergence among the eight samples. Global FST estimated by the AMOVA framework was 0.022 (P < 0.01), with among-sample and within-sample variation contributing 2.25% and 97.75% respectively. Moreover, the AMOVA for three hierarchical levels revealed the genetic variation, the genetic variation for within 62 stream, among subbasins, and among basins was 96%, 2% and 2% respectively.

The pairwise FST values ranged from 0.003 (Yao and Sa) to 0.053 (Wa and Sa) (Table 4-3). All samples were genetically different from each other except for two sample pairs, Kon-Meed and Yao-Sa (P > 0.0018; P-value after adjusting for multiple comparisons = 0.05/ 28).

Table 4-3 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among G. cambodgiensis population samples in the upper Nan River, Thailand.

Meed Kon Pua Yao Yang Sa Wa Haeng Meed 22.48 49.14 100.52 70.81 181.50 203.18 210.96 Kon 0.003 44.60 95.98 66.27 176.96 198.64 206.42 Pua 0.020 0.013 87.16 57.45 168.14 189.82 197.60 Yao 0.018 0.016 0.017 88.32 199.01 220.69 228.47 Yang 0.019 0.016 0.0156 0.022 156.67 178.35 186.13 Sa 0.034 0.029 0.026 0.003 0.029 113.04 120.82 Wa 0.036 0.028 0.026 0.044 0.041 0.053 135.64 Haeng 0.021 0.021 0.020 0.021 0.015 0.029 0.051 Values underlined indicate statistical significance (P < 0.0018, P-values adjusted for multiple comparisons using Bonferroni correction = 0.05/28)

The UPGMA dendrogram based on Nei’s genetic distance also formed robust clusters of Yao and Sa (bootstrap value = 99.3) and of Kon and Meed (bootstrap value = 91.3) (Figure 4-1). The samples from sites upstream of the Nan River, including Kon, Meed, Yang, and Pua were in the same cluster. It is interesting to note that the most downstream sample, Haeng, was also included in this cluster. The Bayesian clustering algorithms implemented in STRUCTURE and TESS suggested five and four possible genetic clusters, respectively (Figure 4-2). Because TESS provided a clearer picture of population subdivision, I only presented the TESS results. The distribution of membership coefficients arranged by population samples suggested that Wa was genetically distinct from other samples. The bar plots also suggested genetic similarity among the three adjacent sites, Meed, Kon and Pua. Even 63 though a large number of individuals in the Yang sample comprised a unique genetic cluster, substantial numbers shared a genetic composition with those in Kon, Meed, Pua and Haeng, The Sa and Yao samples had a similar genetic composition, but Yao contained individuals with admixed ancestry (Figure 4-2). Moreover, it is apparent that most of the population samples contained more than one genetic cluster and admixed individuals. Although there is no location-specific cluster, some spatial patterns existed. These divisions included (1) the headwater tributaries (Meed/ Kon/ Pua) and main stem of the Nan River (including Haeng), (2) a middle tributary (Yang), (3) an eastern tributary (Wa) and (4) a western tributary (Sa). The most downstream site, Haeng, comprised of admixed individuals with coancestry deriving from all four genetic clusters.

Figure 4-1 UPGMA dendrogram of eight population samples of G. cambodgiensis in the upper Nan River based on Nei’s genetic distance (Nei, 1978) with 1000 bootstrap replicates at 11 microsatellite loci. 64

Figure 4-2 Bar plot of membership coefficients of individuals assigned to genetic clusters (K = 4 and 5) generated by a Bayesian clustering algorithm, TESS software. The individual coefficients were grouped by population samples.

4. Migration Migration estimates suggested some dispersal among these populations although the migration rates were not substantial (Table 4-4). The results from MIGRATE showed some gene flow among sub-basins with the migration rates ranging from 0.00046 (Pua received from Wa) to 0.00125 (Yang received from Pua). Migration from Yang to some northern streams (Meed, Kon and Pua) were similar (m = 0.00102-0.00123). Adjacent rivers also had slightly higher migration rates than those more distantly located. For example, Yang received immigrants from Pua

(m = 0.00125, Ne = 21.37) and Wa received immigrants from Sa at relatively higher rates than other, more distant sites (m = 0.00077). The most downstream site, Haeng, received immigrants from all samples. Most samples had comparable immigration and emigrations rates. Historical effective population size estimates suggested that

Yang was a donor population within the upper Nan River drainage basin with Ne estimates ranging from 77.70 (To Pua) to 98.73 (to Yao) (Table 4-4, Figure 4-3).

However, the number of migrant per generation (mNe) was much lower than one, which translates into a very low gene flow among populatoins.

64

Table 4-4 Historical gene flow among G. cambodgiensis population in the upper Nan River at 11 microsatellite loci.

Donor population Meed Kon Pua Yao

M m Ne M m Ne M m Ne M m Ne Meed - - - 1.30 0.00072 59.61 1.55 0.00086 59.51 1.51 0.00084 21.71 Kon 1.09 0.00061 34.15 - - - 1.58 0.00088 41.05 1.54 0.00086 21.65

Pua 1.57 0.00087 32.33 0.98 0.00054 36.33 - - - - -

Yao 1.34 0.00075 43.98 1.49 0.00083 24.65 0.85 0.00047 51.20 1.50 0.00083 38.17 Yang 1.66 0.00092 17.50 1.77 0.00098 18.72 2.25 0.00125 21.37 2.15 0.00120 23.12 Sa 1.55 0.00086 30.36 0.97 0.00054 25.94 1.26 0.00070 38.26 1.64 0.00091 38.57 Wa 1.14 0.00063 35.97 0.93 0.00052 34.95 0.66 0.00037 62.76 1.17 0.00065 33.13

Haeng 1.53 0.00085 33.35 1.38 0.00077 23.91 1.33 0.00074 32.40 1.35 0.00075 37.23 Receiving population Average 1.30 0.00072 - 1.30 0.00072 - 0.88 0.00049 - 1.30 0.00072 - immigration Average 1.30 0.00072 - 0.88 0.00049 - 1.30 0.00072 - 1.30 0.00072 - emigration immigration- 0.00 0.00000 - 0.42 0.00023 - 0.42 -0.00023 - 0.00 0.00000 - emigration Note M = Mij is a immigration rate from population i to j, scaled by mutation rate and m = mij is a immigration rate from i to j, scaled by

migration per generation . Numbers in bold letter are highest migration rate and Ne in a receiving population.

6

5

66

Table 4-4 (Continued)

Donor population Yang Sa Wa Haeng

M m Ne M m Ne M m Ne M m Ne Meed 2.22 0.00123 88.50 0.87 0.00048 36.47 1.30 0.00072 16.73 1.65 0.00092 21.76 Kon 1.84 0.00102 94.89 1.17 0.00065 36.45 1.30 0.00072 16.18 1.39 0.00077 30.56

Pua 1.95 0.00108 77.70 1.00 0.00056 24.84 0.83 0.00046 14.69 1.43 0.00080 31.94

Yao 1.61 0.00090 98.73 1.54 0.00086 21.53 1.42 0.00079 20.52 1.40 0.00078 34.04 Yang - - - 1.47 0.00082 18.21 1.43 0.00080 12.47 1.58 0.00088 17.47 Sa 1.37 0.00076 86.88 - - - 1.38 0.00077 32.14 0.95 0.00053 21.97 Wa 1.17 0.00065 80.91 1.39 0.00077 32.78 - - - 0.84 0.00047 18.77

Haeng 1.63 0.00091 93.00 1.15 0.00064 24.19 1.18 0.00066 28.62 - - - Receiving population Average 1.30 0.00072 - 1.30 0.00072 - 1.30 0.00072 - 1.30 0.00072 - immigration Average 1.93 0.00107 - 1.30 0.00072 - 1.30 0.00072 - 1.30 0.00072 - emigration immigration- -0.63 0.00035 - 0.00 0.00000 - 0.00 0.0000 - 0.00 0.00000 - emigration Note M = Mij is a immigration rate from population i to j, scaled by mutation rate and m = mij is a immigration rate from i to j, scaled by

migration per generation. Numbers in bold letter are highest migration rate and Ne in a receiving population.

6

6

67

Figure 4-3 Historical gene flow (migration rate, m) among G. cambodgiensis population the upper Nan River at 11 microsatellite loci.

Note Historical gene flow (migration rate, m) among G. cambodgiensis population the upper Nan River at 11 microsatellite loci. The arrows, in a differen size and color, indicate the direction and magnitude of gene flow.

68

5. Spatial pattern of genetic variation Based on the Pearson correlation coefficients, allelic richness within samples was significantly negatively correlated with stream order of the sampling locations (P < 0.05). However, it did not correlate with other landscape attributes, including elevation of sampling locations, distance to the Nan River main stem, % paddy field, % forest and % agriculture land (P > 0.05) (Table 4-5). The correlation between geographic distance (Log (geographic distance)) and linearized FST among samples was significant (R = 0.42, one sided P = 0.022, Mantel test). Genetic differentiation between populations was moderately explained by contemporary patterns of stream connectivity (R2 = 0.75 for the STREAMTREE model). However, the difference in elevation between sampling locations did not correlate with genetic distance. Based on the AICc and wi criteria, two regression models identified log stream distance and pairwise differences in stream orders as important explanatory variables in spatial genetic variation. However, neither of these variables explain much of the spatial genetic variation (R2 <0.033) (Table 4-6). I detected evidence for at least two distinct spatial genetic neighborhoods separated by the north-south division. The first two MEMGENE axes described almost all of the variability with the first MEMGENE axis explaining 57.8% (Figure 4-4A) and the second axis explaining 42.2% (Figure 4-4B). In total, only a small proportion of genetic variation can be explained by spatial patterns (R2adj = 0.02), but this was sufficient to identify neighborhoods that correspond to a landscape pattern.

69

Table 4-5 Pearson correlations between landscape characteristics in the upper Nan River and allelic richness within G. cambodgiensis population samples.

Landscape characteristics Pearson correlation P-values coefficient (2-tailed) Elevation 0.467 0.243 Stream order -0.811 0.015 Distance from the Nan River main-stem -0.523 0.183 Number of barriers within sub-basin 0.408 0.315 % Forest -0.512 0.194 % Agriculture 0.065 0.878 % Paddy field 0.492 0.216

Table 4-6 Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis population samples.

Models K AICc AICcwi Coefficients MRM LogDIST STO R2 P LogDIST 3 0.0 0.44 0.0068 - 0.0198 0.15 LogDIST+STO 4 1.8 0.18 0.0068 0.0013 0.0331 0.19

Note The models reported are those with the ∆AICc values <2 and AICcwi >0.1; K is the number of parameters in each model. Coefficients are for each explanatory variable included in the model (log DIST = log transformed stream distance; STO = differences in stream order between sites). MRM R2 is the amount of variation explained by the model using MRM analysis with P-values for each model based on 1000 permutations.

70

Figure 4-4 MEMGENE analysis for eight population samples of G. cambodgiensis in the upper Nan River basin.

Note MEMGENE analysis for eight population samples of G. cambodgiensis in the upper Nan River basin. Circles of a similar size and color suggest individuals with similar scores (large black and large white circle describe opposite extremes on the MEMGENE axes). (A) MEMGENE axis 1 explains 57.8% of the variability and (B) MEMGENE axis 2 explains 42.2%. 71

Tier II(1): Population genetic structure of G. cambodgiensis in the Nam Wa sub-basin 1. Genetic variation within samples All microsatellite loci were polymorphic in all sample populations (Table 4-7). HOLN had the highest allelic diversity while Gar8 had the lowest value. The average number of alleles per locus across samples ranged from 3.80±0.40 (Gar8) to 10.60±1.74 (HOLN), the average effective number of alleles per locus ranged from 2.16±0.33 (Gar9) to 7.12±1.32 (HOLN) and the average allelic richness ranged from 3.68±0.35 (Gar8) to 10.09±1.58 (HOLN). The expected heterozygosities averaged across samples ranged from 0.53±0.07 (Gar9) to 0.85±0.03 (HOLN). Similar to the Tier I research, micro-Checker analysis suggested the presence of null alleles at some loci, with the frequencies ranging from 0.01±0.03 (Gar8) to 0.21±0.03 (HOLN). However, for a given locus, the null allele frequencies were not consistent across all population samples. After adjusting allele frequencies of each sample based on the presence of null alleles (if exist), there was only a slight change in pairwise FST values. All loci, therefore, were used for further analysis. Genetic diversity within-samples across samples were comparable although SP had slightly lower allelic diversity than other samples (Mann-Whitney U test, P > 0.05). For allelic diversity, the average number of alleles per locus (A) ranged from 7.00±2.10 (SP) to 8.40±2.69 (Pha), the effective number of alleles per locus (Ae) ranged from 3.97±1.45 (SP) to 5.05±1.74 (Pha), and allelic richness ranged from 6.56±1.84 (SP) to 7.80±2.38 (Pha). Heterozygosity values were comparable across all samples with observed heterozygosity values (Ho) ranging from 0.59±0.11 (Pha) to

0.65±0.18 (NS) and expected heterozygosity values (He) ranging from 0.71±0.12 (SP) to 0.77±0.09 (Pha). Based on stream hierarchy, generic diversity between upper and lower Wa River, above and below Nam Wa dam were not significant difference (Mann-Whitney U test, P > 0.05). Of the 50 sample-locus cases (5 samples x 10 loci), 10 cases showed significant deviations from HWE (P < 0.001, after Bonferroni correction = 0.05/ 50). All of the deviations were heterozygote deficiencies (Ho < He) (Table 4-7). Based on the Pearson correlation, allelic richness within samples was not correlated with stream order, elevation and land use (P > 0.05). 71

Table 4-7 Average allelic variability at 10 microsatellite loci of five G. cambodgiensis samples in the Wa River, Nam Wa sub-basin.

Samples Locus N A Ae Ar Ho He Fis Null allele frequency SP Gar3 30 6 3.75 5.40 0.60 0.73 0.18 0.08 Gar6 29 9 3.10 7.79 0.55 0.68 0.19 0.04 Gar8 30 4 3.01 3.98 0.90 0.67 -0.35 0.00 Gar9 29 3 2.18 2.98 0.52 0.54 0.04 0.00 Gar13 30 6 1.81 5.84 0.40 0.45 0.10 0.00 GC187 30 8 6.79 7.97 0.70 0.85 0.18 0.08 GC203 30 8 4.31 7.37 0.80 0.77 -0.04 0.00 HOLN 30 8 4.79 7.65 0.40 0.79 0.49 0.22 JQSO 30 8 5.52 7.59 0.80 0.82 0.02 0.02 Sa197 29 10 4.40 9.02 0.48 0.77 0.38 0.17 Ave. 29.70±0.46 7.00±2.10 3.97±1.45 6.56±1.84 0.62±0.17 0.71±0.12 0.12±0.22 0.06±0.07

7

2

72

Table 4-7 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Pha Gar3 27 8 4.15 7.32 0.56 0.76 0.27 0.10 Gar6 30 11 4.07 9.58 0.63 0.75 0.16 0.10 Gar8 30 3 2.82 3.00 0.53 0.65 0.17 0.07 Gar9 28 6 2.38 5.19 0.50 0.58 0.14 0.03 Gar13 28 7 4.96 6.93 0.54 0.80 0.33 0.14 GC187 30 9 5.68 8.29 0.50 0.82 0.39 0.19 GC203 30 11 7.29 10.20 0.83 0.86 0.03 0.00 HOLN 30 13 7.96 11.89 0.43 0.87 0.50 0.23 JQSO 30 8 6.55 7.91 0.73 0.85 0.13 0.06 Sa197 30 8 4.69 7.67 0.60 0.79 0.24 0.12 Ave. 29.30±1.10 8.40±2.69 5.05±1.74 7.80±2.38 0.59±0.11 0.77±0.09 0.24±0.13 0.10±0.07

7

3

73

Table 4-7 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency NW Gar3 30 8 4.95 7.58 0.63 0.80 0.21 0.09 Gar6 30 8 3.27 6.99 0.60 0.69 0.14 0.08 Gar8 30 4 2.43 3.70 0.67 0.59 -0.13 0.00 Gar9 26 8 2.66 7.04 0.69 0.62 -0.11 0.00 Gar13 29 6 2.29 5.67 0.52 0.56 0.08 0.05 GC187 30 8 4.59 7.58 0.73 0.78 0.06 0.03 GC203 30 11 7.35 10.07 0.77 0.86 0.11 0.06 HOLN 30 10 6.52 9.30 0.57 0.85 0.33 0.15 JQSO 30 8 6.45 7.91 0.60 0.85 0.29 0.14 Sa197 30 10 5.49 9.28 0.57 0.82 0.31 0.13 Ave. 29.50±1.20 8.10±1.92 4.60±1.76 7.51±1.77 0.63±0.08 0.74±0.11 0.13±0.15 0.07±0.05

7

4

74

Table 4-7 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency HR Gar3 23 8 4.92 7.74 0.78 0.80 0.02 0.00 Gar6 23 9 3.71 8.56 0.61 0.73 0.17 0.11 Gar8 23 4 3.12 3.91 0.83 0.68 -0.22 0.00 Gar9 23 2 1.83 2.00 0.52 0.45 -0.15 0.00 Gar13 23 6 3.49 5.99 0.52 0.71 0.27 0.09 GC187 23 8 5.37 7.98 0.70 0.81 0.15 0.08 GC203 22 11 4.17 10.95 0.41 0.76 0.46 0.18 HOLN 23 10 8.08 9.90 0.48 0.88 0.45 0.22 JQSO 23 9 5.21 8.82 0.74 0.81 0.09 0.04 Sa197 21 8 5.04 8.00 0.57 0.80 0.29 0.13 Ave. 22.70±0.64 7.50±2.62 4.49±1.59 7.39±2.58 0.62±0.13 0.74±0.11 0.15±0.22 0.08±0.07

7

5

75

Table 4-7 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency NS Gar3 26 6 3.95 5.81 0.54 0.75 0.28 0.12 Gar6 26 11 3.54 10.15 0.81 0.72 -0.13 0.00 Gar8 26 4 2.78 3.81 0.85 0.64 -0.32 0.00 Gar9 26 3 1.78 2.81 0.31 0.44 0.30 0.10 Gar13 26 6 4.23 5.93 0.77 0.76 -0.01 0.00 GC187 26 7 4.28 6.77 0.50 0.77 0.35 0.15 GC203 26 8 5.32 7.62 0.85 0.81 -0.04 0.00 HOLN 26 12 8.24 11.72 0.42 0.88 0.52 0.25 JQSO 26 9 5.34 8.58 0.73 0.81 0.10 0.00 Sa197 24 11 4.30 10.35 0.71 0.77 0.08 0.03 Ave. 25.80±0.60 7.70±2.90 4.38±1.64 7.35±2.75 0.65±0.18 0.73±0.12 0.11±0.24 0.06±0.08 All Ave. 27.40±2.88 7.74±2.52 4.50±1.68 7.32±2.33 0.62±0.14 0.74±0.11 0.15±0.20 0.08±0.07 samples

7

6

76

Table 4-7 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Each Gar3 27.20±2.64 7.20±0.98 4.34±0.50 6.77±0.97 0.62±0.09 0.77±0.03 0.19±0.09 0.08±0.04 locus Gar6 27.60±2.73 9.60±1.20 3.54±0.34 8.61±1.15 0.64±0.09 0.71±0.03 0.10±0.12 0.06±0.04 Gar8 27.80±2.86 3.80±0.40 2.83±0.24 3.68±0.35 0.75±0.14 0.64±0.03 -0.17±0.19 0.01±0.03 Gar9 26.40±2.06 4.40±2.24 2.16±0.33 4.00±1.85 0.51±0.12 0.53±0.07 0.04±0.16 0.03±0.04 Gar13 27.20±2.48 6.20±0.40 3.36±1.17 6.07±0.44 0.55±0.12 0.66±0.13 0.16±0.12 0.06±0.06 GC187 27.80±2.86 8.00±0.63 5.34±0.88 7.72±0.52 0.63±0.10 0.81±0.03 0.23±0.13 0.10±0.06 GC203 27.60±3.20 9.80±1.47 5.69±1.39 9.24±1.46 0.73±0.16 0.81±0.04 0.10±0.19 0.05±0.07 HOLN 27.80±2.86 10.60±1.74 7.12±1.32 10.09±1.58 0.46±0.06 0.85±0.03 0.46±0.07 0.21±0.03 JQSO 27.80±2.86 8.40±0.49 5.81±0.57 8.16±0.46 0.72±0.07 0.83±0.02 0.13±0.09 0.05±0.05 Sa197 26.80±3.66 9.40±1.20 4.78±0.44 8.86±0.96 0.59±0.07 0.79±0.02 0.26±0.10 0.12±0.05

Note The indices included the Sample size (N), number of alleles per locus (A), effective number of alleles (Ae), allelic richness (Ar),

observed heterozygosity (Ho), expected heterozygosity (He), fixation index (Fis) and estimated null allele frequencies.

Fis values and probability of significant deviation from Hardy-Weinberg equilibrium (P) are given for each population and locus. Values underlined indicate statistical significance, P < 0.001, after Bonferroni correction = 0.05/ 50

7

7

78

2. Effective population size and evidence of recent bottlenecks

All population samples had large effective population size estimates (Ne) (Table 4-8). There was no detectable evidence for bottleneck events in the recent history of these populations (Table 4-8). No significant heterozygote excess in any of the samples under the two-phase mutation model (TPM). Additionally, the mode-shift test showed a normal L-shaped distribution pattern of the allele frequencies in all samples.

Table 4-8 Estimates and 95% confidence intervals of contemporary effective

population size (Ne) based on linkage disequilibrium and the detection of bottlenecks based on Wilcoxon’s test for five population samples at 10 microsatellite loci.

Effective population size Bottleneck test Samples Lower Upper TPM (P-value) bound bound SP Infinite 113.9 Infinite 0.922 Pha Infinite 151.0 Infinite 0.695 NW Infinite 198.3 Infinite 0.846 HR Infinite 95.3 Infinite 0.625 NS Infinite 198.7 Infinite 0.557

3. Genetic differentiation among samples and population genetic structure Both conventional and Bayesian genetic analyses suggested genetic divergence among the five samples. Global FST estimated by the AMOVA framework was 0.029 (P < 0.01). The hierchary structure variation between river segments and among-sample within segment contributing to 1 and 4% of total variation, respectively. The variation between above and below the Dam locations was not significant. The pairwise FST values ranged from 0.016 (NW and NS) to 0.050 (SP and HR) (Table 4-9). SP was genetically different from other samples (bootstrap value = 100). Only three of 10 pairwise were not genetically different, Pha-NS, 79

NW-HR and NW-NS (P > 0.005; P-value after adjusting for multiple comparisons = 0.05/ 10). The UPGMA dendrogram based on Nei’s genetic distance also suggested that SP was most divergent from other samples (bootstrap value = 100). The remaining samples were genetically similar (bootstrap value = 36.6 to 53.3) (Figure 4- 5). The Bayesian clustering algorithms implemented in STRUCTURE suggested the number of clusters is K = 3 (Figure 4-6), but the barplot of membership coefficient did not reveal any spatial pattern. Similarly, TESS analysis did not detect genetic divergence among sample. Spatial genetic variation among the five samples along the Wa River could partially explained by pairwise geographic distance (Log geographic distance) and elevation (Table 4-10). The isolation by distance model (IBD) could partially explain the genetic divergence among all samples, except for SP. For the four samples, significant correlation was detected between log (M) (Slatkin's (1993) measure of similarity M) and log (Geographic distance) (R = 0.59, P < 0.043, Mantel test). In contrast to the results at the basin-level, genetic differentiation between populations from the Wa River could not be explained by the contemporary pattern of stream connectivity (STREAMTREE analysis, R2 = 0.19). Furthermore, based on the AICc and wi criteria, a regression model identified pairwise elevation differences and log (geographic distance) as important explaining variables in spatial genetic variation (R2 = 0.42, P-value < 0.01) (Table 4-10).

80

Table 4-9 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among five G. cambodgiensis population samples in the Wa River, Nam Wa sub-basin, Thailand.

SP Pha NW HR NS SP 30 99.87 121.32 137.82 Pha 0.039 66.87 91.32 107.82 NW 0.029 0.021 24.45 40.95 HR 0.050 0.022 0.022 16.50 NS 0.047 0.016 0.016 0.028 Note Values underlined indicate statistical significance (P <0.005, P-values adjusted for multiple comparisons using Bonferroni correction = 0.05/ 10)

Figure 4-5 UPGMA dendrogram of five population samples of G. cambodgiensis in the Wa River.

Note UPGMA dendrogram of five population samples of G. cambodgiensis in the Wa River based on Nei’s genetic distance (Nei, 1978) with 1000 bootstrap replicates at 10 microsatellite loci. 81

Figure 4-6 Bar plot of membership coefficients of G. cambodgiensis in Wa River, Nam Wa sub-basin

Note Bar plot of membership coefficients of G. cambodgiensis in Wa River, Nam Wa sub-basin individuals assigned to three genetic clusters identified by the STRUCTURE software. The individual coefficients were grouped by population samples.

Table 4-10 Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis.

Models K ∆AICc AICcwi Coefficients MRM ELEV DIST R2 P ELEV +logDIST 4 0.35 0.35 5.12 x 10-5 -1.84x10-2 0.42 0.01 Note Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis. The models reported are

those with the ∆AICc values <2 and AICcwi >0.1; K is the number of parameters in each model. Coefficients are for each explanatory variable included in the model (ELEV = elevation differences between sites; log DIST = log transformed stream distance. MRM R2 is the amount of variation explained by the model using MRM analysis with P-values for each model based on 1000 permutations. 82

Tier II(2): Effect of physical barrier on genetic structure of G. cambodgiensis in the Wa River 1. Genetic variation within samples All five microsatellite loci were polymorphic in all population samples. The allelic diversity were comparable among all samples. The average number of alleles per locus ranged from 6.00±1.41 (SWL) to 7.00±0.63 (NKU); the effective number of alleles ranged from 2.70±0.86 (SPU) to 4.26±0.50 (NKU), and allelic richness ranged from 5.96±1.39 (SWL) to 6.91±0.68 (NKU). Averaged observed heterozygosity (Ho) across five loci was between 0.36±0.19 (SPU) and 0.63±0.16 (NKL) and expected heterozygosity (He) ranged from 0.59±0.13 (SPU) to 0.76±0.03 (NKU) (Table 4-11).

SPU had significant lower Ae and He values lower than NKU, NKL and SPL (P < 0.05). Observed genotypes were significantly departed from the Hardy-Weinberg equilibrium (HWE) in 19 of the 30 tests (six samples x five loci), all of the deviations were heterozygote deficiencies (Ho < He). The Micro-Checker analysis detected the presence of null alleles in most samples, but there was no evidence of stutter products or allelic dropout. The average of null allele frequencies ranged from 0.05±0.04 (GC203) to 0.19±0.07 (HOLN) (Table 4-11). However, the frequencies of null alleles at each locus were inconsistent across samples. The seeming presence of null alleles may be a consequence of sampling errors or population properties. In addition, after adjusting the allele frequencies within each sample to account for the presence of null alleles, the FreeNA analysis showed only a slight change in pairwise FST values from the original data set (data not shown).

82

Table 4-11 Average allelic variability at 5 microsatellite loci of six G. cambodgiensis samples in three barrier, Wa River.

Samples Locus N A Ae Ar Ho He Fis Null allele frequency NKU GC203 35 7 4.82 7.00 0.80 0.79 0.01 0.00 HOLN 35 8 4.45 7.96 0.37 0.78 0.52 0.23 Gar3 35 7 4.70 6.97 0.46 0.79 0.42 0.19 Gar6 35 7 3.71 6.82 0.69 0.73 0.06 0.06 Gar13 35 6 3.64 5.82 0.57 0.72 0.21 0.07 Ave. 35.00±0.00 7.00±0.63 4.26±0.50 6.91±0.68 0.58±0.15 0.76±0.03 0.24±0.20 0.11±0.09 NKL GC203 31 8 3.75 7.93 0.87 0.73 0.19 0.00 HOLN 31 6 4.19 6.00 0.39 0.76 0.49 0.22 Gar3 31 6 3.59 6.00 0.58 0.72 0.20 0.07 Gar6 31 7 4.04 6.87 0.61 0.75 0.19 0.08 Gar13 31 7 4.55 7.00 0.68 0.78 0.13 0.04 Ave. 31±0.00 6.80±0.75 4.02±0.34 6.76±0.72 0.63±0.16 0.75±0.02 0.24±0.13 0.08±0.07 Mang tream Ave. 33.00±2.00 6.90±0.70 4.14±0.44 6.84±0.70 0.60±0.16 0.76±0.03 0.24±0.17 0.10±0.08

8

3

83

Table 4-11 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency SPU GC203 34 8 3.22 7.98 0.50 0.69 0.28 0.11 HOLN 34 5 1.61 4.98 0.06 0.38 0.85 0.26

Gar3 34 5 2.17 4.98 0.26 0.54 0.51 0.19

Gar6 34 7 2.42 6.84 0.41 0.59 0.30 0.11

Gar13 34 6 4.07 5.84 0.59 0.75 0.22 0.10

Ave. 34±0.00 6.20±1.17 2.70±0.86 6.12±1.15 0.36±0.19 0.59±0.13 0.43±0.23 0.15±0.06

SPL GC203 29 10 5.07 10.00 0.72 0.80 0.10 0.05 HOLN 29 5 2.93 5.00 0.48 0.66 0.27 0.07

Gar3 29 5 4.13 5.00 0.31 0.76 0.60 0.25

Gar6 29 6 4.22 6.00 0.72 0.76 0.05 0.04

Gar13 29 7 3.94 7.00 0.34 0.75 0.54 0.23

Ave. 29±0.00 6.60±1.85 4.06±0.68 6.60±1.85 0.52±0.18 0.75±0.05 0.31±0.23 0.13±0.09

Pun Stream Ave. 31.50±2.50 6.40±1.56 3.38±1.03 6.36±1.56 0.44±0.20 0.67±0.12 0.37±0.24 0.14±0.08

8

4

84

Table 4-11 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency SWU GC203 31 7 4.78 6.93 0.61 0.79 0.23 0.10 HOLN 31 5 4.39 5.00 0.35 0.77 0.54 0.24

Gar3 31 5 3.45 5.00 0.55 0.71 0.23 0.05

Gar6 31 7 2.77 6.87 0.58 0.64 0.09 0.07

Gar13 31 8 3.18 7.94 0.68 0.69 0.01 0.00

Ave. 31±0.00 6.40±1.20 3.71±0.75 6.35±1.16 0.55±0.11 0.72±0.05 0.22±0.18 0.09±0.08

SWL GC203 31 8 4.79 7.93 0.77 0.79 0.02 0.02 HOLN 31 7 4.64 6.97 0.55 0.78 0.30 0.14

Gar3 31 4 3.07 4.00 0.68 0.67 0.00 0.00

Gar6 31 5 1.41 5.00 0.29 0.29 0.00 0.00

Gar13 31 6 2.95 5.90 0.61 0.66 0.07 0.03

Ave. 31±0.00 6.00±1.41 3.37±1.24 5.96±1.39 0.58±0.16 0.64±0.18 0.08±0.11 0.04±0.05

Wa Stream Ave. 31.00±0.00 6.20±1.33 3.54±1.04 6.15±1.30 0.57±0.14 0.68±0.14 0.15±0.17 0.07±0.07 All pop Ave. 31.83±2.03 6.50±1.28 3.69±0.94 6.45±1.27 0.54±0.18 0.70±0.12 0.26±0.21 0.10±0.09

85

85

Table 4-11 (Continued)

Samples Locus N A Ae Ar Ho He Fis Null allele frequency Each locus GC203 31.83±2.03 8.00±1.00 4.40±0.67 7.96±1.01 0.71±0.12 0.77±0.04 0.14±0.10 0.05±0.04 HOLN 31.83±2.03 6.00±1.15 3.70±1.09 5.99±1.14 0.37±0.15 0.69±0.14 0.50±0.19 0.19±0.07 Gar3 31.83±2.03 5.33±0.94 3.52±0.80 5.33±0.94 0.47±0.15 0.70±0.08 0.33±0.20 0.13±0.09 Gar6 31.83±2.03 6.50±0.76 3.09±0.99 6.40±0.70 0.55±0.15 0.63±0.16 0.10±0.10 0.06±0.03 Gar13 32.40±1.74 6.67±0.75 3..72±0.54 6.58±0.79 0.58±0.11 0.73±0.04 0.20±0.17 0.08±0.07

Note The indices included the number of alleles per locus (A), effective number of alleles (Ae), allelic richness (Ar), observed

heterozygosity (Ho), expected heterozygosity (He), fixation index (Fis) and estimated null allele frequencies.

Fis values and probability of significant deviation from Hardy-Weinberg equilibrium (P) are given for each population and locus. Values underlined indicate statistical significance, P < 0.0017, after Bonferroni correction = 0.05/ 30

8

6

87

2. Effective population size and evidence of recent bottlenecks I therefore used all loci for further analyses. SPU and SPL samples had low effective population size of 3.4 (95% CI = 2.6-5.8) and 8.7 (95% CI = 5.1-14.2) based on the linkage equilibrium approach (Table 4-12). Other samples had comparable estimates of a very large population size. Ne estimates based on sibships suggested comparable values across all samples. The BOTTLENECK analysis suggested significant heterozygote excess in one sample, NKU, under the two-phase mutation model (TPM) (P-value = 0.031). For the remaining samples, the mode-shift test showed a normal L-shaped distribution pattern of the allele frequencies. The results implied the lack of bottleneck events in the recent history of most populations (Table 4-12).

Table 4-12 Estimates and 95% confidence intervals of contemporary effective

population size (Ne) based on linkage disequilibrium and sibship approaches and the detection of bottlenecks based on Wilcoxon’s test for six population samples at 5 microsatellite loci.

Effective population size Bottleneck

Based on linkage disequilibrium Based on sibships test Samples 95% Confidence intervals

Ne Lower Upper NeS Lower Upper TPM bound bound estimate estimate (P-value) NKU 135.8 39.9 Infinite 25 14 24 0.031 NKL Infinite 62.9 Infinite 24 14 47 0.063 SPU 3.4 2.6 5.8 17 9 34 0.156 SPL 8.7 5.1 14.2 21 12 41 0.063 SWU 433.3 42.2 Infinite 22 12 42 0.625 SWL Infinite 106.8 Infinite 20 11 38 0.813

3. Genetic differentiation among samples and population genetic structure AMOVA revealed significant genetic structure among streams and among groups within streams (upstream and downstream of the barriers). The genetic 88 variation partitioned for among streams and for among groups within streams was 9% and 5% of total variation, respectively. Moreover, the AMOVA for three hierarchical levels revealed the genetic variation among stream, among sampling sites within streams, and within sites to be 13%, 7% and 80% respectively (Table 4-13).

Pairwise FST values were statistically significant for all sample pairs, with the values ranging from 0.02 (NKU-NKL in Mang Stream) to 0.31 (SPU-SWL in Pun and Wa streams, respectively). SPU appeared to be the most genetically distinct from samples from other streams (FST = 0.16-0.31). Pairwise FST values among the three streams suggested the greatest difference between Pun and Wa Streams (0.18) (Table 4-14).

Table 4-13 Genetic variation within and among G. cambodgiensis population samples from three streams in the Wa River based on analysis of molecular variance (AMOVA).

d.f. Sum of Mean Estimated % Squares Square Variance Among streams 2 128.952 64.476 0.743 13 (P < 0.01) Among sampling 3 51.261 17.087 0.398 7 (P < 0.01) sites within streams Within sites 185 826.510 4.468 4.468 80 (P < 0.01)

89

Table 4-14 Pairwise FST values among six G. cambodgiensis population samples and among three streams.

NKU NKL SPU SPL SWU SWL NKU - NKL 0.024 - SPU 0.201 0.159 - SPL 0.046 0.055 0.097 - SWU 0.059 0.074 0.254 0.093 - SWL 0.106 0.133 0.307 0.139 0.051 - Mang Stream Pun Stream Wa Stream Mang Stream 0 Pun Stream 0.094 0 Wa Stream 0.075 0.176 0 Note Values underlined indicate statistical significance (P < 0.0033, P-values adjusted for multiple comparisons using Bonferroni correction = 0.05/ 15, 6 population samples), (adjusted P < 0.0167 for 3 stream pair comparisons)

Similarly, a UPGMA dendrogram based on Nei’s genetic distance suggested SPU being the most genetically distant to the remaining samples. Two samples from the same stream were closely related. Samples from different streams were genetically different, with the exception of SPL, which was clustered together with NK samples (Figure 4-7). 90

Figure 4-7 UPGMA dendrogram of six population samples of G. cambodgiensis among samples obtained from areas above and below physical barriers in the Pun, Mang and Wa Streams in the Wa River.

Note UPGMA dendrogram of six population samples of G. cambodgiensis among samples obtained from areas above and below physical barriers in the Pun, Mang and Wa Streams in the Wa River, Nam Wa sub-basin based on Nei’s genetic distance (Nei, 1978) with 1000 bootstrap replicates at 5 microsatellite loci.

The Bayesian clustering algorithm implemented in STRUCTURE suggested at least two possible genetic clusters, the sample above Sapun Waterfall (SPU) in Pun Stream and the remaining samples (Figure 4-8). At K = 2, the results did not reveal genetic difference between Pun and Mang Streams. For K = 3, however, the STRUCTURE results agreed with the previously described analyses and suggested some differentiation among streams. The distribution of membership coefficients arranged by population samples suggested a shared ancestry between the samples above and below Sapun Waterfall (SPU and SPL). Also, there were some admixed 91 individuals in the NK, SW and SPL samples. SPL contained genetic material from all two or three genetic clusters. SWU and SWL also contained genetic contribution from NKU and NKL, and vice versa.

Figure 4-8 Bar plot of membership coefficients of G. cambodgiensis in Wa River.

Note Bar plot of membership coefficients of G. cambodgiensis in Wa River, individuals generated from a Bayesian clustering algorithm in the software STRUCTURE. The individuals are grouped by sampling locations.

A Combined data set for the entire upper Nan River basin The analysis for the combined data set (basin and sub-basin levels), excluding PH8A, suugested a consistent pattern of genetic diversity. The average allelic diversity of the upper Nan River populations was higher than those from the Nam Wa sub-basin (Table 4-15). There was no evidence of recent bottlenecks in all populations (Table 4-16). Similar to the findings in sections Tier I and Tier II, I detected significant deviations from the Hardy-Weinberg equilibrium in 46 out of 120 tests (P < 0.00042, after Bonferroni correction = 0.05/ 120), most of which were 92 heterozygote deficits. Based on the Pearson correlation, allelic richness within samples was not correlated with stream order, elevation and land use (P > 0.05). Population differentiation among poplations was consistent with the findings for each separate data set (Table 4-17, Figures 4-9). The model-based approach also suggested a spatial pattern of genetic varation although STRUCTURE analysis could only detect the difference between the upper Nan River and Nam Wa sub-basin (K = 2), whereas TESS analysis yielded a similar spatial pattern to the results previously described in sections Tier I and Tier II (Figure 4-10). The combined data set provided a new insight on landscape impacts on poplation subdivision. The constently important landcape variables across all geographic scale was geographic distance (R = 0.44, P = 0.003 for the combined data set). Stream connectivity and stream order were important variables for the larger scale analyses (R2 = 0.78, the STREAMTREE analysis, for the combined data set; R2 <0.097, P = 0.001, multiple regression, Table 4-18) and the sub-basin data set (Table 4-10). Some inconsistency detected in the study reflected the inadequate representations of some key landscape characteristics for each data set.

92

Table 4-15 Average allelic variability of G. cambodgiensis at 10 microsatellite loci in upper Nan River drainage basin.

Samples N A Ae Ar Ho He Fis HWD Null allele frequency Meed 43.50±1.69 10.70±2.69 6.59±1.83 9.51±2.17 0.54±0.13 0.83±0.05 0.36±0.14 7 0.17±0.06 Kon 43.90±2.55 11.30±2.87 5.79±1.47 9.39±1.92 0.57±0.16 0.81±0.06 0.31±0.18 5 0.14±0.08 Pua 45.30±1.19 12.00±1.95 6.68±2.00 10.12±1.87 0.67±0.16 0.82±0.10 0.20±0.14 3 0.09±0.06 Yao 45.40±1.02 11.10±2.47 6.10±1.54 9.62±1.85 0.65±0.12 0.82±0.05 0.21±0.13 2 0.10±0.06 Yang 95.60±5.66 13.50±3.67 6.41±2.23 10.11±2.22 0.56±0.16 0.82±0.07 0.33±0.16 11 0.15±0.07 Sa 39.30±2.15 9.80±1.72 5.77±1.99 8.91±1.63 0.63±0.16 0.79±0.11 0.22±0.15 3 0.09±0.06 Wa 29.20±0.98 9.90±2.43 6.07±1.85 7.27±1.66 0.52±0.17 0.81±0.10 0.35±0.21 1 0.06±0.05 Haeng 40.70±2.05 10.10±2.02 5.38±1.89 9.75±2.46 0.64±0.09 0.78±0.09 0.17±0.12 6 0.15±0.09 Ave. upper 47.86±18.88 11.05±2.80 6.10±1.91 9.34±2.17 0.60±0.15 0.81±0.08 0.27±0.17 4.75±3.03 0.12±0.07 Nan River

9

3

93

Table 4-15 (Continued)

Samples N A Ae Ar Ho He Fis HWD Null allele frequency SP 29.70±0.46 7.00±2.10 3.97±1.45 6.56±1.84 0.62±0.17 0.71±0.12 0.12±0.22 2 0.06±0.07 Pha 29.30±1.10 8.40±2.69 5.05±1.74 7.80±2.38 0.59±0.11 0.77±0.09 0.24±0.13 2 0.10±0.07 NW 29.20±0.98 9.90±2.43 6.07±1.85 7.27±1.66 0.52±0.17 0.81±0.10 0.35±0.21 1 0.06±0.05 HR 22.70±0.64 7.50±2.62 4.49±1.59 7.39±2.58 0.62±0.13 0.74±0.11 0.15±0.22 3 0.08±0.07 NS 25.80±0.60 7.70±2.90 4.38±1.64 7.35±2.75 0.65±0.18 0.73±0.12 0.11±0.24 1 0.06±0.08 Ave. 27.34±2.83 8.10±2.75 4.79±1.81 7.27±2.31 0.60±0.16 0.75±0.11 0.19±0.22 1.80±0.75 0.08±0.07 Nam Wa sub-basin Barrier NKU 35.00±0.00 7.00±0.63 4.26±0.50 6.91±0.68 0.58±0.15 0.76±0.03 0.24±0.20 2 0.11±0.09 NKL 31.00±0.00 6.80±0.75 4.02±0.34 6.76±0.72 0.63±0.16 0.75±0.02 0.24±0.13 1 0.08±0.07 Ave. 33.00±2.00 6.90±0.70 4.14±0.44 6.84±0.70 0.60±0.16 0.76±0.03 0.24±0.17 1.5±0.5 0.10±0.08 Mang Stream

9

4

94

Table 4-15 (Continued)

Samples N A Ae Ar Ho He Fis HWD Null allele frequency SPU 34.00±0.00 6.20±1.17 2.70±0.86 6.12±1.15 0.36±0.19 0.59±0.13 0.43±0.23 5 0.15±0.06 SPL 29.00±0.00 6.60±1.85 4.06±0.68 6.60±1.85 0.52±0.18 0.75±0.05 0.31±0.23 5 0.13±0.09 Ave. 31.50±2.50 6.40±1.56 3.38±1.03 6.36±1.56 0.44±0.20 0.67±0.12 0.37±0.24 5.0±0.0 0.14±0.08 Pun Stream SWU 31.00±0.00 6.40±1.20 3.71±0.75 6.35±1.16 0.55±0.11 0.72±0.05 0.22±0.18 4 0.09±0.08 SWL 31.00±0.00 6.00±1.41 3.37±1.24 5.96±1.39 0.58±0.16 0.64±0.18 0.08±0.11 2 0.04±0.05 Ave. 31.00±0.00 6.20±1.33 3.54±1.04 6.15±1.30 0.57±0.14 0.68±0.14 0.15±0.17 3.0±1.0 0.07±0.07 Wa Stream Note Average allelic variability of G. cambodgiensis at 10 microsatellite loci in upper Nan River drainage basin. at 5 microsatellite loci

in impacts of a barrier. The indices included the number of alleles per locus (A), effective number of alleles (Ae), allelic richness

(Ar), observed heterozygosity (Ho), expected heterozygosity (He), fixation index (Fis), deviations from the Hardy-Weinberg equilibrium (HWD) and estimated null allele frequencies

9

5

95

Table 4-16 Estimates and 95% confidence intervals of contemporary effective population size (Ne) based on linkage disequilibrium and the detection of bottlenecks based on Wilcoxon’s test for twelve population samples at 10 microsatellite loci.

Effective population size Bottleneck test Samples Ne Lower bound Upper bound TPM (P-value) Meed 266 133.8 3,200.1 0.193 Kon 1,269.5 242.8 infinite 0.432 Pua 424.6 189.3 infinite 0.769 Yao 300 151.8 3,178.7 0.922 Yang 1,455.9 515.7 infinite 0.275 Sa 2,979.0 218.5 infinite 0.625 Wa 389.2 95.8 infinite 1.000 Haeng infinite 346.5 infinite 0.695 SP infinite 113.9 infinite 0.846 Pha infinite 151.0 infinite 0.625 HR infinite 95.3 infinite 0.557 NS infinite 198.7 infinite 0.557

9

6

96

Table 4-17 Pairwise FST values (lower diagonal) and geographic distance (km) (upper diagonal) among twelve G. cambodgiensis population samples in the upper Nan River drainage basin

Meed Kon Pua Yao Yang Sa Wa Haeng SP Pha HR NS Meed 22.48 49.14 100.52 70.81 181.50 203.18 210.96 303.05 270.05 227.63 244.13 Kon 0.004 44.60 95.98 66.27 176.96 198.64 206.42 298.51 265.51 175.19 158.69 Pua 0.013 0.008 87.16 57.45 168.14 189.82 197.60 289.69 256.69 166.37 149.87 Yao 0.018 0.018 0.012 88.32 199.01 220.69 228.47 320.56 287.56 197.24 180.74 Yang 0.015 0.013 0.014 0.020 156.67 178.35 186.13 278.22 245.22 154.90 138.40 Sa 0.034 0.029 0.018 0.003 0.026 113.04 120.82 212.91 179.91 89.59 73.09 Wa 0.043 0.040 0.040 0.047 0.047 0.055 135.64 99.87 66.87 24.45 40.95 Haeng 0.021 0.018 0.008 0.020 0.009 0.029 0.057 235.51 202.51 112.19 95.69 SP 0.062 0.058 0.059 0.066 0.061 0.075 0.025 0.087 30.00 121.32 137.82 Pha 0.029 0.018 0.019 0.034 0.029 0.041 0.018 0.037 0.041 91.32 107.82 HR 0.035 0.028 0.032 0.030 0.032 0.037 0.020 0.039 0.042 0.016 16.50 NS 0.042 0.034 0.034 0.032 0.044 0.039 0.013 0.048 0.046 0.017 0.024 Note Values underlined indicate statistical significance (P < 0.00076, P-values adjusted for multiple comparisons using Bonferroni correction = 0.05/ 66)

9

7

98

Figure 4-9 UPGMA dendrogram of twelve population samples of G. cambodgiensis in the upper Nan River drainage basin based on Nei’s genetic distance (Nei, 1978) with 1000 bootstrap replicates at 10 microsatellite loci. 99

(A)

(B)

(C)

(D)

(E) Figure 4-10 Bar plot of membership coefficients of G. cambodgiensis in upper Nan River drainage basin.

Note Bar plot of membership coefficients of G. cambodgiensis in upper Nan River drainage basin individuals generated from a Bayesian clustering algorithm in the software STRUCTURE (A) and TESS (B-E). The individuals are grouped by sampling locations. 100

Table 4-18 Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis.

Models K ∆AICc AICcwi Coefficients MRM ELEV STO DIST R2 P ELEV+STO+ 5 0 0.6 4.01 x 10-5 -2.96 x 10-4 7.18 x 10-3 0.097 0.001 LogDIST Note Multiple regression on distance matrices (MRM) for explaining linearized

pairwise FST among populations of G. cambodgiensis. The models reported are

those with the ∆AICc values < 2 and AICcwi > 0.1; K is the number of parameters in each model. Coefficients are for each explanatory variable included in the model (ELEV = elevation differences between sites; STO = differences in stream order between sites; log DIST = log transformed stream distance). MRM R2 is the amount of variation explained by the model using MRM analysis with P-values for each model based on 1000 permutations.

101

CHAPTER 5 DISCUSSION AND CONCLUSIONS

Genetic diversity within populations of a tropical stream species, Garra cambogdiensis, in a stream system in northern Thailand was moderate. Landscape characteristics affecting genetic variation depend on a geographic scale under investigation. At a large geographic scale (upper Nan River drainage basin), stream order negatively affected within-population genetic variation. However, this relationship was not detected at a finer scale (Wa River in the Nam Wa sub-basin). At a basin level, a contemporary hierarchical structure of a stream network and isolation by distance model explained much of the existing G. cambodgiensis population genetic structure. However, the stream network could not explain the pattern of genetic differentiation at a sub-basin level. Some landscape characteristics were important explanatory variables for population divergence. Pairwise stream distance (DIST) and differences in stream order between sites (STO) contributed to genetic divergence among samples for a basin-wide population genetic structure. In addition, elevation differences between sites (ELEV) contributed to spatial genetic variation within the Nam Wa sub-basin and the combined data set (the entire upper Nan River basin including the sub-basin samples). Although population genetic structure among these populations exists, high levels of admixture within most sample populations, especially those in the upper Nan River drainage basin, may have reflected stream dynamics under the influence of tropical monsoons and likely impacts of hatchery-assisted supplementary stocking. Samples from the Nam Wa sub-basin were genetically distict from the main stem population. Lastly, I detected some genetic effects of the presence of physical barriers in the streams on genetic diversity of G. cambodgiensis although barrier size and age determined the magnitude of the effects.

102

Genetic diversity within populations of G. cambodgiensis in the upper Nan River drainage basin Overall genetic diversity found in G. cambodgiensis populations from the upper Nan River were slightly lower than other Garra spp., such as Garra orientalis (Oriental sucking barb) in Hainan Island, Chaina (Su et al., 2013), and Garra barreimiae in the southeastern Arabian Peninsula (Kirchner et al., 2014). It is possible that because almost markers (9 of 11 loci) used in this study were developed for other congeneric species, and they had lose their polymorphisms in G. cambodgiensis. At a set of identical loci also used in this study, a Garra orientalis population collected from the Wanquan River in Hainan Island (n = 23), China, had much higher microsatellite variation, with allelic diversity ranging from 8 (Gar8) to 25 (Gar9) alleles and expected heterozygosities ranging from 0.72 (Gar8) to 1.0 (Gar6) (Su et al., 2013). At another three loci, PH8A, JQSO, and HOLN, two Garra barreimiae populations in the southeastern Arabian Peninsula (n = 44 from each location) had 7 to 15 alleles per locus (Kirchner et al., 2014). However, the diversity within populations of widely dispersed G. cambodgiensis was higher than that of other tropical freshwater fish species, especially those with more restricted distributions. Based on eight microsatellite loci, guppy (Poecilia reticulata) populations in rivers in the Northern Range of Trinidad had average number of allele per locus ranging from 1.57 (n = 59 in upper Lopinot) to 13.2 (n = 131 samples in Lower Caura); extremely low genetic diversity was observed in the most upstream, isolated site (Barson et al., 2009). A poorly dispersing stream fish species in Australia, Mogurnda mogurnda (n = 10 in 17 sites throughout the Daly River catchment) had only 2 to 9 alleles per locus (allelic richness ranged from 1.6-2.7) at nine microsatellite loci (Cook et al., 2011). Similarly, in another poorly dispersed species, Gadopsis marmoratus, populations from Australia’s Murray-Darling Basin had an average allelic richness of 2.2 alleles, with a range of 1.92 (37 samples) to 3.67 (24 samples) alleles at 12 microsatellite loci (Lean, Hammer, Unmack, Adams, & Beheregaray, 2016). At a upper Nan River basin level, stream orders inversely correlated with allelic diversity although these G. cambodgienis populations did not experience recent 103 bottleneck events. This observation reflected interesting dynamics between species- specific requirements for suitable habitat and some other ecological factors facilitating a longer term maintenance of stable effective population size and high genetic diversity. On the one hand, this species inhabits rocky bottoms with fast-moving water which are somewhat restricted to small and mid-size streams (lower stream orders). This habitat requirement explained the lower genetic diversity in the Wa sample (stream order 7). On the other hand, having a reproductive peak during the rainy season with semi-buoyant eggs (Froese & Pauly, 2017) may have facilitated extensive admixture between the Nan River main-stem and populations in other sub- basins. During the rainy season, breeders often congregate in flooded flat areas in the watershed (rice paddies) downstream of the tributaries (Figure 3-3B), where adults are typically absent during the dry season (personal observations). At a sub-basin level, however, the correlation between stream orders and allelic richness was not significant. It is probably due to limited variation of stream orders present in the Nam Wa sub-basin. Among upper Nan River samples, the Wa sample had the lowest allelic richness (P < 0.05, rank test) and slightly lower effective population size based on the sibship method (NeS). This level of diversity was also consistent across samples from the Nam Wa sub-basin, with slightly lower allelic richness in SP (Figure 3-3A). Poplations of G. cambodgiensis in the sub-basin may have been isolated from the remaining sub-basins due to its mountainous terrain (Figure 3-3A). Population isolation can lead to low genetic diversity in freshwater fish (e.g., Neville et al., 2006; Yang, Qian, Wu, Fan, & Wang, 2012). For example, Lahontan cutthroat trout (Oncorhynchus clarkii henshawi) populations in isolated headwater streams in the Mary River basin, western US, had lower genetic diversity than other downstream sites (Neville et al., 2006). Restricted gene flow between the main-stem and the Wa River populations could limit incoming allelic diversity. Population genetic structure of G. cambodgiensis in the upper Nan River drainage basin The G. cambodgiensis populations in the upper Nan River basin are genetically heterogeneous. Landscape contribution to the presence of population genetic structure varied upon geographic scales. At both geographic scales, the 104 existing population genetic structure can be explained mainly by geographic distance. At a basin level, pairwise differences in stream orders between sampling locations may also contribute to this genetic differentiation (not statistically significant based on the MRM analyses). At a sub-basin level, the contribution of geographic distance were less obvious, but pairwise differences in elavation between sampling locations became important. Moreover, with the combined data set, pairwise elevation differences and stream order differences as well as pairwise stream distance were important factors explaining genetic structure of G. cambodgiensis. At a basin level, four genetic clusters according to the river topology were detected, (1) the headwater tributaries (Meed/ Kon/ Pua) and main stem of the Nan River (Including Haeng), (2) a middle tributary (Yang), (3) an eastern tributary (Wa) and (4) a western tributary (Sa). Historical migration rates were relatively low among these sub-basins (mNe <0.1). The low genetic differentiation between two headwater streams, Kon and Meed, can especially explained by their connecting waterways (only 22.48 km apart). Yang sample from the second part of Mae Nam Nan sub-basin was more closely related to the main stem cluster than the remaining clusters. Wa and Sa represented the cluster most distantly related to the headwater streams/ main stem cluster. Isolation by distance and a stream hierarchy structure can explain population genetic structure in several stream fish species, especially those with dispersal mode mediated by water flow (e.g., Gara rufa, Miandare, Askari, Shabany, & Rezaei, 2016; Catostomus discobolus, Hopken et al., 2012). Similar to our study, Hopken et al. (2012) discovered a similar effect of a stream hierarchy on spatial genetic divergence pattern of Bluehead sucker (Catostomus discobolus) populations in three large river drainage basins of western North America (i.e., Colorado River, Snake River, and Bonneville River basins, area cover 640,000 km2) (significant correlation between river network and genetic distance, STREAMTREE, R2 = 0.987). For the entire Colorado River basin, there were three evolutionarily significant units (ESUs) of C. discobolus populations divided by segments of the river. Although the isolation-by-distance pattern of genetic differentiation can be observed in several fish species, the geographic distance at which populations diverged vary upon the species’ dispersal ability and the degree of restrictions of a river flow. For G. cambodgiensis, flooding can facilitate genetic homogeneity for 105 locations along the main-stem (more than 100 km apart) but the landscape disconnectivity among sub-basins restricts some gene flow between sites in adjacent sub-basins (upper part of Mae Nam Nan basin vs. Second part of Mae Nam Nan). In brook charr (Salvelinus fontinalis), Castric et al. (2001) found different patterns of spatial genetic variation in population samples between two neighboring rivers in eastern North America. The population divergence pattern in the Penobscot River could be explained by IBD, whereas the pattern in the Saint John River could not. Similarly, Crookes and Shaw (2016) could only detect a positive correlation between genetic structuring (FST) and distance (20-25 km) in populations of Rutilus rutilus in one of the two rivers studied, Stour River, southeast England. A surprising finding that conformed to neither the isolation-by-distance pattern nor to the upper Nan River hierarchy structure was high genetic similarity between the Sa and Yao samples (FST = 0.00258), located more than 190 km apart in a different sub-basin. The model-based clustering analyses suggested an upstream site, Yao River (in the eastern Nam Yao sub-basin), contained genetic material from at least two sources, the main stem cluster and the western tributary cluster (Nam Sa). A similar pattern has been reported by Beneteau et al. (2009), who found that population samples of Etheostoma blennioides from two separate drainages, Sydenham and Thames rivers, Canada, were genetically similar (mean pairwise

FST = 0.016). Possible explanation for this weak divergence was an on-going gene flow through Lake St. Clair, an adjacent water body connecting the two rivers. Similarly, Davis, Wieman, and Berendzen (2015) reported the lack of genetic differentiation among Rainbow darter samples across all four tributaries of the upper Mississippi River (only 1.05% of the genetic variation contributed to among drainage variation, even at sites located 60-120 km apart). The authors hypothesized that the northward population expansion of this species occurred recently, after the retreat of the last glaciation event (15,000 years ago). This historical process overwhelmed more recent genetic changes due to life histories (e.g., strict habitat requirements) and a contemporary river structure network (STREMTREE analysis, R2 = 0.578). For G. cambodgiensis in the Nam Yao-1 and Nam Sa sub-basins, the two sub-basins lie on the same side of ridge (east Phi Pun Nam range) and the headwater streams may have connected in the past. In addition, the two sub-basins are adjacent to the 106 sub-basin of another large tributary of the Chao Phraya River, the Yom River; it is therefore possible that these headwater streams are physically connected. At a sub-basin level, along the Wa River, most samples were genetically different, especially the headwater sites, but the STRUCTURE analysis could not reveal spatial pattern. The sites sampled in the Nam Wa sub-basin were along the Wa River; there could be some extensive mixing, similar to the results observed in the upper Nan River main stem. Furthermore the effects of a recently built dam on genetic population structure were not apparent. It is possible that the Nam Wa dam has been in operation for less than five years and it may not be not be sufficient time to observe the impacts. The genetic effects of physical barriers has been varied upon on characteristics of the barriers, and the period of population isolation (i.e., age of the barrier). For example, in populations of white-spotted charr (Salvelinus leucomaenis) in southern Hokkaido, Japan, Yamamoto et al. (2004) detected that the different magnitude of genetic effects between old and new dams. Population divergence between the above and below dam populations in eight year old dam (FST = 0.023-

0.273) was much smaller than that observed in a 30 years old dam (FST = 0.20-0.639). 1. Admixture and gene flow among populations Although most genetic clusters defined by the Bayesian clustering analysis correspond to the river topology and the sub-basin division, the TESS bar plots based on membership coefficients of individuals (Figure 4-2) suggested extensive admixture in most locations. This admixture can be a result of occasional mixing of breeders during the flooding season and of supplementary stocking of fry, which started in 2009 (Table 5-1). The natural admixture could happen during the rainy season, which typically coincides with the reproductive season of several fish species, including G. cambodgiensis (a peak in June-July, Paugy, 2002). During this period, most of the downstream segments of rivers in sub-basins would be occasionally flooded, allowing for breeders to congregate in suitable breeding habitats (paddy field) in neighboring sub-basins (Figure 3-3B). As the floods recede, fish fry usually are drifted to suitable nursery/ feeding habitats. The consequence of genetic exchange via flooding is evident by the presence of admixed individuals containing genetic material from the main-stem cluster in almost every site in the upper Nan River sub-basins. Moreover, 107 the most downstream site, the Haeng sample, contained genetic materials from all genetic clusters, including those of adjacent sub-basins. This pattern of admixture is somewhat common in tropical monsoonal rivers and streams, especially for fish species with a smaller body type (e.g., Garra orientalis, Yang et al., 2016; Poecilia reticulata, Barson et al., 2009; Mogurnda adspersa, Hughes et al., 2012; Gasterosteus aculeatus, Hanson, Moore, Taylor, Barrett, & Hendry, 2016). Yang et al. (2016) reported that two lowland populations of Garra orientalis, a wildly dispersed species in the Pearl River, China, contained highly admixed individuals (revealed by STRUCTURE) from at least four genetic clusters compared to other sampling sites. In guppy (Poecilia reticulata) populations from Trinidad and Tobago, Barson et al. (2009) hypothesized that the fast-flowing, turbid waters during the wet season forced a strong downstream migration and resulted in admixed populations. Perez-Figueroa, Fernandez, Amaro, Hermida, and Miguel (2015) found that a main-channel population of three-spine stickleback (Gasterosteus aculeatus) in the Mino River basin, Spain was more admixed compared to the tributary populations. Hughes et al. (2012) revealed that Mogurnda adspersa, a small-bodied freshwater fish, widely distributed in eastern Australia, had asymmetrical, long-term migration among the four streams in drainage basins of central Queensland. The most upstream site was the dominant source of immigrants into the downstream sections. Although historical migration rates among sites in this study are relatively low (Migrants per generation, mNe, < 0.1), there is some evidence of genetic impacts of supplementary stocking of fry in some locations. The supplementary stocking program was initiated in 2009 and has released approximately 200,000 individuals per site per year (personal observations, Table 5-1). Breeders have been collected from the wild and only used for one generation. In most cases, the breeder source and releasing site was the same (e.g., Wa River). However, for the cross sub-basin movements, the majority of the movements were between Yang River from the second part of Mae Nam Nan sub-basin to Kon/ Meed rivers in the upper part of Mae Nam Nan sub-basin (in 2013, 2014 and 2016). It is therefore not surprising to observe some genetic contribution from the Yang genetic cluster in the headwater streams and the main-stem locations. Compared to other sites, relatively higher migration rates 108

(mNe ~ 0.08-0.09, Tables 4-4 and 5-1) between Yang River and other sites within the main-stem cluster suggested such movements. Fortunately, the recent supplementary stocking has not altered existing population genetic structure (i.e., Yang is still genetically different from the main-stem populations).

Table 5-1 Records of supplementary stocking activities within the upper Nan River drainage basin between 2009 and 2017. Underlined records indicate transfer across sub-basins.

Years Source of brooders Released site Number of individuals fry released 2009 Wa River Wa River 200,000 2011 Wa River Sa River 200,000 2013 Yang River Kon River 200,000 Yang River Yang River 100,000 2014 Kon and Yang rivers Kon River 200,000 Wa River Wa River 200,000 2015 Kon and Meed rivers Kon River 200,000 Meed River 10,000 2016 Kon and Meed rivers Kon River 150,000 Kon and Yang rivers Kon River 50,000 Wa River Wa River 200,000 2017 Kon River Kon River 200,000 Wa River Wa River 200,000 Yang River Yang River 200,000 Note Records of supplementary stocking activities within the upper Nan River drainage basin between 2009 and 2017. Underlined records indicate transfer across sub-basins.

109

Effects of physical barriers on genetic variation of populations of G. cambodgiensis in the Wa River, Nam Wa sub-basin 1. Genetic diversity within populations Small, isolated populations are prone to the loss of genetic diversity, due to genetic drift and the lack of incoming gene flow (Frankham et al., 2010). The results suggested that Sapan Waterfall (> 10 m high, age >100 years) had been an effective barrier to fish upstream migration and gene flow between G. cambodgiensis populations from above and below the waterfall. SPU, the sample from areas above Sapun Waterfall, had the lowest effective number of alleles, heterozygosity, and allelic diversity. Similar observations were reported for other stream-dwelling and riverine fish species, such as the white-spotted charr (Salvelinus leucomaenis, Yamamoto et al., 2006) and Sinibrama macrops (Zhao, Chenoweth, Liu, & Liu, 2016). Yamamoto et al. (2006) detected lower genetic diversity in above-dam populations compared to the below-dam populations in two rivers, namely the Kame and Hitozuminai rivers, Japan (Kame River overall He = 0.35/ 0.71, Ho = 0.29/ 0.69 and the mean number of alleles 3.0/ 8.2 for the above- and below-dam samples respectively; Hitozuminai River overall He = 0.63/ 0.68, Ho = 0.65/ 0.67 and the mean number of alleles 6.0/ 7.2). Similarly, Zhao et al. (2016) reported the effects of dam structures on freshwater fish Sinibrama macrops in the Min River basin, China. The lowest amount of genetic diversity was found in a population above the high dam with a large drop, which created a reservoir. On the other hand, I did not detect differences in genetic diversity between samples obtained from above and below two concrete weirs, namely Nakham Dam and Sawanua Dam. Due to the weir height (3-5 meters), it is likely that both weirs are flooded during the rainy season, allowing for upstream and downstream migration. The subtle genetic effect of these concrete weirs was similar to that observed for populations of Sinibrama macrops above and below the ‘low dams’ in the Min River basin (Zhao et al., 2016). Despite the presence of high quality habitats (pristine, forested streams) and high population abundance of G. cambodgiensis in Pun Stream (personal observations), where Sapun Waterfall is located, SP samples had stable, but low 110 estimates of contemporary effective population size compared to other samples. The results did not suggest a recent bottleneck, although the P-value for the Wilcoxon’s test was 0.062 in a population below the waterfall (SPL). It is likely that the small Ne estimates in these populations has been due to isolation imposed by the natural waterfall. This type of isolation leading to a small Ne can also be observed in other fish species. For example, in brown trout (Salmo trutta), two lake populations upstream of 400-500 year-old dams in the Gudena River system, Denmark, had lower

Ne than for those downstream of the fall (Hanson et al., 2014). There was evidence for recent population bottlenecks in the Mang populations, especially the one above the weir (NKU, Wilcoxon’s test P-value = 0.03). This recent reduction in population size may be due to anthropogenic activities. A dominant land use in Mang watershed is agriculture (swidden cultivation) and the stream has been consistently dredged. These habitat disturbances can have a severe impact on a stream-dwelling fish, such as G. cambodgiensis. 2. Genetic differentiation among populations Results from both conventional and model-based cluster analysis suggested genetic divergence among populations from tributary streams of the Wa River. The population above Sapun Waterfall was the most genetically distinct. Except for the SPU-SPL sample pair, samples above and below the weirs within the stream were more genetically similar. Also, STRUCTURE analysis revealed genetic distinctiveness among streams with some admixture among populations from Mang Stream, Wa River and below Sapun Waterfall in Pun Stream. The observed genetic divergence may be a consequence of the presence of a barrier isolating populations for an extended time period (e.g., Hansen et al., 2014) and/ or a historical and contemporary hierarchical structure of a stream network (e.g., Pilger et al., 2017). Compared to the man-made weir structures in the Wa River system, the natural Sapun Waterfall is the largest and oldest barrier in the system. A combination of restricted gene flow for more than 100 years and low Ne of the populations above and below the waterfall may facilitate high genetic distinctiveness between these two populations. Size of a physical barrier affected the magnitude of the genetic isolation. While I detected high genetic distinctiveness between two populations isolated by a large natural waterfall, genetic effects of the smaller concrete weirs was more subtle. 111

On one hand, pairwise FST suggested genetic differences among all pairs of populations above and below the concrete weirs. On the other hand, STRUCTURE did not reveal differences between NKU and NKL and only slight differences between SWU and SWL. Our observations concurred with those observed in Sinibrama macrops populations isolated by dams in the Min River, China (Zhao et al., 2016). The genetic differentiation pattern observed in this part of the stduy were confounded by the ages of the barriers. Being a natural barrier, Sapun Waterfall is much older than the concrete weirs, built in the 1960s and 1970s. Barrier age for a given size can also have a significant genetic impact (Yamamoto et al., 2004). Yamamoto et al. (2004) found that the degree of genetic differentiation after habitat fragmentation by dams was correlated with the period of isolation. They found differences among populations isolated by large dams constructed 36 years ago

(Ken-ichi River, FST = 0.639), 27 years ago (the Haraki Dam, FST = 0.160) and a more recent 8 years old dam (the Hitozuminai River dam, FST = 0.023). The genetic divergence among streams observed in this study may be a consequence of the stream connectivity within the Wa River system. The STRUCTURE analysis suggested admixture between populations in Mang Stream (NKs) and the main stem Wa River, especially the one above Sawanua Dam (SWU). The population below Sapun Waterfall (SPL) also consists of admixed individuals from the Mang and Wa populations, indicating some gene flow mediated by floods. The upstream sections of Mang Stream and the Wa River (only about 6 km apart) consist of numerous tributary headwater creeks (Figure 3-4). It is possible that these headwater connections are mediated by floods. These connections could also explain small genetic contributions from the Mang population to that above Sapun Waterfall. Examples of similar situations include Yazoo darters (Etheostoma raneyi) in North America (Sterling et al., 2012) where populations above barriers were genetically similar, possibly due to headwater connections, and G. cambodgiensis populations in the upper Nan River drainage basin, where headwater connections led to genetic similarity between rivers that are more than 150 km apart (Teir I).

112

Management implications One of the long-term conservation goals is to protect the evolutionary potential of populations (Allendorf & Luikart, 2007). The results obtained from this study provide some insights potentially useful for achieving this goal. For within- population genetic diversity, G. cambodgiensis populations in the upper Nan River have been stable with a large effective population size. However, extreme fishing pressure, especially in the spawning seasons should be closely monitored. For population divergence, based on 10-11 polymorphic microsatellite loci, the genetic analyses revealed at least four distinct genetic clusters of G. cambodgiensis populations that corresponded roughly to distinct sub-basins of the upper Nan River drainage system, including the 2nd part of the Nan River (Yang), Nam Sa, Nam Wa (Wa) and the remaining portion of the upper Nan River/ main stem. This spatial- genetic divergence pattern can be used as a guidance for identifying conservation and fishery management units in the upper Nan River drainage basin. Any supplementary stocking using broodstock from a different sub-basin should be avoided. The high degree of admixture among genetically distinct populations highlights the importance of natural flooding patterns, allowing for admixture, and reflects possible genetic impacts of supplementary stocking. It is possible that this fish species has metapopulation dynamics, with the main stem of the upper Nan River and annual flooding facilitating connectivity among demes. Due to clear sub-division among some sub-basins, future supplementary stocking programs should be planned more carefully (Miller & Kapuscinski, 2003). Within a complex sub-basin, such as Nam Wa, most populations have not undergone recent bottlenecks and maintained a high effective population size, except for some isolated headwater sites (e.g., a population above Sapan waterfall and in Mang Stream). However, some ongoing anthropogenic activities, such as agriculture area expansion, man-made dam construction and heavy expoitation, can be a threat to the populations. An old, large physical barrier (a natural fall) have noticeable genetic impacts on a G. cambodgiensis population. Concrete weirs examined in this study also hinted at some genetic effects. To minimize genetic impacts on a fish population, the size of a barrier should be an important consideration for the design of fish- friendly, man-made structures in a river. 113

A contemporary hierarchical stream network is the most important landscape factor shaping genetic variation of a small-bodied, stream-dwelling fish species, G. cambodgiensis in the upper Nan River and its tributaries. Depending on the geographical scale, pairwise elevation differences between sites and stream orders can also be important. An alteration of the stream network or connectivity can profoundly impact the sustainability of these fish populations.

114

REFERENCES

Aljanabi, S. M., & Martinez, I. (1997). Universal and rapid salt-extraction of high quality genomic DNA for PCR-based techniques. Nucleic Acids Research, 25, 4692-4693. Allendorf, F. W., & Luikart, G. (2007). Conservation and the genetics of population. USA: Blackwell. Apodaca, J. J., Rissler, L. J., & Godwin, J. C. (2012). Population structure and gene flow in a heavily disturbed habitat:implications for the management of the imperilled Red Hills salamander (Phaeognathus hubrichti). Conservation Genetics Resource, 13, 913-923. Barson, N. J., Cable, J., & Van Oosterhout, C. (2009). Population genetic analysis of microsatellite variation of guppies (Poecilia reticulata) in Trinidad and Tobago: evidence for a dynamic source-sink metapopulation structure, founder events and population bottlenecks. Evolution Biology, 22, 485-497. Beerli, P. (2012). Migrate Documentation Version 3.2.1. USA: Florida State University, Tallahasee. Beneteau, C. L., Mandrak, N. E., & Heath, D. D. (2009). The effects of river barriers and range expansion of the population genetic structure and stability in Greenside Darter (Etheostoma blennioides) populations. Conservation Genetics, 10, 477-487. Bohonak, A. J. (2002). IBD (Isolation By Distance): a program for analyses of isolation by distance. Journal of Heredity, 93(2), 153-154. Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-Theoretic Approach (2nd ed.). New York: Springer-Verlag. Castric, V., Bonny, F., & Bernatchez, L. (2001). Landscape structure and hierchical genetic diversity in the Brook charr, Salvelinus fontinalis. Evolution, 55(5), 1016-1028.

115

Chakraborty, R., Srinivasan, M. R., & Daiger, P. S. (1992). Evaluation of Standard Error and Confidence Interval of Estimated Multilocus Genotype Probabilities and Their Implications in DNA Forensics. The American Journal of Human Genetics, 52, 60-70. Chapuis, M.P., & Estoup, A. (2007). Microsatellite Null Alleles and Estimation of Population Differentiation. Molecular Biology and Evolution, 24, 621-631. Chen, C., Durand, E., & Forbes, F. (2007). Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study. Molecular Ecology Notes, 7, 747-756. Chistiakov, D. A., Hellemans, B., & Volckaert, F. A. M. (2006). Microsatellites and their genomic distribution, evolution, function and applications: A review with special reference to fish genetics. Aquaculture, 255, 1-29. Cook, B. D., Kennard, M. J., Real, K., Pusy, B. J., & Hughes, J. M. (2011). Landscape genetic analysis of the tropical freshwater fish Mogurnda mogurnda (Eleotridae) in a monsoonal river basin: importance of hydrographic factors and population history. Freshwater Biology, 56, 812-827. Crookes, S., & Shaw, W. P. (2016). Isolation by distance and non-identical patterns of gene flow within two river populations of the freshwater fish Rutilus rutilus (L. 1758). Conservation Genetics, 17, 861-874. Davis, D. J., Wieman, A. C., & Berendzen, P. B. (2015). The influence of historical and contemporary landscape variables on the spatial genetic structure of the rainbow darter (Etheostoma caeruleum) in tributaries of the upper Mississippi River. Conservation Genetics, 16, 167-179. Dempster, P. A., Laird, M. N., & Rubin, B. D. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, 39,1-38. Di Rienzo, A., Peterson, A.C., Garza, J. C., Valdes, A. M., Slatkin, M., & Freimer, N. B. (1994). Mutational processes of simple-sequence repeat loci in human populations. Proceedings of the National Academy of Sciences of The United States Of America (PNAS), 91, 3166-3170. 116

Do, C., Waples, R. S., Peel, D., Macbeth, G. M., Tillett, B. J., & Ovenden, J. R. (2014). NeEstimator v2: reimplementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Molecular Ecology Resources, 14, 209-214. Earl, D. A., & vonHoldt, B. M. (2012). STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources, 4, 359-361. Evanno, G., Regnaut, S., & Goudet, J. (2005). Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology, 14, 2611-2620. Excoffier, L., & Lischer, H. E. L. (2010). Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10, 564-567. Excoffier, L., Laval, G., & Schneider, S. (2005). Arlequin (Version 3.0): An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online, 47-50. Ferreira, D. G., Galindo, B. A., Frantine-Silva, W., Almeida, F. S., & Sofia, S. H. (2015). Genetic structure of a Neotropical sedentary fish revealed by AFLP, microsatellite and mtDNA markers: a case study. Conservation Genetics, 16, 151-166. Francis, R. M. (2016). Pophelper: An R package and web app to analyse and visualise population structure. Molecular Ecology Resources, 17(1), 27-32. Francois, O., & Durand, E. (2010). Spatially explicit Bayesian clustering models in population genetics. Molecular Ecology Resources, 10(5), 773-784. Frankham, R., Ballou, J. D., & Briscoe, D. A. (2010). Population Fragmentation. Introduction to conservation genetics. Cambridge: United Kingdom. Freeland, J. R. (2005). Molecular Ecology. England: John Wiley & Sons. Froese, R., & Pauly, D. (2017). FishBase. Retrieved from https://www.fishbase.org. Galpern, P., Peres-Neto, P. R., Jean, P., & Micheline, M. (2014). Memgene: Spatial pattern detection in genetic distance data. Retrieved from https://cran. r-project.org/web/packages/memgene/vignettes/memgeneTutorial.pdf 117

Goslee, S. C., & Urban, D. L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software, 22, 1-19. Goudet, J. (2001). FSTAT, a program to estimate and test gene diversities and fixation indice (Version2.9.3). (Computer software). Switzerland: University of Lausanne. Hall, A. L., & Beissinger, R. S. (2014). A practical toolbox for design and analysis of landscape genetics studies. Landscape Ecology. Retrieved from https://nature.berkeley.edu/beislab/BeissingerLab/publications/Hall&Beis_2 014_LandscapeEcol.pdf Hallerman, M. E. (2003). Population genetics: principles and applications for fisheries scientists. Bethesda, Maryland: American Fisheries Society. Halliburton, R. (2004). Introduction to population genetics. USA: Pearson Education. Hamer, M. (2007). The freshwaterfish spawning and migration calendar report. Retrieved from https://www.waikatoregion.govt.nz/assets/ PageFiles/ 5021/tr0711.pdf Hansen, M. M., Limborg, M. T., Ferchaud, A., & Pujolar, J. (2014). The effects of Medieval dams on genetic divergence and demographic history in brown trout populations. BMC Evolutionary Biology, 14, 1-14. Hanson, D., Moore, J. S., Taylor, E. B., Barrett, R. D. H., & Hendry, A. P. (2016). Assessing reproductive isolation using a contact zone between parapatric lake-stream stickleback ecotypes. Journal of evolutionary biology, 29, 2491-2501. Hedrick, W. P. (2011). Genetics of populations. USA: Jones and Barttett. Hopken, M. W., Douglas, M. R., & Douglas, M. E. (2012). Stream hierarchy defines riverscape genetics of a North American desert fish. Retrieved from https://digitalcommons.unl.edu/cgi/viewcontent.cgi?referer=https://www. google.co.th/&httpsredir=1&article=2237&context=icwdm_usdanwrc Hughes, J. M., Real, K. M., Marshall, J. C., & Schmidt, D. J. (2012). Extreme genetic structure in a small-bodied freshwater fish, the purple spotted Gudgeon, Mogurnda adspersa (Eleotridae). PLoS ONE, 7, e40546. 118

Hubisz, M. J., Falush, D., Stephens, M., & Pritchard, J. K. (2009). Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources, 9, 1322-1332. Jaisuk, C., Lothongkham, A., Keereelang, J., & Sriyam, S. (2014). Development of Microsatellite Primers for Native Fish and Applications to Population Genetic Studies of Native Fish in Nan River. Rajabhat Journal of Science, Humanities & Social Sciences, 14, 23-33. Jaisuk, C., Lothongkham, A., & Sriyam, S. (2015). Genetic diversity of wild population Stone lapping minnow (Garra cambodgiensis) in the Nan River watershed using microsatellite DNA markers. Nan: Rajamangala University of Technology Lanna Nan. Kalinowski, S. T., Meeuwig, M. H., Narum, S. R., & Taper, M. L. (2008). Stream trees: a statistical method for mapping genetic differences between populations of freshwater organisms to the sections of streams that connect them. Canadian Journal of Fisheries and Aquatic Sciences, 65, 2752-2760. Kamvar, Z. N., Tabima, J. F., & Grunwald, N. J. (2014). Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/ or sexual reproduction. PeerJ, 2, 281. Kanno, Y., Vokoun, J. C., & Letcher, B. H. (2011). Fine-scale population structure and riverscape genetics of brook trout (Salvelinus fontinalis) distributed continuously along headwater channel networks. Molecular Ecology, 20, 3711-3729. Kirchner, S., Weinmaier, T., Rattei, T., Sattmann, H., & Kruckenhauser, L. (2014). Characterization of 19 new microsatellite loci for the Omani barb Garra barreimiae from 454 sequences. BMC Research Notes, 7, 522. Kottelat, M., & Whiten, K. (1996). Freshwater Biodiversity in Asia With Special Reference to Fish. U.S.A.: Manufactured in the United States of America. Kullander, S.O., & Fang, F. (2004). Seven new species of Garra (Cyprinidae: Cyprininae) from the Rakhine Yoma, southern Myanmar. Ichthyological Exploration of Freshwaters, 15, 257-278. Lamphere, B. A., & Blum, M. J. (2012). Genetic estimates of population structure and dispersal in a benthic stream fish. Ecology of Freshwater Fish, 21, 75-86. 119

Lean, J., Hammer, M. P., Unmack, P. J., Adams, M., & Beheregaray, L. B. (2016). Landscape genetics informs mesohabitat preference and conservation priorities for a surrogate indicator species in a highly fragmented river system. Heredity, 118, 1-11. Leclerc, E., Mailhot, Y., Mingelbier, M., & Bernatchez, L. (2008). The landscape genetics of yellow perch (Perca flavescens) in a large fluvial ecosystem. Molecular Ecology, 17, 1702-1717. Lothongkham, A. (2008). Species Diversity of Fishes in the Nan River Basin (The Chao Phraya River System) in Nan Province, Northern Thailand. Master’s thesis, Department of Fishery Biology, Graduate School, Kasetsart University. Lothongkham, A. (2015). Fish fauna in lower Wa River, Nan Province, northern Thailand. Nan: Rajamangala University of Technology Lanna Nan. Manel, S., Schwartz, M. K., Luikart, G., & Taberlet, P. (2003). Landscape genetics: combining landscape ecology and population genetics. TRENDS in Ecology and Evolution, 18, 189-197. Mazerolle, M. J. (2017). AICcmodavg: Model selection and multimodel inference based on (Q)AIC(C). R package version 2.1-1. Retrieved from https:// cran.r-project.org/package=AICcmodavg. McGregor, A. J., Lane, S., Thomason, M. A., Zhivotovskyl, L. A., Smoker, W. W., & Gharrett, A. J. (1998). Migration timing, a life history trait important in the genetic structure of pink salmon. N. Pac. AnaDr. Fish Commission Bulletin, 1, 262-273. Meeuwig, H., Guy, C., Kalinowski, T. S., & Fredenberg, A. W. (2010). Landscape influences on genetic differentiation among bull trout populations in a stream-lake network. Retrieved from http://onlinelibrary.wiley.com/doi/ 10.1111/j.1365-294X.2010.04655.x/full Miandare, H. K., Askari, G., Shabany, A., & Rezaei, H. R. (2016). Genetic characterization of Garra rufa (Heckel, 1843) populations in Tigris Basin, Iran using microsatellite markers. International Journal of Aquatic Biology, 4, 57-68. 120

Miller, L. M., & Kapuscinski, A. R. (2003). Genetic guidelines for hatchery supplementation programs. In E. M. Hallerman (Ed.), Population genetics: principles and applications for fisheries scientists. American Fisheries Society, 1, 329-356. Nei, M. (1978). Estimation of average heterozygosity and genetic distance for small number of individuals. Genetics, 89, 583-590. Nelson, J. S. (2006). Fishes of the world. New York: John Wiley & Sons. Netrathip, D. (1997). Pesticide uses of the small farm holding farmers in tambon Pua, changwat Nan. Master’s thesis, Department Man and Environment Management, Graduate School, Chiangmai University. Neville, H. M., Dunham, J. B., & Peacock, M. M. (2006). Landscape attributes and life history variability shape genetic structure of trout populations in a stream network. Landscape Ecology, 21, 901-916. Neville, H., Dunham. J., Rosenberger, A., Umek, J., & Nelson, B. (2009). Influences of wildfire, Habitat Size, and Connectivity on Trout in Headwater Streams Revealed by Patterns of Genetic Diversity. Transactions of the American Fisheries Society, 138, 1314-1327. Olsen, B. J., Crane, A. P., Flannery, G. B., Dunmall, K., Templin, D. W., & Wenburg, K. J. (2011). Comparative landscape genetic analysis of three Pacific salmon species from subarctic North America. Conservation Genetics, 12, 223-241. Pakoksung, K., & Koontanakulvong, S. (2015). The effect of land use change on run off in the Nan River Basin. Retrieved from https://www.researchgate. net/ publication/266482365 Paugy, D. (2002). Reproductive strategies of fishes in a tropical temporary stream of the upper Senegal Basin: Baoule River in Mali. Aquatic Living Resource, 15, 25-35. Peakall, R., & Smouse, P. E. (2006). GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes, 6, 288-295.

121

Perez-Figueroa, A., Fernandez, C., Amaro, R., Hermida, M., & Miguel, S. E. (2015). Population structure and effective/ census population size ratio in threatened three-spined stickleback populations from an isolated river basin in northwest Spain. Genetica, 143(4), 403-411. Pilger, T. J., Gido, K. B., Propst, D. L., Whitney, J. E., & Turner, T. F. (2017). River network architecture, genetic effective size, and distributional patterns predict differences in genetic structure across species in a dryland stream fish community. DOI: 10.1111/ mec.14079 Piry, S., Luikart, G., & Cornuet, J. M. (1999). BOTTLENECK: a computer program for detecting recent reductions in the effective population size using allele frequency data. Journal of Heredity, 90, 502-503. Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of Population Structure Using Multilocus Genotype Data. Genetics, 155, 945-959. Proches, S., & Ramdhani, S. (2012). The World’s Zoogeographical Regions Confirmed by Cross-Taxon Analyses. BioScience, 62, 260-270. Pukk, L., Kuparinen, A., Jarv, L., Gross, R., & Vasemagi, A. (2013). Genetic and life- history changes associated with fisheries-induced population collapse. Evolutionary Applications. DOI:10.1111/ eva.12060 Ranjan, J. B., Herwig, W., Subodh, S., & Michael, S. (2005). Study of the length frequency distribution of sucker head, Garra gotyla gotyla (Gray, 1830) in different river and season in Nepal and it applications. Kathmandu University Journal of Science, Engineering and Technology, 1, 1-12. Rice, W. R. (1989). Analyzing tables of statistical test. Evolution, 43, 223-225. Rousset, F. (2008). GENEPOP’007: a complete re-implementation of the GENEPOP software for Windows and Linux. Molecular Ecology Resources, 8, 103-106. Schlotterer, C. (1998). Microsatellites. In A. R Hoelzel (Ed)., Molecular Genetic Analysis of Populations (pp. 237-261). U.K.: Oxford University Press.

122

Selkoe, A. K., Scribner, T. K., & Galindo, M. H. (2016). Waterscape genetic- applications of landscape genetics to river, lakes, and seas. In N. Balkenhol, A. S. Cushman, T. A. Storfer., & P. L. Waits (Eds.), Landscape genetics concepts, methods, applications (pp. 220-246). India: PhotinaMTStd by Thomson Digital, Noida. Shaklee, B. J., & Currens, P. K. (2003). Genetic stock identification and risk assessment. In E. M. Hallerman, (Ed.), Population genetics: principles and applications for fisheries scientists (pp. 291-328). American Fisheries Society, Bethesda, Maryland. Stepien, C. A., Murphy, J. D., Lohner, N. R., Villet, S., & Haponski, E. A. (2009). Signatures of vicariance, postglacial dispersal and spawning philopatry: population genetics of the walleye Sander vitreus. Molecular Ecology, 18, 3411-3428. Stepien, C. A., & Faber, E. J. (1998). Population genetic structure, phylogeography and spawning philopatry in walleye (Stizostedion vitreum) from mitochondrial DNA control region sequences. Molecular Ecology, 7, 1757-1769. Sterling, K. A., Reed, D. H., Noonan, B. P., & Warren, M. L. Jr. (2012). Genetic effects of habitat fragmentation and population isolation on Etheostoma raneyi (Percidae). Conservation Genetics, 13, 859-872. So, N., Maes, G., & Volckaert, F. (2006). High genetic diversity in cryptic populations of the migratory sutchi catfish Pangasianodon hypophthalmus in the Mekong River. Heredity, 96, 166-174. Su, W. L., Liu, Z. Z., Wang, T. C., Zhen, Z., Liu, Y. A., Tang, Q. W., & Yang, Q. J. (2013). Isolation and characterization of polymorphic microsatellite markers in the fish Garra orientalis (Oriental sucking barb). Conservation Genetics Resource, 5, 231-233. Vandergast, G. A., Perry, M. W., Lugo, V. R., & Hathaway, A. S. (2011). Genetic landscapes GIS Toolbox: tools to map patterns of genetic divergence and diversity. Molecular Ecology Resources, 11, 158-161. 123

Van Oosterhout, C., Hutchinson, W. F., Will, D. P. M., & Shipley, P. (2004). MICRO CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes, 4, 535-538. Waits, R. E., Bagley, J. M., Blum, J. M., Mccormick, H. F., & Lazorchak, M. J. (2008). Source-sink dynamics sustain central stonerollers (Campostoma anomalum) in a heavily urbanized catchment. Freshwater Biology, 53, 2061- 2075. Wang, J. (2016). A comparison of single-sample estimators of effective population sizes from genetic marker data. Molecular Ecology, 25, 4692-711. Wang, J. I., Savage, K. W., & Shaffer, B. H. (2009). Landscape genetics and least- cost path analysis reveal unexpected dispersal routes in the California tiger salamander (Ambystoma californiense). Molecular Ecology, 18, 1365-1374. Wagner, H. H., & Fortin, M. J. (2012). A conceptual framework for the spatial analysis of landscape genetic data. Conservation genetics, 14(2), 253-261. Weir, B.S. (1996). Genetic data analysis II. Sunderland, MA: Sinauer Associates. Yamamoto, S., Morita, K., Koizumi, I., & Maekawa, K. (2004). Genetic differentiation of white-spotted charr (Salvelinus leucomaenis) populations after habitat fragmentation: Spatial-temporal changes in gene frequencies. Conservation Genetics, 5, 529-538. Yamamoto, S., Maekawa, K., Tamate, T., Koizumi, I., Hasegawa K., & Kubota, H. (2006). Genetic evaluation of translocation in artificially isolated populations of white-spotted charr (Salvelinus leucomaenis). Fisheries Research, 78, 352-358. Yang, X., Qian, L., Wu, H., Fan, Z., & Wang, C. (2012). Population differentiation, bottleneck and selection of Eurasian perch (Perca fluviatilis L.) at the Asian edge of its natural range. Biochemical Systematics and Ecology, 40, 6-12. Yang, J., Hsu, K., Liu, Z., Su, L., Kuo, P., Tang, W., Zhou, Z., Liu, D., Bao, B., & Lin, H. (2016). The population history of Garra orientalis (Teleostei: Cyprinidae) using mitochondrial DNA and microsatellite data with approximate Bayesian computation. Retrieved from https://bmcevolbiol.biomedcentral.com/articles/10.1186/s12862-016-0645-9 124

Yue, G. H., David, L., & Orban, L. (2007). Mutation rate and pattern of microsatellites in common carp (Cyprinus carpio L.). Genetica, 129, 329-331. Zhao, L., Chenoweth, E. L., Liu, J., & Liu, Q. (2016). Effects of dam structures on genetic diversity of freshwater fish Sinibrama macrops in Min River, China. Biochemical Systematics and Ecology, 68, 216-222.