Influence of historical landscapes, drainage evolution and ecological traits on patterns of genetic diversity in Southeast Asian freshwater

Eleanor A S Adamson BAppSc (Hons)

Biogeoscience Faculty of Science and Technology Queensland University of Technology Brisbane, Australia

This dissertation is submitted in fulfilment of requirements for the degree of Doctor of Philosophy 2010

KEYWORDS

Admixture analysis, Air-breathing , Clades, Channidae, , Chronogram, Cytochrome b, Bayesian clustering, Drainage evolution, Ecology, Freshwater fish, Fossil calibration, Genetic diversity, Historical biogeography, Khorat Plateau, Median- Joining network, River, Microsatellite DNA, Molecular evolution, Phylogeny, Phylogeography, Population genetics, RAG1, Ribosomal RNA 16S, RP1, Siam River, , Snakehead fishes, Sunda Shelf, S7 intron, Tonle Sap Great Lake.

i

ABSTRACT

Snakehead fishes in the family Channidae are obligate freshwater fishes represented by two extant genera, the African Parachannna and the Asian . These species prefer still or slow flowing water bodies, where they are top predators that exercise high levels of parental care, have the ability to breathe air, can tolerate poor water quality, and interestingly, can aestivate or traverse terrestrial habitat in response to seasonal changes in freshwater habitat availability. These attributes suggest that snakehead fishes may possess high dispersal potential, irrespective of the terrestrial barriers that would otherwise constrain the distribution of most freshwater fishes.

A number of biogeographical hypotheses have been developed to account for the modern distributions of snakehead fishes across two continents, including ancient vicariance during Gondwanan break-up, or recent colonisation tracking the formation of suitable climatic conditions. Taxonomic uncertainty also surrounds some members of the Channa genus, as geographical distributions for some taxa across southern and Southeast (SE) Asia are very large, and in one case is highly disjunct. The current study adopted a molecular genetics approach to gain an understanding of the evolution of this group of fishes, and in particular how the phylogeography of two Asian species may have been influenced by contemporary versus historical levels of dispersal and vicariance.

First, a molecular phylogeny was constructed based on multiple DNA loci and calibrated with fossil evidence to provide a dated chronology of divergence events among extant species, and also within species with widespread geographical distributions. The data provide strong evidence that trans-continental distribution of the Channidae arose as a result of dispersal out of Asia and into Africa in the mid–Eocene. Among Asian Channa, deep divergence among lineages indicates that the Oligocene-Miocene boundary was a time of significant species radiation, potentially associated with historical changes in climate and drainage geomorphology. Mid-Miocene divergence among lineages suggests that a taxonomic revision is warranted for two taxa. Deep intra-specific divergence (~8Mya) was also detected between C. striata lineages that occur sympatrically in the Mekong River Basin.

The study then examined the phylogeography and population structure of two major taxa, Channa striata (the chevron snakehead) and the C. micropeltes (the giant snakehead),

iii across SE Asia. Species specific microsatellite loci were developed and used in addition to a mitochondrial DNA marker (Cyt b) to screen neutral genetic variation within and among wild populations.

C. striata individuals were sampled across SE Asia (n=988), with the major focus being the Mekong Basin, which is the largest drainage basin in the region. The distributions of two divergent lineages were identified and admixture analysis showed that where they co- occur they are interbreeding, indicating that after long periods of evolution in isolation, divergence has not resulted in reproductive isolation. One lineage is predominantly confined to upland areas of northern Lao PDR to the north of the Khorat Plateau, while the other, which is more closely related to individuals from southern , has a widespread distribution across mainland SE Asian and Sumatra. The phylogeographical pattern recovered is associated with past river networks, and high diversity and divergence among all populations sampled reveal that contemporary dispersal is very low for this taxon, even where populations occur in contiguous freshwater habitats.

C. micropeltes (n=280) were also sampled from across the Mekong River Basin, focusing on the lower basin where it constitutes an important wild fishery resource. In comparison with C. striata, allelic diversity and genetic divergence among populations were extremely low, suggesting very recent colonisation of the greater Mekong region. Populations were significantly structured into at least three discrete populations in the lower Mekong.

Results of this study have implications for establishing effective conservation plans for managing both species, that represent economically important wild fishery resources for the region. For C. micropeltes, it is likely that a single fisheries stock in the Tonle Sap Great Lake is being exploited by multiple fisheries operations, and future management initiatives for this species in this region will need to account for this. For C. striata, conservation of natural levels of genetic variation will require management initiatives designed to promote population persistence at very localised spatial scales, as the high level of population structuring uncovered for this species indicates that significant unique diversity is present at this fine spatial scale.

iv

Table of Contents

Keywords i Abstract iii List of tables viii List of figures ix List of plates x Statement of original authorship xi

Acknowledgements xii

Chapter 1. General Introduction 1

DISTRIBUTION PATTERNS OF WILD ORGANISMS BIOGEOGRAPHY 3 FRESHWATER DRAINAGE EVOLUTION IN SE ASIA 7 PHYLOGEOGRAPHY 9 SIGNIFICANCE FOR MANAGEMENT AND CONSERVATION 11 SNAKEHEAD FISHES 14 CURRENT STUDY 16

Chapter 2. Systematic investigation of the Asian Snakeheads: Channa (Scopolli) 17

INTRODUCTION 19 SYSTEMATICS OF ASIAN SNAKEHEADS, A MOLECULAR PERSPECTIVE 20 HISTORY OF THE CHANNIDAE 22 CONTEMPORARY CHANNID GEOGRAPHICAL DISTRIBUTION 23 AIMS OF THIS CHAPTER 24 METHODS 25 SAMPLE COLLECTION 25 DNA MARKER SELECTION 28 DNA EXTRACTION, AMPLIFICATION, CLONING AND SEQUENCING OF PCR PRODUCTS 29 ADDITIONAL DATA 30 SEQUENCE ALIGNMENT AND GAP CODING 31 GENETIC DIVERSITY AND GENE TREE RECONSTRUCTION 32 SPECIES TREE INFERENCE AND ESTIMATION OF DIVERGENCE TIMES 34 RESULTS 36 CHANNA GENETIC VARIATION AND PHYLOGENETIC ANALYSES BASED ON MITOCHONDRIAL DNA 36 CHANNA GENETIC VARIATION AND PHYLOGENETIC ANALYSIS BASED ON NUCLEAR DNA 41 MULTI-LOCUS PHYLOGENY RECONSTRUCTION AND CHRONOGRAM ESTIMATION 45

v

DISCUSSION 49 PATTERNS OF INTRA-SPECIFIC DIVERGENCE 49 RELATIONSHIPS AMONG TAXA 52 CHANNID DIVERGENCE TIMES 53 CONCLUSION 55

Chapter 3. Patterns of genetic diversity and phylogeography of Channa striata in SE Asia 57

INTRODUCTION 59 ECOLOGY 60 UNDERSTANDING C. STRIATA DIVERSITY IN A REGIONAL CONTEXT 62 AIMS OF THIS CHAPTER 64 METHODS 65 SAMPLE COLLECTION 65 DNA MARKER SELECTION 69 MOLECULAR TECHNIQUES - MITOCHONDRIAL DNA 70 MOLECULAR TECHNIQUES - MICROSATELLITE DNA 73 STATISTICAL ANALYSES 76 RESULTS 84 MTDNA DIVERSITY AND PHYLOGEOGRAPHY 84 NUCLEAR DNA RESULTS 98 DISCUSSION 115 TWO C. STRIATA FORMS IN MAINLAND SE ASIA 115 PHYLOGEOGRAPHY OF THE WIDESPREAD FORM 119 FINE SCALE DIFFERENTIATION 120 CONCLUSION 122

Chapter 4. Patterns of genetic diversity and phylogeography of in the Mekong River Basin 123

INTRODUCTION 125 ECOLOGY 126 UNDERSTANDING C. MICROPELTES DIVERSITY IN A REGIONAL CONTEXT 128 AIMS OF THIS CHAPTER 130 METHODS 131 SAMPLE COLLECTION 131 DNA MARKER SELECTION 133 MOLECULAR TECHNIQUES – MITOCHONDRIAL DNA 133 MOLECULAR TECHNIQUES – NUCLEAR DNA 134 STATISTICAL ANALYSES 135

vi

RESULTS 139 MTDNA DIVERSITY AND PHYLOGEOGRAPHY 139 NUCLEAR DNA RESULTS 146 DISCUSSION 159 DIVERSITY AND PHYLOGEOGRAPHY 159 STOCK STRUCTURE IN THE LOWER MEKONG BASIN 165 DIFFERENCES IN CHANNA MEKONG PHYLOGEOGRAPHY 166 CONCLUSION 168

Chapter 5. General Discussion 169

HISTORICAL BIOGEOGRAPHY OF TROPICAL ASIAN FRESHWATER FISHES 172 PHYLOGEOGRAPHY OF MEKONG FISHES 176 MANAGING SNAKEHEADS IN THE MEKONG: A GENETIC PERSPECTIVE 182 CONCLUSION 186

References 189

Appendices 219

APPENDIX 1. DNA EXTRACTION 221

APPENDIX 2. PCR CLEAN-UP AND SEQUENCING PROTOCOL 222

APPENDIX 3. CLONING PCR PRODUCT 224

APPENDIX 4. MULTILOCUS PHYLOGENIES 228

APPENDIX 5. PUBLISHED PAPER 229

APPENDIX 6. C. STRIATA TGGE 230

APPENDIX 7. MICROSATELLITE ISOLATION 234

APPENDIX 8. ADDITIONAL C. STRIATA MICROSATELLITE PRIMERS 235

APPENDIX 9. GELSCAN PROTOCOL 236

APPENDIX 10. C. STRIATA MICROSATELLITE FREQUENCIES 237

APPENDIX 11. C. MICROPELTES MICROSATELLITE FREQUENCIES 241

vii

List of Tables

TABLE 2.1. SAMPLING DETAILS 27

TABLE 2.2. PRIMERS USED TO AMPLIFY DNA REGIONS USED FOR CHANNA SPP. PHYLOGENETIC ANALYSIS 29

TABLE 2.3. PCR CONDITIONS USED TO AMPLIFY TARGET DNA FRAGMENTS 30

TABLE 2.4. DETAILS OF GENBANK™ SEQUENCES USED IN THE CHANNIDAE PHYLOGENETIC ANALYSIS 31

TABLE 2.5. DIVERGENCE TIMES ESTIMATED BY BEAST FROM A TWO GENE DATASET 47

TABLE 3.1. GEOGRAPHICAL CO-ORDINATES FOR C. STRIATA SAMPLE SITES 66

TABLE 3.2. C. STRIATA COLLECTION SITES AND SAMPLE SIZES 68

TABLE 3.3. MICROSATELLITE PRIMERS FOR C. STRIATA 74

TABLE 3.4. VARIABLE SITES FOR 70 HAPLOTYPES OF 570BPS OF CYT B FOR C. STRIATA 85

TABLE 3.5. C. STRIATA MITOCHONDRIAL CYT B HAPLOTYPE FREQUENCIES 86-87

TABLE 3.6. SUMMARY STATISTICS FOR C. STRIATA MTDNA 94

TABLE 3.7. POPULATION PAIR-WISE ST’S FOR C. STRIATA MTDNA DATA 95 TABLE 3.8. SUMMARY STATISTICS FOR C. STRIATA MICROSATELLITE LOCI AT ALL SITES 99-102

TABLE 3.9. RESULTS OF PAIR-WISE FSTS OF C. STRIATA MICROSATELLITE DATA 105

TABLE 3.10. RESULTS OF PAIR-WISE RSTS OF C. STRIATA MICROSATELLITE DATA 105

TABLE 4.1. SAMPLING DETAILS FOR C. MICROPELTES 131

TABLE 4.2. MICROSATELLITE PRIMERS FOR C. MICROPELTES 134

TABLE 4.3. VARIABLE SITES IN C. STRIATA MTDNA CYT B HAPLOTYPES 139

TABLE 4.4. MITOCHONDRIAL CYT B HAPLOTYPE FREQUENCIES FOR C. MICROPELTES 139

TABLE 4.5. SUMMARY STATISTICS FOR C. MICROPELTES MTDNA 141

TABLE 4.6. POPULATION PAIR-WISE ST ANALYSIS OF C. MICROPELTES MTDNA 143

TABLE 4.7. SUMMARY STATISTICS FOR C. MICROPELTES MICROSATELLITE DIVERSITY 149-150

TABLE 4.8. RESULTS OF PAIR-WISE FST ANALYSIS OF C. MICROPELTES MICROSATELLITE DATA 152

TABLE 4.9. RESULTS OF PAIR-WISE RST ANALYSIS OF C. MICROPELTES MICROSATELLITE DATA 152

viii

LIST OF FIGURES

FIGURE 1.1. MAP OF MAINLAND SE ASIA AND WESTERN INDONESIAN ARCHIPELAGO 5

FIGURE 1.2. MAINLAND SE ASIA 7

FIGURE 1.3. FRESHWATER DRAINAGES OF MAINLAND SE ASIA 8

FIGURE 2.1. NATURAL GEOGRAPHICAL DISTRIBUTION OF CHANNIDAE 23

FIGURE 2.2. SATURATION PLOTS FOR DNA FRAGMENTS 37

FIGURE 2.3. 16SRNA BAYESIAN CONSENSUS TREE 38

FIGURE 2.4. CYT B BAYESIAN CONSENSUS TREE 40

FIGURE 2.5. RAG1 BAYESIAN CONSENSUS TREE 42

FIGURE 2.6. BAYESIAN CONSENSUS TREE OF RP1 ALLELES 44

FIGURE 2.7. BAYESIAN CHANNA PHYLOGENY CONSENSUS TREE ESTIMATED FROM FOUR LOCI 46

FIGURE 2.8. BAYESIAN INFERENCE CHRONOGRAM 48

FIGURE 3.1. GLOBAL FISHERIES PRODUCTION FOR C. STRIATA 1950-2007 59

FIGURE 3.2. MAP OF SOUTHERN ASIA SHOWING BROAD SCALE SAMPLING SITES FOR C. STRIATA 67

FIGURE 3.3. MAP OF MAINLAND SE ASIA SHOWING FINE SCALE SAMPLING SITES FOR C. STRIATA 67

FIGURE 3.4. AGAROSE CHECK GEL SHOWING AMPLIFICATION OF CYT B MTDNA GENE FRAGMENT 70

FIGURE 3.5. BANDING PATTERNS OBSERVED FOR FOUR CYT B TGGE-OHA GELS 72

FIGURE 3.6. MICROSATELLITE GEL IMAGES 75

FIGURE 3.7. NEIGHBOUR JOINING TREE FOR C. STRIATA MTDNA CYT B HAPLOTYPES 88

FIGURE 3.8. MEDIAN-JOINING NETWORK OF C. STRIATA CYT B HAPLOTYPES 89

FIGURE 3.9. MAP AND MEDIAN-JOINING NETWORK SHOWING THREE CYTB MTDNA CLADES 91

FIGURE 3.10. MAP AND MEDIAN-JOINING NETWORK COLOURED BY SECTION OF DRAINAGE BASIN 91

FIGURE 3.11. MAP AND MEDIAN-JOINING NETWORK OF EA CLADE SHOWING STAR CLUSTER 93

FIGURE 3.12. ST PLOTTED AGAINST THE LOG OF GEOGRAPHIC DISTANCE FOR C. STRIATA 97

FIGURE 3.13. DISTRIBUTION OF MICROSATELLITE ALLELE FREQUENCIES FOR C. STRIATA 98

FIGURE 3.14. GRAPH OF MEASURES OF DIFFERENTIATION RANKED BY DEST 104

FIGURE 3.15. NJ TREE OF DEST MICROSATELLITE DIFFERENTIATION AMONG C. STRIATA SAMPLES 107

FIGURE 3.16. RESULTS OF FACTORIAL CORRESPONDENCE ANALYSIS 108

FIGURE 3.17. RESULTS OF BAYESIAN CLUSTER ANALYSIS 109

FIGURE 3.18. GRAPHS OF MEMBERSHIP COEFFICIENTS ESTIMATED FROM BAYESIAN CLUSTER ANALYSIS 110

FIGURE 3.19. ADMIXTURE LEVELS BETWEEN C. STRIATA FORMS ACROSS SE ASIA 112

FIGURE 3.20. ADMIXTURE ANALYSIS FOR C. STRIATA BASED ON BOTH MICROSATELLITES AND MTDNA 113

ix

FIGURE 3.21. THE DISTRIBUTION OF GENETIC DIVERSITY FOR C. STRIATA IN SE ASIA 114

FIGURE 4.1. NATURAL GEOGRAPHICAL RANGE OF C. MICROPELTES IN SE ASIA 128

FIGURE 4.2. MAP OF THE MEKONG RIVER BASIN SHOWING SAMPLE SITES FOR C. MICROPELTES 132

FIGURE 4.3. MICROSATELLITE GEL IMAGE OF LOCUS CS-3 135

FIGURE 4.4. MAP AND MEDIAN-JOINING NETWORK SHOWING C. MICROPELTES CYT B HAPLOTYPES 140

FIGURE 4.5. RESULTS OF THE POWER ANALYSIS FOR THE C. MICROPELTES CYT B DATA SET 142

FIGURE 4.6. RAW ST PLOTTED AGAINST LOG-DISTANCE FOR IN C. MICROPELTES MTDNA DATA 144

FIGURE 4.7. GRAPH OF CT VALUES OBTAINED IN SAMOVA ANALYSIS 144

FIGURE 4.8. ILLUSTRATION OF GEOGRAPHICAL EXTENT OF GROUPS DEFINED BY SAMOVA ANALYSIS 145

FIGURE 4.9. ALLELE FREQUENCIES OBSERVED AT THE THREE MICROSATELLITE LOCI 147

FIGURE 4.10. MEKONG BASIN MAP SHOWING C. MICROPELTES MICROSATELLITE ALLELE FREQUENCIES 148

FIGURE 4.11. AVERAGE ALLELIC RICHNESS FOR C. MICROPELTES MICROSATELLITE DATA 150

FIGURE 4.12. RESULTS OF THE POWER ANALYSIS FOR THE C. MICROPELTES MICROSATELLITE DATA SET 151

FIGURE 4.13. GRAPH OF PAIR-WISE MEASURES OF DIFFERENTIATION RANKED BY DEST 151

FIGURE 4.14. NJ TREE OF DEST MICROSATELLITE DIFFERENTIATION AMONG C. MICROPELTES SAMPLES 153

FIGURE 4.15. GRAPH OF FCT VALUES OBTAINED IN SAMOVA ANALYSIS 154

FIGURE 4.16. RAW FST PLOTTED AGAINST LOG-DISTANCE FOR C. MICROPELTES MICROSATELLITE DATA 154

FIGURE 4.17. RESULTS OF FACTORIAL CORRESPONDENCE ANALYSIS 156

FIGURE 4.18. RESULTS OF BAYESIAN CLUSTER ANALYSIS 157

FIGURE 4.19. GRAPHS OF CLUSTER MEMBERSHIP COEFFICIENTS 158

LIST OF PLATES

PLATE 2.1. C. GACHUA – THE DWARF SNAKEHEAD 25

PLATE 2.2. C. LUCIUS – THE SPLENDID SNAKEHEAD 26

PLATE 2.3. C. MICROPELTES – THE GIANT SNAKEHEAD 26

PLATE 2.4. C. STRIATA - THE STRIPED OR CHEVRON SNAKEHEAD 26

PLATE 2.5. C. SP “X” (UNIDENTIFIED) 27

PLATE 2.6. ADULT C. MARULIA GUARDING YOUNG 51

PLATE 4.1. PHENOTYPIC VARIATION IN CHANNA MICROPELTES ACROSS THE MEKONG RIVER BASIN 127

x

STATEMENT OF ORIGINAL AUTHORSHIP

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Eleanor A S Adamson

September 2010

xi

ACKNOWLEDGEMENTS

Thank you to my supervisors, Professor Peter Mather and Dr David Hurwood. Thanks for your fantastic open door policy, faith, positivity, interest, tolerance, encouragement and patience. Thanks for introducing me to the Mekong and involving me in your research. Most of all, thanks for believing in me and offering me the chance to tackle such a great project, it’s been a wonderful experience.

Thank you to all my international collaborators, who not only assisted with collection but also made me welcome in their home countries and revealed to me a side of SE Asia that I could never have experienced on my own. I will treasure those wonderful field trip memories forever. Special thanks to Chris Barlow, Boonsong, Nguyen Nguyen Du, Dr Nguyen Van Hao, Suchart Ingthamjitr, Estu Nugroho, Phuc Dinh Phan, Pom, Lieng Sopha, Naruepon Sukmasavin, Ubolratana Suntornata, Nao Thouk and the Mekong River Commission.

Thank you to Vincent Chand, for technical assistance, laboratory support and friendship. Without your help I would never have isolated those microsats! Your expertise has been essential to my research, and I’ll be lost without you to nag!

Thank you to the great bunch of NRS/Biogeoscience academics for fostering a great academic environment and sense of community; it has been a pleasure to be part of the ecology group. Special thanks to Andrew Baker, Susan Fuller, Tanya Scharashkin and Ian Williamson for ongoing support, friendship, advice and occasional wisdom. Thanks also to Dr David Strayer, whose insightful question at NABS2009 encouraged me to look more closely at the genetics of introduced populations.

To my fellow postgrads, thanks for sharing my journey, for insightful discussions, silliness, help, and companionship over so many lunches, coffees (and beers) especially Bita Archangi, Terrence Dammannagoda, Litticia Bryant, Little Susan harveyii and Matt Krosch (Matt - sorry for the countless interruptions, my thesis is all the better from utilising your vast talent as a personal dictionary/thesaurus).

Thank you to my Mum Peggy. Thanks for teaching me about the world, and fostering my interest and understanding of all things scientific. Thanks for your infinite patience in answering my multitude of childhood “how?” and “why?” questions. Thanks for providing me with a great education and the skill set to achieve my goals. Thanks for always

xii

supporting me in the life choices I’ve made. And, thanks for all those cash injections when I’ve needed them, you’re the best!

Lastly, to my partner Andrew: without your love and support this journey would have been a very different and much harder road to travel. Thanks for enduring the long hours I’ve spent away from home in the field, at conferences, in the lab, and in front of my computer, for tolerating my total absorption, procrastination, lack of income and general lack of sanity.

During my PhD I received financial support from a QUT Postgraduate Research Award, with added funding from QUT’s Institute for Sustainable Resources. Additional laboratory costs were kindly absorbed by PM’s research funds. The draft thesis benefited greatly from comments by A Baker and S Fuller, and two excellent anonymous external reviewers.

xiii

Chapter 1 General Introduction

1

Introduction

For a long time naturalists have recognised that landscape evolution has had a large influence on patterns of biodiversity and the distributions of different species and groups of species. Over the last 30 years, advances in genetic technologies and greater understanding of molecular evolution have allowed investigations into the relationship between geography and spatial patterns of genetic diversity. Such studies have provided new insights into the processes that have determined contemporary biogeographical patterns among groups of taxa. Furthermore, improvements in the theoretical understanding of population genetics and in analytical interpretation of genetic data have allowed investigations into the spatial distribution of genetic diversity at the intra-specific level (Avise 2009; Hickerson et al. 2010). This technology has been applied to a wide range of taxa, including plants, invertebrates and vertebrates in both aquatic and terrestrial environments (Beheregaray 2008).

It has become evident that historical landscape evolution can have significant impacts on the way genetic diversity is distributed in natural populations at the intra-specific level. Patterns of genetic diversity, however, are not always easy to predict, especially when contemporary landscapes differ greatly from the configuration of past landscapes. It is now widely recognised that conservation plans must account for spatial patterns of genetic diversity if the long term goal is to ensure species and/or population persistence over evolutionary time. Thus, information on phylogenetic relationships at the interspecific and intra-specific level is relevant to addressing conservation goals as well as to provide insight into how landscape processes have shaped the evolution of regional biotas.

Distribution patterns of wild organisms Biogeography

Biogeography considers the history of organisms in a geographical landscape (Morrone & Crisci 1995). The contemporary spatial distributions of organisms arise as a result of historical range expansions, extinctions, and speciation (Futuyma 1998). Underlying these events are the processes of dispersal (movement among sites, including colonisation of formerly unoccupied sites) and vicariance (division of an ancestral population resulting from cessation of dispersal among sites). Patterns of dispersal and vicariance are linked integrally to an organism’s ecology, and in particular to their intrinsic ability to disperse (vagility) (Kodandaramaiah 2009), especially where suitable habitat patches are subdivided across a natural geographical landscape. Historical changes in climate and geomorphology alter the spatial arrangement of suitable habitats, and hence drive changes in patterns of

3

Chapter 1. dispersal and vicariance (Wiens & Donoghue 2004; Woodruff 2010) . In fact, in a biogeographical sense, “ecology and history are indissolubly tied together” (Posadas et al. 2006, p398), as an organism’s ecology will inherently determine its response to historical processes that modify natural geographical landscapes.

The relationship between ecology and history is clearly evident in the biogeography of contemporary tropical Asian biota. The flora and fauna of the Indian subcontinent has mixed origins, including representatives of relictual lineages of Gondwanan origin and more recent arrivals that evolved in the northern hemisphere (Karanth 2006). The ecology of some Gondwanan relictual Indian taxa has enabled them to disperse widely across southern and Southeast (SE) Asia in the time since continental plate collision with Eurasia. Examples of taxa that have dispersed “out-of-India” include a number of amphibians (Bossuyt & Milinkovitch 2001; Gower et al. 2002), lizards (Macey et al. 2000; Melville et al. 2009), and plants (Conti et al. 2002; Morley 2003). Similarly, after the India-Asia plate collision, taxa of Eurasian origin were able to disperse into India, for example some flowering plants (Bande & Prakash 1986) and gastropods (Kohler & Glaubrecht 2007).

In contrast, other groups have not dispersed either into or out of the Indian sub- continent post-collision, for example burrowing frogs (Biju & Bossuyt 2003), and millipedes (Wesener & VandenSpiegel 2009). This is largely because, in spite of at least a 35Mya history of dry land connection with Eurasia (Ali & Aitchison 2008), the ecologies of these taxa have limited their ability to disperse and establish natural ranges that span Eurasia and the Indian subcontinent.

The biogeography of SE Asian biota has been of particular interest since the mid 19th century, when Alfred Wallace first noted an unexpectedly large shift in species composition between the Indonesian islands of Bali and Lombok, signifying an abrupt change from Asian species to those with Australian affinities across only a 25km ocean gap (Wallace 1863). Wallace suggested that the modern distributions of species he observed reflected the geological history of the land masses, and indeed, all western islands once formed a continuous land mass, Sundaland (Figure 1.1), that was connected to mainland SE Asia until the Eocene (54-33Mya) (Hall 1998). These islands have continued to experience periodic dry land connections with mainland SE Asia and each other during past times of lowered sea level (Moss & Wilson 1998; Woodruff 2010), including most recently during the Pleistocene (Voris 2000; Woodruff & Turner 2009).

4

Introduction

South Sea

Indian Ocean

0 500km

Sunda Shelf Molengraaff Rivers Wallace’s Line

Figure 1.1. Map of mainland SE Asia and western Indonesian archipelago showing current land mass (contemporary freshwater drainage lines (white) and the historical extent of Sundaland, including drowned Molengraaff rivers , and Wallace’s original biogeographical line (1863), figure modified from Voris (2000).

Periods of dry land connection between Sundaic islands and the SE Asian mainland have allowed floral and faunal (biotic) exchange across the SE Asian region. Many taxa are now known to have dispersed among islands during the Miocene and Pliocene, including many groups of mammals, frogs and snakes (Harrison et al. 2006; Heaney 1984; Inger & Voris 2001; Ruedi & Fumagalli 1996). As the geographical distributions of more SE Asian taxa were examined, Wallace and later biogeographers revised Wallace’s (1863) original zoogeographic division, and various boundaries have been drawn based on the distribution of different faunal groups (Mayr 1944). These boundaries, in essence, demonstrate how different life history traits and dispersal capabilities among organisms produce different natural geographical ranges, even where extrinsic historical factors have been the same.

For some SE Asian taxa, ecological traits have prevented widespread dispersal, even when marine regressions exposed dry land between islands on the Sunda Shelf. Dispersal of forest-dependent mammals, for example, may have been limited by the presence of

5

Chapter 1. large areas of open grasslands in the lowlands between closed forest habitats (Bird et al. 2005; Meijaard 2003), although the extent of grasslands and forest cover on the Sunda Shelf during the last glacial maximum remains controversial (Cannon et al. 2009; Woodruff 2010). In contrast, for members of other groups, such as some moths that are accomplished fliers, dispersal between island land masses has probably been ongoing, even during periods where marine transgressions have submerged land bridges on the Sunda Shelf (Beck et al. 2006). Within mainland SE Asia, ecological and geomorphological barriers are also known to have limited dispersal of some taxa, for example gibbons have been isolated either side of major rivers (Meijaard & Groves 2006; Thinh et al. 2010). These examples demonstrate that isolation of historical habitats may or may not limit dispersal, depending on the specific ecological traits of the taxa considered.

Obligate freshwater fishes provide a unique model for biogeographical study because, by nature, their distributions are restricted by the physical boundaries of the environment they inhabit, i.e., freshwater drainages. Unlike terrestrial plant and taxa, with few exceptions, freshwater fishes are unable to disperse via winds, oceanic floating debris, birds, or other secondary agents (Briggs 2003b). For most freshwater fishes this ecological constraint restricts dispersal to historical periods of drainage connectivity, such as before geomorphological change involving tectonic or errosional processes shaped current drainage configurations, amalgamating or subdividing historical drainage networks, or when isolated drainages met in extended river basins on exposed continental shelves during periods of lowered sea levels.

In southern and SE Asia, geomorphological and eustatic processes are known to have played a major role in reshaping freshwater networks throughout the Cenozoic, and in turn have influenced the distribution of freshwater taxa. SE Asia has among the highest freshwater diversity found anywhere in the world (Allan et al. 2005; Dudgeon 2000a), and different groups of taxa with varied ecologies are likely to have responded in different ways to changes in freshwater habitats associated with the evolution of contemporary drainage basins.

6

Introduction

Freshwater drainage evolution in SE Asia

The precise history of freshwater drainage evolution across southern and SE Asia is so far relatively poorly documented, however a number of significant changes are known to have occurred recently, and in the early and late Tertiary. After the Indian-Asian plate collision, the up-thrust of the Himalayas at the northern margin of the Indian subcontinent led to the formation of large eastern flowing rivers (Brookfield 1998; Hall 1998). Major uplifts in the Himalayas occurred at the Eocene-Oligocene boundary ~38Mya, during the Oligocene ~30Mya, in the Late Miocene 8Mya, and in the mid Pliocene ~3Mya (Dupont- Nivet et al. 2008; Pei et al. 2009; Rowley & Currie 2006; Zhisheng et al. 2001). This mountain building triggered large scale drainage captures (Brookfield 1998; Clark et al. 2004; Clift et al. 2006). Evolution of large eastern flowing rivers at this time may have promoted the eastward dispersal of Gondwanan freshwater taxa into SE Asia (Hall 1998).

The Mekong River is now the longest and largest drainage in SE Asia (van Zalinge et al. 2003; Zakaria-Ismail 1994), and it is also the most zoogeographically diverse freshwater region in Asia (Allan et al. 2005; Kottelat 1985). The Mekong is thought to have assumed its current configuration as recently as the Pleistocene (Rainboth 1996a). From headwaters on the Tibetan Plateau, the Mekong flows over 4,350km to drain into the South China Sea. The majority of the River Basin (82%) is located on SE Asia’s Indo-Chinese peninsula, where the drainage encompasses Lao PDR, northeastern , Cambodia, and parts of Southern Vietnam (Figure 1.2).

Figure 1.2 Mainland SE Asia: (a) Mekong River Basin (yellow) and Chao Phraya Basin (orange), and (b) political boundaries of the Indo-Chinese Peninsula.

7

Chapter 1.

During its formation, the Mekong drainage system probably incorporated many parts of other river networks that previously drained the Indo-Chinese peninsula, and in doing so accumulated many species that had previously evolved in isolation (Coates 2001; Rainboth 1996a). Although there is no comprehensive reconstruction of past drainage lines based on geological evidence, it is likely that the growing Mekong Basin captured the headwaters of the ancestral Siam/Yom River (now the Chao Phraya River that drains central Thailand) and also tributaries that drain the Khorat Plateau of northeastern Thailand (Brookfield 1998; Rainboth 1996b). Figure 1.3 shows one reconstruction of former drainage lines in comparison with the current course of the Mekong River.

Figure 1.3 Freshwater drainages of mainland SE Asia, (a) Reconstruction of historical drainage morphology modified from Rainboth (1996b, page 168), (b) Contemporary drainage lines, Mekong River bolded, and (c) four Mekong Freshwater ecoregions as defined by Abell et al. (2008), Tibetan headwater ecoregion (Upper Lancang) not shown.

Mekong geomorphological changes have been ongoing. As recently as 0.6Mya, tectonic subsidence in the lower part of the drainage on the Lao-Cambodian border created an extensive series of rapids, the Khone Falls (Rainboth 1996a). Further subsidence of the Cambodian plate only 6Kya created the Tonle Sap Great Lake in central Cambodia, that is now the largest permanent freshwater body in SE Asia (Penny 2006; Rainboth 1996a). In the dry season the lake covers 2,520km2 and is mostly less than 3m deep, but in the wet season it expands to around 15,780km2, flooding savannah and deciduous forests as it receives floodwaters from the main Mekong channel (Kottelat 1985; Rainboth 1996a; Rainboth 1996b).

Many freshwater taxa are shared among mainland SE Asian drainages and drainage sections (Ng & Rainboth 2001; Rainboth 1996b), and are believed to have assumed their contemporary distributions as historical drainage rearrangement enabled dispersal into

8

Introduction

new drainage networks. In the Mekong Basin itself, five freshwater ecoregions have recently been defined based on broad ecological and evolutionary patterns in species distributions (Abell et al. 2008) (see Figure 1.3-c). In general, these regions correspond with sections of the Basin that are thought to have previously been isolated.

In addition to large scale geomorphological change that has divided and united freshwater habitats in SE Asia, mainland and island freshwater habitats that are now isolated have also experienced extended periods of connectivity in drowned river basins on the Sunda Shelf (Molengraaff Rivers, see Figure 1.1) (Molengraaff 1921; Sathiamurthy & Voris 2006; Voris 2000) . Many groups of freshwater fishes have contemporary distributions across island and mainland drainages that reflect this widespread pattern of freshwater connectivity (Dodson et al. 1995; Kottelat 1985; Rainboth 1996b; Taki 1975; Yap 2002).

Phylogeography

Ongoing processes of dispersal and vicariance, that over long periods of evolutionary time shape the historical biogeography of groups of taxa, also shape the distribution of lineages within taxa over shorter time scales. Information about patterns of variation of neutrally evolving genetic lineages within a single species can be used to trace the micro- evolutionary history of an organism, and can provide insights into how dispersal events, vicariance and demographic fluctuations have shaped the distribution of species in space and time (Avise 1998; Avise 2009; Avise et al. 1987).

The magnitude and patterns of phylogenetic differentiation among individuals can be interpreted in relation to models of population genetics, for example Coalescent Theory (Donnelly & Tavaré 1995; Hudson 1990; Kingman 1982; Stephens & Donnelly 2000). Using Coalescent Theory, simulations of gene genealogies back in time based on current gene variation allow inferences to be made about the history of populations, including estimates of past population sizes, growth rates, and the time to most recent common ancestor among samples (Avise 1989; Kingman 2000; Kuhner 2009; Rand 1996). When geographical distributions of genetic variation are considered, further inferences can be made concerning the origin of historical population founders and about ongoing rates of migration among demes, providing insight into how past and contemporary patterns of gene flow have shaped natural geographical ranges and how they contribute to the persistence of extant populations. The study of genealogy in a geographical landscape is

9

Chapter 1. known as phylogeography. Avise at al. (1987) outlined five generalised classes of phylogeographical pattern that may be expected in contiguous and discontinuous populations under varying amounts of gene flow and historical barriers to dispersal. They range from total sub-division with complete reciprocal monophyly and large divergence among individual demes, to total homogeneity of closely related lineages across large undivided geographical ranges. Each pattern can be related to the likely evolutionary circumstance under which they arose, for example a number of deeply divergent lineages co-occurring across a large geographical range may indicate a history of recent secondary population admixture after a long period of evolution in isolated ranges, or alternatively may indicate the presence of previously unidentified sibling taxa that are reproductively isolated (Avise et al. 1987).

Freshwater fish populations are often structured spatially in nature (Avise 1992; Gyllensten 1985; Ward et al. 1994). This is due to the nature of freshwater environments, that often possess many potential barriers to dispersal for aquatic taxa, for example drainage divides, waterfalls, rapids, lakes, and even dams (Crispo et al. 2006; Meffe & Vrijenhoek 1988; Monaghan et al. 2001; Yamamoto et al. 2004). When the history of regional drainage patterns has been stable, predictions can be made regarding the degree of phylogeographical structuring within and among drainages. Generally, highest phylogenetic divergence is expected among populations that inhabit independent, historically isolated drainage basins. Within drainage networks, gene exchange and genetic similarity is expected to be highest (and phylogenetic divergence lowest) when populations are geographically close and barriers to dispersal are absent or limited (Meffe & Vrijenhoek 1988). In addition, unidirectional stream flow can bias dispersal, leading to the accumulation of genetic diversity in downstream populations (Hernandez-Martich & Smith 1997).

These phylogeographical predictions, based on a scenario of historically stable drainage geomorphology, are unlikely to be appropriate for describing phylogeographical patterns for taxa that have experienced recent large scale changes in freshwater habitat and habitat connectivity. In northeastern Australia and New Zealand, for example, where recent geomorphological change has re-shaped drainage basins, diversity and divergence among intra-specific freshwater fishes retains a signature of ancient freshwater connectivity (Burridge et al. 2007; Hurwood & Hughes 1998; Waters et al. 2001). In SE Asia, where historically recent drainage re-arrangement and eustasy have altered the probability

10

Introduction

of freshwater connectivity among stream sections and drainage basins, some obligate freshwater taxa may also show patterns of phylogeographical structure that have evolved in ancestral river networks. Similarity among isolated rivers has already been noted for a few SE Asian freshwater fishes, although patterns vary with the taxa considered (Adamson et al. 2009; Hurwood et al. 2008; McConnell 2004; Rainboth 1996b).

Adding further complexity to predictions of population structure, intrinsic ecological characteristics can play a significant role in determining the phylogeographical structuring of aquatic taxa in contiguous freshwater habitats (Bohonak 1999). Mode of dispersal (passive or active) (Avise 1992; Bohonak 1999), behavioural traits (such as territorial defence), life history traits (including discrete migration pathways, or breeding sites), and specific reproductive behaviours (Avise et al. 2002; Bohonak 1999; So et al. 2006a) can all influence phylogeography and population structure. In the Mekong River, discrete migration pathways are thought to maintain population structure in small (Hurwood et al. 2008), while breeding site affinity is thought to influence population structure in a large catfish species (So et al. 2006a; So et al. 2006b).

The many freshwater taxa endemic to the recently evolved SE Asian river networks are likely to display a vast array of phylogeographical structures, influenced by regional history and unique ecological traits of target species.

Significance for management and conservation

Information on the level of contemporary gene flow, population structuring and historical phylogeography for a species has important implications for conservation and management. Understanding the magnitude of gene flow among local populations and groups of populations helps determine the spatial limits of functionally independent population units, or ‘Management Units’ (MUs) (Palsbøll et al. 2007). MUs are the logical unit for population monitoring and demographic study, and their recognition is fundamental for short term management of sustainable populations (Moritz 1994). This is especially true for harvested populations, such as heavily fished species, because management for (maximum) sustainable harvesting must account for the level at which fisheries resources operate as independent self-recruiting units, including where multiple MUs are present in established fishery “stocks” (stock complexity), and conversely, where multiple fisheries exploit single MUs (Begg & Waldman 1999; Ovenden 1990; Stephenson 1999).

11

Chapter 1.

At a higher level of hierarchical population structuring, patterns of phylogeographical structuring are important for defining Evolutionary Significant Units (ESUs) for long term conservation management (Moritz 1994) . ESUs may constitute one or more subpopulations (MUs), and each ESU represents a significant proportion of the total genetic diversity present within a species across its natural geographical range (McElhany et al. 2000). The idea behind ESUs is that, where a species is significantly structured into divergent sets of populations that experience substantial reproductive isolation, each individual unit contains distinct genetic variation including adaptive differences and as a consequence they represent unique evolutionary potentials (Moritz 1994; Palsbøll et al. 2007). Over the long term, one of the main goals for conservation of biological diversity is to maintain the evolutionary potential of species by conserving levels of genetic diversity and genetic variability within and among natural populations (Ryman 1991; Woodruff 2001).

In SE Asia, where freshwaters have undergone significant large scale changes in recent history, phylogeographical patterns and the extent of ESUs for freshwater taxa are likely to be difficult to predict. Moreover, there is a high potential for cryptic diversity, as drainage amalgamation may have united divergent lineages and sibling species in single drainages. Effective management for conservation requires an understanding of how target organisms are structured genetically, involving information on the geographical scale at which independent populations, ESUs and species occur naturally.

Genetic analysis of intra-specific fish populations can reveal the distribution of discrete, semi-independent, or panmictic populations (Carvalho & Hauser 1994; Ward 2000). Understanding the magnitude of intra-specific population differentiation and/or similarity and the regional processes that maintain it will be critical for managing freshwater fish stocks effectively in perturbed environments. It will also allow management decisions to incorporate knowledge about ESUs and MUs, and to assess the potential impacts of erecting barriers to migration, changing habitat or flow regimes, and to account for high levels of harvesting at different spatial scales (Carvalho & Hauser 1994).

It is likely that due to historical geomorphological change and eustasy, SE Asian freshwater populations have undergone repeated episodes of vicariance and reconnection during recent times. This has probably led to many cases of secondary admixture, and in some extreme instances, the formation of sympatric sibling species (e.g., siamensis and H. lobatus distributions are sympatric across the greater Mekong

12

Introduction

zoogeographic region) (Rainboth 1996a). The extent to which the distribution of individual taxa have been affected by changes in drainage structure is determined by specific life history traits (Rainboth 1996b). Rainboth (1996b) found that at the intra-specific level, species of the cyprinid genus Hypsibarbus with preferences for small upland streams and coarse substrates show a different history of drainage invasion compared with species with a preference for large river habitat.

Synthesis of ecological data, phylogenetic reconstructions and geomorphological evidence can elucidate the phylogeography and population structure of other SE Asian freshwater fishes, providing insight into the regional processes that have shaped modern distributions of freshwater fish populations at the intra-specific level. Recent regional processes are likely to have had a huge impact on the distribution of intra-specific genetic diversity in freshwater fishes, and also on the scale at which local fish populations operate as independent demographic units. In SE Asia and in particular the greater Mekong region, fishes comprising a single taxon within a single drainage may represent divergent populations that have evolved independently in isolation for long periods of time and that have only recently recontacted. These populations may have developed life history traits specific to sub-drainages or river sections, including discrete migration pathways or discrete breeding grounds, and there may be little if any exchange between neighbouring stocks. Conversely but equally likely, taxa that evolved in past drainage configurations may have undergone historically recent expansions in range and population size when new drainage arrangements have allowed dispersal. Newly established stocks may now rely heavily on the inputs from migrants moving across the system to maintain local population sizes and levels of genetic diversity, or may have specific life history stages that tie individuals to specific localities or habitats for part of their life cycle. Thus, it is likely that the magnitude of recent geomorphological change and the frequency of drainage connection and vicariance in SE Asia have affected different fish taxa in a variety of ways and to different extents. Unlike historically stable biogeographical patterns and congruent patterns of intra-specific structuring among fishes in other world bioregions (e.g., Australia (Unmack 2001) and southern North America (Avise 1992; Bermingham & Avise 1986)), it is likely that SE Asian freshwater fishes could display a range of population structures not easily predicted simply from ancient geomorphology, or even patterns that are congruent among congeneric taxa.

13

Chapter 1.

Snakehead Fishes

Asian snakehead fishes (Channa, Scopolli) are obligate freshwater fishes, with a natural distribution across southern and SE Asia including the Sunda Islands. They constitute a large proportion of the 2.5 million ton annual freshwater catch across the Mekong River Basin (Baran 2005) and command a high price on the Asian fish market (Lieng & van Zalinge unpublished; Wee 1982). There has been some taxonomic uncertainty surrounding members in the Channa genus (Courtenay & Williams 2004), as geographical distributions for some taxa across southern and SE Asia are vast, and in one case is very disjunct.

Channa species possess a number of unusual ecological traits that make predictions of population structure and phylogeography from geomorphological data alone difficult. For instance, snakeheads have a number of adaptations that could facilitate overland dispersal, including modified areas of the respiratory system that allow individuals to breathe air for extended periods of time (Hughes & Munshi 1986), and the ability to propel themselves across terrestrial environments by flexing their bodies (“ipsilateral tail action”) (Sayer 2005). These characteristics may indicate the importance of overland dispersal for members of the genus, and suggest that natural geographical distributions may not be correlated totally with the evolution of watersheds in SE Asia as may be the case for most other obligate freshwater fish taxa.

Channa are also known to tolerate poor water quality, can aestivate buried in mud during dry periods between flood events, and are traditionally known to be sedentary in nature (Wee 1982). Moreover, adults exercise high levels of parental care, “nest building” in mud and guarding young after hatching (Lee & Ng 1994; Wee 1982). These life history characteristics suggest that, despite the potential for high range dispersal, actual dispersal could be much more limited. In this case, large contemporary geographical distributions are unlikely to be the consequence of recent natural dispersal within and among freshwater habitats, but could instead result from ancient colonisation events or recent human mediated introductions.

Different levels of contemporary and historical gene flow, associated with ancient range establishment, recent natural colonisation, human mediated range expansions, or in response to landscape changes, determine patterns of phylogeographical structuring (Avise 2000). For snakehead fishes, ecological information alone is insufficient to accurately predict the level of geographical population structuring that may be present among

14

Introduction

populations, both among populations within newly formed river networks, and across populations now inhabiting isolated freshwater drainage basins. Knowledge of this nature is important, however, for devising effective conservation strategies in freshwater environments undergoing changes associated with human population (Carvalho & Hauser 1994; Ryman et al. 1995; Vrijenhoek 1998).

In SE Asia, the most common and most economically important snakehead species are Channa striata (the chevron snakehead) and C. micropeltes (the giant snakehead). Both species are caught in traps, with hook and line, by draining ponds, and in rice-fish culture (Deap et al. 2003; Rothuis et al. 1998b), but large-scale fishing operations increasingly target the two species. In Cambodia, these species combined constitute 11% of the total freshwater catch (Baran 2005), however in targeted fishing operations, for example on the shores of the Cambodian Great Lake, they comprise majority of the catch (Campbell et al. 2006), and can representing as much as 43% of local catches (Rot unpublished).

Although still relatively abundant in the Mekong Basin, Channa species have experienced recent population declines in other regions due to lack of management, e.g., (Yusoff et al. 2006). Mekong Channa populations may also be vulnerable to decline, firstly as their reproductive cycle is linked closely with seasonal flood plain habitats that are increasingly exposed to modification (Lieng & van Zalinge unpublished), and secondly as populations are subject to unrestricted, high levels of harvesting of both juveniles and adults across the Mekong Basin.

As Channa species represent an important cultural and economic resource (Lee & Ng 1994; Wee 1982), as well as comprising a significant proportion of available protein for many subsistence families (Coates 2002), the conservation of these fish in the wild and their maintenance as viable fishery resources is an important priority for the riparian countries of SE Asia. Ecological information currently available on Channa species cannot adequately predict the stock structure of species in the genus, but successful sustainable management will require knowledge of this nature. Phylogenetic and population genetic approaches that incorporate molecular and ecological data and regional history can elucidate contemporary stock structure for important snakehead species. This information will be relevant to the formulation of future management plans that can assist conservation of this economically important group of fishes.

15

Chapter 1.

Current study

This thesis aimed to examine relationships among Asian snakeheads fishes in order to clarify levels of genetic divergence within and among currently recognised species, and to determine the phylogeographical structure of two economically important species across SE Asia. A molecular genetics approach was employed to date divergences and to assess levels of contemporary and historical gene flow. Patterns of molecular evolution were interpreted in a regional historical biogeographical context to provide insight into what factors are likely to have impacted on the macro- and micro-evolutionary history of Channa spp in SE Asia.

Chapter 2 aimed to construct a fossil calibrated molecular phylogeny and used this information to propose a hypothesis for the evolutionary history of the Channidae, and to elucidate levels of molecular divergence among and within snakehead species in SE Asia.

Chapter 3 aimed to determine the phylogeography and population structure of C. striata across SE Asia and examine the influence that historical landscape evolution and ecological traits have had on determining contemporary patterns of genetic diversity and divergence.

Chapter 4 aimed to determine the population structure of C. micropeltes in the Lower Mekong Basin, where this species is an important component of the Cambodian freshwater fish catch.

Chapter 5 aimed to place the historical biogeography and phylogeography of snakehead fishes in the broader context of other SE Asian endemic freshwater fish. Results were considered in relation to fisheries management for conservation and ongoing wild fishery production of snakeheads.

16

Chapter 2 Systematic investigation of the Asian snakeheads: Channa (Scopolli)

17

Phylogeny of the Channidae

INTRODUCTION

The primary aim of systematic biology is to characterise the phylogenetic relationships among species and groups of species, and in doing so, to describe patterns of biological diversity in an historical and evolutionary context (Kullander 1999). A number of definitions have been proposed that address the concept of what actually constitutes a ‘species’ (Mallet 1995). For the purposes of this chapter, a species can be thought of as a functional unit in a community that incorporates genetic (and demographic) exchangeability and that is monophyletic (Kullander 1999). Species are one of the fundamental units in biology (de Queiroz 2007), and are important in understanding biodiversity and evolutionary processes. Therefore, understanding the patterns by which species are related in time and\or space allows a researcher to address questions about the history of lineages, and about contemporary distributions of species, groups of species and communities. Systematics includes, and is relevant to, (the formal naming and description of taxa) but is broader in scope because it considers the relationships among taxa.

The taxonomy of SE Asian freshwater biodiversity is in general, less than comprehensive. Across Asia currently, there are approximately 3,500 described species of freshwater fish, mostly tropical, but it is likely that many more await formal description and classification (Kullander 1999). In the Mekong Basin alone there are records for at least 1,200 - 1,500 fish species (Bao et al. 2001; Kottelat 2001; MRC 2003; Rainboth 1996a), but as groups of fishes undergo systematic revisions, routinely many new species are resolved (Rainboth 1996a).

Unrecognised biodiversity is likely to be represented by species that exist in low numbers or that possess restricted ranges, small species that lack significant commercial value, and by morphologically cryptic species. Although in general, the occurrence of cryptic species in sympatry is likely to be rare for freshwater fish (Kotlík & Berrebi 2001), in SE Asia a recent history of large scale geomorphological upheaval may have promoted mixing of divergent lineages and sibling taxa via repeated isolation and reconnection of freshwater environments over evolutionary timescales (Kottelat 1985; Rainboth 1996a).

A recent molecular study of SE Asian mudcarp (Henicorhynchus spp.) revealed deep divergence between species that are very similar morphologically, and also recognised highly divergent populations of a single Henicorhynchus “species” that co-exist in the same drainage (Hurwood et al. 2008). This outcome suggests that systematics at the genus and

19

Chapter 2. species levels for Henicorhynchus is not well understood. Recent molecular study of some Mekong cyprinids also recognised a lack of genetic differentiation in taxa that appeared morphologically distinct and are currently classified as different species (Cyclocheilichthys lagleri and C. repasson) (Hurwood , unpublished). These findings demonstrate that in the absence of comprehensive systematic knowledge, biodiversity is likely to go undetected and its significance for conservation management will be unrecognised.

Systematics of Asian snakeheads, a molecular perspective

Asian snakeheads (Channa spp.) are of high economic significance and are common in markets across southern and SE Asia, and this may explain why they have received a moderate amount of systematic attention in comparison with many other Asian freshwater fish groups, (for examples see Abol-Munafi et al. 2007; Li et al. 2006; Vishwanath & Geetakumari 2009). Recent studies have recognised 26 (Li et al. 2006) or 27 (Ambak et al. 2006) species in the Channa genus, although this number may well change as the group is subject to ongoing taxonomic revision and reclassification (e.g., Alfred 1963; Musikasinthorn & Taki 2001; Tweedie 1950), and also as more complete ichthyological surveys lead to identification of new species with restricted ranges, (e.g., Musikasinthorn 1998; Zhang et al. 2002).

To date, channid systematics has been focused primarily at the recognised species level or at higher taxonomic levels, ignoring the possibility for high levels of intra-specific differentiation. It has been suggested however, that at least five species in the genus may in fact represent “species complexes” (Courtenay & Williams 2004), namely for C. gachua, C. marulius, C. micropeltes, C. punctata, and C. striata. Two of these species (C. striata and C. micropeltes) are harvested from the wild in large numbers and also are now widely cultured across SE Asia. Given their economic importance, a better understanding of variation in these two species from a systematic standpoint needs to be developed.

Molecular phylogenetics provides an approach for addressing systematic questions at the individual, species, and genus level. In fact, the main goal of phylogenetic analysis is to reconstruct the evolutionary history of a group of organisms (Hillis 1987). In a strict sense “phylogeny is the (absolute) history of species and populations” (Edwards 2009), although in practice, gene trees that are used to infer the history of DNA lineages are constructed as an approximation for species’ history. Regardless, molecular phylogenies can resolve important information about levels of divergence among taxa and patterns of species

20

Phylogeny of the Channidae divergence. This information is relevant to managers who seek to conserve wild populations, either for their intrinsic and/or for their commercial value (as is the case for many fish species or stocks).

Furthermore, phylogenies can be calibrated with molecular rates of evolution, and fossil or geological data to provide a chronological estimation of divergence times for discrete lineages and hence the evolution of species. This opens the door to addressing a range of questions regarding modes and causes of evolution for the group in question. Answers to these questions have important implications for understanding contemporary species and populations, especially as anthropogenic impacts on climate, habitat availability, and population connectivity have the ability to rapidly and drastically alter natural environments.

Asian snakeheads provide an interesting target for phylogenetic analysis for a number of reasons. First, there is an absence of clear systematic information at the species level that has led to speculation about presence of cryptic taxa. This can in part be addressed by phylogenetic reconstruction. Second, Channa spp. have unusual ecological traits (airbreathing / terrestrial locomotion) that imply that speciation in this group of fishes may not follow similar patterns as those observed for other freshwater fishes that have experienced long periods of isolation in discrete freshwater drainages. Third, members of the genus are commercially very important, and so good systematic knowledge concerning levels of variation within and among taxa has the potential to inform fishery managers and commercial producers. Finally, good fossil evidence is now available that can be used to calibrate a channid molecular phylogeny (Murray 2006; Murray & Thewissen 2008; Roe 1991), allowing biogeographical hypotheses proposed in past studies to be examined explicitly against new evidence. Previous studies have timed divergence of Asian snakeheads from their African counterparts from in the early Cretaceous (over 100Mya estimated from mitochondrial DNA divergence) (Li et al. 2006) to as recently as the late Miocene – early Pliocene (8-4Mya estimated from (incomplete) fossil records) (Bohme 2004).

21

Chapter 2.

History of the Channidae

Fishes in the Family Channidae are classified into two extant genera, the African Parachannna and Asian Channa. They represent the only family in the sub-order Channoidei, one of 18 sub-orders that together make up the Order Perciformes, the most diversified vertebrate order (Nelson 1994).

The oldest known Perciform fossil dates back to the late Cretaceous, with the fossil record providing good evidence that diversification in this Order occurred from the Paleocene (65Mya) (Patterson 1993). Most extant Perciform families are considered to be more recent, having evolved during the Eocene – Miocene (55 – 5Mya), although with ongoing and substantial species radiations also occurring over the last 5 million years (Arratia et al. 2004).

The first fossil record of a channid appears in the early Eocene Kuldana Formation in , where specimens described as belonging to an ancestral (now extinct) genus ‘Eochanna’ provide evidence for an early divergence from other Perciform families. This suggests that “ the Channiformes had already emerged as a distinct phylogenetic entity” some 50Mya (Roe 1991). This date was used recently to calibrate a higher order teleost phylogeny (Santini et al. 2009), but is also applicable to lower order chronogram estimations as it provides a lower boundary for the divergence of channids from other Perciformes.

By the mid Eocene (40Mya) fossils that can be identified positively as Channa appear in deposits in Kashmir (Asia), in addition to another extinct channid, an Anchichanna (Murray & Thewissen 2009), suggesting that the Eocene was a period of diversification for the family. Around this period, faunal similarity has been noted between North African and Pakistani freshwater fish species assemblages, indicating likely Eocene exchange of freshwater taxa between these two regions. By the late Eocene (35Mya), the first Parachanna specimens appear in the Egyptian fossil record (Murray 2006), although this genus is likely to have diverged from the Channa lineage some time before this. Presence of Channa and Parachanna fossils provide clear evidence that the two genera had diverged from ancestral channids by at least 40Mya, providing a second fossil calibration point for reconstructing the evolutionary history of the family.

22

Phylogeny of the Channidae

Contemporary channid geographical distribution

The African genus Parachannna is represented by three extant species with species ranges restricted to western Africa. Members of the Asian genus Channa have a wide natural distribution, from Iran in the west, across the Indian subcontinent including , to SE Asia including; the Indonesian archipelago west of Huxley’s Line, and the Far East including China and Siberia (Wee 1982) (Figure 2.1).

Channa Parachanna

Figure 2.1 Natural geographical distribution of Channidae genera Parachanna (Africa) and Channa (Asia).

It is not clear exactly when Asian Channa expanded east to assume their modern SE Asian distribution. Their presence in western Europe and central Asia during the Miocene is well documented, but no fossils have been found in east Asia before the Pliocene (>5Mya), leading to the suggestion that Channa may only have populated eastern Asia after the onset of the SE Asian monsoon (Bohme 2004). In comparison with other primary freshwater fish taxa, a number of anomalies can be observed in the ranges of individual Channa species that may indicate a relatively recent large scale shift in species distributions. For example, unlike most obligate freshwater taxa, some members of the genus Channa possess extensive natural ranges that span thousands of kilometers across hundreds of river drainages. This suggests that they have an extraordinary capacity for natural dispersal. In contrast, a number of Channa species have very limited geographical ranges that include one or perhaps only a few river basins, suggesting that dispersal capacity for these species is quite poor. Investigating the phylogenetic relationships among Channa species can help to develop a better understanding of how these patterns may have evolved.

23

Chapter 2.

Aims of this chapter

Specifically, the aim of this chapter was to construct gene trees from genetic sequence data obtained from a number of different gene loci that provide independent molecular estimates of the evolutionary relationships among Channa species. Ultimately, these molecular data were used to construct a phylogeny for Channa spp. calibrated in time to address the following specific questions:

1. LEVELS OF DIVERGENCE AMONG SAMPLED POPULATIONS  What are the levels of divergence observed between different recognised species in the Asian genus Channa, and are the levels consistent with current taxonomic classifications for recognised species?  What are the levels of intra-specific variation within species currently recognised in the genus Channa?  Are currently recognised species monophyletic?

2. TEMPORAL SCALE OF SPECIATION EVENTS  Over what time scales have Channa lineages diverged to form discrete species?  Can specific divergence times for species be related to known historical climatic or geomorphological changes?  What is the level of concordance between the timing of events estimated here and earlier biogeographical hypotheses for Channa spp?

24

Phylogeny of the Channidae

METHODS

Sample collection Sampling of snakehead fish for phylogenetic reconstruction in the current study aimed to collect a range of species from the genus Channa, and where possible to obtain replicates of individual species across their natural distributions in SE Asia, in order to assess divergence both within and among species in the genus.

Samples used for phylogenetic analyses were collected across the Mekong Basin, primarily from local markets. Where possible at the time of sampling, fish were positively identified to species level by local government fisheries scientists. At the point of collection, fin tissue was abscised from the caudal, pectoral, or dorsal fin and samples sealed individually in vials of 75% ethanol. Additional Channa samples (fin or muscle tissue) were supplied by collaborators in Vietnam, Malaysia, and India. Plate 2.1-2.5 show five Channa species collected for the current study and sampling localities; see Table 2.1 for further information on collection.

Two outgroup species were chosen to represent divergence at higher systematic levels. A member of the Eleotridae family (Mogurnda adspersa) was chosen to represent divergence within the Order Perciformes, and at a higher level a member of the Atheriniformes (Melanotaenia splendida ) was selected to represent divergence across the Percomorpha. M. splendida belongs to the Smegmamorpha, that is currently considered to be relatively closely related to the Channidae (Chen et al. 2003). Outgroup tissue samples were provided by D. Hurwood.

Plate 2.1. C. gachua – The dwarf snakehead, = sampling locations.

25

Chapter 2.

Plate 2.2. C. lucius – The splendid snakehead, = sampling locations.

Plate 2.3. C. micropeltes – The giant snakehead, = sampling locations.

Plate 2.4. C. striata - The striped or chevron snakehead, = sampling locations.

26

Phylogeny of the Channidae

Plate 2.5. C. sp “x” (unidentified – but similar to C. marulia - The bullseye snakehead), = sampling locations.

Table 2.1. Sampling details, see Table 3.1 for geographical co-ordinates of sample locations. Species Location and date of collection Collected by Plate

C. gachua (Hamilton 1822) Nam Song R., Mekong Basin. Lao PDR. Aug 2006. E. Adamson 1. C. gachua Songkhram R., Mekong Basin. Thailand. Nov 2005. E. Adamson 1. C. gachua Mun R./Mekong R.confluence, Mekong Basin. Thailand. Nov 2005. E. Adamson 1. C. gachua Buon Ma Thuot, Daklak Prov. Mekong Basin, Vietnam. Feb 2007. Nguyen Trong Phuc 1. C. lucius Kratie, Mekong Basin. Cambodia. Apr 2007. E. Adamson 2.

C. lucius (Cuvier 1831) Riau, Sumatra. Indonesia. 2007. Dr Estu Nugroho 2.

C. micropeltes (Cuvier 1831) Mun R., Mekong Basin. Thailand. Nov 2005. E. Adamson 3. C. micropeltes Tien Bien District, Mekong Delta. Vietnam. Feb 2007. E. Adamson 3.

C. diplogramma (Day 1865) Meenachil R., Kaduthuruthy, . India. 2008. - -

C. striata (Bloch 1793) Kanakkankadavu, Chalakkudy R. Basin, Kerala, India. 2008. - 4. C. striata Sayaburi, Mekong Basin. Lao PDR. Aug 2006. E. Adamson 4. C. striata Sayaburi, Mekong Basin. Lao PDR. Aug 2006. E. Adamson 4. C. striata Nam Song R., Mekong Basin. Lao PDR. Aug 2006. E. Adamson 4. C. striata Vientiane, Mekong River. Lao PDR. Aug 2006. E. Adamson 4. C. striata Songkhram R., Mekong Basin. Thailand. Nov 2005. E. Adamson 4. C. striata Chi R., Mekong Basin. Thailand. Nov 2005. E. Adamson 4. C. striata Sekong R., Mekong Basin. Cambodia. Apr 2007. E. Adamson 4. C. striata Gai Lai Province, Kontum, Mekong Basin. Vietnam. Feb 2007. Phuc Dinh Phan 4. C. striata Tonle Sap Great Lake, Mekong Basin. Cambodia. Apr 2007 E. Adamson 4. C. striata Vinh Thuan Province, Mekong Delta. Vietnam. Feb 2007. E. Adamson 4. C. striata Saraburi Province, Chao Phraya R. Basin. Thailand. 2007. N. Sukmasavin 4. C. striata Tanjung Karang. Malaysia. 2007. Dr S Bhassu 4. C. striata Lampung, Sumatra. 2007. Dr E Nugroho 4.

C. cf. maculata (Lacepède 1801) Hanoi, Red R. Basin. Vietnam. 2007. Thuy Nguyen Gia - (sensu GENBANK) C. sp “x” Sekong R., Mekong Basin. Cambodia. April 2007. E. Adamson 5. C. sp “x” Sekong R., Mekong Basin. Cambodia. April 2007. E. Adamson 4. Mogurnda adspersa Barron R., QLD. Australia Dr D. Hurwood - (Castelnau, 1878) Melanotaenia Johnstone R., QLD. Australia Dr D. Hurwood -

splendida (Peters, 1866)

27

Chapter 2.

DNA Marker Selection Four DNA fragments were targeted for phylogenetic analysis in order to provide a range of DNA markers representing faster and slower evolving regions of the genome. The regions were chosen with the aim of generating a robust mulitlocus data set that could resolve divergence at both recent and ancient evolutionary time scales.

Two regions of the maternally inherited mitochondrial (mtDNA) genome were amplified; corresponding with partial fragments of the 16S ribosomal RNA gene and the Cytochrome b gene, henceforth referred to as “16S”and “Cyt b”. The 16S gene represents a slowly evolving conserved region of mtDNA that typically exhibits levels of variation useful for answering phylogenetic questions among distantly related species (Meyer 1994). This region is not translated into protein, instead the structures of transcribed RNA product forms a functional unit involved in protein synthesis. This characteristic means that 16S is not a “coding region” organised into triplet codons corresponding with amino acids, and therefore mutations in this region can include insertion / deletion events (indels) that can produce changes to sequence lengths, as well as substitution mutational events. In contrast, Cyt b is a moderately fast evolving region of mtDNA that commonly shows intra- specific diversity, but this fragment also has utility for resolving higher level relationships (Farias et al. 2001; Johns & Avise 1998; Zardoya & Meyer 1996). The gene codes for a membrane protein involved in the respiration chain and energy transport, and as a coding region evolution is constrained by reading frame, resulting in conserved sequence lengths. Both 16S and Cyt b have been used commonly to infer phylogenies within and among genera, for example, 16S: (Li et al. 2008; Smith & Wheeler 2004; Smith & Wheeler 2006) Cyt b: (Farias et al. 2001; Planes et al. 2001; Slechtova et al. 2006; Zaki et al. 2008)

Two regions of the nuclear genome (nDNA) were selected to provide independent (unlinked) co-inherited markers representing divergence in the nuclear genome. The first nDNA region chosen was a partial fragment of the single-copy Recombination Activation Gene-1 (RAG1). RAG1 is a highly conserved gene, with a substitution rate up to 12 times slower than mtDNA Cyt b (Quenouille et al. 2004). This region of the nuclear genome is popular for reconstructing teleost phylogenies (for example Holcroft 2004; Lopez et al. 2004; Rícan et al. 2008; Rüber et al. 2004).

The second nuclear region selected was a faster evolving, non-coding region; the first intron (RP1) within the S7 Ribosomal Protein Gene. This region is also a popular nDNA marker used to resolve teleost phylogenetic relationships, (for examples see Lavoue et al.

28

Phylogeny of the Channidae

2003; Near & Cheng 2008), and as a non-coding single copy region without constraint on mutations in length or substitution, this fragment typically exhibits much higher levels of variation than the conserved coding RAG1 locus.

DNA extraction, amplification, cloning and sequencing of PCR products

DNA was extracted from fin tissue following a standard salt extraction protocol (Miller et al. 1988). See Appendix 1 for further details on extraction method. Four DNA fragments were amplified, representing three partial gene regions and a single intron. Full details of primer sequences used are listed in Table 2.2. PCR conditions for each fragment are detailed in Table 2.3. Each PCR procedure included a negative control (no DNA template). After successful PCR amplification of target fragments, amplified products were purified before the template was sequenced in both directions on a capillary sequencer. See Appendix 2 for full details.

Sequence data were edited manually to confirm all base pair assignments from chromatographs using BIOEDIT software (Hall 1999). After sequencing, some individuals were found to be heterozygotes at the RP1 locus. Of these individuals, a number had alleles that differed at multiple point substitutions or differed in allele length, making it impossible to distinguish individual alleles from the heterozygote sequence read. In these cases, an attempt was made to isolate individual alleles by cloning before re-sequencing cloned PCR product to obtain single allele sequence reads. For details of cloning protocols see Appendix 3.

Table 2.2. Primers used to amplify DNA regions used for Channa spp. phylogenetic analysis.

Primer Sequence DNA region Reference

16sbr-L: 5’-CGC CTG TTT ATC AAA AAC AT-3’ mtDNA Palumbi et al. 16sbr-H: 5’- CCG GTC TGA ACT CAGA TCA CGT-3’ 16S (1991) GLUDG-L: 5’-TGA CTT GAA RAA CCA YCG TTG-3’ mtDNA Palumbi et al. CB3-H: 5’-GGC AAA GAG AAA RTA TCA TTC-3’ Glut & Cyt b (1991) RAG1F1: 5’-CTG AGC TGC AGT CAG TAC CAT AAG ATG T-3’ nDNA Lopez et al. (2004) RAG1R1: 5’-CTG AGT CCT TGT GAG CTT CCA TRA AYT T-3’ RAG1 S7RPEX1F:5’-TGG CCT CTT CCT TGG CCG TC-3’ nDNA Chow & Hazama S7RPEX2R: 5’-AAC TCG TCT GGC TTT TCG CC-3’ RP1 (1998)

29

Chapter 2.

Table 2.3. PCR conditions used to amplify target DNA fragments.

DNA Reaction Mix (25µL volume) Cycling conditions Product region size 50-200ng genomic DNA 94oC - 2min o 16S 0.2µM of each primer 35 X 94 C - 15s 569-576 1µL 10mM dntps (Roche™) 50oC - 15s bps 2.5µL 10X PCR Reaction Buffer (Roche™) 72oC - 30s o 1µL 25mM MgCl2 (Fisher™) 72 C - 1mi 0.5Units of Taq DNA Polymerase (Roche™) 15oC - hold 50-200ng genomic DNA 94oC - 2min o Glut & 0.2µM of each primer 35 X 94 C - 15s 832-834 o Cyt b 1µL 10mM dntps (Roche™) 50 C - 15s bps 2.5µL 10X PCR Reaction Buffer (Roche™) 72oC - 30s (Cyt b: o 1µL 25mM MgCl2 (Fisher™) 72 C - 1min 809 bps) 0.5Units of Taq DNA Polymerase (Roche™) 15oC - hold Annealing temp varied for individual taxon 1484 bps o RAG1 50-200ng genomic DNA 94 C - 2min 0.2µM of each primer 40 X 94oC - 30s 1µL 10mM dntps (Roche™) 50-57oC-30s 2.5µL 10X PCR Reaction Buffer (Roche™) 72oC - 40s o 1µL 25mM MgCl2 (Fisher™) 72 C - 5mins 0.5Units of Taq DNA Polymerase (Roche™) 15oC - hold Following RP1 50-200ng genomic DNA Chow & Hazama (1998) 626-808 0.2µM of each primer 95oC - 1mins bps 1µL 10mM dntps (Roche™) 35 X 95oC - 30s 2.5µL 10X PCR Reaction Buffer (Roche™) 60oC - 1min o 1µL 25mM MgCl2 (Fisher™) 72 C - 2min 0.5Units of Taq DNA Polymerase (Roche™) 72oC - 10min 15oC - hold

Additional data

As the number of species obtained in the field was small, and in two cases unidentified at the species level, additional sequence information for four species was acquired from GENBANK™ (Benson et al. 1999). These samples were included initially to help classify unidentified samples collected in the study, secondly to provide nodes that could be used as calibration points in chronogram estimation, thirdly to help disperse homoplasy across the tree (Heath et al. 2008) and to overcome possible long branch attraction during phylogenetic tree estimation (Graybeal 1998), and finally, to provide a broader sample of

30

Phylogeny of the Channidae taxa to increase the accuracy of phylogenetic reconstruction, especially as the number of characters (base pairs) used in the analysis was substantial (Hillis et al. 2003). Details of additional sequences added to the dataset are presented in Table 2.4. While GENBANK™ sequences covering the four regions used for phylogenetic estimation were not available for any of the four additional taxa chosen, data that were available was still included, as missing data in Bayesian phylogenetic reconstruction is reportedly not a problem when the overall number of characters assessed is large (Wiens & Moen 2008) although see Lemmon et al.,(2009).

Table 2.4. Details of GENBANK™ sequences used in the Channidae phylogenetic analysis.

Species DNA GENBANK ™ Reference region Accession Number

Parachanna obscura 16S AY763726 Rüber et al. (2006) (Günther, 1861) Parachanna obscura RAG1 AY763788 Rüber et al. (2006) Parachanna obscura Cyt b AY763772 Rüber et al. (2006) Channa bleheri 16S AY763724 Rüber et al. (2006) (Hamilton 1822) Channa bleheri RAG1 AY763786 Rüber et al. (2006) Channa bleheri Cyt b AY763770 Rüber et al. (2006) Channa marulia 16S AY763725 Rüber et al. (2006) (Hamilton 1822) Channa marulia RAG1 AY763787 Rüber et al. (2006) Channa marulia Cyt b AY763771 Rüber et al. (2006) Channa maculata Cyt b AF479271 Direct submission Bai et al. (2002)

Sequence alignment and gap coding

Coding regions (Cyt b and RAG1) were homologous in length and could be easily aligned by eye. The RNA gene fragment (16S) and the non-coding RP1 intron varied in length among taxa, and so more complex multiple alignment methods were investigated to provide robust alignments with gap inference for these fragments. M-COFFEE (Wallace et al. 2006), a meta-method that combines the outputs of several multiple alignment methods to estimate a consensus alignment, was used to align the fragments. Previous authors have

31

Chapter 2. found (e.g., Pei 2008), that M-COFFEE consistently outperforms all other alignment methods. Methods trialed here included CLUSTALW (Larkin et al. 2007), MAFFT (Katoh et al. 2005), MUSCLE (Edgar 2004a; Edgar 2004b), T-COFFEE (Notredame et al. 2000), and R- COFFEE (Moretti et al. 2008) (an alignment method especially for aligning RNA sequences that here increased the overall number of indels inferred in 16S by 550%). All other methods tended to infer many more gaps, less parsimoniously informative gaps, and produced alignments that were overly complicated with large regions of no homology. Alignments were performed on the M-COFFEE online server (Moretti et al. 2007; Wallace et al. 2006). All alignments were further checked and adjusted by eye after automated multiple alignments had been performed.

After multiple alignment, parsimoniously informative indels (gaps common to two or more individuals) were coded as binary presence/absence data following the “Simple Indel Coding” method of (Simmons & Ochoterena 2000), except in a single region of RP1 in C. gachua, where a simple tandem sequence repeat (microsatellite) was coded to reflect a stepwise model of mutation. Coding indels as binary characters allows phylogenetic estimation to incorporate gap information that would otherwise be overlooked, as raw gaps (‘-’) in sequence alignments are treated as “missing data” by programs that estimate phylogenies. Harnessing the parsimoniously informative signal from indel data by treating gaps as binary characters leads to improved accuracy in phylogenetic reconstruction (Dwivedi & Gadagkar 2009).

Genetic diversity and gene tree reconstruction

Tamura and Nei’s pair-wise genetic distance (p-distance) (Tamura & Nei 1993) was calculated individually for each locus to examine divergence within and among species using DAMBE software (Xia & Xie 2001). p-distance is appropriate for comparing both intra- specific and interspecific levels of divergence up to the family level (Kartavtsev & Lee 2006). Tamura and Nei’s (1993) mutational model was chosen over more simple models of nucleotide substitution e.g., the Kimura 2-Parameter (Kimura 1980), because it accounts for variable substitution rates among sites, unequal nucleotide frequencies, as well different frequencies of transition and transversion events (Tamura & Nei 1993). These issues are more important when dealing with variation at the interspecifc level, where simpler methods tend to underestimate real divergence (Tamura & Nei 1993).

32

Phylogeny of the Channidae

For each DNA locus, saturation was investigated by plotting the number of transition and transversion events observed between pairs of sequences against p-distance. As sequences diverge, the rate at which identical substitutions start to arise through multiple independent mutation events approaches the rate at which new differences arise, leading to an observed “saturation” of nucleotide substitutions in the DNA region (Kocher & Carleton 1997). As transition events are generally more frequent than transversions, saturation is likely to occur first in transitional substitutions, and for coding regions third base positions are likely to reach saturation before first and second positions, as many third base substitutions are synonymous and therefore escape selection for protein structure (Kocher & Carleton 1997). When there is a high occurrence of multiple substitutions (saturation) the true evolutionary rate, and hence true level of divergence , is hidden between sequences (Kocher & Carleton 1997). Saturation plots were drawn using DAMBE (Xia & Xie 2001).

Of the methods currently available for inferring phylogenetic relationship among genes, probabilistic methods, namely Bayesian and Maximum Likelihood approaches, are generally considered to be superior for tree estimation when compared with Parsimony / Neighbour Joining methods as they incorporate information on the level of divergence under an evolutionary model, as well as permitting tree topologies to be recovered that are not strictly bifurcating (Hall 2008). These methods have also been shown to be more accurate for phylogenetic inference when gap information is included (Dwivedi & Gadagkar 2009). Both methods were used here to construct gene trees for all four loci independently. This was firstly to investigate the resolution provided by each individual gene fragment, and secondly to identify incongruence between gene trees, that can arise if loci have independent evolutionary histories (Degnan & Rosenberg 2009).

Bayesian inference of gene relationships was implemented using the program MRBAYES (Ronquist & Huelsenbeck 2003). A number of separate analyses were run including and excluding indels, and with different levels of partitioning among codons for Cyt b and RAG1. After initial short runs to confirm adequate settings, analysis used three runs of four chains, with all rates and partitions unlinked and allowed to vary across partitions. The ‘4by4’ nuclear model, an nst of 6, and an invariable sites model combined with gamma model were applied to nucleotide data. States in gap (binary) data partitions were allowed to vary according to the beta distribution. Final analyses were run for 12,000,000 generations, with trees sampled every 1,000 generations. The first 25% of samples were

33

Chapter 2. discarded as burn in. At the end of runs, plots of generation versus log probability of the data were examined to confirm adequate sampling from the posterior probability distribution. In addition, the potential scale reduction factors for each parameter were checked to confirm convergence. Results were summarised as consensus trees with posterior probabilities for clades. Majority rule consensus trees have recently been advocated as the best way for presenting agreement among trees, although they should not strictly be interpreted as phylogenies (Holder et al. 2008).

Maximum Likelihood analysis was implemented in RAxML (Stamatakis 2006) via the online CIPRES PORTAL (www.phylo.org). Again, a number of initial runs were executed under different parameters, with and without codon partitioning for the coding genes. Gap characters were not included. Final analyses were conducted with the GTR Gamma Model with 10,000 multi-parametric bootstrap replicates. Majority rule 50% consensus trees for the RAxML output were constructed in the program MESQUITE (Maddison & Maddison 2009).

Species tree inference and estimation of divergence times

Bayesian and ML estimation of the Channa species phylogeny were performed for two sets of combined DNA fragments. The first method concatenated all four loci following the “simultaneous analysis” approach (Nixon & Carpenter 1996). In cases where individuals were represented by more than a single nuclear allele, one allele was chosen at random to represent that individual in the concatenated analysis. The second approach discarded loci that were poorly resolved in the independent gene tree reconstruction to cut down possible “noise” in the data originating from conflicting poorly resolved relationships at the interspecific level, loosely following the “prior agreement” approach outlined by Bull et al. (1993). In all multi-locus analysis, loci were partitioned with rates and models of evolution allowed to vary across partitions. Analyses were performed in MRBAYES and RAxML following the methodology described above for independent gene tree estimation.

Bayesian estimation of divergence times was implemented in BEAST (Drummond & Rambaut 2007). Both data sets used in the species tree reconstruction were analysed separately. Clades were defined for all major groups of taxa following the results of the multi-locus species tree analysis. Intra-specific clades were also defined where there was clear evidence for geographical isolation across the species range (i.e., Sumatra versus mainland Asia for C. lucius and C. striata) and also where deep intra-specific divergence

34

Phylogeny of the Channidae had already been determined from independent gene tree topologies. Data files were compiled with BEAUTi software (part of the BEAST package) using the following parameters: GT’ (General Time Reversible) nucleotide substitution model incorporating Gamma + invariant sites heterogeneity, a relaxed molecular clock with uncorrelated lognormal distribution (Drummond et al. 2006), and randomly generated starting trees with tree prior set to ‘Speciation – Yule Process’.

Two fossil dates were entered as priors to provide a timeframe for other node estimates, with time values assigned lognormal distributions following advice from Ho (2007) and Ho and Phillips (2009). Assigning a non uniform probability distribution to calibration points (soft bounds) allows sequence data to correct for poor calibrations if conflict between sequence data and calibration priors arises (Yang & Rannala 2006). The node representing divergence between Channa and Parachanna was constrained to a minimum of 40Mya (offset=40, mean=1.5, SD = 1), representing the first appearance of distinct Channa spp. in the fossil record around this time (Murray 2006). The prior for a second node representing the divergence of the Channidae from other perciformes was set for 48Mya (offset = 48, mean = 1.71, SD = 1.4) (Roe 1991; Santini et al. 2009). For the first run, the analysis incorporated 10,000,000 generations with parameters logged every 1,000th generation. At the end of this run, output was examined and the operators adjusted following the program log to increase Effective Sample Sizes (ESSs) in future runs (for the full data set: branch rates window size increased to 2.0, scale factor for the clock rate parameter set to 0.7775, for the reduced data set: branch rates window size increased to 4.0, scale factor for the clock rate parameter set to 0.7722). The analysis was then re-run three times and the log files combined in the program LOGCOMBINERv1.5.1 (part of the BEAST package). The program TRACERv1.4.1 (Rambaut & Drummond 2007) was used to assess the performance of the analysis by checking the convergence of each parameter (“trace”), viewing ESS scores, and used to read divergence times for each clade, including upper and lower 95% highest posterior density bounds (credible intervals).

35

Chapter 2.

RESULTS

DNA sequences from a total of 32 individuals were used in the phylogenetic analysis, including two outgroup taxa Mogurnda adspersa and Melanotaenia splendida, one African Parachanna species P.obscura , and 29 Asian Channa representing 9 putative species. Six Channa species were represented by more than a single individual (C. gachua: n= 4, C. lucius: n= 2, C. maculata: n= 2, C. micropeltes: n= 2, C. striata: n= 14, C. unknown: n= 2).

Channa genetic variation and phylogenetic analyses based on mitochondrial DNA

Two mtDNA regions encompassing partial sequences of two genes (16S and Cyt b) were sequenced for taxa in Table 2.1. Additional samples were included from GENBANK™ as outlined in Table 2.4.

16S RNA

A total of 30 16S sequences from 11 putative species were used in the phylogenetic analysis. Ten sequences were incomplete, with 3-4% missing data for these sequences. Amplified fragment length varied among taxa but length was consistent within species, with the shortest sequences observed in C. striata (569 bps) and the longest sequence observed in M. splendida (576 bps). Sequences were aligned to a total length of 582bps that included 177 variable sites and 11 indel regions. One hundred and twenty-three sites were variable among Channa sequences, and only two indel regions were exclusive to the Channidae (Parachanna and Channa spp). Gap coding of parsimoniously informative indels produced eight binary characters.

Within Channa spp the maximum p-distance was 0.139 (between C. bleheri and C. unknown, and C. gachua and C. unknown), rising to a maximum divergence of 0.145 between Channa (C. bleheri) and P. obscura sequences. Figure 2.2c illustrates the number of transitions and transversions identified in pair-wise 16S sequence comparisons. As can be seen from the ts curve, as p-distance between sequences increases above 0.14, the frequency of transitions begins to plateau, indicating that saturation is likely to be masking divergence between outgroup and ingroup taxa after this point.

36

Phylogeny of the Channidae

0.15 0.12 (a) (b) 0.13 Cyt b 0.10 RAG1 ts (1st) tv (1st) 0.10 0.08 ts (2nd)

0.08 0.06 tv (2nd) ts (3rd) 0.05 0.04 tv (3rd) ts (all) 0.03 0.02 tv (all) 0.00 0.00

0.00 0.10 0.20 0.30 0.00 0.10 0.20

Transversions & & 0.11 0.36 (c) (d) 0.10 16S 0.30 RP1 0.08 0.24

ts Transitions Transitions 0.06 0.18 tv

0.04 0.12

0.02 0.06

0.00 0.00 0.00 0.10 0.20 0.00 0.20 0.40 0.60 0.80 1.0

Pair-wise genetic distance (Tamura & Nei, 1993)

Figure 2.2. Saturation plots for DNA fragments. Transitions (ts) and transversions (tv) plotted against pair-wise genetic distance (Tamura and Nei, 93). Protein coding regions: (a) mtDNA Cyt b and (b) nDNA RAG1 show mutation frequency split by codon position (1st, 2nd, and 3rd), data is summarised in curves. For non-protein coding regions: (c) mtDNA 16S RNA gene and (d) nDNA RP1 intron scatter and curves are presented.

Bayesian analyses, including and excluding gaps, and ML analyses failed to resolve relationships among taxa based solely on 16S data, except at some of the shallowest nodes (see Figure 2.3). Topology of the 16S phylogeny did support sister relationships between C. diplogramme and C. micropeltes, C. marulia and C. sp “x”, and C. bleheri and C. gachua (Bayesian and ML bootstrap clade support values of 1.00/97, 0.93/99, and 1.00/96 respectively). The data also indicated divergence between three C. striata lineages, the first lineage represented by the single Indian sample, the second by samples across mainland SE Asia and Sumatra, and the third lineage was restricted to the mid-to-upper Mekong River Basin, where it occurs in sympatry with the second clade at one site (Sayaburi, North Western Lao PDR). Henceforth, these C. striata clades will be referred to as West Asia (WA), East Asia (EA) and Middle Mekong (MM). 16S data alone, however, was insufficient to resolve relationships between the three clades, or support plausible phylogenetic relationships at deeper nodes in the gene tree.

37

Chapter 2.

M. splendida

C. diplogramma (Kerala, India) 1.00/97 C. micropeltes (Northeastern Thailand)

1.00/92 C. micropeltes (Southern Vietnam)

C. striata (Kerala, Southern India) WA . C. striata (Songkhram, Northeastern Thailand)

C. striata (Sayaburi, Northwestern Lao) 1.00/99

C. striata (Vientiane, Northern Lao) MM 1.00/78 C. striata (Vang Vien, Northern Lao)

C. striata (Stung Treng , Cambodia)

C. striata (Lampung, Sumatra)

C. striata (Tanjung Karang, Northern Malaysia)

0.90/52 C. striata (Saraburi, Central Thailand)

C. striata (Sayaburi, Northwestern Lao) EA 0.58/-- C. striata (Si Sa Ket, Eastern Thailand)

C. striata (Gai Lai, Central Highlands, Vietnam)

0.96/63 C. striata (Vinh Thuan, Southern Vietnam)

. C.striata (Battambang, Western Cambodia)

C.marulia (Rüber et al., 2006) 0.99/93 C. sp “x” (Stung Treng, Northern Cambodia)

1.00/100 C. sp “x” (Stung Treng, Northern Cambodia)

M. adspersa 0.90/--

C. cf. maculata (Hanoi)

C. lucius (Riau, Sumatra) 1.00/65

C. lucius (Kratie, Cambodia)

P. obscura (Rüber et al., 2006) 0.53/--

0.91/-- C. bleheri (Rüber et al., 2006)

1.00/100 C. gachua (Songkhram, Northeastern Thailand)

1.00/96 C.gachua (Daklak Central Highlands, Vietnam)

C. gachua (Kong Jeam, Eastern Thailand)

0.64/63 C. gachua (Vang Vien, Northern Lao) 0.1

Figure 2.3. Bayesian consensus phylogram for 16SRNA mtDNA data. Numbers indicate Bayesian posterior probabilities/Maximum likelihood percent bootstrap support. Coloured bars indicate Channa spp. groups, black bars indicate C. striata clades: West Asia (WA), MM (Middle Mekong), EA (East Asia). Note the large basal polytomy, representing poor resolution of phylogenetic relationships at intermediate and deep nodes. Note also that the outgroup taxa M. adspersa and P. obscura occupy internal positions in this tree, suggesting that 16S is not a suitable DNA marker for resolving relationships between the Perciformes included in this study.

38

Phylogeny of the Channidae

Cytochrome b Sequence data were obtained for the first 809 bases of the mtDNA Cyt b gene for 32 individuals comprising Melanotaenia, Mogurnda, and Parachanna outgroups as well as all nine putative Channa taxa. Six sequences had missing data of between 0.6 and 27.5% of total length (average for all six was 10%).

A total of 361 variable Cyt b sites were identified. Within Channa spp. there were 313 variable sites, with a maximum p-distance between taxa of 0.246 between C. striata and C. micropeltes. Figure 2.2 shows the frequency of mutations in relation to p-distance for Cyt b. The frequency of third base transitions clearly plateaus as p-distance approaches 0.25, indicating that saturation has probably occurred for transitions in the 3rd codon position. To explore the influence of this saturation / homoplasy on phylogenetic signal, phylogenies were constructed for Cyt b under a range of data partitioning options. In all cases, models of evolution that did not partition bases among codon position recovered more resolved and better supported gene tree topologies. Figure 2.4 shows the relationships among taxa resolved in Cyt b phylogenetic analyses.

The three sister relationships established in the 16S analyses were also supported by relationships resolved in the Cyt b phylogram. The Cyt b gene tree however also recovered relationships at deeper nodes within the Channidae, as well as at some shallow nodes that were unresolved in the 16S analysis. The three C. striata clades (WA, EA, and MM) can be clearly identified in Figure 2.3, and the relationship between them is now resolved, with WA clading with MM to form a sister group to the EA haplotypes. Relationships between C. gachua also show more resolution, with samples from Lao PDR and Thailand forming a monoplyletic group to the exclusion of C. gachua from the Vietnamese Central Highlands.

39

Chapter 2.

M. splendida

M. adspersa

P. obscura (Rüber et al., 2006)

C. cf. maculata (Hanoi) 1.00/100 C. maculata (Bai et al., unpublished) 0.79/65 C. bleheri (Rüber et al., 2006) 0.99/66 1.00/100 C.gachua (Daklak Central Highlands, Vietnam) 1.00/100 C. gachua (Vang Vien, Northern Lao)

C. gachua (Songkhram, Northeastern Thailand) 0.62/89 C. gachua (Kong Jeam, Eastern Thailand) 1.00/81 0.66/44 C. lucius (Riau, Sumatra) 1.00/100 C. lucius (Kratie, Cambodia)

C. diplogramma (Kerala, India) 1.00/97 C. micropeltes (Northeastern Thailand) 1.00/100 0.97/82 C. micropeltes (Southern Vietnam) 0.91/64 C.marulia (Rüber et al., 2006) 1.00/100 C. sp “x” (Stung Treng, Northern Cambodia) 1.00/100 C. sp “x” (Stung Treng, Northern Cambodia)

C. striata (Kerala, Southern India) WA

0.90/70 . C. striata (Songkhram, Northeastern Thailand) 0.53/66 C. striata (Sayaburi, Northwestern Lao) 1.00/100 C. striata (Vientiane, Northern Lao) MM

C. striata (Vang Vien, Northern Lao)

1.00/100 C. striata (Stung Treng , Cambodia)

C. striata (Tanjung Karang, Northern Malaysia)

C. striata (Lampung, Sumatra)

C. striata (Saraburi, Central Thailand)

. C. striata (Sayaburi, Northwestern Lao) EA 0.53/100 . C.striata (Battambang, Western Cambodia)

C. striata (Si Sa Ket, Eastern Thailand)

C. striata (Gai Lai, Central Highlands, Vietnam) 0.90/66 1.00/95 C. striata (Vinh Thuan, Southern Vietnam) 0.1

Figure 2.4. Cyt b Bayesian consensus phylogram. Numbers indicate Bayesian posterior probabilities/Maximum likelihood percent bootstrap support. Coloured bars (right hand side) indicate Channa spp. groups, black bars indicate C. striata clades: West Asia (WA), MM (Middle Mekong), EA (East Asia).

40

Phylogeny of the Channidae

Channa genetic variation and phylogenetic analysis based on nuclear DNA

RAG1 gene Initially, a 1484bp region of the RAG1 gene was amplified for taxa studied here. Problems however, with PCR results (multiple products) and poor sequence read lengths, meant that fragments were reduced to a 780bp region located in the middle of the gene. One shortened sequence remained incomplete, with 19% missing data. Sequences generated in this study were aligned with three GENBANK™ RAG1 channid sequences to create a data set with the three outgroup species (Melanotaenia, Mogurnda, and Parachanna) and eighteen Channa taxa representing the nine putative Channa spp.

Across all taxa 231 base positions were variable, with 95 point mutations identified among Channa spp. Maximum p-distance among Channa was 0.069, found between C. gachua and C. lucius. The saturation plot (Figure 2.2b) showed some evidence that third base transitions may be approaching saturation at the highest divergence levels observed between ingroup and outgroup taxa (above 0.2). A range of phylogenetic analyses employing different partitioning among codon positions were undertaken to investigate the role of third base pair mutations in promoting / masking phylogenetic signal. Bayesian analysis with no codon partitioning resulted in better resolution of tree topologies (less polytomies), and better support for individual nodes, than models of evolution examining individual codon positions or the 3rd base separately. In the ML method, partitioning had no result on consensus topology, although slightly higher bootstraps were observed for nodes in partitioned analyses.

The consensus phylogram of the nuclear RAG1 gene fragment is shown in Figure 2.5. While providing good support for deeper nodes in the gene tree, RAG1 sequences were unable to resolve intra-specific relationships for C. gachua and C. striata. Although individuals from the C. striata mtDNA MM clade do form a monophyletic group in the RAG gene tree, the three clades previously identified by mtDNA analysis were less evident in the RAG1 data.

More significant however, was that topology at intermediate nodes in the RAG1 gene tree was not entirely congruent with relationships constructed from Cyt b sequence data. Specifically, three taxa, C. diplogramma, C. lucius, and C. micropeltes assume different positions with respect to other sampled taxa. While the close sister relationship between

41

Chapter 2.

C. diplogramme and C. micropeltes is still evident, the two taxa now form a basal monophyletic group with C. lucius. This relationship is quite different to that observed for mtDNA lineages, where C. lucius showed closer relationship to C. maurlia, C. striata, and C. sp “x”.

M. splendida

M. adspersa

P. obscura (Rüber et al., 2006)

C. cf. maculata (Hanoi) 0.93/54 C. bleheri (Rüber et al., 2006)

1.00/100 C. gachua (Songkhram, Northeastern Thailand)

C.gachua (Daklak Central Highlands, Vietnam)

1.00/100 C. gachua (Kong Jeam, Eastern Thailand) 0.93/72 0.86/67 C. gachua (Vang Vien, Northern Lao)

0.98/91 C.marulia (Rüber et al., 2006)

C. sp “x” (Stung Treng, Northern Cambodia)

0.83/-- C. striata (Kerala, Southern India) WA

C. striata (Si Sa Ket, Eastern Thailand)

1.00/100 EA 0.78/-- C. striata (Lampung, Sumatra)

C. striata (Songkhram, Northeastern Thailand) MM 1.00/97 C. striata (Vientiane, Northern Lao)

1.00/100 C. lucius (Riau, Sumatra)

C. lucius (Kratie, Cambodia)

0.98/89 C. micropeltes (Northeastern Thailand) 0.98/61

C. micropeltes (Southern Vietnam)

1.00/98 C. diplogramma (Kerala, India)

0.1

Figure 2.5. RAG1 Bayesian consensus tree. Numbers indicate Posterior probabilities/Maximum likelihood percent bootstrap support. ML bootstraps taken from tree where nucleotides were partitioned into 1st + 2nd versus 3rd base codon positions. Coloured bars (left hand side) indicate Channa spp. groups, black bars indicate C. striata clades: West Asia (WA), MM (Middle Mekong), EA (East Asia).

42

Phylogeny of the Channidae

RP1 intron

The data set for the 1st intron (RP1) of the S7 ribosomal protein gene consisted of 35 individual sequences that were generated from 28 individuals. This data set included alleles sequenced from homozygotes, sequences derived from heterozygote genotypes (with variable sites coded following the IUPAC sequence alphabet for ambiguous bases), and sequences from heterozygote PCR that were cloned to produce single allele sequence reads in cases where length polymorphism prevented generation of combined allele sequence data. Data were generated for the two outgroups, Melanotaenia and Mogurnda, and for seven of the putative Channa species. Twelve sequences had some missing data (average for all twelve 5.3%).

As expected for non-coding nDNA, the RP1 fragment proved to be highly variable. Sequence lengths ranged from 626bps (for the shortest C. lucius allele) to 808bps (for M. adspersa), and varied in length within individuals and among putative species as well as between species and genera. In C. gachua a microsatellite repeat contributed to allele length differences in the fragment. Multiple alignment inferred many indels, resulting in an aligned sequence length of 877bp. After alignment, a total of 62 parsimoniously informative gaps were coded as binary data for phylogenetic reconstruction. The aligned RP1 fragments had a total of 578 variable sites, with 441 positions variable within the Channa sequence alignment. The RP1 saturation plot (Figure 2.2d) illustrates the frequency of transitions and transversions as divergence increases across pair-wise comparisons. Both transition and transversion curves appear to be approaching a plateau at high p- distances, indicating that saturation is likely to be masking true divergence between more distantly related species in this study. At low levels of divergence however (p-distance less than 0.5), the RP1 fragment shows minimal saturation and therefore is likely to represent true divergence at the intra-specific level.

The consensus phylogram of RP1 alleles constructed using both nucleotide and binary gap data is presented in Figure 2.6. Phylogenetic analysis of RP1 alone was only able to resolve intra-specific and very shallow interspecific relationships with any certainty (Figure 2.6). At a lower taxonomic level however, use of this marker was sucessful at recovering relationships among alleles. Firstly, the S7 gene tree reveals that C. gachua alleles from the Vietnamese highlands are different from all alleles found in Lao PDR and Thailand. For this species, the remainder of RP1 alleles fall into two shallow clades that occur in sympatry across northern Lao and eastern Thailand. This result supports the similarity of C. gachua

43

Chapter 2. across this region that had been indicated by monophyly of individuals in the Cyt b mtDNA phylogram.

M. splendida M. adspersa C. cf. maculata (Hanoi)

C. sp “x” (Stung Treng, Northern Cambodia) 0.99/100 C. sp “x” (Stung Treng, Northern Cambodia) 0.66/68 C. sp “x” (Stung Treng, Northern Cambodia) C. lucius (Riau, Sumatra) 0.96/100 C. lucius (Kratie, Cambodia) C. lucius (Kratie, Cambodia) 0.69/80 C. micropeltes (Northeastern Thailand) 0.96/98 C. micropeltes (Southern Vietnam) C. diplogramma (Kerala, India) 0.99/100 C. diplogramma (Kerala, India) C.gachua (Daklak Central Highlands, Vietnam) 0.80/-- C. gachua (Vang Vien, Northern Lao) 0.93/100 C. gachua (Songkhram, Northeastern Thailand) 0.98/70 C. gachua (Songkhram, Northeastern Thailand) C. gachua (Kong Jeam, Eastern Thailand) 0.71/-- C. gachua (Vang Vien, Northern Lao) C. striata (Kerala, Southern India) i WA C. striata (Tanjung Karang, Northern Malaysia) 0.92/98 EA C. striata (Sayaburi, Northwestern Lao) C. striata (Sayaburi, Northwestern Lao) MM . C.striata (Battambang, Western Cambodia) 0.94/99 C. striata (Saraburi, Central Thailand) b EA C. striata (Vinh Thuan, Southern Vietnam

0.84/-- C. striata (Stung Treng , Cambodia) MM 0.62/-- C. striata (Si Sa Ket, Eastern Thailand) C. striata (Lampung, Sumatra) EA C. striata (Lampung, Sumatra) 0.63/84 C. striata (Gai Lai, Central Highlands, Vietnam) C. striata (Vang Vien, Northern Lao) C. striata (Vientiane, Northern Lao) 0.97/-- r MM C. striata (Songkhram, Northeastern Thailand) 0.87/-- 0.85/99 C. striata (Songkhram, Northeastern Thailand) 0.1

Figure 2.6. Bayesian consensus tree of RP1 alleles. Numbers indicate posterior probabilities/Maximum likelihood percent bootstrap support. Left hand side: Patterned columns indicate Channa spp. groups, vertical striped bars indicate RP1 lineages: India (i), Broad (b), and restricted (r); black bars indicate C. striata clades: West Asia (WA), MM (Middle Mekong), EA (East Asia).

44

Phylogeny of the Channidae

Secondly, the RP1 gene tree clearly indicates the presence of divergent nDNA lineages within C. striata, shown in Figure 2.6 by double horizontal stripes, and henceforth referred to as lineage i (India), lineage b (broad) and lineage r (restricted). The first, lineage (i), was present in the southern Indian sample. This lineage forms a closely related sister group to lineage b, that is represented by samples that are broadly distributed across SE Asia from Northwest Lao PDR to the island of Sumatra (average p-distance between lineages i and b = 0.0151). The third lineage, lineage r, is relatively divergent from other C. striata RP1 alleles (average p-distance versus lineages i and b = 0.0412 and 0.0418 respectively). This lineage was found in C. striata individuals restricted to three sites in northern Lao PDR and northeastern Thailand. Although heterozygotes were examined at this locus, no samples sequenced here were found to possess “inter-lineage” heterozygote genotypes.

As with the C. striata Cyt b mtDNA data, two divergent nDNA lineages were found in the mid – upper Mekong River Basin of mainland SE Asia, and a third and different type was found in India / West Asia. Despite this apparent similarity, Cyt b and RP1 gene trees are not congruent with respect to both the relationship between lineages and the assortment of individual genotypes across lineages. Firstly, mtDNA inference of phylogenetic relationship clades the Indian haplotype (WA) as sister to the mid – upper Mekong clade (MM), whilst the Indian RP1 allele (lineage i) clades more closely with the broadly distributed lineage b. Secondly, within the SE Asian samples, genotypes from two individuals were found to be comprised of MM mtDNA yet lineage b nDNA (individuals sampled at Stung Treng, Cambodia, and Sayaburi, Lao PDR; see striped and black bars, Figure 2.6).

Multi-locus phylogeny reconstruction and chronogram estimation

Two multi-locus phylogenies were constructed to estimate species relationships. The first was reconstructed from a concatenation of all four fragments that had been analysed independently above. For the second, loci that failed to resolve intra-specific relationships independently were discarded, and the two remaining loci (the protein coding regions Cyt b and RAG1) were concatenated and analysed together. Both methods inferred well resolved relationships among all taxa, and the topologies of the phylogenies constructed under each method were very similar. The phylogeny estimated from the four loci is shown in Figure 2.7. It differs slightly from the phylogeny estimated from coding loci alone, where

45

Chapter 2.

M. splendida

M. adspersa

P. obscura (Rüber et al., 2006)

1.00/100 C. cf. maculata (Hanoi) C. maculata (Bai et al., unpublished) 0.89/-- C. bleheri (Rüber et al., 2006)

1.00/99 C.gachua (Daklak Central Highlands, Vietnam)

1.00/100 C. gachua (Songkhram, Northeastern Thailand)

C. gachua (Kong Jeam, Eastern Thailand) 0.56/-- 0.92/82 C. gachua (Vang Vien, Northern Lao)

C.marulia (Rüber et al., 2006) 1.00/99 C. sp “x” (Stung Treng, Northern Cambodia) 1.00/100

C. sp “x” (Stung Treng, Northern Cambodia) 0.86/92 C. striata (Kerala, Southern India)

1.00/85 C. striata (Sayaburi, Northwestern Lao) 1.00/60 C. striata (Stung Treng , Cambodia)

1.00/100 C. striata (Vang Vien, Northern Lao)

C. striata (Vientiane, Northern Lao) 1.00/99 0.60/56 0.96/74 . C. striata (Songkhram, Northeastern Thailand) 0.66/-- C. striata (Tanjung Karang, Northern Malaysia)

C. striata (Saraburi, Central Thailand)

C. striata (Sayaburi, Northwestern Lao)

. C.striata (Battambang, Western Cambodia) 1.00/97 C. striata (Si Sa Ket, Eastern Thailand) 0.98/83 C. striata (Gai Lai, Central Highlands, Vietnam) 1.00/70 C. striata (Vinh Thuan, Southern Vietnam) 0.87/78 0.59/11 C. striata (Lampung, Sumatra)

1.00/100 C. micropeltes (Northeastern Thailand) 1.00/100 C. micropeltes (Southern Vietnam)

C. diplogramme (Kerala, India)

C. lucius (Riau, Sumatra) 1.00/100

C. lucius (Kratie, Cambodia)

0.1

Figure 2.7. Bayesian Channa phylogeny consensus tree estimated from four loci. Numbers indicate posterior probabilities/Maximum likelihood percent bootstrap support.

46

Phylogeny of the Channidae

C. lucius and C. micropeltes and C. diplogramme form a sister group to all other Channa spp. See Appendix 4 to compare topologies.

Both multi-locus data sets were analysed separately to estimate divergence times among taxa. The times and credible intervals estimated were very similar for each data set, however lower ESSs and longer times to parameter convergence were observed for the four locus data set (data not shown). Divergence times, Upper and Lower 95% credible intervals, and ESS scores for the coding region only data set are presented in Table 2.5.

Table 2.5. Divergence Times estimated by BEAST from a two gene dataset. For location of nodes see Figure 2.8. Estimates for nodes used as initial calibration points are indicated by *. Time since most recent common ancestor (tMCRA) and credible intervals expressed in millions of years.

Node Mean tMRCA 95% credible intervals ESS (Mya) (Lower- Upper) C1* 53.95 48.15 – 63.68 15197.9 C2* 43.56 40.18 - 48.59 21589.57 C3 33.91 24.99 - 42.31 3826.74 C4 28.82 20.02 - 37.19 2583.47 C5 25.21 14.5 - 35.70 1390.26 C6 23.93 15.10 -33.04 1865.561 C7 22.98 14.98 – 31.48 1580.2 C8 12.72 4.58 – 22.43 684.97 C9 11.79 6.13 – 18.31 1242.184 C10 10.45 3.56 – 18.24 1511.16 C11 8.05 3.47 – 13.91 511.1 C12 5.99 2.14 – 10.65 579.22 C13 5.66 1.42 - 11.86 1494.14 C14 3.72 1.34 – 7.44 770.49 C15 3.17 1.13 – 5.76 1695.53

The Bayesian chronogram estimated in BEAST is presented in Figure 2.8. Most divergence between Channa spp. is estimated to have occurred during the Oligocene and Miocene. The most ancient intra-specific divergence is evident in C. striata (Node C11). Credible intervals for this split overlaps with the interspecific divergence of C. diplogramme and C. micropeltes (C8), C. marulia and C. sp ”x” (C10) and C. bleheri and C. gachua (C9). This suggests that some lineages have undergone speciation more rapidly than others in the recent past, however as these nodes are quite distant from the calibration points at the root of the tree, these tMRCA estimates should be treated with caution.

47

Chapter 2.

70 60 50 40 30 20 10 0 Mya

M. splendida

M. adspersa

P. obscura (Rüber et al., 2006)

C13 C. lucius (Riau, Sumatra)

C1 C5 C. lucius (Kratie, Cambodia) C8 C. diplogramma (Kerala, India) C. micropeltes (Southern Vietnam)

C2 C. micropeltes (Northeastern Thailand)

C. striata (Lampung, Sumatra)

C14 C. striata (Si Sa Ket, Eastern Thailand)

C. striata (Gai Lai, Central Highlands, Vietnam)

C. striata (Vinh Thuan, Southern Vietnam)

. C.striata (Battambang, Western Cambodia)

C3 C11 C. striata (Sayaburi, Northwestern Lao)

C. striata (Saraburi, Central Thailand)

C. striata (Tanjung Karang, Northern Malaysia)

C. striata (Kerala, Southern India)

C. striata (Vang Vien, Northern Lao) C6 C12 C. striata (Songkhram, Northeastern Thailand)

C. striata (Stung Treng , Cambodia)

C. striata (Sayaburi, Northwestern Lao)

C. striata (Vientiane, Northern Lao) C4 C10 C.marulia (Rüber et al., 2006)

C. sp “x” (Stung Treng, Northern Cambodia)

C. sp “x” (Stung Treng, Northern Cambodia)

C. cf. maculata (Hanoi)

C7 C. maculata (Bai et al., unpublished) C9 C. bleheri (Rüber et al., 2006) C.gachua (Daklak Central Highlands, Vietnam)

C. gachua (Vang Vien, Northern Lao) C15 C. gachua (Songkhram, Northeastern Thailand)

C. gachua (Kong Jeam, Eastern Thailand)

Eocene

Paleocene Oligocene

Late CretaceousLate

Pliocene

Pleistocene Miocene

Figure 2.8. Bayesian inference Chronogram from BEAST. Divergence times were estimated for nodes C1-C15. Open circles (C1 & C2) indicate nodes that were calibrated with fossil dates. Grey bars around nodes C1 – C15 represent 95% credible intervals for tMRCA estimates (as per Table 2.5).

48

Phylogeny of the Channidae

DISCUSSION Evidence from four DNA loci was combined here to reconstruct the evolutionary history of members of the Channa genus in SE Asia. Analysis of coding mitochondrial and nuclear loci, both independently and in combination, provided a generally well resolved phylogeny that reveals a history of divergence that spans the last 40 million years. Ribosomal and intron DNA sequences support intra-specific and close sister group relationships, but fail to resolve deeper relationships, reflecting the large divergence present within this ancient genus.

Patterns of intra-specific divergence

In the phylogenetic reconstruction, five species were represented by more than a single individual collected from different geographical locations across the species’ natural ranges. The phylogeny assessed individuals belonging to at least three species that have formerly been identified as being potentially cryptic: C. striata, C. gachua, and C. micropeltes, as well as representatives of an unidentified taxa that was found to be monophyletic with C. marulia, another potential cryptic species. In all cases, taxa identified as the same species formed monophyletic groups, suggesting that none of the samples assessed here represent paraphyletic cryptic species.

The chevron snakehead, C. striata, has an extensive natural distribution, where it co- occurs with C. gachua across the full extent of the Asian snakehead distribution. A total of 14 C. striata individuals were included in the phylogeny, representing samples from five major drainage basins located in southern India, mainland SE Asia, Peninsula Malaysia and the island of Sumatra (Indonesia). All samples formed a monophyletic group, but this species showed the highest level of intra-specific divergence of any taxa included in the analysis. This divergence was observed for both mtDNA and nDNA loci. Such congruence is unlikely to have resulted from random lineage sorting in a large population. Instead, the pattern of divergence indicates that a number of separate groups within C. striata have most likely been evolving in isolation for long periods of evolutionary time. Divergence between southern Indian and SE Asian groups is not unexpected, given the large geographical distance between these sites, however, divergent lineages were also detected in individuals that occurred in sympatry in the Mekong Basin (at Sayaburi, Northeastern Lao PDR). The majority of individuals genotyped that possessed Middle Mekong (MM) mtDNA haplotypes also possessed restricted (r) nDNA, however in two instances,

49

Chapter 2. individuals with MM mtDNA possessed broad (lineage b) nDNA genotypes (Figure 2.6). This indicates that genetic exchange is occurring between the two divergent groups in the Mekong Basin, and hence that the deep intra-specific divergence detected does not reflect reproductive isolation. Chapter 3 explores diversity and phylogeography of C. striata in more depth.

Four C. gachua individuals were included in the phylogeny, collected from sites separated by over 1,000km of river distance in the Mekong River Basin (Plate 2.1). These individuals grouped closely together for all loci examined, and did not provide evidence for cryptic diversity in this species across the study area. The native range of C. gachua is recorded to span the complete geographical extent of Asian snakehead fish (Figure 2.1), from Pakistan, across India and Sri Lanka, across mainland SE Asia to islands in the Indonesian archipelago. Although no evidence for cryptic diversity was found in the Mekong Basin for C. gachua, it is quite possible that across its entire range there may be greater evidence for cryptic diversity.

In their 2004 USGS Circular on snakehead fishes, Courtenay and Williams report that C. micropeltes has a “remarkably disjunctive distribution” (p94), where the majority of the natural range of the species is represented by rivers in mainland SE Asia, Peninsula Malaysia, Sumatra and Borneo, but that the species also occurs in an isolated pocket in southwestern India. Although the Indian “C. micropeltes” was originally described as Ophiocephalus diplogramme (Day 1865), Courtenay and Williams (2004) suggest that this is a misidentification, and that C. micropeltes was translocated to southwest India sometime before the 19th Century. Contrary to this hypothesis, the present study supports the original description of the Indian taxa as a distinct species. While the two taxa do form a monophyletic group in every gene tree reconstruction, divergence between C. diplogramme and the SE Asian C. micropeltes is larger than for any other intra-specific comparison here, suggesting that the two sister species have been evolving in isolation for a long period of evolutionary time that is unlikely to be explained by a recent introduction.

Two Channa species collected for this study could not be identified to species level at the time of collection. The first, collected from northern Vietnam, was classified as C. cf maculata based on molecular comparison with DNA database records (Genbank™, Bai et al unpublished) and on records of species occurrence (Arthur & Te 2006). The second species, collected from Cambodia and referred to here as C. sp. “x” (Plate 2.5), remains unclassified. In all gene and multi-locus reconstructions the unknown species forms a sister group to C.

50

Phylogeny of the Channidae marulia, a species with a large mainland south Asian distribution. In the current study C. marulia is represented only by a DNA database sample collected from Bengal, north eastern India, over 2000km distant from the collection site for C. sp. “x” (Rüber pers comm. 2009). While C. sp. “x” and C. marulia consistently group together, considerably more divergence was present between the two taxa than was found within any recognised species groupings. In a previous channid phylogeny (Li et al. 2006), C. marulia formed a sister group to C. marulioides, but this species is not known to occur in Cambodia where C. sp. “x” was collected. The unknown species here may in fact represent evidence for a C. marulius “cryptic species complex” as suggested by Rainboth (1996a). The closest image found to the image taken at the time of collection (Plate 2.5) is shown below (Plate 2.6). Although this image also lacks formal identification, it was presented by Courtenay and Williams (2004) (among others) as a C. marulia, and superficially, it does closely resemble C. sp. “x” in the current study.

Plate 2.6. Adult C. marulia guarding young. Photographed by Ianaré Sévi. (Available from Wikimedia Commons http://en.wikipedia.org/wiki/File:Channa_marulius.jpg)

51

Chapter 2.

Relationships among taxa

A number of robust interspecific relationships were identified from the analysis using independent and combined loci. C. gachua was consistently identified as the closest relative of C. bleheri. These are two of the smaller snakehead species and only grow to a maximum length of about 20cm. While C. gachua has an extensive natural distribution, C. bleheri is endemic to the Brahmaputra River Basin in northeastern India (Courtenay & Williams 2004), where presumably the two species currently occur in sympatry. This genetic affinity has been reported previously by other authors, based on three morphological traits (U-shaped Isthmus, single sensory pores arrangement, and presence of large cycloid scales on each side of the lower jaw (Vishwanath & Geetakumari 2009)), and mtDNA relationships (Li et al. 2006). As this group occupies a distal position in the phylogeny, this suggests that the discriminating traits identified by Vishwanth and Geetakumari (2009) may be recently derived.

The C. bleheri + C. gachua clade consistently nested with C. maculata, a larger Channa species restricted historically to a limited natural distribution in southern China and northern Vietnam at the eastern extent of the larger range of C. gachua. Sister to these taxa is a clade composed of C. marulia, C. sp “x” and C. striata. Both C. marulia and C. striata possess extensive natural distributions across southern and SE Asia, and it is interesting that the two species also share a close genetic affinity.

C. micropeltes, C. diplogramme and C. lucius clade together within the phylogeny resolved here. This result is not congruent with the topology obtained by Li et al., (2006), however, in both cases C. micropeltes and C. lucius lineages represent descendants of the most ancient divergence events within the Channa. Both species are characterised by a region of gular scales that are also present in African Parachanna, but are absent in the majority of the Asian species (Musikasinthorn & Taki 2001). This supports the hypothesis proposed by Li et al (2006) that this morphological trait is plesiomorphic, or an ancestral character state. Two other species, C. bankanensis and C. pleurophthalmus, are reported to share the gular scales character with C. lucius and C. micropeltes. All four species exhibit ranges restricted to SE Asia, with the former two restricted to islands in western Indonesia. In contrast, C. diplogramme is only recorded from southwestern India, and it is not yet documented if this species also possesses the gular scale character trait.

52

Phylogeny of the Channidae

Channid divergence times

Divergence times estimated from nuclear and mitochondrial DNA data calibrated with fossil evidence show a clear history of channid diversification over the last 40 million years. Estimated divergence times vary greatly from previous estimates based on only single lines of evidence, i.e., incomplete fossil history (Bohme 2004), and single locus mtDNA evolutionary rates (Li et al. 2008). Current divergence time estimates do agree, however, with crude estimates based on general rates of Cyt b evolution, for example BEAST estimated the divergence time for Node C3 at 33.91Mya (24.99 – 42.31) (Figure 2.8, Table 2.5), compared with estimates based on p-distance that yielded 24.6Mya, 36.9Mya, and 49.2Mya with published molecular clock rates of 1%, 1.5%, and 2%, respectively (Bernardi et al. 2004; Johns & Avise 1998).

The root of the fossil calibrated chronogram (Figure 2.8) is aligned to the established hypothesis that divergence and radiation of ancestral perciforms was initiated when the Indian and continental Asian Plates collided, placing India as the centre of origin for perciform families (Bagra et al. 2009; Briggs 2003a; Karanth 2006). The northern India / Pakistan region also probably represents the centre of origin for the Channidae family, as the earliest fossil records (~50Mya) have been uncovered here, dating from a time before the Himalayas were up-thrust (Roe 1991).

The chronogram illustrates the estimated divergence times for members of the Asian Channidae. It provides evidence that the Channa genus underwent significant radiation during the Oligocene to early Miocene 34-20Mya. Globally, the mid-Oligocene was a time of major sea regression (Hall 1998). Islands in western Indonesia were connected to mainland SE Asia, and major river systems flowed southwards and eastwards from areas of high relief on the Tibetan Plateau, possibly driving dispersal of many taxa eastwards into SE Asia (Clark et al. 2004; Clift et al. 2006; Hall 1998). For channids, a landscape that provided potential for long distance hydrological connectivity may have presented the ideal opportunity to expand their ranges across southern Asia. As the Oligocene gave way to the Miocene around 24Mya, major drainage rearrangements occurred on the Tibetan Plateau, isolating the formerly connected headwaters of the Yangtze, Red, Mekong and Salween

Rivers (which flow to China, Vietnam, mainland SE Asia, and Burma, respectively) (Clark et al. 2004; Clift et al. 2006). It is notable that this period of vast geomorphological change was also correlated with the period in which the Asian snakeheads appear to have undergone their primary speciation wave. Descendants of lineages that arose around this

53

Chapter 2. time are represented in taxa that today span the full geographical extent of the Asian snakehead distribution, suggesting significant dispersal has occurred since the Oligocene / Miocene divergence.

Snakeheads prefer a wet climate, and the contemporary distributions of both Channa and Parachanna spp. are thought to be limited to areas with periods of high rainfall (>150mm in the wettest month) and warm temperatures (average mean temp 20OC in the wettest month) (Bohme 2004). The Miocene in SE Asia was generally characterised by a wet climate, culminating in the development of the East Asian Monsoon in the Late Miocene over 7Mya (An 2000; Morley 1998). The late Miocene also appears to represent a time when ancestral Channa were diverging further to become what can now be recognised as sister taxa (nodes C8, C9, and perhaps C10, Figure 2.8, Table 2.5). The onset of the monsoon system around this time brought cyclical climatic changes, alternating between dominance of dry-cold winters and warm-humid summers (An 2000; Morley 1998), resulting in associated repeated expansion and contraction cycles of rainforest assemblages (Morley 1998). This change perhaps may also have acted to isolate populations across ancestral species’ ranges in greater southern Asia, facilitating the formation of sister species pairs. The deep divergence among C. striata lineages also arose around this time.

More recently, the Miocene / Pliocene boundary (5Mya) perhaps represents a time of channid population expansion, when contemporary taxa may have expanded their ranges. The most recent common ancestors for mainland and Sumatran individuals of C. striata and C. lucius are estimated to date to the Pliocene, although the land bridge connecting Sumatra to continental Asia was present most recently as little as 20Kya (Woodruff & Turner 2009). Some care should be taken in interpreting more recent divergence times, as nodes that lie distant from calibration points (for this data set C1 and C2) are hard to estimate accurately (Linder et al. 2005).

54

Phylogeny of the Channidae

Conclusion

All species examined in the current study were monophyletic, and in the two cases where taxonomic confusion was present, the taxa in question (C. diplogramma and C. sp “x”) were found to be monophyletic with the recognised species with which they had previously been confused. Despite this monophyly, the large divergence (>10Mya) between these taxa suggests that taxonomic status as distinct species is probably warranted in each case, especially as this divergence is of a similar magnitude to divergence observed between taxa currently recognised as distinct (C. bleheri and C. gachua). The level of divergence uncovered among C. striata individuals, however, indicates that deep divergence has not necessarily resulted in the formation of reproductive isolation for all lineages in the genus.

The combination of fossil evidence and molecular divergence presented here indicates that that the divergence between African and Asian snakeheads is most likely to have taken place soon after the Indian-Asian plate collision (Eocene), in contrast to previous biogeographical hypotheses that have suggested Gondwanan (early Cretaceous) (Li et al. 2006) or Miocene (Bohme 2004) divergence between genera. In Asia, the presence of multiple divergence events in the Miocene that reflect division between species with western and eastern distributions suggests that dispersal across southern Asia may have been limited during this time, possibly in association with aridifying climate and Himalayan upthrust

55

Chapter 3 Patterns of genetic diversity and phylogeography of Channa striata in SE Asia

57

Phylogeography of C. striata

INTRODUCTIONION Channa striata, commonly known in English as the ‘chevron snakehead’, ‘striped snakehead’, or ‘striped snakehead murrel’, is perhaps the most common species in the Asian snakehead genus. C. striata is locally abundant across a geographically wide natural distribution, with a natural range that encompasses most of southern and SE Asia, extending from Sri Lanka in the west to Borneo and Sumatra in the southeast (Courtenay & Williams 2004; Fishbase 2010).

Growing up to 90cm, C. striata is also one of the largest snakeheads (Rainboth 1996a). Across its native range this species is harvested for food from the wild in vast numbers (Figure 3.1a). The species is popular for human consumption, and is generally marketed fresh or alive. In mainland SE Asia, C. striata is possibly the most common fish species in markets along the Mekong River, where it is among the most important species from wild capture and commands the highest prices from culture production (MRC Fisheries Program 1999; Naret et al. 2002). In the Vietnamese delta alone, C. striata accounts for 20-40% of all household money spent on fish (Dey et al. 2005).

90 Data © FAO - Fisheries and Aquaculture Information and Statistics Service (a) 20/02/2010 80

70

60

Thousands Thousands tonnes of 50

40

30

20

10

0 1950 1960 1970 1980 1990 2000 16 Data © FAO - Fisheries and Aquaculture Information and Statistics Service (b) 20/02/2010 14

12

10 Thousands Thousands tonnes of 8

6

4

2 (c) © MRC 0 1950 1960 1970 1980 1990 2000 Figure 3.1. Global fisheries production for C. striata 1950-2007. Source: Fisheries Statistics Data (FAO 2010). (a) Production from wild capture fisheries, (b) Production from aquaculture, (c) A Cambodian fish vendor sells C. striata at a local market.

59

Chapter 3.

In the Mekong Basin C. striata are harvested in large numbers from flooded forests, rice fields, associated canals and other wetland areas in all seasons and at all stages of maturity (Ambak & Jalal 2006) (personal observation). The species is commonly caught using seines, gill-nets, traps and baited hooks (Nguyen et al. 2006; Rainboth 1996a). The majority of fishing operations are small scale family run enterprises, although larger operations are also present.

C. striata production from aquaculture represents a rapidly expanding industry in the region (Figure 3.1b), with fish often grown at very high stocking densities (greater than 30/m2 for fish of average length 20cm) where they perform well on formulated feeds (Qin & Fast 1998). At present, almost all aquaculture production is established using wild caught juveniles (Ali 1999; Poulsen et al. 2008; So & Haing 2007). As demand for fish in Asia continues to grow (Delgado et al. 2003; Huang & Bouis 1996), it is likely that fishing pressure will increase on natural populations of C. striata, both as a capture fishery and as a source for cultured stocks.

Ecology C. striata are primarily carnivorous, and as they grow from fry to adults their diet is thought to shift from planktonic crustaceans, snails and worms towards larger prey items including fish, frogs, and small aquatic snakes (Lee & Ng 1994; Wee 1982). Individuals prefer standing or slow moving water up to 1m in depth (Courtenay & Williams 2004), and are often found in rice paddies, drainage canals, lakes and ponds. C. striata is non- migratory (known in the Mekong Basin as “black fish”), but do undertake short lateral migrations to and from flood plain habitats (Poulsen et al. 2008). In fact, C. striata is one of the main ‘self-recruiting species’ indigenous to SE Asia, and is quick to colonise ephemeral freshwater habitats where they occur, including inundated rice fields and temporary flooded forest habitats (Amilhat et al. 2009a). Despite their capacity to disperse quickly at a local scale, the maximum dispersal distance reported for this species is only 3km (over a 2 year study), with average recorded dispersal distances in the order of only 500m (Amilhat & Lorenzen 2005), suggesting that movement is generally limited over individual lifetimes.

C. striata are solitary except during breeding , and become reproductively mature at 2- 3 years of age (total length >20 cm) (Kilambi 1986). Reproductive females produce an average of between 4326-9017 oocytes annually (Ali 1999; Kilambi 1986) , although in nature the maximum brood size is probably around 5000 (Parameswaran & Murugesan

60

Phylogeography of C. striata

1976). This is relatively low in comparison with many other freshwater fishes (Wee 1982). Individuals are “opportunistic breeders” (Poulsen et al. 2008), and are capable of multiple extended spawning under the right conditions, but breeding effort is commonly reserved for onset of the wet season, when fish move into floodplain habitat (Ali 1999). Across mainland SE Asia, pairs are thought to spawn up to twice a year during the flood season, and eggs are guarded by parents until they hatch (after around 24 hrs) (Ali 1999; Campbell et al. 2006; Courtenay & Williams 2004; Marimuthu & Haniffa 2007).

Presence of suprabranchial chambers in adult C. striata enable it to breathe air and so remain out of water for extended periods of time, at least 28 hrs or even more provided humidity is sufficient to keep individuals moist (Hughes & Munshi 1986; Sayer 2005). Of all the snakehead species, C. striata is perhaps the most tolerant of poor water quality, but it is also capable of traversing terrestrial environments, or aestivating under hardened mud (Wee 1982). These characteristics enable individuals to respond to seasonal changes in habitat availability and quality by either moving short distances overland in search of new favoured aquatic habitat or by remaining dormant until local conditions improve. These are uncommon attributes in freshwater fish species.

Unusual life history characteristics make the population structure of C. striata in SE Asia hard to predict. Populations of freshwater species commonly show population structure tightly-linked to the drainage basins they inhabit (Meffe & Vrijenhoek 1988), yet C. striata is also capable of some overland dispersal and is known to be a rapid coloniser, suggesting that unlike many freshwater species it may have potential to disperse widely among drainage basins. A previous study of C. striata genetic diversity in Thailand found differentiation among populations in the middle Mekong Basin and Chao Phraya Basin at a level expected among local races, yet noted close genetic similarity between geographically proximate C. striata populations inhabiting the upper regions of both drainages (Hara et al. 1998). This pattern suggests that neither geographical proximity nor drainage divides are likely to fully explain contemporary population structure of C. striata populations.

Furthermore, spatial arrangement of genetic diversity in relatively sedentary species often conforms to a pattern of isolation by distance (Wright 1943), where genetic exchange is inversely proportional to geographical distance among sites, with higher rates of gene exchange (and hence similarity) apparent among neighbouring populations. For populations arranged in a linear habitat (such as along a large river system), this expectation can be refined to expectations of Kimura and Weis’ (1964) One-dimensional

61

Chapter 3.

Stepping Stone model, or perhaps to Meffe and Vrijenhoek’s (1988) Stream Hierarchy model, where probability of gene exchange (dispersal) within a drainage is influenced directly by the level of freshwater connectivity. For C. striata in the Mekong, these predictions do not seem to be satisfied. Results of the phylogenetic analysis (Chapter 2) revealed that not only are two deeply divergent C. striata lineages present in sympatry in the Mekong, but ancestors of one lineage were more closely related to individuals from southern India, separated by over 2,000kms, than to other individuals collected at the same sampling site.

Understanding C. striata diversity in a regional context

Considering that C. striata has an unusual ecology and the importance of this species to human populations, a number of environmental and anthropogenic factors may possibly have influenced contemporary population structure and genetic diversity patterns in C. striata populations in SE Asia. These can be broadly grouped into factors promoting population and/or range expansion, causing possible population declines, and those enhancing population connectivity or promoting admixture.

C. striata are known to prefer slow moving shallow water habitat. The formation of the vast shallow Tonle Sap Great Lake in the Lower Mekong Basin during the mid Holocene (Penny 2006; Rainboth 1996a) was likely to have expanded available preferred habitat for the species, and hence could have promoted population expansions. It has also been suggested that recent expansion of rice growing by human populations across SE Asia with associated conversion of natural habitats to paddy fields may have benefited wild C. striata populations (Poulsen et al. 2008), leading to sustained large population sizes.

Conversely, habitat loss and overexploitation of wild stocks have the potential to severely impact fish populations over very short time scales (Collares-Pereira & Cowx 2004; Kenchington 2003; Mullon et al. 2005). C. striata populations have been the focus of sustained and intensive harvesting in particular, over the last 50 years (Figure 3.1a), and this could have resulted in marked local declines in genetic diversity. Local declines in populations have been reported in some parts of the species’ natural range (Courtenay & Williams 2004), and these incidents may reflect a wider pattern of decline due to increased harvesting pressure.

62

Phylogeography of C. striata

Finally, throughout the Pleistocene, sea levels fluctuated greatly, and large parts of the Sunda and Sahul shelves were repeatedly exposed, connecting freshwater drainages and forming extended river basins (Rainboth 1996b; Voris 2000; Woodruff & Turner 2009). In addition, geomorphological changes in mainland SE Asia are thought to have reshaped drainage lines, both isolating and connecting freshwater habitats (Rainboth 1996a). Expanded freshwater connections among what are now isolated drainage basins, that are known to have enhanced dispersal in other freshwater taxa (de Bruyn et al. 2004; Dodson et al. 1995; McConnell 2004) could have also promoted gene exchange and population admixture in C. striata. Dispersal of C. striata across southern Asia may also potentially have resulted from translocations by humans. Today C. striata is an important, high value food fish in India and in SE Asia. Translocation by humans is common for economically important freshwater taxa, and it is possible that some C. striata lineages may in part, owe their current distribution patterns to human populations moving individuals to new habitats across southern Asia for food or commercial reasons.

63

Chapter 3.

Aims of this chapter

This chapter aimed to characterise the levels and patterns of genetic diversity in wild C. striata populations across SE Asia and to determine the pattern of population structure present in this species. Mitochondrial and nuclear microsatellite data were interpreted in a geographical, demographic and historical context to re-construct a model for the micro- evolutionary history of this species in the region.

Specifically, the questions addressed at both local and regional scales include:

 How is genetic variation distributed across the study area?  How does divergence in mtDNA diversity compare with divergence and diversity in nDNA? o Do different loci indicate similar patterns of divergence? o Is this pattern reflected at the individual as well as at the gene level?  To what extent do divergent lineages occur in sympatry? o What is the extent of hybridisation among lineages?  Is there any evidence for recent demographic changes that may have been associated with anthropogenic activities?  What does the spatial arrangement of genetic variation reveal about past and present patterns of gene flow, genetic exchange, and colonisation by the species?  How do observed patterns of phylogeography and population structure relate to the ecology and life history of C. striata?

64

Phylogeography of C. striata

METHODS

Sample Collection

Sampling of C. striata aimed to collect individuals from across the natural distribution of the species in SE Asia at a range of different hierarchical scales. At the finest spatial scale, multiple sites were sampled across the Mekong River Basin to assess variation, dispersal and population structure at the within-drainage basin level. At a higher level of the spatial hierarchy, sites were assessed in the Chao Phraya River drainage basin and Mekong River drainage basin to provide comparisons at the neighbouring drainage scale. Individuals from a third drainage in mainland SE Asia (at Tanjung Karang, Malaysia) were included to provide additional comparison at the regional scale. At the highest level of spatial hierarchy, a site from Sumatra in the Indonesian archipelago and a site from southern India were included to assess divergence/similarity across the full extent of the natural range of the species, spanning a marine barrier to dispersal (Sumatra) and large geographical distance (India) Figure 3.2 and 3.3.

Across the Mekong Basin, C. striata samples were collected primarily from local markets. At the time of sampling, fish were positively identified to species level by local government fisheries scientists. All fish sampled were caught locally from the wild, usually within a 1- 5km radius of the point of collection, and fish collected at a single site were assumed to be representative of local population variation at and around that site. Where possible, 30 – 50 individuals per site were sampled to allow for a robust analysis of variation at the local population level (Ruzzante 1998). Fin tissue was abscised from the caudal, pectoral, or dorsal fin and samples sealed individually in vials of 75% ethanol. Additional C. striata samples (fin or muscle tissue) were supplied by collaborators in Vietnam, Malaysia, Indonesia and India. Geographical co-ordinates for each site and details of sample collectors are presented in Table 3.1. Figure 3.2 and Figure 3.3 illustrate location of collection sites. Information for each location including river drainage basin membership and country are presented in Table 3.2. MtDNA and nDNA analyses utilised slightly different sets of samples (see Table 3.2 for details). Colours of collection sites in Figure 3.2 and Figure 3.3 indicate which sites correspond with the mtDNA and/or nDNA analyses. In preparation for genetic screening, total genomic DNA was extracted following Miller et al.’s (1998) standard salt extraction method (Appendix 1).

65

Chapter 3.

Table 3.1 Geographical co-ordinates for C. striata sample sites. Collaborators that assisted with sample collection are presented in the right hand column. See Table 3.2 for site code abbreviations and further sampling details.

Collection Latitude and Collectors and sampling dates site longitude

Kanakkankadavu 10O08.00’N 76O07.00’E Unknown. 2008 Chiang Mai 18O47.00’N 99O00.00’E D. Hurwood & E. Adamson. Nov 2005. Saraburi 14O37.50’N 100O54.00’E N. Sukmasavin & Thai Dept. of Fisheries. 2007. Tanjung Karang 03O25.78’N 101O16.76’E S. Bhassu. 2007. Lampung 03O20.30’S 101O10.00’E E. Nugroho. 2007. Sayaburi 19O16.10’N 101O42.68’E Boonsong, D. Hurwood & E. Adamson. Aug 2006. Vang Vien 18O56.52’N 102O26.85’E Boonsong, D. Hurwood & E. Adamson. Aug 2006. Vientiane 17O57.61’N 102O36.86’E Boonsong, D. Hurwood & E. Adamson. Aug 2006. Songkhram 17O47.90’N 104O00.43’E U. Suntornata, D. Hurwood & E. Adamson. Nov 2005. Mukdahan 16O32.33’N 104O43.65’E U. Suntornata, D. Hurwood & E. Adamson. Nov 2005. Kong Jeam 15O18.88’N 105O30.04’E U. Suntornata, D. Hurwood & E. Adamson. Nov 2005. Burirum 14O57.75’N 102O58.20’E N. Sukmasavin & Thai Dept. of Fisheries. 2007. Kontum 14O19.22’N 108O01.63’E Phuc Dinh Phan. 2007. Buon Ma Thot 12O43.80’N 107O55.00’E Phuc Dinh Phan. 2007. Stung Treng 13O31.73’N 105O58.26’E S. Lieng, E. Adamson & Cambodian DoF. Apr 2007. Kratie 12O29.10’N 106O01.02’E S. Lieng, E. Adamson & Cambodian DoF. Apr 2007. Snoul 12O04.56’N 106O25.36’E S. Lieng & E. Adamson. Apr 2007. Memot 11O49.62’N 106O10.86’E S. Lieng & E. Adamson. Apr 2007. Kampong Cham 11O59.38’N 105O27.86’E S. Lieng, E. Adamson & Cambodian DoF. Apr 2007. Pha Ao 12O01.70’N 104O51.86’E S. Lieng & E. Adamson. Apr 2007. Kampong Chhnang 12O15.27’N 104O40.13’E E. Adamson & Cambodian DoF. Apr 2007. Pursat 12O32.32’N 103O55.10’E E. Adamson & Cambodian DoF. Apr 2007. Battambang 13O07.59’N 103O12.69’E T. Roth, E. Adamson & Cambodian DoF. Apr 2007. Takeo 10O56.00’N 104O50.00’E Cambodian DoF. Apr 2007. Tinh Bien 10O37.16’N 105O00.01’E Nguyen Nguyen Du & E. Adamson. Feb 2007. Tram Chim 10O40.20’N 105O33.57’E Nguyen Nguyen Du & E. Adamson. Feb 2007. Tan An 09O52.31’N 105O07.45’E Nguyen Nguyen Du & E. Adamson. Feb 2007. Vinh Thuan 09O30.73’N 105O15.53’E Nguyen Nguyen Du & E. Adamson. Feb 2007. Phung Hiep 09O48.75’N 105O49.26’E Nguyen Nguyen Du & E. Adamson. Feb 2007.

66

Phylogeography of C. striata

Figure 3.2. Map of Southern Asia showing broad scale sampling sites for C. striata. Stars indicate collection sites. Colour of star indicates type of See Figure 3.3 genetic analysis performed; Yellow stars ( ) Cyt b mtDNA only, and Red stars ( ) both Cyt b and microsatellite DNA.

Figure 3.3. Map of mainland SE Asia showing fine scale sampling sites for C. striata in the Mekong River Drainage. Stars indicate collection sites. Colour of star indicates type of genetic analysis performed; Yellow stars ( ) Cyt b mtDNA only, Green star ( ) microsatellite DNA only, and Red stars ( ) both Cyt b and microsatellite DNA

67

Chapter 3.

Table 3.2. C. striata collection sites and sample sizes. See Figures 3.2 and 3.3, and Table 3.1 for geographical location of sampling sites.

n n Collection Site mtDNA microsatellite Country River site Code analysis analysis Drainage (total =988) (total = 654) Kanakkankadavu K 3 - India Chalakkudy Chiang Mai CM 10 10 Saraburi CP 49 48 Thailand Chao Phraya Tanjung Karang TK 30 30 Malaysia Sungai Tengi Lampung LP 7 24 Indonesia Eastern Coastal Sayaburi SB 28 30 Vang Vien VV 2 20 Lao PDR Vientiane VT 42 42 Songkhram SM 2 8 Mukdahan MD 21 29 Kong Jeam KJ 10 11 Thailand Burirum NE 50 50 Kontum GL 39 39 Vietnam Buon Ma Thot LL 40 40 (Highlands) Stung Treng ST 47 48 Kratie KK 49 - Mekong Snoul SN 50 - Memot ME 39 - Kampong Cham KC 50 48 Cambodia Pha Ao PA 49 - Kampong Chhnang KH 51 - Pursat PS 56 48 Battam Bang BB 50 - Takeo TP 49 - Tinh Bien TT 42 42 Tram Chim TC 53 - Vietnam Tan An TL - 45 (Mekong Vinh Thuan VI 28 - Delta) Phung Hiep PH 42 42

68

Phylogeography of C. striata

DNA marker selection

Both mitochondrial DNA (mtDNA) and nuclear DNA (nDNA) diversity were assessed for this study. In each case, regions of the genomes were selected to address specific aims of the research, and species specific primers compatible with high through-put screening techniques were designed to target selected loci.

A region of the Cytochrome b gene of the mtDNA genome was chosen to address genealogical relationships among individuals and populations. MtDNA is well suited for phylogeographic investigations, and is frequently used to address questions regarding population history, demography, and spatial patterns of genetic diversity and divergence (Avise 2009). MtDNA has a number of characteristics that make it especially useful for intra-specific phylogeographic analysis that have been discussed elsewhere in depth and that are now generally well recognised (Avise 1994, 2000, 2009; Wilson et al. 1985). Briefly, mtDNA is a haploid, maternally inherited, non-recombining, rapidly evolving locus with an effective population size one quarter that of nuclear loci. These characteristics mean that mtDNA provides an unbroken genealogical record of species’ and population histories. Under assumptions of neutrality and coalescent theory, mtDNA diversity can be analysed to infer historical demographic events including; population bottlenecks, range expansions and to map the micro-evolutionary history of a species in a spatial framework. The Cytochrome b gene region (Cyt b) was selected for screening here as this mtDNA region was found in a pilot study to contain suitable levels of variation to address intra-specific relationships among C. striata populations.

To further investigate population processes, microsatellite regions of the nuclear genome were selected. Microsatellite loci are simple sequence repeats that are present in high frequencies across eukaryote genomes, that theoretically represent multiple unlinked Mendelian loci (Ellegren 2004; Zhang & Hewitt 2003). Most microsatellites are in non- coding regions of the genome and typically have much higher mutation rates than rates of nucleotide base substitution in mtDNA or nDNA protein coding regions, that generate high levels of allelic diversity at the intra-specific level relevant to addressing fine-scale ecological questions (Schlötterer 2000; Selkoe & Toonen 2006; Zhang & Hewitt 2003). Microsatellite data are used commonly to address questions regarding contemporary levels of population structure, gene flow, demographic history, interbreeding, and relatedness among individuals (Pearse & Crandall 2004; Selkoe & Toonen 2006).

69

Chapter 3.

Molecular techniques - Mitochondrial DNA

PCR amplification Initial sequencing was undertaken for a number of C. striata individuals to confirm suitability of Cyt b as a variable marker for addressing phylogenetic and population genetic questions for this species. Several 834 base pair mtDNA fragments encompassing 25 bases of the Glutamate tRNA gene and 809 bases at the start of the Cyt b gene were amplified with primers GLUDG-L and CB3-H (Palumbi et al. 1991) prior to sequencing in both directions. For primer sequences and PCR conditions see Table 2.2 and Table 2.3, for sequencing protocol consult Appendix 2. Initial sequences were aligned by eye using BIOEDIT software (Hall 1999) and a region of partial homology identified from bases 174 to 193 of the Cyt b gene, from which a Channa specific primer was designed; Ch-CBi: 5’-CAT YAC CAC CGY CTT CTC AT-3’. Used in combination with CB3-H, this primer yielded a 657 base pair internal fragment of the Cyt b gene that was suitable for mass screening with Temperature Gradient Gel Electrophoresis (TGGE).

In preparation for screening, the internal Cyt b region was amplified for each sample as follows: 25µL PCR reaction volumes contained 50-200ng genomic DNA, 5pmol of each primer, 1µL of 10mM dntps (Roche™), 2.5µL of 10X PCR Reaction Buffer (Roche™), 1µL of

25mM MgCl2 (Fisher™) and 0.5units of Taq DNA Polymerase (Roche™). PCR cycling conditions began with denaturing at 94oC for 2mins, followed by 35 cycles of 15sec denaturation at 94oC, 15sec annealing at 50oC and 30sec extension at 72oC. A final extension of one minute was followed by a 15oC hold step to complete the reaction cycles. All PCR reactions included a negative control (one reaction with no DNA template). After PCR amplification, reaction success was checked for each sample using agarose gel electrophoresis. Figure 3.4 shows an example of a successful PCR check gel, see Appendix 2 for details.

Loading wells

bps 1357 1078 872 603 >300 M -ve S1 S2 S3 S4 S5 S6 Figure 3.4. Agarose check gel showing positive amplification of Cyt b mtDNA gene fragment. M: Molecular weight marker (Marker 9, Roche), S1-6: PCR product from successful amplification, -ve: negative control.

70

Phylogeography of C. striata

Mass screening using TGGE

Variation in Cyt b was screened using Temperature Gradient Gel Electrophoresis in combination with Outgroup Heteroduplex Analysis (TGGE-OHA) (Campbell et al. 1995; Elphinstone & Baverstock 1997). This method provides a relatively low cost, high throughput means for screening variation at mtDNA loci. TGGE is a ‘genetic fingerprinting’ technique that can be applied to identify different DNA sequences (allelic variants) based on their relative electrophoretic mobility when partially melted (denatured) (Lessa & Applebaum 1993; Muyzer & Smalla 1998). OHA increases the sensitivity of TGGE, allowing for detection of differences as small as one base pair between allelic variants (Campbell et al. 1995).

In OHA, PCR product from the target DNA fragment of each individual is annealed to PCR product from a single reference sample (outgroup), creating mismatched double stranded DNA fragments of the target DNA region (heteroduplexes). Every heteroduplex is characterised by a precise melting temperature that is determined by the specific nucleotide base mismatches between sample and reference DNA strand. TGGE on high- resolution polyacrylamide gels can be used to discriminate between heteroduplexes based on these denaturation temperatures, allowing samples to be classified into haplotype groups based on the banding patterns unique to each haplotype heteroduplex.

A C. striata sample collected from Tanjung Karang (Malaysia) was selected as the principal outgroup for Cyt b screening in this study. A second sample, collected from Vientiane (Lao PDR) was used as an alternative outgroup to rerun samples that had failed to produce banding patterns when the first outgroup was used. Samples were screened on a horizontal TGGE system (modelled from a DIAGEN (now QIAGEN) TGGE-System). Each heteroduplex reaction contained ~15ng sample PCR product and ~10ng reference PCR product. After optimisation of electrophoresis conditions, samples were run through 5% polyacrylamide gels at 300V for 3hrs over a perpendicular gradient of 40-60oC. DNA bands were visualised via silver staining. Examples of banding pattern and gel scoring are presented in Figure 3.5. For a detailed explanation of methods, including optimisation of the TGGE-OHA technique for screening Cyt b diversity in C. striata, please consult Appendix 6.

Cyt b haplotype frequencies were scored individually for all sample populations. After all samples for each site had been sorted into haplotype classes, single individuals of each

71

Chapter 3. haplotype present for each site were sequenced for the Cyt b fragment using the Ch-CBi primer following the protocol outlined in Appendix 2. The DNA sequences of these individuals were assumed to be representative of all individuals of the same haplotype class at their site of collection. Sequences were aligned by eye using BIOEDIT software and all mutations checked against chromatograph sequencing output. Ends of sequences were discarded and, where this resulted in the loss of unique mutations, haplotype frequencies were pooled to reflect diversity in the shortened gene fragment.

Ref ? A1 A1 A1 A1 ? A1 A1 A1 A2 A1 A3 A2 A1 A2 A4 A3 A2 A1 A1 A1 A4 A1 ? A1 A2 A1

Ref B1 B1 B2 B2 B1 B2 B2 B3 B4 B5 B5 B6 B6 B6 B7 B7 B1 B8 B1 B2 B2 B2 B2 B9 B2 B2 B2

Ref C1 C2 C3 C1 C2 C2 C3 C2 C4 C2 C2 C2 C2 C2 C3 C2 C1 C3 C5 C2 C1 C2 C2 C6 C1 C7 C2

Ref D1 D1 D2 D3 D4 D5 D5 D1 D6 D6 D3 D3 D1 D1 D6 D6 D3 D1 D5 D5 D5 D1 D5 D5 D1 D5 D1

Figure 3.5. Banding patterns observed for four Cyt b TGGE-OHA gels. Letters indicate gel (A, B, C and D), numbers indicate allele classification based on scored banding pattern. The standard outgroup only homoduplex (Ref) is on the far left.

72

Phylogeography of C. striata

Molecular techniques - Microsatellite DNA

Isolation of microsatellite loci, primer design and PCR amplification

Nuclear DNA diversity was assessed by screening variation at eight microsatellite loci. While microsatellite loci are considered to be ubiquitous in most eukaryote genomes (Ellegren 2004), they are often located within non-coding DNA regions with high mutation rates, meaning that homology in microsatellite priming sites between species is generally low. This often necessitates isolation of taxon-specific microsatellite regions when examining new taxa (Zane et al. 2002). This proved to be the case for C. striata, as no previously published microsatellite primers were available for this, or any other member of the Channidae.

Isolation of microsatellite DNA involved: 1) fragmenting C. striata genome, 2) inserting individual fragments into bacterial cells to create a genomic library, 3) screening the genomic library for specific tandem repeat sequences using radioactive probes, and 4) sequencing microsatellite DNA inserts identified in the screening process. See Archangi et al. (2009), Chand et al. 2005 and Appendix 7.

Sequences that contained discrete microsatellite repeat regions with large (at least 40 base) flanking regions either side of the repeat were selected for primer design. The online program PRIMER3 (Rozen & Skaletsky 2000) was used to produce a list of primer pairs for amplifying each repeat region. From these pairs, primers were chosen that amplified fragment sizes around the 100-200bp size to facilitate short gel run times and ease of scoring against molecular size standards. Information about the eight primers developed here is presented in Table 3.3. Additional primers not used in this study are presented in Appendix 8.

PCR reaction mix was optimised for each primer set by trialling multiple samples under a range of annealing temperatures and MgCl2 concentrations and visualising products on GELSCAN 3000 polyacrylamide gels. All reactions were 12µL in total and contained 5pmol of each primer, 0.5mM dNTPS (Roche™), 1.2µL 10X PCR Reaction Buffer (Roche™), and 0.5 Units of Taq DNA Polymerase (Roche™). Reactions were performed in an EPPENDORF Mastercycler S under the following cycling conditions: 2min initial denaturation at 94oC; 35 cycles of 15s denaturation at 94oC, 15s annealing and 30s extension at 72oC; final extension ran for 1 min at 72oC.

73

Chapter 3.

Table 3.3. Microsatellite primers for C. striata. Repeat motif is reported as observed in sequence used in primer design. Flanking region refers to the length in base pairs of the sequence within the priming region that is not composed of tandem repeats. Size range of PCR products is reported under Allele size, the corresponding number of tandem repeats is reported under Repeat range. For each primer set annealing temperature in degrees Celsius (Ta) and quantity (µL) of 25mM MgCl2 per 12µL were optimised for screening samples on the GELSCAN 3000 System.

Primer Sequence Repeat Flanking Alleles Repeat Ta MgCl2 motif region size range

Cs-1 5’-GGC AGT GTT CCA CTC CAG TT-3’ GT(11)- 5’-CCG GGG ATC TTT TCA GTT TT-3’ GG-GT(6) 138 162-198 12-30 50 0.1 Cs-2 5’-GGT TAC ACT GCG GGT CAG AG-3’ 5’-GGA TGG GTC TAA CCT GCC TA-3’ TG(12) 89 103-131 7-21 56 0.4 Cs-3 5’-TGC ACT GTT TCT GAC TAA ATG TG-3’ 5’-TGC CAA ACT AAA CCG ACT TTG-3’ TG(14) 83 95-151 6-34 55 0.4 Cs-4 5’-TCG CAG TTT ATG TAC CGA CA-3’ 5’-CTC CAG GGG AAT TTA CAG CA-3’ CA(15) 128 146-202 9-37 53 0.4 Cs-5 5’-AAA CCC AAA AGC CAC ACT TC-3’ 5’-TGA AAT AGA GCC TGT GAC TGA TG-3’ CA(14) 125 137-197 6-36 59 0.1 Cs-6 5’-ACT TGA CAA AAC CTG CCA CA-3’ AC(20)- 5’-ACT TGT TCT TGG TAG ATG CCA CT-3’ AT-AC(8) 92 122-172 15-40 59 0.15 Cs-7 5’-CTG TGT GAA GCA GCG CAT TA-3’ 5’-GTC CAG TCT AGC AGG AGT AAC GA-3’ GT(14) 122 142-176 10-27 59 0.1 Cs-8 5’-CTC CGA GGA TGT GTC TCT CC-3’ 5’-CTT CAT TTC TCC CCC ACC TT-3’ GT(9) 121 133-193 6-36 59 0.15

Mass screening using real-time gel fragment analysis

Variation in microsatellite allele sizes for each locus was assessed by high-density gel electrophoresis on the CORBETT Gel-Scan™ 3000 System (QIAGEN). This system utilises temperature stable high-density vertical polyacrylamide gels to separate single stranded DNA products based on their electrophoretic mobility (sequence length). As PCR fragments migrate through the gel, fluorescently tagged DNA is read by a laser to generate digital gel images. In this study primers were fluorescently tagged with HEX dye (Geneworks). Digital images were then scored using ONE-Dscan 2.05 software (SCANALYTICS) to determine length of alleles (genotype) for each sample.

PCR product was denatured at 95oC in the presence of urea to create single stranded DNA prior to electrophoresis. This product was run through 18.5cm 5% polyacrylamide gels at 1500V to separate alleles based on size. To control for consistency in size scoring across gels, each gel was run with a minimum of 3 molecular size standards (Tamra350, APPLIED BIOSYSTEMS) and also included internal standards for each locus that were created by

74

Phylogeography of C. striata mixing PCR products representing a range of pre-determined allele sizes. Figure 3.6 illustrates digital image and gel scoring. Full screening protocol can be found in Appendix 9.

After scoring, data for all loci were collated in spreadsheet format using Microsoft EXCEL , and then data checked with MICROCHECKER software Ver2.2.3 (Van Oosterhout et al. 2004) to identify possible genotyping errors. Genotyping errors result potentially from PCR amplification failure due to mutations at priming sites (null alleles) or preferential short allele amplification (large allele dropout), or from mis-scoring, for example scoring of stutter bands (shorter PCR product that is formed by polymerase slippage during PCR amplification) or typographic error during data collation.

(a) (b)

m m m r m

(c) (d) allele

stutter band

heterozygote

homozygote

Figure 3.6. Microsatellite Gel images. (a) raw digital image produced on the GELSCAN™ system, (b) area of gel with microsatellite allele banding pattern: m = molecular size standard, r = reference of pre-scored alleles, (c) microsatellite gel after scoring genotypes with ONE-Dscan software, (d) detail from (b) showing the genotypes of two individuals. All images are alleles of Cs-2 locus amplified for C. striata individuals sampled at Tan An, Mekong Delta, Vietnam.

75

Chapter 3.

Statistical Analyses

Mitochondrial DNA analysis

Diversity FINDMODEL (Tao et al. 2009) was used to identify the most appropriate model of evolution for the dataset via the Akaike information criteria (Posada & Buckley 2004), and where possible, all subsequent analyses utilised the closest available evolutionary model. Two approaches were employed to describe the relationship between all C. striata Cyt b haplotypes. Firstly, neighbour joining (NJ) and Bayesian phylogenetic tree building methods were used to construct gene trees of all haplotypes. In both cases a C. marulia Cyt b sequence (AY763771: Ruber et al. 2006) was included to root the tree. The NJ method, that recovers the specific tree topology that minimises the sum of all branch lengths, was implemented in MEGA Version 4 (Tamura et al. 2007; Tamura et al. 2004) under the Maximum Composite Likelihood (MCL) model of evolution (Tamura & Nei 1993; Tamura et al. 2004), and tested with 1,000 bootstrap replicates. Bayesian inference of phylogeny was implemented in MrBAYES Version 3.1.2 (Ronquist & Huelsenbeck 2003). The analysis was run for 20,000,000 generations with a 4by4 nuclear model, nst of 6, and invariable sites with gamma evolutionary model settings.

A second approach employed a maximum parsimony-median joining (MP-MJ) network to construct an estimate of the evolutionary relationships among haplotypes using NETWORK software Version 4.5.1. (Bandelt et al. 1999), with all characters weighted equally. Among methods of network estimation, the MP-MJ method provides the best estimate of true genealogy, especially when node (internal) haplotypes are absent from the data set (Cassens et al. 2005). The star contraction algorithm (Forster et al. 2001) was applied to the network to identify star-like clusters that may be indicative of historical population expansions (Slatkin & Hudson 1991). Conservatively, the contraction radius (Δ) was set to 1 and the network only subjected to a single round of contraction. Only star-like clusters whose ancestral state was supported by a wide geographical distribution (Crandall & Templeton 1993; Fedorov et al. 2008) and which represented more than 10% of all samples were retained (Forster et al. 2001). To further elucidate the evolutionary divergence among major clades identified in tree and network analyses, Tamura-Nei (1993) corrected distance and Maximun Composite Likelihood net evolutionary divergence (Da)

76

Phylogeography of C. striata

(Tamura & Nei 1993; Tamura et al. 2004) were calculated among clades, using the software DAMBE Version 5.5.1 (Xia & Xie 2001) and MEGA software respectively.

Three measures of diversity were calculated to describe mtDNA variation at each site, and also within drainages where multiple samples were available. This permitted comparisons of diversity estimates among population samples. The first measure, haplotype diversity (Hd), is the probability that two randomly chosen haplotypes within a sample will be different (Nei 1987). The second measure, nucleotide diversity () (Tajima 1983), is the mean number of pair-wise nucleotide differences between all individuals in the sample; this measure is more robust for describing variation where populations possess different sample sizes. Finally, Theta S (S) was also calculated (Watterson 1975). This measure is based on the number of segregating (polymorphic) sites among haplotypes in a sample, and is therefore useful for indicating which sites represent a mix of individuals with divergent haplotypes. All diversity measures were calculated using ARLEQUIN software Version 3.1 (Excoffier et al. 2005).

Each sample and drainage (pooled sample) was also tested for deviation from mutation-drift and gene flow-drift equilibrium. Deviation from neutral expectations may be evidence for past demographic events such as population growth or the presence of population sub-structuring within a sample. Statistics for detecting such deviations can be divided into three classes (I, II, and III) based on the information they incorporate (Ramos- Onsins & Rozas 2002). Here, Tajima’s D (Tajima 1989), that compares the number of segregating sites to nucleotide diversity in a sample (Class I), was used to test for deviation from neutrality due to selection, population bottleneck, or admixture (Rand 1996). Fu’s FS

(Fu 1997), that compares  estimated from nucleotide diversity with the expected number of haplotypes under Ewens’ (1972) distribution given the sample size (Class II) was used to detect past fluctuations in population size, and R2 (Ramos-Onsins & Rozas 2002), that uses information from the mismatch distribution of pair-wise differences (Class III) was also employed to examine demographic events. Tajima’s D and Fu’s FS were calculated in

ARLEQUIN, and Rasmos-Onsins and Rosas’ R2 was calculated in DNASP Version 5 (Librado & Rozas 2009). Significance values for all three tests were calculated using coalescent simulations implemented in DNASP (given θ, with 1,000 replicates for each simulation), and adjusted for family-wise error rate using the False Discovery Rate Procedure

(Benjamini & Hochberg 1995; Verhoeven et al. 2005). In addition, FS and R2 were calculated for each clade independently to look for evidence of clade specific demographic

77

Chapter 3.

expansion. Where R2 reflected a distribution expected under the population expansion model (Rogers & Harpending 1992), τ was calculated to investigate the length of time since population expansion [mutational units of time (τ) =2ut, where t is time (in generations) since population expansion given the mutation rate of the whole fragment u (Rogers &

Harpending 1992)]. τ was calculated using DNASP software.

Population structure

To examine the level of differentiation among samples and to investigate spatial population structuring, pair-wise ST analysis (Excoffier et al. 1992), that partitions genetic variation within and among sites, was performed in ARLEQUIN. The analysis used Slatkin’s linearized measure of genetic distance (Slatkin 1991) with Tamura and Nei (1993) distance method, and significance of each pair-wise comparison was tested with a nonparametric permutation procedure (incorporating 10,000 iterations). Significance values were adjusted to account for family-wise error rate using the False Discovery Rate Procedure (FDR) (Benjamini & Hochberg 1995) (Benjamini & Hochberg 1995; Verhoeven et al. 2005), that accounts for the increased risk of type 1 error (finding a significant result by chance) when multiple statistical tests are performed (Verhoeven et al. 2005). Spatial analysis of molecular variance (SAMOVA) (Dupanloup et al. 2002) was undertaken to identify groups of samples that were most similar. In this analysis a simulated annealing approach was applied to maximise the proportion of genetic variation between groups of samples (CT), incorporating information on haplotype divergence and geographical proximity. Data were forced into k groups (where k = 2-28) to identify the combination of sites that resulted in the highest CT, and hence to identify the strongest signature of population structure (Dupanloup et al. 2002). Spatial data were entered as geographical co-ordinates (latitude and longitude) to account for the possible modes of dispersal for this species (overland or via permanent freshwater connection).

To further test for significant spatial structuring in patterns of genetic differentiation, a Mantel’s test for isolation by distance (IBD) was conducted. Straight line distances for each pair-wise site comparison were log transformed prior to correlation with ST values. Calculations were performed in ARLEQUIN, with significance calculated from 10,000 permutations. Mantels tests were applied to the whole data set, and then to mainland SE Asian sites, excluding the southern Indian site (K) and the Sumatran site (LP). These sites are isolated from mainland SE Asian sites by extensive expanses of agricultural land (>1,500kms) and marine environment (Strait of Malacca) respectively, and so neither site is

78

Phylogeography of C. striata likely to contribute contemporary migrants to mainland SE Asian sites. Sites in the Lower Mekong Basin were also tested for IBD independently, as this represents a vast area of suitable habitat with no obvious barriers to dispersal, and so sites may be less influenced by potential in-stream barriers including rapids (i.e., the Khone Falls), that could mask IBD by establishing high genetic gaps over short geographical distances.

Nuclear DNA analysis

Diversity Raw microsatellite data were summarised into allele frequencies for each locus at each site using the software CONVERT (Glaubitz 2004). This software was also used to write data files for subsequent analyses using other statistical software packages. To enable comparison of diversity across sites, allelic richness (A) was calculated for each locus at each site in the software program FSTAT Version 2.9.3.2 (Goudet 1995), that corrects for different sample sizes using rarefaction (Leberg 2002; Petit et al. 1998) (in this study the smallest sample was 5 for site SM at the Cs-8 locus).

To test for presence of significant associations between alleles across microsatellite loci (i.e., linkage), the likelihood-ratio test of linkage disequilibrium (Slatkin & Excoffier 1996) was conducted for all pair-wise locus combinations for all sites using the EM algorithm, as implemented in ARLEQUIN with 10,000 permutations. An exact test was used to determine the statistical significance of the magnitude of linkage disequilibrium. Detection of significant linkage disequilibrium between loci may be evidence for physical proximity on a chromosome, or indicate that the sample is not representative of a population at Hardy-Weinberg Equilibrium (HWE). A sample may show HW disequilibrium if it constitutes a mix of populations (stocks), cryptic species, or individuals that are not mating at random.

ARLEQUIN was also used to perform exact tests (Guo & Thompson 1992) to check data for each locus for deviation from HWE expectations at each site. Deviation from HWE is important to detect as equilibrium is an inherent assumption for population genetic analysis, and deviation may have important biological implications, such as non-random mating within the sample. More generally, testing for deviation from population mutation- drift equilibrium can reveal historical demographic events. Two analyses were performed for each site to specifically investigate recent population demography. The first analysis assessed deviation from HWE against a distribution obtained from coalescent simulations

79

Chapter 3. under a microsatellite model of mutation using a Wilcoxon’s test (2-tailed). This analysis was performed with BOTTLENECK software Version 1.2.02 (Piry et al. 1999) under the two phase model (TPM) (incorporating 95% stepwise mutation and 5% infinite allele model) with variance of 12 and 100,000 iterations. Significance values were adjusted for linkage, exact HWE and Wilcoxon’s tests following the FDR procedure.

Secondly, the Garza-Williamson Index (M: essentially the number of alleles divided by the allelic range) (Garza & Williamson 2001) was calculated in ARLEQUIN. This statistic, that can be averaged across loci, ranges from 0 to 1, with a value of 1 indicating historically stationary (stable) population size, and very small values indicating a recent reduction in population size (bottleneck). As a general rule, if M < 0.68, it can be assumed that there has been a recent reduction in population size (given a data set of seven or more loci) (Garza & Williamson 2001).

Population structure Three pair-wise estimators of differentiation were calculated to assess levels of population sub-division among sample sites. FST and RST analogues (Slatkin 1991, 1995) were calculated in ARLEQUIN with 10,000 permutations, with significance values corrected using the FDR procedure. FST considers the number of alleles and their frequency, and is appropriate for describing differentiation when sub-populations are weakly structured, while RST estimates, that account for microsatellite size variation under the stepwise mutation model, outperform FST when structure is more pronounced (and migration lower)

(Balloux & Goudet 2002). While both FST and RST analysis are used commonly to assess population differentiation, it has increasingly been recognised that both methods fail to recover the “true” magnitude of differentiation when diversity and/or differentiation is high (Balloux et al. 2000; Jost 2008). To avoid this bias, a third measure of differentiation,

Jost’s Dest (estimator of actual differentiation) (Jost 2008) that avoids the statistical problems inherent in fixation indices, was calculated for each locus using SMOGD software (Crawford 2010), and data combined across all loci by taking the harmonic mean approximation proposed by Chao (2009)(http://www.ngcrawford.com/django/jost/). FST and RST values were plotted against Dest to assess the similarity of estimates qualitatively for each pair-wise comparison.

Two visual methods were adopted to illustrate the magnitude and pattern of differentiation among samples. Firstly, Dest values were used to construct a population tree using the NJ method implemented in MEGA Version 4 to visualise the relationship among

80

Phylogeography of C. striata sites. An alternative approach, factorial correspondence analysis (FCA) (Benzécri 1973), was used to plot all individuals in p-dimensional hyperspace (where each allele defines a dimension) and to determine each sample’s centre of gravity (midpoint of all individuals) within the hyperspace. This was then displayed as plots of all individuals and of all populations in 3-dimensional space with axes corresponding with the three factors that account for the most variation within the data set (Jombart et al. 2009). FCA analysis was undertaken using GENETIX software Version 4.05.2 (Belkhir et al. 1996-2004).

To examine the spatial distribution of genetic diversity quantitatively, the significance of the correlation between FST and straight line distance (log transformed) was tested using a Mantel’s test (10,000 permutations), performed in ARLEQUIN. To further test for patterns in the spatial arrangement of genetic similarity, genetic distances estimates among sites within the Mekong River Basin were mapped onto the stream sections that connect them, following the Stream Trees method proposed by Kalinowski et al (2008), that uses least squares (regression) analysis to fit genetic distance to river sections. Unlike straight tests for IBD, the Stream Trees model allows for individual sections in a drainage network to represent different degrees of genetic distance, allowing for the possible influence of in- stream barriers to create large genetic gaps over short geographical distances. Similar to the predictive Stream Hierarchy Model (Meffe & Vrijenhoek 1988), the Stream Trees model assumes that, in the absence of headwater exchange, large fluctuations in population size or differential life history traits, genetic difference will accumulate across a drainage network (Kalinowski et al. 2008). Regression was performed in the STREAM TREES software package (Kalinowski et al. 2008) using Dest distance estimates.

Finally, two clustering methods were used to assign individuals into relatively homogenous groups. In the first method, individuals were assigned to populations using a Bayesian model-based clustering approach that classifies individuals into K groups, optimising HWE and LE for each group (Guillot et al. 2009; Pritchard et al. 2000). Clustering was applied to the full data set, and also independently across a reduced data set constituting a subset of sites where one site was not significantly differentiated from neighbouring sites, but showed strong evidence for HW and linkage disequilibrium. The analysis was performed at the smaller scale because in simulations, Bayesian clustering has been shown to only detect the uppermost hierarchical level of population structuring when multiple layers of sub-structuring are present (Evanno et al. 2005), and hence analysis of smaller scale (geographically localised) data sets may avoid including individuals from top

81

Chapter 3. tiers in the hierarchical structure, the presence of which would obscure detection of fine scale sub-structuring.

Bayesian clustering was implemented using the program STRUCTURE Version 2.2 (Falush et al. 2003, 2007; Pritchard et al. 2000), with the majority of analyses performed on the CBSU bioHPC facility (http://cbsuapps.tc.cornell.edu/structure.aspx). Analyses assumed an admixture model, with a constant λ of 1.0, and were run for a MCMC chain length of 1,000,000 following a burn-in of 500,000 (these values were chosen after preliminary analysis indicated summary statistics (FST, α, likelihood, and log probability (LnP(D)) had stabilised after this run length). For analysis of the reduced data set, 20 replicates of each K from 2 to 2n (n=4) were performed. To reduce computational time for the full data set, only 10 replicates of each K from 2 to 2n (n=19) were performed.

Following advice from Evanno et al (2005), to determine the optimal number of groups, LnP(D) for each K across all replicate runs were collated and the rate of change in LnP(D) assessed by plotting ΔK (Evanno et al. 2005) using EXCEL and SIGMAPLOT softwares. The analysis in which K corresponded with the highest peak in the distribution of ΔK was then considered to be the best estimate of population groupings, and the probabilities of assignment to this number of individual clusters were aligned across replicates using the program CLUMPP Version 1.1.2 (Jakobsson & Rosenberg 2007b), using the appropriate algorithm determined by calculation of D for each method as defined in the user manual (Jakobsson & Rosenberg 2007a) (The “Greedy” algorithm in each case). Results were visualised using DISTRUCT version 1.1 (Rosenberg 2004).

The second clustering method was employed specifically to examine the extent of possible introgression between the two lineages earlier identified in the phylogenetic analysis (Chapter 2). In this approach, individual genotypes were assigned to one of two clusters (K =2 following the a priori assumption that introgression has occurred between two C. striata groups, previously referred to as the EA lineage and the MM lineage (see Figure 2.3). This analysis was implemented using FLOCK software (Duchesne & Turgeon 2009). The non-Bayesian approach adopted in FLOCK attempts to group contemporary admixed individuals “along their ancestral differentiation lines”, and has been shown to perform better than the Bayesian algorithm implemented in STRUCTURE when pure genotypes (non-introgressed individuals) are absent from the data set (Duchesne & Turgeon 2009). After all individuals have been assigned to ancestral clusters, it is possible re-allocate individuals following multilocus maximum likelihood (Paetkau et al. 1995). For

82

Phylogeography of C. striata each individual, log-likelihoods were estimated for membership to each of K reference clusters (K=2). The difference in log-likelihood allocation (LLOD score) was then calculated for each individual, by subtracting the likelihood of membership to reference 2 from the same value for reference 1. In this way, it follows that individuals with a high probability of membership to reference 1 and low probability of membership to reference 2 will have large positive LLOD scores, and conversely, a large negative LLOD score indicates a high probability of membership to reference 2. Individuals that have roughly equal likelihoods of membership will have LLOD scores close to zero, and are interpreted as admixed.

Here, mean LLOD scores for each sample (representing the mean of differences in log- likelihood allocation to the reference group for each individual) were plotted to generate an admixture map for C. striata sample sites. This approach pinpoints sites that may represent hybridisation zones between the two lineages. In addition, LLOD scores where plotted for each individual that had also been genotyped for their mtDNA to examine the level of nuclear introgression between mitochondrial clades.

83

Chapter 3.

RESULTS

MtDNA diversity and phylogeography A survey of 988 C. striata individuals collected from 28 sites identified a total of 70 unique haplotypes for the 570bp Cyt b mtDNA fragment. Among haplotypes, 96 sites were variable (Table 3.4), with 12 representing non-synonymous mutations. Frequencies of haplotypes for each site are listed in Table 3.5. Of the 70 haplotypes observed, 41 (accounting for 17.31% of all individuals sampled) were detected at only a single site (private haplotypes), and of these 23 haplotypes were detected in only a single individual (singletons).

Phylogeography All 70 haplotypes could be assigned to the three distinct lineages detected in the phylogenetic study (Figure 3.7). The West Asian lineage (WA) was represented by a single haplotype detected in all three individuals surveyed from southern India (haplotype 65). Five haplotypes were classified as belonging to the Middle Mekong lineage (MM) (haplotypes 66-70), that were found in 51 individuals from 6 sites in the Mid-upper Mekong River Basin (sites SB, VV, VT, SM, KJ and ST, Figure 3.3). The majority of individuals assessed for Cyt b (83.5%) possessed haplotypes from the East Asian lineage (EA) (haplotypes 1- 64), that were found across all eastern sampling sites except for SM and VV (n=2 for each). Within the EA group, all samples from Sumatra (site LP, haplotypes 60-64) and samples from Malaysia (site TK, haplotypes 49 and 50) formed distinct clades (see Figure 3.7).

The relationship between haplotypes is also illustrated in Figure 3.8. The EA Cyt b clade was the most diverse, with a maximum of 16 mutations present among the 64 haplotypes. The other eastern clade, MM, was detected less often in sampling, and was less diverse, with a maximum of 5 mutations among MM haplotypes. The MM clade was highly divergent from the widespread EA clade, with a minimum of 39 base pair mutations present between haplotypes from each clade, representing a minimum divergence of 7.45% corrected distance (Tamura-Nei 1993), with net evolutionary divergence between groups of 0.071 (SE = 0.013)(MCL) . The WA haplotype is more closely related to the EA clade than to the MM clade, with minimum divergences of 3.44% (19 mutations) and 6.02% (31 mutations) corrected distances, respectively. This pattern of divergence between major clades indicates that the split between the common ancestors of the MM

84

Phylogeography of C. striata clade and other C. striata populations is likely to pre-date divergence between the WA and EA lineages, despite the current occurrence of MM and EA clades in sympatry at four sites in the Mekong River Basin (sites SB, VT, KJ and ST).

Table 3.4. Variable sites for 70 haplotypes of 570 nucleotide bases of Cyt b for C. striata.

11111111111111112222222222222222222223333333333333333444444444444455555555555555555 233446789900112223334677790011123344455566778990112244555777889001455667888900112233444555667 239739283840925676792592512721703624713635817365079581725157238473251139025678917363547369289140 Hap 1 GTACATCTCGCCATAACAATTCCGTATCTCTTTCTTAGCTCAAACCCAATCCCCCCATATTTCAATGTCGTTCTATGCCGGCATAACCCCCCCATG Hap 2 ...... G...... Hap 3 ...... G...... A..G...... Hap 4 ...... T...... T...... G...... C...... Hap 5 ...... T...... G...... A...... Hap 6 ...... T...... G...... T...... A...... Hap 7 ...... T...... G...... G...A...... Hap 8 ...... T...... G.G...... G...A...... Hap 9 ...... G...... T...... G...... G...A...... Hap 10 ...... T...... G...... A...... Hap 11 ...... T...... G...... T..... Hap 12 ...... T...... T...... G...... T..... Hap 13 ...... T...... G...... T..G.. Hap 14 ...... T...... G...... T... Hap 15 ...... G...T...... G...... A...... T... Hap 16 ...... T....T...... G...... Hap 17 ...... A...... T...... G...... G...... Hap 18 ...... A...... G...... Hap 19 ...... A...... G..C...... Hap 20 ...... G..C...... Hap 21 ...... C...... G..C...... Hap 22 ...... G..C...... A...... Hap 23 ...... CG..C...... Hap 24 ...... A...... G.CC...... Hap 25 ...... A...... G...... Hap 26 ...... G...... Hap 27 ...... G...... C...... Hap 28 ...... C...... G...... Hap 29 .C...... G...... Hap 30 ...... G...... A Hap 31 ...... T...... G...... Hap 32 ...... C...... G...... Hap 33 ...... C...... T...... G...... Hap 34 ...... G...... T...... Hap 35 ...... G...... A...... Hap 36 ...... G...... G...... Hap 37 ...... A...... G...... Hap 38 ...T...... G...... Hap 39 ...... G...... G...... Hap 40 ...... C...... G...... Hap 41 ...... C...... G...... Hap 42 ...... A...... T...G...... Hap 43 ...... A...... A...... T...G...... Hap 44 ...... A...... T...... T...G...... Hap 45 ...... A...... T.....T..T...G...... Hap 46 ...... A...... T...... T...G...... T...... Hap 47 ...... A...... T...... T...G...... A...... Hap 48 ...... A...... T...... T...G...... A...... Hap 49 ...... Ac...... T...... T...... T..tG...... C...... Hap 50 ...... A...... G...... T...... T...... T..tG...... C...... Hap 51 ...... A...... G...... C...... T...... TT..G...... G.....A.A.....c...... Hap 52 ...... A...... G...... C...... T...... TT..G...... G...... Hap 53 ...... A...... G...... C...... T...... TT..G...... Hap 54 ...... A...... G...... C...... T...... TT..G...... C...... Hap 55 ...... A...... G...... G...... C...... T...... TT..G...... Hap 56 A...... A...... G...... G...... C...... T...... TT..G...... C...... Hap 57 ...... A...... G...... G...... C...... T...... TT..G...... A...C...... Hap 58 ..G...... A...... G...... T...... T...G...... C...... c...... Hap 59 ..G...... A...... G...... T...... T...G...... T...... Hap 60 ...... A...C...... CG...... C...... TA...... T...G....G...... T...... c...... Hap 61 ...... A...C...... CG...... C...... T..G.....T...G....G...... T...... Hap 62 ...... A...C...... CG...... C...... TA.G.....T...G....G...... T...... Hap 63 ...... TA...... C...... T.T....T...... T...G....G...... C.....A.T...... Hap 64 ...... TA...... C...... T.T....T...... T...G....G...... C...... T...... Hap 65 .....C..T...... C.T..C...... T.C....T.GG.T.C...... T...... C...A...... TAA.T...... T.... Hap 66 .C...CT..AT.G...TC.C.T..C..T..CC...C...CT.GGTT.C.C..TTT..C.CC.TGGC.C..CC.C.....AA...... T...... Hap 67 .C..GCTC.AT..T..TC.C.T..C..T..C....CG..CT.GGTT.c.C..TTT..C.CC.TGGC.C..CC.C.....AA.A....T...... C. Hap 68 .C..GCT..AT.....TC.C.T..C..T..C....CG..CT.GGTT.C.C..TTT..C.CC.TGGC.C..CC.C.....AA...... T...... C. Hap 69 .C..GCT..AT.....TC.CAT..C..T..C....CG..CT.GGTT.C.C..TTT..C.CC.TGGC.C..CC.C.....AA...... T...... C. Hap 70 .C...CT..AT.....TC.C.T..C..TC.CC.T.C...CT.GGTT.C.C..TTT..C.CC.TGGC.C..CC.C.....AA...... T...... C.

85

Chapter 3.

Table 3.5 C. striata mitochondrial Cyt b haplotype frequencies for total sample and for each site individually (n = 988) site K CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KK SN ME KC PA KH PS BB TP TT TC VL PH clade haplotype n 3 10 49 30 7 28 2 42 2 21 10 50 39 40 47 49 50 39 50 49 51 56 50 49 42 53 28 42

EA 1 0.013 0.04 0.10 0.02 0.02 0.10

EA 2 0.005 0.09

EA 3 0.001 0.02

EA 4 0.008 0.08 0.02 0.02 0.02 0.02

EA 5 0.007 0.06 0.08 0.02

EA 6 0.001 0.02

EA 7 0.027 0.67 0.04

EA 8 0.001 0.02

EA 9 0.009 0.23

EA 10 0.001 0.03

EA 11 0.007 0.02 0.04 0.10

EA 12 0.004 0.02 0.04 0.02

EA 13 0.001 0.02 0.00

EA 14 0.007 0.06 0.08

EA 15 0.010 0.02 0.02 0.00 0.19

EA 16 0.005 0.02 0.04 0.07

EA 17 0.001 0.03

EA 18 0.001 0.02

EA 19 0.004 0.08

EA 20 0.076 0.20 0.28 0.36 0.20 0.18 0.05 0.14 0.10 0.06 0.02 0.10

EA 21 0.006 0.06 0.05 0.02

EA 22 0.021 0.23 0.16

EA 23 0.005 0.11

EA 24 0.003 0.02 0.04

EA 25 0.001 0.02

EA 26 0.328 0.14 0.20 0.08 0.51 0.67 0.48 0.56 0.58 0.76 0.39 0.61 0.62 0.41 0.77

EA 27 0.021 0.50

EA 28 0.001 0.02

EA 29 0.036 0.61 0.45

EA 30 0.001 0.02

EA 31 0.003 0.03 0.02 0.02

EA 32 0.001 0.02

EA 33 0.001 0.02

EA 34 0.002 0.05

EA 35 0.079 0.86 0.30 0.64 0.12 0.33 0.04

86

Phylogeography of C. striata

Table 3.5 continued site K CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KK SN ME KC PA KH PS BB TP TT TC VL PH clade haplotype n 3 10 49 30 7 28 2 42 2 21 10 50 39 40 47 49 50 39 50 49 51 56 50 49 42 53 28 42

EA 36 0.002 0.02 0.03

EA 37 0.004 0.04 0.04

EA 38 0.002 0.02 0.02

EA 39 0.002 0.04

EA 40 0.001 0.02

EA 41 0.043 0.98 0.02 0.02 0.02

EA 42 0.009 0.32

EA 43 0.013 0.46

EA 44 0.012 0.70 0.14 0.02

EA 45 0.001 0.10

EA 46 0.002 0.04

EA 47 0.001 0.10

EA 48 0.032 0.65

EA 49 0.029 0.97

EA 50 0.001 0.03

EA 51 0.004 0.08

EA 52 0.012 0.10 0.22

EA 53 0.013 0.12 0.05 0.08 0.02

EA 54 0.002 0.05

EA 55 0.001 0.02

EA 56 0.040 0.12 0.29 0.02 0.29 0.31

EA 57 0.007 0.04 0.06 0.02 0.02

EA 58 0.002 0.05

EA 59 0.001 0.03

EA 60 0.002 0.29

EA 61 0.001 0.14

EA 62 0.002 0.29

EA 63 0.001 0.14

EA 64 0.001 0.14

WA 65 0.003 1.00

MM 66 0.005 0.10 0.50

MM 67 0.001 0.50

MM 68 0.035 0.07 0.69 0.30 0.02

MM 69 0.009 0.19 0.50

MM 70 0.001 0.50

87

Chapter 3.

35 30 37 36 39 38 41 31 34 26 33 29 27 32 1 2 3 40 28 22 20 23 21 19 18 EA 17 16 4 15 14 12 13 11 10 5 -- 0.95 6 7 8 9 25 24 42 43 44 46 88 49 Malaysia 1.00 50 45 47 58 59 71 51 0.65 97 52 -- -- 53 0.97 54 55 56 73 57 -- 97 64 0.86 -- 63 Sumatra 0.52 99 61 1.00 62 60 65 WA India 99 66 1.00 70 89 68 MM 0.96 67 69

0.02

Figure 3.7. Neighbour joining tree for C. striata mtDNA Cyt b haplotypes. Selected NJ bootstrap support values presented above the line, Bayesian posterior probabilities below the line. Three major clades indicated by grey vertical bars: EA = East Asian, WA = West Asian, and MM = Middle Mekong.

88

Phylogeography of C. striata

Figure 3.8. Median-Joining Network of C. striata Cyt b haplotypes. Open circles represents individual haplotypes, size of circle indicates relative frequencies (see Table 3.5 for absolute frequencies). Lines between haplotypes represent single mutational changes, where additional mutational changes have occurred small black circles represent hypothetical intermediate haplotypes not detected in sampling. Dotted lines show alternative connections. Three clades indentified from phylogenetic analysis: EA = East Asian, WA = West Asian, and MM = Middle Mekong.

89

Chapter 3.

Figure 3.9 presents the geographical distribution of the three major C. striata Cyb b mtDNA clades, illustrating that the two eastern lineages, EA and MM, occur together in the upper -mid Mekong River Basin. The EA lineage is distributed across SE Asia, including the island of Sumatra.

In Figure 3.10 the MJ network has been colour-coded by sections of drainage basins, to illustrate the diversity and to describe the genetic relationships found within and across four sample drainages in SE Asia. The single sites from Malaysian and Indonesian drainages (coloured blue and green) have unique sets of haplotypes. In both cases, haplotypes detected in these drainages grouped together (Figure 3.7) and these clades formed tips in the MJ network (Figure 3.8 and Figure 3.10). Tip haplotypes are generally considered under coalescent theory to be recently evolved (Castello & Templeton 1994; Crandall 1996), representing new genetic variants that have not had sufficient time to disperse widely across the distribution at large. Haplotypes found in Indonesia represent two tip lineages that are as closely related to each other as they are to an internal haplotype (haplotype 44; coloured purple and red) that was found in the northern Chao Phraya and the upper Mekong. Haplotype 44 was also the most closely related haplotype to Malaysian haplotypes.

Haplotypes, such as haplotype 44, that have internal positions in a network are generally considered ancestral types (Castello & Templeton 1994; Crandall & Templeton 1993; Donnelly & Tavaré 1986), especially when their distributions are widespread (as in this case across two drainages). Therefore, it is likely that both the Malaysian and two Indonesian lineages (haplotypes 60-62 and 63-64) identified here represent descendants of separate divergence events from this ancestral EA haplotype.

Along with haplotype 44, the Chao Phraya drainage in Thailand (coloured purple) has 6 unique haplotypes. Four of these haplotypes (haplotypes 45-48) appear to have evolved directly from the ancestral haplotype 44. Fifteen of the sixteen individuals with the other two unique Chao Phraya haplotypes (haplotypes 51 and 52) were detected in the Lower Chao Phraya at Saraburi (site CP). These haplotypes were most closely related to haplotype 53, that is found in the lower Mekong Drainage at sites in southern and western Cambodia, along with four other related haplotypes (haplotypes 54-57, yellow in Figure 3.10). All have distributions restricted to the lower Mekong south of Kratie (site KK on the main Mekong channel in central Cambodia above the Mekong – Tonle Sap confluence (Figure 3.3).

90

Phylogeography of C. striata

Figure 3.9. (a) Map of southern Asia showing geographical broad distributions of three Cytb mtDNA clades, and (b) Median-Joining network illustrating relationship between clades.

Figure 3.10. (a) Median-Joining network of C. striata Cyt b mtDNA haplotypes, with colours indicating section of drainage basin where haplotypes were detected, (b) Map of SE Asia with drainage basin sections indicated by colours.

91

Chapter 3.

In total, 56 haplotypes were detected from sites in the Mekong River Basin. Sites in this drainage were divided into three groups in Figure 3.10 (red, yellow and orange) to reflect their genetic compositions. Upper Mekong sites (sites SB, VV, VT, and SM – coloured red) had the highest frequency of MM clade haplotypes, that represented 69% of all C. striata sampled in this river section, and comprised fifty-one of the fifty-four individuals from this clade sampled across the entire drainage. This Upper Mekong section was also the only region within the Mekong Basin where haplotype 44 was detected, as well as two other unique haplotypes (haplotypes 42 and 43 found exclusively at site SB, the most upstream Mekong Basin sampling site). Sayaburi (SB) had a much lower frequency of MM individuals (~7%, n = 28) when compared with the other three sites in the Upper Mekong group (which had MM frequencies of 100% (n=2), 97% (n=42) and 100% (n=2) for sites VV, VT and SM, respectively).

The suite of haplotypes detected at Kontum (site GL) in the Vietnamese highlands represents a distinct component of the Mekong River Basin C. striata mtDNA diversity, and so this river section (coloured orange in Figure 3.10) appears to be independent from other downstream Mekong Basin sample sites. All haplotypes in GL belong to the EA Clade; however, they are not all closely related to each other within the clade. For example, haplotypes 58 and 59 (coloured orange to the right of Figure 3.10a) are more closely related to Upper Mekong (red), Chao Phraya (purple) and even to Malaysian (blue) haplotypes than to other GL types (haplotypes 7, 9, and 17 to the left of Figure 3.10a). In addition, four of five GL haploytpes are unique to this site, forming tips in the MJ network.

The remaining 18 sites in the Middle and Lower Mekong River Basin were grouped together in Figure 3.10 (coloured yellow) because, at these sites, haplotypes formed star- like clusters in the MJ network. A total of twenty-one haplotypes were considered to belong to star-clusters (haplotypes 19-23 and 26-41), illustrated in Figure 3.11a. These haplotypes were found in a total of 632 individuals in the Middle and Lower Mekong River Basin, accounting for more than 81% of all individuals sampled from this region of the drainage (illustrated in Figure 3.11b). The occurrence of star-like patterns suggests that C. striata populations have undergone significant population size expansions in the relatively recent past (Forster et al. 2001; Slatkin & Hudson 1991). The geographical range of star haplotypes suggests that population size expansion may have been limited to populations in the Middle and Lower Basin (excluding site GL in the Vietnamese highlands), and the high frequency of star-haplotypes in this region suggests that new mutations retained

92

Phylogeography of C. striata during such an expansion may have contributed significantly to genetic diversity at these sites.

Evidence for recent population expansion of the EA clade is supported by significant values for Ramos-Onsins and Rozas’ R2 (R2 = 0.0258, p = 0.030) and Fu’s FS (FS = -25.240, p ≤ 0.001) consistent with population growth. Tau (τ) = 0.963 for this clade, that, given a generation time of 2yrs (Ali 1999; Kilambi 1986) and a mutation rate for perciform Cyt b of 1-2% per million years (Cárdenas et al. 2005; Johns & Avise 1998), potentially indicates that the EA mtDNA clade expanded in the Late Pleistocene between 338-169 Kya. Site specific and drainage wide tests for deviation from mutation-drift and gene-flow drift equilibrium, however, did not generally support the hypothesis for population expansion, except for FS in the Mekong Basin (Table 3.6).

Figure 3.11, (a) Median-Joining Network of EA Cyt b mtDNA clade. Green circles show haplotypes which form star-like clusters that indicate recent population expansion, (b) Map of mainland SE Asian freshwater drainage lines. Geographical distribution of star clustered haplotypes is shaded green.

93

Chapter 3.

Table 3.6. Summary statistics for all sites individually, river drainages, and all data combined. Symbols denote significance (p ≤ 0.05) before (*) and after (†) FDR correction for multiple comparisons. Sites K, VV and SM have been excluded due to low sample sizes (≤3).

Site n n Haplotype Nucleotide Theta Tajima’s Fu’s FS Rasmos- haplotypes diversity diversity (S) D Onsins & (Hd) () Rozas’ R2 CM 10 4 0.533 1.200 2.121 -1.796* -0.497 0.200 CP 49 4 0.526 2.212 1.346 1.647 3.6987 0.184 TK 30 2 0.067 0.067 0.252 -1.147* -1.212*† 0.180 LP 7 5 0.905 4.762 4.081 0.894 0.2002 0.227 SB 28 4 0.680 6.230 10.536 -1.526* 8.629 0.048* VT 42 4 0.489 3.028 9.751 -2.415*† 5.039 0.132 MD 21 2 0.257 0.257 0.278 -0.133 0.341 0.129 KJ 10 4 0.822 20.422 15.553 1.524 9.023 0.232 NE 50 3 0.516 0.882 0.447 1.739 1.731 0.220 GL 39 5 0.511 1.579 2.602 -1.199 0.941 0.072 LL 40 2 0.050 0.150 0.702 -1.716*† -0.150 0.156 ST 7 4 0.610 2.481 9.736 -2.571*† 4.191 0.138 KK 49 5 0.508 0.872 1.346 -0.899 -0.506 0.074 SN 50 9 0.727 2.405 2.902 -0.514 -0.478 0.090 ME 39 9 0.971 1.555 3.075 -1.550* -2.628 0.058* KC 50 9 0.638 1.597 4.242 -1.976* -2.046 0.045* PA 49 6 0.422 1.471 2.916 -1.491 0.108 0.058 KH 51 11 0.737 1.655 4.445 -1.996*† -3.706* 0.065 PS 56 7 0.556 1.194 1.306 -0.213 -1.234 0.115 BB 50 8 0.598 1.668 2.902 -1.275 -1.060 0.064 TP 49 13 0.797 3.636 5.158 -0.959 -1.532 0.076 TT 42 7 0.671 5.069 3.846 1.435 4.127 0.164 TC 53 8 0.737 1.128 3.526 -2.086*† -2.428 0.065 VI 28 4 0.563 4.656 3.341 1.320 6.481 0.176 PH 42 5 0.678 5.660 3.951 1.391 7.834 0.163 Mekong 899 57 0.841 7.016 11.403 -1.069 -24.053*† 0.042 Chao 59 7 0.655 2.227 1.483 0.780 0.208 0.137 Phraya Global 988 70 0.870 7.207 12.847 -1.222 -23.997*† 0.037

Of the 378 pair-wise ST comparisons (Table 3.7), 355 revealed significant population differentiation after FRD correction. Samples from Peninsula Malaysia (site TK) and Sumatra (site LP) were differentiated from all other sampled sites, as was the southern Indian population (site K) when sites of low sample size (n = 2) were excluded. This broad scale differentiation is reflected in results of the Mantels test across all sites, that showed a significant association between genetic distance (ST) and geographical distance (log of straight line distance in kms, r=0.547 p ≤0.001).

94

Phylogeography of C. striata

Table 3.7. Population Pair-wise ST’s for C. striata mtDNA data. Above the line: ST estimates, below the line: Significance (α = 0.05) before (*) and after (†) False Discovery Rate Correction. Grey cells indicate comparisons that are not statistically significant.

K CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KK SN ME KC PA KH PS BB TP TT TC VI PH K 0.95 0.90 0.99 0.85 0.75 0.94 0.92 0.94 0.99 0.41 0.96 0.94 0.99 0.89 0.96 0.90 0.93 0.93 0.94 0.93 0.95 0.93 0.85 0.80 0.95 0.83 0.79 CM *† 0.23 0.86 0.66 0.12 0.96 0.94 0.96 0.87 0.26 0.79 0.69 0.92 0.56 0.79 0.48 0.63 0.64 0.64 0.64 0.73 0.60 0.36 0.27 0.69 0.33 0.25 CP *† *† 0.67 0.70 0.32 0.95 0.94 0.95 0.74 0.51 0.74 0.69 0.79 0.62 0.76 0.55 0.64 0.65 0.65 0.66 0.69 0.62 0.46 0.39 0.67 0.45 0.38 TK *† *† *† 0.89 0.47 0.99 0.96 0.99 0.98 0.56 0.91 0.85 0.98 0.76 0.90 0.73 0.83 0.82 0.83 0.82 0.87 0.81 0.64 0.57 0.86 0.65 0.54 LP *† *† *† *† 0.46 0.89 0.92 0.88 0.88 0.31 0.87 0.81 0.92 0.74 0.86 0.73 0.80 0.80 0.81 0.81 0.85 0.79 0.65 0.56 0.84 0.59 0.54 SB *† *† *† *† 0.84 0.88 0.84 0.44 0.21 0.50 0.54 0.55 0.37 0.45 0.33 0.37 0.40 0.39 0.42 0.46 0.37 0.29 0.26 0.43 0.26 0.25 VV *† *† *† *† *† 0.12 -0.34 0.99 0.45 0.98 0.96 0.99 0.94 0.98 0.95 0.96 0.96 0.97 0.96 0.97 0.96 0.92 0.89 0.97 0.90 0.88 VT *† *† *† *† *† *† 0.01 0.95 0.76 0.96 0.95 0.96 0.93 0.96 0.94 0.95 0.95 0.95 0.95 0.95 0.95 0.92 0.91 0.95 0.91 0.90 SM *† *† *† *† *† 0.99 0.43 0.98 0.96 0.99 0.94 0.98 0.94 0.96 0.96 0.95 0.96 0.97 0.96 0.92 0.88 0.97 0.89 0.87 MD *† *† *† *† *† *† *† *† *† 0.36 0.13 0.75 0.90 0.33 0.53 0.33 0.41 0.31 0.40 0.18 0.47 0.38 0.25 0.37 0.46 0.39 0.33 KJ *† *† *† *† *† *† *† *† 0.49 0.51 0.54 0.36 0.48 0.40 0.40 0.44 0.44 0.44 0.48 0.43 0.34 0.32 0.47 0.28 0.30 NE *† *† *† *† *† *† *† *† *† *† *† 0.73 0.72 0.21 0.32 0.24 0.28 0.19 0.28 0.08 0.31 0.27 0.22 0.40 0.34 0.42 0.37 GL *† *† *† *† *† *† *† *† *† *† *† *† 0.81 0.57 0.68 0.53 0.59 0.61 0.62 0.61 0.68 0.60 0.44 0.46 0.64 0.49 0.42 LL *† *† *† *† *† *† *† *† *† *† *† *† *† 0.44 0.64 0.45 0.54 0.51 0.53 0.52 0.59 0.50 0.36 0.46 0.58 0.51 0.42 ST *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.05 0.06 0.07 0.06 0.07 0.10 0.05 0.06 0.09 0.29 0.12 0.27 0.26 KK *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.06 0.03 0.02 0.03 0.08 0.07 0.03 0.08 0.32 0.05 0.33 0.29 SN *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.02 0.04 0.03 0.09 0.10 0.01 0.02 0.19 0.07 0.19 0.17 ME *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.01 0.01 0.06 0.07 -0.01 0.05 0.25 0.02 0.24 0.22 KC *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† * -0.01 0.02 0.07 0.01 0.05 0.26 0.03 0.26 0.23 PA *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.07 0.08 0.01 0.04 0.24 0.02 0.24 0.22 KH *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.12 0.07 0.09 0.30 0.09 0.29 0.26 PS *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.07 0.10 0.34 0.13 0.35 0.32 BB *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.04 0.24 0.03 0.24 0.22 TP *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.12 0.06 0.11 0.10 TT *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.28 0.10 0.08 TC *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.28 0.24 VI *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.01 PH *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *†

95

Chapter 3.

Excluding the two outlying sites, samples across mainland SE Asian retained a strong signature of IBD with straight line (overland) distance (r=0.627 p ≤0.001, Figure 3.12). The two populations from the Chao Phraya River drainage (sites CM and CP) were differentiated from each other (ST = 0.228, p ≤0.01); however, no significant ST was observed among CM and site SB in the Mekong Drainage, suggesting that across northern Thailand and northern Lao, patterns of differentiation do not reflect current drainage line boundary patterns. Although located in separate drainages, these sites are approximately 300km apart, less than the ~ 500km distance between the two Chao Phraya sampling sites.

Significant differentiation was also observed across most sites in the Mekong basin. Comparisons between sites VV, VT, and SM, however, indicated no significant differentiation, revealing a cluster of genetically similar sample sites in the upper-mid Mekong. Spatial analysis of molecular variance (SAMOVA) also identified this group, because when k = 2 the three sites were partitioned from all other sites, resulting in the highest between group diversity values observed for any k value (CT = 0.90120, p < 0.00001). Interestingly, site KJ was not significantly differentiated from sites VV and SM in the ST analysis; however, site MD, that lies on the Mekong River between KJ and SM, was significantly differentiated from all other sites. This site had relatively low diversity compared with neighbouring sites (Table 3.6), and unlike the sites VV, VT, SM, and KJ, no individuals belonging to the MM clade were detected there.

Sites at the headwaters of Mekong tributaries in the Vietnamese highlands were strongly differentiated from each other (ST = 0.806, p ≤0.001) and from all other Mekong sampled populations. Average significant ST’s between these and other Mekong sites were 0.671 (for GL) and 0.670 (for LL). Stung Treng (ST), which is the closet downstream site to the Vietnamese highlands, was highly differentiated from GL and LL (195kms, ST =

0.569, p ≤0.001 and 270kms, ST = 0.441, p ≤0.001 respectively), but least differentiated from its own nearest downstream neighbour (site KK: 115kms, ST = 0.046, p ≤0.002). This suggests that distance and direction of stream flow are not the only factors that influence patterns of differentiation among these sites. Stream connections between the Vietnamese highlands and the Mekong proper provide quite different freshwater habitat to the main Mekong channel between ST and KK, as the highland tributaries are predominantly clear water forested streams in comparison with the Mekong’s large, fast flowing current that carries a high silt load (Rainboth 1996a)

96

Phylogeography of C. striata

1.05 r = 0.627 P < 0.001 0.85

0.65

ST 0.45

0.25

0.05

1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1 3.3 -0.15

-0.35 Log of straight line geographic distance (kms)

Figure 3.12. ST plotted against the log of straight line (overland) geographic distance for C. striata mtDNA data. Trendline shows the general pattern of increasing genetic distance with greater geographical distance (IBD).

In Cambodia (sites ST, KK, SN, ME, KC, PA, KH, PS, BB, TP), all sampling locations lie downstream from the largest complex of rapids in the Mekong Basin, the Khone Falls. Cambodian sites span the transition in the Mekong from a fast flowing upland river with deep pools and sand bars to a lowland meandering river with vast floodplains and numerous oxbows. Across Cambodia, 35 of 45 pair-wise comparisons showed significant

ST values among sites, but values were relatively low (average of significant ST’s = 0.063). As the Mekong approaches southern Vietnam, it splits into a number of channels and takes the form of a high estuary, which means that although freshwater, sample locations in the Mekong delta (sites TT, TC, VI and PH) experience current and velocity changes associated with tidal effects at the river mouth. More differentiation was generally observed among these sites than among Cambodian sites (average significant ST among delta sites of 0.152), and these sites were also strongly differentiated from upstream Lower Basin sites

(average significant ST between delta and Cambodian sites = 0.222). This suggests that genetic differentiation is not necessarily proportional to geographical distance across the Lower Mekong Basin. This hypothesis is supported by results of the Mantel’s correlation analysis, which although significant, was weak (r =0.196, p <0.004), indicating IBD is less appropriate for describing the overall pattern of genetic differentiation at this spatial scale

.

97

Chapter 3.

Nuclear DNA results.

Diversity A total of 654 individuals collected from 19 sites were genotyped at eight microsatellite loci. A total of 58 PCR amplifications were unsucessful, resulting in a final data set with ~1.12% missing data. All loci were variable, with the number of alleles per locus ranging from 10 (for Cs-2) to a maximum of 27 (for Cs-5). Figure 3.13 shows the distribution of allele sizes for each locus pooled among sites. Most distributions were skewed, in particular Cs-5 alleles, where less than 25% of alleles fall within one standard deviation of the mean allele size. Allele frequencies for each locus at each site are presented in Appendix 10.

Cs-1 Cs-2 Cs-3 Cs-4

160 180 200 100 120 140 100 150 160 140 170 200

Cs-5 Cs-6 Cs-7 Cs-8 Total frequency Total

140 170 200 120 150 180 140 160 180 140 170 200 Allele size (base pairs)

Figure 3.13. Distribution of allele frequencies for C. striata dinucleotide microsatellite loci used in this study.

Private (site specific) alleles were observed at all loci except for Cs-1, and accounted for 28 of the 166 individual alleles recorded across all loci at all sites. In only four instances, a single private allele accounted for more than 5% of the alleles detected at a site (16.6% for Cs-4 allele 184 at SB, 8.3% for Cs-7 allele 142 at TK, and 11.4% and 7.1% for Cs-8 alleles 133 and 167 at LP and VT respectively). The greatest number of private alleles was detected at site SB, including five private alleles at the Cs-4 locus (alleles 184, 194, 196, 198 and 200). These accounted for ~28.3% of all alleles scored for this locus at this site.

98

Phylogeography of C. striata

Table 3.8. Summary statistics for C. striata microsatellite loci at all sites; n = number of individuals genotyped, n alleles = number of different allele sizes observed in each sample, allelic range = number of repeat units that alleles span, allelic richness (A) , obs het = observed heterozygosity, exp het = expected heterozygosity, average het = heterozygosity averaged across all 8 loci, H/W p value = significance of test for deviation from Hardy-Weinberg Equilibrium. G/W index = Garza – Williamson Index across all loci, W/T p value = significance of Wilcoxon’s 2-tailed test for deviation from population mutation-drift equilibrium. For p values, significance (α=0.05) is indicated before (*) and after (†) FDR correction. For the G/W index, values less than 0.68 (threshold for recent population decline) are indicated by (**).

site locus: Cs-1 Cs-2 Cs-3 Cs-4 Cs-5 Cs-6 Cs-7 Cs-8 n 10 10 10 10 9 9 10 9 CM n alleles 6 4 14 8 6 9 3 3 allelic range 15 6 22 9 17 13 2 3 allelic richness 4.75 3.45 8.58 6.03 4.53 6.84 2.76 2.55 obs het 0.70 0.80 0.90 0.60 0.56 0.78 0.50 0.67 exp het 0.76 0.70 0.97 0.88 0.73 0.92 0.61 0.50 H/W p value 0.45 0.53 0.29 0.02* 0.32 0.12 0.77 0.63 average het (8 loci) 0.69 G/W index 0.64**(SD: 0.21) W/T p value 0.38 n 47 48 48 46 47 47 47 44 CP n alleles 12 7 17 11 16 14 4 5 allelic range 18 8 19 13 25 16 3 5 allelic richness 4.83 3.86 7.09 6.10 5.99 5.97 2.89 2.44 obs het 0.74 0.54 0.88 0.65 0.68 0.77 0.53 0.41 exp het 0.71 0.74 0.91 0.88 0.84 0.86 0.60 0.40 H/W p value 0.40 0.02* <0.01*† <0.01*† <0.01*† <0.01*† <0.01*† <0.01*† average het (8 loci) 0.65 G/W index 1.79(SD: 0.12) W/T p value 0.07 n 30 30 30 30 30 30 30 30 TK n alleles 5 4 11 5 9 10 4 4 allelic range 10 6 25 7 30 21 4 3 allelic richness 2.73 2.73 6.01 2.19 3.87 5.26 2.31 3.16 obs het 0.33 0.47 0.77 0.27 0.43 0.50 0.23 0.40 exp het 0.44 0.58 0.86 0.25 0.67 0.81 0.30 0.65 H/W p value <0.01*† 0.37 0.12 1.00 <0.01*† <0.01*† 0.16 <0.01*† average het (8 loci) 0.43 G/W index 0.58**(SD: 0.21) W/T p value 0.02 n 24 23 24 24 19 21 24 22 LP n alleles 5 2 5 1 8 3 2 4 allelic range 7 3 7 n/a 7 2 1 5 allelic richness 3.05 1.99 3.42 1.00 5.91 1.81 1.90 2.38 obs het 0.46 0.61 0.63 n/a 0.63 0.19 0.13 0.27 exp het 0.46 0.46 0.66 n/a 0.88 0.18 0.31 0.32 H/W p value 0.38 0.18 0.12 n/a <0.01*† 1.00 <0.01*† 0.08 average het (8 loci) 0.42 G/W index 1.80(SD: 0.20) W/T p value 0.47 n 30 30 30 30 30 30 30 30 SB n alleles 4 6 11 14 8 9 7 5 allelic range 4 9 16 24 16 19 15 9 allelic richness 3.72 4.48 6.53 6.09 4.62 4.75 2.97 3.63 obs het 0.70 0.83 0.83 0.60 0.57 0.37 0.50 0.80 exp het 0.74 0.78 0.90 0.87 0.75 0.75 0.59 0.70 H/W p value 0.13 0.43 0.31 <0.01*† <0.01*† <0.01*† 0.16 0.82 average het (8 loci) 0.65 G/W index 0.59**(SD: 0.11) W/T p value 0.46

99

Chapter 3.

Table 3.8 continued.

site locus: Cs-1 Cs-2 Cs-3 Cs-4 Cs-5 Cs-6 Cs-7 Cs-8 n 20 20 20 20 16 17 20 15 VV n alleles 2 2 4 3 1 4 5 5 allelic range 1 1 9 4 n/a 12 11 14 allelic richness 1.25 2.00 2.82 1.69 1.00 2.57 2.45 2.56 obs het 0.05 0.40 0.55 0.15 n/a 0.65 0.25 0.27 exp het 0.05 0.49 0.54 0.14 n/a 0.51 0.32 0.31 H/W p value 1.00 0.64 0.65 1.00 n/a 0.18 0.48 0.14 average het (8 loci) 0.33 G/W index 0.63**(SD: 0.30) W/T p value 0.05 n 42 42 42 42 42 42 42 42 VT n alleles 3 2 7 8 2 11 11 10 allelic range 2 1 6 8 1 20 11 18 allelic richness 1.86 1.74 4.85 4.50 1.12 5.34 5.93 5.65 obs het 0.26 0.19 0.62 0.64 0.02 0.74 0.71 0.81 exp het 0.23 0.21 0.81 0.74 0.02 0.84 0.87 0.86 H/W p value 1.00 0.45 0.05 0.02* 1.00 0.32 0.06 <0.01*† average het (8 loci) 0.50 G/W index 1.86(SD: 0.20) W/T p value 0.31 n 8 8 8 8 6 6 8 5 SM n alleles 6 2 5 8 1 6 10 5 allelic range 7 1 6 9 n/a 11 14 9 allelic richness 4.38 1.63 4.37 6.36 1.00 5.62 7.43 5.00 obs het 0.63 0.13 0.63 0.63 n/a 0.67 0.75 0.80 exp het 0.62 0.13 0.76 0.90 n/a 0.86 0.93 0.84 H/W p value 0.77 1.00 0.07 0.07 n/a 0.32 0.11 0.34 average het (8 loci) 0.60 G/W index 1.74(SD: 0.19) W/T p value 0.94 n 29 29 29 29 29 27 28 29 MD n alleles 7 4 9 8 8 11 4 3 allelic range 8 7 10 13 8 19 3 2 allelic richness 3.62 2.75 3.76 4.26 4.85 5.30 3.28 2.17 obs het 0.62 0.62 0.59 0.69 0.69 0.67 0.68 0.59 exp het 0.65 0.51 0.67 0.72 0.81 0.81 0.60 0.49 H/W p value 0.88 0.59 0.41 0.19 0.05 <0.01*† 0.67 0.22 average het (8 loci) 0.64 0.19 G/W index 1.76(SD: 0.19) W/T p value 0.05 n 11 11 11 10 11 11 11 11 KJ n alleles 6 4 5 7 6 7 3 2 allelic range 6 6 11 8 12 7 2 1 allelic richness 4.82 2.77 3.08 5.50 4.93 5.31 2.97 1.97 obs het 0.73 0.45 0.27 0.50 0.91 0.91 0.64 0.27 exp het 0.79 0.40 0.41 0.85 0.79 0.84 0.68 0.37 H/W p value 0.06 1.00 0.21 0.04* 0.92 0.63 0.32 0.44 average het (8 loci) 0.59 G/W index 1.74(SD: 0.22) W/T p value 0.74 n 50 50 50 50 50 50 50 48 NE n alleles 6 3 8 7 10 10 4 4 allelic range 7 3 8 9 12 18 3 3 allelic richness 3.49 2.58 3.49 4.36 5.48 5.21 2.54 2.85 obs het 0.74 0.60 0.56 0.70 0.82 0.82 0.66 0.48 exp het 0.68 0.58 0.56 0.77 0.85 0.82 0.49 0.55 H/W p value 0.25 0.32 0.30 0.63 0.18 0.41 0.03* 0.27 average het (8 loci) 0.67 G/W index 1.80(SD: 0.15) W/T p value 0.46

100

Phylogeography of C. striata

Table 3.8 continued. site locus: Cs-1 Cs-2 Cs-3 Cs-4 Cs-5 Cs-6 Cs-7 Cs-8 n 37 39 39 39 39 39 36 39 GL n alleles 3 3 5 6 8 10 3 5 allelic range 4 6 10 8 18 19 2 5 allelic richness 1.39 2.24 3.63 3.86 3.29 5.57 2.85 3.76 obs het 0.08 0.41 0.67 0.56 0.36 0.51 0.53 0.59 exp het 0.08 0.50 0.71 0.65 0.53 0.83 0.63 0.73 H/W p value 1.00 0.34 0.25 <0.01*† <0.01*† <0.01*† 0.08 <0.01*† average het (8 loci) 0.46 G/W index 0.61**(SD: 0.20) W/T p value 0.74 n 40 40 40 40 40 40 40 40 LL n alleles 4 3 3 8 6 6 4 1 allelic range 3 6 5 8 14 14 3 n/a allelic richness 2.49 2.87 2.62 3.44 3.09 2.40 3.08 1.00 obs het 0.40 0.55 0.60 0.30 0.33 0.38 0.73 n/a exp het 0.41 0.64 0.58 0.52 0.63 0.34 0.68 n/a H/W p value 0.35 0.63 1.00 <0.01*† <0.01*† 0.47 0.48 n/a average het (8 loci) 0.47 G/W index 1.70(SD: 0.27) W/T p value 0.30 n 48 47 48 48 48 48 48 48 ST n alleles 7 7 14 9 9 12 5 8 allelic range 8 9 17 9 12 15 7 15 allelic richness 4.30 3.18 5.89 4.07 3.45 5.28 2.84 3.39 obs het 0.79 0.62 0.88 0.63 0.44 0.65 0.42 0.58 exp het 0.76 0.62 0.84 0.71 0.50 0.83 0.47 0.67 H/W p value 0.06 0.42 0.38 0.08 0.11 <0.01*† 0.02* 0.15 average het (8 loci) 0.62 G/W index 1.72(SD: 0.11) W/T p value 0.02* n 48 48 48 48 47 48 47 47 KC n alleles 10 6 16 12 12 13 5 5 allelic range 13 9 23 14 18 24 4 27 allelic richness 4.66 3.08 5.66 6.00 5.06 6.40 3.39 2.57 obs het 0.81 0.56 0.73 0.71 0.79 0.81 0.57 0.51 exp het 0.76 0.61 0.80 0.87 0.81 0.89 0.69 0.53 H/W p value 0.68 0.73 0.12 0.06* 0.80 0.04* 0.15 0.97 average het (8 loci) 0.69 G/W index 0.64**(SD: 0.22) W/T p value 0.05 n 48 48 48 48 48 48 48 48 PS n alleles 7 5 15 8 9 14 6 5 allelic range 7 5 17 9 12 21 7 4 allelic richness 3.27 2.56 5.61 4.84 5.19 6.03 4.03 2.94 obs het 0.75 0.52 0.83 0.63 0.77 0.81 0.69 0.65 exp het 0.63 0.55 0.84 0.80 0.81 0.87 0.76 0.60 H/W p value 0.22 0.82 0.81 <0.01*† 0.90 0.18 0.24 0.52 average het (8 loci) 0.71 G/W index 1.80(SD: 0.11) W/T p value 0.05 n 42 42 42 42 42 42 42 42 TT n alleles 13 6 15 12 14 14 4 4 allelic range 17 7 15 27 20 19 3 9 allelic richness 5.68 3.25 5.47 6.40 5.66 5.90 3.66 2.76 obs het 0.74 0.67 0.90 0.62 0.74 0.81 0.71 0.43 exp het 0.83 0.64 0.80 0.89 0.84 0.84 0.73 0.58 H/W p value 0.03* 0.51 0.44 <0.01*† 0.05 0.15 0.46 0.11 average het (8 loci) 0.70 G/W index 1.70(SD: 0.20) W/T p value 0.07

101

Chapter 3.

Table 3.8 continued.

site locus: Cs-1 Cs-2 Cs-3 Cs-4 Cs-5 Cs-6 Cs-7 Cs-8 n 45 45 45 45 44 45 44 43 TL n alleles 11 5 11 11 13 14 4 5 allelic range 14 5 14 17 22 21 3 4 allelic richness 4.98 2.84 4.23 5.39 5.23 6.11 3.64 2.45 obs het 0.80 0.60 0.51 0.51 0.80 0.82 0.70 0.40 exp het 0.78 0.60 0.62 0.83 0.80 0.87 0.73 0.55 H/W p value 0.86 0.07 0.02* <0.01*† 0.20 0.31 0.03* 0.09 average het (8 loci) 0.64 G/W index 0.76(SD: 0.16) W/T p value 0.04* n 42 42 42 42 41 42 42 42 PH n alleles 8 6 12 12 15 17 4 5 allelic range 14 7 14 17 22 22 3 9 allelic richness 3.88 2.47 4.29 5.82 5.30 6.61 3.26 2.97 obs het 0.62 0.69 0.60 0.64 0.68 0.88 0.71 0.48 exp het 0.66 0.55 0.63 0.86 0.80 0.89 0.69 0.62 H/W p value 0.39 0.14 0.59 <0.01*† 0.36 0.66 0.46 0.04* average het (8 loci) 0.66 G/W index 0.71(SD: 0.15) W/T p value 0.02*

After FDR correction, only 17 of the 532 pair-wise comparisons between individual loci at each site indicated significant deviation from linkage equilibrium (3.2% of tests performed) (data not shown). This very limited linkage disequilibrium is not likely to be the result of the presence of actual physical linkage between loci, as in this case a consistent linkage pattern would be expected among specific loci across all sites. Instead, significant results may indicate localised demographic effects. This is certainly likely to be true for site CP, that returned 11 pairs of loci that showed linkage disequilibrium. At this site, the data for six of eight loci also showed significant departure from HWE after FDR correction, providing further evidence that the sample from CP (the southern site in the Chao Phraya River Basin) probably did not constitute a random sample collected from a single population evolving under neutral expectations. Therefore, the observed homozygote excess here is unlikely to be the result of the presence of null alleles. In total, 26 of 152 tests for HWE revealed significant deviations from equilibrium. All were homozygote excesses. Apart from site CP, three other sites showed departures from HWE at more than two loci. They were site SB (most upstream Mekong site), site GL (at the headwaters of a Mekong tributary flowing from the Vietnamese central highlands) and site TK (the Malaysian sample). Generally however, as there was no consistent pattern of homozygote excess within loci across sites, there is a high likelihood that any presence of null alleles (if indeed they exist) will not significantly bias subsequent analyses. In fact, allele frequency adjustment would be likely to provide less accurate estimates of population structure than

102

Phylogeography of C. striata raw allele frequency data (Chapuis & Estoup 2007). As such, allele frequency adjustment to account for the presence of null alleles was not undertaken.

Diversity, in terms of the average number of alleles, allelic range and allelic richness (A), were lowest towards the southern extent of the species range at site LP (Sumatra), and close to the headwaters of Mekong tributaries at higher altitudes (sites VV and GL). Among other sampling sites, average A was fairly constant at around 4.0, although both a greater number of alleles and larger allelic ranges were observed in the Chao Phraya, at site SB at the top of the Mekong, and at downstream Mekong sites than were observed in the middle Mekong sites on the Khorat Plateau. This contrasts with the pattern of mtDNA diversity, where both  and θS diversity estimates were highest for site KJ on the Khorat Plateau, and also relatively high for site LP.

Little or no evidence was found for recent population declines, with no sample sites displaying significant deviations from mutation-drift expectations under Wilcoxon’s Test after FDR. The Garza-Williamson Index (M) did provide however, some support for a population decline at a limited number of sites (six M site values fell below 0.68), however only one site showed an index of lower than one standard deviation below this threshold. This value was observed at site SB, however, among all sites SB showed one of the largest allelic ranges, and high allelic range is known to lower the value of M artificially (Garza & Williamson 2001), possibly in association with hybridisation (see below).

Population structure

Despite some variation in the absolute levels of differentiation measured (Figure 3.14) (theoretically all measures range from 0 to 1), the three pair-wise measures of differentiation calculated were largely congruent and indicated high differentiation in almost all pair-wise site comparisons (See Table 3.9 and Table 3.10). FST analysis, that generally yielded the lowest estimates, identified only one pair-wise comparison that did not equate to significant differentiation among samples, corresponding with a lack of differentiation between the two sites sampled in the Chao Phraya Drainage (CM and CP).

While RST analysis supported this finding for the Chao Phraya, this measure, which is more sensitive when differentiation is high (as is the case here), identified 12 pair-wise site comparisons that were not differentiated. Three of these were observed between neighbouring sites in the upper-, mid-, and lower Mekong Basin (VT-SM, MD-KJ, and KC-PS

103

Chapter 3.

respectively), while the remainder indicated similarity between KJ in the mid-Mekong, GL in the Vietnamese highlands, and four sites across lower Mekong Basin.

Results of the FST and RST analysis showed that the largest differentiation was observed consistently between sites LP (Sumatra), LL (Vietnamese central highlands) and VV

(northern Lao) and all other sites. For Dest estimates, the greatest pair-wise differentiation was evident between site TK (Malaysia), VT (northern Lao) and VV, and all other sites.

1 11 1 0.9 0.90.9 0.9 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.6 0.6 0.6 1 Series1 Series1 0.6 Series2Series1Jost’s Dest values 0.5 Series2 Series3Series2Series1F values 0.50.5 0.9 ST Series3Series2Series3F trendline 0.5 LinearST (Series2) 0.4 Linear (Series2) LinearSeries3RST values (Series3)(Series2) 0.4 0.8 0.4 Linear (Series3) LinearRST trendline (Series3)(Series2) 0.4 0.3 Linear (Series3) 0.3 0.7 0.3

0.20.3 0.6 0.2 0.2 Series1 0.10.2 Series2 0.5 0.1 Differentiation 0.1 Series3 Linear (Series2) 0.1 0 0.4 Linear (Series3) 0 0 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 0 0.3 0 20 40 60 80 100 120 140 160 180

0.2

0.1

0 0 20 40 60 80 100 120 140 160 180 Pair-wise comparison (ranked by Dest)

Figure 3.14. Graph of measures of differentiation ranked by Dest.

104

Phylogeography of C. striata

Table 3.9. Pair-wise FSTs for C. striata microsatellite data. Above the line: FST estimates, below the line: Significance (α = 0.05) before (*) and after (†) False Discovery Rate Correction. Grey cells indicate comparisons that are not statistically significant.

CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH CM 0.00 0.19 0.36 0.05 0.44 0.26 0.17 0.12 0.12 0.15 0.17 0.30 0.15 0.12 0.12 0.06 0.07 0.12 CP 0.17 0.26 0.08 0.38 0.29 0.18 0.11 0.10 0.15 0.16 0.26 0.14 0.12 0.12 0.07 0.07 0.12 TK *† *† 0.39 0.22 0.52 0.41 0.36 0.26 0.28 0.27 0.29 0.34 0.27 0.23 0.23 0.19 0.21 0.24 LP *† *† *† 0.31 0.63 0.49 0.48 0.38 0.42 0.34 0.36 0.45 0.37 0.25 0.28 0.29 0.31 0.26 SB *† *† *† *† 0.35 0.23 0.13 0.17 0.16 0.19 0.22 0.29 0.19 0.14 0.15 0.10 0.13 0.16 VV *† *† *† *† *† 0.26 0.32 0.45 0.50 0.39 0.49 0.56 0.42 0.38 0.40 0.34 0.37 0.39 VT *† *† *† *† *† *† 0.05 0.35 0.38 0.33 0.39 0.47 0.35 0.31 0.32 0.28 0.30 0.33 SM *† *† *† *† *† *† *† 0.26 0.27 0.23 0.32 0.41 0.25 0.20 0.22 0.16 0.20 0.21 MD *† *† *† *† *† *† *† *† 0.05 0.06 0.14 0.28 0.08 0.11 0.08 0.06 0.05 0.10 KJ *† *† *† *† *† *† *† *† *† 0.10 0.15 0.30 0.08 0.10 0.12 0.05 0.04 0.09 NE *† *† *† *† *† *† *† *† *† *† 0.15 0.23 0.08 0.07 0.08 0.05 0.06 0.07 GL *† *† *† *† *† *† *† *† *† *† *† 0.23 0.13 0.14 0.15 0.12 0.09 0.12 LL *† *† *† *† *† *† *† *† *† *† *† *† 0.24 0.20 0.23 0.19 0.21 0.24 ST *† *† *† *† *† *† *† *† *† *† *† *† *† 0.11 0.10 0.07 0.07 0.12 KC *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.03 0.05 0.06 0.03 PS *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.07 0.08 0.06 TT *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.02 0.06 TL *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.05 PH *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *†

Table 3.10. Pair-wise RSTs for C. striata microsatellite data. Above the line: RST estimates, below the line: Significance (α = 0.05) before (*) and after (†) False Discovery Rate Correction. Grey cells indicate comparisons that are not statistically significant.

CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH CM 0.04 0.15 0.76 0.12 0.67 0.58 0.47 0.43 0.33 0.51 0.16 0.45 0.32 0.23 0.26 0.15 0.25 0.26 CP 0.07 0.56 0.23 0.52 0.51 0.36 0.24 0.18 0.29 0.09 0.24 0.25 0.15 0.16 0.08 0.13 0.13 TK *† *† 0.45 0.32 0.50 0.51 0.30 0.24 0.20 0.29 0.12 0.27 0.31 0.18 0.20 0.13 0.13 0.05 LP *† *† *† 0.70 0.87 0.76 0.79 0.70 0.77 0.67 0.46 0.70 0.66 0.51 0.59 0.46 0.47 0.41 SB *† *† *† *† 0.53 0.55 0.42 0.49 0.38 0.56 0.34 0.51 0.49 0.40 0.44 0.30 0.40 0.39 VV *† *† *† *† *† 0.05 0.20 0.66 0.68 0.67 0.43 0.72 0.62 0.50 0.58 0.39 0.45 0.46 VT *† *† *† *† *† *† 0.05 0.52 0.52 0.56 0.43 0.62 0.51 0.47 0.51 0.41 0.45 0.51 SM *† *† *† *† *† *† 0.44 0.43 0.47 0.21 0.55 0.42 0.28 0.37 0.21 0.25 0.29 MD *† *† *† *† *† *† *† *† 0.00 0.02 0.04 0.13 0.10 0.02 0.02 0.08 0.04 0.23 KJ *† *† *† *† *† *† *† *† 0.06 0.02 0.09 0.08 0.01 0.02 0.03 0.02 0.19 NE *† *† *† *† *† *† *† *† *† *† 0.06 0.11 0.18 0.04 0.06 0.10 0.05 0.24 GL *† *† *† *† *† *† *† *† *† *† 0.06 0.11 0.00 0.03 0.00 0.00 0.12 LL *† *† *† *† *† *† *† *† *† *† *† *† 0.17 0.05 0.09 0.09 0.06 0.25 ST *† *† *† *† *† *† *† *† *† *† *† *† *† 0.07 0.06 0.15 0.16 0.35 KC *† *† *† *† *† *† *† *† *† *† *† *† 0.01 0.03 0.03 0.19 PS *† *† *† *† *† *† *† *† *† *† *† *† *† 0.07 0.05 0.21 TT *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.02 0.12 TL *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† 0.09 PH *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *† *†

105

Chapter 3.

The pattern of microsatellite differentiation among sites is clear in the NJ tree constructed using Jost’s Dest. At the highest level of hierarchical structuring, sample sites can be classified into a number of clades (Figure 3.15). As with the distribution of mtDNA clades, one grouping of samples is restricted to the upper Mekong Basin (sites SM, VV, and VT), while the others encompasses all sites including Malaysia and Sumatra. The most upstream Mekong site (SB) clades closely with the Chao Phraya sites (CM and CP). Although in a different drainage, SB is geographically close to the headwaters of the Chao Phraya, and is isolated from all other widespread clade sites in the Mekong by occurrence of the restricted clade that is present in sites directly downstream. Similarity between site SB and the Chao Phraya was also observed in the mtDNA dataset, where haplotype 44 was shared among sites. Interestingly, the southernmost samples from Malaysia and Sumatra clade most closely with samples from the Vietnamese central highlands at the eastern extent of the sampled range (although with long branches indicating high differentiation).

Results of the FCA provide a clearer picture of how individual genotypes contributed to among sample differentiation and similarity. The three factors that accounted for the most variability among samples together explained over 46% of the variation observed (Figure 3.16a). Sites VV, VT and SM, as before, were clearly closely related to the exclusion of all others. All sites in the Mid- and Lower Mekong clustered tightly together, indicating that although all samples may show significant differentiation in pair-wise analyses, the level of differentiation was much less than between sites in separate drainages and drainage sections. Although classified as a separate sub-clade in the NJ tree, results of the FCA also revealed that when population centres of gravity were considered (Figure 3.16b), site SB and the Chao Phraya sites CM and CP were relatively similar to the majority of other Mekong sample sites. In contrast, Sumatran (site LP) and Malaysian (site TK) samples clearly cluster away from genotypes sampled from the Chao Phraya and Mekong River Basins, both at the individual level (Figure 3.16c) and for their population centres of gravity (Figure 3.16b). This pattern of differentiation with increasing distance was reflected by a significant but weak signature of IBD (r2= 0.189734 p = 0.0050) when straight line geographical distance was considered. Within the Mekong Basin, however, genetic distance was not strongly correlated with stream section (r2 = 0.769: relatively poor fit (Kalinowski et al. 2008)), indicating that distance within a landscape of contemporary freshwater connectivity was not the only factor influencing gene flow and the spatial arrangement of genetic differentiation.

106

Phylogeography of C. striata

MD ~ Mekong Basin, Eastern Thailand NE ~ Mekong Basin, Eastern Thailand ST ~ Mekong Basin, Northern Cambodia KJ ~ Mekong Basin, Eastern Thailand PH ~ Mekong Basin, Vietnamese Delta KC ~ Mekong Basin, Cambodia PS ~ Mekong Basin, Cambodia TL ~ Mekong Basin, Vietnamese Delta TT ~ Mekong Basin, Vietnamese Delta GL ~ Mekong Basin, Vietnamese Central Highlands LL ~ Mekong Basin, Vietnamese Central Highlands TK ~ Malaysia LP ~ Sumatra CP ~ Chao Phraya, Central Thailand CM ~ Chao Phraya, Northern Thailand SB ~ Mekong Basin, Northwest Lao PDR SM ~ Mekong Basin, Northeastern Thailand VV ~ Mekong Basin, Northern Lao PDR VT ~ Mekong Basin, Northern Lao PDR

0.05

Figure 3.15. NJ Tree of Dest microsatellite differentiation among C. striata samples.

107

Chapter 3.

TK 50 46.89% (b) 45 40 35.51% 160 35 150 LP 30 140 25 130 20 120 15 110 10 100 5 22.96% 12.55% 11.38%

90 Percenttotal inertiaof SB 0 80 axis 1 axis 2 axis 3 axis 2 70 CP (a) (12.55%) 60 50 40 LL CM 30 20 SM 10 VT VV 0 -10 -20 mid-lower Mekong -30 12,000 -40 10,000 8,000 6,000 ) -5,000 4,000 % 2,000 8 0 0 -2,000 .3 -4,000 1 5,000 -6,000 1 -8,000 ( 10,000 -10,000 3 ax -12,000 is 1 15,000 s (22.9 -14,000 i 6%) -16,000 x 20,000 -18,000 a

-20,000

2

s

i

x a

3 (c) is x axis 1 a

Figure 3.16. Results of FCA: (a) Graph of the magnitude of inertia for the three factors that account for the greatest variably in the analysis, black line indicates cumulative inertia across three axis; (b) centres of gravity for each population; and (c) all individuals coloured by sample site.

108

Phylogeography of C. striata

Bayesian clustering of genotypes was performed for the entire data set, and separately for sites in, and adjacent to, the Chao Phraya (sites CM, CP, SB, and NE). The smaller data set was tested separately to examine the contribution that neighbouring sites made to Chao Phraya diversity because CP showed strong evidence for disequilibrium (hence possible subpopulation admixture), yet a lack of differentiation to CM. For each data set analysed, ΔK identified three as the best number of distinct groups (Figure 3.17). For the reduced data set comprising the Chao Phraya and neighbours, the three clusters corresponded largely with the Chao Phraya (CM+CP: green), site SB to the north (light blue), and site NE to the east (dark blue, see Figure 3.18). However, two individuals sampled at CM had a high probability of membership with the SB cluster, and two from the SB cluster had a high probability of membership with the Chao Phraya cluster. A comparison with individual mtDNA genotypes revealed that the two individuals from CM had haplotype 44 (common to CM and SB), but that the two SB individuals had mtDNA haplotype 43, which was found exclusively at SB.

-12000 -3500 -12000 (a) -14000 -14000 -3600

-16000 -16000 -3700 -18000 -18000

LnP(D) LnP(D) -20000 LnP(D) -20000 -3800

-22000 -22000 -3900 -24000 -24000 (a) (b)

-26000 -4000 -26000 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 2 3 4 5 6 7 8 k 50 k 160

45 140 40 120 35 100

30

K K

25 Δ 80 Δ 20 60 15 40 10 (d) (c) 20 5

0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 2 3 4 5 6 7 K K Figure 3.17. Results of Bayesian Cluster Analysis: Mean Log probability of the data (LnP(D)) ±SD over replicates of each K value for (a) the entire data set and (b) four sites in or adjacent to the Chao Phraya River Drainage. Lower plots show ΔK for (c) entire data set and (d) reduced number of sites analysed in (b). The modal value of each ΔK distribution indicates the best number of groups identified in the analyses (3 in both cases).

109

Chapter 3.

CM Ch ao Ph C ra M C ya ha o C Ph P C ra ha ya o P Ph S Me C ray ko P C a ng ha o Ph ray a SB Me ko ng TT SB M Me e ko ko VV n ng Me g ko ng

V NE M T M TL e ek Me ko on ko ng g ng SM Me ko ng M D M PH ek Me on ko g ng K J Me ko ng G L Me NE ko Me ng ko ng

LL Me ko ng ST Me ko ng TK Ma lay sia

L KC P Su Me m ko at ng ra

Figure 3.18. Bar graphs of membership coefficients estimated from Bayesian Cluster analysis. Each individual is represented by a horizontal bar partitioned into three sections which correspond to cluster membership, with each colour representing a different cluster. Sample site names are indicated to the left of the bar, drainage basin of origin is indicated to the right side or bar. LEFT GRAPH presents results of reduced analysis of Chao Phraya and neighbouring sites. CENTRE and RIGHT graphs show results of analysis of entire data set.

110

Phylogeography of C. striata

Broad scale analysis also identified three optimum clusters, although these groups did not correspond with the clusters identified in the small scale analysis, where SB and Chao Phraya sites were assigned to different groups, apparently reflecting population structure at a lower hierarchical level (Evanno et al. 2005).

Membership probabilities for each individual from each population are illustrated in the two right hand graphs in Figure 3.18. Samples taken from sites VV, VT and SM in the mid-upper Mekong constitute a single cluster (illustrated in dark orange). This corresponds with the pattern observed in the FCA, and also to one of two main clades identified in the NJ tree. A second cluster, coloured light orange, constitutes the majority of individuals from all Mekong Basin sites downstream of the dark orange clade, with the notable exception of site LL in the Vietnamese central highlands, where individuals have a higher probability of clustering with the third group. This third cluster, indicated in pale yellow, is mostly represented by individuals to the west of the sampled range, encompassing SB in the upper Mekong, both Chao Phraya Basin sites, and TK and LP to the south of the sampled region. Site LL, and to a lesser extent a few individuals from GL and the lower Mekong (at sites KC, TT, TL and PH), also clustered with this predominantly western group. Surprisingly, some mtDNA haplotypes recovered from the Vietnamese highland were quite closely related to Chao Phraya types (Haps 58 & 59), as were Malaysian haplotypes 49 & 50. These haplotypes all centre around the SB-Chao Phraya haplotype 44, that was close to the centre of the EA clade network (Figure 3.8), suggesting that ancestral diversity (at least for the EA clade) could have been retained in individuals in the Vietnamese highlands.

Some individuals from site TT (Mekong delta) could not be assigned with high probability to any cluster. Generally, sites in the delta had high numbers of alleles at each locus as well as high allelic ranges. Site TT was especially high in these respects, and this may possibly explain the lack of confidence in cluster assignment for individuals from this site. Overall, Bayesian cluster assignment could however, confidently assign the majority of individuals to a single cluster, and within almost all sites there was very little evidence for presence of multiple clusters. Among the most admixed sites were those of the Chao Phraya, which were inferred to hold a significant element of mid-lower Mekong diversity (light orange cluster).

Distributions of the two clades identified in the mtDNA analysis were largely congruent with the results of the microsatellite analysis, both of which identified a significantly different cluster of genotypes in the upper Mekong Basin. However, while both markers

111

Chapter 3. identified what could be classified as two C. striata lineages, some individuals appeared to be hybrids between the two. FLOCK analysis was employed specifically to investigate the extent of hybridisation among lineages and to generate an admixture map across sampled locations based on microsatellite variation. FLOCK classified 579 individuals in one reference group, and the remaining 75 as the other. The smaller group represented the restricted clade, and was made up of all individuals from sites VT, VV, and SM, as well as four individuals from SB and one from CM (this CM individual was not the same as the one formerly assigned to SB in the small scale cluster analysis). Average log-likelihood differences among reference groups for each site are presented in Figure 3.19. These values roughly equate to a hybrid index, with sites falling close to zero indicating admixture. Sites VV and VT show little if any gene introgression with widespread types, and, at least considering the sampling implemented here, these sites appear to be the centres for distribution of the restricted lineage. Site SB directly upstream and site SM downstream show the most evidence for hybrid introgression between the two lineages, while all other sites except perhaps for CM show no real signature of introgression.

20 LL ST GL 15 MD KJ PS LP TL CP NE KC PH TK 10 TT

CM SB

5

Restricted -

0 Mean LLOD Mean

-5 Widespread Widespread

-10 SM

-15 VT VV -20 Figure 3.19. Admixture levels between C. striata forms across SE Asia. Graph shows the mean log-likelihood difference (LLOD) between the two reference groups obtained by FLOCK across all sample sites assessed for microsatellite diversity in this study. Sites with high positive log-likelihoods belong to the widespread reference group, sites with high negative values belong to the restricted reference group. Sites with intermediate LLOD values are admixed.

112

Phylogeography of C. striata

When mtDNA clade membership was compared with microsatellite LLOD at the individual level (Figure 3.20), a signature of gene introgression was evident, although not widespread. The small number of individuals sampled downstream of site SM that had MM mtDNA haplotypes cluster with EA types from their site when microsatellite LLOD scores were considered. This is strong evidence that these locally rare MM mtDNA individuals do not represent long distance dispersers from source MM populations (as in this case LLOD scores would indicate low log-likelihood of assignment to widespread microsatellite lineage), and hence provides evidence that the two forms are interbreeding at a local scale where they co-occur. This pattern was mirrored for the single mtDNA EA individual detected in the MM clade at VT, which clearly assigns to the restricted lineage when microsatellite data were considered. A signature of admixture was particularly evident at site SB, where LLOD scores clustered around zero. This sample was comprised of roughly 90% EA mtDNA. Figure 3.21 ilustrates the geographical distribution of mtDNA and nDNA groups.

25

20

15

10

5

Restricted -

0 LLOD

-5 Widespread Widespread -10

-15

EA mtDNA clade -20 MM mtDNA clade

-25 Site: Chao Vietnamese Up Mekong Down

Phraya Stream Stream Highlands Sumatra Malaysia Figure 3.20. Admixture analysis for C. striata individuals genotyped for both microsatellites and mtDNA. Log-likelihood difference scores (LLOD) based on microsatellite analysis, marker type indicates mtDNA genotype, for EA clade n = 504, for MM clade n = 51.

113

Chapter 3.

Figure 3.21. The distribution of genetic diversity for C. striata in SE Asia. (a) pie graphs indicate frequency of each mtDNA clade at each sample site (cream for EA clade and pink for MM clade, see Figure 3.9), (b) bar graphs show Bayesian population cluster membership for each site inferred by Bayesian inference of microsatellite data, colours correspond to Figure 3.18.

114

Phylogeography of C. striata

DISCUSSION

Two C. striata forms in mainland SE Asia

Results here confirmed the presence of two genetically distinct forms of C. striata in mainland SE Asia that are characterised by divergent sets of unique mtDNA haplotypes and different suites of alleles at different frequencies for nDNA microsatellite loci. Both sets of markers provided largely congruent outcomes when classifying individuals to their respective forms. Based on mtDNA, both groups were only distantly related to C. striata in southern India, however the widespread East Asian clade (EA) appears to have shared a common ancestor with the Indian clade more recently than either group had with the restricted Mekong clade (MM).

The two forms both occur in the Mekong River Drainage Basin, yet the geographical zone of overlap between them is quite limited. While the widespread form (characterised by EA mtDNA haplotypes) occurred broadly across mainland SE Asia and extended into the Indonesian archipelago, the restricted form (characterized by MM mtDNA) has only a limited distribution that is centred around Vientiane (Site VT) on the northern Thai-Lao PDR border. Some uncommon occurrences were found of individuals with MM mtDNA alleles both upstream and downstream from MM dominated sites (for example at SB and ST), and there was limited evidence for nDNA genes that may have introgressed into the population sampled in the upper Chao Phraya (at site CM, see Figure 3.18). There was also evidence that the widespread clade occurred at low frequency within the centre of the restricted form’s distribution, with one individual possessing an ancestral EA haplotype (haplotype 44) detected among the 42 individuals sampled at Vientiane.

It is clear that where the two forms occurred in sympatry they were interbreeding, as rare individuals that were detected outside the main distribution of their lineage (identified based on mtDNA haplotype) could not be distinguished from other individuals sampled at their site when nDNA was examined (Figure 3.20). This indicates not only that hybridisation has occurred, but that hybridisation has resulted in ongoing gene exchange, at least for two generations (F2) but probably for many more generations considering that eight nDNA loci were examined. Analysis of nDNA revealed that Sayaburi (Site SB) had the highest admixture level of all sites surveyed (Figure 3.19), and at this site the gene pool (both for mtDNA and nDNA) was a mix of the two forms. In other sites where rare mtDNA haplotypes indicated admixture, nDNA introgression was less obvious. This may be

115

Chapter 3. primarily because microsatellite alleles that characterised the restricted form were less variable than those of the widespread form, and typical ‘restricted’ alleles were generally also present in relatively high frequencies in individuals of the widespread clade right across the study area. Only at two loci, (Cs-7 and Cs-8), were alleles found that were present exclusively in the centre of the restricted forms distribution (sites VV, VT and SM).

It is possible that the relatively narrow zone of overlap between the two lineages is the product of hybrid breakdown (Ohta 1980; Templeton 1981), preventing stepwise introgression of divergent lineages into populations further from the contact zone over successive generations. Given, however, the sedentary ecology of C. striata and high significant differentiation observed among all populations, the narrow zone of overlap is more likely to reflect an intrinsically low tendency to disperse rather than strong selection for independent co-adapted gene complexes. If hybrid breakdown were occurring a) MM mtDNA haplotypes should not be observed at KJ and ST given nuclear diversity at these sites, and b) the population at Sayaburi, where extensive introgression was observed, should be undergoing some kind of catastrophic decline as independent lineages are swamped by maladapted genes. This is unlikely, and rather dispersal ecology has limited the geographical extent of introgression since historically recent recontact.

Given the distribution of the restricted form, with few individuals detected in sites distant from others where MM haplotypes were common, their apparent absence at Mukdahan (site MD) was surprising. Mukdahan lies directly downstream of Vientiane and directly upstream of Kong Jeam (site KJ) where three of the ten individuals sampled possessed MM haplotypes, and below which, a single MM individual was detected from 47 samples at Stung Treng. It is reasonable to expect that individuals at site MD would represent a mix of haplotypes at some frequency intermediate between directly upstream (essentially 100% MM clade) and downstream sites (30% MM clade). This was obviously not the case, as no MM haplotypes were observed among the 21 individuals assessed for mtDNA diversity at this site. This pattern could be explained in a number of ways. First, and perhaps the most obvious way, may be that the sample taken at Kong Jeam (KJ) may not be truly representative of the frequency of haplotypes that occur at this site. Nine of the ten C. striata individuals sampled at Kong Jeam were juveniles of the same size (approx 15cm) and were sold by the same market vendor. They could have represented the offspring of only a few parents (four mtDNA haplotypes were detected at this site in total). Alternatively, the mtDNA screening method (TGGE-OHA) may have failed to detect MM

116

Phylogeography of C. striata haplotypes at Mukdahan if they were present. The method requires individuals to be run separately with two outgroups in order to positively identify them as members of the MM mtDNA clade. In screening samples from Mukdahan, eight individuals failed to produce adequate banding with either outgroup and so were discarded from further mtDNA analysis, although they were retained for nDNA analysis. It is possible that the screening method had failed, and that these individuals were truly members of the MM clade but had remained undetected. This fails to consider however, the results of nDNA clustering for all 29 individuals from Mukdahan, that found no evidence for membership to the restricted form, suggesting that the absence of MM haplotypes in the data set truly reflects their absence at that site.

Another possible explanation for the pattern observed involves the geomorphology of the mid-Mekong region. A major Mekong sub-basin, the Mun River, drains the bulk of northeastern Thailand, channelling small and medium rivers to meet the Mekong at a confluence at Kong Jeam at the SE margin of the Khorat Plateau. Directly above the Mun sub-basin, another smaller Mekong tributary, the Songkhram River (including site SM) drains the northern extent of the Khorat Plateau, meeting with the Mekong approximately 300km upstream of Kong Jeam. The individuals collected from site SM all belonged to the restricted MM form. Mukdahan lies on the bank of the main Mekong channel between the two confluences, on a section of the Mekong that is fast flowing, with several long underwater canyons more than 100m deep (Rainboth 1996a). As C. striata is known to prefer slow moving shallow water, dispersal may be more likely to occur across the rice fields and drainage networks between northeastern Thai watersheds on the Khorat Plateau than along the adjacent section of the main Mekong channel. In this way, over generations, individuals of the restricted group could have dispersed southward across the Greater Mekong Basin while not following a colonisation route directly along the river course.

The occurrence of two distinct divergent lineages is strong evidence that the two groups have been evolving independently in isolation for a long period of evolutionary time (at least 3.5, probably closer to 8 million years, see previous chapter). Moreover, the overlapping distributions of contemporary populations were almost certainly the product of relatively recent secondary contact. It is likely that isolation was maintained until recent times by some long standing barrier to dispersal, such as a mountain range between drainage basins, a permanently arid expanse of untraversable terrestrial habitat, or extremely large geographical distances. In the case of the latter, only a long range dispersal

117

Chapter 3. event of the sort mediated by human translocation is likely to have brought the two lineages into contact (as has been the case among divergent populations of many freshwater fishes including cutthroat trout (Gyllensten et al. 1985) and pupfishes (Echelle & Connor 1989). This is unlikely, however, as the diversity and divergence observed among individuals of both forms did not carry a signature of colonisation (recent expansion) postdating a founder event from a small number of individuals.

Where one form inhabits a small area of another form’s natural distribution, one explanation could be that the restricted form has been a recent arrival that has not yet had sufficient time to disperse across the entire geographical range. This would place the restricted form as the recent arrival, but evidence is insufficient to positively identify a likely source population. This group is unlikely to have come from the west of the species range, as it is the widespread form that clades most closely with western C. striata (from southern India). In all southern sites including Sumatra, individuals belong to the widespread form. By a process of elimination, this suggests a northeasterly origin for the restricted clade. The distribution of C. striata extends northward into southern China and west to the small coastal rivers of central Vietnam. Prior to the Pleistocene, the Mekong was not a major river, and it is possible that with its formation historical geographical barriers to gene flow were removed and the restricted form was able to disperse south. Alternatively, the contemporary range of the restricted form may have remained essentially unchanged, but with the Mekong’s formation, the habitat of the formerly isolated group was simply amalgamated into the much larger range of widespread C. striata. The area where the restricted form was detected is the northernmost region of the Khorat Plateau upland (Gupta 2005a) characterised by narrow river valleys bounded by steep mountains (Gupta 2004; Kiernan 2009). To the east, the Annamite mountains form a chain of almost unbroken 2000m peaks from southern China to Cambodia (Gupta & Liew 2007), while to the north, the geography is dominated by mountains and limestone karsts (Gupta 2005a). This suggests that vertical relief could have been an important factor that may have isolated the restricted form in the past. Additional sampling to the west of the Annamite range and north near the Lao PDR-China border may reveal the extent of the restricted form’s distribution and help pinpoint the possible centre of origin for this group.

The current data, however, did suggest that ancestors of this form were isolated from other C. striata well before the widespread and Indian groups diverged, either as the result of an early colonisation event from the west that was followed much later by westward

118

Phylogeography of C. striata dispersal of ancestors of the widespread form, or via some geomorphological event that isolated the north-restricted form from all other C. striata in southern and SE Asia.

Phylogeography of the widespread form

Both mtDNA relationships and nDNA cluster analysis were used to explore the micro- evolutionary history of the widespread C. striata form across greater SE Asia. Ancestral haplotypes occupying a central position in the mtDNA network were found primarily in the Chao Phraya drainage and Upper Mekong at site SB. Support form multiple lines of evidence suggests that until relatively recently (in evolutionary time), the upper Mekong flowed directly south into the Siam River drainage (Yom/paleo-Chao Phraya). This evidence comes from a combination of genetic studies that have recognised similarity of genotypes in freshwater species between the two regions (Adamson et al. 2009; Hara et al. 1998), overlapping distributions of species and genera across the upper Chao Phraya and Upper Mekong (Rainboth 1996b), and geological evidence (Brookfield 1998; Gupta 2005b).

The Siam River appears to be the ancestral centre from which the widespread C. striata form migrated across SE Asia. To the south, the diversity of Malaysian individuals represents recent divergence from the ancestral Chao Phraya types that probably arose as migrants dispersed into the peninsula via a network of rivers that drained into the Gulf of Siam in past times of lower sea level. Further south, Sumatran individuals showed much greater divergence from Chao Phraya types but their most recent common ancestor was also a close relative of the Chao Phraya C. striata. Sumatran and Malaysian individuals both shared a common ancestor with a Chao Phraya ancestor more recently than they did with each other, suggesting that the two groups could represent different historical dispersal events, possibly occurring in association with different episodes of sea shore regression during the Pliocene (Woodruff 2010; Woodruff & Turner 2009).

Other regions where the ancestral signature is apparent is in the Vietnamese Central highlands, far to the east of the Chao Phraya River Drainage. These sites, that drain west into the Mekong Basin but are elevated in upland headwaters, retain both an ancestral mtDNA signature and a nDNA similarity with samples from Chao Phraya and southern sites. These individuals may, at least in part, be the relict of a first wave of migration that swept from the Chao Phraya east, either across Cambodia and into Vietnam, or quite possibly following a more southerly route along the northern gulf coast in times of sea shore regression.

119

Chapter 3.

For sites in the mid-lower Mekong Basin, both mtDNA and nDNA reveal more recent population processes. The pattern of diversity in mtDNA haplotypes across the region suggests that most Mekong individuals downstream from the Songkhram River (site SM) share relatively recent common ancestors, and Bayesian population assignment of nDNA genotypes also groups downstream Mekong sites into a single population. Data presented here suggests that these closely related populations descended from a population expansion in the Lower Mekong that probably took place around 200,000 - 300,000 years ago. This is not consistent with the hypothesis that the EA clade underwent significant expansion following the formation of the freshwater Tonle Sap Great Lake and Cambodian floodplains [mid-Holocene, ca. 6Kya (Penny 2006; Rainboth 1996a)], but instead dates to a period when major rivers drained the exposed Sunda and Sahul shelves (Voris 2000), providing increased freshwater habitat and greater freshwater connectivity in comparison with present day freshwater systems in SE Asia.

The data also suggests that there has been very limited gene flow between the lower Chao Phraya and Lower Mekong in the time following this hypothesised historical expansion. In the mtDNA data, haplotypes 53-57, that are most closely related to more recently evolved (less internal) Chao Phraya haplotypes, occur only in the lower reaches of the Cambodian Mekong Basin south of Kratie (site KK). Conversely, clustering of nDNA genotypes assign some Chao Phraya individuals to the predominantly mid-lower Mekong sub-population. It is possible that freshwater connections existed between the Chao Phraya and Tonle Sap / lower Mekong during Pleistocene sea shore regression. In general, however, it appears that diversity originating from a hypothesised past C. striata population expansion in the Lower Mekong has had insufficient time or opportunity to disperse more fully into upstream populations and neighbouring drainages.

Fine scale differentiation

Perhaps the most significant finding here with regard to wild C. striata diversity was that all sample sites in the Mekong showed significant differentiation. In general, the number of mtDNA haplotypes found at each site was low, yet in the 23 sites sampled across the Mekong Basin, a total of 56 unique haplotypes were detected. Fifty percent of all Mekong haplotypes detected were found only to occur at single sites. This means that although the Mekong population as a whole may represent a diverse gene pool, at the scale of local sub-populations, gene flow is likely to be very restricted, with little genetic

120

Phylogeography of C. striata exchange occurring across the system. This conclusion is supported by the results of the nuclear DNA analysis, that found significant differences among most sites, and provides strong evidence that if any translocation by humans has occurred, this has not homogenised genetic diversity across the Mekong basin. The level of local differentiation is not surprising given the ecology of the species. Adults are known to be mainly sedentary, moving only short distances between permanent freshwaters and temporal wetlands (Amilhat & Lorenzen 2005). The species is thought to prefer shallow water, and is perhaps reluctant to disperse widely across the major channel of the Mekong, that is fast flowing and can in parts be as deep as 60-90m (Rainboth 1996a; Viravong et al. 2006). Furthermore, as adults exercise some level of parental care (Lee & Ng 1994), larvae are likely to be retained at the natal site, and hence dispersal is less likely to occur at this life history stage than for species with a pelagic larval phase. Despite C. striata’s ability to move overland and to quickly colonise new habitat, these other life history traits imply that the pattern of high differentiation observed here probably results from naturally low dispersal levels. When considered in light of the possible population expansion of the EA clade in the mid-lower basin 338-169Kya, the high differentiation between sites may also in part, be attributed to ancestral founder events, that could have led to establishment of local populations from a biased subset of the total variation present in Mekong populations.

The high differentiation observed in the Mekong was not, however, observed to the same degree for the two Chao Phraya Basin samples. Although ST analysis showed significant mitochondrial differentiation between CM and CP, nDNA analysis, both in pair- wise estimates of differentiation and in cluster assignment, revealed that the two sites could be considered as a single sub-population. A number of factors could be responsible for generating this pattern. It is possible that gene flow has historically been higher in the smaller Chao Phraya Basin. It is also possible that in relatively recent times humans have moved C. striata across this region. In Thailand, C. striata was ranked among the top eight freshwater aquaculture species in 2004 (Thongrod 2007), and large shipments of live snakeheads are moved routinely across the country, especially from large operations near Bangkok to major population centres in the north (personal observation). A number of escapes from culture could have resulted in genetic homogenisation across the Basin. MtDNA diversity suggests however, close ancestral ties between the upper Chao Phraya site (CM) and the geographically proximate upper Mekong site (SB). Site SB is located far from trade routes and population centres, and is unlikely therefore, to have

121

Chapter 3. been either the recipient or donor of C. striata exports. This suggests that genetic diversity at site CM and across the Chao Phraya could maintain an ancestral signature that has not been swamped by recent migration events. Sites CM and CP had the highest average allelic richness of any site surveyed here, that would indicate a historically large stable population size. This may explain the similarity between the two sites, as a historically large stable population inhabiting a relatively steady and discrete geographical range may have had time to accumulate more diversity and to transfer genes potentially across the drainage, even with low numbers of migrants per generation (Mills & Allendorf 1996).

Conclusion

Phylogeographic and population genetic analysis of C. striata populations across SE Asia revealed two distinct genetic forms based on both mitochondrial and nuclear DNA loci. Where the forms occurred in sympatry there was evidence of genetic introgression, however the geographic zone of overlap between forms was narrow and confined to the upper Mekong basin. One form, referred to here as the widespread East Asian form, was distributed across mainland SE Asia and the Sunda islands (Sumatra). Diversity in individuals of this group indicated that population sizes expanded in lower Mekong region during the Pleistocene. The pattern of phylogeographical structuring in the upper Mekong and Chao Phraya probably reflects historical drainage geomorphology rather than contemporary overland dispersal, especially as the overall pattern of high differentiation among sites indicates that contemporary dispersal is extremely limited. This implies that, despite the species ability to breathe air and traverse terrestrial habitat, this ecological trait is unlikely to result in significant levels of inter-drainage dispersal.

122

Chapter 4 Patterns of genetic diversity and phylogeography of Channa micropeltes in the Mekong River Basin

123

Phylogeography of C. micropeltes

INTRODUCTION

The giant snakehead, C. micropeltes (Cuvier, 1831), grows up to 1m in length and can weigh over 20kg, making it the largest by weight of all Asian snakehead species (Courtenay & Williams 2004; Rainboth 1996a). The natural distribution of C. micropeltes encompasses mainland SE Asia including the Mekong, Chao Phraya and Peninsula Malaysia, as well as the Greater Sunda Islands of Sumatra and Borneo (Courtenay & Williams 2004; Kottelat 1985).

Across SE Asia C. micropeltes are a popular and highly priced food fish. As a consequence, harvesting of wild C. micropeltes represents a significant income source for many fishermen, and this species is caught in large numbers across the region (Ambak & Jalal 2006; Baran et al. 2001). In the Lower Mekong Basin, C. micropeltes is one of the most important “black” fish (Campbell et al. 2006; van Zalinge et al. 2003), accounting for up to 20% of all fish harvested in large scale fishing lot systems (Lim et al. 1999; Rot unpublished) and 7% of the total catch from middle-scale fisheries in the Tonle Sap Great Lake (Hai Yen et al. 2009).

Over the last 20 years, fishing pressure on C. micropeltes fish stocks in and around the Tonle Sap Great Lake has increased significantly, and assessments of catch per unit effort indicate that C. micropeltes catches over this period have declined (Enomoto et al. 2005; Hai Yen et al. 2009). In other parts of the Lower Mekong Basin, altered flow regimes have been implicated in C. micropeltes local population declines, for example in northeastern Cambodia where water released from a new dam upstream in the Vietnamese central Highlands is thought to have caused massive destruction of C. micropeltes nests by washing away eggs (Baird & Mean 2005). Since 1996, when dam construction first altered stream flow in the Sesan River (Mekong Basin), local catches of C. micropeltes have declined to 10% of pre-dam levels. C. micropeltes still remains however, one of the most important wild fishery species in the Lower Mekong Basin (van Zalinge 2002).

C. micropeltes is also a very popular and important aquaculture species across SE Asia (Campbell et al. 2006; Ingthamjitr et al. 2005; Onrizal et al. 2005; So & Haing 2007). In Cambodia, C. micropeltes is the dominant cage culture species, where in 2004 this species accounted for 77.8% of all aquaculture cages (So & Haing 2007). So far, this culture industry has relied exclusively on wild fingerings to stock cages, with 15 million fingerlings

125

Chapter 4. harvested from the wild to support the culture industry in Cambodia in 2004 alone (Phillips 2002; So & Haing 2007).

Ecology C. micropeltes are found in still or slow flowing water, but unlike C. striata, this species prefers deeper water bodies, and is more abundant in reservoirs, lakes, and rivers. It is not often found however, in rice fields or other highly disturbed aquatic habitats (Ambak & Jalal 2006). In the Mekong Basin C. micropeltes prefers lakes (lentic) over river (lotic) habitats (Lim et al. 1999). Although C. micropeltes possess auxiliary air-breathing organs, adults of this species are unable to traverse terrestrial habitat, as they are restricted by their weight and body shape (Courtenay & Williams 2004).

C. micropeltes matures at approximately 40cms (Ambak & Jalal 2006) and spawns in flood plain areas (Lim et al. 1999). After spawning, eggs and then juvenile fish are guarded agressively by both parents, probably until offspring become demersal (Lee & Ng 1994; Wee 1982). Juveniles travel in schools and are often seen travelling close to the surface of the water catching insect prey (Lee & Ng 1994). C. micropeltes is a carnivore, with enlarged “knifelike” canine teeth (Ng et al. 1994), and sub-adult and adult fish feed diurnally in packs (Courtenay & Williams 2004; Wee 1982). Common prey items include other fish, small birds and frogs, and the species is known as a voracious predator (Courtenay & Williams 2004; Wee 1982).

Across the Mekong Basin, colour patterns of adult C. micropeltes vary (personal observation; see Plate 4.1). Fish vendors attribute this variation to the type of habitat individuals occupy, with individuals from less turbulent water showing more uniform (less blotchy) colour patterns.

126

Phylogeography of C. micropeltes

Mun River (juvenile), Thailand

Stung Treng, Cambodia

Stung Treng, Cambodia

Tra Su, Vietnam

Cultured fish, Vietnam Plate 4.1 Phenotypic variation in Channa micropeltes collected across the Mekong River Basin (this study).

127

Chapter 4.

Understanding C. micropeltes diversity in a regional context

The contemporary distribution of C. micropeltes across mainland SE Asia and the Sunda islands (Figure 4.1) suggests that contemporary populations could represent fragments of an historically large contiguous Pleistocene distribution that extended into major rivers on the Sunda shelf. Other species with similar distribution patterns are thought to have dispersed across the region in times of sea shore regression (Dodson et al. 1995; McConnell 2004; Rainboth 1996b). Recent records indicate, however, that at least part of the contemporary C. micropeltes range, the tip of the Malay Peninsula, was colonised only recently (Alfred 1966), probably as a result of human introduction. Additionally, populations of C. micropeltes in Malaysia may have undergone recent expansions where river impoundments have created large reservoir habitat (Ambak & Jalal 2006), and this may also be true for populations in the Mekong, including those in the Nam Ngum (Lao PDR) and Sirinthorn (Thailand) Reservoirs.

Figure 4.1. Natural geographical range of C. micropeltes in SE Asia. Extended Pleistocene drainage basins shown after Voris (2000).

128

Phylogeography of C. micropeltes

Results presented earlier (Chapter 2), indicated that C. micropeltes individuals sampled from the Mun River (Mekong Basin, eastern Thailand) and the Vietnamese Mekong Delta were not genetically divergent. This information alone, however, was insufficient to establish the extent of population structure for this species across the river basin. More intensive sampling and examination of a suite of fast evolving DNA markers was required to reveal the distribution patterns of genetic diversity for C. micropeltes populations across the river system and to identify subdivided populations where they exist. In addition, these data can provide information about the effects of historical population processes including gene flow and population size changes, that, when interpreted in parallel with ecological and geographical information, can provide insights into factors that have been important in determining the contemporary structure of C. micropeltes populations across the Mekong River Basin. Understanding how populations are structured is important, as management strategies that account for natural population structure can contribute to their long term population sustainability. As C. micropeltes represents a very important wild fishery resource and natural populations are currently experiencing intense pressure from harvesting, managing natural Mekong populations in a sustainable way will be essential for maintaining livelihoods and providing a reliable source of animal protein for local people, especially in the Lower Mekong Basin in Cambodia.

129

Chapter 4.

Aims of this chapter

This chapter aimed to characterise the levels and patterns of genetic diversity in wild C. micropeltes populations across the Mekong River Basin, focusing in particular on the Cambodian Mekong where C. micropeltes constitutes an important wild fishery resource. Mitochondrial DNA and nuclear microsatellite data were interpreted in a geographical, demographic and historical context to establish relationships between samples collected across the river basin and to infer levels of gene flow and population structure for this species in the study area.

Specific questions addressed include:

 How was genetic variation distributed across the Mekong River Basin?  How did levels of divergence in mtDNA and microsatellite DNA compare?  Could the samples collected across the river basin be assigned to groups representing sub-populations?  How did observed patterns of population structure and phylogeography relate to the known ecology and life history of C. micropeltes?  Was there evidence for recent demographic changes, and if so, could these be associated with recent anthropogenic impacts such as overfishing?  How do levels of diversity and phylogeographical structuring for C. micropeltes compare with those for C. striata across the same geographical region?

130

Phylogeography of C. micropeltes

METHODS

Sample Collection.

Wild C. micropeltes individuals were collected across the Mekong River Basin, in particular with a focus on collections within Cambodia (the Lower Mekong Basin), where this species represents a major wild fishery. The majority of fish were sampled from local markets. Before collection, all fish sampled were first confirmed to have come from the wild and their capture location was noted. In one instance, fish were sampled directly following capture from Melaleuca wetland habitat in the Tra Su National Park, Vietnam (Site TS). At the time of sampling, fin tissue was abscised from the caudal, pectoral, or dorsal fin and samples sealed individually in vials of 75% ethanol. Full sampling details are presented in Table 4.1, and illustrated in Figure 4.2.

Table 4.1. Sampling details for C. micropeltes, including geographic location of collection and sample sizes for mtDNA and nDNA analyses. Under River Section: UMB = Upper Mekong River Basin, MMB = Middle Mekong River Basin, and LMB = Lower Mekong River Basin.

Location and date of River section Site mtDNA nDNA Lat/Long capture Code n=75 n=280 (approx) Luang Prabang, Lao PDR. Khan R: UMB LP 1 1 19O52.69’N Aug 2006. 102O07.36’E Houay Pamom, Tha Heua, Lao Nam Ngum TH 1 1 18O46.73’N PDR. Aug 2006. Reservoir: UMB 102O30.69’E Central Khorat Plateau, Mun-Chi sub- MC 5 12 15O11.00’N Thailand. Nov 2005. basin: MMB 104O40.00’E 20-30kms from Stung Treng, Lower Sekong ST 11 49 13O31.73’N Cambodia. April 2007. River: LMB 105O58.26’E Kratie, Cambodia. Main Mekong KK 10 50 12O29.10’N April 2007. Channel: LBM 106O01.02’E Kampong Cham, Cambodia. Main Mekong KC 9 9 12O29.10’N April 2007. Channel: LBM 106O01.02’E 10-15kms from Kampong Tonle Sap River: KH 10 50 12O15.27’N Chhnang, Cambodia. LMB 104O40.13’E April 2007. 40km from Tonle Sap Great PS 10 50 12O32.32’N Pursat, Cambodia. April 2007. Lake: LMB 103O55.10’E Battambang, Cambodia. Tonle Sap Great BB 10 50 13O07.59’N April 2007. Lake: LMB 103O12.69’E Tra Su National Park, Tinh River Delta, TS 3 3 10O35.09’N Bein, An Giang, Vietnam. LMB 105O03.54’E Feb 2007. Tram Chim, Tam Nong, Dong River Delta, TC 5 5 10O40.20’N Thap, Vietnam. Feb 2007. LMB 105O33.57’E

131

Chapter 4.

Figure 4.2. Map of the Mekong River Basin showing geographical locations of sample sites for C. micropeltes, neighbouring river drainages are not shown. (a) shows individual sites, see Table 4.1 for full site names, and (b) shows political borders for Mekong riparian countries.

Most C. micropeltes samples were taken from relatively unmodified freshwater habitats, including natural lakes and large rivers. Fish collected at two sites, however, were caught in artificial wetlands. These included one individual from Houay Pamom (site TH), and three individuals from Tra Su National Park (site TS). Houay Pamom lies on the banks of the Nam Ngum Reservoir, that is a large flooded valley formed by the closure of the Nam Ngum Hydropower Dam in the early 1970s (Claridge 1996; van Zalinge et al. 2003) .The Tra Su flooded forest was established in 1980 to combat acidic soil in rice fields, with all resident fish having resulted from natural colonisation in the intervening 20 years before

132

Phylogeography of C. micropeltes dykes built in 2000 stopped fish movement in and out of the sanctuary (personal communication with Tra Su National Park staff, Vang Giao Commune, Tinh Bein, 2007).

DNA Marker Selection

Two types of molecular markers were chosen to characterise diversity in C. micropeltes: the mtDNA Cyt b gene and microsatellite loci specific to C. micropeltes. The same suites of markers were chosen for both C. striata and C. micropeltes because a) both studies addressed essentially similar research questions and mtDNA and microsatellite markers were appropriate therefore, in both cases, and b) in order to produce data sets that were directly comparable, to simplify comparisons of phylogeographic patterns among the two species.

Molecular techniques – Mitochondrial DNA

DNA was extracted from each sample following a standard Salt Extraction Procedure (Miller et al. 1988).(Appendix 1)

PCR amplification and direct sequencing

Mitochondrial variation was determined for 75 individuals across the 11 sample sites; see Table 4.1 for sample sizes for each site. For each individual, an 832 base pair region of the mtDNA genome encompassing 22 bases of the Glutamate tRNA gene and 810 bases at the start of the Cyt b gene was amplified with the primers GLUDG-L and CB3-H (Palumbi et al. 1991). For full primer sequences and PCR conditions see Table 2.2 and Table 2.3.

After checking PCR amplification success, products were cleaned and sequenced with the GLUDG-L primer following the protocols outlined in Appendix 2. When unique haplotypes were indentified from sequence data, a single replicate of each type was re- sequenced with the CB3-H primer to verify any mutations. All sequence reads were checked against chromatographs to ensure correct genotyping. Sequences were aligned using BIOEDIT software (Hall 1999), and sequence ends were trimmed to leave a 765 base pair fragment at the start of the Cyt b gene. This fragment was used in all mtDNA analyses.

133

Chapter 4.

Molecular techniques – Nuclear DNA

Isolation of microsatellite loci, primer design and PCR amplification

Microsatellites were isolated from C. micropeltes tissue provide by the Cambodian Department of Fisheries in 2005. Molecular techniques were performed by E. Adamson and V. Chand following the procedure outlined in Appendix 7 and employed previously by QUT researchers (Archangi et al. 2009; Chand et al. 2005). Five microsatellite loci were isolated with sufficient flanking regions to allow primer design, however during early optimisations two loci proved to be monomorphic and so were discarded from the data collection process. Optimisation of PCR amplification for each locus followed the process outlined previously (Chapter 3). Table 4.2 presents information for each primer set.

Table 4.2. Microsatellite primers for C. micropeltes. Repeat motif is reported as observed in sequence used in primer design. Flanking region refers to the length in base pairs of the sequence within the priming region that is not composed of tandem repeats. Size range of PCR products is reported under Allele size, the corresponding number of tandem repeats is reported under Repeat range. For each primer set annealing temperature in degrees Celsius (Ta) and quantity (μL) of 25mM MgCl2 per 12μL were optimised for screening samples on the GELSCAN 3000 System. Loci Cm-4 and Cm-5 were monomorphic.

Primer Sequence Repeat Flanking Alleles Repeat Ta Mg motif region size range Cl2 Cm-1 5’-CAC GCA CCA AGT CTT TCA GA-3’ 5’-ATG CAG GCA TGG TAA GAA CC-3’ CA(11) 131 149-207 9-38 55 0.4 Cm-2 5’-CAC TGT GCA GAT GTG GAG AAA-3’ 5’-CTT TGC AAA AGC CCA GAG TC-3’ AC(16) 104 136-142 16-19 55 0.4 Cm-3 5’-AAG CAG AAA GTC ATT TAT GCT GTT T-3’ TG(16).CG.TG( 5’-AAT GGA TGA GCT GGA ACC TC-3’ 4).CG.TG(7) 93 151-167 29-37 50 0.2 Cm-4 5’-TTG ATG CAT TTG TGT GAG TCC-3’ 5’-CCA TCT GCT TTC TCA GCA CA-3’ GT(10) 98 118 n/a 55 0.6 Cm-5 5’-TTA AGA AGA GGA GCG CCA AG-3’ 5’-CAG CAG ATT GAA AGT GAT AAC ATA AA-3’ AC(7).AG.AC(7) 121 151 n/a 55 0.4

Mass screening using real-time gel fragment analysis

Two-hundred and eighty individuals were screened for allele size variation at three loci; Cm-1. Cm-2 and Cm-3, using the CORBETT Gel-Scan™ 3000 System (QIAGEN), as described earlier (Chapter 3) with full protocols outlined in Appendix 9. Allele sizes for each individual at each locus were scored from Gel-scan gel images using ONE-Dscan 2.05 software (SCANALYTICS). Figure 4.3 shows an example gel for the Cm-3 locus. Allelic genotypes were collated in MICROSOFT EXCEL before data were checked for the presence of

134

Phylogeography of C. micropeltes null alleles, stutter scoring or genotyping errors with MICROCHECKER Version 2.2.3 software (Van Oosterhout et al. 2004) .

(a)

200 200

160 160

150 150 139 139 m m m

(b) 167 167 155 155 153 153 151 151

Figure 4.3. Microsatellite gel image of locus Cs-3 showing all four alleles detected at this locus. (a) Digital image produced on the GELSCAN™ system showing molecular size standards “m”, and (b) a close-up of the same image showing scored alleles. Individual alleles indicated in pink, numbers on edges correspond to allele sizes in base pairs.

Statistical Analyses

Mitochondrial DNA analysis

Diversity

To illustrate the relationship between Cyt b haplotypes a median-joining network was constructed with NETWORK software (Bandelt et al. 1999). To facilitate comparison with the C. striata data set three measures of diversity were calculated to describe variation in C. micropeltes mtDNA: Haplotype diversity (Hd) (Nei 1987), nucleotide diversity () (Tajima

1983), and θS (Watterson 1975) using pair-wise distance. Each sample where n>1 was tested for deviation from mutation-drift and gene-flow drift equilibrium using Tajima’s D

(Tajima 1989), Fu’s FS (Fu 1997), and Rasmos-Onsins and Rosas’ R2 (Ramos-Onsins & Rozas

135

Chapter 4.

2002). Diversity and neutrality statistics were also calculated for the pooled data set to obtain basin-wide parameters. All statistics were calculated using ARLEQUIN Version 3.1 software (Excoffier et al. 2005), except for Rasmos-Onsins and Rosas’ R2, which was calculated using DnaSP Version 5 software (Librado & Rozas 2009). DnaSP was also used to calculate significance values for Tajima’s D and Fu’s FS using coalescent simulations (given

θ, with 1,000 replicates run for each simulation). Where appropriate, significance values for each test were adjusted to account for family-wise error rate using the False Discovery Rate Procedure (FDR) (Benjamini & Hochberg 1995; Verhoeven et al. 2005). As the data set included a number of very small sample sizes (where n < 5), a Spearman’s correlation was performed to examine the relationship between sample size and the number of haplotypes detected. This analysis was conducted using PSAW STATISTICS (SPSS) software Version 18.

Population structure

Since some sample sizes were small and haplotypic variation was low, prior to analysis of genetic differentiation the statistical power of the FST analysis to correctly reject the null hypothesis of no-differentiation was estimated using the program POWSIM (Ryman & Palm 2006). This program uses real sample sizes and pooled allele frequencies to simulate populations exposed to drift under a range of effective population sizes (Ne) and generations (t). Each simulation produces a “true” FST and replicate runs can be used to assess the proportion of times a hypothetical “true” FST of the same magnitude would return a significant result when testing differentiation in the actual data set (allele

2 frequencies and sample sizes) using a χ test. This analysis was performed for a range of Ne values (from 200 to 2000) and t values (from 25 to 500) and results were plotted to identify what level of differentiation it was possible to detect in the data set. Each simulation was run with 500 replicates.

To assess the level of differentiation among sites, pair-wise ST was calculated using ARLEQUIN Version 3.1 software (Excoffier et al. 2005) employing Slatkin’s linearized measure of genetic distance (Slatkin 1991) under a Kimura-2-parameter evolutionary model (Kimura 1980). Significance of ST estimates were tested with a nonparametric permutation procedure (10,000 iterations) and values were adjusted following the FDR procedure. Two methods were used to explore the pattern of population structure among sites. A Mantel’s test was employed to test for associations between geographical distance and genetic differentiation (ST) in ARLEQUIN, tested with 10,000 permutations. For this analysis river distance between sample locations was used, as the ecology of C. micropeltes

136

Phylogeography of C. micropeltes suggests that dispersal is most likely to occur via permanent mid-deep water channels, and hence ‘dispersal distance’ among sample sites is likely to be linked to river channels rather than overland distance across ephemeral wetlands. As distances ranged from 80km to 1585km, distances were log transformed prior to analysis. For the second method, Spatial Analysis of Molecular Variance was used to partition sites into the number of groups (K) that maximised differentiation among them (CT) whilst maintaining homogeneity within groups. This analysis was performed using the program SAMOVA (Dupanloup et al. 2002). Geographical co-ordinates were substituted with grid co-ordinates that better reflected river distance. The analysis was performed for each possible K (in this case 2 to 10) and the best K was selected by choosing the results with the highest significant CT.

Nuclear DNA analysis

Diversity

Microsatellite allele frequencies at each locus at each site were summarised and data files prepared with CONVERT software (Glaubitz 2004). Allelic richness (A) was calculated for each locus at each site where n > 1 using FSTAT Version 2.9.3.2 software (Goudet 1995). The likelihood-ratio test of linkage disequilibrium (Slatkin & Excoffier 1996) was undertaken between all pairs of loci for samples where n >5 using the EM algorithm, as implemented in ARLEQUIN with 10,000 permutations. This incorporated an exact test to determine statistical significance (Slatkin & Excoffier 1996). Deviation from Hardy – Weinberg Equilibrium was tested with exact tests for each locus at each site (Guo & Thompson 1992), also performed in ARLEQUIN. Wilcoxon’s- tests against coalescent simulations of populations evolving under a two phase model (TPM: 95% stepwise mutation and 5% infinite allele model) of microsatellite mutation were performed, and Garza-Williamson’s M was calculated for each site to further investigate population demography, using BOTTLENECK Version 1.2.02 (Piry et al. 1999) and ARLEQUIN software, respectively. Significance was adjusted for linkage, HWE and Wilcoxon’s analyses following the FDR procedure (Benjamini & Hochberg 1995; Verhoeven et al. 2005).

Population structure

As only three microsatellite loci were available for C. micropeltes, and all three showed relatively low levels of allelic variation, the power of the data to detect differentiation was tested using POWSIM (Ryman & Palm 2006) following the methods described above for the

137

Chapter 4. mtDNA dataset excluding sites where n = 1 (LP and TH). To assess the level of population subdivision, FST and RST analogues (Slatkin 1991, 1995) were calculated in ARLEQUIN with 10,000 permutations, with significance values corrected using the FDR procedure. Jost’s

Dest (Jost 2008) was also calculated for each locus using SMOGD software (Crawford 2010) and combined across all loci by taking the harmonic mean approximation proposed by

Chao (2009, http://www.ngcrawford.com/django/jost/). FST and RST values were correlated against Dest to assess the similarity of estimates for each pair-wise comparison, using Spearman Rank Correlations in PSAW Statistics package (SPSS) Version 18.

To explore the pattern and magnitude of differentiation among samples, Dest values were used to construct a population tree using the NJ method implemented in MEGA Version 4 (Tamura et al. 2007), and factorial correspondence analysis (FCA) (Benzécri 1973) was used to plot all individuals and populations based on genotype using GENETIX software Version 4.05.2 (Belkhir et al. 1996-2004).

Potential for isolation by distance effect was investigated by correlating FST estimates and log transformed river distance among sites using a Mantel’s test (10,000 permutations), performed in ARLEQUIN. To further test for patterns of spatial arrangement of genetic similarity, genetic Dest distance estimates among sites were mapped onto stream sections that connected them and the goodness of fit assessed by regression analysis using STREAM TREES software (Kalinowski et al. 2008). SAMOVA (Dupanloup et al. 2002) was also employed to partition variation into geographical homogenous groups.

Lastly, Bayesian clustering of individuals was undertaken to investigate the possible presence of population sub-structuring. The program STRUCTURE Version 2.2 was used to assign individuals to K clusters (where K = 1-22) (Falush et al. 2003, 2007; Pritchard et al. 2000). Analyses assumed an admixture model, with a constant λ of 1.0, and after initial optimisation were run for a MCMC chain length of 300,000 following a burn-in of 150,000. Twenty replicate analyses were performed for each K value using the CBSU bioHPC facility (http://cbsuapps.tc.cornell.edu/structure.aspx). To determine the optimum number of groups from the results of the clustering analysis, the rate of change of the log probability of the data ΔK was plotted and the highest peak in this distribution taken as the best estimate of the ‘real’ number of clusters (Evanno et al. 2005). Probabilities of assignment were aligned across replicate analyses using CLUMPP Version 1.1.2 software (Jakobsson & Rosenberg 2007b) employing the ‘Greedy’ algorithm (with random input order repeated 1,000 times) and visualised using DISTRUCT version 1.1 software (Rosenberg 2004).

138

Phylogeography of C. micropeltes

RESULTS

MtDNA diversity and phylogeography

Five unique mtDNA Cyt b haplotypes were identified among the 75 individuals sequenced for this gene region. Among haplotypes, four polymorphic sites were identified, one of which was a nonsynonymous mutation at a first base codon position. The maximum number of haplotypes found together at a single site was only three (for sites KC and KH). Information on variable sites is presented in Table 4.3. Haplotype frequencies for each site are presented in Table 4.4. Figure 4.4 illustrates the evolutionary relationship among haplotypes and their geographical distribution across the Mekong River Basin.

Table 4.3. Variable sites in first 765 bases of C. striata mtDNA Cyt b haplotypes (1-5). Gene position refers to number of nucleotide bases from start of gene. Dots indicate identity to Hap 1.

Cyt b gene position Haplotype 498 585 712 735 Hap 1 A C A C Hap 2 . . G . Hap 3 . A G . Hap 4 G . G . Hap 5 . . G T Transition / transversion ts tv ts ts Codon position 3rd 3rd 1st 3rd

Table 4.4. Mitochondrial Cyt b haplotype frequencies for C. micropeltes.

Haplotype Site n Hap 1 Hap 2 Hap 3 Hap 4 Hap 5 LP 1 1 TH 1 1 MC 5 1 ST 11 1 KK 10 0.80 0.20 KC 9 0.11 0.78 0.11 KH 10 0.50 0.40 0.10 PS 10 0.50 0.50 BB 10 0.60 0.40 TS 3 1 TC 5 0.60 0.40 Mekong 75 0.47 0.45 0.03 0.04 0.01

139

Chapter 4.

Figure 4.4. (a) Map of the Mekong Basin showing frequency of C. micropeltes Cyt b haplotypes detected at each sampling location, and (b) Median Joining network illustrating hypothesised evolutionary relationship among the five different haplotypes detected. Size of circle indicates relative frequency of each type detected across all samples.

140

Phylogeography of C. micropeltes

The five Cyt b haplotypes identified here were all closely related. Among them, two haplotypes (Hap 1 and Hap 2) were present in near equal frequencies, and together accounted for 92% of all individuals genotyped across the Mekong River Basin. Hap 1 was the only haplotype detected in the mid- and upper-Mekong Basin, and was entirely absent from three of the eight Lower Mekong sites, including the two most downstream sampling sites (TS and TC). In contrast, Hap 2 was only detected in the Lower Mekong Basin, where it accounted for at least 50% of all individuals sampled at each site south of Stung Treng (site ST). The other three haplotypes (Haps 3, 4, and 5), were more closely related to Hap 2 than to Hap 1, and were also found exclusively in Lower Mekong River sites. Haplotypes 3, 4, and 5 were detected in low frequency and had only localised distributions, suggesting that they may be recently derived.

Although diversity appeared to be higher in the Lower Mekong (for example in Figure 4.4 and Table 4.5), this may be an artifact of differences in sample sizes, as sites in the mid- and upper- River Basin were only represented by a small number of individuals. The correlation between sample size and number of haplotypes detected, however, was not significant (r= 0.461, p=0.153), suggesting that relative diversity was not directly related to the number of observations per site. There was no evidence that diversity at the Cyt b locus deviated from expectations under mutation-drift and gene-flow drift equilibrium. There was also no evidence for recent population expansions (Table 4.5), although for the sample from Kampong Chang (site KC) both Tajima’s D and Fu’s FS returned negative values, that could indicate a recent population bottleneck or an excess retention of recent mutations due to reduced effects of drift associated with population size expansion (Fu 1997; Tajima 1989). Table 4.5. Summary statistics for all sites individually and for the pooled data set (Mekong), asterix (*) denotes significance before FDR correction for multiple comparisons, no values was significant after FDR.

Site n n Haplotype Nucleotide Theta Tajima’s D Fu’s FS Rasmos- haplotypes diversity diversity (S) Onsins & (Hd) () Rozas’ R2 LP 1 1 1.000 - - n/a n/a n/a TH 1 1 1.000 - - n/a n/a n/a MC 5 1 0 0 0 0 0 n/a ST 11 1 0 0 0 0 0 n/a KK 10 2 0.356 0.356 0.353 0.015 0.417 0.178 KC 9 3 0.417 0.444 0.736 -1.362 -1.081* 0.208 KH 10 3 0.644 0.756 0.707 0.222 -0.046 0.204 PS 10 2 0.556 0.556 0.353 1.464 1.096 0.278 BB 10 2 0.533 0.533 0.353 1.303 1.029 0.267 TS 3 1 0 0 0 0 0 n/a TC 5 2 0.600 0.600 0.480 1.225 0.626 0.066 Mekong 75 5 0.582 0.622 0.818 -0.400 -0.902 0.084

141

Chapter 4.

1.2

1

0.8

0.6

0.4

0.2

0 Proportion of significantresults of Proportion 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Hypothetical “True” F ST Figure 4.5. Results of the Power analysis for the C. micropeltes Cyt b data set, indicating that sufficient power was available to resolve FST values greater than approximately 0.15.

Power analysis suggested that the data set did possess the potential to resolve significant differentiation at FSTs ≥ approx 0.15 (Figure 4.5), if and where this magnitude of differentiation was present in the data. Pair-wise ST analysis did detect high and significant differentiation among several sites (Table 4.6). The magnitude of ST values was largely due to very low within site variation, which resulted in most variation being partitioned among sites. The most upstream sample site in the lower Mekong Basin, Stung

Treng (ST) was the most differentiated from all other sites, with the highest FST estimates observed between ST and Tram Chim (TC) and ST and Kratie (KK). This pattern of differentiation was also evident in the raw haplotype frequencies, as the only haplotype observed at Stung Treng (haplotype 1) was entirely absent from the Tram Chim and Kratie sites (see Figure 4.4). Interestingly, the most geographically close sample site to Stung Treng was Kratie, the two locations separated by only 120km along the main Mekong River channel, indicating that geographical proximity does not necessarily indicate high genetic similarity for this species.

142

Phylogeography of C. micropeltes

Table 4.6. Population pair-wise ST analysis of C. micropeltes mtDNA. Above the line: ST estimates, below the diagonal: Significance (α = 0.05) before (*) and after (†) False

Discovery Rate Correction. Negative ST estimates are treated as 0. Grey cells indicate comparisons that are not statistically significant.

MC ST KK KC KH PS BB TS TC MC 0 0.80296 0.71853 0.25764 0.33333 0.22078 1 0.78571 ST 0.85875 0.79845 0.38689 0.46078 0.34975 1 0.86879 KK *† *† -0.05776 0.30556 0.34921 0.44444 -0.08597 0.23913 KC *† *† 0.15211 0.17998 0.28869 -0.18033 0.17803 KH *† *† -0.09259 -0.07407 0.16938 0.30723 PS *† *† -0.08889 0.2500 0.36317 BB * *† * 0.37824 0.44099 TS *† *† 0.11765 TC *† *† * * *†

Overall, however, the test for isolation by distance did find a significant association between ST and river distance (r = 0.502032, p = 0.0068, Figure 4.6). In addition, Spatial Analysis of Molecular Variance (SAMOVA) was able to identify four groups of samples in the mtDNA data set that represented geographically homogenous groups (Figure 4.7, and illustrated in Figure 4.8). These groups correspond with the most upstream sites composed exclusively of haplotype 1 (“Upstream”: sites MC and ST), the Tonle Sap Great Lake sites (“Great Lake”: KH, PS and BB), and all other sites in the Main Mekong downstream of ST (“Downstream”: sites KK, KC and TS) excluding Tram Chim (site TC) that formed the fourth group.

The four groups are not surprising given the general pattern in the distribution of mtDNA diversity uncovered for C. micropeltes in this study. At the broadest spatial scale the diversity observed at Stung Treng, the most upstream site in the Lower Mekong River Basin, appears to resemble diversity in the mid-and upper river Basin. Much less mtDNA variation was found in this section of the Mekong.

Among the seven downstream sites sampled in the Lower Mekong River Basin, diversity was comparatively higher. The three sites sampled in the Tonle Sap River – Great Lake sub-drainage appear to be very similar, with roughly 50% of individuals possessing haplotype 1 and 50% haplotype 2. In contrast, haplotype 1 was notably absent in the majority of individuals sampled from the four sites in the main Mekong downstream of Stung Treng, where haplotype 2 was dominant.

143

Chapter 4.

1 r = 0.5020 p = 0.0068

0.8

ST 

0.6

0.4 Linearised

0.2 Slatkin’s

0 1.8 3

-0.2 Log stream distance

Figure 4.6. Raw ST plotted against log transformed geographic distance for in C. micropeltes mtDNA data. Trendline shows the general pattern of increasing genetic distance with greater geographic distance (Isolation by Distance).

0.54

0.53

0.52

0.51

CT 0.5 0.49

0.48

0.47

0.46 1 2 3 4 5 6 7 8 K

Figure 4.7. Graph of CT values (differentiation between groups) obtained in SAMOVA analysis, which identified four as the number of groupings (K) which maximised variation among groups.

144

Phylogeography of C. micropeltes

Upstream Great Lake Downstream Tram Chim

Figure 4.8. Illustration of geographical extent of groups defined by SAMOVA analysis of mtDNA data

145

Chapter 4.

Nuclear DNA results. Diversity Two hundred and eighty C. micropeltes were genotyped at three variable microsatellite loci. A total of twelve unique alleles were detected for Cm-1, while only four alleles were present for both Cm-2 and Cm-3 (See Appendix 11). At each locus, single most common alleles accounted for at least 60% of all alleles detected, with the two most common alleles at each locus accounting for over 90% of all alleles observed (Figure 4.9). Figure 4.10 shows the relative proportion of each allele at each sample site.

There was no real evidence in the dataset for the presence of nulls (undetected alleles), with only two loci at a single site showing significant deviations from Hardy- Weinberg equilibrium after FDR correction (Table 4.7). For sites where n > 1, average heterozygosity was 0.32509, 0.35136, and 0.38352 for Cm-1, Cm-2 and Cm-3, respectively. There was also very limited evidence for linkage disequilibrium, with only five of 21 pair- wise comparisons returning significant exact test results. Patterns of disequilibrium among loci were not consistent, with only one pair-wise test for linkage significant after FDR. These results indicate that the three loci analysed provide essentially independent estimates of neutral genetic variation for the individuals sampled here.

Highest allelic richness (A) was detected in individuals from sites in the main Mekong River in the Lower River Basin downstream of Stung Treng (sites KK and KC) (Table 4.7, Figure 4.11), however the greatest values for both number of alleles detected and allelic range (difference in number of repeats among alleles) were observed in the Tonle Sap Great Lake (sites KH, PS, and BB) (Table 4.7). In general, A was lowest at upstream sites and in samples collected within the Great Lake distant from the Mekong Tonle-Sap confluence (sites PS and BB).

A number of samples showed evidence for recent population bottlenecks when Garza- Williamson’s M was considered (Table 4.7), however the low number of loci used and overall low number of alleles and small allelic ranges make drawing any firm conclusions quite difficult, especially as the other specific test for population decline (Wilcoxon’s test against HWE) was not significant for any sample site. At Tram Chim (site TC) however, the M value was relatively low (0.35194, SD: 0.23465). This can be attributed to the high allelic range for locus Cm-1 (28) but low number of alleles. A high allelic range was also observed for locus Cm-1 Great Lake sites (KH, PS and BB) but was not detected anywhere else in the Mekong Basin.

146

Phylogeography of C. micropeltes

4 0 0 CCm-1m.1

3 0 0

y

c

n

e

u

q

e r

F 2 0 0

l

l

a

r

e

v O

1 0 0

0 1 40 1 50 1 6 0 1 70 1 8 0 1 9 0 2 00 2 1 0 Allele length (bps)

400

y Cm. 2

c Cm-2 n

300

e y

c

u

n

e

q

u

e q

r e

r F

F 200

l

l

l

l a

r

a e

v

r

O e

v 100 O

0 134 136 138 140 142 144 Allele length (bps)

400 Cm.3 Cm-3

300

y

c

n

e

u

q

e r

F 200

l

l

a

r

e

v O

100

0 150 155 160 165 170 Allele length (bps) Allele length (bps)

Figure 4.9. Allele frequencies observed at the three microsatellite loci, Cm-1 (top), Cm-2 (centre), and Cm-3 (lower). Colours match pie graphs in Figure 4.11.

147

Chapter 4.

TH

Figure 4.10. (a) Map of the Mekong Basin showing frequency of C. micropeltes microsatellite alleles detected at each sampling location. Pie graphs from left to right show alleles of loci Cm-1. Cm-2, and Cm-3 respectively; (b) inset shows abbreviated location codes. Note that sample sizes varied, with n = 1 for sites LP and TH.

148

Phylogeography of C. micropeltes

Table 4.7. Summary statistics for C. micropeltes microsatellite diversity. Significance (α = 0.05) of Hardy-Weinberg p values (H/W p value) indicated before (*) and after (†) FDR correction. Garza-Williamson’s M values (G/W index) below 0.68 threshold are identified by (**). Beside Wilcoxon’s significance (W/T p value) (~) indicates analysis compromised by low sample size. site locus: Cm-1 Cm-2 Cm-3 n 12 12 12 MC n alleles 3 2 2 allelic range 3 1 2 allelic richness 2.42 1.25 1.96 obs het 0.58 0.08 0.42 exp het 0.60 0.80 0.49 H/W p value 0.60 1.00 1.00 average het (8 loci) 0.36 G/W index 1.81(SD: 0.14) W/T p value 1.00 n 49 49 49 ST n alleles 2 2 2 allelic range 3 1 2 allelic richness 1.78 1.95 1.96 obs het 0.35 0.39 0.31 exp het 0.34 0.47 0.49 H/W p value 1.00 0.23 0.009* average het (8 loci) 0.35 G/W index 0.72(SD: 0.21) W/T p value 0.13 n 50 50 50 KK n alleles 3 2 4 allelic range 6 1 8 allelic richness 2.52 1.95 2.34 obs het 0.24 0.26 0.42 exp het 0.59 0.48 0.55 H/W p value <0.01*† <0.01*† 0.12 average het (8 loci) 0.31 G/W index 0.62**(SD: 0.27) W/T p value 0.25 n 9 9 9 KC n alleles 2 3 4 allelic range 3 2 8 allelic richness 1.57 2.68 3.20 obs het 0.22 0.78 0.22 exp het 0.21 0.65 0.73 H/W p value 1.00 0.76 <0.01* average het (8 loci) 0.41 G/W index 0.65**(SD: 0.25) W/T p value ~0.38 n 50 50 50 KH n alleles 5 4 4 allelic range 25 3 8 allelic richness 1.76 2.19 2.50 obs het 0.28 0.38 0.56 exp het 0.26 0.50 0.52 H/W p value 1.00 0.18 0.59 average het (8 loci) 0.41 G/W index 0.55**(SD: 0.34) W/T p value 0.13 n 50 50 50 PS n alleles 5 3 4 allelic range 25 2 8 allelic richness 1.47 2.08 1.91 obs het 0.14 0.42 0.30 exp het 0.15 0.51 0.32 H/W p value 0.13 0.43 0.12 average het (8 loci) 0.29 G/W index 0.55**(SD: 0.34) W/T p value 0.25 149

Chapter 4.

Table 4.7 continued. site locus: Cm-1 Cm-2 Cm-3 n 50 50 50 BB n alleles 5 2 4 allelic range 28 1 8 allelic richness 1.52 1.97 2.01 obs het 0.18 0.32 0.36 exp het 0.17 0.50 0.38 H/W p value 1.00 0.01156* 0.07 average het (8 loci) 0.29 G/W index 0.54** (SD: 0.34) W/T p value 0.37 n 3 3 3 TS n alleles 2 2 2 allelic range 3 1 1 allelic richness 2 2 2 obs het 0.33 0.33 0.67 exp het 0.33 0.33 0.53 H/W p value 1.00 1.00 1.00 average het (8 loci) 0.44 G/W index 0.83 (SD: 0.23) W/T p value ~0.25 n 5 5 5 TC n alleles 3 2 2 allelic range 28 2 6 allelic richness 2.47 1.97 1.60 obs het 0.60 0.20 0.20 exp het 0.51 0.47 0.20 average het (8 loci) 2.01 H/W p value 1.00 0.33 1.00 G/W index 0.35** (SD: 0.23) W/T p value ~0.25

2.7 KC

) 2.5 A

KK 2.3 KH

2.1 TS TC MC ST 1.9 PS BB

Averageallelic( richness 1.7

1.5

Figure 4.11. Average allelic richness for C. micropeltes microsatellite data for each site.

150

Phylogeography of C. micropeltes

Population structure

Results of the power analysis indicated that the data could potentially resolve

significant differentiation even when it was low (at or above FSTs ≈ 0.015, see Figure 4.12).

1

0.8

0.6

0.4

0.2

0

0 0.01 0.02 0.03 0.04 0.05 0.06 Proportion of significantresults of Proportion Hypothetical “True” FST Figure 4.12. Results of the Power analysis for the C. micropeltes microsatellite data set,

indicating that combined analysis of three loci provide sufficient power to resolve FST values greater than approximately 0.015.

Estimates of differentiation varied greatly among analyses (Figure 4.13), although in

general they were low (<0.4). Dest and FST estimates were strongly correlated (rS = 0.892,

p < 0.001), although FST analysis consistently resulted in higher pair-wise estimates of

differentiation. RST analysis, that incorporates allele size variation under a stepwise mutation model in addition to allele frequencies, resulted in the widest range of

1 differentiation estimates, and was poorly correlated with both Dest and FST estimates (rS = 11 1 1 1 0.9 1 0.90.9 0.624, p < 0.001 and rS = 0.634, p < 0.001 respectively). RST analysis performs best when 0.9 1 0.8 0.8 0.8 0.9 differentiation is high (Balloux & Goudet 2002), and a lack of sensitivity of this estimator to 0.8 0.9 0.7 0.9 0.7 0.7 0.7 0.9 low levels of differentiation may explain the disparity between indices observed here. 0.6 0.6 0.8 0.6 0.5 Series1 Series1 0.6 0.8 Series2Series1Jost’s Dest values 0.5 0.8 Series2 Series3Series2Series1F values 0.50.5 ST Series3Series2Series3F trendline 0.5 0.8 LinearST (Series2) 0.4 0.4 Linear (Series2) LinearSeries3RST values (Series3)(Series2) 0.4 0.7 0.4 Linear (Series3) 0.7 LinearRST trendline (Series3)(Series2) 0.4 0.3 0.7 Linear (Series3) 0.3 0.3 0.3 0.7 0.3 0.2 0.6 0.2 0.2

0.6 0.2 Series1

0.6 n

0.10.2 0.5 o

0.1 i 0.1 Series1 0.6 t Series2Series1Jost’s D values

Differentiation est a 0.10 0.5 i t 0.1 Series2 0 n Series1 0 0 0.5 20 40 60 80 100 120 140 160 180 Series3Series2FST values 0 0.5 20 40 60 80 100 120 140 e 160 180 0 0 20 40 60 80 100 120 140 160r 180 Series3 e Series2

0 20 40 60 80 100 120 140 160f 180 LinearSeries3FST trendline (Series2) f 0.5 i 0 0.4

0.4 D Linear (Series2) LinearSeries3RST values (Series3)(Series2) 0.4 0.4 LinearR trendline (Series3) -0.1 LinearST (Series3)(Series2) 0.4 0.3 Linear (Series3) Pair-wise comparison (ranked by Dest) 0.3 -0.2 0.3 0.3 Pair-wise comparison (ranked by Dest)

0.20.3 0.2 0.2 Figure 4.13. Graph of Pair-wise measures of differentiation0.2 ranked by Dest 0.10.2 0.1

0.1 Differentiation 0.10 151 0.1 0 0 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 0 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 0

-0.1

Pair-wise comparison (ranked by Dest) -0.2 Chapter 4.

Both FST and RST analysis detected significant differentiation between individuals from the middle Mekong (site MC), with individuals from the most upstream site in the lower Mekong Basin (site ST) and almost all other samples (Table 4.8 and Table 4.9). Among samples takes from the Great Lake (sites KH, PS, and BB) there was no, or in one case very low significant differention observed (FST = 0.02008). There was also no differentiation evident among samples from the main Mekong River south of Stung Treng (sites KK, KC, and TS) with the exception of site TC in the Mekong Delta. In general in the lower Mekong Basin however, main Mekong River sites were differentiated from Great Lake sites. This general pattern is congruent with “Upstream”, “Downstream”, “Great Lake”, and “Tram Chim” groups inferred from Spatial Analysis of Molecular Varience for the mtDNA data set.

Table 4.8. Results of pair-wise FST analysis of C. micropeltes microsatellite data for sites where n >1. Above the diagonal: FST estimates, below the diagonal: Significance (α = 0.05) before (*) and after (†) FDR correction. Grey cells indicate comparisons that were not significant.

MC ST KK KC KH PS BB TS TC MC 0.14039 0.10674 0.26623 0.24417 0.35640 0.33513 0.20486 0.29772 ST *† 0.11563 0.27582 0.28578 0.34839 0.33613 0.26084 0.33155 KK *† *† 0.06247 0.06853 0.10763 0.09409 0.03715 0.11651 KC *† *† 0.05145 0.08981 0.06400 0.07129 0.16950 KH *† *† *† * 0.01628 0.02008 -0.01371 0.06889 PS *† *† *† *† -0.00482 0.05535 0.12490 BB *† *† *† * * 0.06230 0.14450 TS * *† 0.03675 TC *† *† *† *† *†

Table 4.9. Results of pair-wise RST analysis of C. micropeltes microsatellite data for sites where n >1. Above the diagonal: RST estimates, below the diagonal: Significance (α = 0.05) before (*) and after (†) FDR correction. Grey cells indicate comparisons that were not significant

MC ST KK KC KH PS BB TS TC MC 0.12381 0.13555 0.23997 0.11742 0.05004 0.02281 0.18227 0.01465 ST *† 0.32342 0.40900 0.20001 0.13445 0.10327 0.37596 0.07364 KK * *† 0.01972 0.06665 0.01367 0.00054 -0.07488 0.11565 KC *† *† -0.00924 -0.02315 -0.02317 -0.06872 -0.02879 KH *† *† *† 0.00441 0.01370 -0.06052 -0.01762 PS *† -0.00836 -0.08900 -0.01172 BB *† -0.09261 -0.00931 TS *† -0.11258 TC

152

Phylogeography of C. micropeltes

PS ~ Great Lake, Lower Mekong Basin

BB ~ Great Lake, Lower Mekong Basin

TC ~ Mekong Delta

TS ~ Mekong Delta

KH ~ Great Lake, Lower Mekong Basin

KC ~ Lower Mekong Basin

KK ~ Lower Mekong Basin

ST ~ Lower Mekong Basin MC ~ Middle Mekong Basin

0.01

Figure 4.14. NJ tree of Dest microsatellite differentiation among C. micropeltes samples.

Relationships evident in the population tree (Figure 4.14) illustrate the pattern observed for FST and RST pair-wise differentiation for microsatellite data, and also the general trend observed for mitochondrial diversity. Stung Treng (site ST), although situated in the lower Mekong River Basin, groups more closely with individuals from the middle Mekong sub-basin (site MC) (“Upstream” group). All other sites were essentially more similar to each other than they were to upstream sites (ST and MC). Among them, the two delta samples (sites TC and TS) were most similar to Great Lake samples (sites KH, PS, BB).

This pattern corresponds roughly with results of the mtDNA analysis, with genetic affinity apparently related to geographical units within the sampled section of the Mekong River Basin. In maximising variation among groups for microsatellite data, spatial analysis of molecular variance recovered only two groups, corresponding with “Upstream” (sites

MC and ST) and “Downstream + Great Lake + Tram Chim” population groups (FCT = 0.21448, p = 0.02933, Figure 4.15). Although somewhat different from results of the mtDNA analysis, this division does not conflict with earlier results, but instead simply appears to have recovered the highest level of hierarchical population structuring present in the data set.

153

Chapter 4.

0.25

0.2

0.15 FCT

0.1

0.05

0 1 3 5 7 9 K

Figure 4.15. Graph of FCT values (differentiation between groups) obtained in SAMOVA analysis, identifying two groups as the configuration that maximised differentiation between groups whilst maintaining genetic homogeneity within groups

The Mantel’s test for Isolation by Distance provided further support for the presence of a geographical pattern of differentiation in the microsatellite data, clearly identifying a significant relationship between the two factors (r = 0.73153, p = 0.0002). The spread of FST values (Figure 4.16) adds weight to the suggestion that differentiation is not only present at the highest level of hierarchical structure identified by SAMOVA, indicating that some level of sub-structuring is also likely to exist across the geographical range of each of the two main groups.

0.4 r = 0.7315 0.35 p = 0.0002

0.3

0.25

0.2

ST F 0.15

0.1

0.05

0 1.8 3 -0.05

-0.1

Log stream distance Figure 4.16. Raw FST plotted against log transformed geographic distance for C. micropeltes microsatellite data. Trendline shows the general pattern of increasing genetic distance with greater geographic distance (Isolation by Distance).

154

Phylogeography of C. micropeltes

The pair-wise distance data fit only relatively poorly, however, when actually mapped onto stream sections (Stream Tree r2 = 0.781, NJ fit r2 = 0.823 (Kalinowski et al. 2008)). Considered together, these results suggest that although there is a general trend of isolation by distance, this is not likely to result from local populations having reached equilibrium under low levels of stepping-stone like gene flow.

Furthermore, although differentiation was present amongst sample sites, the magnitude of differentiation was low, with all pair-wise comparisons among the

“Downstream + Great Lake” sites producing FSTs of less than 0.2. The FCA plots illustrate the small difference and large overlap among individuals from across the Mekong Basin. Although the analysis was able to separate the sample sites based on three variables (Figure 4.17a), when plotted individually, very little variation was present among genotypes, that largely overlap in the 3-D space defined by the highest inertia variables (Figure 4.17c).

Estimates of differentiation, SAMOVA and FCA analysis all incorporate information on sampling location to infer differences and similarity among groups of individuals. As a different approach, Bayesian clustering of individuals was used to infer population structure based on individual genotypes without geographical information. This analysis identified four separate clusters among all C. micropeltes individuals sampled for this study (Figure 4.18).

Assignment of individuals into specific clusters, however, was in most cases poor (Figure 4.19). Individuals collected from the “Upstream” sites (MC and ST) were generally assigned with relatively high probability to a single cluster (coloured dark purple in Figure 4.19), although there were a few exceptions. Around 20% of individuals from Kratie (site KK) were also assigned with high probability to this “Upstream” cluster. Kratie was the closest sampling site to Stung Treng, and is located only ~120km downstream on the main Mekong channel.

The remainder of individuals, accounting for over 75% of all fish genotyped, could not be assigned with any confidence to a single cluster. Their probability of assignment to the “Upstream” dark purple cluster, however, was essentially zero. In essence, these results reflect those of the Spatial Analysis of Molecular Variance, while in addition identifying some similarity between Kratie and the “Upstream” group that was also evident in the magnitude of pair-wise differentiation (see FSTs, Table 4.8).

155

Chapter 4.

BB

PS TS ST KK KH MC

axis 2 KC 90

(13.45%) n

o 76.73%

i 80

t (b) a

i 70 65.44%

r a

v 60

51.99% f

o 50

e 40

g a

t 30 n

e 20

c r

e 10

TC P 0 axis 1 axis 2 axis 3 axis 3 axis 1 (11.29%) (a) (51.99%)

80

0

2

s

i

x a

18,000

-320 3

-8,000 0 s -4,000 i 0 x 4,000 a ()oc axis 1 8,000 12,000 -16,000

Figure 4.17. Results of factorial correspondence analysis: (a) Centres of gravity for each population; (b) Graph showing the magnitude of inertia for the three factors in the analysis that account for the most variation, and; (c) all individuals coloured by sample site.

156

Phylogeography of C. micropeltes

-1200 (a) 8 (b)

7 -1400 6

-1600 5

(D) K

Δ 4

LnP -1800 3

2 -2000 1

-2200 0 0 2 4 6 8 10 12 14 16 18 20 22 24 2 4 6 8 10 12 14 16 18 20 22 K K

Figure 4.18. Results of Bayesian Cluster analysis indentifying K=4 as the best number of population groupings, (a) Mean log probability of the data (LnP(D)) ±SD over twenty replicates of each K value for K= 2 to K = 22, and (b) ΔK across all analyses. The peak of ΔK identifies the best number of groups, in this case four.

The strong difference between “Upstream” and all other individuals can be attributed to lower allelic diversity in the “Upstream” group. Careful examination of the raw allele frequencies for MC and ST show that these sites possess only a subset of the total alleles present in downstream sites for two of the three loci examined here, while at the third locus (Cm-1), a unique allele was present in low frequency (0.08%) in MC individuals (See Appendix 11). This trend of reduced “Upstream” diversity representing a subset of Lower Basin variation is also true for mitochondrial haplotype diversity at MC and ST.

At a greater distance upstream, however, at sites not included in the bulk of analyses (LP and TH), unique alleles were observed at a single microsatellite locus (alleles 191 and 195 at the Cm-1 locus, see Appendix 11). This indicates that genetic variation observed in the lower Mekong Basin does not fully represent total C. micropeltes diversity across the entire Mekong River Basin.

157

Chapter 4.

LP

TH MC

ST

KK Figure 4.19. Graphs of cluster membership coefficients estimated by Bayesian inference of population structure (K = 4). Each colour indicates a different cluster, sample KC locations are listed in centre of figure. LEFT GRAPH: Cluster membership for each C. micropeltes individual genotyped in the study. RIGHT GRAPH: Average cluster memberships for each sample location. KH

PS

BB

TS T C 158

Phylogeography of C. micropeltes

DISCUSSION

Diversity and phylogeography

Population genetic analysis of C. micropeltes in the Mekong River Basin showed limited diversity and divergence across the study area. Low nucleotide diversity was observed in mtDNA, with a total of only four base pair mutations observed among the five haplotypes detected. Microsatellite marker diversity was also low at the three loci examined, with an average allelic richness of 2.04 and average heterozygosity of only 0.35. These values are lower than expected averages for freshwater fishes (De Woody & Avise 2000; O'Connell & Wright 1997), and significantly less than diversity observed in other Mekong endemic fishes, for example Pangasianodon hypophthalamus (So et al. 2006a), and C. striata (previous chapter: average allelic richness of 3.94 and average heterozygosity of 0.58).

Low allelic diversity can result from widespread catastrophic population size declines (Maruyama & Fuerst 1985; Nei et al. 1975). In such cases, coalescent events among surviving individuals across the natural geographical range are likely to pre-date the bottleneck (random lineage sorting in small populations), but for C. micropeltes this was not the case. Instead, the very low divergence among all mtDNA haplotypes (maximum uncorrected distance < 0.004) and low allelic ranges of alleles at microsatellite loci indicate that all coalescent events are likely to have occurred in the recent past, and hence that all individuals sampled across the Mekong most probably shared a very recent common ancestor. This implies that a recent range expansion from a very small ancestral population may have occurred to account for presence of C. micropeltes across the greater Mekong Basin. Such a range expansion may be the result of recent colonisation by members of a population in a neighbouring drainage (either naturally or via translocation by humans), or in fact may have occurred as a result of drainage amalgamation, where an ancestrally small geographical range was incorporated into the greater Mekong Basin, enabling C. micropeltes to colonise the entire system.

In both the mtDNA and microsatellite data sets, two alleles were common at each locus, accounting for approximately 90% of all alleles detected. Common alleles were not distributed in uniform frequencies across all sites, however, and inter-population differentiation was considerable, indicating that contemporary C. micropeltes populations in the Mekong River Basin are not panmictic. Furthermore, despite limited allelic diversity overall, a number of alleles were detected that were present only at single sampling

159

Chapter 4. localities (private alleles). Private alleles were identified in seven of the eleven sample sites surveyed and were observed at both mtDNA and microsatellite loci. As these alleles always occurred at very low frequency (excluding sites LP and TH where n = 1, the average frequency of private alleles was 0.025 across all loci), they are probably quite recently evolved (post range expansion) (Crandall & Templeton 1993; Donnelly & Tavaré 1995; Golding 1987), and due to low contemporary gene flow, have not been transmitted across the greater Mekong Basin.

Levels of gene flow for C. micropeltes among Mekong Basin sites appear to be related partially to the magnitude of freshwater distance between them, with results conforming to the classical expectations of isolation by distance (Kimura & Weiss 1964; Slatkin 1993; Wright 1943). Across the entire river basin, however, genetic diversity also shows a pattern of hierarchical structuring. When the nine sites in the middle and lower basin are considered together there is a clear difference between upstream and lower basin individuals collected south of Stung Treng. Among the lower basin samples, C. micropeltes individuals are further structured into Great Lake and main channel groups.

At the highest hierarchical level, variation among groups can be seen in marked differences in allelic diversity. Upstream sites possess fewer alleles at both mtDNA and microsatellite loci than do downstream sites. Furthermore, alleles found upstream at sites MC and ST are, with all but one exception (private allele 159 at Cm-1), a subset of the most common alleles and haplotypes found lower in the Mekong River Basin. One explanation for this pattern is that ancestors of individuals from upstream sites have passed recently through one or more population size reductions (genetic bottlenecks) associated with a recent upstream range expansion. Recent bottlenecks are known to reduce allelic diversity and usually eliminate the majority of low frequency allelic forms (Nei et al. 1975).

Possible evidence for a recent range expansion include Fu’s FS values, that are more likely to be negative and significant at the centre of a geographical range of expansion than at the edge (Ray et al. 2003), the presence of a spatially arranged cline in allele frequencies (Klopfstein et al. 2006), and slightly lower diversity in peripheral populations (Ray et al.

2003). For C. micropeltes a significant negative FS value was observed at Kampong Cham in the Lower main channel, suggesting that this site may represent the centre of a possible expansion. In addition, across the river basin, variation in allele frequencies appears to be clinal, and diversity is lower in upper Mekong sites, suggesting that upstream samples may represent newly established populations at the periphery of a spatial expansion. Specific

160

Phylogeography of C. micropeltes tests for population size reduction did not identify a significant signature of genetic bottleneck at any site, however the overall low allelic range across all sites and low number of loci used are likely to have made probability of detection of statistically significantly effects low.

An alternative explanation for the higher diversity at downstream sites is asymmetrical gene flow. For freshwater systems, the unidirectional nature of stream flow can bias dispersal in a downstream direction (Hänfling & Weetman 2006; Hernandez-Martich & Smith 1997). Asymmetrical dispersal between sub-populations can be a significant influence in determining the spatial distribution of diversity (Fraser et al. 2004; Wilkinson- Herbots & Ettridge 2004). If significantly more dispersal occurs in a downstream direction than in an upstream direction, then diversity may be higher downstream due to the accumulation of migrant genotypes and retention of local variation. Although this hypothesis fails to account for the somewhat uniform genetic composition of upstream sites observed for C. micropeltes, it is possible that low population sizes in the upper Mekong may have prevented new mutations from being retained in situ over long periods of evolutionary time due to strong genetic drift, limiting divergence away from the most common allelic forms. Although sample sizes at the very top of the system at Luang Prabang and Tha Heua were very low (n = 1 for each), diversity observed at these sites fit with either hypothesis of recent northward range expansion or downstream accumulation and retention.

At the interface between upstream and downstream groups, pair-wise analysis of mtDNA and microsatellite loci revealed significant differentiation for each comparison between Stung Treng and Kratie. This break is very clearly demonstrated in the distribution of mtDNA variation (Figure 4.4), as these sites did not share a common haplotype (although sample sizes were small for the mtDNA data set). When clustering methods were employed to investigate microsatellite diversity in the larger data set, however, both FCA (Figure 4.17), and the Bayesian analysis (Figure 4.18) suggested some similarity between Kratie and upstream individuals. In addition, the sample collected from Kratie showed a significant heterozygote deficiency, that could indicate non-random mating due to inbreeding, assortative mating or a Wahlund effect (where the sample constitutes a mix of sub-populations) (Van Oosterhout et al. 2004). These findings indicate that local Kratie C. micropeltes population in the transition area between major groups is not at equilibrium.

161

Chapter 4.

Despite the nuclear DNA analysis that assigned some Kratie individuals to the upstream group, it is unlikely that this sub-population has actually received significant numbers of downstream migrants from Stung Treng in very recent history, as no mtDNA haplotype 1’s were detected among the sample. An alternative explanation for the assignment would be that dispersal is sex-biased with only males moving downstream. This is unlikely to be the case, however, as no signature of sex biased dispersal can be observed at other sites and there is no ecological information for presence of such a life history trait in C. micropeltes. Instead, similarity may be an artifact of early upstream colonisation of first Kratie and subsequently Stung Treng, signaling the beginning of a gradual shift in allelic frequencies due to successive bottlenecks associated with a hypothesised stepwise colonisation upstream. Alternatively, unidirectional gene flow and low resident population sizes at Kratie may have limited diversity in comparison with downstream populations.

At Kratie the Mekong channel is more than 3km wide and has a maximum depth of only 5m, however low rocky hills adjacent to the river constrain the channel, and floodplains that characterise lower river sections are absent (Gupta & Liew 2007). C. micropeltes is known to spawn in floodplains (Lim et al. 1999), and the appearance of a Wahlund effect may reflect local population sub-structuring as a result of limited spawning habitat in this area. Cryptic population structuring in another endemic Mekong fish species, Pangasianodon hypophthalmus, has been attributed previously, in part, to limited spawning sites in this part of the Mekong Basin (So et al. 2006a).

Among the downstream group, the differentiation among Great Lake and mainstream sub-populations is characteristic of only limited dispersal among regions. Here, differences in allele frequencies rather than in allelic composition define sub-populations (see for example Figure 4.4, and Appendix 11). The three great lake samples (from sites BB, PS, and KH) displayed very similar gene frequencies for all loci examined, while for main channel sites individual allelic frequencies were more variable and overall allelic richness was generally higher. While FST analysis of microsatellite variation revealed low but significant variation between Mainstream and Great Lake sites (Table 4.8), RST analysis, which performs best when differentiation is high and in this case was poorly correlated with other measures of differentiation (Figure 4.13), could not differentiate between lower basin samples (Table 4.9).

The separation of Great Lake and Mainstream sub-populations detected in mtDNA

SAMOVA (Figure 4.8) and nDNA FST analysis (Table 4.8) is not surprising, as ecological

162

Phylogeography of C. micropeltes knowledge and the recovered pattern of isolation by distance (Figure 4.6, Figure 4.16) suggests dispersal is likely to be limited among river sections. In addition, if downstream accumulation of diversity is occurring, the Great Lake may receive migrants from main Mekong channel populations when the Tonle Sap River reverses flow annually and floods the Great Lake, but any downstream dispersal from Great Lake populations would not contribute to diversity at Kampong Cham and Kratie, which lie upstream of the confluence between the Mekong and Tonle Sap Rivers.

Differences between river sections in the lower basin may be a remnant of historical independent colonisations by a small number of founders (for example Bryan et al. 2005), a biased pattern of dispersal (for example Fraser et al. 2004), or alternatively, may reflect different demographic effects in the absence of large scale contemporary dispersal among regions (for example Castric et al. 2001). The Great Lake C. micropeltes population is subject to intense fishing pressure that removes large numbers of C. micropeltes annually.

If this pressure is severe enough to reduce the number of effective breeders (Ne) significantly, the Great Lake population may be undergoing mild successive population bottlenecks, and hence may experience the effects of genetic drift at a more accelerated rate in comparison with other sub-populations. Fishing activity is known to impact levels of genetic diversity among harvested species (Jørgensen et al. 2007; Smith et al. 1991). As there is evidence to suggest that C. micropeltes in the Great Lake are currently being fished at or above maximum sustainable yield (Hai Yen et al. 2009), it is reasonable to assume that the resident population could be undergoing some level of fishery induced change in genetic diversity. The virtually identical gene frequencies among the three Great Lake sites however, do indicate that all individuals sampled constitute members of the same breeding population in this region.

While allele frequencies did vary among downstream main channel sites, significant differentiation was only observed in pair-wise comparisons with Tram Chim (n = 5) and other mainstream sites. This suggests that since their establishment in the downstream main channel, either the effective number of migrants (Nem) has been high enough or local population sizes have been large enough to counteract any local changes in gene frequency brought about by genetic drift/ harvesting or bottlenecks where they have occurred. Interestingly, and despite the small sample size (n = 2), the recently established (<30yr old) and now totally isolated population in Tra Su National Park (site TS) showed no evidence for a significant decline in diversity or divergence from other mainstream sites.

163

Chapter 4.

While the results presented here indicate that C. micropeltes is most probably a recent coloniser of the greater Mekong Basin, with all resident individuals apparently descending from a common ancestor in the recent past, the data provide no insight into where the original founders may have come from, or indeed how they first dispersed into the Mekong system. The pattern of diversity observed may suggest that the lower basin represents the origin of a range expansion that progressed upstream; or alternatively that dispersal has been predominantly in a downstream direction. Potentially, dispersal into the Mekong could have been accomplished in the upper basin via the Siam drainage (palaeo - Chao Phraya), middle Basin via rivers on the Khorat Plateau, or in the lower basin via extended Pleistocene drainage networks.

As the contemporary distribution of C. micropeltes extends across the Chao Phraya, eastern Malaysia, Sumatra and Borneo (Figure 4.1), recent dispersal across the exposed Sunda Shelf is an appealing hypothesis. Other freshwater taxa are thought to have dispersed across the same region during this period (Dodson et al. 1995; McConnell 2004). More extensive sampling, including sampling individuals from the upper and lower Chao Phraya, Malaysia, and the Indonesian archipelago will be required to identify which, if any, of these populations are most closely related to Mekong C. micropeltes, and if possible, to pinpoint a likely origin for founders of contemporary Mekong populations.

Furthermore, quantifying the level of divergence between Mekong and extra-Mekong individuals has the potential to elucidate how recently C. micropeltes dispersed into the Mekong, or alternatively, if human populations could have introduced C. micropeltes to the Mekong River Basin. C. micropeltes’ desirability to humans as a high quality protein resource and its ability to survive out of water suggests that deliberate introduction could easily have been accomplished by early human inhabitants, that have had a long history in the Mekong Basin (Higham 1998). The pre-historic Khmer of central Cambodia are known to have been harvesting large snakehead fish from around the Great Lake 2,500 years ago, and probably regularly transported fish harvested from large rivers distances of 20-25km for trade (Higham 1998; O’Reily et al. 2006).

With or without an initial human initial introduction, C. micropeltes may have dispersed across the greater Mekong Basin naturally via stepwise range expansion facilitated by either up or downstream dispersal, or initial populations may have been distributed via human mediated introductions along the watercourse. The pattern of observed diversity could even have resulted from a combination of both modes of

164

Phylogeography of C. micropeltes dispersal. Many recent species invasions by aquatic species have been characterised by a combination of natural and human mediated range expansions, (for example Johnson & Carlton 1996; Kawamura et al. 2006; Lindholm et al. 2005).

The two hypotheses posed here present essentially opposing mechanisms to explain the observed pattern of differentiation (upstream invasion v downstream dispersal), as both could be expected to result in similar patterns of spatial genetic diversity seen for C. micropeltes in the lower Mekong River given low levels of genetic diversity and divergence. It is unlikely that the lack of resolution is due to the relatively small number of individuals sampled or the number of loci assayed as all four independent markers reflect the same pattern of population structure. Unfortunately, without extra field sampling and genetic analysis to identify potential source populations outside the Mekong drainage, it would be difficult to reject one hypothesis in favour of the other. However, comparison with the known evolutionary histories of co-distributed species may provide insight into which of the hypotheses is a better explanation for the observed data. Phylogeographical patterns of other Mekong taxa are reviewed in the following chapter.

Stock Structure in the Lower Mekong Basin

The patterns of diversity observed here for C. micropeltes suggest that populations are structured into at least four management units across the lower Mekong River Basin. Individuals resident in the Great Lake, that are perhaps the most heavily fished of all C. micropeltes in the Mekong, appear to constitute a single stock from Battambang at the head of the Great Lake to Kampong Chhnang where the Lake flows into the Tonle Sap River. In the main Mekong Channel in Cambodia, individuals harvested from Kratie and from Kampong Cham appear to belong to a relatively homogenous gene pool that also includes C. micropeltes from the river delta (sites TS and TC), although Kratie may exchange fewer migrants with other downstream sites.

Fish sampled at Stung Treng, the most upstream site in the Cambodian section of the river, appear to constitute members of a stock distinct from lower mainstream populations. Diversity at this site comprises only a subset of the diversity resident downstream, indicating that effective upstream dispersal is low between Stung Treng and Kratie. This implies that potential for recruitment to the local Stung Treng area from the greater Mekong is limited, and hence local population declines due to overfishing are

165

Chapter 4. unlikely to be compensated for in the short term at least, by recolonisation from other parts of the system.

In the Mun-Chi sub-drainage in Thailand, resident C. micropeltes constitute an independent management unit. Although diversity in this sample was similar to that observed for Stung Treng, the overall pattern of isolation by distance among all sites suggests that dispersal is likely to be low at this spatial scale, and that any genetic similarity is more likely to be a remnant of stepwise colonisation in the past than a result of contemporary gene flow.

Differences in Channa Mekong phylogeography

This thesis examined the phylogeography of two snakeheads in the Mekong basin. Although similar in ecology and life history, the two species show remarkably different patterns of geographical population structuring across this river basin. While C. striata is perhaps more capable of exploiting ephemeral freshwater habitats than C. micropeltes, that prefers deeper permanent freshwaters, both species share many ecological similarities, including nest building, parental care, and air-breathing. Given these similarities, and the fact neither species appears to be recently arisen, the disparity in phylogeographical patterns is somewhat unexpected.

C. striata, that is present in two divergent forms in the upper Mekong region, has clearly been resident across the region for long periods of evolutionary time. Divergence among members of the widespread form indicates that populations have been established since at least the Pliocene, pre-dating the formation of contemporary Mekong drainage lines. Populations in the lower basin show a signature of demographic expansion dating back 150Kya, and the populations in this region today are haplotypically diverse, indicating that large population sizes have persisted in recent evolutionary time. The restricted C. striata form may be a relatively more recent arrival into the Mekong Basin, possibly in association with drainage re-alignment, however the diversity and divergence within this group also suggests a historically stable population history. In contrast, C. micropeltes is almost certainly a very recent colonist of the Mekong drainage, with all individuals sampled likely to have descended from a small ancestral population in the very recent past.

While inherent differences in the evolutionary rate of Cyt b could possibly have resulted in faster accumulation of diversity in C. striata in comparison to C. micropeltes,

166

Phylogeography of C. micropeltes rate variation is unlikely to have resulted in such vastly different phylogeographical patterns given that the two taxa are members of the same genus and have similar generation times (Baer et al. 2007; Martin & Palumbi 1993). Why then, if C. striata has been present in the Mekong region for long periods of evolutionary time, has C. micropeltes not also been present, and therefore show a similar pattern of phylogeographical structuring that pre-dates historical drainage rearrangement? Why also, is the population size expansion for C. micropeltes likely to have occurred so much more recently than for C. striata in the same geographical region?

Although significantly structured across the Mekong basin, C. micropeltes appears to be less geographically structured than C. striata, implying that C. micropeltes may possess a higher dispersal potential. If this is true, then the apparent absence of C. micropeltes from the greater Mekong region until recent evolutionary time is even more intriguing, and not easily reconcilable with differences in dispersal ecology.

One explanation for the observed difference in phylogeographical structures is that C. micropeltes has been previously been prevented from attaining a Mekong (and greater SE Asian) natural geographical distribution by the presence of some formerly long standing barrier to dispersal. Evolution in an isolated Molengraaff river basin (Figures 1.1 and 4.1), either draining northwest into the Andaman Sea at the western margin of Sundaland, or draining east across what is now the floor of the Java Sea at the southeast margin of Sunda shelf, may have prevented dispersal north into mainland SE Asia and the Mekong. Such a hypothesis requires that colonisation of greater SE Asia by C. micropeltes was able to be achieved only recently, either naturally during large habitat fluctuations in the Pleistocene, or though translocation by humans that effectively overcame natural barriers to dispersal that may still be in place.

Recent colonisation (and/or translocation) does not appear to have contributed significantly to contemporary phylogeographical structuring of C. striata populations in the Mekong, probably because populations of this species right across mainland SE Asia pre- date the formation of contemporary drainage networks. Instead, for C. striata, historical population expansions and deep divergence among lineages reflect much more ancient changes in freshwater habitats and habitat connectivity.

167

Chapter 4.

Conclusion

Phylogeographical and population genetic analysis of C. micropeltes across the Mekong Basin revealed very low genetic divergence among all individuals sampled, suggesting that this species has recently colonised the greater Mekong basin from a relatively small ancestral population. Despite low mtDNA nucleotide diversity and low microsatellite allelic richness in comparison with diversity in C. striata, contemporary C. micropeltes populations do appear to be significantly structured throughout the drainage. The geographical scale of structure is, however, most likely, broader than the scale of population structuring for C. striata. This is in keeping with ecological information on the two species, which suggests that C. micropeltes is better able to utilise permanent deep water habitats such as the main river channel. This species, therefore, is potentially, better able to disperse across larger geographical distances within the river basin.

168

Chapter 5 General Discussion

169

General Discussion

Snakehead fishes constitute an important component of wild freshwater fishery resources in SE Asia, especially in the Mekong Basin, where the fish are both commercially important and are very popular (Champasri 2003; Lieng & van Zalinge unpublished; MRC Fisheries Program 1999; Singhanouvong & Phouthavongs 2002; Yusoff et al. 2006). In addition to the large contribution that snakehead fishes make to Mekong capture fisheries, the genus is also important to indigenous culture industries in Thailand, Cambodia, and Vietnam (Boonyaratpalin et al. 1985; Campbell et al. 2006), as well as further afield in Malaysia and China (Courtenay & Williams 2004; Yusoff et al. 2006).

The current study examined the macro- and micro-evolutionary history of Asian snakeheads using molecular data. Macro-evolutionary relationships were constructed among members of the genus to assess when individual species likely arose and to quantify the level of divergence among taxa, and also within a few key taxa where natural geographical distributions were large. Divergence times were estimated using a combination of molecular data from multiple DNA loci and fossil evidence, and were interpreted with respect to past geological and climatic conditions. Together, this information was used to reconstruct the historical biogeography of the family Channidae, revising hypotheses previously posed in other studies.

Intra-specific genetic diversity was assessed across SE Asia for two common members of the Asian snakeheads, C. striata and C. micropeltes. These data were used to reconstruct the micro-evolutionary history of each species in SE Asia. For C. striata, sampling spanned multiple drainage basins and was able to address questions regarding levels of historical and contemporary gene flow across mainland SE Asia and Sumatra. The distribution of divergent lineages uncovered in the phylogenetic analysis was documented and the extent of genetic exchange among them assessed. For C. micropeltes, population structure in the lower Mekong Basin was determined, and hypotheses were presented that may explain the pattern of genetic diversity uncovered for this species in this part of its distribution.

Results presented here help to clarify the systematics of some Channa species, and provide information on stock structure for the two most economically important members of the genus in the Mekong Basin. This information is relevant to management of the wild fishery, and also provides data that may assist development of C. striata and C. micropeltes local stocks as sustainable and productive culture species. The genetic techniques used here, in particular the microsatellite loci that were isolated for each species, have the

171

Chapter 5. potential to be applied in future studies that address questions regarding genetic diversity in C. striata and C. micropeltes.

Historical biogeography of tropical Asian freshwater fishes

At the broadest scale, taxonomic composition and the level of species richness in freshwater fishes varies greatly among continents and zoogeographical regions (Lundberg et al. 2000). Tropical Asia including SE Asia and the Indian subcontinent, however, share a number of freshwater fish lineages with Africa, including anabantoids (Rüber et al. 2006), cyprinids (Stiassny & Getahun 2007; Tang et al. 2009), notopterids (Inoue et al. 2009), silurids (Agnese & Teugels 2005; Sullivan et al. 2006) and channids. The freshwater faunal affinity between continents results from sharing of Gondwananan fauna and also as a consequence of later immigration of fishes between regions (Stewart 2001).

For those groups that are distributed across the two continents, estimates of divergence times among African and Asian groups vary greatly. African and tropical Asian freshwater knifefishes (Inoue et al. 2009), cichlids (Azuma et al. 2008), (Kumazawa & Nishida 2000), and killifish (Murphy & Collier 1997) may have diverged during the break-up of Gondwanaland more than 130Mya and then taxa may have reached Asia subsequently after rafting on the Indian plate. Although absence of a fossil record in the Cretaceous has shed doubt on the validity of Gondwanan vicariance for some freshwater taxa (Briggs 2003b; Murray 2001; Vences et al. 2001), there is strong support for the credibility of this biogeographical pattern for freshwater fishes (e.g., Sparks & Smith 2005).

For labyrinth fishes (Anabantoidei), which like channids are able to breathe atmospheric oxygen and that also exercise parental care, divergence among African and Asian lineages has been estimated to have occurred at least 30Mya in the Eocene, but possibly as early as the Late Cretaceous 87Mya (Rüber et al. 2006). For other groups, the Miocene appears to have been a period when divergence arose among African and Asian groups via dispersal events. African Schilbeidae and Asian Pangasiidae catfishes, for example, diverged in the early Miocene (Pouyaud et al. 2004). Walking catfish (family Clariidae), that are also able to breathe atmospheric oxygen and move overland, are believed to have arisen in Asia and dispersed into Africa from the middle-east during the Miocene at least 15Mya (Agnese & Teugels 2005). Exchange of cyprinid taxa among

172

General Discussion continents is also thought to have occurred in the early (23Mya) and late (9Mya) Miocene (Stewart 2001; Tang et al. 2009).

Contrary to earlier hypotheses about early Cretaceous vicariance associated with Gondwanan break-up (Li et al. 2006), or late-Miocene-Pliocene divergence associated with shifts in suitable habitat (Bohme 2004), results of the current study place divergence between Asian Channa and African Parachanna in the Eocene around 40-49Mya. This new, fossil calibrated molecular phylogeny for channids represents a significant revision of previous divergence time estimates, and requires a new biogeographical hypothesis to explain the distribution of the family. The topology of the current phylogeny is, however, broadly congruent with Li et al’s (2006) reconstruction, suggesting that the relationships established in both analyses are robust in all but time.

Results presented here also suggest a south Asian origin for the family Channidae, with a westward dispersal event having given rise to the African Parachanna lineage in the mid- Eocene at least 40Mya. The reciprocal monophyly of Asian Channa and African Parachanna taxa, also observed by Li et al (2006), indicates that if more recent exchange of Africa and Asian channid taxa occurred in parallel with the dispersal of other freshwater fishes, these dispersal events did not result in establishment of inter-continental ranges for either channid lineage.

Among tropical Asian fish taxa sampled here (Channa species), the earliest divergence is estimated to have occurred around the onset of the Oligocene around 30Mya. The majority of taxa belonging to one clade that arose then, the ‘gular scale’ clade, have ranges restricted to SE Asia, and it has been suggested here that Oligocene cladogenesis may have resulted from lowered east-west dispersal opportunity during this period across southern and SE Asia due to the onset of a dry climate (Morley 1998). An east-west split in featherback knifefish was estimated to have occurred around this time (Inoue et al. 2009). For , which like Channidae arose in the mid-Eocene in Asia, key basal lineages also diverged in the Oligocene (Rüber et al. 2007; Wang et al. 2007), indicating that this was a significant period for evolution of many Asian freshwater teleosts.

Among the Channa taxa considered here, most speciation events occurred during the Miocene. Generally, the Miocene appears to have been an important period in the radiation of many Asian freshwater fishes. Both Asian cyprinids and catfish lineages experienced species radiations during the Miocene, possibly in association with broad scale

173

Chapter 5. dispersal events (Agnese & Teugels 2005; Gaubert et al. 2009; Pouyaud et al. 2004; Pouyaud et al. 2000; Tang et al. 2009; Wang et al. 2007). The ancestral C. diplogramme (India) – C. micropeltes (SE Asia) lineage split in the mid-Miocene, suggesting factors affecting dispersal and vicariance may have altered significantly across southern Asia during this time. Other Channa lineages that arose during the Miocene seem to show a west/east division in species’ geographical ranges, for example C. bleheri (India) and C. maculata (SE Asia).

Unlike previous channid phylogenies, that have focused on reconstructing relationships at the interspecific level (e.g., Li et al. 2006; Vishwanath & Geetakumari 2009), the current phylogeny examined levels of divergence among individuals of the same species and placed this divergence in the context of divergence within the family. This approach provided new insight into the evolutionary history of individual Channa species across tropical Asia, with particular regard to divergence within taxa between India and SE Asia, between SE Asian mainland and island freshwaters, and across a recently formed large river drainage, the Mekong.

While few studies to date have examined the divergence between Indian and SE Asian freshwater fish taxa in a historical biogeographical context, the up-thrust of the Tibetan Plateau and associated formation of strong seasonality in precipitation (the East Asian monsoon) in the Late Miocene may have had a significant impact on dispersal for formerly widespread taxa. The most recent common ancestor of glyptosternoid catfishes in southern Tibet and the eastern Himalayas is estimated to date from the Late Miocene 6- 8Mya (Peng et al. 2006), and other Tibetan taxa, including some lizards, show patterns of allopatric divergence that reflect isolation that was established around the Miocene- Pliocene boundary (Jin et al. 2008). Among C. striata, divergence between Indian and SE Asian lineages is estimated here to represent around 6 and 8Mya of evolution in isolation, suggesting that geomorphological events in the late Miocene had the same impact on C. striata as they appear to have had on other freshwater taxa such as catfishes.

Extensive freshwater faunal exchange has been documented among mainland SE Asia and the islands on the Sunda Shelf during the Pliocene and Pleistocene (e.g., Dodson et al. 1995; McConnell 2004). Divergence between mainland and Sumatran C. striata and C. lucius suggested most recent common ancestors at 3.7 and 5.7Mya respectively, placing dispersal across Sundaland for both taxa well before the Pleistocene. Extended river basins may not have physically connected freshwaters of mainland and island drainages in the

174

General Discussion

Pleistocene, and intermediate land bridges are thought to have been arid (Gorog et al. 2004; Heaney 1991; Morley 1998; van der Kaars et al. 2001), although see (Cannon et al. 2009). Conditions such as these are likely to have prevented more recent inter-drainage dispersal for snakehead species even where marine barriers were absent. Extended rivers in the Gulf of Thailand did however, unite Malaysian and Indo-Chinese drainage basins (Sathiamurthy & Voris 2006). The very low divergence between Malaysian and other mainland C. striata populations indicates that dispersal has occurred for this species at least at this spatial scale in more recent history, probably as a result of drowned drainage basins in the Gulf of Thailand.

Perhaps the most significant finding resulting from the phylogenetic reconstruction here relates to the deep divergence uncovered between two C. striata lineages that are sympatric in the upper Mekong Basin. Estimated divergence among lineages has been dated here to around 8Mya, placing the last common ancestor in the late Miocene. This finding is striking because not only are these highly divergent lineages found in sympatry, but divergence is greater among them than between Indian and SE Asian individuals. Furthermore, microsatellite data indicated genetic exchange where they were found together, indicating interbreeding. Divergence of this magnitude must have occurred in isolation, and while it is not uncommon for divergent lineages to re-contact after long periods of allopatric evolution, especially where drainages have undergone recent geomorphological change, it is unusual for such deeply divergent lineages to retain reproductive compatibility (i.e., display no obvious external morphological differentiation or reproductive isolation). The late-Miocene early Pliocene has been implicated in freshwater species radiations for Asian catfish (Guo et al. 2005; Peng et al. 2006; Pouyaud et al. 2000), cyprinids (Wang et al. 2007), and gastropods (Glaubrecht & Kohler 2004). For many Mekong sisorid catfish with limited distributions in northeastern Lao PDR (Ng & Rainboth 2001) it is likely that morphological divergence evolved during this period. It appears that the historical processes that promoted species radiations for many other taxa in mainland SE Asia during the Miocene-Pliocene did not have the same impact on C. striata, despite strong evidence that late Miocene vicariance directly affected this species in the hills of northern Lao PDR. It is possible however, that speciation did occur in the Miocene-Pliocene for other species in the Channa genus that were not sampled here. Li et al’s (2006) channid phylogeny identified close relationships among three species distributed in and surrounding regions: C. barca, C. burmanica, and C. bleheri. The current study has revised the divergence time estimated by Li et al for C. bleheri and C.

175

Chapter 5. gachua, and provided a new estimate of around 11Mya, indicating that subsequent cladogenesis events in the C. bleheri lineage may have occurred over the same time period during which intra-specific divergence arose among C. striata lineages. This highlights the importance of sampling broadly below the level of recognised species if we are to reach an understanding of how evolution has progressed for this family of freshwater fishes.

Phylogeography of Mekong fishes

The Mekong River Basin’s recent history of large scale geomorphological changes including stream amalgamations, tectonic subsidence and changes in inclination have undoubtedly led to the accumulation of suites of taxa with different phylogeographical histories. This means that generalisations about phylogeographical structure in freshwater systems, such as a pattern of isolation by distance or higher probability of similarity among demes within drainages than among drainages (Meffe & Vrijenhoek 1988), are unlikely to apply to all Mekong taxa.

Recently, the Mekong Basin has been subdivided into five major freshwater ecoregions based on the distributions of freshwater species (Abell et al. 2008), indicating that ecological, and probably historical, differences exist among river sections. Most species that span the entire drainage are likely to show some level of subdivision based on their individual ecology, life history, and historical presence in different river sections. Unfortunately, perceived similarities in ecological and life history requirements among taxa have proven to be poor indicators of congruence in population sub-structure among Mekong taxa. For example, the morphologically similar migratory cyprinids Henicorhynchus lobatus and H. siamensis display vast differences in diversity, population structure and intra-population divergence across the Mekong Basin (Adamson et al. 2009; Hurwood et al. 2008), demonstrating that historical processes have potentially played a major role in differential structuring of contemporary Mekong River fish populations.

The current study sampled the snakehead fishes C. striata and C. micropeltes across four Mekong River ecoregions; the Lower Lancang (northern Lao PDR), the Khorat Plateau (northeastern Thailand), Kratie-Stung Treng (Cambodian main channel and Vietnamese Highlands), and the Mekong Delta (including Tonle Sap Great Lake). Given the wide natural geographical distributions and relatively sedentary life histories of the two snakehead species, they were expected to show phylogeographical patterns that reflected deep historical rather than recent probabilities of connectivity (i.e., population structures that

176

General Discussion were established by past freshwater connections rather than in contemporary drainage networks). For C. striata this expectation was in general met. Populations from the Mekong Basin in northern Lao PDR (Lower Lancang) and the Upper Chao Phraya River Basin were genetically similar, most likely reflecting a history of connectivity pre-dating Pleistocene drainage rearrangement, although ongoing dispersal cannot be ruled out entirely given the species capacity for overland movement. Genetic similarity between drainage basins in northern Thailand had previously been observed for C. striata populations based on allozyme frequencies (Hara et al. 1998), but the current study, that incorporated mtDNA sequence data, provided genealogical information that can be directly interpreted with respect to population histories using the principles of coalescent theory (Hudson 1990; Kingman 1982). Results here show that individuals are similar among drainages in northern Thailand – northern Lao PDR because they shared a recent common ancestor. Given the inferred drainage history of the region (Attwood & Johnston 2001; Brookfield 1998; Rainboth 1996b), the common ancestor probably once inhabited the Siam River Basin (Paleo- Chao Phraya), that later lost its headwaters to the growing Mekong in the early-mid Pleistocene only 1 – 1.9Mya (Rainboth 1996a). Other endemic freshwater taxa, including small and large cyprinid species (Adamson et al. 2009; Rainboth 1996b), catfish (Leesa-Nga et al. 2000), and gastropods (Glaubrecht & Kohler 2004) also show genetic similarity and/or similar species composition across the same region.

Further downstream, the middle Mekong Basin including the Mun River sub-drainage (the Khorat Plateau ecoregion) is believed to be another recent addition to the Mekong watershed (Brookfield 1998). Populations of H. lobatus in the Mun River sub-drainage are thought to have diverged from mainstream Mekong individuals sometime around 2.5 - 3Mya, probably when the river sections were separate (Hurwood et al. 2008). For this species, however, greater genetic divergence (up to around 4.8Mya) was observed between Chao Phraya and Mekong Basin populations than among any Mekong Basin individuals, suggesting that the Khorat Plateau may historically have experienced a greater degree of isolation from the Siam than from the Paleo-Mekong during the Pliocene. Relatively few studies have examined phylogeographical patterns of freshwater taxa across the Khorat Plateau region compared with the Greater Mekong, probably partially as the trans-political boundary nature of the Mekong means such research requires international co-operation for sampling. Generally, it seems however, that higher genetic similarity is usually observed between Khorat and Greater Mekong sites than between the Chao Phraya and Khorat Plateau (e.g., Doi & Taki 1997), although some species , including some

177

Chapter 5. freshwater gastropods, show high rates of endemism and species diversity in the middle Mekong – Mun that are believed to have arisen in isolation from neighbouring freshwater courses (Davis 1979).

Diversity among individuals from the Khorat Plateau were compared in the current study with other Mekong individuals for three snakehead species, C. gachua, C. striata, and C. micropeltes. For C. gachua, that inhabits hill streams (Kottelat 2001), individuals from different Khorat Plateau tributaries were closely related (Cyt b p-distance < 0.01), suggesting that dispersal may have been maintained across the Plateau during the Pliocene, possibly as a result of small order stream captures. The level of mitochondrial sequence divergence among Khorat and lower Mekong genotypes (Cyt b p-distance < 0.019) suggests a common ancestor approximately 1 - 2Mya. For C. striata, most of the diversity present in the Khorat Plateau was also found in the lower Mekong Basin, and overall population assignment suggested that most Khorat individuals belong to the widespread C. striata population expansion that occurred across the mid and lower Mekong < 1Mya. For both species, the phylogeographical patterns recovered suggest that historically significant physical or ecological barriers that have potentially limited dispersal of freshwater taxa between the middle and lower Mekong, including drainage isolation (Brookfield 1998), rigid migration pathways (Hurwood et al. 2008) or low dispersal capability and habitat specificity (Davis 1979), have not restricted dispersal for these two Channa species.

In contrast to C. striata in the Upper Mekong, no Khorat C. striata were assigned to the ancestral Chao Phraya population, providing further evidence that more recent drainage connections must have been present between the upper Mekong-Chao Phraya than between the Khorat and Siam. If the Siam-Khorat connection did exist in the past, it must have been present prior to the arrival of the East Asian form of C. striata to mainland SE Asia, as all Khorat individuals represent recently derived members of this clade.

Although relatively little divergence was identified between Khorat and lower Mekong Basin individuals for C. gachua and C. striata, both species are present at the northern margin of the Plateau as more divergent forms, representing evolution in isolation of ~8Mya for C. striata, and ~3Mya for C. gachua (based on p-distance). This region has not been recognised, however, as a distinct freshwater ecoregion, although the underlying geomorphology has been classified as unique (Gupta & Liew 2007), and is characterised by alluvial channels between rock cut river sections. These features possibly indicate that this

178

General Discussion region has a geomorphological history different from the rest of the Khorat region. The strong phylogeographical pattern observed here for the two Channa species has not, to the best of knowledge, been observed for any other Mekong taxa yet examined, but if enduring historical barriers to dispersal have been present for snakeheads in this area, it is likely that dispersal of other freshwater taxa will also have been limited historically. Evidence that divergent C. striata individuals from this region now occur in low frequency downstream in the lower Khorat Plateau suggests that any historical barrier to dispersal may no longer be present, and although dispersal in general appears to be low for C. striata, limited post- Pliocene dispersal may still be occurring.

In contrast to other Channa taxa examined here, C. micropeltes populations across the Mekong retain no phylogeographical signature of historical vicariance, but instead the extremely low nucleotide diversity present may indicate a very recent population expansion following a recent severe bottleneck. Another Mekong fish, Pangasianodon hypophthalmus, that like C. micropeltes prefers large rivers and occurs in vast numbers in the lower Mekong, is also hypothesised to have suffered a recent bottleneck due to recent dramatic changes in Mekong geomorphology (So et al. 2006c). Although the two species are similar in size and have similar habitat preferences, they have very different ecologies, where C. micropeltes is relatively sedentary and guards larvae, P. hypophthalmus is migratory and releases pelagic larvae. The fact that both species have very low levels of genetic diversity in the Mekong Basin suggests that the habitat on which they depend may have undergone recent large scale changes that impacted historical population sizes of both species. Although drainage rearrangement is likely to have played a large part in both cases, the Tonle Sap Great Lake and Mekong delta, that are important feeding grounds for P. hypophthalmus and provide year round habitat for C. micropeltes, were inundated by saline waters during the early to middle Holocene (Penny 2006). It is possible that this very recent marine transgression, combined with geomorphological upheavals, could also have had a major impact on available habitat, and hence on population sizes of large fish taxa reliant on slow flowing lowland Mekong ecosystems.

In addition to low diversity observed in C. micropeltes, low divergence among haplotypes indicates a very recent common ancestor for all Mekong individuals, probably associated with recent population expansion across the river basin. This presents a stark contrast to the phylogeographical history documented here for C. striata, a species with high diversity and with evidently a much longer history in the Mekong drainage. Without

179

Chapter 5. additional sampling of other rivers across the natural geographical range of C. micropeltes, it is impossible to say if colonisation was associated with a recent introduction from a source population in another drainage, or if Mekong populations of this species originated in a section of the Mekong pre-amalgamation. C. micropeltes prefers large river habitat, and is therefore unlikely to have dispersed naturally via small headwater capture events, suggesting that only large scale freshwater connections either as a result of major drainage amalgamation or extended river basin connections on the exposed continental shelf would have facilitated dispersal into the greater Mekong Basin. The natural geographical distribution of C. micropeltes across mainland SE Asia, the Malay Peninsula and western Indonesia suggests that dispersal in drowned rivers on the Sunda shelf has historically been extensive for this species. This dispersal route could have enabled C. micropeltes to colonise the lower Mekong Basin and subsequently the entire drainage, however the phylogeography of C.striata, which also has an extensive Sundaland distribution, indicates that Quaternary dispersal across the Sunda Shelf may have excluded lower Mekong populations. For this species, Malaysian and Sumatran individuals were more closely related to the ancestral Chao Phraya River genotypes than to lower Mekong ones. The close similarity between upper Mekong and upper Chao Phraya types may indicate that drainage re-arrangement, rather than extended river basins, has been more influential in shaping the distributions of snakeheads across mainland drainages.

Interestingly, although diversity was low in C. micropeltes, a genetic break identified with microsatellite analysis was found between northern Cambodian populations, and southern Cambodian Mekong and Great Lake populations. This break corresponds roughly with the break between the ‘Kratie-Stung Treng’ and ‘Mekong Delta’ ecoregions. As data presented here indicate, C. micropeltes has only recently established a wide distribution in the Greater Mekong, so the presence of a phylogenetic break for this species suggests that ecological factors are likely to have affected levels of gene flow between the two lower Mekong River sections. The existence however, of C. marulia–like fishes previously suggested to belong to an undescribed species with a distribution limited to the ‘Kratie- Stung Treng’ section of the river (Rainboth 1996a), and shown here to be sister to, although very divergent from, other C. marulia (~10Mya), indicates that historical isolation has probably also played a role in structuring some elements of the fish faunal community in the lower Mekong Basin. Significant structure was also identified between main stream Mekong and Tonle Sap Great Lake C. micropeltes populations, a pattern that has been reported previously for the featherback, Notopterus notopterus (Takagi et al. 2006),

180

General Discussion another endemic species with similar habitat preferences to C. micropeltes (Rainboth 1996a). This suggests that some type of ecological barrier may exist between the Mekong and Tonle Sap for species that share a habitat preference for large slow flowing water bodies.

Despite evidence that genetic bottlenecks have occurred in some species, the lower Mekong Basin appears in recent evolutionary time to have provided the opportunity for populations of many freshwater fish species to undergo demographic expansions. The highly diverse star-like phylogeny observed for C. striata haplotypes in the lower basin suggests a recent population expansion, and this pattern has also been seen for the small cyprinid, H. siamensis (Adamson et al. 2009), and a catfish, Pangasius bocourti (So et al. 2006c) across the same region. Population expansions may have occurred as a result of increased available freshwater habitat for fishes in the lower Mekong Basin, due to either the formation of the vast Tonle Sap –Great Lake floodplains, or as new species were able to colonise the system from external source populations. High diversity associated however, with demographic expansions in the lower reaches of the Mekong River Basin somewhat confounds conclusions about the contribution unidirectional downstream gene flow may make to promoting and maintaining high diversity in the lower basin. While most downstream diversity seems to have arisen in situ, at least for C. striata, different life history traits among Mekong taxa have ultimately influenced the way demographic expansion has impacted on population structure. For C. striata, this expansion extended onto the Khorat Plateau, while for H. siamensis, rigid migration pathways in the lower Mekong, probably constrained by the Khone Falls on the Lao PDR-Cambodia border, kept diversity that arose during demographic expansion confined to the lower Mekong Basin (Adamson et al. 2009). This emphasises the important influence that ecology, including life history, dispersal ability and habitat requirements can have on shaping the distribution of genetic diversity for freshwater taxa, even in a ‘young’ system like the Mekong River drainage basin.

181

Chapter 5.

Managing snakeheads in the Mekong: a genetic perspective

As human populations continue to expand in SE Asia, the security of natural fish stocks for food and income is a high priority for most governments in the region (Coates 2002). Natural fish populations face threats from the fisheries sector, including over-exploitation and the use of destructive fishing methods, and also from the activities of external sectors associated with rapid regional development and human population growth, that can cause habitat loss, ecosystem simplification, pollution and changes in flow regimes (Baran & Cain 2001; Coates 2001). Although the overall wild fisheries catch remains high, over-fishing and riparian development have both recently been implicated in declining catches for some fish species in parts of the river basin, (e.g., Allan et al. 2005; Dudgeon 2000b; Hai Yen et al. 2009). Management of the Mekong Basin, continued development, and sustainability of fishery resources is further complicated by the trans-boundary nature of the catchment (Lebel et al. 2005; Ratner 2003). While political borders and social responsibilities of regional governments limit the scale of management decisions, fish stocks often transcend boundaries, requiring multinational approaches to sustainable management. In order to meet the national goals of ensuring ongoing food security, alleviating poverty and generating income for people in Mekong riparian countries, the fisheries sector will require continued production from both the wild capture and culture fishery industries (Bush 2008), and fisheries policy and management initiatives will need to take this into account.

Effective conservation management of wild fisheries resources requires information on the scale at which populations or exploited species are structured (Graves 1998; McElhany et al. 2000; Palsbøll et al. 2007). The current study has provided a detailed picture of population structure for two important species, C. striata and C. micropeltes, across the Mekong River Basin. Although levels of genetic diversity and phylogeographical patterns vary greatly between the two species, populations of both species appear to be significantly structured into geographically discrete sub-populations across the entire drainage. The C. micropeltes sub-populations identified here can essentially be considered as independent management units (MUs), because they appear to exchange very few migrants, and thus represent essentially demographically independent groups, relying largely on local birth and death rates to maintain local population sizes and levels of genetic diversity (Palsbøll et al. 2007). For C. striata, the fine scale of differentitation uncovered suggests that a MU approach to management is inappropriate, as essentially

182

General Discussion every local population represents a unique and distinct component of total genetic variation for this species.

For each species, multiple sub-populations occur within the national boundaries of each Mekong riparian country, indicating that snakehead fisheries in different countries, and even provinces, are unlikely to be exploiting the same fish populations for these taxa. This has important implications for catch monitoring and the design of management plans for both species if, as is the case for most fisheries, the ultimate goal is to ensure the security of the fishery resource in perpetuity (Ward 2000). Management initiatives will need to be focused towards the scale of local sub-populations, and their effectiveness is likely to be only realised locally. Low dispersal among sub-populations indicates that population size declines associated with human activities such as over-fishing or habitat destruction are not likely to be compensated for by natural recruitment of migrants from other sub-populations. If such declines do occur, population recovery is likely to depend on reducing local mortality (by limiting catch) or increasing recruitment by other means (e.g., restoring habitat or via restocking).

From a longer term evolutionary perspective, local genetic diversity in each C. striata population is essentially unique, and if lost cannot be regained by colonisation of individuals from other parts of the Mekong Basin. Genetic diversity potentially allows individuals to use a wider array of environments, protects against short-term environmental fluctuations, and provides the building blocks for surviving future environmental changes (McElhany et al. 2000). Thus, maintenance of local genetic diversity by promoting local population persistence should be a management priority for this species. Although C. micropeltes displays only very low levels of genetic diversity and divergence across the Mekong, populations of this species should not be considered immune to losses of genetic diversity, as there is potential that undetected selective forces have impacted on diversity to different extents in independent sub-populations. Genetic diversity can be lost via population size reductions, but also by the introduction of hatchery reared or translocated individuals, that can undermine natural populations by replacing and/or hybridising with local stocks, so potentially introducing maladaptive genetic variation (Carvalho 1993; Lorenzen 2005; Utter 1998)

Such introductions may be undertaken deliberately, to enhance stock production or to supplement local recruitment (Lorenzen 2005), or occur accidentally via escapes from culture. The deliberate release of hatchery reared juveniles is already common in the

183

Chapter 5. upper Mekong Basin in Thailand for a range of economically important fish species (Bush et al. 2004; Little et al. 1996). Furthermore, in Thailand, gene introgression due to escapes from culture has been observed among wild populations and culture species (Na-Nakorn et al. 2004; Senanan et al. 2004). As ever increasing areas of the Mekong Basin are developed for aquaculture and reservoir fisheries to meet the demand for increased fish yields, it is likely that mixing of hatchery and local populations will become even more common events, especially for snakeheads that are becoming more important as culture species. In light of this, the establishment of broodstocks that represent local diversity should be a priority, and every attempt should be made to culture lines that are derived from local wild populations. If in future, policy makers decide that augmentation of natural populations is necessary, local brood stocks could also then be used to supply hatchery individuals for localised release. Limiting movement of fishes between sub-populations and river sections may also assist in limiting the spread of potential diseases, that have previously been a problem for snakeheads in culture situations (Nash et al. 1988).

As well as representing a significant component of large scale cage culture operations in the Mekong Basin, snakeheads are harvested in many small scale operations including rice-field fisheries and trap-pond systems, that represent a move away from dependency on wild fish populations towards fisheries production in closed and semi-closed systems (‘Farmer Managed Aquatic Systems’) (Amilhat et al. 2009b; Bush 2008; Frei & Becker 2005; Rothuis et al. 1998a). In Vietnam these environments are heavily stocked with hatchery fish, but in Cambodia and Thailand many of these systems are able to exploit and enhance local recruitment of wild C. striata, and are effectively enhancing the availability of suitable habitat for local populations (Amilhat et al. 2009a, b; Hortle et al. 2008). This approach to managing local C. striata populations for sustainable harvesting appears to be a successful method for maintaining population sizes and levels of genetic diversity at the local scale, as it effectively balances increased fishery mortality by providing access to relatively high quality breeding and feeding areas, so allowing high numbers of individuals to complete their lifecycle naturally.

Although similar in life-history, C. micropeltes is less able to utilise these heavily managed aquatic ecosystems than C. striata. While each C. micropeltes MU recovered here inhabits a relatively small proportion of the entire drainage basin, local populations of this species are currently likely to be at least somewhat dependent on natural populations of wide ranging species as food resources. This is especially true for the large C. micropeltes

184

General Discussion population in the Tonle Sap Great Lake, that is known to rely on small migratory fishes as a major prey item (personal communication, Lieng Sopha, Cambodian DoF). Reliance on small migratory fish is already a recognised problem in the snakehead cage culture industry, where some limits have been imposed in an attempt to curb over-fishing of ‘trash fish’ for use as feed, although its use varies across the basin (Bush 2008; Ingthamjitr et al. 2005; Ngor et al. 2003).

This means that persistence of the large C. micropeltes populations that are currently heavily exploited by the wild capture fisheries sector, will in part, rely on maintaining adequate stocks of prey species, that will in turn, require managements at much larger spatial scales. Without maintaining access to migrating fish stocks, populations of C. micropeltes are unlikely to be able to persist in large numbers in the Lower Basin, regardless of stock augmentation measures. In Thailand, reservoirs are already regularly stocked with H. siamensis and other small prey species to augment declining natural populations. Restocking approaches, however, are reactionary (Collares-Pereira & Cowx 2004), and at best only a short term solution to providing fisheries resources where habitat modification has interrupted the natural life cycle of important species. In Cambodia and Lao PDR, where wild fisheries still constitute a major economic resource on which most rural communities firmly depend (Navy & Bhattarai 2009; Singhanouvong & Phouthavongs 2002), a better approach for management may be to attempt to maintain adequate areas for natural populations of key species to complete their life cycles.

The Mekong is an incredibly diverse aquatic ecosystem, and the sheer number of species precludes detailed analysis of population structure and ecological requirements for each one. The young history of the drainage means that ecology and life history alone are unlikely to be adequate predictors of the level of population structuring of all Mekong fishes. Hopefully, assessment of a few key species, like C. striata and C. micropeltes, can provide insights into the scale at which population processes are operating within the system. Knowledge of the spatial scale of population structuring can be applied to enhance existing management measures, for example by informing the appropriate placement of fish sanctuaries (Baird 2006; Baird & Flaherty 2005). While attempting to maximise the environmental benefit to target species, management approaches implemented at the spatial scales known to be important to some key taxa may also benefit suites of other aquatic organisms that share similar population structures. Over the long term, the genetic data presented here also has the potential to provide a baseline for future genetic

185

Chapter 5. monitoring for conservation, and means to assess the effectiveness of management strategies undertaken in future years.

Conclusion

The current analysis suggests that the family Channidae arose in in the Eocene, before migrating west into Africa and east into SE Asia to assume its current distribution. Divergence times estimated here indicate that among Asian taxa, speciation events can be related to climate and geomorphological induced vicariance. It is likely that some lineages have experienced species radiations since the Late Miocene, while for others, such as the C. striata lineage, it appears that long periods of isolation have not resulted in the formation of new species.

Two Channa species with wide Mekong Basin distributions, C. striata and C. micropeltes, display very different phylogeographical patterns, despite having similar ecologies and life history strategies. These differences can be attributed largely to different histories in the Mekong. C. striata appears to have been present across freshwaters of the Mekong Basin for a considerable period of evolutionary time, while C. micropeltes is probably a recent arrival in most, if not all, sections of the drainage.

Generally, fishes of the Mekong Basin display a range of phylogeographical patterns and levels of diversity. Some taxa are characterised by deeply divergent lineages that probably result from historical isolation in pre-Mekong drainage lines, however divergences are not necessarily congruent among taxa, spatially or temporally. Some species have evidently expanded from very low ancestral population sizes, while others appear to have maintained large population sizes over long periods of evolutionary time. Levels of diversity are not always indicative of population sizes, as both migratory and non- migratory taxa can show either high or low diversity across the river basin.

The Mekong is a very young drainage, and it is evident that the vast array of phylogeographical patterns observed among Mekong fishes have been heavily influenced by many independent geomorphological re-arrangements. Simple conclusions based on perceived ecological similarity are unlikely to accurately reflect true patterns of genetic structure for Mekong taxa, as for most there is no information on historical distributions prior to drainage amalgamation. Life history traits are still important, however, as they influence the amount of contemporary gene flow among sub-populations.

186

General Discussion

As Mekong taxa display such a multitude of different phylogeographical patterns across all trophic levels, management strategies based on ecological modelling of single taxa are unlikely to effectively conserve all wild fish stocks. Effective management strategies should aim to maintain local populations by preserving natural habitats, providing refuges from fishing, and accounting for natural dispersal pathways, including main channel migration routes and floodplain areas. To maintain natural patterns of genetic diversity, translocations among and within mainland SE Asian drainages should be avoided if possible, as they come with a high risk of mixing divergent lineages even where freshwater connections are inferred to have been historically recent. In addition, broodstocks for indigenous culture species should be sourced where possible from local wild stocks. Not only are local variants perhaps likely to be more tolerant of local conditions, but their escape from culture is less likely to undermine natural patterns of genetic structure in wild populations.

187

References

189

Cited Literature

Abell R., Thieme M. L., Revenga C., Bryer M., Kottelat M., Bogutskaya N., Coad B., Mandrak N., Balderas S. C. & Bussing W. (2008) Freshwater ecoregions of the world: a new map of biogeographic units for freshwater biodiversity conservation. Bioscience 58: 403-414.

Abol-Munafi A. B., Ambak M. A., Ismail P. & Tam B. M. (2007) Molecular data from the Cytochrome b for the phylogeny of Channidae (Channa sp.) in Malaysia. Biotechnology 6: 22-27.

Adamson E. A. S., Hurwood D. A., Baker A. M. & Mather P. B. (2009) Population subdivision in Siamese mud carp Henicorhynchus siamensis in the Mekong River basin: implications for management. Journal of Fish Biology 75: 1371-1392.

Agnese J.-F. & Teugels G. G. (2005) Insight into the phylogeny of African Clariidae (Teleostei, Siluriformes): Implications for their body shape evolution, biogeography, and taxonomy. Molecular Phylogenetics and Evolution 36: 546-553.

Alfred E. R. (1963) Channa bistriata (Weber and de Beaufort), the young of the Snake-head Fish Channa lucius (Cuvier). The Bulletin of the National Museum 32: 156-157.

Alfred E. R. (1966) The fresh-water fishes of Singapore. Zoologische Verhandelingen 78: 74pp.

Ali A. B. (1999) Aspects of the reproductive biology of female snakehead (Channa striata Bloch) obtained from irrigated rice agroecosystem, Malaysia. Hydrobiologia 411: 71-77.

Ali J. R. & Aitchison J. C. (2008) Gondwana to Asia: Plate tectonics, paleogeography and the biological connectivity of the Indian sub-continent from the Middle Jurassic through latest Eocene (166-35 Ma). Earth-Science Reviews 88: 145-166.

Allan J. D., Abell R., Hogan Z., Revenga C., Taylor B. W., Welcomme R. L. & Winemiller K. (2005) Overfishing of inland waters. Bioscience 55: 1041-1051.

Ambak M. A., Bolong A.-M. A., Ismail P. & Tam B. M. (2006) Genetic variation of snakehead fish (Channa striata) populations using random amplified polymorphic DNA. Biotechnology 5: 104-110.

Ambak M. A. & Jalal K. C. A. (2006) Sustainability issues of reservoir fisheries in Malaysia. Aquatic Ecosystem Health & Management 9: 165-173.

Amilhat E. & Lorenzen K. (2005) Habitat use, migration pattern and population dynamics of chevron snakehead Channa striata in a rainfed rice farming landscape. Journal of Fish Biology 67: 23-34.

Amilhat E., Lorenzen K., Morales E. J., Yakupitiyage A. & Little D. C. (2009a) Fisheries production in Southeast Asian farmer managed aquatic systems (FMAS): I. Characterisation of systems. Aquaculture 296: 219-226.

Amilhat E., Lorenzen K., Morales E. J., Yakupitiyage A. & Little D. C. (2009b) Fisheries production in Southeast Asian Farmer Managed Aquatic Systems (FMAS): II. Diversity of aquatic resources and management impacts on catch rates. Aquaculture 298: 57-63.

An Z. (2000) The history and variability of the East Asian paleomonsoon climate. Quaternary Science Reviews 19: 171-187.

191

Cited Literature

Archangi B., Chand V. & Mather P. B. (2009) Isolation and characterization of 15 polymorphic microsatellite DNA loci from Argyrosomus japonicus (mulloway), a new aquaculture species in Australia. Molecular Ecology Notes 9: 412-414.

Arratia G., Lopez-Arbarello A. & Prasad G. V. R. (2004) Late Cretaceous-Paleocene percomorphs (Teleostei) from India - Early radiation of Perciformes. In: Recent Advances in the origin and early radiation of vertebrates (eds. M. V. H. Wilson & R. Cloutier). Verlag, Munchen, Germany.

Arthur J. R. & Te B. Q. (2006) Checklist of the parasites of fishes of Viet Nam. In: FAO Fisheries Technical Paper pp. 133. FAO, Rome.

Attwood S. W. & Johnston D. A. (2001) Nucleotide sequence differences reveal genetic variation in Neotricula aperta (Gastropoda: Pomatiopsidae), the snail host of schistosomiasis in the lower Mekong Basin. Biological Journal of the Linnean Society 73: 23- 41.

Avise J. (1998) The history and purview of phylogeography: a personal reflection. Molecular Ecology 7: 371-379.

Avise J., Jones A. & Walker D. (2002) Genetic mating systems and reproductive natural histories of fishes: Lessons for ecology and evolution. Annual Review of Genetics 36: 19-45.

Avise J. C. (1989) Gene trees and organismal histories: A phylogenetic approach to population biology. Evolution 43: 1192-1208.

Avise J. C. (1992) Molecular population structure and the biogeographic history of a regional fauna: a case history with lessons for conservation biology. Oikos 63: 62-76.

Avise J. C. (1994) Molecular Markers, Natural History and Evolution. Chapman and Hall, New York.

Avise J. C. (2000) Phylogeography. The History and Formation of Species. Harvard University Press, Cambridge, MA.

Avise J. C. (2009) Phylogeography: retrospect and prospect. Journal of Biogeography 36: 3- 15.

Avise J. C., Arnold J., Ball R. M., Bermingham E., Lamb T., Neigel J. E., Reeb C. A. & Saunders N. C. (1987) Intraspecific Phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics 18: 489- 522.

Azuma Y., Kumazawa Y., Miya M., Mabuchi K. & Nishida M. (2008) Mitogenomic evaluation of the historical biogeography of cichlids toward reliable dating of teleostean divergences. BMC Evolutionary Biology 8: 215.

Baer C. F., Miyamoto M. M. & Denver D. R. (2007) Mutation rate variation in multicellular eukaryotes: causes and consequences. Nature Reviews Genetics 8: 619-631.

Bagra K., Kadu K., Nebeshwar-Sharma K., Laskar B. A., Sarkar U. K. & Das D. N. (2009) Ichthyological survey and review of the checklist of fish fauna of Arunachal Pradesh, India. Check List 5: 330-350.

192

Cited Literature

Baird I. G. (2006) Strength in diversity: fish sanctuaries and deep-water pools in Lao PDR. Fisheries Management and Ecology 13: 1-8.

Baird I. G. & Flaherty M. S. (2005) Mekong River Fish Conservation Zones in Southern Lao: assessing effectiveness using local ecological knowledge. Environmental Management 36: 439-454.

Baird I. G. & Mean M. (2005) Sesan River fisheries monitoring in Ratanakiri province, northeast Cambodia: Before and after the construction of the Yali Falls dam in the Central Highlands of Viet Nam pp. 92. 3S Rivers Protection Network and the Global Association for People and the Environment, Ban Lung, Ratanakiri, Cambodia.

Balloux F., Brunner H., Lugon-Moulin N., Hausser J. & Goudet J. (2000) Microsatellites can be misleading: an empirical and simulation study. Evolution 54: 1414-1422.

Balloux F. & Goudet J. (2002) Statistical properties of population differentiation estimators under stepwise mutation in a finite island model. Molecular Ecology 11: 771-783.

Bande M. B. & Prakash U. (1986) The tertiary flora of Southeast Asia with remarks on its palaeoenvironment and phytogeography of the Indo-Malayan region. Review of Palaeobotany and Palynology 49: 203-233.

Bandelt H., Forster P. & Rohl A. (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16: 37-48.

Bao T. Q., Bouakhamvonsa K., Chan S., Chhuon K. C., Phommavong T., Poulsen A. F., Rukawoma P., Suornratana U., Tien D. V., Tuan T. T., Tung N. T., Valbo-Jorgensen J., Viravong S. & Yoorong N. (2001) Local knowledge in the study of river fish biology: experiences from the Mekong. In: Mekong Development Series (eds. M. Hewitt & D. Paul) pp. 22. Mekong River Commission, Phnom Phen.

Baran E. (2005) Cambodia inland fisheries: facts, figures and context. WorldFish Center and Inland Fisheries Research and Development Institute, Phnom Penh, Cambodia.

Baran E. & Cain J. (2001) Ecological approaches of flood-fish relationships modelling in the Mekong River Basin. In: National workshop on Ecological and Environmental Modelling (eds. H. L. Koh & H. Y. Abu) pp. 20-27, Universiti Sains Malaysia, Penang, Malaysia.

Baran E., Van Zalinge N. & Ngor P. B. (2001) Floods, floodplains and fish production in the Mekong Basin: present and past trends. In: Proceedings of the Second Asian Wetlands Symposium, 27-30 August 2001 (ed. A. Ahyaudin) pp. 920-932. Penerbit Universiti Sains Malaysia, Pulau Pinang, Malaysia.

Beck J., Kitching I. J. & Linsenmair K. E. (2006) Wallace's line revisited: has vicariance or dispersal shaped the distribution of Malesian hawkmoths (Lepidoptera: Sphingidae)? Biological Journal of the Linnean Society 89: 455-468.

Begg G. A. & Waldman J. R. (1999) An holistic approach to fish stock identification. Fisheries Research 43: 35-44.

Beheregaray L. B. (2008) Twenty years of phylogeography: the state of the field and the challenges for the Southern Hemisphere. Molecular Ecology 17: 3754-3774.

193

Cited Literature

Belkhir K. P., Borsa P., Chikhi L., Raufaste N. & Bonhomme F. (1996-2004) GENETIX 4.05, logiciel sous WindowsTM pour la génétique des populations pp. Available from http://www.genetix.univ-montp2.fr/genetix/genetix.htm. Laboratoire Génome, Populations, Interactions CNRS UMR 5000, Université de Montpellier II, Montpellier (France).

Benjamini Y. & Hochberg Y. (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological) 57: 289-300.

Benson D. A., Boguski M. S., Lipman D. J., Ostell J., Ouellette B. F., Rapp B. A. & Wheeler D. L. (1999) GenBank. Nucleic Acids Research 27: 12-17.

Benzécri J. P. (1973) L'Analyse des Données: T. 2, I' Analyse des correspondances. Paris: Dunod.

Bermingham E. & Avise J. C. (1986) Molecular Zoogeography of Freshwater Fishes in the Southeastern United States. Genetics 113: 939-965.

Bernardi G., Bucciarelli G., Costagliola D., Robertson D. & Heiser J. (2004) Evolution of coral reef fish Thalassoma spp.(Labridae). 1. Molecular phylogeny and biogeography. Marine Biology 144: 369-375.

Biju S. D. & Bossuyt F. (2003) New frog family from India reveals an ancient biogeographical link with the Seychelles. Nature 425: 711-714.

Bird M. I., Taylor D. & Hunt C. (2005) Palaeoenvironments of insular Southeast Asia during the Last Glacial Period: a savanna corridor in Sundaland? Quaternary Science Reviews 24: 2228-2242.

Bohme M. (2004) Migration history of air-breathing fishes reveals Neogene atmospheric circulation patterns. Geology 32: 393-396.

Bohonak A. J. (1999) Dispersal, gene flow, and population structure. The Quarterly Review of Biology 74: 21-45.

Boonyaratpalin M., McCoy E. W., Chittapalapong T. & Bangkhen B. (1985) Snakehead Culture and Its Socio-economics in Thailand. NACA/WP/85/26. Bangkok, Thailand.

Bossuyt F. & Milinkovitch M. C. (2001) Amphibians as indicators of Early Tertiary "Out-of- India" dispersal of vertebrates. Science 292: 93-95.

Briggs J. C. (2003a) The biogeographic and tectonic history of India. Journal of Biogeography 30: 381-388.

Briggs J. C. (2003b) Fishes and birds: Gondwana life rafts reconsidered. Systematic Biology 52: 548-553.

Brookfield M. E. (1998) The evolution of the great river systems of southern Asia during the Cenozoic India-Asia collision: rivers draining southwards. Geomorphology 22: 285-312.

194

Cited Literature

Bryan M. B., Zalinski D., Filcek K. B., Libants S., Li W. & Scribner K. T. (2005) Patterns of invasion and colonization of the sea lamprey (Petromyzon marinus) in North America as revealed by microsatellite genotypes. Molecular Ecology 14: 3757-3773.

Bull J. J., Huelsenbeck J. P., Cunningham C. W., Swofford D. L. & Waddell P. J. (1993) Partitioning and combining data in phylogenetic analysis. Systematic Biology 42: 384.

Burridge C. P., Craw D. & Waters J. M. (2007) An empirical test of freshwater vicariance via river capture. Molecular Ecology 16: 1883-1895.

Bush A. M., Markey M. J. & Marshall C. R. (2004) Removing bias from diversity curves: The effects of spatially organized biodiversity on sampling-standardization. Paleobiology 30: 666-686.

Bush S. R. (2008) Contextualising fisheries policy in the Lower Mekong Basin. Journal of Southeast Asian Studies 39: 329-353.

Campbell I. C., Poole C., Giesen W. & Valbo-Jorgensen J. (2006) Species diversity and ecology of Tonle Sap Great Lake, Cambodia. Aquatic Sciences - Research Across Boundaries 68: 355-373.

Campbell N. J. H., Harriss F. C., Elphinstone M. S. & Baverstock P. R. (1995) Outgroup heteroduplex analysis using temperature gradient gel electrophoresis: high resolution, large scale, screening of DNA variation in the mitochondrial control region. Molecular Ecology 4: 407-418.

Cannon C. H., Morley R. J. & Bush A. B. G. (2009) The current refugial rainforests of Sundaland are unrepresentative of their biogeographic past and highly vulnerable to disturbance. Proceedings of the National Academy of Sciences 106: 11188-11193.

Cárdenas L., Hernández C. E., Poulin E., Magoulas A., Kornfield I. & Ojeda F. P. (2005) Origin, diversification, and historical biogeography of the genus Trachurus (Perciformes: Carangidae). Molecular Phylogenetics and Evolution 35: 496-507.

Carvalho G. R. (1993) Evolutionary aspects of fish distribution: genetic variability and adaptation. Journal of Fish Biology 43: 53-73.

Carvalho G. R. & Hauser L. (1994) Molecular genetics and the stock concept in fisheries. Reviews in Fish Biology and Fisheries 4: 326-350.

Cassens I., Mardulyn P. & Milinkovitch M. C. (2005) Evaluating intraspecific "network" construction methods using simulated sequence data: Do existing algorithms outperform the global maximum parsimony approach? Systematic Biology 54: 363-372.

Castello J. & Templeton A. R. (1994) Root probabilities for intraspecific gene trees under neutral coalescent theory. Molecular Phylogenetics and Evolution 3: 102-113.

Castric V., Bonney F. & Bernatchez L. (2001) Landscape Structure and Hierarchical Genetic Diversity in the Brook Charr, Salvelinus fontinalis. Evolution 55: 1016-1028.

Champasri T. (2003) Some ecological aspects, water properties and natural fish species of the Phrom River in Northeast Thailand. Pakistan Journal of Biological Sciences 6: 65-69.

195

Cited Literature

Chand V., de Bruyn M. & Mather P. B. (2005) Microsatellite loci in the eastern form of the giant freshwater prawn (Macrobrachium rosenbergii). Molecular Ecology Notes 5: 308-310.

Chapuis M.-P. & Estoup A. (2007) Microsatellite null alleles and estimation of population differentiation. Molecular Biology and Evolution 24: 621-631.

Chen W.-J., Bonillo C. & Lecointre G. (2003) Repeatability of clades as a criterion of reliability: a case study for molecular phylogeny of Acanthomorpha (Teleostei) with larger number of taxa. Molecular Phylogenetics and Evolution 26: 262-288.

Chow S. & Hazama K. (1998) Universal PCR primers for S7 ribosomal protein gene introns in fish. Molecular Ecology 7: 1247-1263.

Claridge G. F. (1996) An inventory of wetlands of the Lao P.D.R. IUCN, Bangkok, Thailand.

Clark M. K., Schoenbohm L. M., Royden L. H., Whipple K. X., Burchfiel B. C., Zhang X., Tang W., Wang E. & Chen L. (2004) Surface uplift, tectonics, and erosion of eastern Tibet from large-scale grainage patterns. Tectonics 23: TC1006, doi:1010.1029/2002TC001402

Clift P. D., Blusztajn J. & Duc N. A. (2006) Large-scale drainage capture and surface uplift in eastern Tibet-SW China before 24 Ma inferred from sediments of the Hanoi Basin, Vietnam. Geophysical Research Letters 33.

Coates D. (2001) Biodiversity and fisheries management opportunities in the Mekong River Basin. In: Blue millennium - Managing global fisheries for biodiversity. GEFIDRC pp. 3-7. World Fisheries Trust

Coates D. (2002) Inland capture fishery statistics of Southeast Asia: current status and information needs. In: Asia-Pacific Fishery Commission pp. RAP Publication No. 2002/2011, 2114p. FAO, Bangkok, Thailand.

Collares-Pereira M. J. & Cowx I. G. (2004) The role of catchment scale environmental management in freshwater fish conservation. Fisheries Management and Ecology 11: 303- 312.

Conti E., Eriksson T., Schönenberger J., Sytsma K. J. & Baum D. A. (2002) Early Tertiary out- of-India dispersal of crypteroniaceae: evidence from phylogeny and molecular dating. Evolution 56: 1931-1942.

Courtenay W. R. & Williams J. D. (2004) Snakeheads (Pisces, Channidae): a biological synopsis and risk assessment. In: US Geological Survey Circular. US Geological Survey, Denver CO.

Crandall K. A. (1996) Multiple interspecies transmissions of human and simian T-cell leukemia/lymphoma virus Type I sequences. Molecular Biology and Evolution 13: 115-131.

Crandall K. A. & Templeton A. R. (1993) Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction. Genetics 134: 959-969.

Crawford N. G. (2010) SMOGD: software for the measurement of genetic diversity. Molecular Ecology Resources 10: 556-557.

196

Cited Literature

Crispo E., Bentzen P., Reznick D. N., Kinnison M. T. & Hendry A. P. (2006) The relative influence of natural selection and geography on gene flow in guppies. Molecular Ecology 15: 49-62.

Davis G. M. (1979) The origin and evolution of the gastropod family Pomatiopsidae, with emphasis on the Mekong River Triculinae. Academy of Natural Sciences of Philadelphia Monograph 20: 120pp.

Day F. (1865) On the fishes of Cochin, on the Malabar Coast of India. Proceedings of the Zoological Society of London: 286-318. de Bruyn M., Wilson J. A. & Mather P. B. (2004) Huxley's line demarcates extensive genetic divergence between eastern and western forms of the giant freshwater prawn, Macrobrachium rosenbergii. Molecular Phylogenetics and Evolution 30: 251-257. de Queiroz K. (2007) Species concepts and species delimitation. Systematic Biology 56: 879-886.

De Woody J. A. & Avise J. (2000) Microsatellite variation in marine, freshwater and anadromous fishes compared with other . Journal of Fish Biology 56: 461-473.

Deap L., Degen P. & van Zalinge N. (2003) Fishing Gears of the Cambodian Mekong. Inland Fisheries Research and Development Institute of Cambodia (IFReDI).

Degnan J. H. & Rosenberg N. A. (2009) Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends in Ecology & Evolution 24: 332-340.

Delgado C., Wada N., Rosegrant M., Meijer S. & Ahmed M. (2003) Fish to 2020: supply and demand in changing global markets. International Food Policy Research Institute Washington, USA.

Dey M. M., Rab M. A., Paraguas F. J., Piumsombun S., Bhatta R., Alam M. F. & Mahfuzuddin A. (2005) Fish comsumption and food security: A disaggregated analysis by types of fish and classes of consumers in selected asian countries. Aquaculture Economics and Management 9: 89-111.

Dodson J. J., Colombani F. & Ng P. K. L. (1995) Phylogeographic structure in mitochondrial DNA of a South-east Asian freshwater fish, Hemibagrus Nemurus (Siluroidei, Bagridae) and Pleistocene sea-level changes on the Sunda shelf. Molecular Ecology 4: 331-346.

Doi A. & Taki Y. (1997) Genetic differentiation of Osteochilus hasseltii (Teleostei: Cyprinidae) in Thailand. Raffles Bulletin of Zoology 45: 61-72.

Donnelly P. & Tavaré S. (1986) The ages of alleles and a coalescent. Advances in Applied Probability 18: 1-19.

Donnelly P. & Tavaré S. (1995) Coalescents and genealogical structure under neutrality. Annual Review of Genetics 29: 401-421.

Drummond A. & Rambaut A. (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 7: 214.

197

Cited Literature

Drummond A. J., Ho S. Y. W., Phillips M. J. & Rambaut A. (2006) Relaxed phylogenetics and dating with confidence. PLoS Biology 4: e88.

Duchesne P. & Turgeon J. (2009) FLOCK: A method for quick mapping of admixture without source samples. Molecular Ecology Resources 9: 1333-1344.

Dudgeon D. (2000a) The ecology of tropical Asian rivers and streams in relation to biodiversity conservation. Annual Review of Ecology and Systematics 31: 239-263.

Dudgeon D. (2000b) Large-scale hydrological changes in Tropical Asia: Prospects for riverine biodiversity. Bioscience 50: 793-806.

Dupanloup I., Schneider S. & Excoffier L. (2002) A simulated annealing approach to define the genetic structure of populations. Molecular Ecology 11: 2571-2581.

Dupont-Nivet G., Hoorn C. & Konet M. (2008) Tibetan uplift prior to the Eocene-Oligocene climate transition: Evidence from pollen analysis of the Xining Basin. Geology 36: 987-990.

Dwivedi B. & Gadagkar S. (2009) Phylogenetic inference under varying proportions of indel- induced alignment gaps. BMC Evolutionary Biology 9: 211.

Echelle A. A. & Connor P. J. (1989) Rapid, geographically extensive genetic introgression after secondary contact between two pupfish species (Cyprinodon, cyprinodontidae). Evolution 43: 717-727.

Edgar R. (2004a) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5: 113-131.

Edgar R. C. (2004b) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 1792-1797.

Edwards S. V. (2009) Is a new and general theory of molecular systematics emerging? Evolution 63: 1-19.

Ellegren H. (2004) Microsatellites: simple sequences with complex evolution. Nature Reviews. Genetics 5: 435-445.

Elphinstone M. S. & Baverstock P. R. (1997) Detecting mitochondrial genotypes by temperature gradient gel electrophoresis and heteroduplex analysis. BioTechniques 23: 982-986.

Enomoto K., Ishikawa S., Sitha H., Thuok N. A. O. & Kurokura H. (2005) Relationship between fluctuation of water level and fish catches of Henicorhynchus spp. and Channa micropeltes in Kampong Thom province in Kingdom of Cambodia. In: International Symposium on Sustainable Development in the Mekong River Basin. Japan Science and Technology Agency, Southern Institute for Water Resources Research, Ho Chi Minh City.

Evanno G., Regnaut S. & Goudet J. (2005) Detecting the number of clusters of individuals using the software: STRUCTURE: a simulation study. Molecular Ecology 14: 2611-2620.

Ewens (1972) The sampling theory of selectively neutral alleles. Theoretical Population Biology 3: 87-112.

198

Cited Literature

Excoffier L., Laval G. & Schneider S. (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1: 47-50.

Excoffier L., Smouse P. E. & Quattro J. M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131: 479-491.

Falush D., Stephens M. & Pritchard J. K. (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164: 1567- 1587.

Falush D., Stephens M. & Pritchard J. K. (2007) Inference of population structure using multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes 7: 574-578.

FAO (2010) Channa striata (Bloch 1793). In: Species Fact Sheets. Fisheries and Aquaculture Department: Food and Agricultural Organisation of the United Nations, accessed 20/02/2010, available at: http://www.fao.org/fishery/species/3062/en.

Farias I. P., Orti G., Sampaio I., Schneider H. & Meyer A. (2001) The cytochrome b gene as a phylogenetic marker: the limits of resolution for analyzing relationships among cichlid fishes. Journal of Molecular Evolution 53: 89-103.

Fedorov V. B., Goropashnaya A. V., Boeskorov G. G. & Cook J. A. (2008) Comparative phylogeography and demographic history of the wood lemming (Myopus schisticolor): implications for late Quaternary history of the taiga species in Eurasia. Molecular Ecology 17: 598-610.

Fishbase (2010) Channa striata (Bloch, 1793) Snakehead murrel (eds. R. Froese & D. Pauly). www.fishbase.org, accessed 02/2010.

Forster P., Torroni A., Renfrew C. & Rohl A. (2001) Phylogenetic star contraction applied to Asian and Papuan mtDNA evolution. Molecular Biology and Evolution 18: 1864-1881.

Fraser D. J., Lippe C. & Bernatchez L. (2004) Consequences of unequal population size, asymmetric gene flow and sex-biased dispersal on population structure in brook charr (Salvelinus fontinalis). Molecular Ecology 13: 67-80.

Frei M. & Becker K. (2005) Integrated rice-fish culture: Coupled production saves resources. Natural Resources Forum 29: 135-143.

Fu Y. X. (1997) Statistical tests for neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147: 915-925.

Futuyma D. J. (1998) Evolutionary Biology. Sunderland, MA, Sinauer Associates, MA.

Garza J. C. & Williamson E. G. (2001) Detection of reduction in population size using data from microsatellite loci. Molecular Ecology 10: 305-318.

Gaubert P., Denys G. & Oberdorff T. (2009) Genus-level supertree of Cyprinidae (: ), partitioned qualitative clade support and test of macro- evolutionary scenarios. Biological Reviews 84: 653-689.

199

Cited Literature

Glaubitz J. C. (2004) CONVERT: A user-friendly program to reformat diploid genotypic data for commonly used population genetics software packages. Molecular Ecology Notes 4: 309-310.

Glaubrecht M. & Kohler F. (2004) Radiating in a river: systematics, molecular genetics and morphological differentiation of viviparous freshwater gastropods endemic to the Kaek River, central Thailand (Cerithioidea, Pachychilidae). Biological Journal of the Linnean Society 82: 275-311.

Golding G. B. (1987) The detection of deleterious selection using ancestors inferred from a phylogenetic history. Genetical Research 49: 71-82.

Gorog A. J., Sinaga M. H. & Engstrom M. D. (2004) Vicariance or dispersal? Historical biogeography of three Sunda shelf murine rodents (Maxomys surifer, Leopoldamys sabanus and Maxomys whiteheadi). Biological Journal of the Linnean Society 81: 91-109.

Goudet J. (1995) FSTAT (Version1.2): a computer program to calculate F-statistics. Heredity 86: 485-486.

Gower D. J., Kupfer A., Oommen O. V., Himstedt W., Nussbaum R. A., Loader S. P., Presswell B., Müller H., Krishna S. B., Boistel R. & Wilkinson M. (2002) A molecular phylogeny of ichthyphiid caecilians (Amphibia: Gymnopiona: Ichthyophidiidae): out of India or out of South East Asia? Proceedings of the Royal Society Biological Sciences 269: 1536- 1569.

Graves J. E. (1998) Molecular insights into the population structures of cosmopolitan marine fishes. Journal of Heredity 89: 427-437.

Graybeal A. (1998) Is it better to add taxa or characters to a difficult phylogenetic problem? Systematic Biology 47: 9-17.

Guillot G., Leblois R., Coulon A. & Frantz A. (2009) Statiatical methods in spatial genetics. Molecular Ecology 18: 4734-4756.

Guo S. W. & Thompson E. A. (1992) Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48: 361-372.

Guo X., He S. & Zhang Y. (2005) Phylogeny and biogeography of Chinese sisorid catfishes re- examined using mitochondrial cytochrome b and 16S rRNA gene sequences. Molecular Phylogenetics and Evolution 35: 344-362.

Gupta A. (2004) The Mekong River; morphology, evolution and palaeoenvironment. Progress in palaeohydrology; focus on monsoonal areas 64: 525-533.

Gupta A. (2005a) Landforms of Southeast Asia. In: The physical geography of Southeast Asia (ed. A. Gupta) pp. 38-64. Oxford University Press, New York.

Gupta A. (2005b) Rivers of Southeast Asia. In: The physical geography of Southeast Asia (ed. A. Gupta) pp. 65-79. Oxford University Press, New York.

Gupta A. & Liew S. C. (2007) The Mekong from satellite imagery: A quick look at a large river. Geomorphology 85: 259-274.

200

Cited Literature

Gyllensten U. (1985) The genetic structure of fish: differences in the intraspecific distribution of biochemical genetic variation between marine, anadromous, and freshwater species. Journal of Fish Biology 26: 691-699.

Gyllensten U., Leary R. F., Allendorf F. W. & Wilson A. C. (1985) Introgression between two cutthroat trout subspecies with substantial karyotypic, nuclear and mitochondrial genomic divergence. Genetics 111: 905-915.

Hai Yen N. T., Sunada K., Oishi S., Ikejima K. & Iwata T. (2009) Stock assessment and fishery management of Henicorhynchus spp., Cyclocheilichthys enoplos and Channa micropeltes in Tonle Sap Great Lake, Cambodia. Journal of Great Lakes Research 35: 169-174.

Hall B. G. (2008) Phylogenetic trees made easy: a how-to manual. Sinauer Associates, Sunderland.

Hall R. (1998) The plate tectonics of Cenozoic SE Asia and the distribution of land and sea. In: Biogeography and Geological Evolution of SE Asia (eds. J. D. Holloway & R. Hall) pp. 99- 132. Backhuys Publishers, Leiden.

Hall T. A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium 41: 95-98.

Hänfling B. & Weetman D. (2006) Concordant genetic estimators of migration reveal anthropogenically enhanced source-sink population structure in the River Sculpin, Cottus gobio. Genetics 173: 1487-1501.

Hara M., Sekino M. & Na-Nakorn U. (1998) Genetic differentiation of natural populations of the snake-head fish, Channa striatus in Thailand. Fisheries Science 64: 882-885.

Harrison T., Krigbaum J. & Manser J. (2006) Primate biogeography and ecology on the Sunda Shelf islands: A palentological and zooarchaeological perspective. In: Primate Biogeography: Progress and Prospects (eds. S. M. Lehman & J. G. Fleagle). Springer.

Heaney L. R. (1984) Mammalian species richness on islands on the Sunda Shelf, Southeast Asia. Oecologia 61: 11-17.

Heaney L. R. (1991) A synopsus of climatic and vegetational change in Southeast Asia. Climatic Change 19: 53-61.

Heath T. A., Hedtke S. M. & Hillis D. M. (2008) Taxon sampling and the accuracy of phylogenetic analyses. Journal of Systematics and Evolution 46: 239-257.

Hernandez-Martich J. D. & Smith M. H. (1997) Downstream gene flow and genetic structure of Gambusia holbrooki (eastern mosquitofish) populations. Heredity 79: 295-301.

Hickerson M. J., Carstens B. C., Cavender-Bares J., Crandall K. A., Graham C. H., Johnson J. B., Rissler L., Victoriano P. F. & Yoder A. D. (2010) Phylogeography's past, present, and future: 10 years after. Molecular Phylogenetics and Evolution 54: 291-301.

Higham C. F. W. (1998) The transition from prehistory to the historic period in the Upper Mun Valley. International Journal of Historical Archaeology 2: 235-260.

201

Cited Literature

Hillis D. M. (1987) Molecular versus morphological approaches to systematics. Annual Review of Ecology and Systematics 18: 23-42.

Hillis D. M., Pollock D. D., McGuire J. A. & Zwickl D. J. (2003) Is sparse taxon sampling a problem for phylogenetic inference? Systematic Biology 52: 124-126.

Ho S. Y. W. (2007) Calibrating molecular estimates of substitution rates and divergence times in birds. Journal of Avian Biology 38: 409-414.

Ho S. Y. W. & Phillips M. J. (2009) Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Systematic Biology 58: 367-380.

Holcroft N. I. (2004) A molecular test of alternative hypotheses of tetraodontiform (Acanthomorpha: Tetraodontiformes) sister group relationships using data from the RAG1 gene. Molecular Phylogenetics and Evolution 32: 749-760.

Holder M. T., Sukumaran J. & Lewis P. O. (2008) A justification for reporting the majority- rule consensus tree in Bayesian phylogenetics. Systematic Biology 57: 814-821.

Hortle K. G., Troeung R. & Lieng S. (2008) Yeild and value of the wild fishery of rice fields in Battambang Province, near the Tonle Sap Lake, Cambodia. In: MCR Technical Paper No.18 pp. 62. Mekong River Comission, Vientiane.

Huang J. & Bouis H. (1996) Structural Changes in the Demand for Food in Asia. International Food Policy Research Institute: 2020 Brief 41.

Hudson R. R. (1990) Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology 7: 1-44.

Hughes G. M. & Munshi J. S. D. (1986) Scanning electron microscopy of the accessory respiratory organs of the snake-headed fish, Channa striata (Bloch). Journal of Zoology, London 209: 305-317.

Hurwood D. & Hughes J. M. (1998) Phylogeography of the freshwater fish, Mogurnda adspersa, in streams of northeast Queensland, Australia: evidence for altered drainage patterns. Molecular Ecology 7: 1507-1517.

Hurwood D. A., Adamson E. A. S. & Mather P. B. (2008) Evidence for strong genetic structure in an economically important, highly vagile cyprinid (Henicorhynchus lobatus) in the Mekong River Basin. Ecology of Freshwater Fish 17: 273-283.

Inger R. F. & Voris H. K. (2001) The biogeographical relations of the frogs and snakes of Sundaland. Journal of Biogeography 28: 863-891.

Ingthamjitr S., Mattson N. S. & Hortle K. G. (2005) Use of inland trash fish for aquaculture feed in the lower Mekong Basin in Thailand and Lao PDR. In: Regional Workshop on Low Value and "Trash Fish" in the Asia-Pacific Region, Hanoi, Viet Nam.

Inoue J. G., Kumazawa Y., Miya M. & Nishida M. (2009) The historical biogeography of the freshwater knifefishes using mitogenomic approaches: A Mesozoic origin of the Asian notopterids (Actinopterygii: ). Molecular Phylogenetics and Evolution 51: 486-499.

202

Cited Literature

Jakobsson M. & Rosenberg N. A. (2007a) CLUMPP software and manual (ed. a. a. http://rosenberglab.bioinformatics.med.umich.edu/clumpp.html). University of Michigan, © the Authors.

Jakobsson M. & Rosenberg N. A. (2007b) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23: 1801-1806.

Jin Y. T., Brown R. P. & Liu N. F. (2008) Cladogenesis and phylogeography of the lizard Phrynocephalus vlangalii (Agamidae) on the Tibetan plateau. Molecular Ecology 17: 1971- 1982.

Johns G. & Avise J. (1998) A comparative summary of genetic distances in the vertebrates from the mitochondrial cytochome b gene. Molecular Biology and Evolution 15: 1481-1490.

Johnson L. E. & Carlton J. T. (1996) Post-establishment spread in large-scale invasions: dispersal mechanisms of the Zebra mussel Dreissena Polymorpha. Ecology 77: 1686-1690.

Jombart T., Devillard S., Dufour A. B. & Pontier D. (2009) Genetic markers in the playground of multivariate analysis. Heredity 102: 330-341.

Jørgensen C., Enberg K., Dunlop E. S., Arlinghaus R., Boukal D. S., Brander K., Ernande B., Gårdmark A., Johnston F. & Matsumura S. (2007) Managing evolving fish stocks. Science 318: 1247-1248.

Jost L. (2008) GST and its relatives do not measure differentiation. Molecular Ecology 17: 4015-4026.

Kalinowski S. T., Meeuwig M. H., Narum S. R. & Taper M. L. (2008) Stream trees: a statistical method for mapping genetic differences between populations of freshwater organisms to the sections of streams that connect them. Canadian Journal of Fish and Aquatic Science 65: 2752-2760.

Karanth K. (2006) Out-of-India Gondwanan origin of some tropical Asian biota. Current Science 90: 789-792.

Kartavtsev Y. & Lee J. (2006) Analysis of nucleotide diversity at the cytochrome b and cytochrome oxidase 1 genes at the population, species, and genus levels. Russian Journal of Genetics 42: 341-362.

Katoh K., Kuma K.-i., Toh H. & Miyata T. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33: 511-518.

Kawamura K., Yonekura R., Katano O., Taniguchi Y. & Saitoh K. (2006) Origin and dispersal of bluegill sunfish, Lepomis macrochirus, in Japan and Korea. Molecular Ecology 15: 613- 621.

Kenchington E. L. (2003) The effects of fishing on species and genetic diversity. In: Responsible fisheries in the marine ecosystem (eds. E. Kenchington, M. Sinclair & Valdimarsson) pp. 235-253. Food and Agricultural Organization of the United Nations.

Kiernan K. (2009) Distribution and character of karst in the Lao PDR. Acta Cardiologica 1: 38.

203

Cited Literature

Kilambi R. V. (1986) Age, growth and reproductive strategy of the snakehead, Ophicephalus striatus Bloch, from Sri Lanka. Journal of Fish Biology 29: 13-22.

Kimura M. (1980) A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16: 111-120.

Kimura M. & Weiss G. H. (1964) The Stepping Stone Model of population structure and the decrease of genetic correlation with distance. Genetics 19: 561-576.

Kingman J. F. (2000) Origins of the coalescent. 1974-1982. Genetics 156: 1461.

Kingman J. F. C. (1982) The coalescent. Stochastic Processes and their Applications 13: 235- 248.

Klopfstein S., Currat M. & Excoffier L. (2006) The fate of mutations surfing on the wave of a range expansion. Molecular Biology and Evolution 23: 482-490.

Kocher T. D. & Carleton K. L. (1997) Base substitution in fish mitochondrial DNA: patterns and rates. In: Molecular Systematics of Fishes (eds. T. D. Kocher & C. A. Stepien) pp. 13–24. Academic Press, California.

Kodandaramaiah U. (2009) Vagility: The neglected component in historical biogeography. Evolutionary Biology 36: 327-335.

Kohler F. & Glaubrecht M. (2007) Out of Asia and into India: on the molecular phylogeny and biogeography of the endemic freshwater gastropod Paracrostoma Cossmann, 1900 (Caenogastropoda: Pachychilidae). Biological Journal of the Linnean Society 91: 627-651.

Kotlík P. & Berrebi P. (2001) Phylogeography of the barbel (Barbus barbus) assessed by mitochondrial DNA variation. Molecular Ecology 10: 2177-2185.

Kottelat M. (1985) Fresh-water fishes of Kampuchea; A provisory annotated check-list. Hydrobiologia 121: 249-279.

Kottelat M. (2001) Fishes of Lao. WHT Publications, Colombo.

Kuhner M. K. (2009) Coalescent genealogy samplers: windows into population history. Trends in Ecology & Evolution 24: 86-93.

Kullander S. O. (1999) Fish species – how and why. Reviews in Fish Biology and Fisheries 9: 325-352.

Kumazawa Y. & Nishida M. (2000) Molecular phylogeny of osteoglossoids: A new model for gondwanian origin and plate tectonic transportation of the Asian arowana. Molecular Biology and Evolution 17: 1869-1878.

Larkin M. A., Blackshields G., Brown N. P., Chenna R., McGettigan P. A., McWilliam H., Valentin F., Wallace I. M., Wilm A., Lopez R., Thompson J. D., Gibson T. J. & Higgins D. G. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947-2948.

204

Cited Literature

Lavoue S., Sullivan J. P. & Hopkins C. D. (2003) Phylogenetic utility of the first two introns of the S7 ribosomal protein gene in African electric fishes (Mormyroidea: Teleostei) and congruence with other molecular markers. Biological Journal of the Linnean Society 78: 273-292.

Lebel L., Garden P. & Imamura M. (2005) The politics of scale, position, and place in the governance of water resources in the Mekong region. Ecology and Society 10: 18.

Leberg P. L. (2002) Estimating allelic richness: effects of sample size and bottlenecks. Molecular Ecology 11: 2445-2449.

Lee P. G. & Ng P. K. L. (1994) The systematics and ecology of snakeheads (Pisces: Channidae) in Peninsular Malaysia and Singapore. Hydrobiologia 285: 59-74.

Leesa-Nga S.-N., Siraj S. S., Daud S. K., Sodsuk P. K., Tan S. G. & Sodsuk S. (2000) Biochemical polymorphism in Yellow Catfish, Mystus nemurus (C&V), from Thailand. Biochemical Genetics 38: 77-86.

Lemmon A. R., Brown J. M., Stanger-Hall K. & Lemmon E. M. (2009) The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and bayesian inference. Systematic Biology 58: 130-145.

Lessa E. P. & Applebaum G. (1993) Screening techniques for detecting allelic variation in DNA sequences. Molecular Ecology 2: 119-129.

Li J., Wang X., Kong X., Zhao K., He S. & Mayden R. L. (2008) Variation patterns of the mitochondrial 16S rRNA gene with secondary structure constraints and their application to phylogeny of cyprinine fishes (Teleostei: Cypriniformes). Molecular Phylogenetics and Evolution 47: 472-487.

Li X., Musikasinthorn P. & Kumazawa Y. (2006) Molecular phylogenetic analyses of snakeheads (Perciformes: Channidae) using mitochondrial DNA sequences. Ichthyological Research 53: 148-159.

Librado P. & Rozas J. (2009) DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25: 1451-1452.

Lieng S. & van Zalinge N. (unpublished) Fish Yield Estimation in the Floodplains of the Tonle Sap Great Lake and River, Cambodia.

Lim P., Lek S., TanaTouch S., Mao S.-O. & Chhouk B. (1999) Diversity and spatial distribution of freshwater fish in Great Lake and Tonle Sap river (Cambodia, Southeast Asia). Aquatic Living Resources 12: 379-386.

Linder H. P., Hardy C. R. & Rutschmann F. (2005) Taxon sampling effects in molecular clock dating: An example from the African Restionaceae. Molecular Phylogenetics and Evolution 35: 569-582.

Lindholm A. K., Breden F., Alexander H. J., Chan W. K., Thakurta S. G. & Brooks R. (2005) Invasion success and genetic diversity of introduced populations of guppies Poecilia reticulata in Australia. Molecular Ecology 14: 3671-3682.

205

Cited Literature

Little D. C., Surintaraseree P. & Innes-Taylor N. (1996) Fish culture in rainfed rice fields of northeast Thailand. Aquaculture 140: 295-321.

Lopez J. A., Chen W.-J. & Orti G. (2004) Esociform phylogeny. Copeia 2004: 449-464.

Lorenzen K. (2005) Population dynamics and potential of fisheries stock enhancement: practical theory for assessment and policy analysis. Philosophical Transactions of the Royal Society B: Biological Sciences 360: 171-189.

Lundberg J. G., Kottelat M., Smith G. R., Stiassny M. L. J. & Gill A. C. (2000) So many fishes, so little time: An overview of recent ichthyological discovery in continental waters. Annals of the Missouri Botanical Garden 87: 26-62.

Macey J. R., Schulte J. A., II, Larson A., Ananjeva N. B., Wang Y., Pethiyagoda R., Rastegar- Pouyani N. & Papenfuss T. J. (2000) Evaluating trans-Tethys migration: an example using acrodont lizard phylogenetics. Systematic Biology 49: 233-256.

Maddison W. P. & Maddison D. R. (2009) Mesquite: A modular system for evolutionary analysis, (software package), available from: http://mesquiteproject.org/mesquite/mesquite.html.

Mallet J. (1995) A species definition for the modern synthesis. Trends in Ecology & Evolution 10: 294-299.

Marimuthu K. & Haniffa M. A. (2007) Embryonic and larval development of the striped snakehead Channa striatus. Taiwania 52: 84-92.

Martin A. P. & Palumbi S. R. (1993) Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences 90: 4087-4091.

Maruyama T. & Fuerst P. A. (1985) Population bottlenecks and nonequilibrium models in population genetics. II. Number of alleles in a small population that was formed by a recent bottleneck. Genetics 111: 675-689.

Mayr E. (1944) Wallace's Line in the light of recent zoogeographic studies. The Quarterly Review of Biology 19: 1-14.

McConnell S. (2004) Mapping aquatic faunal exchanges across the Sunda shelf, South-east Asia, using distributional and genetic data sets from the cyprinid fish Barbodes gonionotus (Bleeker, 1850). Journal of Natural History 38: 651-670.

McElhany P., Ruckelshaus M. H., Ford M. J., Wainwright T. C. & Bjorkstedt E. P. (2000) Viable salmonid populations and the recovery of evolutionary significant units. In: NOAA Technical Memorandum NMFS-NWFSC-42 pp. 156. U.S. Department of Commerce, National Oceanic and Atmospheric Administration, National Marine Fisheries Service.

Meffe G. K. & Vrijenhoek R. C. (1988) Conservation genetics in the management of desert fishes. Conservation Biology 2: 157-169.

Meijaard E. (2003) Mammals of south-east Asian islands and their Late Pleistocene environments. Journal of Biogeography 30: 1245-1257.

206

Cited Literature

Meijaard E. & Groves C. P. (2006) The geography of mammals and rivers in mainland Southeast Asia. In: Primate Biogrography: Progress and Prospects (eds. S. M. Lehman & J. G. Fleagle). Springer, New York.

Melville J., Hale J., Mantziou G., Ananjeva N. B., Milto K. & Clemann N. (2009) Historical biogeography, phylogenetic relationships and intraspecific diversity of agamid lizards in the Central Asian deserts of Kazakhstan and Uzbekistan. Molecular Phylogenetics and Evolution 53: 99-112.

Meyer A. (1994) DNA technology and phylogeny of fish. In: Genetic and Evolution of Aquatic Organisms (ed. A. R. Beaumont). Chapman & Hall, London.

Miller S. A., Dykes D. D. & Polesky H. F. (1988) A simple salting out procedure for extracting DNA from nucleated cells. Nucleic Acids Research 16: 1215.

Mills L. S. & Allendorf F. W. (1996) The one-migrant-per-generation rule in conservation and management. Conservation Biology 10: 1509-1518.

Molengraaff G. A. F. (1921) Modern deep-sea research in the East Indian Archipelago. The Geographical Journal 57: 95-118.

Monaghan M. T., Spaak P., Robinson C. T. & Ward J. V. (2001) Genetic differentiation of Baetis alpinus Pictet (Ephemeroptera: Baetidae) in fragmented alpine streams. Heredity 86: 395-403.

Moretti S., Armougom F., Wallace I. M., Higgins D. G., Jongeneel C. V. & Notredame C. (2007) The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Research 35: W645- 648.

Moretti S., Wilm A., Higgins D. G., Xenarios I. & Notredame C. (2008) R-Coffee: a web server for accurately aligning noncoding RNA sequences. Nucleic Acids Research 36: W10- 13.

Moritz C. (1994) Defining 'Evolutionary Significant Units' for conservation. Trends in Ecology & Evolution 9: 373-375.

Morley R. (1998) Palynological evidence for Tertiary plant dispersals in the SE Asian region in relation to plate tectonics and climate. In: Biogeography and Geological Evolution of SE Asia (eds. J. D. Holloway & R. Hall) pp. 211-234. Backhuys Publishers, Leiden.

Morley R. J. (2003) Interplate dispersal paths for megathermal angiosperms. Perspectives in Plant Ecology, Evolution and Systematics 6: 5-20.

Morrone J. J. & Crisci J. V. (1995) Historical biogeography: introduction to methods. Annual Review of Ecology and Systematics 26: 373-401.

Moss S. & Wilson M. (1998) Biogeographic implications of the Tertiary palaeogeographic evolution of Sulawesi and Borneo. In: Biogeography and Geological Evolution of SE Asia (eds. R. Hall & J. Holloway) pp. 133-163. Backbuys Publishers, Leiden.

MRC (2003) State of the Basin Report pp. 50pp. Mekong River Comission, Phnom Penh.

207

Cited Literature

MRC Fisheries Program (1999) Snakeheads. In: Mekong Fisheries Network Newsletter, Vol 4, Number 3. Mekong River Comission, Bangkok.

Mullon C., Freon P. & Cury P. (2005) The dynamics of collapse in world fisheries. Fish and Fisheries 6: 111-120.

Murphy W. J. & Collier G. E. (1997) A molecular phylogeny for aplocheiloid fishes (Atherinomorpha, Cyprinodontiformes): the role of vicariance and the origins of annualism. Molecular Biology and Evolution 14: 790-799.

Murray A. M. (2001) The fossil record and biogeography of the Cichlidae (Actinopterygii: Labroidei). Biological Journal of the Linnean Society 74: 517-532.

Murray A. M. (2006) A new channid (Teleostei: Channiformes) from the Eocene and Oligocene of Egypt. Journal of Paleontology 80: 1172-1178.

Murray A. M. & Thewissen J. G. M. (2008) Eocene actinopterygian fishes from Pakistan, with the description of a new genus and species of channid (Channiformes). Journal of Vertebrate Paleontology 28: 41-52.

Murray A. M. & Thewissen J. G. M. (2009) Eocene actinopterygian fishes from Pakistan, with the description of a new genus and species of channid (Channiformes). Journal of Vertebrate Paleontology 28: 41-52.

Musikasinthorn P. (1998) Channa panaw , a new channid fish from the Irrawaddy and Sittang River basins, Myanmar. Ichthyological Research 45: 355-362.

Musikasinthorn P. & Taki Y. (2001) Channa siamensis (Gunther, 1861), a junior synonym of Channa lucius (Cuvier in Cuvier and Valenciennes, 1831). Ichthyological Research 48: 319- 324.

Muyzer G. & Smalla K. (1998) Application of denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE) in microbial ecology. Antonie van Leeuwenhoek 73: 127-141.

Na-Nakorn U., Kamonrat W. & Ngamsiri T. (2004) Genetic diversity of walking catfish, Clarias macrocephalus, in Thailand and evidence of genetic introgression from introduced farmed C. gariepinus. Aquaculture 240: 145-163.

Naret K., Viryak S. & Griffiths D. (2002) Fish price monitoring in Kandal, Prey Veng and Takeo Provinces of Cambodia. IIEFT (International Institute of Fishery Economic and Trade): 11pp.

Nash G., Roberts R. J., Chinabut S., Areerat S. & Limsuwan C. (1988) Emaciation of pond- cultured snakehead, Channa striatus (Fowler). Journal of Fish Diseases 11: 215-224.

Navy H. & Bhattarai M. (2009) Economics and livelihoods of small-scale inland fisheries in the Lower Mekong Basin: a survey of three communities in Cambodia. Water Policy 11: 31- 51.

Near T. J. & Cheng C. H. C. (2008) Phylogenetics of notothenioid fishes (Teleostei: Acanthomorpha): Inferences from mitochondrial and nuclear gene sequences. Molecular Phylogenetics and Evolution 47: 832-840.

208

Cited Literature

Nei M. (1987) Molecular Evolutionary Genetics. Columbia University Press, New York, USA.

Nei M., Maruyama T. & Chakraborty R. (1975) The bottleneck effect and genetic variability in populations. Evolution 29: 1-10.

Nelson G. J. (1994) Fishes of the World. Wiley, New York.

Ng H. H. & Rainboth W. J. (2001) A review of the sisorid catfish genus Oreoglanis (Siluriformes: Sisoridae) with descriptions of four new species. Occasional Papers of the Museum of Zoology, The University of Michigan 732.

Ng P. K. L., Tay J. B. & Lim K. K. P. (1994) Diversity and conservation of blackwater fishes in Peninsular Malaysia, particularly in the north Selangor peat swamp forest. Hydrobiologia 285: 203-218.

Ngor S., Aun, Deap L. & Hortle K. G. (2003) The dai trey linh fishery in the Tonle Touch (Touch River), southeast Cambodia. In: Proceedings of the 6th Technical Symposium on Mekong Fisheries, Pakse, Lao PDR.

Nguyen D. N., Smallwood C., Hao N. V., Trinh N. X. & Tin (2006) Bo suu tap. Ngu cu noi dia vung. Dong bang song Cuu Longz. Mekong River Comission and the Research Institute for Aquaculture No.2., HCMC.

Nixon K. C. & Carpenter J. M. (1996) On simultaneous analysis. Cladistics 12: 221-241.

Notredame C., Higgins D. G. & Heringa J. (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302: 205-217.

O'Connell M. & Wright J. M. (1997) Microsatellite DNA in fishes. Reviews in Fish Biology and Fisheries 7: 331-363.

O’Reily D. J. W., Von Den Dreisch A. & Voeun V. (2006) Archaeology and archaeozoology of Phum Snay: A late prehistoric cemetery in northwestern Cambodia. Asian Perspectives 45: 188-211.

Ohta A. T. (1980) Coadaptive gene complexes in incipient species of Hawaiian Drosophila. The American Naturalist 115: 121-132.

Onrizal, Kusmana C., Saharjo B. H., Handayani I. P. & Kato T. (2005) Social and environmental issues of Danau Sentarum National Park, West Kalimantan. Biodiversitas 6: 220-223.

Ovenden J. R. (1990) Mitochondrial DNA and marine stock assessment: A review. Australian Journal of Marine and Freshwater Research 41: 835-853.

Paetkau D., Calvert W., Stirling I. & Strobeck C. (1995) Microsatellite analysis of population structure in Canadian polar bears. Molecular Ecology 4: 347-354.

Palsbøll P. J., Bérubé M. & Allendorf F. W. (2007) Identification of management units using population genetic data. Trends in Ecology & Evolution 22: 11-16.

209

Cited Literature

Palumbi S., Martin A., Romano S., McMillan W., Stice L. & Grabowski (1991) The simple fool's guide to PCR. Version 2. Department of Zoology and Kewalo Marine Laboratory, Honolulu.

Parameswaran S. & Murugesan V. K. (1976) Observations on the hypophysation of murrels (Ophicephalidae). Hydrobiologia 50: 81-87.

Patterson C. (1993) An overview of the early fossil record of Acanthomorphs. Bulletin of Marine Science 52: 29-59.

Pearse D. E. & Crandall K. A. (2004) Beyond FST: Analysis of population genetic data for conservation. Conservation Genetics 5: 585-602.

Pei J. (2008) Multiple protein sequence alignment. Current Opinion in Structural Biology 18: 382-386.

Pei J., Sun Z., Wang X., Zhao Y., Ge X., Guo X., Li H. & Si J. (2009) Evidence for Tibetan plateau uplift in Qaidam basin before Eocene-Oligocene boundary and its climatic implications. Journal of Earth Science 20: 430-437.

Peng Z., Ho S. Y. W., Zhang Y. & He S. (2006) Uplift of the Tibetan plateau: Evidence from divergence times of glyptosternoid catfishes. Molecular Phylogenetics and Evolution 39: 568-572.

Penny D. (2006) The Holocene history and development of the Tonle Sap, Cambodia. Quaternary Science Reviews 25: 310-322.

Petit R. J., El Mousadik A. & Pons O. (1998) Identifying populations for conservation on the basis of genetic markers. Conservation Biology 12: 844-855.

Phillips M. J. (2002) Fresh water aquaculture in the Lower Mekong Basin. In: MRC Technical Paper No.7, (ed. A. Bishop) pp. 62pp. Mekong River Commission Phnom Penh.

Piry S., Luikart G. & Cornuet J. M. (1999) Computer note. BOTTLENECK: a computer program for detecting recent reductions in the effective size using allele frequency data. Heredity 90: 502-503.

Planes S., Doherty P. J. & Bernardi G. (2001) Strong genetic divergence among populations of a marine fish with limited dispersal, Acanthochromis polyacanthus, within the Great Barrier Reef and the Coral Sea. Evolution 55: 2263-2273.

Posada D. & Buckley T. R. (2004) Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. Systematic Biology 53: 793-808.

Posadas P., Crisci J. V. & Katinas L. (2006) Historical biogeography: A review of its basic concepts and critical issues. Journal of Arid Environments 66: 389-403.

Poulsen A., Griffiths D., Nam S. & Nguyen T. T. (2008) Capture-based aquaculture of Pangasiid catfishes and snakeheads in the Mekong River Basin. In: Capture-based aquaculture. Global overview. FAO Fisheries Technical Paper No. 508 (eds. A. Lovatelli & P. F. Holthus) pp. 69-91. FAO, Rome.

210

Cited Literature

Pouyaud L., Gustiano R. & Teugels G. G. (2004) Contribution to the phylogeny of the Pangasiidae based on mitochondrial 12S RDNA. Indonesian Journal of Agricultural Science 5: 45-62.

Pouyaud L., Teugels G. G., Gustiano R. & Legendre M. (2000) Contribution to the phylogeny of pangasiid catfishes based on allozymes and mitochondrial DNA. Journal of Fish Biology 56: 1509-1538.

Pritchard J. K., Stephens M. & Donnelly P. (2000) Inference of population structure using multilocus genotype data. Genetics 155: 945-959.

Qin J. G. & Fast A. W. (1998) Effects of temperature, size and density on culture performance of snakehead, Channa striatus (Bloch), fed formulated feed. Aquaculture Research 29: 299-303.

Quenouille B., Bermingham E. & Planes S. (2004) Molecular systematics of the damselfishes (Teleostei: Pomacentridae): Bayesian phylogenetic analyses of mitochondrial and nuclear DNA sequences. Molecular Phylogenetics and Evolution 31: 66-88.

Rainboth W. J. (1996a) Fishes of the Cambodian Mekong. Food and Agricultural Organization of the United Nations, Rome.

Rainboth W. J. (1996b) The taxonomy, systematics, and zoogeography of Hypsibarbus, a new genus of large barbs (Pisces, Cyprinidae) from the rivers of Southeastern Asia. University of California Press, Berkeley.

Rambaut A. & Drummond A. (2007) Tracer v1.4. Available from http://beast.bio.ed.ac.uk/Tracer

Ramos-Onsins S. E. & Rozas J. (2002) Statistical properties of new neutrality tests against population growth. Molecular Biology and Evolution 19: 2092-2100.

Rand D. M. (1996) Neutrality tests of molecular markers and the connection between DNA polymorphism, demography, and conservation biology. Conservation Biology 10: 665-671.

Ratner B. (2003) The politics of regional governance in the Mekong River Basin. Global Change, Peace & Security 15: 59-76.

Ray N., Currat M. & Excoffier L. (2003) Intra-deme molecular diversity in spatially expanding populations. Molecular Biology and Evolution 20(1): 76-86.

Rícan O., Zardoya R. & Doadrio I. (2008) Phylogenetic relationships of Middle American cichlids (Cichlidae, Heroini) based on combined evidence from nuclear genes, mtDNA, and morphology. Molecular Phylogenetics and Evolution 49: 941-957.

Roe L. J. (1991) Phylogenetic and ecological significance of Channidae (Osteichthyes Teleostei) from the early Eocene Kuldana Formation of Kohat, Pakistan. Contributions from the Museum of Paleontology, University of Michigan 28: 93-100.

Rogers A. R. & Harpending H. (1992) Population growth makes waves in the distribution of pairwise genetic differences. Molecular Biology and Evolution 9: 552-569.

211

Cited Literature

Ronquist F. & Huelsenbeck J. P. (2003) MRBAYES: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572-1574.

Rosenberg N. A. (2004) DISTRUCT: a program for the graphical display of population structure. Molecular Ecology Notes 4: 137-138.

Rot T. (unpublished) Status of the flooded forest in fishing lot #2 Battambang Province. Mekong River Comission/Cambodian Department of Fisheries/Danida Fisheries Program (Cambodia).

Rothuis A. J., Khan D. K., Richter C. J. J. & Ollevier F. (1998a) Rice with fish culture in the semi-deep waters of the Mekong Delta, Vietnam: interaction of rice culture and fish husbandry management on fish production. Aquaculture Research 29: 59-66.

Rothuis A. J., Nhan D. K., Richter C. J. J. & Ollevier F. (1998b) Rice with fish culture in the semi-deep waters of the Mekong Delta, Vietnam: a socio-economical survey. Aquaculture Research 29: 47-57.

Rowley D. B. & Currie B. S. (2006) Palaeo-altimetry of the late Eocene to Miocene Lunpola basin, central Tibet. Nature 439: 677-681.

Rozen S. & Skaletsky H. J. (2000) Primer3 on the web for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology (eds. S. Krawetz & S. Misener) pp. 365-386. Humana Press, Totowa, NJ.

Rüber L., Britz R., Tan H. H., Ng P. K. L. & Zardoya R. (2004) Evolution of mouthbrooding and life-history correlates in the fighting fish genus Betta. Evolution 58: 799-813.

Rüber L., Britz R. & Zardoya R. (2006) Molecular phylogenetics and evolutionary diversification of labyrinth fishes (Perciformes: Anabantoidei). Systematic Biology 55: 374- 397.

Rüber L., Kottelat M., Tan H. H., Ng P. K. L. & Britz R. (2007) Evolution of miniaturization and the phylogenetic position of Paedocypris, comprising the world's smallest vertebrate. BMC Evolutionary Biology 7: 38.

Ruedi M. & Fumagalli L. (1996) Genetic structure of Gymnures (genus Hylomys; Erinaceidae) on continental islands of Southeast Asia: historical effects of fragmentation. Journal of Zoological Systematics and Evolutionary Research 34: 153-162.

Ruzzante D. E. (1998) A comparison of several measures of genetic distance and population structure with microsatellite data: bias and sampling varience. Canadian Journal of Fish and Aquatic Science 55: 1-14.

Ryman N. (1991) Conservation genetics considerations in fishery management. Journal of Fish Biology 39: 211-224.

Ryman N. & Palm S. (2006) POWSIM: a computer program for assessing statistical power when testing for genetic differentiation. Molecular Ecology 6: 600-602.

Ryman N., Utter F. & Laikre L. (1995) Protection of intraspecific biodiversity of exploited fishes. Reviews in Fish Biology and Fisheries 5: 417-446.

212

Cited Literature

Santini F., Harmon L. J., Carnevale G. & Alfaro M. E. (2009) Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes. BMC Evolutionary Biology 9: 194.

Sathiamurthy E. & Voris H. K. (2006) Maps of Holocene sea level transgression and submerged lakes on the Sunda Shelf. Natural History Journal of Chulalongkorn University 2 (Supplementry): 1-44.

Sayer M. D. J. (2005) Adaptations of amphibious fish for surviving out of water. Fish and Fisheries 6: 186-211.

Schlötterer C. (2000) Evolutionary dynamics of microsatellite DNA. Chromosoma 109: 365- 371.

Selkoe K. A. & Toonen R. J. (2006) Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecology Letters 9: 615-629.

Senanan W., Kapuscinski A. R., Na-Nakorn U. & Miller L. M. (2004) Genetic impacts of hybrid catfish farming (Clarias macrocephalus×C. gariepinus) on native catfish populations in central Thailand. Aquaculture 235: 167-184.

Simmons M. P. & Ochoterena H. (2000) Gaps as characters in sequence-based phylogenetic analyses. Systematic Biology 49: 369-381.

Singhanouvong D. & Phouthavongs K. (2002) Fisheries baseline survey in Champasack Province, Southern Lao PDR. 5th Technical Symposium on Mekong Fisheries, Mekong River Commission, Phnom Penh.

Slatkin M. (1991) Inbreeding coefficients and coalescence times. Genetical Research 58: 167-175.

Slatkin M. (1993) Isolation by distance in equilibrium and non-equlibrium populations. Evolution 29: 264-279.

Slatkin M. (1995) A measure of population subdivision based on microsatellite allele frequencies. Genetics 139: 457-462.

Slatkin M. & Excoffier L. (1996) Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm. Heredity 76: 377-383.

Slatkin M. & Hudson R. R. (1991) Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129: 555-562.

Slechtova V., Bohlen J., Freyhof J. & Rab P. (2006) Molecular phylogeny of the Southeast Asian freshwater fish family Botiidae (Teleostei : Cobitoldea) and the origin of polyploidy in their evolution. Molecular Phylogenetics and Evolution 39: 529-541.

Smith P. J., Francis R. I. C. C. & McVeagh M. (1991) Loss of genetic diversity due to fishing pressure. Fisheries Research 10: 309-316.

Smith W. L. & Wheeler W. C. (2004) Polyphyly of the mail-cheeked fishes (Teleostei: Scorpaeniformes): evidence from mitochondrial and nuclear sequence data. Molecular Phylogenetics and Evolution 32: 627-646.

213

Cited Literature

Smith W. L. & Wheeler W. C. (2006) Venom evolution widespread in fishes: A phylogenetic road map for the bioprospecting of piscine venoms. Journal of Heredity 97: 206-217.

So N. & Haing L. (2007) An evaluation of freshwater fish seed resources in Cambodia. In: Assessment of freshwater fish seed resources for sustainable aquaculture (ed. M. G. Bondad-Reantaso) pp. 628. FAO, Rome.

So N., Maes G. E. & Volckaert F. A. M. (2006a) High genetic diversity in cryptic populations of the migratory sutchi catfish Pangasianodon hypophthalmus in the Mekong River. Heredity 96: 166-174.

So N., Maes G. E. & Volckaert F. A. M. (2006b) Intra-annual genetic variation in the downstream larval drift of sutchi catfish (Pangasianodon hypophthalmus) in the Mekong river. Biological Journal of the Linnean Society 89: 719-728.

So N., van Houdt J. & Volckaert F. (2006c) Genetic diversity and population history of the migratory catfishes Pangasianodon hypophthalmus and Pangasius bocourti in the Cambodian Mekong River. Fisheries Science 72: 469-476.

Sparks J. S. & Smith W. M. (2005) Freshwater fishes, dispersal ability, and nonevidence: "Gondwana Life Rafts" to the rescue. Systematic Biology 54: 158-165.

Stamatakis A. (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688-2690.

Stephens M. & Donnelly P. (2000) Inference in molecular population genetics. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 62: 605-655.

Stephenson R. L. (1999) Stock complexity in fisheries management: a perspective of emerging issues related to population sub-units. Fisheries Research 43: 247-249.

Stewart K. M. (2001) The freshwater fish of Neogene Africa (Miocene-Pleistocene): systematics and biogeography. Fish and Fisheries 2: 177-230.

Stiassny M. L. J. & Getahun A. (2007) An overview of labeonin relationships and the phylogenetic placement of the Afro-Asian genus Garra Hamilton, 1922 (Teleostei: Cyprinidae), with the description of five new species of Garra from Ethiopia, and a key to all African species. Zoological Journal of the Linnean Society 150: 41-83.

Sullivan J. P., Lundberg J. G. & Hardman M. (2006) A phylogenetic analysis of the major groups of catfishes (Teleostei: Siluriformes) using rag1 and rag2 nuclear gene sequences. Molecular Phylogenetics and Evolution 41: 636-662.

Tajima F. (1983) Evolutionary relationship of DNA sequences to finite populations. Genetics 105: 437-460.

Tajima F. (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585-595.

Takagi A., Ishikawa S., Nao T., Hort S., Nakatani M., Nishida M. & Kurokura H. (2006) Genetic differentiation of the bronze featherback Notopterus notopterus between Mekong River and Tonle Sap Lake populations by mitochondrial DNA analysis. Fisheries Science 72: 750-754.

214

Cited Literature

Taki Y. (1975) Geographic distribution of primary freshwater fishes in four principal areas of Southeast Asia. Southeast Asian Studies 13: 200-214.

Tamura K., Dudley J., Nei M. & Kumar S. (2007) Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. . Molecular Biology and Evolution 24: 1596-1599.

Tamura K. & Nei M. (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution 10: 512-526.

Tamura K., Nei M. & Kumar S. (2004) Prospects for inferring very large phylogenies by using the neighbor-joining method. Proceedings of the National Academy of Sciences 101: 11030- 11035.

Tang Q., Getahun A. & Liu H. (2009) Multiple in-to-Africa dispersals of labeonin fishes (Teleostei: Cyprinidae) revealed by molecular phylogenetic analysis. Hydrobiologia 632: 261-271.

Tao N., Richardson R., Bruno W. & Kuiken C. (2009) FindModel. HCV sequence database: http://hcv.lanl.gov/content/sequence/HCV/ToolsOutline.html.

Templeton A. R. (1981) Mechanisms of speciation - A population genetic approach. Annual Review of Ecology and Systematics 12: 23-48.

Thinh V. N., Mootnick A. R., Geissmann T., Li M., Ziegler T., Agil M., Moisson P., Nadler T., Walter L. & Roos C. (2010) Mitochondrial evidence for multiple radiations in the evolutionary history of small apes. BMC Evolutionary Biology 10: 74.

Thongrod S. (2007) Analysis of feeds and fertilizers for sustainable aquaculture development in Thailand. In: Study and analysis of feeds and fertilizers for sustainable aquaculture development. FAO Fisheries Technical Paper No 497 (eds. M. R. Hasan, T. Hecht, S. S. De Silva & A. G. J. Tacon) pp. 309-330. FAO, Rome.

Tweedie M. W. F. (1950) Notes on Malayan fresh water fishes. The Bulletin of the Raffles Museum 21: 97-105.

Unmack P. J. (2001) Biogeography of Australian freshwater fishes. Journal of Biogeography 28: 1053-1089.

Utter F. (1998) Genetic problems of hatchery-reared progeny released into the wild, and how to deal with them. Bulletin of Marine Science 62: 623-640. van der Kaars S., Penny D., Tibby J., Fluin J., Dam R. A. C. & Suparan P. (2001) Late Quaternary palaeoecology, palynology and palaeolimnology of a tropical lowland swamp: Rawa Danau, West-Java, Indonesia. Palaeogeography, Palaeoclimatology, Palaeoecology 171: 185-212.

Van Oosterhout C., Hutchinson W. F., Wills D. P. M. & Shipley P. (2004) Micro-checker: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4: 535-538. van Zalinge N. (2002) Update on the status of the Cambodian inland capture fisheries sector. Mekong Fish - Catch and Culture 8.

215

Cited Literature van Zalinge N., Degen P., Pongsri C., Nuov S., Jensen J. G., Hao N. V. & Choulamany X. (2003) The Mekong River System. In: Contribution to the Second International Symposium on the Management of Large Rivers for Fisheries, Phnom Penh.

Vences M., Freyhof J., Sonnenberg R., Kosuch J. & Veith M. (2001) Reconciling fossils and molecules: Cenozoic divergence of cichlid fishes and the biogeography of . Journal of Biogeography 28: 1091-1099.

Verhoeven K. J. F., Simonsen K. L. & McIntyre L. M. (2005) Implementing false discovery rate control: increasing your power. Oikos 108: 643-647.

Viravong S., Phounsavath S., Photitay C., Putrea S., Chan S., Kolding J., Jorgensen J. V. & Photavong K. (2006) Hydro-acoustic survey of deep pools in the Mekong River in Southern Lao PDR and Northern Cambodia. In: MRC Technical Paper No. 11 pp. 76. Mekong River Comission, Vientiane.

Vishwanath W. & Geetakumari K. (2009) Diagnosis and interrelationships of fishes of the genus Channa Scopoli (Teleostei: Channidae) of northeastern India. Journal of Threatened Taxa 1: 97-105.

Voris H. K. (2000) Maps of Pleistocene sea levels in Southeast Asia: shorelines, river systems and time durations. Journal of Biogeography 27: 1153-1167.

Vrijenhoek R. C. (1998) Conservation genetics of freshwater fish. Journal of Fish Biology 53: 394-412.

Wallace A. R. (1863) On the physical geography of the Malay Archipelago. Journal of the Royal Geographical Society of London 33: 217-234.

Wallace I. M., O'Sullivan O., Higgins D. G. & Notredame C. (2006) M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Research 34: 1692- 1699.

Wang X., Li J. & He S. (2007) Molecular evidence for the monophyly of East Asian groups of Cyprinidae (Teleostei: Cypriniformes) derived from the nuclear recombination activating gene 2 sequences. Molecular Phylogenetics and Evolution 42: 157-170.

Ward R. D. (2000) Genetics in fisheries management. Hydrobiologia 420: 191-201.

Ward R. D., Woodwark M. & Skibinski D. O. F. (1994) A comparison of genetic diversity levels in marine, freshwater, and anadromous fishes. Journal of Fish Biology 44: 213-232.

Waters J. M., Craw D., Youngson J. H. & Wallis G. P. (2001) Genes meet geology: Fish phylogeographic pattern reflects ancient, rather than modern, drainage connections. Evolution 55: 1844-1851.

Watterson G. A. (1975) On the number of segregating sites in genetic models without recombination. Theoretical Population Biology 7: 256-276.

Wee K. L. (1982) Snakeheads - Their biology and culture. In: Recent Advances in Aquaculture (eds. J. F. Muir & R. J. Roberts) pp. 179-213. Westview Press, Boulder, Colorado.

216

Cited Literature

Wesener T. & VandenSpiegel D. (2009) A first phylogenetic analysis of Giant Pill-Millipedes (Diplopoda: Sphaerotheriida), a new model Gondwanan taxon, with special emphasis on island gigantism. Cladistics 25: 545-573.

Wiens J. J. & Donoghue M. J. (2004) Historical biogeography, ecology and species richness. Trends in Ecology & Evolution 19: 639-644.

Wiens J. J. & Moen D. S. (2008) Missing data and the accuracy of Bayesian phylogenetics. Journal of Systematics and Evolution 46: 307-314.

Wilkinson-Herbots H. M. & Ettridge R. (2004) The effect of unequal migration rates on FST. Theoretical Population Biology 66: 185-197.

Wilson A. C., Cann R. L., Carr S. M., George M., Gyllensten U. B., Helm-Bychowski K. M., Higuchi R. G., Palumbi S. R., Prager E. M., Sage R. D. & Stoneking M. (1985) Mitochondrial DNA and two perspectives on evolutionary genetics. Biological Journal of the Linnean Society 26: 375-400.

Woodruff D. S. (2001) Colloquium Paper: Declines of biomes and biotas and the future of evolution. Proceedings of the National Academy of Sciences 98: 5471-5476.

Woodruff D. S. (2010) Biogeography and conservation in Southeast Asia: how 2.7 million years of repeated environmental fluctuations affect today's patterns and the future of the remaining refugial-phase biodiversity. Biodiversity and Conservation 19: 919-941.

Woodruff D. S. & Turner L. M. (2009) The Indochinese-Sundaic zoogeographic transition: a description and analysis of terrestrial mammal species distributions. Journal of Biogeography 36: 803-821.

Wright S. (1943) Isolation by distance. Genetics 28: 114-138.

Xia X. & Xie Z. (2001) DAMBE: Software package for data analysis in molecular biology and evolution. Journal of Heredity 92: 371-373.

Yamamoto S., Morita K., Koizumi I. & Maekawa K. (2004) Genetic differentiation of white- spotted charr (Salvelinus leucomaenis) populations after habitat fragmentation: Spatial– temporal changes in gene frequencies. Conservation Genetics 5: 529-538.

Yang Z. & Rannala B. (2006) Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Molecular Biology and Evolution 23: 212-226.

Yap S. Y. (2002) On the distributional patterns of Southeast-East Asian freshwater fish and their history. Journal of Biogeography 29: 1187-1199.

Yusoff F. M., Shariff M. & Gopinath N. (2006) Diversity of Malaysian aquatic ecosystems and resources. Aquatic Ecosystem Health & Management 9: 119-135.

Zakaria-Ismail M. (1994) Zoogeography and biodiversity of the freshwater fishes of Southeast Asia. Hydrobiologia 285: 41-48.

217

Cited Literature

Zaki S. A. H., Jordan W. C., Reichard M., Przybylski M. & Smith C. (2008) A morphological and genetic analysis of the European bitterling species complex. Biological Journal of the Linnean Society 95: 337-347.

Zane L., Bargelloni L. & Patarnello T. (2002) Strategies for microsatellite isolation: a review. Molecular Ecology 11: 1-16.

Zardoya R. & Meyer A. (1996) Phylogenetic performance of mitochondrial protein-coding genes in resolving relationships among vertebrates. Molecular Biology and Evolution 13: 933-942.

Zhang C.-G., Musikasinthorn P. & Watanabe K. (2002) Channa nox, a new channid fish lacking a pelvic fin from Guangxi, China. Ichthyological Research 49: 140-146.

Zhang D.-X. & Hewitt G. M. (2003) Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Molecular Ecology 12: 563-584.

Zhisheng A., Kutzbach J. E., Prell W. L. & Porter S. C. (2001) Evolution of Asian monsoons and phased uplift of the Himalaya-Tibetan plateau since Late Miocene times. Nature 411: 62-66.

218

Appendices

219

Appendices

Appendix 1. DNA EXTRACTION.

DNA was obtained from samples of caudal fin, pectoral fin, dorsal fin or muscle tissue. All samples were preserved in 70% ethanol solution at the time of collection.

Total genomic DNA was extracted following a modified version of the protocol outlined by Miller et al (1988).

EXTRACTION PROTOCOL

Approximately 500mg of tissue abscised from larger samples using standard dissection equipment (scalpel and forceps). Equipment was washed in 100% ethanol and wiped two times between dissecting each sample to minimise cross contamination.

The tissue sample was placed directly into 500µL of buffer (50mM TrisHCl; 20mM EDTA; 2% SDS) in a 1.5mL eppendorf tube. After the addition of 5µL of Proteinase K (20mg/mL) samples were left to digest overnight at 37oC.

The following day samples were chilled on ice for 10 mins to deactivate enzyme activity and then spun lightly before the addition of 250µL of saturated salt solution (6M NaCl). Tubes were inverted gently to mix solutions before chilling on ice for a further 5mins.

Samples were then spun at 8,000rpm for 15 minutes to pellet cellular debris and NaCl. Immediately after this spin 500µL of supernatant was carefully collected and transferred to a new labeled 1.5mL eppendorf tube. To precipitate the DNA from this solution 1mL of 100% AR grade ethanol was added and the solution cooled to -4oC for a minimum of 2 hours.

After chilling, samples were spun at maximum speed (11,000 – 14,000rpm) for 15 mins to pellet the DNA. All supernatant was completely removed before DNA pellet was rinsed in cold 70% ethanol solution. After spinning at maximum speed for 5 mins supernatant was again discarded and DNA was left to dry overnight at room temperature. Lids of eppendorfs were left open to facilitate evaporation but covered with tissue to protect samples from dust.

After drying, DNA was re-suspended in 50-200µL of ddH2O depending on the size of pellet and left at room temperature for a further 30mins to rehydrate before storage at - 4oC.

221

Appendix 2. PCR Clean-up and Sequencing Protocol

PCR CLEAN-UP

After PCR, agarose gel electrophoresis was performed to verify success of amplification. 5µL of each reaction was mixed with 2µL of loading dye (glycerol and Bromophenol blue) and loaded in to 1% agarose gel (1g agarose: 100mL 1xTBE buffer). A molecular weight standard ladder was also included on each agarose gel. Gel electrophoresis was performed in 1xTBE buffer at 100V for 30mins and then the gel was stained with ethidium bromide by submersion for 10 mins in TBE-ethidium bromide solution. Amplified product was then visualised under UV light to verify success of amplification.

Successfully amplified product was cleaned using a MoBio UltraCleanTM PCR Clean-up DNA Purification Kit following the manufacturers instructions (www.mobio.com). After clean-up product was again run on 1% agarose gel to check recovery rate of clean-up procedure.

SEQUENCING

Clean PCR was sequenced as follows:

Reactions contained: 1µL of template clean PCR

0.015 μM of primer

1µL of BigDye Sequencing Buffer

1µL of BigDye Terminator Mix Version 3.1TM

ddH2O to total volume of 20µL

Sequencing reactions were performed in an EPPENDORF Mastercycler S under the following cycling conditions:

94 oC - 5mins

96oC – 10s

30 cycles of 50oC – 5s

60 oC – 4mins

4 oC – 10mins

222

Appendices

SEQUENCING CLEAN UP

Product of sequencing reactions were cleaned following an Ethanol / EDTA protocol as follows:

Sequencing product was added to a 1.5mL microcentrifuge tube containing 5µL of 125mM EDTA (pH 8.0). Product was then vortexed and spun briefly before the addition of 60µL of cold 100% ethanol. Samples were again vortexed and spun briefly before being left to incubate at room temperature for 15mins, after which they were spun at maximum speed (11,000 – 14,000rpm) for 20mins. Immediately after this step all solution was aspirated away from the pellet, this process involving a quick second spin and second aspiration to remove any residual solution. Following this the pellet was rinsed by the addition of 250µL of cold 70% ethanol, vortexed briefly, and spun at maximum speed for 5mins before all liquid was again removed. Pellets were then dried completely by placing open tubes in a heat block at 70oC until no liquid meniscus was visible (1-2 mins). Dry pellets were then protected from light and transferred to the Griffith University DNA Sequencing Facility (Brisbane Australia; www.griffith.edu.au/science/dna-sequencing- facility) where samples were processed on an ABI 3130x1 Genetic Analyser (www.AppliedBiosystems.com).

223

APPENDIX 3. Cloning PCR product

PCR product from individuals previously identified from sequencing as heterozygotes at the RP1 locus was cloned to isolate individual alleles before re-sequencing. The flow chart summarises the steps that were involved in the cloning process. See below for details of individual procedures.

224

Appendices

PCR CLONING PROTOCOL

Ligation: PCR product was first incorporated into plasmid vector.

10 µL Reaction contained: 2 µL Clean PRC product 1 µL PGM Vector (Promega™) 5 µL Ligation Buffer (Promega™)

1 µL ddH2O 1 µL T4 DNA Ligase (Promega™) (added last)

Reaction mix was incubated overnight at 16oC

Transformation: Plasmid vector containing PCR product was inserted into host bacterial cells.

1 µL of plasmid+PCR was added to 100µL of JM109 cells (Promega™). The heat shock transformation method was used to facilitate the uptake of plasmid DNA into bacterial cells; samples were placed in a 42oC water bath for 45-50 seconds, then placed on ice for 2 mins for cell recovery. The solution was then added to 900µL of cold SOC medium and allowed to grow for 60mins at 37oC with gentle shaking to prevent cells clumping or attaching to side of vial.

SOC (makes 100mL) 2g Tryptone 0.5g Yeast Extract 0.05g NaCl

Add to 90mls dH2O

[Autoclave in 18mL or 9mL aliquots in bottles at 121oC for 20mins; Store at 4oC]

Plating and Growth: Bacterial cells containing plasmid DNA including single copies of PCR product were cultured. As bacterial cells replicated they replicated plasmid DNA, producing thousands of copies of the target allele (RP1).

After transformation cells were cultured on LB media containing Ampicillin and X-gal. Only those cells that contain the plasmid insert with the Ampicillin resistance gene can replicate and form bacterial colonies on this media. The plasmid vector also contains a region that codes for the LacZ gene. LacZ hydrolyses X-gal to produce a blue colour; however the vector’s ligation site falls within the LacZ coding region such that vectors which have circularised to incorporate a PCR-insert no longer contain a functional LacZ

225

gene. This blue metabolite allows discrimination between bacterial colonies which contain vector-only plasmids (blue) and colonies that contain vector+PCR plasmids (white).

Large (17cm) petri dishes were prepared containing LB-media the day before inoculation and refrigerated until needed. After transformation, 100µL of cell solution was spread on one plate of LB-media. After allowing solution to air dry, plates were grown overnight at 37 oC. The next day white (positive) colonies were identified. Cells from positive colonies were removed from agar plates and places in individual vials of 3-5mLs of growth media (LB-Media without agar, with Ampicillin). Colonies were grown overnight at 37oC with gentle shaking to increase bacterial yeild.

Mini-prep: After positive colonies had been grown overnight, plasmid DNA was extracted and cleaned ready for sequencing.

1.5 ml of culture was decanted into an eppendorf tube and spun at top speed to pellet cells in bottom of vial. Supernatant was removed and cells resuspended in 100µL of cold TEG buffer.

Then: 200µL of freshly made SDS-NaOH solution was added and solution mixed 100µL Acetate Solution (3M, pH 4.8) was added and the solution mixed 150µL Chloroform was added and the solution mixed Solution was spun at top speed for 4 mins

After this the top layer of supernatant was carefully decanted into a clean tube, where 1mL of 100% cold ethanol was added. Plasmid DNA was allowed to precipitate for 2 mins before being spun at top speed for 4 mins. Ethanol was removed and the plasmid DNA was rinsed in 70% ethanol and spun for a further 4 mins. Ethanol was again removed, and the pellet was allowed to dry before resuspension in 50µL of dH2O. Finally, 1µL of RNAase (10mg/mL) was added and the solution incubated at 37oC for 1 hr.

1µL of clean plasmid was used for sequencing following the protocol (APPENDIX 2) with the following pUC18/19 Vector Primers:

F: 5’-GTA AAA CGA CGG CCA GT-3’ R: 5’-CAG GAA ACA GCT ATG AC-3’

226

Appendices

Solutions:

LB-MEDIA (makes 1L) 10 g Tryptone 5 g Yeast Extract 5 g NaCl 15 g Agar

Make to 1 L with dH2O

[Autoclave at 121oC for20 mins, place in 55oC water bath to cool prior to adding ampicillin/ X-gal/ IPTG] for 500mL LB add: 2.5mls 0.1 M IPTG 400 uL 50mg/ml X-gal 1 ml 50mg/ml Ampicillin

TEG Buffer 50mM Glucose 25mM Tris pH 8.0 10mM EDTA pH 8.0 prepare in 100 mL batches, autoclave 121 C/ 15 mins store at 4 C

1% SDS/ 0.2M NaOH

(100 ul 10%SDS, 40 uL 5M NaOH, per 1 ml)

Acetate Solution for 100 mL

5M Potassium Acetate 60 mL 100% Glacial Acetic acid 11.5 mL

dH2O 28.5 mL Store at 4 C.

227

Appendix 4. Multilocus phylogenies

0.1

0.86/92 0.87/78 -- 0.60/56 0.92/82

0.89/ 1.00/85 1.00/100 1.00/100 1.00/100 2006) , -- Thailand) et al. et 1.00/100 1.00/99 ) India) (Kerala, 1.00/99 0.96/74 0.66/ 1.00/99 Rüber 1.00/60 ( 0.98/83 1.00/97 0.59/11 , Cambodia) , 1.00/70 1.00/100 (Hanoi) (Hanoi) (Southern Vietnam) (Southern -- , 2006) , ) ) ) Kratie (Northeastern ( ) (Riau, Sumatra) (Riau, 1.00/100 et al. et obscura , unpublished , 0.56/ 1.00/100 umatra)

maculata diplogramme P. P. . S et al. et cf C. Rüber 2006) ( C. lucius lucius C. , C. lucius lucius C. , Cambodia) , C. Bai Southern India) Southern micropeltes C. ( Northern Lao) Northern et al. et , Central Thailand Central , Northern Malaysia) Northern C. micropeltes C. , , Eastern Thailand) Eastern , , Northwestern Lao Northwestern , , Treng , Southern Vietnam) Southern , (Lampung, (Lampung, , Northwestern Lao Northwestern , , Western Cambodia Western , splendida Ket (Kerala, Vien C.marulia Northern Cambodia) Northern Northern Cambodia) Northern Rüber , Northern Lao) Northern , ( , , M. maculata Thuan Saraburi Karang ( , Eastern Thailand) Eastern , (Stung , Northeastern Thailand) Northeastern , Sayaburi Vien Lao) Northern (Vientiane, ( (Si Sa Sa (Si Vang C.

( Sayaburi ( Treng C. striata striata C. Treng Vinh , Northeastern Thailand) Northeastern , ( Lai, Central Highlands, Vietnam) Highlands, Central Lai, Jeam Battambang bleheri C. striata striata C. ( Vang striata striata striata ( Tanjung Central Highlands, Vietnam) Highlands, Central ( striata striata Gai C. ( C. (Stung (Stung C. (Stung (Stung Songkhram (Kong (Kong C. ( C. striata striata C. striata C. striata striata C. C. striata striata C. C. striata striata C. “x” “x” “x” “x” C. Songkhram Daklak ( ( C.striata gachua . C. striata striata C. C. C. striata striata C. gachua sp C. C. sp C. C. striata striata C.

C. . gachua C.gachua C. adspersa

M. adspersa

M. Coding v all Coding v all loci , Northern Lao) Northern , , Eastern Thailand) Eastern , Thailand) Northeastern , ) ) Vien Central Highlands, Vietnam) Highlands, Central Jeam ) ) ) 2006) Vang , ( Songkhram (Kong (Kong ( Daklak ( et al. et , Cambodia) , Northern Malaysia) Northern Northern Lao) Northern , Northeastern Thailand) Northeastern , , , , unpublished , gachua , Northwestern Lao Northwestern , Rüber umatra) gachua gachua ( Treng , Southern Vietnam) Southern , Vien Cambodia Western , C. S et al. et C. C. Thailand) Eastern , Southern India) Southern Thailand) Karang , Northwestern Lao Northwestern , C.gachua , Central Thailand Central , (Hanoi) (Hanoi) Ket Bai Songkhram Sayaburi Vang Thuan ( ( Lao) Northern (Vientiane, ( ( (Stung Lai, Central Highlands, Vietnam) Highlands, Central Lai, Northern Cambodia) Northern Cambodia) Northern bleheri , , -- (Kerala, C. Gai Vinh Tanjung Sayaburi (Lampung, Saraburi ( ( Battambang ( (Si Sa Sa (Si ( ( , 2006) ,

( 1.00/91 striata striata striata striata , 2006) , Treng Treng 0.65/74 0.67/ maculata (Northeastern Vietnam) (Southern et al. et maculata striata C. C. striata C. striata C. C. . , Cambodia) , . striata striata et al. et cf striata striata (Kerala, India) (Kerala, C. C. striata striata C. (Stung (Stung (Stung -- C. striata striata C. C. striata striata C. C. C.striata C. striata striata C. C. C. striata striata C. C. striata striata C. C. Rüber . Kratie ( ( “x” “x” “x” Rüber (Riau, Sumatra) (Riau, ( splendida -- 1.00/ M. C. sp C. sp C. C. micropeltes C. C. micropeltes C. 1.00/88 0.80/ 1.00/97 C. lucius lucius C. diplogramme 0.90/72 0.65/91 1.00/96 C.marulia obscura lucius C. 1.00/ 99 1.00/ C. P. P. -- 1.00/99 1.00/100 -- 1.00/98 1.00/ -- 0.98/ -- 1.00/98 0.90/ 1.00/ -- 0.81/59 1.00/

1.00/93 0.1

Species phylogenies. LEFT/LOWER: Estimated on coding gene loci Cyt b and RAG1, and

RIGHT/UPPER: Estimated on all four DNA loci. Red lines indicate differences in topology

228

Appendices

Appendix 5.

Published phylogeny

“A reappraisal of the evolution of Asian snakehead fishes (Pisces, Channidae) using molecular data from multiple genes and fossil calibration”

Eleanor A S Adamson. David A Hurwood, Peter B Mather Journal of Molecular Evolution, (2010) Volume 56, Pages 707-717.

Full paper available online at: http://dx.doi.org/10.1016/j.ympev.2010.03.027

229

Appendix 6. C. striata TGGE

After PCR amplification of the Cyt b mtDNA fragment for each individual, variation was screened with Temperature Gel Gradient Electrophoresis (TGGE) with outgroup heteroduplex analysis (OHA) on a modified DIAGEN (now QIAGEN) TGGE system. TGGE- OHA discriminates between DNA variants based on the specific melting temperatures of mis- matched PCR products. After mass screening, representatives of each unique haplotype were sequenced following Appendix 2.

OPTIMISATION OF TGGE CONDITIONS

In order to determine the approximate melting temperature of heteroduplexed Cyt b PCR product, an optimization TGGE gel was run with the temperature gradient perpendicular to the direction of current.

Homoduplex reaction contained: 50μL of reference PCR product 50μL of alternative PCR product 20μL 10xME Buffer + bromophomal blue

80μL dH2O Homoduplex reaction (95OC for 5 mins, 50OC for 15 mins, 2OC hold) was performed on an Eppendorf™ Mastercycler S PCR Machine, and stored on ice prior to electrophoresis. Homoduplexed sample was loaded across the entire gel well and run at room temperature at 300V for 30mins. Then current was halted and a temperature gradient of 20OC to 60OC established, before the gel was run at 300V for a further 60mins. DNA was visualised via silver staining.

After approximate melting temperature was established from perpendicular gel results, trial parallel gels (35OC-60 OC) were run to optimise running time and banding resolution. In the time trails, a small number of samples were homoduplexed multiple times and run in batches on the same gel, for different run times. Homoduplex reactions (95OC for 5 mins, 50OC for 15 mins, 2OC hold) contained: 1μL of reference PCR product 1.5μL of sample PCR product 0.8μL 10xME Buffer + bromophomal blue 4μL of 8M Urea

0.7μL dH2O

230

Appendices

Initial set of samples was loaded and run at 300V for 15mins, before gel was halted and second set was loaded, gel was run for a further 15mins before the addition of the final set. Gel was then run for 3hrs 45mins before staining.

+ve

Double stranded DNA

Single Stranded DNA

-ve

20oC → temperature → 60oC

ABOVE: Perpendicular TGGE showing melting profile for C. striata Cyt b heteroduplexed PCR product.

BELOW: Two examples of Time Trial TGGE Gels.

231

TGGE RUNNING CONDITIONS

Optimum TGGE running conditions for C. striata Cyt b fragment were:

Temperature gradient: 40OC to 60OC Run time: 3 hours Outgroup (reference) 1: sample from Tanjung Karang in Malaysia (haploypte 50) Outgroup (reference) 2*: sample from Vientiane in Lao PDR (haplotype 68)

Homoduplex reactions (95OC for 5 mins, 50OC for 15 mins, 2OC hold) contained: 1μL of reference PCR product 1.5μL of sample PCR product 0.8μL 10xME Buffer + bromophomal blue 4μL of 8M Urea

0.7μL dH2O

*In this study two outgroups were required as some individuals failed to produce adequate banding patterns when the first outgroup was used. This was due to the presence of two very divergent clades. When heteroduplexes were formed between individuals from different clades they melted at low temperatures, presumably due to the very high number of mismatches between nucleotide bases in these cases. This denaturation at low temperature prevented multi-clade heteroduplexex from migrating through the 40OC to 60OC temperature gradient, and therefore an outgroup belonging to the same clade was required to produce a banding pattern under optimized gel conditions.

TGGE SOLUTIONS

ME Buffer (50x): 209.6g MOPS (1M) 18.6g EDTA (50mM) 36g NaOH (buffered to pH8)

In 1L dH2O

Gel mix: 21.6g Urea 900μL 10xME Buffer 2.25mL 40% Glycerol stirred over low heat until dissolved

16.7mL dH2O 5.6mL 40% Acrylamide

136μ 10% APS added immediately 74μL TEMED prior to casting

232

Appendices

Gel casting and loading:

Gels were cast onto GelBond®PAG film (203mmx260mm)(GE Healthcare) between two glass plates (210mmx210mm) and set horizontally for 1hr. Before electrophoresis, the glass plates were removed and the gel placed PAG film side down horizontally on the Temperature Gradient block. A thin film of TRITON detergent was applied between temperature block and PAG film to maintain uniform contact for even heat transferral. Thick cloth material was placed over each end of the gel and run into buffer chambers (1xME Buffer) to form wicks to transfer current. A thin plastic sheet was placed over exposed gel surface (excluding loading wells) to prevent gel desiccation during electrophoresis. After the gel was loaded, heavy Perspex was placed over set-up to hold wicks in place against the gel.

Gel staining:

After electrophoresis, the gel was removed from the TGGE apparatus and placed in a staining tray. Gel was rinsed in Buffer A (fixing solution) for 3mins, removed, and resubmerged in fresh Buffer A for 3mins. Buffer A was removed and the gel covered with Buffer B (silver nitrate) for 10 mins, before Buffer B was removed and the gel quickly rinsed two times with dH2O. The gel was then developed with Buffer C (oxidising solution) for 10- 20mins depending on intensity of revealed bands. Buffer C was then discarded and Buffer D applied for 10 mins. After Buffer D was removed, the gel was ready to air-dry on bench before wrapping.

Buffer A: 135mL dH2O plus 15mL Solution A (10% EtOH, 0.5% acetic acid)

Buffer B: 135mL dH2O plus 20mL Solution B (1% AgNO3)

Buffer C: (made fresh) 300mL dH2O plus 4.5g NaOH, 1.2mL formaldehyde, small

scoop NaBH4

Buffer D: 135mL dH2O plus 15mL Solution D (0.75% NaCO3)

233

Appendix 7. Microsatellite Isolation

To identify microsatellite regions in the genome of C. striata and C. micropeltes, genomic libraries were created for each species, and radioactive probes were used to identify specific repeat motifs. The protocols used follow the methods employed by Archangi et al. (2009) and Chand et al (2005). Steps of the process are summarised in the flow chart below.

DNA extraction (Appendix 1)

Restriction Digest with Positive colony DpnII/Sau3A selection, mini-prep and sequencing (Appendices 2 & 3)

Gel Extraction (to isolate DNA fragments 300- 700bps) X-ray screening

Labelling Ligation, microsatellite transformation, containing clones plating, growth. with radio-active (Appendix 3) probes

Hybridization

234

Appendices

Appendix 8. Additional microsatellite primers for C. striata

Species Primer name Primer Sequence

C. striata Cs-9 5’-CAT ATG GTA CAT GTC AC GCT CA-3’ 5’-ATC TGA GTT TCG AAA ACC ACT TAA A-3’ C. striata Cs-10 5’-GGC CAA AAT GTC CTC ACT TTA-3’ 5’-AGG AAG CAG TTC AGCC AGC G-3’ C. striata Cs-11 5’-CCT GGC CCT AAT TGT CTC AA-3’ 5’-GCA GCT GCC TCA GGA GGA GT-3’ C. striata Cs-12 5’-GCG CAC AGC GTT TCT ACT AA-3’ 5’-GTT CGG ACT GGA AAA CCT TG-3’

Repeat motif Flanking region Ta/ MgCl2 Comment

Cs-9 GT(4)-GC-GT(7) 126 59/0.1 Low variation Cs-10 CA (25) 108 59/0.5 Nulls? Cs-11 GT repeat <163 Not optimised Cs-12 Compound GA repeat 120 Not optimised

.

235

Appendix 9. Gelscan Protocol

Variation in microsatellite allele size was screened by vertical polyacrylamide gel electrophoresis on the CORBETT Gel-Scan™ 3000 System (QIAGEN). Gels were cast between two glass plates, which were then installed directly into the Gel-Scan machine. A laser was used to generate a real-time image as denatured colour tagged PCR product was electrophoresed through the gel.

GEL CASTING

Gels (~1mm thickness) were cast between 210mm x 185mm specialised Gel-Scan™ glass plates. Immediately prior to casting, 8μL of TEMED and 80μL of 10% APS were added to 15mLs of gel mix to catalyze reaction.

Gel mix (100mLs): 42g Urea 6mL 10xTBE Buffer (AMRESCO, ASTRAL Scientific) 40% Acrylamide: bis-Acrylamide (AMRESCO, ASTRAL Scientific)

dH2O to total volume 100mLs

After the gel was set (>1hr), the gel was installed vertically into Gel-Scan™ 3000 System and buffer chambers filled with 1XTBE. Gels were them pre-run at 1200V for 30 mins before samples were loaded. All gel runs were run at a constant temperature of 40OC.

PRODUCT PREPARATION

PCR product was mixed with denaturing loading dye (Formamide & Bromophenol Blue) in a ratio of 1:1. Before electrophoresis, product was denatured for 3mins at 94OC and then immediately placed on ice for 3mins. 0.5-1μL of PCR product +dye was loaded for each sample, depending on concentration of PCR product.

REAL-TIME ELECTROPHORESIS

Each gel run was loaded with 3 or more molecular size standards (TAMRA, Applied Biosystems), 2 internal standards (previously run product where allele sizes had been determined), and 20-40 unknown samples. Product was pulsed into gel (1200V for 8secs), before excess was flushed out of loading wells. Electrophoresis was carried out at 1200V for 30-45mins depending on allele sizes at each locus.

236

Appendices

Appendix 10. C. striata microsatellite frequencies

Microsatellite allele frequencies for C. striata loci across 19 sites; n is the number of individuals genotypes at each locus at each site

Cs-1 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 10 47 30 24 30 20 42 8 29 11 50 37 40 48 48 48 42 45 42 162 0 0.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 164 0 0 0 0.06 0 0 0 0.06 0 0 0 0 0 0 0 0 0 0 0 166 0.05 0.01 0 0 0 0 0 0.13 0 0 0 0 0 0 0 0 0.01 0 0 168 0 0 0 0 0 0 0.01 0.06 0.03 0 0 0 0 0.01 0 0 0 0 0 170 0.1 0.02 0 0.08 0 0.03 0.12 0 0 0 0.01 0.03 0 0 0.10 0 0.10 0.09 0.12 172 0.1 0.04 0.02 0 0.2 0.98 0.87 0.63 0.16 0.09 0.32 0.01 0 0.11 0.04 0.06 0.04 0.11 0.02 174 0 0 0.73 0.10 0 0 0 0 0 0.05 0 0 0.01 0 0.04 0.02 0 0 0 176 0 0.09 0.07 0.02 0.32 0 0 0.06 0.22 0.18 0.17 0 0.06 0.09 0.28 0.39 0.19 0.19 0.20 178 0.45 0.51 0.17 0.73 0.33 0 0 0.06 0.53 0.41 0.44 0.96 0.75 0.33 0.39 0.47 0.33 0.41 0.54 180 0 0.05 0 0 0.15 0 0 0 0.02 0.14 0.03 0 0.18 0.31 0.05 0.03 0.07 0.07 0.07 182 0 0 0 0 0 0 0 0 0.02 0 0 0 0 0.01 0.02 0 0.04 0.01 0.01 184 0.1 0 0 0 0 0 0 0 0.02 0.14 0.03 0 0 0.13 0.04 0.01 0.08 0.03 0.02 186 0 0.11 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0.02 0.05 0.03 0 188 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.01 0 192 0 0.02 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 194 0 0.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.06 0 0 196 0.2 0.06 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0.01 0 0 198 0 0.06 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.03 0.01

Cs-2 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 10 48 30 23 30 20 42 8 29 11 50 39 40 47 48 48 42 45 42 103 0 0 0 0 0 0 0 0 0 0 0 0 0 0.04 0.01 0 0 0 0 105 0 0.01 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0 107 0.05 0.05 0.05 0.35 0 0 0 0 0.02 0.05 0 0.03 0.44 0.01 0.03 0 0.01 0 0.01 111 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0.01 0.01 0.01 0.01 113 0.3 0.29 0.48 0.65 0.2 0 0 0 0.24 0.14 0.44 0.37 0.18 0.36 0.35 0.40 0.32 0.42 0.51 115 0 0.02 0 0 0.1 0.4 0.12 0 0 0 0 0 0 0 0 0.04 0 0 0 117 0.45 0.32 0.03 0 0.35 0.6 0.88 0.94 0 0.05 0.08 0 0 0.06 0 0 0.14 0.09 0.01 119 0.2 0.28 0.43 0 0.18 0 0 0.06 0.66 0.77 0.48 0.60 0.39 0.5 0.51 0.54 0.49 0.47 0.44 121 0 0.02 0 0 0.15 0 0 0 0.09 0 0 0 0 0.01 0.08 0.01 0.02 0.01 0.01 131 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0

237

Cs-3 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 10 48 30 24 30 20 42 8 29 11 50 39 40 48 48 48 42 45 42 95 0 0 0.05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 103 0 0 0 0.06 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.06 105 0 0 0.12 0 0 0 0 0 0.02 0 0 0 0.09 0.07 0.07 0.06 0.01 0.07 0.02 107 0.05 0 0.17 0.35 0 0 0 0 0.02 0.05 0.1 0.19 0 0.06 0 0.01 0 0 0 109 0.1 0.19 0.07 0.46 0 0.63 0.01 0.44 0.45 0.77 0.64 0.40 0.4 0.31 0.41 0.19 0.40 0.6 0.60 111 0.05 0.08 0 0.02 0.07 0.08 0.32 0.25 0.05 0 0.02 0.03 0 0.02 0.14 0.21 0.08 0.09 0.02 113 0.05 0.01 0 0 0 0 0.21 0 0.36 0 0.14 0 0 0.02 0.08 0.26 0.02 0 0.01 115 0 0.02 0.02 0 0.07 0 0.11 0 0 0 0 0 0.51 0 0.02 0.03 0.02 0.08 0.01 117 0.05 0.13 0.03 0.10 0 0.28 0.13 0.13 0.03 0 0.01 0.06 0 0 0 0.02 0.17 0.01 0.10 119 0.1 0.09 0 0 0.02 0 0.14 0.13 0 0 0.01 0 0 0.09 0.05 0.04 0.06 0.02 0.01 121 0 0.06 0 0 0 0 0.07 0.06 0.02 0.09 0.07 0 0 0.20 0.02 0 0.05 0 0.01 123 0 0.04 0 0 0 0 0 0 0.02 0 0.01 0 0 0.05 0 0.01 0.02 0.01 0 125 0 0.02 0.28 0 0.07 0 0 0 0.03 0.05 0 0 0 0.02 0.02 0 0.07 0.06 0.08 127 0 0.04 0.05 0 0 0.03 0 0 0 0 0 0.32 0 0.04 0.04 0.01 0.01 0.03 0.05 129 0.1 0.08 0.03 0 0.07 0 0 0 0 0.05 0 0 0 0.02 0.03 0.01 0.01 0 0 131 0.1 0.02 0.12 0 0.2 0 0 0 0 0 0 0 0 0.05 0.04 0.10 0.04 0.02 0.02 133 0 0.09 0 0 0.15 0 0 0 0 0 0 0 0 0 0.03 0.01 0.01 0 0 135 0.1 0.02 0 0 0.1 0 0 0 0 0 0 0 0 0.02 0.01 0 0.01 0 0 137 0 0.04 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.02 0 0 0 139 0.05 0 0 0 0.13 0 0 0 0 0 0 0 0 0.01 0.01 0.01 0 0 0 141 0.05 0 0 0 0.08 0 0 0 0 0 0 0 0 0 0 0 0 0 0 143 0 0 0 0 0.05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 145 0.1 0.02 0.07 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 147 0.05 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 151 0.05 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0

Cs-4 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 10 46 30 24 30 20 42 8 29 10 50 39 40 48 48 48 42 45 42 146 0 0 0.05 0 0 0 0 0 0 0 0 0 0 0.01 0.09 0.01 0 0 0 148 0 0.01 0 0 0 0 0 0 0.21 0.15 0.15 0.03 0.03 0.46 0.13 0.33 0.13 0.09 0.10 150 0.05 0 0 0 0 0 0 0 0 0.2 0 0 0.01 0.01 0 0 0 0.01 0 152 0 0.01 0.02 0 0 0 0.02 0.06 0 0 0 0 0 0 0.01 0 0 0 0 154 0 0.23 0.05 1 0.25 0 0 0.06 0.02 0 0 0 0.04 0.02 0.07 0.02 0.06 0.03 0.04 156 0.2 0.15 0 0 0.02 0 0.01 0.19 0.47 0.15 0.32 0.17 0.14 0.25 0.04 0.16 0.12 0.19 0.08 158 0.1 0.14 0.87 0 0.05 0 0.06 0 0.16 0.05 0.31 0.12 0.68 0.10 0.21 0.21 0.21 0.16 0.12 160 0.2 0.08 0.02 0 0.03 0 0.44 0 0.07 0.05 0.03 0.55 0.01 0.02 0.06 0.04 0.05 0.30 0.11 162 0.2 0.10 0 0 0.03 0.05 0.21 0.19 0.02 0.3 0.13 0.10 0.06 0.11 0.23 0.11 0.10 0.12 0.27 164 0.15 0.09 0 0 0 0 0.05 0.19 0.05 0.1 0.04 0.04 0.04 0.01 0.08 0.11 0.08 0.04 0.18 166 0.05 0.09 0 0 0.13 0 0.11 0.06 0 0 0.02 0 0 0 0.01 0 0 0 0.01 168 0.05 0 0 0 0.02 0.93 0.10 0.06 0 0 0 0 0 0 0.05 0 0.13 0.01 0 170 0 0.08 0 0 0.17 0.03 0 0.19 0 0 0 0 0 0 0 0 0 0 0.01 172 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0.06 174 0 0.03 0 0 0 0 0 0 0.02 0 0 0 0 0 0.01 0 0 0.03 0.01 176 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0 178 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 182 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.01 184 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 194 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 196 0 0 0 0 0.05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 198 0 0 0 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 200 0 0 0 0 0.17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 202 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0.06 0 0

238

Appendices

Cs-5 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 9 47 30 19 30 16 42 6 29 11 50 39 40 48 47 48 42 44 41 137 0.06 0.10 0.03 0 0.05 0 0 0 0.29 0.41 0.23 0.67 0.49 0.70 0.32 0.26 0.30 0.38 0.22 139 0.17 0.17 0 0 0.05 0 0 0 0.17 0.18 0.16 0.01 0 0.10 0.19 0.31 0.05 0.19 0.10 141 0.17 0.02 0.03 0 0.43 1 0.99 1 0.02 0 0 0.01 0 0.01 0 0 0.12 0.03 0.07 143 0 0.02 0 0 0.1 0 0.01 0 0 0 0 0 0.01 0 0.01 0 0.01 0 0.01 145 0 0.02 0 0.11 0 0 0 0 0.02 0.09 0.18 0 0.36 0.01 0.19 0.04 0.24 0.03 0.02 147 0.5 0.33 0.5 0.16 0.22 0 0 0 0.24 0.09 0.1 0 0 0.02 0 0 0.01 0.03 0 149 0.06 0.09 0 0.18 0.02 0 0 0 0.07 0 0.19 0.01 0.01 0.03 0.11 0.11 0.05 0.03 0.01 151 0 0 0 0.18 0 0 0 0 0.16 0.14 0.03 0 0 0.05 0.10 0.06 0.05 0.02 0 153 0 0 0 0.11 0 0 0 0 0.03 0 0.04 0 0 0.04 0.02 0.06 0.01 0.01 0.02 155 0 0 0 0.03 0 0 0 0 0 0 0.04 0 0 0 0.01 0.02 0.02 0 0 157 0 0 0 0.13 0 0 0 0 0 0 0 0 0 0 0 0.06 0 0 0.01 159 0 0 0 0.11 0 0 0 0 0 0 0.01 0 0 0 0 0 0 0 0 161 0 0.04 0 0 0 0 0 0 0 0.09 0.02 0 0 0.03 0.02 0.06 0.04 0.07 0.01 163 0 0 0.02 0 0 0 0 0 0 0 0 0.03 0.11 0 0.01 0 0 0 0.01 165 0 0 0 0 0.1 0 0 0 0 0 0 0.14 0.01 0 0.01 0 0 0.01 0 167 0 0.05 0.08 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 169 0 0.04 0.28 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 171 0.06 0.01 0.02 0 0 0 0 0 0 0 0 0.01 0 0 0 0 0 0 0.01 173 0 0 0 0 0 0 0 0 0 0 0 0.12 0 0 0.01 0 0.07 0.15 0.37 175 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0.02 0.09 177 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 179 0 0.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 181 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.02 183 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 187 0 0.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 195 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 197 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cs-6 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 9 47 30 21 30 17 42 6 27 11 50 39 40 48 48 48 42 45 42 122 0 0 0 0.90 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0 124 0 0 0 0.07 0 0 0.01 0 0 0 0 0 0 0 0.05 0 0 0 0.07 126 0 0 0.02 0.02 0 0.03 0.04 0 0 0 0 0 0 0.01 0.03 0.07 0 0.01 0.02 128 0 0 0 0 0.02 0 0.19 0.33 0 0 0 0.08 0 0 0.04 0.02 0.07 0.04 0 130 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0 0 0 0 132 0 0.02 0.37 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 134 0.06 0 0.1 0 0 0.65 0.05 0.17 0.037 0 0 0.13 0 0 0 0 0.01 0.08 0.10 136 0 0.02 0 0 0.05 0 0.25 0 0.35 0.14 0.32 0.06 0.01 0.07 0.15 0.07 0.18 0.14 0.14 138 0.11 0.05 0 0 0 0 0 0.08 0 0 0.09 0.13 0 0 0 0.01 0 0 0 140 0.06 0.14 0 0 0 0 0 0 0.02 0.05 0.09 0 0 0.04 0.05 0.02 0.04 0.02 0.01 142 0.11 0.06 0.12 0 0.18 0 0.02 0 0.07 0.14 0.01 0 0.14 0.24 0.05 0.19 0.08 0.07 0.02 144 0 0.02 0 0 0 0 0 0.17 0.04 0.05 0.07 0.35 0.8 0.25 0.14 0.10 0 0.04 0.04 146 0.11 0.16 0 0 0 0.29 0.21 0.17 0.11 0.09 0.23 0 0.01 0.21 0.19 0.05 0.32 0.27 0.08 148 0 0 0 0 0 0 0 0 0.09 0.27 0.07 0.12 0.03 0.02 0.04 0.03 0.06 0.03 0.08 150 0.11 0.13 0 0 0.1 0.03 0.15 0.08 0.22 0.27 0.1 0 0 0.09 0.18 0.25 0.10 0.17 0.26 152 0.22 0.28 0 0 0.02 0 0 0 0.02 0 0.01 0 0 0.02 0.03 0.14 0.01 0.02 0.04 154 0 0.01 0 0 0.05 0 0.02 0 0 0 0 0 0 0.01 0.04 0 0.02 0 0.02 156 0.06 0.02 0.02 0 0.1 0 0 0 0 0 0 0 0 0.01 0 0.01 0.01 0 0.01 158 0 0.05 0.17 0 0 0 0.04 0 0.02 0 0 0.03 0 0 0 0 0.02 0.08 0.01 160 0.17 0.01 0.03 0 0.45 0 0 0 0 0 0 0.03 0 0 0 0.01 0.05 0.01 0.04 162 0 0 0 0 0 0 0 0 0 0 0 0.03 0 0 0 0 0 0 0 164 0 0.02 0.12 0 0 0 0.01 0 0 0 0 0 0.01 0 0 0 0 0 0 166 0 0 0.05 0 0.03 0 0 0 0 0 0 0.06 0 0 0 0 0 0 0.02 168 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0.01 0.02 172 0 0 0 0 0 0 0 0 0.02 0 0.01 0 0 0 0.01 0 0 0 0

239

Cs-7 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 10 47 30 24 30 20 42 8 28 11 50 36 40 48 47 48 42 44 42 142 0 0 0.08 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 144 0 0 0 0.19 0 0 0 0 0 0 0 0 0.31 0 0.01 0.02 0 0 0 146 0.4 0.53 0.83 0.81 0.37 0.03 0 0.13 0.09 0.23 0.01 0.46 0.31 0.08 0.26 0.18 0.25 0.27 0.30 148 0.5 0.34 0.07 0 0.53 0.10 0 0.19 0.21 0.36 0.27 0.17 0.36 0.17 0.41 0.20 0.35 0.35 0.29 150 0.1 0.11 0.02 0 0.02 0.03 0 0.06 0.59 0.41 0.66 0.38 0.01 0.71 0.28 0.31 0.29 0.26 0.38 152 0 0.02 0 0 0 0 0.01 0 0.11 0 0.06 0 0 0.03 0.04 0.28 0.12 0.11 0.04 154 0 0 0 0 0 0 0.02 0.06 0 0 0 0 0 0 0 0 0 0 0 156 0 0 0 0 0.02 0 0.11 0.06 0 0 0 0 0 0 0 0 0 0 0 158 0 0 0 0 0 0 0.05 0.06 0 0 0 0 0 0 0 0.01 0 0 0 160 0 0 0 0 0 0 0.19 0.19 0 0 0 0 0 0.01 0 0 0 0 0 164 0 0 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 166 0 0 0 0 0.02 0.83 0.18 0.13 0 0 0 0 0 0 0 0 0 0 0 168 0 0 0 0 0 0.03 0.08 0.06 0 0 0 0 0 0 0 0 0 0 0 170 0 0 0 0 0 0 0.20 0 0 0 0 0 0 0 0 0 0 0 0 172 0 0 0 0 0.03 0 0.10 0 0 0 0 0 0 0 0 0 0 0 0 174 0 0 0 0 0 0 0.04 0.06 0 0 0 0 0 0 0 0 0 0 0 176 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cs-8 Site CM CP TK LP SB VV VT SM MD KJ NE GL LL ST KC PS TT TL PH Allele size n 9 44 30 22 30 15 42 5 29 11 48 39 40 48 47 48 42 43 42 133 0 0 0 0.11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 137 0.06 0.02 0 0 0.08 0 0.02 0 0 0 0 0.15 0 0 0 0 0 0 0 139 0 0.02 0 0.82 0 0 0 0 0 0 0.21 0 0 0 0.61 0.51 0 0.01 0.49 141 0.28 0.19 0.33 0.05 0.32 0 0 0 0.36 0.23 0.63 0.38 1 0.40 0.33 0.36 0.51 0.47 0.37 143 0.67 0.75 0.47 0.02 0.42 0 0 0 0.62 0.77 0.16 0.29 0 0.41 0 0.09 0.39 0.49 0.12 145 0 0 0.17 0 0 0 0.13 0 0.02 0 0.01 0.01 0 0.14 0.01 0.01 0.07 0.02 0.01 147 0 0.01 0.03 0 0.17 0.03 0.24 0.3 0 0 0 0.15 0 0.02 0.04 0.02 0 0.01 0 151 0 0 0 0 0 0.07 0.13 0 0 0 0 0 0 0 0 0 0 0 0 153 0 0 0 0 0 0 0.06 0.3 0 0 0 0 0 0 0 0 0 0 0 155 0 0 0 0 0.02 0.83 0.19 0 0 0 0 0 0 0 0 0 0 0 0 157 0 0 0 0 0 0 0.01 0.1 0 0 0 0 0 0.01 0 0 0 0 0.01 159 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0.02 0 0 163 0 0 0 0 0 0 0.01 0.1 0 0 0 0 0 0 0 0 0 0 0 165 0 0 0 0 0 0 0 0.2 0 0 0 0 0 0 0 0 0 0 0 167 0 0 0 0 0 0 0.07 0 0 0 0 0 0 0 0 0 0 0 0 169 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0 0 171 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0 0 173 0 0 0 0 0 0.03 0.13 0 0 0 0 0 0 0 0 0 0 0 0 175 0 0 0 0 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0 0 193 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0

240

Appendices

Appendix 11. C. micropeltes microsatellite frequencies

Microsatellite frequencies for C. micropeltes loci

Cm-1 Site LP TH MC ST KK KC KH PS BB TS TC Allele size n 1 1 12 49 50 9 50 50 50 3 5 149 0 0 0 0 0.16 0 0.01 0 0 0 0.20 151 0 0 0 0 0 0 0 0 0.03 0 0 155 0 0 0.46 0.21 0.56 0.89 0.86 0.92 0.91 0.83 0.70 157 0 0 0 0 0 0 0.04 0.03 0.03 0 0 159 0 0 0.08 0 0 0 0 0 0 0 0 161 0 0.50 0.46 0.79 0.28 0.11 0.07 0.02 0 0.17 0 185 0 0 0 0 0 0 0 0.01 0 0 0 191 0 0.50 0 0 0 0 0 0 0 0 0 195 1.00 0 0 0 0 0 0 0 0 0 0 199 0 0 0 0 0 0 0.02 0 0 0 0 205 0 0 0 0 0 0 0 0.02 0.02 0 0.10 207 0 0 0 0 0 0 0 0 0.01 0 0

Cm-2 Site LP TH MC ST KK KC KH PS BB TS TC Allele size n 1 1 12 49 50 9 50 50 50 3 5 136 0 0 0.96 0.62 0.61 0.33 0.63 0.56 0.54 0.83 0.70 138 1.00 1.00 0.04 0.38 0.39 0.50 0.32 0.42 0.46 0.17 0 140 0 0 0 0 0 0.17 0.04 0.02 0 0 0.30 142 0 0 0 0 0 0 0.01 0 0 0 0

Cm-3 Site LP TH MC ST KK KC KH PS BB TS TC Allele size n 1 1 12 49 50 9 50 50 50 3 5 151 0 0 0.63 0.42 0.34 0.28 0.12 0.11 0.17 0 0 153 0 0 0 0 0.07 0.17 0.06 0.04 0.05 0.33 0 155 1.00 1.00 0.38 0.58 0.58 0.44 0.67 0.82 0.77 0.67 0.90 167 0 0 0 0 0.01 0.11 0.15 0.03 0.01 0 0.10

241