Ecological genetics of dugongs (Dugong dugon) in Queensland

Alexandra May McGowan

Bachelor of Science

Bachelor of Applied Science (Honours)

A thesis submitted for the degree of Doctor of Philosophy at

The University of Queensland in 2019

School of Veterinary Science

i

Abstract

Effective management of species is paramount to their survival in the face of ever- increasing threats. To ensure appropriate conservation management, ongoing monitoring of a species’ dispersal patterns is required to assess the impact potential barriers can have on limiting the exchange of genetic material between populations. Additionally, threats to the overall health of species need to be identified to ensure conservation initiatives are effective. In this thesis, I used an array of genetic techniques to assess the impact of coastal threats to dugongs (Dugong dugong) along the Queensland, Australia coast in an effort to inform conservation management of the Queensland population. The dugong is a long-lived marine mammal that inhabits shallow sub-tropical and tropical coastal waters of the Indo-West Pacific region, feeding on meadows of seagrass. Australia is considered a stronghold for the species due to the large populations still found within its waters, although in Queensland, they are exposed to a number of threatening factors that have potentially contributed to observed population declines.

For my first study, I used microsatellite and mitochondrial DNA (mtDNA) genetic markers to investigate the genetic diversity and population structure of dugong populations along the eastern coastline of Queensland, between Torres Strait and Moreton Bay. A seascape genetics approach evaluated the factors that may have influenced dugong dispersal. The microsatellite analysis identified an abrupt genetic break within the Queensland dugong population in the Whitsunday Islands region, with a northern and southern cluster identified. Additionally, Fst values from the mtDNA analysis supported the division of these two clusters. While sea surface temperature and seagrass distribution were not found to be significant factors influencing dugong population structure and genetic distance, an oceanographic phenomenon known as the ‘sticky water’ effect, which is associated with strong tidal currents, may potentially be influencing dugong dispersal.

Next, I developed single nucleotide polymorphism (SNP) markers using the double-digest RAD-Seq method to better define Queensland dugong population structure and to identify highly discriminatory, genome-wide genetic markers. After filtering, a dataset of 10,690 loci detected the same abrupt genetic break as in the first study. However, the analysis also indicated a third genetic cluster, suggesting dugong dispersal may have occurred more often across the Whitsundays Islands region than the microsatellite and mtDNA analyses

ii indicated. The identification of genome-wide genetic markers, including a subset of 464 highly differentiating SNPs, provides opportunities for future genomic studies of dugongs.

My third study aimed to test the effectiveness of a novel method, the commensal bacterial network, to detect dugong contemporary movements. This method aimed to build a network demonstrating dugong groupings based on whether bacterial genotypes were shared, assuming the transfer of through either direct or indirect contact among dugongs. Faecal samples were collected and commensal bacterial species cultured and sequenced to determine bacterial genotypes for construction of a network. There was a surprisingly low rate of Escherichia coli culture (n=1, previously used commensal species). A 16S sequencing method identified Staphylococcus warneri as a potentially more suitable species, although low culture success and limited genetic diversity meant it was not possible to build a comprehensive commensal bacterial network to infer dugong contemporary movements.

Characterisation and investigation of differences in the dugong faecal microbiome along the Queensland coast was the aim of my fourth study. Diversity profiling demonstrated that the faecal microbial composition of three southern Queensland populations (Clairview, Hervey Bay and Moreton Bay) were different to those of the more northern locations according to Principal Coordinate Analysis and relative abundance averages of the three main phyla, potentially due to differences in diet. Interestingly, the microbiome geographical divide did not align with the genetic divide, potentially indicating ranging movements across the genetic break. Additionally, core microbial families were identified throughout all locations, including Clostridiaceae_1, Lachnospiraceae, Peptostreptococcaceae, Ruminococcaceae, Bacteroidaceae and Flavobacteriaceae, with these possibly important for seagrass digestion.

In my final study, antibiotic resistance profiles of bacterial isolates cultured from dugong faecal samples (n=9) collected at two locations, Moreton Bay (urban) and Newry Region (rural), were investigated. Whole Genome Sequencing (WGS) was used to identify resistance and virulence genes. All S. warneri isolates were resistant to penicillin, with two S. warneri isolates displaying multidrug resistance, while E.coli isolates were all resistant to ampicillin. Together with the finding of resistance genes in some of the WGS, these

iii preliminary findings suggest possible contamination of the dugong’s environment with antibiotics from human and agricultural wastewater or run-off.

Overall, my findings indicated dugong dispersal is limited across the genetic break, although relatively unrestricted along the remainder of the Queensland coast, with sufficient gene flow to reduce the likelihood of inbreeding and other deleterious effects of low genetic diversity within the two identified clusters. The results from this thesis have the potential to improve dugong conservation management along the urbanised Queensland coast, with maintenance of movement corridors paramount to ensure genetic connectivity.

iv Declaration by author

This thesis is composed of my original work, and contains no material previously published or written by another person except where due reference has been made in the text. I have clearly stated the contribution by others to jointly-authored works that I have included in my thesis.

I have clearly stated the contribution of others to my thesis as a whole, including statistical assistance, survey design, data analysis, significant technical procedures, professional editorial advice, financial support and any other original research work used or reported in my thesis. The content of my thesis is the result of work I have carried out since the commencement of my higher degree by research candidature and does not include a substantial part of work that has been submitted to qualify for the award of any other degree or diploma in any university or other tertiary institution. I have clearly stated which parts of my thesis, if any, have been submitted to qualify for another award.

I acknowledge that an electronic copy of my thesis must be lodged with the University Library and, subject to the policy and procedures of The University of Queensland, the thesis be made available for research and study in accordance with the Copyright Act 1968 unless a period of embargo has been approved by the Dean of the Graduate School.

I acknowledge that copyright of all material contained in my thesis resides with the copyright holder(s) of that material. Where appropriate I have obtained copyright permission from the copyright holder to reproduce material in this thesis and have sought permission from co-authors for any jointly authored works included in the thesis.

v Publications included in this thesis

No publications included.

Submitted manuscripts included in this thesis

Chapter 3, Seascape genetics of a mobile marine mammal: evidence of an abrupt break in dugong (Dugong dugon, Müller) gene flow along Australia’s eastern Queensland coast, has been submitted as a manuscript to Conservation Genetics. Contributions by authors, Alexandra M. McGowan, Janet M. Lanyon, Nicholas Clark, David Blair, Helene Marsh, Eric Wolanski, Jennifer M. Seddon, to the manuscript are presented in the page preceding Chapter 3.

Other publications during candidature

No other publications.

Contributions by others to the thesis

Contributions to the research design, interpretation of data, structure, editing and proof- reading of this thesis was provided by my supervisors Prof Jennifer Seddon, Dr Janet Lanyon, Dr Nicholas Clark and Dr Justine Gibson. Additionally, Dr Janet Lanyon supervised and assisted with sample collection, Dr Nicholas Clark assisted with analyses for Chapters 3 and 4 and Dr Justine Gibson assisted with interpretation of Chapter 7 results. Dr Deirdre Mikkelsen from The University of Queensland Centre for Nutrition and Food Science and QAAFI assisted with the analysis and interpretation of the microbiome data in Chapter 6. Contributions by Prof David Blair, Prof Helene Marsh and Prof Eric Wolanski to Chapter 3 are outlined in the ‘Contributions by authors to submitted manuscript’ section preceding the chapter. Antimicrobial resistance testing services were conducted by Rochelle Price, Tina Maguire and Hester Rynhoud from UQ Veterinary Laboratory Services. Sequencing services for Chapters 3 and 5 were conducted by the Animal Genetics Laboratory, Gatton. SNP and diversity profiling sequencing services were conducted by the Australian Genome Research Facility.

vi Statement of parts of the thesis submitted to qualify for the award of another degree

No works submitted towards another degree have been included in this thesis.

Research Involving Human or Animal Subjects Dugongs were sampled under The University of Queensland Animal Ethics Permits #ZOO/ENT/344/04, #SBS/290/11, SBS/360/14, SBS/181/18, Scientific Purposes Permits #WISP01660304, WISP03294105, WISP04937308, WISP07255110 and WISP14654414, Moreton Bay Marine Parks Permits #QS2000 to #QS2010CV L228 and MPP18-001119, Great Sandy Marine Parks Permit QS2010-GS043 and Great Barrier Reef Marine Park Permits #G07=23274:1 and G14/36987.1. Ethics approval letters are provided in the appendix.

vii Acknowledgements

I would like to thank my supervisors Prof Jennifer Seddon, Dr Janet Lanyon, Dr Nicholas Clark and Dr Justine Gibson for all of their support, advice and guidance throughout my thesis. They have been generous with their time and have shared their wealth of experiences and expertise to help make me a better researcher and I will be forever grateful. They have provided valuable contributions to all aspects of this thesis.

My sincere appreciation also goes to Rochelle Price, Tina Maguire and Charmaine Lubke from the School of Veterinary Science for support and guidance with my microbiology laboratory work. Thank you also to Sean Corley, Janene Harris and Deanne Waine from the Animal Genetics Laboratory for their assistance with my genetic laboratory work. I would also like to thank the UQ dugong research team, especially Helen Sneath and Rob Slade, staff of the Queensland Environment and Heritage Department through the StrandNet program, James Cook University dugong researchers, Dr Colin Limpus, A/Prof Caroline Gaus and members of the Badu, Mabuiag and Boigu communities in Torres Strait for their assistance with sample collection and sample donations. I also thank Dr Deirdre Mikkelsen from The University of Queensland Centre for Nutrition and Food Science and QAAFI for her assistance with the analysis and interpretation of the microbiome data.

Finally, I would like to thank my family, friends and fellow PhD candidates for their support throughout my thesis. Special thanks to Tatiana Proboste for being a truly great lab mate and friend and for your help with various aspects of my thesis. Thank you to my sisters and parents for your support and love through everything the last few years and for always providing the encouragement and motivation I needed.

viii Financial support

This research was supported by an Australian Government Research Training Program Scholarship. The research presented in this thesis was funded by two Sea World Research and Rescue Foundation grants, SWR/1/2015 awarded to Dr Janet Lanyon and SWR/6/2016, awarded to Prof Jennifer Seddon and a Winifred Violet Scott Foundation grant, WVS/1/2015 awarded to Dr Janet Lanyon. Travel funding to attend conferences were provided by the School of Veterinary Science and through the UQ Graduate School Candidate Development Award.

Keywords dugong, genetics, queensland, population structure, dispersal, movement, microbiology, microbiome, conservation, coastal threats

Australian and New Zealand Standard Research Classifications (ANZSRC)

ANZSRC code: 060411, Population, Ecological and Evolutionary Genetics, 60%

ANZSRC code: 060503, Microbial Genetics, 40%

Fields of Research (FoR) Classification

FoR code: 0604, Genetics, 70% FoR code: 0605, Microbiology, 30%

ix Dedications

This thesis is dedicated to my parents, Michael and Lynne. Thank you to my dad for first inspiring my love of animals and to my mum for her never ending support and encouragement.

x

Table of Contents

Table of Contents ...... xi

List of Figures ...... xv

List of Tables ...... xix

List of Abbreviations used in the thesis ...... xxi

CHAPTER 1: General introduction and thesis aims ...... 1

1.1 COASTAL THREATS ...... 1 1.2 GENETIC CONSEQUENCES OF COASTAL THREATS ...... 1 Measuring genetic diversity ...... 2 Genetic and health effects of habitat loss and fragmentation ...... 3 1.3 AIMS AND RATIONALE ...... 4 1.4 THESIS OVERVIEW ...... 6 CHAPTER 2: Literature review ...... 8

2.1 DUGONGS ...... 8 Distribution and habitat ...... 8 Threats ...... 10 Estimates of Queensland dugong population sizes ...... 11 Life history and breeding ...... 13 Diet ...... 14 Movement ...... 14 Population genetics ...... 16 Dugong conservation management in Queensland ...... 19 CHAPTER 3: Seascape genetics of a mobile marine mammal: evidence of an abrupt break in dugong (Dugong dugon, Müller) gene flow along Australia’s eastern Queensland coast ...... 21

3.1 ABSTRACT ...... 21 3.2 INTRODUCTION ...... 22 3.3 METHODS ...... 25 Study sites ...... 25 Sample collection ...... 26 DNA extraction ...... 27 Microsatellite genotyping ...... 27 Population differentiation – Microsatellite data ...... 29 Mitochondrial DNA sequencing and analysis ...... 29 xi

Seascape genetics ...... 31 Oceanographic modelling ...... 33 3.4 RESULTS ...... 34 Population differentiation – Microsatellite data ...... 34 mtDNA diversity and population differentiation ...... 38 Seascape genetics ...... 42 Oceanographic modelling ...... 44 3.5 DISCUSSION ...... 44 CHAPTER 4: Reduced representation sequencing of single nucleotide polymorphisms for detecting Queensland dugong population structure ...... 49

4.1 ABSTRACT ...... 49 4.2 INTRODUCTION ...... 50 4.3 METHODS ...... 54 Samples, library preparation and sequencing ...... 54 Sequence alignment and filtering...... 56 Population differentiation and diversity statistics ...... 57 Identification of highly informative SNPs ...... 57 4.4 RESULTS ...... 58 Sequencing and filtering output ...... 58 Population differentiation and genetic diversity ...... 58 Identification of highly informative SNPs and sensitivity admixture analysis ...... 59 4.5 DISCUSSION ...... 65 CHAPTER 5: Assessment of a novel method, the commensal bacterial network, to detect Queensland dugong contemporary movements ...... 69

5.1 ABSTRACT ...... 69 5.2 INTRODUCTION ...... 70 5.3 METHODS ...... 73 Sample collection ...... 73 Bacterial culture, DNA extraction, PCR and sequencing – E. coli ...... 76 Identification of commensal bacteria species – 16S sequencing ...... 77 Bacteria culture - S. warneri ...... 77 Investigating the genetic variability of S. warneri ...... 78 5.4 RESULTS ...... 80 Bacteria culture – E. coli ...... 80 Bacteria identified in dugong faecal samples – 16S sequencing ...... 81 Bacteria culture – S. warneri ...... 81

xii Genetic variability of S. warneri ...... 81 5.5 DISCUSSION ...... 85 CHAPTER 6: Characterisation of the faecal microbiome of dugongs along the east Queensland coast ...... 88

6.1 ABSTRACT ...... 88 6.2 INTRODUCTION ...... 89 6.3 METHODS ...... 91 Study sites and sample collection ...... 91 DNA extraction, PCR amplification and sequencing ...... 94 Bioinformatics and statistical analyses ...... 94 6.4 RESULTS ...... 95 Bacterial sequencing data and depth ...... 95 Bacterial richness and diversity ...... 97 Bacterial ...... 100 Variation in beta diversity ...... 104 Analysis of location-specific bacterial communities ...... 105 6.5 DISCUSSION ...... 110 CHAPTER 7: Antibiotic resistance profiles of bacteria cultured from Queensland dugong faeces ...... 115

7.1 ABSTRACT ...... 115 7.2 INTRODUCTION ...... 117 7.3 METHODS ...... 120 Bacterial isolates ...... 120 Antimicrobial susceptibility testing ...... 122 Whole Genome Sequencing ...... 122 7.4 RESULTS ...... 124 Bacterial isolate details ...... 124 Antimicrobial susceptibility testing ...... 124 Whole Genome Sequencing ...... 128 7.5 DISCUSSION ...... 134 CHAPTER 8: General discussion ...... 138

8.1 HISTORICAL DUGONG DISPERSAL ...... 138 8.2 ECOLOGICAL DRIVERS OF DUGONG POPULATION STRUCTURE AND MOVEMENT ...... 140 8.3 CONTEMPORARY DUGONG MOVEMENTS ...... 143 8.4 DUGONG FAECAL BACTERIA ANALYSIS ...... 144 8.5 IMPLICATIONS FOR DUGONG MANAGEMENT IN QUEENSLAND ...... 145

xiii REFERENCES ...... 147

APPENDICES ...... 159

xiv List of Figures

Figure 2.1. Global dugong distribution map. Sourced from Jefferson et al. (2015)...... 9

Figure 2.2. Map showing the Australian dugong range. Sourced from Marsh et al. (1999)10

Figure 2.3. Maps of southern Queensland seagrass meadows. Sourced from McKenzie et al. (2010)...... 18

Figure 3.1. Map of Queensland, Australia, showing the cumulative assignment probabilities of individual dugongs sampled at each coastal location to one of the two clusters identified by the best-fitting STRUCTURE model...... 26

Figure 3.2. Hierarchical clustering dendrogram of 293 individual dugong samples from along the Queensland coast based on 22 microsatellite loci...... 35

Figure 3.3. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres)...... 36

Figure 3.4. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres) for individuals assigned to the northern or southern cluster...... 36

Figure 3.5. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres) for individuals across the 500km section of the Whitsunday Islands genetic break, 500km north of the Whitsunday Islands and 500km south of the Whitsunday Islands...... 37

Figure 3.6. Median-joining network showing relationships between mitochondrial control- region haplotypes from 639 individual dugongs, the proportions of each haplotype and their geographical origins...... 41

Figure 3.7. Logistic regression coefficients for each variable’s predicted influence on an individual dugong’s probability of being assigned in the microsatellite analysis to Cluster 1 (the northern cluster)...... 43

Figure 3.8. The predicted movement of potential waterborne particles emanating from the three colour-coded seagrass meadows (site 1 - western side of Whitsunday Island (maroon), site 2 - Shute Harbour (blue) and site 3 - Airlie Beach (green)) in the

xv

Whitsundays Island region (~20.32°S) after 184 hours (a) during calm weather conditions and (b) during the prevailing south-easterly winds...... 44

Figure 4.1. Map showing sample collection locations of dugong tissue samples along the east Queensland coast that underwent genotyping using single nucleotide polymorphisms for population structure analysis...... 55

Figure 4.2. The assignment probabilities of individual dugongs sampled at each coastal location to one of the three clusters (1 – blue, 2 – red, 3 – green) identified by the admixture analysis of 10,690 SNPs...... 61

Figure 4.3. Principal Coordinate Analysis plot visualising the similarities and dissimilarities between individual samples and geographical locations based on 10,690 SNPs. GSS – Great Sandy Straits...... 62

Figure 4.4. Genomic relatedness matrix network of 43 dugong samples from along the east Queensland coast based on 10,690 SNPs...... 63

Figure 4.5. The assignment probabilities of individual dugongs sampled at each coastal location to one of the three clusters (1 – blue, 2 – red, 3 – green) identified by the admixture analysis using the 464 highly discriminatory SNPs identified...... 64

Figure 5.1. Map of dugong faecal sample collection sites along the east Queensland coast...... 74

Figure 5.2. Flow chart of dugong faecal bacterial culture and sequencing methods...... 75

Figure 5.3. Maximum Likelihood consensus tree constructed using the Tamurei-Nei (1993) model (1,000 bootstraps) of concatenated sequences across eight regions of the Staphylococcus warneri genome cultured from twenty dugong faecal samples collected along the east Queensland coast...... 84

Figure 5.4. TCS network of the 20 Staphylococcus warneri concatenated sequences from eight genome regions. Staphylococcus warneri was cultured from dugong faecal samples collected along the east Queensland coast. The size of the circles represents the number of samples with that genotype and the colour indicates the site of collection. Cross lines over connections between sequences show the number of mutational changes between sequence haplotypes...... 85

Figure 6.1. Map of dugong faecal sample collection sites along the east Queensland coast. Number of seagrass species present at each site is shown as a gradient colour scale. N, number of samples...... 92

xvi Figure 6.2. Rarefaction plot to show how well the sequencing captured each dugong faecal samples bacterial diversity...... 96

Figure 6.3a. Box plot of the estimated OTU richness (Chao1 and Richness) of dugong faecal samples collected from different locations along the east Queensland coast. Locations ordered from north to south (left to right). Upper and lower quartiles, mean (X), median and upper and lower values are shown...... 98

Figure 6.3b. Box plot of the estimated OTU diversity indexes (Shannon, Simpson’s and Evenness) of dugong faecal samples collected from different locations along the east Queensland coast...... 99

Figure 6.4. Relative abundance (%) of different bacterial communities at the phylum (p) level in Queensland dugong faecal samples. Locations are arranged north to south (left to right)...... 100

Figure 6.5. Relative abundance (%) of different bacterial communities at the family level in Queensland dugong faecal samples...... 102

Figure 6.6. Relative abundance (%) of the 20 most abundant bacterial communities at the OTU level in wild dugong faecal samples collected from Townsville, south to Moreton Bay Queensland. Locations are organised from north to south (left to right)...... 103

Figure 6.7. Principal Coordinate Analysis (PCoA) plot showing similarities in dugong faecal bacterial communities at the OTU level amongst the collection sites using Bray-Curtis distances ...... 104

Figure 6.8. The eight most significantly different abundant families of bacterial communities present at each sampling location based on ANOVA...... 106

Figure 6.9. Venn diagram to show the number of shared and unique OTUs in dugong faecal samples collected from north (blue; Townsville, Upstart Bay, Bowling Green Bay, Airlie Beach) and south (pink; Repulse Bay, Newry Region, Ince Bay, Clairview, Hervey Bay, Moreton Bay) Queensland...... 107

Figure 7.1. Pathways for antibiotics used in human and veterinary medicine and agriculture to enter the environment. Adapted from Carvalho and Santos (2016)...... 118

Figure 7.2. Satellite map of a) Moreton Bay and b) Newry Region sample collection sites...... 121

Figure 7.3. Phylogenetic relationship of E. coli sequence types, including ST196. The maximum likelihood phylogenetic tree was constructed in MEGA7 using the Tamurei-Nei

xvii mutation model and bootstraps calculated with 1,000 iterations. Branch lengths are measured in the number of substitutions per site. ST196 was derived from concatenated sequences generated from an E. coli isolate cultured from a Queensland dugong faecal sample and is circled in red. A selection of subtypes from the A, B1, B2, D, AxB1 and ABD lineages were downloaded from the EnteroBase database...... 130

xviii List of Tables

Table 2.1. Relative abundance estimates (± standard error) of dugong populations along the Queensland coast calculated from aerial survey data...... 13

Table 3.1. Microsatellite diversity metrics (estimated across 1,000 iterations) for the two primary clusters identified in STRUCTURE...... 37

Table 3.2. Summary statistics for dugong mitochondrial sequence data by mitochondrial lineage, mtDNA sub-cluster and region...... 40

Table 3.3. Pairwise FST values between mtDNA sub-clusters (1a - Torres Strait, 1b - Starke River to Airlie Beach, 2a - Midge Point to Gladstone, 2b - Bundaberg to Moreton Bay) calculated in Arlequin (distance method: pairwise differences)...... 41

Table 3.4. AMOVA of the mtDNA control region sequences for the combined northern clusters (Sub-clusters 1a and 1b) and southern clusters (Sub-clusters 2a and 2b). P- values were based on 1000 permutations...... 42

Table 3.5. Results from the multiple regression on distance matrices to examine relationships between individual pairwise genetic (microsatellite) distance and environmental seascape distances...... 43

Table 4.1. Observed and expected heterozygosity (estimated across 1,000 iterations) of the three clusters...... 59

Table 4.2. Pairwise FST values (estimated across 1,000 iterations) between the three clusters. Brackets show 95% confidence intervals...... 59

Table 4.3. FIS values (estimated across 1,000 iterations) between the three clusters. Brackets show 95% confidence intervals...... 59

Table 5.1. Primer sequences designed in this study and used to amplify potentially variable regions of S. warneri cultured from Queensland dugong faecal samples...... 79

Table 5.2. The number of dugong faecal samples from each collection method and collection location attempted for bacterial culture (E. coli and S. warneri) and the number of isolates confirmed by sequencing as the target bacteria...... 80

Table 5.3. Bacterial species identified in three dugong faecal samples (10 colonies each) using 16S sequencing methods...... 82

Table 5.4. The genetic diversity observed at eight genome regions of S. warneri cultured from 20 Queensland dugong faecal samples...... 83

xix

Table 6.1. Dugong faecal sample collections sites (organised from north to south), number (n) of samples collected and seagrass species identified at each site - present (p) and absent (a)...... 93

Table 6.2. Shared and restricted (unique to specified locations) bacterial OTUs in dugong faecal samples collected from various locations along the east Queensland coast. p- phylum, c-class, o-order, f-family...... 108

Table 7.1. Collection location and species identity of ten bacterial isolates cultured from wild dugong faeces (n=9) and marine sediment (n=1)...... 124

Table 7.2. Antimicrobial susceptibility determined by disc diffusion of bacteria cultured from wild dugong faecal samples (NW=Newry Region, MB=Moreton Bay) and a marine sediment sample from Queensland, Australia...... 126

Table 7.3. Minimum inhibitory concentration (MIC, ug/ml) results of bacteria cultured from Queensland dugong faecal samples (NW=Newry Region, MB=Moreton Bay) and a marine sediment sample...... 127

Table 7.4. Summary of sequence reads and alignment to reference sequence information of post quality control WGS to reference sequences. Reference sequences were downloaded from GenBank. CP026085.1 used for E. coli isolates, CP003668.1 used for S. warneri isolates, CP016316.1 used for B. cereus and CP015224.1 used for L. sphaericus. %Q30 is the percentage of bases >Q30...... 128

Table 7.5. Average nucleotide identity between E. coli WGS cultured from a dugong faecal sample (NW16031) collected from Newry Region, central Queensland Australia...... 129

Table 7.6. Resistance genes identified in the WGS of bacteria cultured from Queensland dugong faecal samples. HSP is the length of the alignment between the best matching resistance gene and the corresponding sequence in the WGS. N/A – not tested...... 132

Table 7.7. Virulence genes identified by the database VirulenceFinder in the WGS of bacteria isolated from Queensland dugong faecal samples...... 133

xx List of Abbreviations used in the thesis

AGRF Australian Genome Research Facility

AMOVA Analysis of Molecular Variance

ANOVA Analysis of Variance

BHI Brain Heart Infusion

BPW buffered peptone water

CLSI Clinical and Laboratory Standards Institute ddRAD-Seq double digest RAD-Seq

DPA Dugong Protection Areas

GBR Great Barrier Reef

GBS genotyping by sequencing

HWE Hardy-Weinberg equilibrium

LGM last glacial maximum

MIC minimum inhibitory concentration

MLST multilocus sequence typing mtDNA mitochondrial DNA

NGS Next-Generation Sequencing

PCoA Principal Coordinate Analysis

PCR polymerase chain reaction

QIIME 2 Quantitative Insights into Microbial Ecology 2

QTL quantitative trait loci

RAD-Seq Restriction site Associated DNA Sequencing

RRS reduced representation sequencing

SBA Sheep Blood Agar Columbia

xxi

SLIM Second-generation Louvain-la-Neuve Ice-ocean Model

SNP single nucleotide polymorphism

SST Sea Surface Temperature

TSS total-sum normalisation

VLS Veterinary Laboratory Services

WGS Whole Genome Sequencing

xxii CHAPTER 1: General introduction and thesis aims

1.1 COASTAL THREATS

The world’s coastal ecosystems, including seagrasses, coral reefs, estuaries and coastal wetlands, provide important services to humans and other species whose distributions are within them (Gibson et al. 2007; Halpern et al. 2008). These services include habitat, food production, nutrient cycling, nursery grounds, barriers to erosion and regulation of water quality (Gibson et al. 2007). It is estimated that the global value of coastal ecosystems is ten times greater than that of terrestrial ecosystems (Costanza et al. 1997; Gibson et al. 2007). However, despite their ecological and economic importance, coastal ecosystems have dramatically declined (Waycott et al. 2009). The major cause of the decline is due to direct and indirect threats associated with the large and growing human population that lives close (within 60 km) of the coast (Gray 1997). Threats to coastal ecosystems include over-fishing and hunting of important species, pollution (including chemicals, pathogenic bacteria and algal toxins), invasive species, climate change, physical alterations of coastal areas leading to erosion, tourism, and litter (Gray 1997; Gibson et al. 2007). However, the most significant threat to coastal ecosystems is habitat loss, which leads to fragmented populations that lack dispersal connectivity and local extinctions (Gray 1997; Gibson et al. 2007). Developing ways of balancing the need for human activities and survival with the preservation of coastal species is vital. Conservation management of key coastal marine species potentially provides the opportunity for the maintenance and survival of coastal ecosystems and their constituent species in this evolving environment.

1.2 GENETIC CONSEQUENCES OF COASTAL THREATS

For wildlife species and populations to adapt to a changing environment, maintenance of genetic diversity is vital as the rate of evolutionary change is proportional to the level of genetic diversity (Frankham et al. 2004; Groom et al. 2006). Genetic diversity provides the ‘raw material’ for future adaptation and thus dictates the evolutionary ability of species to respond to environmental change (Frankham et al. 2004; Groom et al. 2006). In small, fragmented and threatened populations where genetic diversity has been reduced due to limited gene flow, their evolutionary potential and reproductive fitness may be restricted,

1 increasing their extinction risk (Frankham et al. 2004; Groom et al. 2006). Therefore, species conservation requires the preservation of genetic diversity for survival (Allendorf and Leary 1986).

Measuring genetic diversity

Genetic diversity of a population or species, is measured by the number and frequency of different alleles and genotypes present (Frankham et al. 2004; Hughes et al. 2008). Molecular techniques measure genetic variation at individual loci, with the extent of genetic diversity for the study group commonly assessed by the level of heterozygosity across many loci and the diversity of these loci assumed to be representative of the diversity across the whole genome (Frankham et al. 2004; Chapman et al. 2009). Advances in sampling methods, genetic markers and genetic technology (DeSalle and Amato 2004; Hudson 2008; Helyar et al. 2011) have markedly increased our ability to assess genetic diversity across a wide range of species, with samples collected non-invasively also able to generate high quality genetic data (Frankham 1995; Frankham et al. 2004). Genetic information collected from these studies allows assessment of the threats to the viability of a population by investigating the level of inbreeding, estimating effective breeding population sizes and determining the amount of current and past gene flow occurring between putative groups (DeSalle and Amato 2004; DeYoung and Honeycutt 2005; Groom et al. 2006; Hudson 2008). Additionally, genetic analysis enables taxonomic uncertainties to be resolved, and management units, breeding systems and parentage to be determined and also allows for insights into species evolution and historical population structure (Frankham 1995; DeYoung and Honeycutt 2005; Groom et al. 2006; Hudson 2008).

Determining the degree of connectivity amongst wildlife populations is a central issue in conservation biology and has become particularly important in light of widespread habitat fragmentation (Groom et al. 2006; Kindlmann and Burel 2008). Immigration and dispersal between populations is important as it ensures the exchange of genetic material between populations, with sufficient levels of interbreeding between populations leading to fewer genetic differences and low genetic differentiation (Groom et al. 2006) as gene flow reduces the genetic effects of population fragmentation (Frankham et al. 2004). Measuring gene flow can be challenging when using observational studies as movement events are

2 not always associated with effective breeding, movements are only documented over the observation period, and historical levels of dispersal cannot be determined (Groom et al. 2006). In contrast, studies employing genetic markers are able to more accurately and efficiently define dispersal rates from inferences of genetic differentiation amongst and within populations (Frankham et al. 2004; DeYoung and Honeycutt 2005). Individual immigrants can be identified from multilocus genotypes using assignment tests whereby an individual is considered a migrant if its genetic assignment to a population is not the same as the population from which it was sampled (Frankham et al. 2004; DeYoung and Honeycutt 2005). Assigning individuals to different genetic groups also determines whether or not there is evidence of population structuring, that is, whether or not individuals form distinct groupings due to differences in their genotype frequencies (DeYoung and Honeycutt 2005).

Genetic and health effects of habitat loss and fragmentation

Habitat loss is the most significant threat to species population declines (Fahrig 1997; Frankham et al. 2004). Loss of habitat can result in small and/or fragmented populations where gene flow restrictions can be highly deleterious to the long term survival of the population(s) (Frankham 1995). Human population growth and development has significantly reduced wildlife habitats across the globe, and in many cases has created new barriers to dispersal (e.g. cities, roads) that have led to fragmented and genetically non-connected populations (Manel et al. 2003; Frankham et al. 2004; DeYoung and Honeycutt 2005; Holderegger and Di Giulio 2010).

Small and fragmented habitats have a number of genetic consequences for populations. Small populations are more vulnerable to the effects of genetic drift, which will result in a loss of genetic diversity through stochastic loss of alleles (Frankham et al. 2004). Further, the choice of mates becomes restricted and inbreeding can occur. Inbreeding is considered one of the major threats to species survival, particularly in small and fragmented populations (Frankham et al. 2004; Groom et al. 2006). Inbreeding leads to reduced overall heterozygosity and increased homozygosity of deleterious recessive genotypes. The decrease in fitness and vigour as a result of inbreeding is known as inbreeding depression. Inbreeding depression typically results in reduced reproductive success and survival, and thus further reductions in population size and increased

3 extinction risk due to the reduced fitness of individuals (Frankham 1995; Frankham et al. 2004; Groom et al. 2006; Chapman et al. 2009). With fragmentation and restricted dispersal possibilities, there is less opportunity for gene flow to break this cycle and reverse the effects of diversity loss and inbreeding.

Restricted gene flow between groups or populations due to limited interbreeding may result in genetic differentiation. Investigating the cause of genetic differentiation and restricted gene flow involves determining if there is sex biased dispersal, evidence of adaptations to specific environments and determining the barriers to effective breeding movements. This allows for assessment of how threats might impact on the connectivity and survival of species, particularly genetically isolated and fragmented populations. An understanding of dispersal, genetic population structure and their influencing factors is required for conservation management of threatened species.

Habitat or resource specialist species are particularly vulnerable to habitat loss as the loss of suitable habitat results in population declines and emigration, reducing genetic diversity and potentially leading to local extinctions. Therefore, conserving habitat and maintaining connectivity between habitats may be essential for sustaining genetically healthy populations in these wildlife species. The dugong (Dugong dugon, order Sirenia) is a long lived species with a specialist habitat, restricted to shallow, coastal regions where their preferred seagrass species are present. These factors suggest that the dugong is a key indicator species for understanding the impacts of coastal habitat degradation and loss.

1.3 AIMS AND RATIONALE

The overall objective of this thesis was to evaluate the presence of coastal threats to dugong (Dugong dugon) populations along the east Queensland coast from a variety of genetic techniques and analyses. Specifically, dugong historical and contemporary movements and the factors that influence them were explored, as were the potential impacts of coastal threats to the health of the species in Queensland. This thesis will provide information to assist with dugong conservation in Queensland, a region that has shown recent population declines and is exposed to a number of threatening factors. Additionally, the dugong’s status as a key species within the Queensland coastal

4 ecosystem makes it a potentially valuable sentinel species to assess the impact of coastal threats across the coastal environment.

The specific aims of this thesis were:

Aim 1 (Chapter 3) - Investigate the historical movements of dugongs along the entire east Queensland coast using microsatellite and mitochondrial genetic markers and to evaluate ecological factors that may be influencing dugong movements.

Previous genetic studies of highly vagile marine mammals have identified fine-scale population structuring over relatively small distances, with ecological factors such as sea surface temperature suggested as explanatory variables for the observed population structure. Fine-scale population structuring has been identified previously in the dugong populations of southern Queensland, however the reasons for this have not been explored. This chapter expands on current knowledge about Queensland dugong historical movements and identifies influencing ecological factors.

Aim 2 (Chapter 4) – To develop a suite of single nucleotide polymorphism (SNP) genetic markers to use to explore the genetic population structure of Queensland dugongs at the genome-wide level.

SNP markers are able to provide greater resolution compared to traditional markers when investigating population structure due to the significantly greater number of markers able to be readily developed. The greater resolution provided by the SNP markers will increase our understanding of Queensland dugong population structure and assist in future conservation efforts through the identification of highly informative markers.

Aim 3 (Chapter 5) - Explore the contemporary movements of dugongs in Queensland using a novel method in the marine environment, the commensal bacterial network.

Traditional methods used to study the contemporary movements of marine species, such as radio-tracking, are expensive, require large amounts of human contact hours and are challenging due to nature of the marine environment. The construction of commensal bacterial networks based on shared bacterial genotypes in terrestrial species have

5 revealed the same connections as found in social networks or from behavioural observations, providing a possible alternative method of determining contemporary species groupings and hence inference of movements. This method was trialled on dugongs in Queensland in an attempt to understand contemporary movements based on the presence of shared bacterial genotypes between individuals and populations.

Aim 4 (Chapter 6) – Characterise and compare the faecal microbiome of dugongs in Queensland.

The dugong’s microbiome has previously been investigated only for a single captive individual and in one population in sub-tropical Queensland; however potential variations between geographical areas are unknown. The dugong’s reliance on seagrass is hypothesised to be a driver of population structure. Diet has been shown to be an influencing factor in shaping the bacterial composition of the gut and so gut microbiome differences might indicate restrictions of dugongs to specific seagrass habitats. In this chapter, differences in the dugong’s microbiome were explored and a preliminary assessment of the role diet plays in shaping the bacterial composition of the dugongs gut was conducted.

Aim 5 (Chapter 7) – Investigate the presence of antibiotic resistant bacteria in Queensland dugongs using antimicrobial susceptibility testing and Whole Genome Sequencing (WGS).

The dugong’s shallow water coastal habitat means they are exposed to anthropogenic sources of contamination, including bacteria and antibiotics discharged directly and indirectly into their habitat. The aim of this chapter was to investigate the presence of antibiotic resistant bacteria in Queensland dugongs, to provide baseline data for future comparisons. This will enable future assessment of the impact coastal pollution has on the health of the coastal ecosystem in which dugongs and other marine species inhabit.

1.4 THESIS OVERVIEW

Chapter 1 presents a general introduction to the thesis and presents the thesis aims. Chapter 2 presents a review of the relevant literature. Next, the historical population

6 structure of dugongs along the east Queensland coast using microsatellite and mitochondrial DNA markers is reported in Chapter 3 along with the findings from the seascape genetics analysis. Chapter 4 uses newly developed SNP markers to investigate the historical population structure using a subset of samples from Chapter 3 and defines the most informative SNP markers for use in future dugong genetic studies. In Chapter 5, a novel method of determining contemporary movements is investigated using Queensland dugong faecal bacterial genotypes. The microbiome of dugong populations along the Queensland coast was investigated in Chapter 6 with reasons for differences explored. Chapter 7 presents the findings of antimicrobial susceptibility testing and WGS for bacterial species cultured from Queensland dugong faecal samples. To conclude, the overall findings, applications and future directions are discussed in Chapter 8.

7 CHAPTER 2: Literature review

2.1 DUGONGS

Distribution and habitat

The dugong is a herbivorous marine mammal with a distribution extending through the Indo-Pacific from east Africa to Vanuatu at latitudes between 26° north and 27° south of the equator (Marsh et al. 1999). They inhabit tropical and subtropical shallow coastal waters around over 37 countries where suitable seagrass meadows are present (Figure 2.1; Marsh et al. 1999). Australia has been considered a stronghold for the species due to the large populations still found within its northern waters, with mostly remnant populations found elsewhere, many of which appear close to extinction (Marsh et al. 1999; Whiting 2008). Comprehensive surveys of northern Australia, particularly of Shark Bay, Western Australia, and Queensland, and of Saudi Arabia have provided detailed knowledge of dugong distributions around these countries (Preen 1989b; Marsh et al. 1999; Marsh et al. 2011; Sobtzick et al. 2012; Sobtzick et al. 2017); in contrast, dugong distribution in most other areas is relatively unknown (Marsh et al. 2011).

8 Figure 2.1. Global dugong distribution map. Sourced from Jefferson et al. (2015).

Globally, dugongs are listed as Vulnerable to extinction by the International Union for Conservation of Nature (IUCN) Red List of Threatened Species, with the last assessment occurring in July 2015 (Marsh and Sobtzick 2015; https://www.iucnredlist.org). The dugong’s distribution in shallow coastal waters means they are particularly vulnerable to human-mediated threats, with an estimated 82-85% of the dugong’s habitat range seriously threatened by human settlement (Marsh et al. 2011).

In Australia, dugongs are found between Shark Bay in Western Australia (25°S) around northern Australia and extending as far south as Moreton Bay in southern Queensland (27°S; Figure 2.2; Marsh et al. 2002). The largest populations in Australia are found in Shark Bay (Gales et al. 2004), Torres Strait (Marsh et al. 2004) and southern Queensland where herds of >100 individuals have been recorded (Preen 1992; Lanyon 2003). Along

9 the Queensland coast, dugongs are found within numerous inshore bays where seagrass meadows are present.

Figure 2.2. Map showing the Australian dugong range. Sourced from Marsh et al. (1999).

Threats

The dugong’s shallow water coastal habitat makes it susceptible to a number of different anthropogenic threats, with threats to different populations largely dependent on the adjacent land use. Human related threats along the urbanised Queensland coast include runoff from urban, industrial and agricultural activities, dredging, trawling, pollution and mining (Marsh et al. 2002; Marsh et al. 2011). These activities contribute to the loss of seagrass habitat with increased sediment smothering seagrass meadows and increased turbidity reducing seagrass photosynthesis. Seagrass habitats are one of the most endangered ecosystems globally, with median rates of declines increasing from 0.9% per year before 1940 to 7% per year since 1990 (Waycott et al. 2009). Seagrass loss along the Queensland coast has potentially displaced dugongs from their primary habitat, impacting on dugong distribution and movement in the region. Human activities also release contaminants, including heavy metals, chemicals and antibiotics into the dugong’s

10 habitat which could potentially impact on dugong health (Gray 1997; Costanzo et al. 2005; Haynes et al. 2005). Hunting, incidental drowning in fishing and shark nets and vessel strikes are other human threats that dugong face and that have been directly linked to dugong deaths (Marsh et al. 2011). Additionally, habitat loss as a result of severe weather events, including cyclones and flooding, have also been associated with dugong deaths and displacement (Preen and Marsh 1995; Marsh et al. 2011; Sobtzick et al. 2012). Following the 1999 flooding of the Mary River, approximately 90% of the intertidal seagrasses were lost from the northern Great Sandy Strait, an area of significant habitat for dugongs, with full recovery of meadows taking three years (Campbell and McKenzie 2004). The dugong’s broad distribution throughout coastal Queensland and the fact that they are long lived species, makes them a good indicator species to assess the impact of coastal threats to the wider Queensland coastal environment. Also, the fact that dugong populations along the Queensland coast are adjacent to both urban and rural areas means the impact of the threats associated with either can be assessed and compared.

Estimates of Queensland dugong population sizes

Although there are a number of large dugong populations throughout Australia and in Queensland specifically, population declines and fluctuations have been indicated. Long- term modelling of dugong populations along the Queensland coast have indicated a significant decrease, with the current population suggested as only approximately 3% of 1960’s levels based on Queensland government records of shark control programs (Marsh et al. 2001; Marsh et al. 2005). Dugong populations along the Queensland coast may be Critically Endangered based on population fluctuations and the threats they face in the region (Marsh et al. 2011).

Aerial strip transect surveys have been conducted at approximately five yearly intervals since the mid-1980s along most of the Queensland coastline, including major dugong habitats to determine changes in the abundance and distribution of dugongs in the region (Marsh and Saalfeld 1990a; Marsh and Saalfeld 1990b; Marsh et al. 1994; Marsh et al. 1996; Marsh and Lawler 2001; Marsh and Lawler 2007; Marsh et al. 2011; Sobtzick et al. 2012; Sobtzick et al. 2017). Moreton Bay (27.4°S), northern Great Barrier Reef (GBR; 12°S - 15°S) and Torres Strait (10°S) populations appear to be relatively stable, although fluctuations in estimates are evident (Table 2.1). In Hervey Bay (25°S), significant

11 population declines occurred following the cyclone and major flooding events in 1992, with evidence of recovery in recent surveys (Table 2.1). In the southern GBR (16°S - 23°S) region however, there has been considerable variation in population size estimates, especially over the three most recent surveys, with a significant reduction in the 2011 survey but then a far greater population estimate from the most recent 2016 survey (Table 2.1). The cause of dugong population size estimate variation is unclear, although it may be the result of movements related to seagrass availability or deaths following severe weather events. During the summer of 2010/2011, severe flooding and cyclones adversely impacted the intertidal seagrass that dugongs rely on, with a large reduction in the percent cover of intertidal seagrass in the GBR region and an associated increase in dugong strandings along the Queensland coast (Sobtzick et al. 2012; Sobtzick et al. 2017). Emigration from the southern GBR region may also have contributed to the reduced population size in this region, with individuals moving to areas where seagrass availability and quality was superior. However, movement alone does not explain the huge increase in population size in the southern GBR region determined from the 2016 aerial survey and likely reflects issues with survey methodology, potentially due to differences in the tide times between surveys which can affect the ability to detect dugongs.

12 Table 2.1. Relative abundance estimates (± standard error) of dugong populations along the Queensland coast calculated from aerial survey data.

Moreton Bay Hervey Bay Southern GBR Northern GBR Torres Strait 1985 8110 (1073) a 1986 3434 (456) a 1987 13319 (2136) o 1988 442 (69) a 2175 (419) g 1990 10471 (1578) k 1991 24225 (3276) o 1992 1088 (382) g 1757 (286) i 1993 524 (124) g 1994 695 (140) h 1575 (233) h 1995 968 (44) b 9444 (1381) k 1999 1653 (248) c 3911 (637) j 2000 344 (88) c 9730 (1485) l 2001 13465 (2152) p 2005 453 (97) d 1388 (323) d 1558 (300) d 2006 8812 (1769) m 14767 (2292) m 2011 696 (106) e 1438 (438) e 537 (223) e 12603 (2080) n 2013 6558 (1141) n 15727 (2942) n 2016 601 (80) f 2055 (382) f 2822 (600) f Different methods of calculating population sizes have been used in the past, with the Hagihara et al. (2014) method currently considered most accurate as it makes fewer assumptions. This method was used to calculate the relative abundance of Moreton and Hervey Bays and the southern GBR from the most recent aerial surveys (2016), with recalculations of two past surveys only available. The Pollock et al. (2006) method was used to calculate the northern GBR and Torres Strait relative abundances for 2000-2013. All pre-2000 estimates were calculated using the Marsh and Sinclair (1989) method. a - Marsh and Saalfield (1990a&b), b - Lanyon (2003), c - Lawler (2002), d - Marsh and Lawler (2007), e - Sobtzick et al. (2012), f - Sobtzick et al. (2017), g - Preen and Marsh (1995), h - Marsh et al. (1996), i - Marsh et al. (1994), j - Marsh and Lawler (2001), k - Marsh and Corkeron (1996), l - Marsh and Lawler (2002), m - Marsh et al. (2007), n - Sobtzick et al. (2014), o - Marsh et al. 1991, p - Marsh et al. (2004)

Life history and breeding

Dugongs are long-lived mammals with a lifespan of up to 70 years (Marsh et al. 1984c). Reproduction in dugongs appears to be dependent on seagrass availability and quality that varies spatially and temporally due to differing growth conditions (Preen 1995; Marsh and Kwan 2008). From counts of the growth layer groups in the tusks of dead dugongs, age at reproductive maturity for males and females is spatially variable, ranging from ~6 - 15 years (Marsh 1980; Marsh et al. 1984a; Marsh et al. 1984b; Marsh et al. 1984c; Kwan 2002). Females from Torres Strait (far north Queensland) bear offspring at a younger age

13 and at a smaller body length (Marsh et al. 1984c; Kwan 2002) compared to females in southern Queensland, whose maturation is the most protracted so far recorded (Burgess 2012). Female dugongs give birth to one calf every 2.6-6.8 years with a gestation period estimated to be between 12.4-17.3 months (Marsh et al. 1984c; Kwan 2002; Marsh and Kwan 2008). Calves remain with their mothers for an extended period and suckle for a minimum of 14-18 months (Marsh 1995). In southern Queensland, the mating season appears to occur between September-November (Burgess et al. 2012), and although there have been few observations of the mating behaviour of dugongs, it has been suggested that it is polygamist and promiscuous in nature (Preen 1989a; Burgess 2012). Due to their long life-span, low reproductive rate, long generation time and large investment in each offspring, dugongs require a high survival rate to maintain population numbers (Marsh et al. 1984c) and will take a long time to recover from population declines.

Diet

Dugongs are seagrass specialists, with their distribution reflecting and limited to areas where their preferred tropical seagrass species are found (Marsh et al. 2002; Marsh et al. 2011). Seagrasses that appear to be most frequently selected by dugongs are in the Halophila and Halodule genera; these pioneer species are highest in available nitrogen and lowest in structural fibre and therefore more digestible by dugongs (Lanyon 1991; Marsh et al. 2002). However, seagrass growth and nutritional quality is highly influenced by climatic variables, including daily ambient temperature, rainfall (and the associated run- off) and day-length (Perez and Romero 1992; Lee et al. 2007; Allgeier et al. 2011). Differences in these conditions occur along the Queensland coast, with seagrass species diversity and biomass greater in northern than southern Queensland (Preen 1992; Long et al. 1993). In southern Queensland, the distribution limit of many tropical seagrass species, greater seasonal variation in climate is associated with marked seasonal variation in the nutrient availability and biomass of seagrass meadows, with dugongs in this area dependent on seagrasses that are nutrient-limited during winter (Preen 1992; Preen 1995).

Movement

Generally, dugongs do not make annual or seasonal migratory movements unlike their relative the Florida manatee (Trichechus manatus latirostris, order Sirenia), which make migratory movements in response to sea temperature changes (Deutsch et al. 2003).

14 Along the Queensland coast, dugong movements appear to be individualistic with tracking data showing a few individuals making large scale movements, while most only move small distances (Sheppard et al. 2006; Marsh et al. 2011). Tracking of 64 individuals captured at various locations along the Queensland coast between 1987-2006 using VHF and satellite transmitters found the majority of movements were less than 100 km in length between local seagrass meadows (Sheppard et al. 2006). However, some individuals (~19%) travelled distances greater than 100 km, with an individual caught at Hinchinbrook Island (18.25°S) travelling in excess of 500 km (Sheppard et al. 2006). This suggests dugongs are capable of moving large distances but rarely do so. Additionally, the tracking data demonstrated some individuals made return movements which appeared to be ranging in nature rather than dispersive due to the short period of time spent at the new location (Sheppard et al. 2006). Social cues, for example males looking for breeding females, may be driving these ranging movements (Sheppard et al. 2006). Support for ranging movements occurring is also evident by the fact that many individuals apparently bypassed suitable dugong habitat, spending little or no time foraging (Sheppard et al. 2006). Telemetry tracking of dugongs has been conducted sporadically, likely due to the high expense, therefore alternative methods for studying dugong movements require development.

Factors that have been suggested to influence dugong movements include changes in ocean temperature and tides (Sheppard et al. 2006; Sheppard et al. 2009; Zeh et al. 2018). For the Florida manatee, movements in response to ocean temperature are well documented, with mass migration events occurring during the cooler winter months where manatees move towards southern Florida where water temperatures are warmer (Deutsch et al. 2003). While dugong mass migration events have not been recorded, there is evidence of movements consistent with thermal gain in the dugong’s sub-tropical range. At least some of the satellite tracked individuals investigated by Sheppard et al. (2006) made movements towards warmer waters, with localised daily movements documented in the Moreton Bay area where dugongs would move out of the bay and into the adjacent South Passage, an area where seagrass appears to be absent but where ocean temperatures are up to 5.5°C warmer (Zeh et al. 2018). However, movement events in response to temperature appear to be dependent on dugong nutritional requirements, with the activity spaces of dugongs in Moreton Bay reduced and limited to just inside South Passage during the cooler months to still allow for adequate foraging opportunities within the

15 adjacent seagrass meadow (Zeh et al. 2018). In addition to temperature, daily movements appear to be tidally mediated, presumably in an effort to reduce energy and to maximise foraging. Almost 90% of trips in and out of Moreton Bay were made with the assistance of tidal transport i.e. moving with the direction of the tide, and allowed individuals to access intertidal seagrass that are not accessible during low tide (Zeh et al. 2018). However, factors influencing dugong movements throughout the rest of their tropical range are not well understood.

Population genetics

Investigations of the population genetics of dugongs globally have been limited with the Australian population most intensely studied. Initial investigation of the genetic structure and connectivity of Australian dugongs was undertaken with mitochondrial DNA (mtDNA) sequences, a genetic marker which is maternally inherited and which can be informative for investigating historical population structuring. Analysis of mtDNA control region sequences identified two distinct lineages (McDonald 2005; Blair et al. 2014). One lineage, which authors termed the widespread lineage, was found across the entire Australian dugong range, but was rare in Moreton Bay, while the other lineage, termed the restricted lineage, was found along the east coast of Queensland and in the Torres Strait but was rarely found further west (McDonald 2005; Blair et al. 2014). The division of the two lineages was hypothesised to have been caused by glacial cycles that occurred during the Pleistocene and the resultant Torres Strait landbridge, a physical barrier separating the east coast of Queensland from the rest of northern Australia when sea levels were low (Blair et al. 2014), significantly reduced gene flow across the Cape York Peninsula. Although there have been fluctuations in sea levels over the past 2.5 million years, it is thought that there were limited instances when sea levels were at or above present day heights (Shackleton 1987; Raymo et al. 2006; Blair et al. 2014). Dugong movements across the Torres Strait region would have been limited until the landbridge was submerged after the most recent glacial period (~7,000 years ago), resulting in distinct genetic differences between dugongs on either side of the Torres Strait landbridge (Blair et al. 2014). The mtDNA analysis suggests that the landbridge in the Torres Strait likely contributed significantly to the genetic structuring of the Australian population, leaving a persisting genetic signature in Australian dugongs (Blair et al. 2014). However, as there isn’t a clear division of the two lineages either side of the Torres Strait, the landbridge may not be the only cause of the genetic structuring. Recent investigations of the

16 phylogeography across the global dugong distribution using mtDNA has suggested the presence of significant genetic variation and geographical structuring (Plon et al. 2019).

Further investigation of dugong population structure using nuclear markers (seven microsatellite loci), developed originally for the related Florida manatee, concluded that there was no significant geographic structuring across the Australian population based on the STRUCTURE analysis (McDonald 2005). High genetic diversity within the Australian population was indicated with evidence of connectivity and gene flow among populations (McDonald 2005). These findings contrast with the mtDNA analysis which determined that gene flow was to some extent, restricted throughout the Australian dugong range due to past barriers to dispersal, although panmixia was not evident in the microsatellite analysis (McDonald 2005; Blair et al. 2014). Sex-biased dispersal has not been evident from dugong satellite tracking data, however there has been only limited sampling conducted (Sheppard et al. 2006). A deficit in the number of genetic markers and their ability to detect dugong population structure may explain the lack of structuring identified.

An expanded set of nuclear markers (24 microsatellite loci) identified fine-scale population structuring of the southern Queensland dugong populations over relatively small distances

(Seddon et al. 2014). Significant genetic differentiation (Fst values 0.005-0.339) was identified between all four major populations along the southern Queensland coast (Moreton Bay, Hervey Bay, Great Sandy Straits and Shoalwater Bay) based on microsatellite and mtDNA analyses (Seddon et al. 2014). Bayesian clustering analysis identified the Moreton Bay population as a distinct cluster and suggested this population to be a separate breeding group to the other southern Queensland populations (Seddon et al. 2014). Sex biased dispersal was not evident within this region, with no differences between female and male genetic structuring identified (Seddon et al. 2014). The finding of significant genetic differentiation among the four southern Queensland populations suggested that if movement is occurring between populations, movement events are not always translating into effective breeding. There was evidence of individual movements of a small number of individuals translating into lower levels of genetic differentiation among neighbouring populations, although this did not result in panmixia (Seddon et al. 2014). It has been suggested that the genetic structuring among the southern Queensland populations may be due the large distances between suitable seagrass habitat (Figure

17 2.3), with dugongs having to travel across exposed parts of the coast and across deeper waters to reach them (Blair et al. 2014; Seddon et al. 2014). This discontinuous distribution of seagrass meadows may have created a barrier to gene flow which is reflected in the population structuring of the southern Queensland populations (Seddon et al. 2014).

Figure 2.3. Maps of southern Queensland seagrass meadows. Sourced from McKenzie et al. (2010).

It is difficult to obtain accurate estimates of contemporary movements by dugongs among populations due to their cryptic nature. One attempt to do so used pedigree analysis of dugongs based on microsatellite genotypes; pedigrees were reconstructed and movement inferred based on parent-offspring relationships whereby the parent or offspring are found in a different location (Cope et al. 2015). This analysis showed that the amount of movement was between 2.15% to 3% between Moreton Bay and Hervey Bay/Great Sandy Straits annually (Cope et al. 2015). This level of movement was higher than suggested by the previous genetic analysis of the same populations and provided evidence to support the assumption that not all dugong movement is for the purpose of breeding (Seddon et al. 2014; Cope et al. 2015). While dugongs are capable of travelling long distances and can travel across deep ocean water (Marsh et al. 2002; Sheppard et al. 2006), such dispersal events to other areas do not appear common enough to result in changes in dugong genetic structure (Seddon et al. 2014). Alternatively, if movement is occurring, it is not resulting into breeding. Further investigation into the population structure of Australian dugongs using highly informative genetic markers is required, as are the possible barriers to dugong effective breeding dispersal and movement. Population genomic approaches

18 have recently been more widely applied to wildlife species providing greater insights (see Chapter 4 Introduction).

Dugong conservation management in Queensland

In Australia, dugongs are protected under the Federal Government’s Environment Protection and Biodiversity Act 1999 (EPBC Act) as a marine and migratory species, with dugongs in Queensland protected by the Nature Conservation (Wildlife) Regulation 2006 under the Nature Conservation Act 1992 (QLD) as Vulnerable to Extinction. In an effort to conserve Queensland dugong populations, a number of management initiatives have been introduced. The Great Barrier Reef Marine Park extends from just north of Bundaberg (24.3°S) to Cape York (10.4°S) and includes a number of Dugong Protection Areas where there are strict controls over netting practices to minimise bycatch (http://www.gbrmpa.gov.au/access-and-use/zoning/special-management-areas). The Moreton Bay and Great Sandy Marine Parks in southern Queensland include zoned protection areas, such as Go Slow areas to minimise boat strikes and trawling and fishing restrictions in specific areas. Dugong stranding and mortalities are monitored by the Queensland Government Department of Environment and Science StrandNET program (https://environment.des.qld.gov.au/wildlife/animals/caring-for-wildlife/marine- strandings/data-reports/annual-reports). The Reef Water Quality Protection Plan 2013 (https://www.reefplan.qld.gov.au/__data/assets/pdf_file/0016/46123/reef-plan-2013.pdf) includes strategies to minimise the impact of terrestrial runoff on seagrass meadows. The Torres Strait Regional Authorities Dugong and Turtle Management Project (http://www.tsra.gov.au/the-tsra/programmes/env-mgt-program/our-projects/dugong-and- turtle-management-project) aims to sustainably manage dugongs in the region with respect to indigenous culture and contemporary science. However, while management strategies exist for the main dugong habitats along the Queensland coast, there is no/limited coordination between marine parks to ensure movement corridors are protected to maintain dispersal rates. Investigation of the gene flow, population structure and threats to the Queensland dugong populations are required to determine if there are restrictions to dugong dispersal and, if so, what factors may be contributing to this. This information will inform strategic conservation and management decisions.

19 Chapter 3 – Contributions by authors to submitted manuscript My contribution to the submitted manuscript included: study design and concept (10%), performed experiments (80%), data analysis (70%), writing of the paper (95%) and editing of the paper (25%). Contributions by both Prof Jennifer Seddon and Dr Janet Lanyon each included: study design and concept (40%) and editing of the paper (20%). Dr Nicholas Clark contributed to: data analysis (15%) and editing of the paper (10%). Prof David Blair contributed to: study design and concept (5%), performed experiments (20%), data analysis (10%), writing of the paper (5%) and editing of the paper (10%). Prof Helene Marsh contributed to: study design and concept (5%) and editing of the paper (10%). Prof Eric Wolanski contributed to: data analysis (5%) and editing of the paper (5%).

20

CHAPTER 3: Seascape genetics of a mobile marine mammal: evidence of an abrupt break in dugong (Dugong dugon, Müller) gene flow along Australia’s eastern Queensland coast

3.1 ABSTRACT

Despite the lack of obvious physical barriers and their ability to travel significant distances, many marine mammals exhibit substantial population structuring over relatively short geographical distances. The dugong (Dugong dugon, Müller) is the only representative of family Dugongidae and its restriction to shallow coastal waters puts it at risk of the effects of urban and industrial coastal development. Although satellite tracking shows that dugongs in Queensland are capable of travelling >500 km, previous research identified at least two genetically distinct populations in southern Queensland separated by only a few hundred kilometres. This study investigated the genetic population structure of dugongs along the entire eastern Queensland coast (>2,000 km). Using 22 microsatellite loci and mitochondrial control region sequences, we employed a seascape genetics approach to test ecological factors that may be associated with the observed population structure. Bayesian clustering analysis of microsatellite genotypes from 293 dugongs sampled between Moreton Bay (27.4°S) and the Torres Strait (10°S) identified an abrupt genetic break in the Whitsunday Islands region (20.32°S), a finding supported by analysis of mitochondrial control region sequences. The seascape genetics analysis found that sea surface temperature and distribution of seagrass was not significantly associated with genetic structure. Oceanographic modelling of the Whitsunday Island region detected the presence of the ‘sticky water’ phenomenon, opening questions about the role of this oceanographic effect on the movement of marine mammals.

21 3.2 INTRODUCTION

Identification of the landscape features that underpin the existence and adaptations of populations is critical for effective conservation management. It can be challenging to identify these features in the marine environment because of the limited opportunities to observe dispersal and behaviour of marine taxa compared with terrestrial species (Selkoe et al. 2016). However, seascape genetic studies have identified a number of highly vagile marine mammals with continuous habitats that exhibit significant genetic structuring, even in the absence of obvious barriers to gene flow. This population structuring has been linked to the presence of physical (e.g. Viricel and Rosel 2014), social (e.g. Hoelzel et al. 2007) and/or ecological barriers (e.g. Amaral et al. 2012). For example, differences in habitat and resource use have been hypothesised to result in genetic subdivision between inshore and offshore bottlenose dolphins (Tursiops spp., Sellas, Wells and Rosel 2005). Geographic distance and habitat choice (differences in water depth and sea surface temperature) were suggested as explanations for the distinction between four identified genetic clusters from the western North Atlantic, Gulf of Mexico and the Azores Islands in the Atlantic spotted dolphin (Stenella frontalis, Viricel and Rosel 2014). In addition, social behaviours have been found to affect the population structure of marine mammals through their influence on dispersal. Hoelzel et al. (2007) suggested that the evolution of the population structure of killer whale (Orcinus orca) populations in the North Pacific was linked to the strict sociality of groups, whereby differences in prey detection and capture methods between social groups may have led to specialisations on local prey resources, eventually creating lasting regional differentiation. The idea of matrilineal social systems leading to genetic differences between groups of cetaceans has been proposed for multiple species including the killer whale, sperm whale and other whale species (see Whitehead 2017)

Several ecological variables associated with the genetic structure of marine mammals have helped to explain the fine-scale population structuring observed (Fontaine et al. 2007; Mendez et al. 2010; Mendez et al. 2011; Amaral et al. 2012; Viricel and Rosel 2014). These include sea surface temperature (Fontaine et al. 2007; Mendez et al. 2010; Mendez et al. 2011; Amaral et al. 2012; Viricel and Rosel 2014), chlorophyll concentration (Fontaine et al. 2007; Mendez et al. 2010; Mendez et al. 2011; Amaral et al. 2012), water turbidity (Mendez et al. 2010; Mendez et al. 2011; Amaral et al. 2012; Viricel and Rosel 2014) and dissolved matter (Mendez et al. 2011). A recent review of seascape genetic 22 studies (Selkoe et al. 2016) suggested that temperature could be as influential as geography in explaining regional-scale population genetic structure, with 43% of multi- factor studies finding temperature to be associated with genetic structure. Accounting for possible influences of ecological variables, such as temperature, on population differentiation is potentially relevant to management decisions. In this study, we sought to extend our understanding of the impacts ecological variables have on population structuring in the marine environment to dugong (Dugong dugon, Müller, order Sirenia) populations along the eastern Queensland coast.

The dugong is an obligate marine herbivore that favours shallow coastal regions where meadows of seagrass (their preferred food source) are present (Marsh et al. 2011). The dugong’s distribution covers the tropical and sub-tropical coastal and island regions of the Indo-West Pacific region. Australia is considered a stronghold for the species as its populations are the largest globally in terms of distribution and total population size (Marsh et al. 1999; Marsh et al. 2011). In Australia, dugongs are found along the coastline from Shark Bay in Western Australia (25.4°S, 113.5°E), around the north Australian coast to Moreton Bay, Queensland (27.3°S, 153.3°E).

Dugongs are listed as vulnerable to extinction at a global scale (Marsh and Sobtzick 2015), however their conservation status is spatially variable. There are particular concerns about the conservation status of the species along Australia’s eastern Queensland coast south of Cooktown due to apparent significant recent declines in dugong numbers, particularly in the southern Great Barrier Reef (GBR) region (Marsh et al. 2011; Marsh et al. 2019). Whilst the causes of these population declines remain unclear, suggestions include loss of habitat due to coastal development, fatal interactions with boats, fishing gear and shark nets and traditional hunting (Marsh et al. 2002). However, apparent regional declines could also be the result of temporary emigration due to the loss of seagrass meadows following cyclones and floods (Preen and Marsh 1995; Marsh et al. 2011).

A close relative of the dugong, the Florida manatee (Trichechus manatus latirostris, order Sirenia), is considered migratory as the species displays predictable movement patterns in response to changes in sea temperature (Deutsch et al. 2003). In contrast, a lack of

23 consistency in the direction and length of movements made by Queensland dugongs indicates that movements are individualistic (Marsh et al. 2011) and are most likely ranging (movements out and back from a home range) in nature. Most dugongs that have been tracked by satellite telemetry along the eastern Queensland coast have remained relatively sedentary, although some have travelled distances in excess of 100 km, with two individuals tracked more than 500 km from their capture locations (Sheppard et al. 2006). The classification of such movements as ranging is supported by the finding of movement rates amongst the southern Queensland populations of between 2.15-3.00% per year, whilst population genetic data suggests a migration rate of only 4-5% per generation, with evidence of genetically distinct breeding groups (Seddon et al. 2014; Cope et al. 2015). This analysis suggests markedly more movement between localities than detected through repeated direct sampling of individuals (Seddon et al. 2014) or through telemetry (Sheppard et al. 2006). These data also indicate that most movement events are not translating into breeding events.

Dugong distribution and movements have previously been linked to changes in ocean temperature in some areas, with populations in higher latitude habitats making meso- or micro-scale movements in winter in a proposed effort to escape colder waters (Sheppard et al. 2006). Zeh et al. (2018) also suggested that dugongs move in response to changes in ocean temperature at a local scale with some dugongs in Moreton Bay tracked moving between the seagrass beds inside the bay where they feed to the warmer waters outside the bay, particularly in the cooler autumn months. Other factors linked to dugong distribution and movements include distribution of seagrass beds, water depth and tides (Sheppard et al. 2009; Marsh et al. 2011). The identified ecological drivers of genetic structure of other marine mammals can be linked with ocean productivity (Behrenfeld and Falkowski 1997), high fish biomass (Longhurst 2006) and therefore regions of high prey availability, factors that are likely to be less relevant to the herbivorous dugong.

The only previously identified physical barrier to dugong movements in Australian waters is the former Pleistocene Torres Strait landbridge that stretched from the Cape York Peninsula north to New Guinea (Blair et al. 2014). The landbridge, which formed during periods of low sea levels in the past, most recently existed between ~115,000 and ~7,000 years ago and likely created a barrier to dugong movement across the Torres Strait until it

24 was finally submerged by rising sea levels. The differentiation of two major mitochondrial lineages, from analysis of mitochondrial control region sequences, in dugongs has been attributed to this landbridge, with one lineage rarely found to the west of the Torres Strait (restricted lineage) whilst the other is widespread across the north Australian range, although found infrequently in southern Queensland (widespread lineage; Blair et al. 2014).

Genetic structuring among dugong populations in Queensland, Australia, has only been examined in detail in the most southerly populations, in which fine-scale population structuring occurs over relatively short distances (Seddon et al. 2014). The drivers of the genetic differentiation in this region were not identified but likely related to distances between coastal seagrass beds (Seddon et al. 2014) or possibly to physical or ecological variables, as has been seen in other marine mammals. Considering the ongoing concern over dugong population declines in the region, further research is needed to identify variables that act as barriers to gene flow between dugong populations. The aims of the present study were to investigate genetic population structure and connectivity of dugong populations along the entire east Queensland coast from Torres Strait in the north to Moreton Bay in the south and to test the association of relevant ecological factors with the observed genetic structure.

3.3 METHODS

Study sites

Tissue samples were collected from dugongs (live and recovered carcasses) from major dugong foraging sites along the Queensland coast from Torres Strait, north Queensland (10°S), south to Moreton Bay in south-east Queensland (27.4°S, a distance >2,000 km, Figure 3.1; Appendix 3.1). Dugong foraging grounds were identified as areas in which substantial numbers of dugongs have been frequently recorded during dedicated aerial surveys along the Queensland coast, conducted at five yearly intervals since 1986 using standard strip transect methods (Marsh and Saalfeld 1990a; Marsh and Saalfeld 1990b; Marsh et al. 1994; Marsh et al. 1996; Marsh and Lawler 2001; Marsh and Lawler 2007; Marsh et al. 2011; Sobtzick et al. 2012; Sobtzick et al. 2017). Some of these sites have been designated as Dugong Protection Areas (DPAs; GBRMPA 2011). Samples were collected from all DPAs (see Appendix 3.1), except Hinchinbrook (carcass samples

25 available) and Taylor’s Beach. Additional tissue samples were obtained opportunistically from areas outside the DPAs in an effort to sample from as much of the east Queensland coast as possible.

Figure 3.1. Map of Queensland, Australia, showing the cumulative assignment probabilities of individual dugongs sampled at each coastal location to one of the two clusters identified by the best-fitting STRUCTURE model. Colours of circles are shown on a gradient ranging from blue (Cluster 1) to white (uncertainty in assignments) to orange (Cluster 2). The number of samples collected at each location is represented by the size of the circles. The STRUCTURE plot for K = 2 is shown alongside the map with an arrow indicating the location of the genetic break in the Whitsunday Islands region.

Sample collection

Tissue samples (skin, muscle and/or liver, n = 249) were collected from live and dead dugongs using several methods over the period August 1997 to April 2016 inclusive. Skin samples were collected from the dorsum of live free-swimming dugongs as they surfaced to breathe, using a hand-held scraping device deployed from a boat (Lanyon et al. 2010a).

26 Skin samples were also collected during live capture (Sheppard et al. 2006)(Sheppard et al. 2006)(Sheppard et al. 2006)(Sheppard et al. 2006)(Sheppard et al. 2006)(Lanyon et al. 2006; Sheppard et al. 2006).Three cow/calf pairs were skin scraped and in each case the sample from the mother was removed from the analysis. Tissue samples (skin, liver and/or muscle) were also collected opportunistically from dugong carcasses recovered along the Queensland coast. Tissue samples were from dugongs of both sexes and all age classes. Fresh skin samples and the majority of carcass samples were stored in salt-saturated dimethyl sulfoxide (DMSO, 20%) solution and placed on ice whilst in the field, and then stored frozen at -20°C. The remaining carcass samples were dry frozen at -20°C.

Dugongs were sampled under The University of Queensland Animal Ethics Permits #ZOO/ENT/344/04, #SBS/290/11, SBS/360/14, SBS/181/18, Scientific Purposes Permits #WISP01660304, WISP03294105, WISP04937308, WISP07255110 and WISP14654414, Moreton Bay Marine Parks Permits #QS2000 to #QS2010CV L228 and MPP18-001119, Great Sandy Marine Parks Permit QS2010-GS043 and Great Barrier Reef Marine Park Permits #G07=23274:1 and G14/36987.1.

DNA extraction

DNA from ~10 mg of dugong tissue was extracted using the MagJET Genomic DNA Kit and KingFisher Flex (Thermo Scientific) with an overnight digestion at 56°C in 200 µl of digestion solution, 20 µl of Proteinase K and 2 µl 1M Dithiothreitol. The final elution volume was ~100 µl with final concentrations ranging between 13-3810 ng/µl (assessed by NanoDrop ND-1000).

Microsatellite genotyping

Dugong samples (n = 249) were genotyped with 24 species-specific microsatellite loci (Broderick et al. 2007). The forward primers had a 5’ M13 complementary tail for labelling with a fluorescent M13 primer (Schuelke 2000). Eight multiplex polymerase chain reactions (PCR) were performed, each containing ~10 ng genomic DNA, 0.03-0.18 µM reverse primer, 0.01-0.04 µM of M13-tailed forward primer, 0.25-0.44 µM dye-labelled M13 primer, 0.6 µl of Qiagen’s 5x Q solution and 3.0 µl 2x Qiagen’s Multiplex Kit master mix

(containing HotStarTaq DNA polymerase, MgCl2, dNTPs and PCR buffers) for a total of 6

27 µl per reaction. Multiplex 1 contained loci DduE11, DduG11 and DduH09; multiplex 2 contained DduC11, DduF07 and DduH04; multiplex 3 contained DduE03, DduE09, and DduH02; multiplex 4 contained DduA01, DduF06 and DduG10; multiplex 5 contained DduB02, DduB05, DduC03; multiplex 6 contained DduE08, DduC09, DduB01; multiplex 7 contained DduD11 and DduE04; multiplex 8 contained DduA12, DduF11, DduG12 and DduD08. PCR conditions were as follows: 94°C for 15 min, followed by 35 cycles of 94°C for 30 s, 58°C for 45 s and 72°C for 90 s with a final extension at 72°C for 45 min. PCR amplicons were separated by capillary electrophoresis on an ABI3730 sequencer (multiplexes 1-4 and 5-8 were separately pooled prior to electrophoresis) and scored using GeneMapper software (version 5, Applied Biosystems).

Null alleles were investigated using MICRO-CHECKER version 2.2.1 (Van Oosterhout et al. 2004) and loci DduF07 and DduD08 were removed due to excess homozygosity at these loci. Deviation from Hardy-Weinberg equilibrium (HWE) was assessed in GENEPOP web version 4.2 using exact tests (Raymond and Rousset 1995). Only samples that were genotyped at 13 or more loci were included in the analysis, eliminating 37 samples from the analysis. Samples were also investigated for duplicates using GenAlEx (version 6.5, Peakall and Smouse 2012). Nine pairs of samples were found to be duplicates based on exact matching of genotypes and, where available, on biological data and one of each duplicate pair was removed. Following this, there were 203 samples suitable for further analysis.

In addition, we included a further 90 individuals (selected at random) from southern Queensland sampled in Moreton Bay (n = 30), Hervey Bay (n = 30) and Great Sandy Straits (n = 30) and reported in Seddon et al. (2014). Only thirty samples were selected from each location so as to not have a disproportionately high number of samples from one location. Including samples from southern Queensland ensured that our dataset covered the entire east Queensland coast region, giving a total of 293 genotypes for final analysis (see Figure 3.1).

28 Population differentiation – Microsatellite data

Bayesian clustering methods were employed to deduce the underlying population structure based on variation in allele frequencies for the entire microsatellite dataset using STRUCTURE 2.3.4 (Pritchard et al. 2000). The STRUCTURE analysis was repeated separately for each of the identified clusters to identify possible evidence of sub- structuring. Individuals were assigned to clusters without prior population definition. For all analyses, the ‘admixture’ model was used with associated allele frequencies as each population was not considered to be discrete. Ten independent runs were performed for K = 1-10 clusters with 300,000 iterations and a discarded burn-in length of 150,000 for each run. The number of clusters (K) best fitting the dataset was inferred using the log probability of data [LnP(D)] and the ΔK index defined by Evanno, Regnaut and Goudet (2005). In addition, a hierarchical clustering dendrogram was constructed using the pairwise genetic distance matrix generated using the hclust function in R (R Core Team 2016).

Genetic diversity metrics were calculated (mean observed number of alleles per locus (na), mean observed heterozygosity (HO) and mean expected heterozygosity (HE)) for each cluster identified by the best-fitting STRUCTURE model. Allelic richness was estimated for each combination of cluster and locus using rarefaction, with sample sizes based on the smallest number of alleles observed across all cluster x locus combinations. Pairwise FST values between clusters were calculated following Nei (1987). To account for uncertainties in cluster assignments, these analyses were repeated 1,000 times by randomly assigning individuals to one of the clusters based on their individual assignment probabilities (extracted from the STRUCTURE q-matrix). For all metrics, means and 95% confidence intervals are reported. Microsatellite differentiation metrics were calculated in the R programming language (R Core Team 2016) using functions in the ‘hierfstat’ package (Goudet and Jombart 2015). Isolation by distance was also explored by plotting pairwise genetic distance against pairwise geographic distance using the R programming language.

Mitochondrial DNA sequencing and analysis

The samples collected (minus the nine duplicates, n = 240) were also sequenced for a 726 base pair fragment of the mitochondrial control region to investigate possible concordance between inferences gleaned from nuclear and mitochondrial markers. The forward CR-5

29 primer (TCACCATCAACACCCAAAGC; Garcia‐Rodriguez et al. 1998) was used with a previously developed reverse primer (CR-DUG01, GTATGCGCTGGGAAATGG) based on Dugong dugon and Trichechus manatus sequences (Seddon et al. 2014). A 10 µL PCR reaction was prepared containing 0.625 µM MgCl2, 0.25 µM primers, 1x Qiagen PCR Buffer, 0.1 µl Qiagen HotStarTaq DNA polymerase and 0.25 µM dNTPs, with approximately 10 ng of DNA. Cycling conditions for the PCR were: 94°C for 15 min and 35 cycles of 94°C for 30 s, 58°C for 45 s, 72°C for 60 s with a final extension at 72°C for 10 min. Amplicons were purified using Exonuclease 1 (5 U per 5 µl PCR product, ThermoFisher Scientific) and shrimp alkaline phosphatase (1 U per 5 µl PCR product, ThermoFisher Scientific). Cycle sequencing was conducted using Big Dye Terminators v3.1 (Applied Biosystems) followed by capillary electrophoresis on a 3730 Genetic Analyser (Applied Biosystems). Mitochondrial control region sequences (n = 182) from southern Queensland utilised in the Seddon et al. (2014) study were included in the analysis.

A further dataset of mitochondrial control region sequences (410 bp, n = 217) generated from samples collected from along the east Queensland coast and used by Blair et al. (2014) were incorporated into this study. Following alignment of all sequences, the alignment was truncated to the shared 410 bp fragment prior to analysis. Descriptive statistics for the mitochondrial sequences were generated in the program DNASP v5.10.1 (Librado and Rozas 2009). Using the mitochondrial sequence data, a median-joining network was constructed using NETWORK v5.0 2015 (Fluxus Technology Ltd; Bandelt, Foster and Röhl 1999) allowing visualisation of the haplotypes and the phylogeography.

Population-pairwise FST values based on mitochondrial sequences were calculated in the program Arlequin v3.5 (Excoffier and Lischer 2010) using the distance method – pairwise differences. FST estimates departure from panmixia caused by population subdivision and can also be calculated across all populations. Overall values for FST (and for FSC – assessing departure from panmixia among sub-clusters within clusters – and FCT – assessing departure from panmixia among clusters) were also calculated in an analysis of molecular variance (AMOVA) framework, specifying a priori groupings of sub-clusters and clusters. This approach takes into account both haplotype frequencies and sequence differences between haplotypes (Excoffier, Smouse and Quattro 1992). We note in advance that values of FCT can never attain those required to reject the null hypothesis of

30 panmixia because the total number of populations we specified a priori was too low (Fitzpatrick 2009).

Seascape genetics

The possible association between sea-surface temperature and dugong genetic structure was explored based on the importance of temperature in previous marine mammal seascape genetic studies and its association with local scale movements in dugongs and manatees. We also assessed coastal seagrass distribution as a driver of genetic structure due to the dugong’s reliance on seagrass as its primary food source.

Night-time Sea Surface Temperature (SST, °C) was tested as a predictor of the observed genetic differences between Queensland dugongs on a population and individual level using the microsatellite data. Remote sensing data for this variable for each sample location (Appendix 3.1) from January 2006 to April 2016 were obtained from the Australian Ocean Data Network (https://portal.aodn.org.au/; IMOS L3S Nighttime gridded multiple- sensor multiple-swath Australian region HRPT AVHRR skin SST) and extracted using the ‘ncdf4’ R package (Pierce 2012). HRPT AVHRR SSTskin retrievals were produced by the Australian Bureau of Meteorology as a contribution to the Integrated Marine Observing System - an initiative of the Australian Government being conducted as part of the National Collaborative Research Infrastructure Strategy and the Super Science Initiative. The imagery data were acquired from NOAA spacecraft by the Bureau, Australian Institute of Marine Science, Australian Commonwealth Scientific and Industrial Research Organization, Geoscience Australia, and Western Australian Satellite Technology and Applications Consortium. Total mean summer (November-February) and winter (May- August) SST values were calculated by gathering daily data over a 10-year period (January 2006 – April 2016), with an average of ~310 observations per sampling point for each month and ~1200 observations per sampling point for the ten year period.

Seagrass distribution mapping along the Queensland coast is incomplete; however we were able to investigate connectivity of individual dugongs based on the modelled habitat suitability of coastal seagrass distribution for the GBR region during the dry season (Grech and Coles 2010). The modelled seagrass distribution map was downloaded from

31 http://maps.eatlas.org.au/ and exported into ArcMap 10.5.1 where each 2 km x 2 km grid square was given a resistance value based on the probability of seagrass being present at that site and therefore being suitable/unsuitable habitat for dugongs. The resistance values were set at an arbitrary value of 20 for a probability of seagrass presence <0.25, at 10 for probabilities of 0.25-0.49, at 5 for probabilities of 0.5-0.74 and 1 for probabilities of 0.75-1. Connectivity among pairs of individuals was then calculated as a resistance distance using Circuitscape 4.0.5 (McRae, Shah and Mohapatra 2014) in the pairwise mode. The seagrass distribution model has only been mapped for parts of the GBR region, so we were only able to look at the effect of seagrass distribution on pairwise genetic distance (from the microsatellite analysis) for samples collected between Starcke River (14.77°S) and Rodds Bay (23.97°S). We also calculated geographic distances between sample locations (calculated between sampling GPS coordinates as beeline distance in km) to assess the influence of distance on population assignment and individual genetic distances. Other parameters were considered to be unimportant to dugongs or unreliable; remote-sensed data on chlorophyll levels were considered unreliable in the shallow water inhabited by dugongs (J Brodie pers. comm.).

Initially, we tested the influence of latitude and the oceanographic variables (SST and seagrass distribution) on the probability that an individual dugong would be assigned to a STRUCTURE cluster. Because our best-fitting STRUCTURE model identified only two clusters (see Results below), associations were tested using binomial logistic regression with a logit link function (fit using functions in the R package ‘nnet’; Venables and Ripley 2002). Predictor variables included latitude, mean summer SST, mean winter SST (all as continuous variables) and a binary variable representing whether or not the individual was sampled north of the Whitsunday Islands. This binary variable was included to help define the significance of the effect of other ecological variables effect on population structure, as a genetic break was detected at this location (see below). All continuous covariates were mean-centred and scaled prior to regression to allow for directly comparable regression coefficients. For the response variables, individuals belonging to the different clusters were assigned a number code (e.g. 1, 2). Similarly to calculations of population differentiation metrics above, uncertainty in individuals’ cluster assignments were accounted for by repeating models 1,000 times and assigning individuals to either cluster in each iteration based on a binomial draw of their STRUCTURE assignment probabilities. Predictor variables were considered to have a significant influence on cluster assignment probability if their 95% quantiles did not overlap with zero.

32

Following this, the association between location and ecological variation on individual-level genetic differentiation was examined. These relationships were explored using multiple regression on distance matrices (fit using functions in the R package ‘ecodist’; Goslee and Urban 2007). Individual pairwise genetic distances were calculated following the multilocus genetic dissimilarity measure proposed by Kosman and Leonard (2005). Predictor variables included pairwise geographic distances, two Gower’s distance matrices that accounted for environmental dissimilarity between individual sampling points (summer SST, winter SST; Gower 1971) and the resistance distances that were calculated for the seagrass distribution in Circuitscape. Gower’s distance matrices were built following methods in Pavoine et al. (2009) using the ecological variables outlined above, which were all treated as continuous, unweighted and scaled by range (dividing by the maximum). Significance of predictor regression coefficients was assessed as above using 1,000 model iterations. Because seagrass resistance values were not available for all pairwise comparisons (as data were only available for parts of the GBR region), we imputed these values from a uniform [0, 1] distribution (representing the observed range of scaled pairwise distances) in each iteration.

Oceanographic modelling

Oceanographic modelling was conducted to determine if there was any influence of currents and tides on dugong population structure along the east Queensland coast. We used the Second-generation Louvain-la-Neuve Ice-ocean Model (SLIM). This is a discontinuous Galerkin finite element model (Lambrechts et al. 2008; Critchell et al. 2015; Delandmeter et al. 2017) that has been successfully verified and used to simulate the hydrodynamics of the GBR, which is characterised by a complex topography and strong velocity gradients. Appendix 3.3 shows the model domain and mesh used by SLIM for this study. The domain comprised the entire GBR lagoon. The eastern boundary was the shelf break at 200 m depth where the oceanographic forcing by the Coral Sea was applied. The mesh had a very high resolution in the Whitsunday Islands area of central Queensland. We also modelled the fate of potential waterborne particles in this region. This was done using the SLIM Lagrangian advection-diffusion model; the value of the horizontal eddy diffusion coefficient, which parameterises sub-mesh size turbulent mixing, was set equal to 5 m2s-1 following Wolanski and Kingsford (2014).

33 3.4 RESULTS

Population differentiation – Microsatellite data

Analysis of the microsatellite genotypes in the east Queensland coast dugong population identified an abrupt genetic break in central Queensland. Likelihood values estimated from Bayesian clustering analyses executed in STUCTURE identified a clear mode at K = 2. The geographical boundary between the two clusters was abrupt and occurred in the Whitsunday Islands region (20.32°S, Figure 3.1). Cluster 1 (the northern cluster) included dugongs sampled from Torres Strait (10.03°S) south to Airlie Beach (20.23°S), with only two individuals from this region showing a high probability of being assigned to Cluster 2 according to the inferred ancestry of individuals estimated from the run with the maximum LnP(D) (Figure 3.1). Cluster 2 (the southern cluster) included dugongs sampled from Midge Point (20.65°S) south to Moreton Bay (27.44°S; Figure 3.1). When the STRUCTURE plots were visualised for K=2, 3 ,4 and 5, they all show a consistent division at the Whitsunday Islands region (Appendix 3.2).The hierarchical clustering dendrogram also supported the division of the 293 samples into two clusters (Figure 3.2). The pairwise Fst value between the two clusters was 0.011 (95% CI: 0.009 - 0.013). A relatively strong isolation-by-distance pattern was observed for the entire dataset, although this pattern was less distinct when only looking at individuals within the southern cluster separately (Figure 3.3 and 3.4). The isolation by distance pattern was visually stronger for individuals within the northern than the southern cluster, however, this may be influenced by the lack of samples analysed between Starcke River and the Torres Strait. The southern cluster demonstrated what appeared to be an asymptote in genetic and geographical distance at around 700km. Additionally, the comparison between pairwise genetic distance and pairwise geographical distance found genetic distances to be higher when comparing individuals across the genetic break compared to those either within 500 km north or south of the genetic break (Figure 3.5).

34

Figure 3.2. Hierarchical clustering dendrogram of 293 individual dugong samples from along the Queensland coast based on 22 microsatellite loci.

35

Figure 3.3. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres).

Figure 3.4. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres) for individuals assigned to the northern or southern cluster.

36

Figure 3.5. Isolation by distance plot of pairwise genetic distance, calculated using genotypes from 293 individual dugongs across 22 microsatellite loci, against pairwise geographic distance (kilometres) for individuals across the 500km section of the Whitsunday Islands genetic break, 500km north of the Whitsunday Islands and 500km south of the Whitsunday Islands.

Cluster 2 had higher genetic diversity values compared to Cluster 1 (Table 3.1). All 22 loci were polymorphic for both Clusters 1 and 2. Allelic richness values for both clusters at each locus ranged from 2.00 to 14.08 alleles/locus (Appendix 3.4), with average values for each cluster presented in Table 3.1. No loci were identified as being out of HWE for both clusters (Appendix 3.5).

Table 3.1. Microsatellite diversity metrics (estimated across 1,000 iterations) for the two primary clusters identified in STRUCTURE. Sample size for each cluster (n = number of individual dugongs whose assignment probabilities were most strongly associated with each cluster), mean observed number of alleles per locus (na), average allelic richness per cluster (AR), observed heterozygosity (HO), expected heterozygosity (HE). Brackets show 95% confidence intervals. Note that diversity metrics and their associated confidence intervals reflect uncertainty in admixture assignments.

Cluster n na AR HO HE

1 (north) 100 6.99 6.45 0.546 0.585 (6.95 - 7.03) (6.41 - 6.49) (0.543 - 0.549) (0.582 - 0.587)

2 (south) 193 7.67 (7.62 - 7.04 (7.00 - 0.568 (0.565 - 0.612 (0.609 - 7.72) 7.08) 0.570) 0.615)

37 Due to the identification of two large and clearly separated clusters, we investigated evidence for further sub-structuring by performing additional STRUCTURE analyses using subsets restricted to either Cluster 1 or 2. Within Cluster 1 (samples between Torres Strait and Airlie Beach), two sub-clusters were identified (Appendix 3.6): Sub-cluster 1a included samples from Torres Strait, Sub-cluster 1b had samples from Starcke River to Airlie Beach. The genetic break between the two sub-clusters was less abrupt than between the two primary clusters and was located in the region between Torres Strait and Starcke River, although an absence of samples from that region precluded refinement of the site of this break. Analysis of samples from Cluster 2 (samples from Midge Point to Moreton Bay) identified four clusters (K = 4), although there was a high value at K = 2 for the ΔK index, with a genetic break occurring in the Gladstone to Bundaberg region for both K = 2 and K = 4 (Appendices 3.7 and 3.8). Hence, sub-cluster 2a contained samples from Midge Point to Gladstone and Sub-cluster 2b included samples from Bundaberg to Moreton Bay.

mtDNA diversity and population differentiation

A total of 639 individual mtDNA control region sequences from dugongs across the entire east Queensland coast was assessed using a 410 bp segment of the mtDNA control region, with sequences from 441 dugong samples falling into the restricted lineage and 198 into the widespread lineage (Table 3.2; Figure 3.6). Forty-seven haplotypes were identified with 45 variable sites. Truncation of all sequences to the 410 bp region resulted in the exclusion of five variable sites. To investigate the extent of concordance between nuclear and mtDNA inferences, the mtDNA samples were divided into four clusters that reflected the genetic break locations identified in the microsatellite analysis (see above; Table 3.2).

Numbers of haplotypes and of variable sites, haplotype diversity and nucleotide diversity, as well as the average number of differences between sequences all tended to increase from south to north. This pattern is likely a result of the increased representation of the more diverse widespread lineage in northern populations. The most northerly Sub-cluster (1a) had the highest number of haplotypes (31) and number of unique haplotypes (25) and the highest haplotype and nucleotide diversities (Table 3.2). Conversely, Sub-cluster 2b, the most southerly sub-cluster, had the lowest values for all these parameters. Only two haplotypes (h1 and h6, both in the restricted lineage) were shared across all four sub- 38 clusters (Appendix 3.9). There were 37 variable sites for the combined northern populations (Sub-clusters 1a and 1b) and 31 for the combined southern populations (Sub- clusters 2a and 2b). Representation of the two mitochondrial lineages was not equal in the different clusters/sub-clusters. Table 3.2 illustrates the very low representation of the widespread lineage in Cluster 2, and especially in Sub-cluster 2b (Bundaberg to Moreton Bay). In the Torres Strait (Sub-cluster 1a), the two lineages are roughly equally represented. Interestingly, Sub-cluster 1b (samples between Starke River and Airlie Beach) contains significantly fewer representatives of the restricted lineage than expected (Chi-square test, P < 0.0001) in comparison with Sub-cluster 1a. Table 3.2 also implies that the total number of haplotypes in Sub-cluster 2b (Bundaberg to Moreton Bay region) is probably not much greater than the number we sampled, but that many haplotypes remained unsampled in the more northerly populations.

39 Table 3.2. Summary statistics for dugong mitochondrial sequence data by mitochondrial lineage, mtDNA sub-cluster and region. N is number of individual dugongs. Nucleotide diversity is per-site. P (N haplotypes) is the probability of obtaining a sample with at least the number of haplotypes equal to the observed number of haplotypes.

mtDNA sub- North North North South South South cluster All 1a 1b overall 2a 2b overall

N (total) 139 102 241 177 221 398 639

N (Restricted) 75 12 87 137 217 354 441

N (Widespread) 64 90 154 40 4 44 198

No. variable 36 23 37 30 24 31 45 sites

No. haplotypes 31 12 38 15 7 18 47

Unique 25 3 28 6 2 9 N/A haplotypes

Haplotype 0.877 0.829 0.924 0.765 0.551 0.670 0.851 diversity

Nucleotide 0.0220 0.0142 0.0223 0.0161 0.0037 0.0100 0.0199 diversity

Av. No. nuc. 9.018 5.817 9.132 6.611 1.498 4.102 8.122 Differences

P (N haplotypes) 0.041 0.033 0.025 0.016 0.158 0.101 0.01

Population structuring was identified in the mtDNA data, with all pairwise comparisons indicating significant restrictions to gene flow between sub-clusters (Table 3.3). The highest FST values were between Sub-cluster 1b and Sub-cluster 2b. The analysis of molecular variance (AMOVA) analysis also found significant results for comparisons among sub-clusters within clusters and within sub-clusters but not between the main northern and southern clusters (Table 3.4), due to limitations of the AMOVA approach (see earlier comment).

40

Figure 3.6. Median-joining network showing relationships between mitochondrial control-region haplotypes from 639 individual dugongs, the proportions of each haplotype and their geographical origins. Inset is a key to colours indicating geographical origins, consistent with the sub-clustering identified in the microsatellite data. The size of each circle is proportional to the frequency of the haplotype it represents. Haplotype identification numbers are indicated, extending the numbering system of Blair et al. (2014). Not all haplotypes numbered in Blair et al. (2014) are in this figure because it includes data from the east coast of Queensland only. Haplotype #47 of Blair et al. (2014; GenBank accession EU835807) is not included: re-examination of sequence files for this haplotype revealed ambiguities indicating possible contamination. Slash lines across a line connecting haplotypes or median vectors indicate the number of mutational changes inferred as occurring between haplotypes. Numbers within circles indicate number of individuals represented, when more than one.

Table 3.3. Pairwise FST values between mtDNA sub-clusters (1a - Torres Strait, 1b - Starke River to Airlie Beach, 2a - Midge Point to Gladstone, 2b - Bundaberg to Moreton Bay) calculated in Arlequin (distance method: pairwise differences). All values were significant (p=0.000).

1b 2a 2b 1a 0.28176 0.15087 0.45274 1b - 0.52447 0.80546 2a - - 0.15128

41 Table 3.4. AMOVA of the mtDNA control region sequences for the combined northern clusters (Sub-clusters 1a and 1b) and southern clusters (Sub-clusters 2a and 2b). P-values were based on 1000 permutations. Note that, although a substantial percentage of the variation is among clusters, the number of populations specified is too low to achieve a significant result for FCT.

Source of variation d.f. Percentage of F-stat P-value variation

Among clusters (north and 1 36.50 FCT 0.365 P = 0.331 south)

Among sub-clusters within 2 14.50 FSC 0.228 P < 0.001 clusters

Within sub-clusters 635 49.00 FST 0.510 P < 0.001

Seascape genetics

The logistic regression analysis identified the binary geographical location variable, ‘north of the Whitsundays Islands’, to be the most influential factor for predicting if an individual was genetically assigned in the microsatellite analysis to Cluster 1 (regression coefficient 95% CI: -1.41 – 9.82; Figure 3.7). This variable also accounted for the largest proportion of the explained variance (95% CI: 0.10 - 0.99; Appendix 3.10). Summer and winter SST regression coefficients overlapped with zero and were therefore not significant predictors of population assignment.

42

Figure 3.7. Logistic regression coefficients for each variable’s predicted influence on an individual dugong’s probability of being assigned in the microsatellite analysis to Cluster 1 (the northern cluster). Mean (lines within boxes), interquartile range (boxplot hinges) and 95% confidence intervals (whiskers) are shown. Variables were considered significant if their 95% confidence intervals did not overlap with zero.

The results of the multiple regression analysis on the distance matrices found geographic distance between sampling locations (positive correlation) had the most significant effect on pairwise genetic (microsatellite) distance and had the highest relative importance value (Table 3.5). All other variables were found to be relatively unimportant in explaining pairwise genetic distance.

Table 3.5. Results from the multiple regression on distance matrices to examine relationships between individual pairwise genetic (microsatellite) distance and environmental seascape distances. Statistics show mean values taken across 1,000 models, brackets show 95% quantiles. The R2 value was 0.3189 (95% CI: 0.3178 – 0.3216).

Distance Regression statistic Relative importance Geographic distance 0.2787 (0.2772 – 0.2809) 0.7902 (0.7567 – 0.8243) Mean summer SST -0.1057 (-0.1087 – -0.1027) 0.1134 (0.1018 – 0.1262) Mean winter SST -0.0186 (-0.0198 – -0.0170) 0.0035 (0.0029 – 0.0042) Seagrass distribution -0.0958 (-0.1191 – -0.0653) 0.0931 (0.0457 – 0.1380)

43 Oceanographic modelling

Oceanographic modelling was conducted to identify possible explanatory mechanisms for the abrupt genetic break in the Whitsundays Islands region. Within this region, three seagrass hotspots have been identified from past seagrass presence/absence surveys in the region (Layer id: ea_nesp1:GBR_NESP-TWQ-3-1_JCU_Seagrass_1984-2014_Site- surveys, https://maps.eatlas.org.au) and these are located on the western side of Whitsunday Island (site 1 in Figure 3.8), at Shute Harbour (site 2) and at Airlie Beach (site 3). The oceanographic model showed that, in the absence of wind, most potential waterborne particles remain within the Whitsundays Islands region, with few escaping (Figure 3.8a and Appendices 3.11 and 3.12). When this model is run with winds blowing from the south-east direction, again few particles drift beyond the Whitsunday Islands region (Figure 3.8b).

Figure 3.8. The predicted movement of potential waterborne particles emanating from the three colour-coded seagrass meadows (site 1 - western side of Whitsunday Island (maroon), site 2 - Shute Harbour (blue) and site 3 - Airlie Beach (green)) in the Whitsundays Island region (~20.32°S) after 184 hours (a) during calm weather conditions and (b) during the prevailing south- easterly winds. Land masses are shown in white with the change in ocean bathymetry shown as grey-scale contours (the actual values of the depth are shown in Appendix 3.3).

3.5 DISCUSSION

Investigation of the genetic population structure of dugongs along the east Queensland coast found evidence of a major barrier to gene flow that has led to significant genetic differentiation. Analysis of the microsatellite genotypes from 293 dugongs sampled along the east Queensland coast identified an abrupt genetic break in the Whitsunday Islands region of central Queensland, indicating the presence of two genetically distinct clusters. Since dugongs in Queensland have been tracked making movements of up to 560 km (Sheppard et al. 2006), such a profound separation of the two clusters occurring over a

44 distance of less than 100 km was surprising and supports our theory that the barrier in the Whitsunday Islands region is significantly restricting dugong gene flow across the region.

Our seascape genetics analyses accounted for population uncertainty and demonstrated a strong influence of location in relation to the Whitsunday Islands on an individual’s population assignment probability, indicating that dugongs on either side of the break were unlikely to move across this location and interbreed. Additionally, geographic distance between sample locations most significantly explained the observed pairwise genetic distances. Geographical distance is likely a significant parameter because the distribution of dugongs along the east Queensland coast south of Cooktown is essentially linear, with their distribution largely limited to sheltered shallow coastal regions supporting seagrass meadows (Marsh and Saalfeld 1989; Marsh and Saalfeld 1990b). Large-scale dugong movements therefore tend to be strongly two-directional (north or south; Sheppard et al. 2006; Zeh et al. 2016), facilitating isolation by distance.

The high-resolution oceanographic modelling (Figure 3.8 and Appendices 3.11 and 3.12) indicated the presence of the ‘sticky water’ effect, a unique phenomenon that might influence dugong movements and potentially other marine mammals. This effect occurs when the mean currents flow around a matrix of reefs and islands, whilst strong tidal currents prevail within the matrix (Wolanski and Spagnol 2000). Our modelling clearly shows the effect of the ‘sticky water’ phenomenon on waterborne particles in the Whitsunday Islands region, with most particles remaining inside the region (Figure 3.8 and Appendices 3.11 and 3.12). The model shows that, by the time that they exit the Whitsundays, the waterborne particles (which might include odour particles) originating from the Whitsundays are highly diluted by a factor of 100:1 and probably also highly biodegraded because the water residence time is about 2 weeks. The sticky water effect does not physically prevent dugongs from moving; it may however have the effect of inhibiting any strong seagrass smell cues to dugongs located north and south of the Whitsundays which possibly is a cue to attract them to swim to the Whitsundays in search of feeding grounds. This trapping of a smell plume within an archipelago is an identical process to that of the sticky water trapping non-swimming coral larvae in a reef matrix (Andutta et al. 2012). Along the inshore waters of eastern Queensland, this effect is found only at the Whitsunday Islands region (Wolanski and Kingsford 2014). We hypothesise

45 that the currents and tides in this area might disrupt or inhibit the movements of dugongs traversing the Whitsunday Islands region, but further research is required to understand the phenomenon. There are few comparative data from other species in this region; further studies might verify whether or not other species show similar patterns of restricted gene flow at this location.

Past satellite tracking data show limited dugong movements across the region of the genetic break reported here. The fifty-two dugongs caught and tracked between Missionary Bay (18.2oS) and Hervey Bay (25.2oS) had the potential to cross the Whitsunday Island region (20oS - 21oS), based on a maximum movement distance of 560 km (Sheppard et al. 2006), yet none did so. It is possible that dugongs are traversing the Whitsundays region but have not yet been tracked doing so, as pedigree analysis conducted by Cope et al. (2015) found that dugongs in south-east Queensland moved between locations more often than was suggested by genetic structuring (Seddon et al. 2014) or telemetry (Sheppard et al. 2006). Targeted satellite tracking of individuals from around the locality of the genetic break is required to confirm whether or not dugongs do indeed move across the Whitsunday Islands region.

The latitudinal gradient in SST data indicated that there was a greater than 6°C difference in SST from northern Queensland towards southern Queensland (Appendix 3.1) and so we might have expected SST to influence population assignment and genetic distance. However, no such association was detected in our analysis and contrasts with the only other sirenian species, the Florida manatee (Trichechus manatus latirostris) that migrates with changes in temperature. Similarly, it was thought that seagrass distribution would help explain the pairwise genetic distances due to the dugong’s reliance on seagrass, however, our analysis using the modelled habitat suitability of coastal seagrass distribution for the GBR region found it to be relatively unimportant in explaining the genetic distances. Further investigations into the extent, quality and species composition of seagrasses along the entire Queensland coast are required to understand what influence, if any, seagrass distribution may have on dugong connectivity.

The same genetic breaks were identified in both the microsatellite and mtDNA analyses. MtDNA data look further back in time than do microsatellite data, which are more indicative

46 of current gene flow (Martien et al. 2014). Agreement between these two sets of data provides compelling evidence that these breaks are somewhat historical. The mtDNA median-joining network suggests that breeding between dugongs from either side of the main genetic break has occurred in the past, possibly more frequently than at present. Any female crossing the region of the break and breeding has the potential to have her mtDNA genome persist for many generations, softening the apparent abruptness of the genetic break without additional migrants.

After sea levels rose following the last glacial maximum (LGM) and areas of suitable seagrass habitat expanded (Blair et al. 2014), dugong populations that were isolated from each other may have come into contact and bred. Blair et al. (2014) suggested that the Coral Sea plateaux, such as the Marion (adjacent to Townsville) and Queensland (adjacent to Cairns) Plateaux, might have provided refugia for dugongs during much of the last glacial period, during which time restricted lineage haplotypes diverged in isolation in each of these refugia. Dr R Beaman (pers. comm.) indicated that at the LGM, the margins of these two plateaux would have been sheer limestone cliffs, rising from deep water and likely unsuitable for the growth of seagrasses and hence not able to support dugongs. Limited shallow-water habitat would have existed at that time only at the southern end of what is now the Great Barrier Reef and around the mouth of the Fly River in the Gulf of Papua. We now postulate that the restricted mitochondrial lineage underwent some divergence in these two refugia. With rising sea levels post-LGM, areas of suitable habitat expanded from the refugia, but large parts of the northern GBR coastline would have remained unsuitable for dugongs until relatively recently. Following the flooding of the Torres Strait landbridge (~7,000 years ago: Blair et al. 2014), representatives of the widespread lineage would have been able to colonise the eastern coastline of Queensland. While this scenario can explain the phylogeography of the restricted lineage, it still does not explain the low representation of this lineage in mid-north Queensland, a portion of the coastline where the widespread lineage predominates. Neither does this scenario explain the persistence of the genetic break in the vicinity of the Whitsunday Islands. Sea levels in the area have been relatively stable for about the last 7,000 years (Lambeck et al. 2014) and inter- and sub-tidal seagrass habitats suitable for dugongs are widely distributed in the Whitsunday Islands region (Carter et al. 2016). In this study, we see greater mixing in the mtDNA sequence data, which potentially provides insights at a deeper time scale, while the microsatellite data likely exhibits the more fine-scale

47 population structuring that has occurred due to the contemporary barriers to dugong gene flow proposed in this study.

At present, dugong management on the east coast of Queensland is largely conducted on a jurisdictional basis with separate (and very different) management arrangements for Torres Strait (north of 10.68oS), the Great Barrier Reef Region (10.68oS to 24.5oS) and south east Queensland (24.5oS to the Queensland/NSW border; see Marsh et al. 2011 for details). Our finding of a distinct genetic break in the Whitsunday Islands region highlights the need for improved cross-jurisdictional co-ordination. Along the eastern Queensland coast, dugongs should be managed and monitored as two distinct breeding units, north and south of the Whitsunday Islands. While the microsatellite and mtDNA analysis both suggested four ‘populations’, a lack of samples from the proposed genetic break in the Torres Strait to Starcke River region and the uncertainty in the number of sub-clusters in Cluster 2 means that further investigation is required to determine if this approach should be extended to four breeding units and consequent management units. As there is evidence of high gene flow within each of the two clusters/populations, maintaining the movement corridors currently utilised by dugongs in each of the two regions should be a priority.

48 CHAPTER 4: Reduced representation sequencing of single nucleotide polymorphisms for detecting Queensland dugong population structure

4.1 ABSTRACT

Technological advances in the molecular genetics field have greatly improved our ability to study population genetics. More recently, improvements and the greater affordability of single nucleotide polymorphism (SNP) discovery and genotyping methods have allowed for genome-wide analyses for non-model species. The double digest RAD-Seq (ddRAD- Seq) method of SNP discovery and genotyping has been applied to a number of wildlife species and has enabled improved resolution to population structure analyses due to the evaluation of genome-wide variation. In this study, the ddRAD-Seq method of SNP discovery and genotyping was applied to investigate Queensland dugong population structure and to make comparisons with results obtained using microsatellite markers and mitochondrial DNA sequences. A total of 47 individual dugongs sampled along the eastern Queensland coast between Torres Strait and Moreton Bay were genotyped, with 43 individuals remaining after filtering for genotype quality. Evaluation of dugong population structure using 10,690 SNPs using admixture analysis identified the presence of three genetic clusters. A northern and southern cluster were found with a distinct genetic break in the Whitsunday Islands region, a result very similar to that detected by microsatellite analysis. The third cluster consisted of individuals sampled from both northern and southern Queensland and indicated dugong effective dispersal with breeding has occurred across the region of the genetic break. Further analysis identified 464 highly discriminatory SNPs that contributed the most to cluster assignments, with a cost-effective SNP array able to be developed from this SNP subset for additional dugong population genetics research. The discovery of genome-wide SNPs has provided improved Queensland dugong population structure resolution and provided opportunities for future genomic study.

49

4.2 INTRODUCTION

Population genetics analyses are able to evaluate gene flow between populations, providing an indication of the level of effective breeding dispersal occurring between populations and thus how connected populations are to one another (Frankham et al. 2004; Groom et al. 2006). Connectivity is important as it enables the exchanges of genetic material and thus reduces the likelihood of the deleterious effects of low genetic diversity and low gene flow. An understanding of the level of connectivity within a species ensures management action is effective in conservation of the species.

Over the past few decades, advances in molecular techniques and analysis methods have dramatically increased our ability to study population genetics. Substantial advances have been made using high-throughput sequencing technologies that allow for the discovery and sequencing of thousands of genetic markers and this has enabled the study of genetic variation at the genome-wide level for a number of taxa of interest (Davey et al. 2011; Valencia et al. 2018). The characterisation of a large number of genome-wide markers has increased the resolution of comparisons at the individual and population levels, providing insights into species population genetics not previously possible, for example, in detecting fine-scale population structuring (Dussex et al. 2018; Thrasher et al. 2018). The ability to discover large numbers of variants and the associated benefits of access to genome-wide data have resulted in a definite shift towards the use of single nucleotide polymorphisms (SNPs) as opposed to other molecular markers in both model and non-model organisms genetics research (Morin et al. 2009; Garvin et al. 2010; Davey et al. 2011).

SNPs are substitutions of a single nucleotide at a particular position in the genome and represent the most common type of genetic variability in the genome (Morin et al. 2009; Lapegue et al. 2014). They are now the most widely used molecular marker and have been used for crop improvement, breeding programs, in aquaculture and have been extensively used in human diagnostics, including targeted treatment plans based on individual genotypes (Schork et al. 2000; Shastry 2007). While research utilising SNPs had previously been restricted to model organisms and humans, the decrease in costs associated with SNP discovery and genotyping due to improvements in technology and

50 their adaptability to automation (Liu and Cordes 2004; Davey et al. 2011) has led to an increase in their use in wildlife species for population genetics and conservation studies (Steiner et al. 2013; Dussex et al. 2018).

The key benefit of SNPs is their abundance throughout the genome which allows for detection of genome-wide variation that is not possible with other markers (Liu and Cordes 2004). This has caused a shift away from traditional markers such as microsatellites and mitochondrial DNA (mtDNA; Helyar et al. 2011), with studies utilising SNPs likely to continue to increase due to the ease and efficiency of SNP discovery and genotyping resulting in a large number of annotated markers (Morin and McCarthy 2007; Helyar et al. 2011). SNPs are also advantageous because they are simple to score (Garvin et al. 2010), they have lower mutation rates (Schork et al. 2000; Brumfield et al. 2003) and thus often have low levels of homoplasy (Brito and Edwards 2009). Additional benefits include easier calibration between laboratories compared to length based markers, although this does depend on the methodology used, low scoring error rates, the ability to genotype poorer quality samples such as ancient DNA, the capacity to investigate neutral and adaptive variation across the genome (Helyar et al. 2011) and their suitability for high-throughput sequencing (Sobrino et al. 2005; Garvin et al. 2010). SNP markers can be used in a broad range of population genetic studies including analysis of genetic diversity, population structure, taxonomic delineations and in parentage analyses (Morin and McCarthy 2007). However, their frequency throughout the genome also allows for gene mapping, mapping of quantitative trait loci (QTL, loci underlying phenotypic and adaptive traits; Steiner et al. 2013), disequilibrium-based association mapping (Lapegue et al. 2014), tracing the origin of adaptive traits (Ellegren 2014), identification of loci associated with inbreeding depression (Steiner et al. 2013) and genome-wide association studies (Thrasher et al. 2018).

In addition to their abundance throughout the genome and ease of genotyping, SNPs have further advantages over other nuclear markers. The development of microsatellite assays can be expensive and can require substantial laboratory time investment (Thrasher et al. 2018). The manual scoring of microsatellite alleles also requires substantial time and can include several forms of error making scoring unreliable, including allelic dropout, null allele issues and human error associated with the complex workflow (Thrasher et al. 2018). These issues can be less severe when using SNP assays as there are fewer steps

51 involved and the genotyping is more automated (Brito and Edwards 2009; Thrasher et al. 2018). The mutation rate per generation differs between markers, with the mutation rate per generation higher for microsatellites compared to SNPs (Ellegren 2000; Lapegue et al. 2014). Due to this, SNPs are generally biallelic, while microsatellites are multiallelic, meaning a greater number of SNP loci are needed to achieve the same resolution as highly variable microsatellite loci (Lapegue et al. 2014; Thrasher et al. 2018). While advances in genetic technology have allowed for the discovery of numerous SNP markers and the creation of large amounts of sequence data, analysis tools have struggled to keep up with genomic data development, and some software is not able to handle large genomic datasets without updates (Steiner et al. 2013). Additionally, sequencing errors occurring from high-throughput sequencing can result in false SNP discovery, although these errors can be minimised by achieving sufficient sequence read depths and by using filters (e.g. removal of low quality sequence reads) to improve SNP calling (Ogden et al. 2013).

While early population genetics studies questioned the advantages of SNPs compared to neutral markers like microsatellites (e.g. Rosenberg et al. 2003), their utility for evaluating population structure has been shown to be similar if not better than microsatellites when an appropriate number of highly discriminatory markers are used. It has been suggested that around 100 SNPs are required to provide the same discriminatory power as 10-20 microsatellite loci (Kalinowski 2002; Helyar et al. 2011; Thrasher et al 2018). Tens of thousands of SNPs can be discovered using Next-Generation Sequencing (NGS) methods and therefore they have the statistical power to outperform a few dozen microsatellites when determining population structure (Helyar et al. 2011; Dussex et al. 2018). A recent investigation of harbor porpoise population structure using both SNP markers discovered using reduced representation sequencing (RRS) and microsatellite loci found that assignment probabilities of individuals to clusters was improved in the Bayesian clustering analysis and that better resolution in the spatial analyses were obtained using the genome-wide SNPs compared to microsatellite markers, with this attributed to the greater number of SNP markers (Lah et al. 2016). In another study, SNP markers discovered using RRS were able to considerably improve the power to differentiate between potential relatives of the variegated fairy wren and provide more accurate estimates of relatedness coefficients compared to highly variable microsatellite markers (Thrasher et al. 2018)

52 The application of SNP markers to non-model species has been greatly improved by the introduction of high-throughput sequencing methods that allow for SNP discovery without the prior need for the costly and complex process of Whole Genome Sequencing and annotation. Reduced representation sequencing methods allow for the discovery of informative SNPs throughout the genome at a lower cost and without the need for a reference genome or other annotated genomic data (Thrasher et al. 2018; Valencia et al. 2018). Restriction site Associated DNA Sequencing (RAD-Seq) is a genotyping by sequencing method and a commonly used RRS method for SNP discovery that allows for the simultaneous identification and screening of a large number of polymorphic SNP loci for many individuals of a non-model species (Andrews et al. 2016). Analysis is possible at intra- and interspecific levels, making it a useful method for investigating fine-scale population structure, gene flow and phylogeography (Valencia et al. 2018). The RAD-Seq method can also be applied to linkage and QTL mapping studies (Baird et al. 2008) and for determining inbreeding levels and effective population sizes (Andrews et al. 2016). RAD- Seq uses restriction enzymes to cut genomic DNA into numerous fragments, with size selection of the digested fragments reducing the number of fragments requiring sequencing (Andrews et al. 2016; Valencia et al. 2018). Sequencing of these fragments using next generation sequencing platforms enables the discovery and genotyping of a large number of SNPs in a single, simple and cost-effective step, resulting in libraries of an unbiased subset of loci distributed throughout the genome (Valencia et al. 2018). The double digest RAD-Seq (ddRAD-Seq) method of SNP discovery is a refinement of the RAD-Seq method and uses two restriction enzymes (a common and a rarer cutter) to fragment genomic DNA and includes a precise size selection step that provides improved consistency, homogeneity and reproducibility in the selection of fragments across samples for sequencing (Peterson et al. 2012; Valencia et al. 2018). The method is designed to construct reduced representation libraries that contain a greater number of homologous regions within and between samples and so tends to result in higher sequencing depths at each locus, which ensures polymorphisms detected are true sequence variants and not sequencing errors (Peterson et al. 2012; Valencia et al. 2018). The ddRAD-Seq method reduces the amount of the genome to be sequenced and allows for high coverage sequencing of each marker, providing an excellent, cost-effective alternative for non-model species SNP discovery and genotyping (Steiner et al. 2013).

53 This study aimed to characterise a set of SNP markers in dugongs (Dugong dugon) to allow for evaluation of population structure of dugong populations along the east coast of Queensland, Australia between Torres Strait and Moreton Bay using the ddRAD-Seq method of SNP discovery and genotyping. It was hypothesised that the greater number of SNP markers developed would provide greater resolution and confidence in assignment probabilities compared to the fewer microsatellite loci and the mtDNA markers employed to deduce population structure in Chapter 3. Therefore comparisons were made between the three markers to determine the SNP markers’ performance in detecting dugong population structure. Additionally, the study aimed to identify a set of highly discrimantory SNP markers that could be developed into a cost-effective SNP array tool for future dugong population genetics and conservation genetics studies.

4.3 METHODS

Samples, library preparation and sequencing

A subset (n = 47) of DNA extractions from tissue samples collected from free-ranging Queensland dugongs between Torres Strait and Moreton Bay (Figure 4.1) were selected for SNP genotyping using the ddRAD-seq method. These included 34 extractions from between Torres Strait and Shoalwater Bay (used in Chapter 3) as well as 13 extractions previously extracted from southern Queensland dugong tissue samples (Great Sandy Straits, Hervey and Moreton Bays; Seddon et al. 2014; Figure 4.1). These extractions were selected as they returned genotypes at all 22 loci in Chapter 3 and were thus deemed likely to generate high quality sequencing data. DNA was quantified using QuantiFluor (Promega, Madison, Wisconsin) and then visualised on a Genomic DNA ScreenTape. For the initial library establishment phase, three DNA extracts each with 2 μg of DNA and a minimum concentration of 20 ng/μl were required and 800 ng/sample with a minimum concentration of 20 ng/μl required for the ddRAD-seq of the 47 samples.

54

Figure 4.1. Map showing sample collection locations of dugong tissue samples along the east Queensland coast that underwent genotyping using single nucleotide polymorphisms for population structure analysis. Sample sizes (N) are shown for each location.

Three DNA extracts (one each from Torres Strait, Townsville and Moreton Bay) were selected and an initial establishment phase was undertaken by the Australian Genome Research Facility (AGRF) to determine the appropriate restriction enzymes to use for ddRAD-seq and to determine the size selection to use for the analysis of the full set of 47 samples. The enzyme combination was determined using a pooled sample of the three initial extractions that were digested with eight different restriction enzyme combinations, with libraries made and examined by electrophoresis to determine the enzyme combination least likely to yield repetitive sequences. Following this step, size-selected libraries (narrow or wide range) were made from this trio of samples, and genotyping by

55 sequencing (GBS) performed using an Illumina MiSeq (160 cycles) to assess tag numbers and the level of polymorphism.

Following the establishment phase, ddRAD-seq was performed by the Australian Genome Research Facility (AGRF) for the 47 samples using the restriction enzymes determined in the establishment phase to give optimal results (EcoRI and HpyCH4IV), following the library preparation protocol of Peterson et al. (2012). DNA digestion of samples with a negative control was conducted using the two restriction enzymes followed by ligation of barcoded adapters compatible with the restriction site overhang, purification and size selection (280-375 bp) of digested-ligated fragments with the Blue Pippin (Sage Science), amplification of libraries via PCR with indexed primers, and finally paired-end sequencing on the Illumina NextSeq500 with 150 cycles in MID-output mode. The establishment phase, library preparation and sequencing of samples were conducted by the Australian Genome Research Facility (Melbourne).

Sequence alignment and filtering

SNP genotyping was performed using Stacks software (version 1.47; Catchen et al. 2013) using the process_radtags (demultiplexing, filtering and removal of low quality reads), ustacks (builds loci), cstacks (creates catalogue of loci), sstacks (matches samples against the catalogue of tags) and genotypes (determines sample genotypes from common variants) functions. Sequences in FASTQ.GZ format were de-convoluted for each read according to the inline barcodes with reads also checked for quality and restriction site presence. FASTQ files were created for each sample and these were trimmed to the shortest read size minus two bases to compensate for differences in read length. The alignment process then created stacks of similar reads, also known as tags, for each sample. The tags that were common across samples were then gathered into catalogue tags and genotypes calculated from the common polymorphic sites. Filtering of genotypes was conducted in R using functions in the package dartR (version 1.1.11; R code in Appendix 4.1). Loci with a call rate of less than 95% as well as monomorphic loci were removed prior to analysis.

56 Population differentiation and diversity statistics

Genetic population structure was investigated for Queensland dugongs using an admixture model. Similar to the program STRUCTURE, the snmf function in the R package LEA (version 2.6.0; R code in Appendix 4.1) estimates admixture coefficients for individuals from the genotype matrix and provides least-squares estimates of ancestry proportions (Frichot et al. 2014). As in Bayesian clustering methods like STRUCTURE, the total number of ancestral population clusters is unknown a priori. Individuals were assigned to a cluster in the absence of prior population association, with the total number of ancestral population clusters (K) assessed for K = 1-10 using 100 repetitions. The optimal number of clusters defined was based on the cross-entropy criterion, with the smallest value run selected. A bar chart was constructed from the best run for the defined number of clusters to visualise the assignment probabilities of each individual to each of the clusters. In addition, a Principal Coordinates Analysis (PCoA) was conducted to visualise the similarities and dissimilarities between individual samples and among geographical locations using the dartR package in R. The dartR package was also used to generate a genomic relatedness matrix network using the gl.grm function. Genetic diversity metrics

(mean observed heterozygosity (HO) and mean expected heterozygosity (HE)) for each cluster and pairwise FST values between clusters were calculated using methods outlined in Chapter 3. FIS values were calculated following Nei (1987) using the STRUCTURE.popgen package in R.

Identification of highly informative SNPs

Identification of SNPs with the greatest discriminatory power in determining cluster assignment was performed by calculating observed versus expected frequencies of genotypes across the clusters that were identified from the analysis conducted above. Chi square tests comparing observed versus expected genotype frequencies were conducted and the SNPs with chi square values that were significantly larger than expected, using a p-value threshold of 0.05, were deemed to be those with the greatest discriminatory power in detecting population genetic structure. Loci in which the SNP was in the first or last 29 bp of the sequence fragment were discarded, leaving a set of SNP loci that could be used as the basis of future SNP-based assays. To confirm these SNPs were able to detect similar/same population structure results as the analysis using all of the filtered SNPs, a sensitivity analysis was performed whereby the number of clusters detected and the

57 assignment probabilities of each individual to each cluster were determined following methods described above using only the final subset of highly discriminatory SNPs.

4.4 RESULTS

Sequencing and filtering output

A total of 142,887,564 de-multiplexed raw reads across four lanes were generated from the sequencing, yielding 21.58 GB of raw sequence data. Individual sample reads ranged from 437,972 - 7,598,616 with a mean of 2,961,519 reads per individual sample. Prior to filtering, 1,048,574 catalogue tags were identified with 10,690 retained after filtering. Four samples (one each from Cairns, Cardwell, Townsville and Midge Point) were removed from the final analysis due to low call rates within those samples. Three of the removed samples had some of the lowest reads among the 47 samples. A total of 10,690 loci were used in the population admixture analysis to determine the genetic population structure of the remaining 43 samples.

Population differentiation and genetic diversity

Admixture analysis identified that the optimal number of ancestral populations (K) for dugong samples collected along the Queensland coast was three (Figure 4.2). The geographical distributions of two of these clusters showed a distinct latitudinal pattern. Samples collected from north of Airlie Beach to Torres Strait formed Cluster 1 (northern cluster, blue colour in Figure 4.2) and samples collected south of Midge Point to Moreton Bay formed Cluster 2 (southern cluster, red colour in Figure 4.2). There was an abrupt break between Clusters 1 and 2 in the region of Airlie Beach in the Whitsunday Islands. One sample from Airlie Beach in the Whitsunday Islands region had an assignment probability that best fit Cluster 2 (red colour in Figure 4.2) while the two remaining samples from Airlie Beach were assigned to Cluster 1. The only sample remaining in the analysis (after filtering) from Midge Point (Whitsunday Islands region) had an almost 50% split in assignment to Clusters 1 and 2. The third cluster (Cluster 3; green in Figure 4.2) contained a small number of individual dugongs from mostly northern Queensland locations, including Torres Strait, Cairns, Townsville, Bowling Green Bay and Airlie Beach, with 3 of 4 Great Sandy Straits samples having a mixed assignment to Clusters 2 and 3. The multidimensional analysis (PCoA) also indicated the presence of the two main geographical clusters suggested by the admixture analysis, with the individuals that

58 formed the third cluster also shown to be dissimilar to the individuals that formed Cluster 1 and 2 in the PCoA plot (Figure 4.3). However, the genomic relatedness matrix network did not find support for a third cluster, only finding support for the northern and southern clusters (Figure 4.4).

Observed heterozygosity ranged between 0.199 – 0.290 across the three clusters (Table

4.1). Point estimates of pairwise FST values were highest between Clusters 2 and 3 (0.025) and lowest between Clusters 1 and 3 (0.019), although overlapping confidence intervals suggested that pairwise FST values were generally similar across each pair of clusters (Table 4.2).

Table 4.1. Observed and expected heterozygosity (estimated across 1,000 iterations) of the three clusters.

Cluster HO HE 1 0.290 0.311 2 0.199 0.226 3 0.215 0.238

Table 4.2. Pairwise FST values (estimated across 1,000 iterations) between the three clusters. Brackets show 95% confidence intervals.

Cluster 1 2 3 1 - 0.023 (0.019-0.031) 0.019 (0.016-0.023) 2 - - 0.025 (0.021-0.032) 3 - - -

Table 4.3. FIS values (estimated across 1,000 iterations) between the three clusters. Brackets show 95% confidence intervals.

Cluster 1 2 3 1 - 0 (0-0.008) 0 (0-0.001) 2 - - 0.008 (0-0.043) 3 - - -

Identification of highly informative SNPs and sensitivity admixture analysis

Chi squared tests of observed vs expected allele frequencies identified a total of 464 SNPs among 444 catalogue tags as having the most discriminatory power in determining genetic population structure (Appendix 4.2). A sensitivity admixture analysis using only these highly discriminatory SNPs identified a very similar result in terms of assignment probabilities to that identified using the full set of 10,690 loci (Figure 4.5). Three clusters were identified again with the north/south split between Clusters 1 and 2 again occurring in

59 the Whitsunday Islands region (blue and red in Figure 4.5). Similar assignment probabilities were identified, although some samples were assigned more closely to one cluster using the smaller number of highly discriminatory SNPs compared to using all SNPs. For example, the three Great Sandy Straits samples that had a mixed assignment between Clusters 2 and 3 were assigned to Cluster 2 using the highly discriminatory SNPs (red colour in Figure 4.5).

60 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

0

Cairns Cairns Cairns

Cardwell Cardwell Cardwell

Clairview Clairview Clairview Clairview

Gladstone Gladstone Gladstone

Townsville Townsville Townsville Townsville Townsville Townsville Townsville

HerveyBay HerveyBay HerveyBay HerveyBay

MidgePoint

AirlieBeach AirlieBeach AirlieBeach

TorresStrait TorresStrait TorresStrait

MoretonBay MoretonBay MoretonBay MoretonBay MoretonBay

ShoalwaterBay

BowlingGreen Bay BowlingGreen Bay

Great SandyStraits Great SandyStraits Great SandyStraits Great SandyStraits

Figure 4.2. The assignment probabilities of individual dugongs sampled at each coastal location to one of the three clusters (1 – blue, 2 – red, 3 – green) identified by the admixture analysis of 10,690 SNPs. Each individual is represented by a column. Collection locations of individual samples are shown below the plot. The plot is organised according to location from north (left side of the plot) to south Queensland (right side of the plot) with the location of an apparent genetic break in the Whitsunday Islands region circled in black.

61

Figure 4.3. Principal Coordinate Analysis plot visualising the similarities and dissimilarities between individual samples and geographical locations based on 10,690 SNPs. GSS – Great Sandy Straits.

62

Figure 4.4. Genomic relatedness matrix network of 43 dugong samples from along the east Queensland coast based on 10,690 SNPs.

63

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

Cairns Cairns Cairns

Cardwell Cardwell Cardwell

Clairview Clairview Clairview

Clairview

Gladstone Gladstone Gladstone

Townsville Townsville Townsville Townsville Townsville Townsville Townsville

HerveyBay HerveyBay HerveyBay HerveyBay

MidgePoint

AirlieBeach AirlieBeach AirlieBeach

TorresStrait TorresStrait TorresStrait

MoretonBay MoretonBay MoretonBay MoretonBay MoretonBay

ShoalwaterBay

BowlingGreen Bay BowlingGreen Bay

Great SandyStraits Great SandyStraits Great SandyStraits Great SandyStraits Figure 4.5. The assignment probabilities of individual dugongs sampled at each coastal location to one of the three clusters (1 – blue, 2 – red, 3 – green) identified by the admixture analysis using the 464 highly discriminatory SNPs identified. Each individual is represented by a column. Collection locations of individual samples are shown below the plot. The plot is organised according to location from north (left side of the plot) to south Queensland (right side of the plot) with the location of an apparent genetic break in the Whitsunday Islands region circled in black.

64 4.5 DISCUSSION

In this study, I used a RRS method to discover polymorphic SNPs present in the dugong, where a reference genome is not available. The discovery of 10,690 genome-wide SNPs using ddRAD-seq has allowed for evaluation of fine-scale population structuring and genetic diversity of dugongs along the entire east Queensland coast. Admixture and multidimensional analyses identified the presence of three population genetic clusters, with evidence of an abrupt genetic break in the region of the Whitsunday Islands between the two primary clusters. Individuals assigned to Cluster 1 were all sampled in north Queensland, between Torres Strait and Airlie Beach, while individuals assigned to Cluster 2 were sampled from locations in southern Queensland between Clairview and Moreton Bay (Figure 4.2). The analyses also detected the presence of a third cluster, consisting of dugongs sampled from both northern and southern Queensland, potentially indicative of widespread historical dugong dispersal. Post-hoc investigation of discriminatory power for individual SNPs identified a reduced set of SNPs that reliably reproduced the clustering analysis, providing informative potential targets for the design of cost-effective SNP arrays to improve future dugong population genetic studies in Australia.

There was strong concordance in the population structure identified from the different genetic datasets analysed in this thesis. The population structure of Queensland dugongs deduced using the SNP markers discovered in this study was comparable to that obtained using microsatellite markers and mtDNA sequences in Chapter 3. The STRUCTURE analysis conducted in Chapter 3 using 22 microsatellite markers identified the presence of two clusters, where dugongs sampled from Torres Strait south to Airlie Beach were assigned to a northern cluster and those from Midge Point south to Moreton Bay assigned to a southern cluster. The evidence for an abrupt genetic break in the east Queensland dugong population in the Whitsunday Islands region was identified using both the SNP and microsatellite markers. This use of multiple genomic datasets and different analysis methods to detect a major genetic break provides concordant evidence that can be valuable to conservation managers tasked with designing management plans to protect dugong populations.

Both the mtDNA (Chapter 3) and SNP analyses indicate dugong effective breeding dispersal occurred across the region of the genetic break that was not readily detected by the microsatellite analysis. The SNP analysis identified a third Cluster that included a mix

65 of samples from both northern and southern Queensland and might indicate dispersal across the Whitsunday Island region. While the microsatellite analysis clearly assigned all Airlie Beach individuals to the northern cluster, this study found that one Airlie Beach individual, included in both the microsatellite and SNP analysis, had a greater than 90% assignment to the southern cluster according to the SNP analysis. The mtDNA analysis in Chapter 3 found evidence of more dispersal between the northern and southern clusters than the microsatellite analysis detected, with the mtDNA sequence analysis reflecting the widespread and restricted lineages described by Blair et al. (2014). The widespread lineage has been identified in samples collected throughout the dugong’s range while the restricted lineage primarily included samples collected from southern Queensland. Comparison of the ability of different genetic markers to detect population structure in Queensland dugongs indicates resolution can be variable depending on the marker used. MtDNA sequences are often used to detect historical dispersal but only from the maternal line, while the higher mutation rate of microsatellite markers make them suitable to detect more contemporary dispersal. The greater number of SNP markers that are distributed throughout the dugong genome likely provide the greatest level of resolution in terms of dugong population structure. Hence, the analyses of population structure of Queensland dugongs suggest that dispersal patterns have differed over time.

Other studies comparing the performance of SNPs in detecting population structure compared to microsatellites for highly vagile species have also found similar overall results, although have indicated greater resolution of assignments when using SNP markers. Dussex et al. (2018) found similar genetic population structuring of New Zealand fur seals using SNP markers and previously published microsatellite loci. However, the larger number of SNP markers allowed for improved assignment of pups to their colony of origin and to genetic clusters (Dussex et al. 2018). In the Baltic Sea region, three harbour porpoise populations had previously been suggested but their delineation remained unclear using microsatellite markers (Lah et al. 2016). SNP markers generated using ddRAD-seq were able to provide greater resolution, improving assignment of individuals to geographical clusters (Lah et al. 2016).

In addition to analysing Queensland dugong population structure using the 10,690 SNPs discovered, further analysis of these SNPs identified 464 highly discriminatory SNPs. These 464 SNPs were found to contribute most strongly to the observed population structure of Queensland dugongs, with admixture analysis using this smaller number of

66 SNPs able to reliably detect the same population structure as when using the full number of SNPs. Simulations of a hypothetical case suggested only 80 or more SNPs were required to maximise the statistical power to detect differentiation (Morin et al. 2009). The refinement of SNP markers in this study provides opportunities for future dugong genomic research, whereby investigators will be able to develop a SNP array specific for dugong population genetics research based on the filtering of genome-wide SNPs performed in this study. The establishment of cost-effective SNP panels that consist of a small number of highly discriminatory SNP markers opens up further opportunities for genome-wide analysis of dugong population genetics, particularly where funds are limited, DNA quality is low and where conservation effort is needed. The selection of highly discriminatory SNPs in this study were chosen as they were able to detect the same population structuring as the entire SNP dataset, however, for other studies e.g. for pedigree discrimination, alternative SNP markers for an array would need to be selected. This study provides the firsts steps for genome-wide research in dugongs; further SNP genotyping over a wider geographical area is needed prior to selecting informative SNPs that can be used for the global dugong population.

While the use of SNPs has provided greater insights into dugong population structure, the limited number of samples analysed in this study may be limiting further refinement of population boundaries. In Chapter 3, the microsatellite analysis identified sub-clusters within the two major northern and southern clusters, indicating genetic structuring in the Queensland dugong population may be prominent. These sub-clusters were not evident from the SNP analysis, although this may be due to the fewer number of samples analysed. In addition the genomic relatedness matrix network did not find support for the third cluster suggested by the STRUCTURE and PCoA analyses. Analysis of a greater number of samples will likely aid in accurately identifying Queensland dugong population structure, with larger sample sizes as opposed to a larger number of loci shown to increase the power to detect differentiation (Morin et al. 2009). Further sampling from areas where limited number of samples were available for analysis, particularly around the Whitsunday Islands genetic break location, will also help refine population boundaries.

The discovery and development of SNP markers using the reduced representation method of ddRAD-Seq has allowed for improved refinement of Queensland dugong population structure without the need for a whole genome or other genomic data. While the microsatellite markers detected similar population structuring, the greater number of

67 genome-wide SNPs have improved assignment probabilities and have indicated historical dispersal occurred across the region of the Whitsunday Islands genetic break not detected by the microsatellite markers. In addition, further refinement of the SNP markers has provided opportunities for future genome-wide analysis of dugong population genetics due to the smaller number of markers required to accurately identify population structure and thus the reduced costs associated with sequencing. After further samples have been analysed, future studies will be able to use the SNP markers to determine relatedness, conduct pedigree analyses and estimate effective population sizes. The characterisation of SNP markers not only allows for evaluation of dugong population genetics but also for future linkage and quantitative trait locus (QTL) mapping to detect genotype-phenotype associations and potenitally allows for identifcation of selected or adaptive genetic pathways when whole genomes for dugongs become available.

68 CHAPTER 5: Assessment of a novel method, the commensal bacterial network, to detect Queensland dugong contemporary movements

5.1 ABSTRACT

Population genetic studies have provided valuable information about the historical breeding movements that occur between populations, however their ability to detect contemporary (breeding and non-breeding) movements is limited. An understanding of a species contemporary movements is required for effective conservation management as it will help identify recent changes to barriers to movements. Traditional methods used to study contemporary movements are financially expensive, time consuming and only capture movements occurring during the study observation period; therefore more efficient methods of studying contemporary movements are required, particularly in the marine environment where observational studies are more challenging. The proposed approach investigated in this study is the commensal bacterial network method which uses shared bacterial genotypes between individuals to build a network that demonstrates the connections between individuals across the landscape. The aim of this study was to investigate the potential of the commensal bacterial network to detect contemporary movements in a marine mammal, the dugong. Faecal samples (n = 423) were collected from dugongs at various locations along the east Queensland, Australia coast. Attempts to culture Escherichia coli, a bacterial species commonly used for this method, failed with only 1 of 17 samples tested culturing E. coli. Therefore 16S rRNA sequencing was conducted to identify a commensal bacterial species common in dugong faeces. Staphylococcus warneri was identified and culturing attempted from 101 faecal samples, with S .warneri cultured from 20 samples. Sequences generated from the 20 S. warneri samples using newly developed primers showed limited genetic variability between faecal samples and it was therefore not possible to construct a commensal bacterial network. Further 16S sequencing of faecal samples using an optimal sample collection method is required to determine a bacterial species that could be used to build a commensal bacterial network that demonstrates dugong contemporary movements.

69

5.2 INTRODUCTION

Effective management of threatened species is paramount to their survival in the face of ever-increasing anthropogenic threats (Escoda et al. 2017). One important parameter informing management is the ongoing monitoring of a species’ dispersal patterns, which assesses the impact that barriers to movements have on the exchange of genetic material between populations (Escoda et al. 2017). This information is important because it allows for assessment of the impact that disturbance events, either natural (e.g. flooding) or human mediated (e.g. development), have on connectivity (Frankham et al. 2004; Groom et al. 2006). Genetic markers, such as microsatellites or single nucleotide polymorphisms (SNPs) in an animal’s genome, allow us to infer the amount of effective migration among populations. The widespread use of genetic markers to investigate the gene flow and genetic connectivity of wild animal populations has provided valuable information about the amount of effective breeding between populations and their long-term migration rates (Storfer et al. 2007; Escoda et al. 2017). Studies utilising genetic markers have been able to define the historical dispersal and genetic population structure of species and the barriers that have restricted effective breeding movements (Storfer et al. 2010; Escoda et al. 2017; Escoda et al. 2019). However, the ability of genetic markers to detect contemporary breeding and non-breeding (e.g. for foraging or socialising) movements is quite limited as movements can only be detected by these studies if an individual’s genetic assignment differs to that of the area in which it was sampled. An understanding of a species’ contemporary movements and the barriers that may be limiting them is crucial for understanding species responses to new alterations or disturbances in their habitat and therefore for future species conservation management (Escoda et al. 2019).

Traditionally, contemporary movements have been investigated through the tracking of individuals using telemetry (Sheppard et al. 2006), or by observation studies that use individually discriminatory physical features to distinguish individuals (Wells et al. 2008). While such study methodologies have successfully identified movements of tracked individuals, they require substantial financial expense, considerable human contact hours to recapture or re-observe individuals at different locations, and are only able to capture the movements that occur during the study period (Clapham 1996; Sheppard et al. 2006; Cope et al. 2015). Traditional contemporary movement studies are made more challenging for marine mammals than most terrestrial species due to the fact that individuals can only

70 be observed when at or near the surface of the water. Therefore it would be beneficial to develop new methods of studying contemporary movements in the marine environment.

A method for studying contemporary movements that has been used in the terrestrial environment to study pathogen transmission involves identifying interactions among individuals through their sharing of commensal (non-pathogenic) bacteria. This approach involves building a network that represents the connections between individuals based on the sharing of their commensal bacterial genotypes (Bull et al. 2012). Borrowing from the approach of establishing host contact networks for infectious disease, each individual in the network is represented by a node and the relationship between nodes is diagrammatically represented by an edge indicating contact or interaction between individuals (Keeling and Eames 2005; Bansal et al. 2007; Craft and Caillaud 2011). For commensal bacterial networks, the edge represents the sharing of a bacterial genotype. To draw inferences about population structuring or movement, it is assumed that individuals that share the same bacterial genotype are likely to have been in close contact either through direct social interactions or indirectly through sharing of resources such as space and diet (VanderWaal et al. 2013; VanderWaal et al. 2014b). Using these data, the contemporary movements and social structure of a species can be defined through this network.

The application of this method to terrestrial species has identified connections made in the commensal bacterial network that matched social links among animals based on observed behavioural data (VanderWaal et al. 2013; VanderWaal et al. 2014b) thus providing information about the contemporary groupings of individuals. For example, giraffe pairs with strong social links based on a behavioural data network were more likely to share an Escherichia coli genotype and individuals with more connections in the social behaviour network had more connections in the bacterial network (VanderWaal et al. 2014a). This is consistent with the notion that for individual giraffes to share the same E. coli genotype they must have moved and come into contact directly with each other, maybe for socialising or breeding or indirectly via being in the same feeding area. To date, E. coli has been the most commonly used bacteria in commensal bacterial networks due to the well- established methods for genotyping, its large genetic diversity and the ease with which E. coli can be cultured from faecal samples (VanderWaal et al. 2014a; VanderWaal et al.

71 2014b). However, other bacteria species have also been used to construct commensal bacterial networks. In Australian sleepy lizards (Tiliqua rugosa) using the bacterial species Salmonella enterica (considered in this species to be a commensal rather than a pathogen), research found that pairs of lizards that shared the same bacterial genotype were more connected in the social network than pairs of lizards that did not (Bull et al. 2012).

The benefits of the commensal bacterial network approach to studying contemporary movements compared to traditional movement studies is that it is less expensive, less study time is required and you can sample larger numbers of individuals. These benefits mean that researchers are not required to continually monitor the behaviour of individuals, instead, only needing to collect a single sample (e.g. faeces). Also, if you are only interested in determining if individuals from one population interact/move to another population, then it is not necessary to identify individuals, although host genotyping could be used in conjunction to identify individuals. A limitation of this method is that preliminary studies need to be conducted initially to determine which commensal bacterial species is most suitable for the target species.

Due to the commensal bacterial network’s success in identifying connections between individuals that were comparable to social networks of terrestrial species, the application of this new methodology to the marine environment was investigated using the dugong (Dugong dugon) as a test species. The dugong is a long-lived marine mammal that inhabits shallow sub-tropical and coastal waters of the Indo-West Pacific region feeding on meadows of seagrass (Marsh et al. 2011). Previous tracking (using VHF and satellite telemetry) of dugongs in eastern Queensland waters identified that many animals remain relatively sedentary although they are capable of travelling significant distances (up to 560km; Sheppard et al. 2006). However, the genetic population structure analyses described in Chapters 3 and 4 indicated dugong gene flow to be restricted in some areas of the Queensland coast, indicating breeding movements are limited between the two clusters identified. Pedigree construction by Cope et al. (2015) using genetic and ancillary biological data elucidated movements of dugongs in southern Queensland over the last one or two generations and was able to identify greater movements than detected through telemetry (Sheppard et al. 2006) or direct recapture of individuals. The pedigree analysis

72 indicated a higher annual movement rate than had been detected in the genetic analysis of the southern Queensland populations (Seddon et al. 2014), indicating dugongs are making more non-breeding movements than can be detected through genetic population structure studies. While the pedigree construction method was able to provide some information about recent dugong movements, the method requires collection of data from a large number of individuals and identification of a sufficient number of parent-offspring pairs to allow pedigree construction, (Cope et al. 2015; Escoda et al. 2017) limiting its wide application to infer movement patterns .

The aim of the present study was to test the feasibility and effectiveness of the commensal bacterial network method to define contemporary movements of dugongs along the east Queensland coast between Townsville and Moreton Bay. Investigating the contemporary movements of these dugong populations will help us better understand the connections between individuals and populations and identify the current barriers that restrict dugong movements in Queensland, ensuring the species is managed effectively.

5.3 METHODS

Sample collection

Dugong faecal samples (n = 423) were collected between November 2015 and September 2017 from between Moreton Bay and Townsville, Queensland, Australia using four collection methods. During field trips to the Townsville, Upstart Bay and Bowling Green Bay dugong foraging grounds, dugong faeces found floating on the surface of the water were collected and placed into individual zip lock bags and frozen at -20°C during field collection. Approximately 5 grams from the centre of each sample was later (≤1 month) transferred into 20% glycerol and stored at -80°C (n = 262, collection method 1). Secondly, during field trips to dugong foraging grounds between Edgecombe Bay and Clairview, dugong faeces found floating on the surface of the water were collected and an aliquot from each was immediately placed into 20% glycerol and stored at -20°C and then transferred to -80°C approximately 1 week later (n = 121, collection method 2). During health assessments of dugongs in Moreton Bay where individual dugongs were caught and lifted onto the vessel (Lanyon et al. 2010b), approximately 5 grams of fresh faeces was collected from each animal (n = 40) and stored in 20% glycerol at -20°C and then transferred to -80°C ≤1 week after collection (collection method 3). Finally, transport media

73 swabs (Amies media, Labtek, Brendale, Queensland, Australia) were used to swab the fresh faecal samples collected from dugongs while on board the vessel during the 2017 health assessment and stored at 4°C until culturing (n = 16, samples were stored for ≤1 week, collection method 4). Sample collection locations are shown in Figure 5.1.

Figure 5.1. Map of dugong faecal sample collection sites along the east Queensland coast.

A flow chart demonstrating the bacterial culture and sequencing methods utilised in this study is shown in Figure 5.2.

74

Figure 5.2. Flow chart of dugong faecal bacterial culture and sequencing methods. 75

Bacterial culture, DNA extraction, PCR and sequencing – E. coli

As E. coli has been the most commonly used bacteria to construct commensal bacterial networks (VanderWaal et al. 2014a; VanderWaal et al. 2014b) and has also previously been cultured from dugong gastrointestinal samples (Nielsen et al. 2013), a subset of faecal samples (n = 17) were randomly selected from each collection region to optimise E. coli culture conditions. Faecal samples stored in glycerol were thawed and inoculated directly onto Brilliance™ E. coli selective medium (ThermoFisher Scientific, Scoresby, Victoria, Australia) and Sheep Blood Agar Columbia (SBA; ThermoFisher Scientific). Also, samples were placed into 900 µl phosphate-buffered saline (PBS), vortexed for 5 seconds and serial ten-fold dilutions made with the 10-1, 10-4 and 10-7 dilutions plated onto Brilliance™ E. coli selective medium and SBA. Plates were incubated aerobically at 37°C overnight. After incubation, plates were observed for the characteristic signs of E. coli growth and the concentration of presumed E.coli colonies per dilution determined. For DNA extraction, up to ten colonies from each culture-positive faecal sample were individually placed into 100 µl of ultrapure water and heated at 100°C for 2 minutes (mins). Due to poor culture outcomes for E. coli, pre-enrichment and overnight incubation of samples in both buffered peptone water (BPW) and EC Broth (4 ml, rotating at 37°C overnight, ThermoFisher Scientific) was trialled prior to culturing on Brilliance™ E. coli selective medium.

The bacterial cultures that phenotypically appeared to be E. coli were confirmed as E. coli by PCR using primers uspA-1 and uspA-2 (Chen and Griffiths 1998) in a 10 µL PCR reaction containing 1.25 µM MgCl2, 0.25 µM of each primer, 1x Qiagen (Chadstone, Victoria, Australia) PCR Buffer, 0.1 µl Qiagen HotStarTaq DNA polymerase and 0.25 µM dNTPs, with 1 µL of the DNA extract. PCR conditions were 95°C for 15 min followed by 30 cycles of 95°C for 30 s, 63°C for 30 s and 72°C for 1 min and a final extension at 72°C for 5 min. PCR products were separated by agarose gel electrophoresis to identify positive amplicons. To confirm the identity of the PCR amplicons, amplicons were purified using Exonuclease 1 (5 U per 5 µl PCR product, ThermoFisher Scientific) and shrimp alkaline phosphatase (1 U per 5 µl PCR product, ThermoFisher Scientific). Cycle sequencing was than conducted using Big Dye Terminators v3.1 (Applied Biosystems, Foster City, California, USA) followed by capillary electrophoresis on a 3730 Genetic Analyser (Applied Biosystems). The resulting sequences were aligned using the program Geneious (version 8, Auckland, New Zealand, https://www.geneious.com/). Sequences were compared to the 76

GenBank sequence database using BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi, U.S. National Library of Medicine) to determine the species of bacteria present in the samples based on the highest percentage match to a GenBank deposited sequence.

Identification of commensal bacteria species – 16S sequencing

As E. coli was cultured from only 1 of 17 faecal samples trialled (see results), 16S sequencing was conducted to identify a commensal bacterial species common across dugong faecal samples. Three faecal samples were selected from Moreton Bay (n = 2) and Newry Bay (n = 1) and inoculated on SBA media and incubated aerobically at 37°C overnight. Following incubation, ten colonies were randomly selected from each sample and placed into 100 µl of ultrapure water and heated at 100°C for 2 min for DNA extraction. A fragment of the 16S ribosomal RNA gene was amplified by PCR using primers 16S-27F and 16S-1492R (Tsukinowa et al. 2008). The PCR reaction mixture was the same as described above with PCR conditions as follows: 95°C for 15 min, 35 cycles of 95°C for 30 s, 53°C for 30 s and 72°C for 1 min followed by a 53°C for 1 min and a final extension at 72°C for 10 min. Amplification purification was followed by cycle sequencing using Big Dye Terminators v3.1 (Applied Biosystems) and capillary electrophoresis on a 3730 Genetic Analyser (Applied Biosystems). The 16S sequences were aligned as above and the bacterial species identified based on highest percentage matches to a GenBank deposited sequence.

Bacteria culture - S. warneri

As Staphylococcus warneri was identified from the three tested dugong faecal samples, a subset of dugong faecal samples (n = 101) were randomly selected from each sample collection region and inoculated onto a medium most likely to successfully culture S. warneri. A 10 µl loopful of each dugong faecal sample was added to ~4 ml of BPW, vortexed for 5 seconds and then incubated overnight on a roller at 37°C. Following incubation, samples were plated onto Columbia CNA medium (ThermoFisher Scientific), a media selective for gram-positive bacteria, and incubated overnight. Up to ten colonies that were phenotypically like S. warneri from each sample were then inoculated onto SBA plates and incubated aerobically at 37°C overnight. This culture method was repeated for each sample.

77 Investigating the genetic variability of S. warneri

The ability of previously published primers to identify genetic variability between S. warneri sequences from dugong faecal samples was assessed. For all colonies that phenotypically resembled S. warneri, DNA was extracted and regions of the S. warneri genome were sequenced to identify regions with sufficient genetic variability to construct a commensal bacterial network. DNA was extracted from the individual colonies of each faecal sample using the boiling method described above. A literature search (Web of Science) identified five primers which had been used to sequence regions of Staphylococcus spp., including S. warneri, and these primers were trialled. The primers tested amplified regions of the spa (Larsen et al. 2008), tuf (Ke et al. 1999), HSP60 (Goh et al. 1996), sodA (Poyart et al. 2001) and gap genes (Yugueros et al. 2000). A 10 µL PCR reaction contained 1.25 µM

MgCl2, 0.25 µM of each primer, 1x Qiagen PCR Buffer, 0.1 µl Qiagen HotStarTaq DNA polymerase and 0.25 µM dNTPs, with 2 µL of DNA extract. PCR conditions were 95°C for 3 min followed by 35 cycles of 95°C for 30 s, 55°C for 30 s and 72°C for 60 s with a final extension of 72°C for 10 min. Cycle sequencing was conducted using Big Dye Terminators v3.1 (Applied Biosystems) followed by capillary electrophoresis on a 3730 Genetic Analyser (Applied Biosystems). Sequences were aligned using Geneious alignment and compared against GenBank uploaded sequences to confirm species identity.

An additional ten primer pairs (Table 5.1) were designed around variable regions identified from analysis of the Whole Genome Sequences (WGS) of S. warneri cultured from Queensland dugong faecal samples (see Chapter 7). Ten colonies from each of five faecal samples were sequenced at three of these gene regions (Staph2F&2R, Staph4F&4R and Staph9F&9R) and were aligned (using Geneious alignment) to determine if colonies from the same faecal sample were identical. Following this, only one colony from each sample was sequenced for the ten regions. The PCR mixture for these primers was as above with PCR conditions: 94°C for 15 min followed by 35 cycles of 94°C for 30 s, 65°C for 45 s and 72°C for 60 s with a final extension of 72°C for 10 min. Amplicon purification was followed by cycle sequencing using Big Dye Terminators v3.1 (Applied Biosystems) and capillary electrophoresis on a 3730 Genetic Analyser (Applied Biosystems).

78 Sequences generated using the primers designed from the WGS were imported into Geneious (version 9, https://www.geneious.com/) and sequences trimmed to remove low quality regions (at least 30 bp at 5’ end and at least 20 bp at 3’ end). All sequences were run through BLAST to confirm identity as S. warneri, using a sequence match at >98% against GenBank sequences. Following this, one sequence per faecal sample was aligned for all ten regions using Geneious alignment. The sequence alignments for each of the regions were then imported into MEGA7 (version 7.0.26, https://www.megasoftware.net) and a Maximum Likelihood consensus tree constructed using the Tamura -Nei (1993) model with 1,000 bootstraps. In addition, the program PopART (version 1.7, http://popart.otago.ac.nz) was used to generate a TCS network (Clement et al. 2000) to show the relationship among sequences.

Table 5.1. Primer sequences designed in this study and used to amplify potentially variable regions of S. warneri cultured from Queensland dugong faecal samples. Primer locations in relation to reference sequence CP003668.1 (S. warneri, complete genome) from GenBank are shown.

Primer Sequence 5' to 3' Location

Staph1F GATTCATTGTTAGTTTGACTTTGTCAT 2,005,344 – 2,005,371

Staph1R AAATGGCAGAAATACGTGGTGA 2,005,722 – 2,005,743

Staph2F GACAAAGTCGCCATGGTCCC 51,520 – 51,540 Staph2R CGTGAAACGCGTTGAAAATAGT 51,883 – 51,905

Staph3F GTCTTTTTGACATTGAACCATACTCCT 160,872 – 160,898

Staph3R TTTGACCTCAGACAAACCGA 161,230 – 161,249 Staph4F TCATTATGCAGTATTAGCATTACTCAT 232,277 – 232,303

Staph4R GCACAATTAATTATCGATTACCTTCAA 232,689 – 232,715

Staph5F TCTGAAAAATCAAAACTTTCTTGTTCA 334,316 – 334,343

Staph5R AAATTTTTCTAAAGCCATCGTTATCGC 334,674 – 334,700

Staph6F TGGTCATGATATTCATATGGCAAGT 626,780 – 626,804

Staph6R ACATTCCATGTATTTCCACATGAAA 627,187 – 627,211 Staph7F TGGTTGTCGTTACACATGAAATG 985,851 – 985,873

Staph7R GCCCCCACATTATAAAAATATACTCT 986,219 – 986,244

Staph8F ATGGTGCGTCTAAAGCTGCT 1,307,761 – 1,307,780 Staph8R TGACGATAATTCATTTAGTGGCTTT 1,308,131 – 1,308,155

Staph9F TCATACGGCTTTTTAAAATTCATTTCT 1,645,188 – 1,645,214

Staph9R ACGCTAGCAGCAATTTTTGAT 1,645,517 – 1,645,537 Staph10F CCCTGAACATGTTCGAACTGT 2,086,865 – 2,086,885

Staph10R AGGCTTCTTTAACGTATATATCATTGT 2,087,242 – 2,087,268

79 5.4 RESULTS

Bacteria culture – E. coli

Initially, a subset of 17 dugong faecal samples were cultured to identify the presence and genotype variability of E. coli. However, of the 17 dugong faecal samples selected for E. coli culture, only one sample phenotypically resembled E. coli and sequencing confirmed identification as E. coli (NW16031 from Newry Bay, Table 5.2).

Table 5.2. The number of dugong faecal samples from each collection method and collection location attempted for bacterial culture (E. coli and S. warneri) and the number of isolates confirmed by sequencing as the target bacteria. All bacterial isolates suspected as the target bacteria species based on phenotypic assessment underwent sequencing to confirm identity. Collection Location # E. coli - # E. coli – # S. warneri # S. warneri method attempted confirmed - attempted – confirmed

1 (n = 262) Townsville 1 0 10 2 Bowling Green Bay 0 0 10 0

Upstart Bay 2 0 12 3

Total 3 0 32 5

2 (n = 121) Edgecombe Bay 0 0 2 0

Cape Gloucester 0 0 1 0

Airlie Bay 0 0 8 1 Newry Bay 7 1 7 0

Ince Bay 0 0 3 1

Clairview 2 0 8 5

Total 9 1 29 7

3 (n = 40) Moreton Bay 5 0 40 8

Total 5 0 40 8

4 (n = 16) Moreton Bay 0 0 16 0 Total 0 0 16 0

Collection method 1 – floating faecal samples collected and ≤1 month later stored in 20% glycerol and frozen at -80°C, Collection method 2 – floating faecal samples collected and immediately stored in 20% glycerol and frozen at -80°C, Collection method 3 – faecal sample collected following defaecation and immediately stored in 20% glycerol and frozen at -80°C, Collection method 4 – swab of faecal sample following defaecation using transport media swab

80 Bacteria identified in dugong faecal samples – 16S sequencing

As E.coli was not cultured from many of the samples, other bacterial species that could be cultured readily under aerobic conditions were identified. From three dugong faecal test samples, 30 colonies were cultured and comparison of the 16S sequences generated identified the presence of S. warneri, Pseudomonas sp., Bacillus lentus, Bacillus sp. and Bacillus cereus (Table 5.3) based on a greater than 98% match with GenBank deposited sequences. However only S. warneri (based on minimum 700 bp sequences) was identified in all three dugong faecal samples (Table 5.3).

Bacteria culture – S. warneri

Staphylococcus warneri was cultured from only 20 of the 101 (19.80%) faecal samples attempted. The number of individual S. warneri colonies cultured varied between faecal samples, with some samples only yielding one colony while other samples yielded more than 10 individual colonies. Based on a maximum number of 10 colonies collected, there was an average of 6.4 individual colonies per faecal sample cultured and confirmed by sequencing as S. warneri. Comparison of the different faecal sample collection methods identified collection method 2 (floating faecal samples immediately placed into 20% glycerol) as the method with the highest success rate of S. warneri culture as confirmed by PCR (using primers designed using the WGS) with 24.14% (7 of 29) of isolates identified as the target bacteria (Table 5.2). For collection method 1, the culture success rate was 15.63% (5 of 32), 20% (8 of 40) for collection method 3 and 0% (0 of 16) for collection method 4 (Table 5.2).

Genetic variability of S. warneri

Initially, genetic variability of S. warneri colonies cultured from the same dugong faecal sample was assessed. Alignment and pairwise comparisons of sequences from ten individual colonies from each of five selected faecal samples using the primers Staph2F&2R, Staph4F&4R and Staph9F&9R showed colonies from the same faecal sample to be genetically identical. Therefore, it was assumed that there was no or very limited variability of S. warneri within a single dugong sample and so only one colony was chosen at random to be sequenced from each subsequent sample and used for analyses.

81

Table 5.3. Bacterial species identified in three dugong faecal samples (10 colonies each) using 16S sequencing methods. Sequences were compared against GenBank deposited sequences with the highest similarity match presented.

Sample Colony Species GenBank # % similarity

MB15702 1 Pseudomonas sp. MH398515.1 99.79%

2 Pseudomonas sp. MF948937.1 99.81% 3 Staphylococcus. warneri CP033098.1 99.44%

4 N/A

5 Pseudomonas sp. KU341738.1 99.62%

6 Pseudomonas sp. MH398515.1 100%

7 Pseudomonas sp. MH398515.1 100%

8 N/A 9 Pseudomonas sp. MH398515.1 100%

10 Pseudomonas sp. MK542822.1 96.27%

MB15703 1 S. warneri MK713616.1 99.87% 2 N/A

3 S. warneri MK841411.1 100%

4 S. warneri MK841411.1 100%

5 S. warneri MK841411.1 100%

6 S. warneri MK841411.1 100% 7 Bacillus lentus LS483476.1 100%

8 S. warneri MK841411.1 100%

9 S. warneri MF185384.1 100% 10 S. warneri MK841411.1 100%

NW16025 1 Bacillus cereus MK894129.1 100%

2 S. warneri MK841411.1 100%

3 Bacillus cereus MK894129.1 100%

4 S. warneri MK841411.1 99.88%

5 S. warneri MK841411.1 99.88% 6 S. warneri MK841411.1 99.88%

7 S. warneri MK841411.1 99.88%

8 Bacillus cereus MK894129.1 100% 9 N/A

10 S. warneri MK841411.1 100%

N/A – species identity not available due to poor quality sequence

82 It was not possible to amplify two of the proposed regions (primers 7F, 7R, 8F and 8R, designed using WGS) despite numerous attempts at PCR optimisation. The genetic variability of the eight sequenced regions for the 20 S. warneri samples is presented in Table 5.4, with the number of variable sites ranging from 22 to 60.

Table 5.4. The genetic diversity observed at eight genome regions of S. warneri cultured from 20 Queensland dugong faecal samples.

Primers # samples Number of base Variable sites Parsimony where pairs sequenced Informative sites sequences available Staph1F&1R 17 351 46 45

Staph2F&2R 19 343 30 27 Staph3F&3R 19 332 60 59

Staph4F&4R 17 386 27 24

Staph5F&5R 17 332 37 18 Staph6F&6R 14 383 45 31

Staph9F&9R 16 303 38 37

Staph10F&10R 18 357 22 19

The Maximum Likelihood consensus tree showed that the 20 concatenated sequences from the 20 dugong faecal samples fell into four lineages (Figure 5.3). The reference sequence CP003668.1 formed a monophyletic lineage with four faecal samples from Moreton Bay. The second lineage included only two identical sequences (MB15702, collected in Moreton Bay and TV15027, collected in Townsville). The third lineage was comprised of a single Townsville sample and the fourth lineage consisted of the remaining samples from a range of locations. The TCS network generated showed a similar relationship, with a large number of nucleotide changes separating samples MB15702 and TV15027 from the remaining samples (Figure 5.4). The remaining samples were either identical to each other or were separated by relatively few nucleotide changes. All individuals from Clairview, Ince Bay, Airlie Beach and Upstart Bay were connected in the network along with two Moreton Bay individuals. Another four Moreton Bay individuals were shown to share the same bacterial genotype and TV15027 from Townsville shared the same bacterial genotype with MB15702 from Moreton Bay. There were two individuals which did not share the same bacterial genotype with another individual (MB16889 – Moreton Bay, TV15002 – Townsville).

83

Figure 5.3. Maximum Likelihood consensus tree constructed using the Tamurei-Nei (1993) model (1,000 bootstraps) of concatenated sequences across eight regions of the Staphylococcus warneri genome cultured from twenty dugong faecal samples collected along the east Queensland coast. Includes the concatenated sequences from the reference S. warneri genome CP003668.1 downloaded from GenBank from the same eight regions. Branch lengths represent the number of substitutions per site. MB=Moreton Bay, TV=Townsville, CV=Clairview, IN=Ince Bay, UP=Upstart Bay, AB=Airlie Beach. Numbers above the branches are bootstrap values for the node.

84

Figure 5.4. TCS network of the 20 Staphylococcus warneri concatenated sequences from eight genome regions. Staphylococcus warneri was cultured from dugong faecal samples collected along the east Queensland coast. The size of the circles represents the number of samples with that genotype and the colour indicates the site of collection. Cross lines over connections between sequences show the number of mutational changes between sequence haplotypes.

5.5 DISCUSSION

A commensal bacterial network is based on the sharing of commensal bacterial genotypes among individuals. A suitable bacterial species is one that is present in all or most samples, is easy to culture and has sufficient genetic variability to facilitate network construction. These attributes enable construction of a network that can demonstrate connections that are either directly (social contact) or indirectly (shared feeding area) made between individuals within the study group. In this study, the feasibility of the commensal bacterial network to infer contemporary movements in the marine environment

85 was trialled using dugong faecal samples collected from along the Queensland coast. While efforts were made to culture a suitable commensal bacterial species, low culture success rates and limited genetic variability meant it was not possible to construct a commensal bacterial network of sufficient depth to infer groupings of dugongs.

While various attempts were made to culture E.coli from dugong faecal samples, including using pre-enrichment, E. coli was only successfully cultured from one sample. E. coli was initially chosen as the commensal bacteria to build the commensal bacteria network based on its successful culture and use in previous studies (VanderWaal et al. 2014a; VanderWaal et al. 2014b) and the fact that it is ubiquitous in the gastrointestinal region of many animals (Dixit et al. 2004; Derakhshandeh et al. 2013; Ju and Willing 2018). Furthermore, E. coli was previously cultured from the gastrointestinal regions (small intestine contents, liver, colon contents, cardiac gland swab, and faeces) from 5 of 36 dugongs from the Moreton Bay area, Queensland, submitted for post-mortem examinations (Nielsen et al. 2013) using similar culture methods to those used here. Therefore, it was assumed that E. coli could be cultured from dugong faecal samples. While it remains unclear as to why E. coli was not cultured from more samples, it may be possible that the bacteria had not survived its interaction with salt water. Survival of E. coli appears to be dependent on the salinity levels of seawater, whereby decreased salinity increases the survival of E. coli (Anderson et al. 1979). E. coli sensitivity to salinity may therefore be expected to have an effect on culture success for the floating dugong faecal samples collected from the sea but is less able to explain why E. coli was not cultured from samples collected from dugongs undergoing health assessments, where faecal samples had limited contact with seawater. The 16S sequencing did not identify E. coli as a species cultured from dugong faeces and the microbiome analysis conducted in Chapter 6 did not find E. coli to be present in the 47 dugong faecal samples tested. Therefore, it is likely that E.coli is not a bacterial species commonly found within the faeces of dugongs.

The results of the 16S sequencing indicated S. warneri to be present in dugong faecal samples and therefore may be a more suitable species to use to construct a commensal bacterial network. S. warneri is a gram-positive, catalase-positive, oxidase-negative and coagulase-negative commensal bacteria and has previously been identified in a number of marine fish species (Metin et al. 2014) and is present on the skin of humans (Gil et al.

86 2000). Unfortunately, culture success rates of S. warneri from dugong faecal samples were not as high as expected (19.8%). Additionally, genetic variability among the 20 S. warneri sample sequences amplified using the newly developed primers was also low. Both the Maximum Likelihood consensus tree and the TCS network showed very few nucleotide differences between sequences, with quite a few identical sequences. This is surprising as samples were collected over a broad geographic area and therefore genetic variability of S. warneri sequences was expected to be greater. Based on these findings, it was not possible to build a commensal bacterial network that could be used to provide information about the contemporary movements and/or social structure of Queensland dugongs.

As the commensal bacterial network method has previously been able to infer the contemporary movements of terrestrial species (Bull et al. 2012; VanderWaal et al. 2014a; VanderWaal et al. 2014b), further research may be able to prove its effectiveness in the marine environment. For the commensal bacterial network method to be more effective at inferring contemporary movements in dugongs, a bacteria species that is easily cultured and has high genetic variability needs to be identified. A possible method to test its efficacy may involve using next generation sequencing methods to identify common bacterial species in dugong faecal samples and then use sequences from a specific region that is highly variable to construct a network, rather than only culturing bacteria and identifying what has cultured. However, there has been recent doubt raised over the inter- year variability of bacterial networks and their vulnerability to parameter choice in construction (Proboste et al. 2019).

An understanding of species contemporary movements is important for effective conservation management but remains a challenge. As the genetic population structure analyses in Chapters 3 and 4 identified largely unrestricted movements north and south of the Whitsunday Islands region, it is important we have an understanding of whether the movement within the two clusters is changing with ongoing habitat modification and anthropogenic induced perturbations.

87 CHAPTER 6: Characterisation of the faecal microbiome of dugongs along the east Queensland coast

6.1 ABSTRACT

It has becoming increasingly clear that the gut microbiome is associated with the health of the host. Research has shown a number of factors influence the microbial composition of the gut, including diet. The aims of this study were to i) characterise and compare the dugong faecal microbiome at various locations along the east Queensland coast between Townsville and Moreton Bay, and ii) to investigate the potential relationship between seagrass species diversity and the faecal microbiome. Forty-seven dugong faecal samples (two to five from each of ten known dugong feeding grounds) underwent diversity profiling of the V1–V3 hypervariable regions of the bacterial 16S rRNA gene by polymerase chain reaction and next generation sequencing. The diversity profiling identified the (62%) phyla as the most abundant across all samples, followed by Bacteroidetes (30%), Actinobacteria (5%) and Proteobacteria (2%). Principal Coordinate Analysis demonstrated that the three southern Queensland populations (Clairview, Hervey and Moreton Bays) had different microbial community compositions compared to those from more northern Queensland locations. A possible reason for this difference may be the variation in seagrass species presence and distribution along the Queensland coast, with greater species diversity found in north Queensland. This difference in microbial composition suggests that there is some adaptive change associated with location; the implications for the movement of dugongs is unknown. However, the following bacterial families, Clostridiaceae_1, Lachnospiraceae, Peptostreptococcaceae, Ruminococcaceae, Bacteroidaceae and Flavobacteriaceae, were detected in samples from all locations suggesting they are important in dugong hindgut digestion of seagrass. Further research, including study of the seagrass species dugongs actually consume, will help understand the role diet and other factors play in the composition of the dugong’s microbiome.

88 6.2 INTRODUCTION

It is well recognised that the gut microbiome plays an important role in the maintenance of a host’s health and disease status (Glad et al. 2010; Nelson et al. 2015; Ahasan et al. 2017b). The gut microbiome is an important contributor to the digestion and utilisation of food material, aiding in the breakdown of complex food particles, fermentation of complex carbohydrates, uptake and utilisation of nutrients and vitamins, and energy harvest and storage (Lavery et al. 2012; Smith et al. 2013; Medeiros et al. 2016; Ahasan et al. 2017b). Gut bacteria have also been linked to the development and maintenance of the immune system and are involved in protection against pathogenic organisms (Lavery et al. 2012; Smith et al. 2013; Nelson et al. 2015; Ahasan et al. 2017b; Erwin et al. 2017). A suite of intrinsic and extrinsic factors have been indicated to influence the community composition of the microbiome of humans, domesticated and captive animals. These include gut physiology and structure (Sommer and Backhed 2013), host genotype (Zhang et al. 2010), developmental stage of the host (Sommer and Backhed 2013), antimicrobial exposure (Jernberg et al. 2010), health status (Ley et al. 2008) and diet (Ley et al. 2008). In the marine environment, age (Eigeland et al. 2012; Smith et al. 2013), health status (Ahasan et al. 2017b) and diet (Nelson et al. 2013b) have been suggested as factors contributing to variations in microbial communities of marine mammals and turtles.

A host’s diet may influence their gut microbiota due to differences in the microbes and nutritional composition that are associated with their prey or food source (Nelson et al. 2013b; Medeiros et al. 2016). A significant difference in the gut microbiota between wild and captive leopard seals was identified and indicated this to be due to captive seals only being fed fish caught from one location while wild seals were able to feed on a variety of prey including fish, krill and penguins (Nelson et al. 2013b). Wild green sea turtles sampled in Queensland, Australia were found to have higher bacterial diversity and richness compared to stranded green turtles from the same region potentially due to the stranded turtles’ poor health state (Ahasan et al. 2017b). The green sea turtle study also highlighted the importance of a diverse gut microbiome for a healthy host, with changes to the microbiota increasing the risk of proliferation of pathogenic bacteria within the gastrointestinal tract resulting in disease (Ahasan et al. 2017b).

Fundamental to the association between microbial community composition and diet is the segregation of bacterial phyla among individuals or species according to their diet (Ley et al. 2008). The microbiome of herbivores has been shown to be distinctive to that of

89 carnivores and omnivores, with terrestrial and marine species also found to have different microbial compositions (Ley et al. 2008; Nelson et al. 2013a; Bik et al. 2016). To be able to access the complex carbohydrates present in plants (celluloses and starches), evolving herbivorous species extended their gut retention times to allow bacterial fermentation in either the enlarged foregut or hindgut (Ley et al. 2008). This allowed for digestion of lower quality forage (Ley et al. 2008). Due to the more complex digestion of plant material, the microbiome of herbivores has been found to be more diverse than that of carnivores and omnivores (Ley et al. 2008; Nelson et al. 2013a). Further, there are microbial community differences amongst herbivorous mammals based on their gut morphology. While a significant proportion of the fermentative bacteria are themselves digested in foregut fermenters, in hindgut fermenters these bacteria are more likely to be excreted in faeces (Ley et al. 2008) and therefore detected in faecal microbiome analysis.

The dugong is a marine hindgut fermenting herbivore with a diet consisting almost exclusively of sub-tropical and tropical seagrass (Marsh et al. 2011). Digestion of seagrass by dugongs includes release of and enzymatic digestion of plant cell contents by the mouthparts (Lanyon and Sanson 2006a), followed by fibre fermentation in their expanded 30 m long colon (Murray et al. 1977; Lanyon and Marsh 1995). Variable digestibility of seagrass species has been reported in the dugong: more fibrous genera such as Zostera and Cymodocea have apparent digestibility of ~60% whilst less fibrous genera (Halophila and Halodule) have apparent digestibility of greater than 85% (Murray et al. 1977). Dugongs lack the enamelled dentition that allows other mammalian hindgut fermenters to masticate high fibre plant material, and instead may preferentially forage on low fibre seagrasses such as Halophila and Halodule spp. (Lanyon and Sanson 2006b). The dugong’s apparent preference for certain seagrass species may be associated with the evolution of a unique microbiome that facilitates their effective digestion (Eigeland et al. 2012).

The hindgut microbiome of dugongs was previously investigated using faecal samples taken from wild individuals from sub-tropical Moreton Bay on the mid-east coast of Australia, and from two individuals in captivity (Eigeland et al. 2012). The analysis identified Firmicutes and Bacteroidetes as the most abundant bacterial phyla overall, and found differences in the gut microbiota between wild and captive individuals that could be attributed to dietary differences and captive conditions (Eigeland et al. 2012). It is likely that diet is pivotal in shaping the gut microbiome of dugongs and so the aim of this study

90 was to characterise and investigate variation in the gut microbiome of dugong populations located along the urbanised east Queensland coast, from Townsville south to Moreton Bay, a distance of greater than 1,000 km. I posited that variation in the gut microbiome is likely to reflect dietary differences related to seagrass availability along the east coast of Queensland, where environmental conditions are likely influencing seagrass diversity and biomass (Lanyon 1991; Preen 1992). It was further hypothesised that differences in Queensland dugong microbiomes may reflect the genetic population structuring identified in Chapters 3 and 4 if ranging movements across the Whitsunday Islands genetic break are restricted or they may be able to inform more contemporary movements of individuals or populations.

6.3 METHODS

Study sites and sample collection

Between November 2011 and September 2017, dugong faecal samples were collected from each of ten known dugong feeding grounds in coastal waters between Townsville and Moreton Bay, Queensland, Australia (Figure 6.1). Collection methods and sample storage for each location are described in Chapter 5, with the Hervey Bay samples collected using the same methods described for samples collected from Moreton Bay. In addition, a search of the literature was conducted to identify journal articles and reports which have described the seagrass species found at each sample collection site along the Queensland coast (Table 6.1).

91 # of seagrass species present

Figure 6.1. Map of dugong faecal sample collection sites along the east Queensland coast. Number of seagrass species present at each site is shown as a gradient colour scale. N, number of samples.

92 Table 6.1. Dugong faecal sample collections sites (organised from north to south), number (n) of samples collected and seagrass species identified at each site - present (p) and absent (a).

Halodule Halodule Halophila Halophila Halophila Halophila Cymodocea Cymodocea Syringodium Thalassia Zostera Total uninervis pinifolia ovalis spinulosa tricostata decipiens rotundata serrulata isoetifolium hemprichii capricorni number of species present Townsville p p p p p p p p p p p 11 n = 5 a,b,c,d,e Bowling p p p a a a a a a a p 4 Green Bay n = 5 a,d,e Upstart p p p p p p a p a a p 8 Bay n = 5 a,d,e,f Airlie p p p p a a p p p p p 9 Beach n = 5 a,d,e Repulse p a p p a a a a p a p 5 Bay n = 5 a,e Newry p a p p a p a p p a p 7 Region n = 5 a,e,f Ince Bay p p p p a p a a a a p 6 n = 5 a,e,f Clairview p p p a a a a a a a p 4 n = 5 a,e,f Hervey p p p p a p a a a a p 6 Bay n = 2 a,d Moreton p a p p a a a p p a p 6 Bay n = 5 c,d,g a - Long et al. (1993), b - McKenzie et al. (2018), c - Preen (1992), d - Seagrass-Watch (http://www.seagrasswatch.org/australia.html), e - https://maps.eatlas.org.au (GBR: seagrass site surveys 1984-2014), f - Coles et al. (2002), g - Young and Kirkman (1975)

93 DNA extraction, PCR amplification and sequencing

Each of the 47 dugong faecal samples (Figure 6.1) underwent bacterial diversity profiling. Genomic DNA (gDNA) was extracted from the thawed faecal samples using the QIAGEN DNeasy PowerLyzer PowerSoil Kit (Chadstone, Victoria, Australia), following manufacturer’s instructions. Primers 27F (AGAGTTTGATCMTGGCTCAG; Lane 1991) and 519R (GWATTACCGCGGCKGCTG; Turner et al. 1999) were used to amplify the V1–V3 hypervariable regions of the bacterial 16S rRNA gene by polymerase chain reaction (PCR). PCR conditions were as follows: 95°C for 7 min, followed by 29 cycles at 94°C for 45 s, 50°C for 1 min and 72°C for 1 min, with a final extension at 72°C for 7 min. AmpliTaq Gold 360 mastermix (Life Technologies, Mulgrave, Victoria, Australia) was used for the primary PCR, with a secondary PCR using TaKaRa Taq DNA Polymerase (TBUSA, Mountain View, California, USA), performed to index the amplicons. Resulting amplicons were measured by fluorometry (Invitrogen Picogreen, ThermoFisher Scientific, Scoresby, Victoria, Australia) and normalised. The eqimolar pool was measured by qPCR (KAPA) followed by sequencing using Illumina MiSeq (San Diego, California, USA) with 2x 300 base pairs paired-end chemistry. DNA extraction, sequencing and library preparation was performed by the Australian Genomics Research Facility, Adelaide, Australia.

Bioinformatics and statistical analyses

Paired-ends reads were assembled by aligning the forward and reverse reads using PEAR1, version 0.9.5 (Zhang et al. 2014). Primers were identified and trimmed. Trimmed sequences were processed using Quantitative Insights into Microbial Ecology 2 (QIIME 2; Bolyen et al., 2019), USEARCH (version 8.0.1623; Edgar 2010) and UPARSE (Edgar 2013) software. Using USEARCH tools, sequences were quality filtered, full length duplicate sequences were removed and then sequences were sorted by abundance. Singletons (or unique reads) in the dataset were discarded. Sequences were clustered followed by chimera filtering using the “RDP_gold” database as reference. To obtain the number of reads in each Operational Taxonomic Unit (OTU), reads were mapped back to OTUs with a minimum identity of 97%. Using QIIME 2, taxonomy was assigned using the Greengenes database (version 13.8, Aug 2013), where OTUs were assigned at a threshold setting of 97% similarity for sequence identity.

The number of reads of each OTU for all samples were normalised to relative abundance using total-sum normalisation (TSS) in Calypso, version 8.84 (Zakrzewski et al. 2017). The

94 depth of sequence reads was visualised in a rarefaction plot to show how well the sequencing reflected the sample diversity. The OTU tables generated by QIIME 2 were filtered with low abundance OTUs (< 1%) and unclassified OTUs grouped into the ‘other’ category. Bar charts were generated in Microsoft Excel (2016) to show the proportion of each OTU at the phylum, family and OTU classifications.

The program Calypso was used to calculate alpha diversity parameters; Chao1 (abundance), Evenness (how equally abundant OTUs are), Richness (the number of OTUs present), Shannon Index (includes richness & evenness) and Simpson’s index (number of OTUs present and the relative abundance of each OTU), as well as to determine location specific microbial diversity and richness. The variation among different locations was evaluated using Bray-Curtis (Bray and Curtis 1957) and visualised using Principal Coordinate Analysis (PCoA) using Calypso. The top most abundant families were determined using the Calypso program to show the shared microbial communities present across all locations, and the unique microbial communities of dugong faeces at each location. Finally, the abundance of family level bacterial communities amongst the different sampling locations was compared using an Analysis of Variance (ANOVA).

6.4 RESULTS

Bacterial sequencing data and depth

Amplicon sequencing of the 47 dugong faecal samples yielded 36,635 to 169,076 raw sequencing reads per sample and after quality control, sequence reads per sample ranged from 17,831-79,700 with a mean of 37,406. The depth of sequencing for each dugong faecal sample was assessed using rarefaction, indicating that each sample was sequenced to an adequate depth and reflected the bacterial diversity present (Figure 6.2).

The bacterial sequences generated identified 98 OTUs that were taxonomically classified to at least the phylum level, using a similarity of at least 97% (Appendix 6.2).

95 Figure 6.2. Rarefaction plot to show how well the sequencing captured each dugong faecal samples bacterial diversity. Coloured lines represent individual dugong faecal samples. Sample names indicate locations: TV=Townsville, BG=Bowling Green Bay, UP=Upstart Bay, AB=Airlie Beach, RP=Repulse Bay, NW=Newry Region, IN=Ince Bay, CV=Clairview, HB=Hervey Bay, MB=Moreton Bay

96 Bacterial richness and diversity

Bacterial richness and diversity within dugong faeces were assessed at the OTU level for Chao1, Evenness, Richness, Shannon Index and Simpson’s Index (Figure 6.3a&b), using a rarefied read depth of 1416. Faecal bacterial diversity was significantly different (ANOVA, P < 0.05) between sampling locations, with Moreton Bay (Chao1 Index = 30, Richness Index = 25.38) and Repulse Bay (Chao1 Index = 29.20, Richness Index = 23.15) having the highest bacterial diversity, while Newry Region (Chao1 Index = 16.20, Richness Index = 14.82) had the lowest. There was no significant difference (ANOVA, P > 0.05) in each of the Evenness, Simpson’s nor Shannon Index for bacterial diversity of dugong faecal samples at various locations (Figure 6.3b).

97

Newry Region Newry Newry Region Newry

Figure 6.3a. Box plot of the estimated OTU richness (Chao1 and Richness) of dugong faecal samples collected from different locations along the east Queensland coast. Locations ordered from north to south (left to right). Upper and lower quartiles, mean (X), median and upper and lower values are shown.

98

Newry Region Newry

Newry Region Newry Newry Region Newry

Figure 6.3b. Box plot of the estimated OTU diversity indexes (Shannon, Simpson’s and Evenness) of dugong faecal samples collected from different locations along the east Queensland coast. Locations ordered from north to south (left to right). Upper and lower quartiles, mean (X), median and upper and lower values are shown.

99 Bacterial taxonomy

The OTUs identified in the dugong faecal samples could be assigned to five main bacterial phyla (Figure 6.4). Firmicutes were the most abundant phylum overall, with an average overall relative abundance of 62%, followed by Bacteroidetes (30%), Actinobacteria (5%) and Proteobacteria (2%). For the faecal samples collected from the southern Queensland dugong populations (Clairview, Hervey Bay and Moreton Bay), the relative abundances of bacterial phyla differed to that of the overall averages. Actinobacteria made up more of a contribution to their bacterial communities with an average relative abundance of 16%, with Bacteroidetes only averaging 15% for these individuals (Appendix 6.1). One of the samples from Moreton Bay (MB16886) had a large relative abundance of Proteobacteria (74.48%, Figure 6.4).

Figure 6.4. Relative abundance (%) of different bacterial communities at the phylum (p) level in Queensland dugong faecal samples. Locations are arranged north to south (left to right).TV=Townsville, BG=Bowling Green Bay, UP=Upstart Bay, AB=Airlie Beach, RP=Repulse Bay, NW=Newry Region, IN=Ince Bay, CV=Clairview, HB=Hervey Bay, MB=Moreton Bay.

100 Bacterial diversity was also assessed at a finer taxonomic level, with a total of 12 families identified which had relative abundances greater than 1% (Figure 6.5). The most dominant family among the samples was Clostridiaceae (Firmicutes), with a relative abundance of 30%, followed by Lachnospiraceae (Firmicutes; 21%) and Bacteroidaceae (Bacteroidetes; 18%; Figure 6.5). In dugong faecal samples collected from southern Queensland (Clairview, Hervey and Moreton Bays), Clostridiaceae was found to have a higher relative abundance (40%) than the overall average, and an unclassified Actinobacteria family also identified in these samples, with an average relative abundance of 15% (Figure 6.5). The top 20 most abundant OTUs for Queensland dugong faecal samples are shown in Figure 6.6.

101

Figure 6.5. Relative abundance (%) of different bacterial communities at the family level in Queensland dugong faecal samples. Locations are organised from north to south (left to right). TV=Townsville, BG=Bowling Green Bay, UP=Upstart Bay, AB=Airlie Beach, RP=Repulse Bay, NW=Newry Region, IN=Ince Bay, CV=Clairview, HB=Hervey Bay, MB=Moreton Bay. p-phylum, c-class, o-order, f-family.

102 Figure 6.6. Relative abundance (%) of the 20 most abundant bacterial communities at the OTU level in wild dugong faecal samples collected from Townsville, south to Moreton Bay Queensland. Locations are organised from north to south (left to right). TV=Townsville, BG=Bowling Green Bay, UP=Upstart Bay, AB=Airlie Beach, RP=Repulse Bay, NW=Newry Region, IN=Ince Bay, CV=Clairview, HB=Hervey Bay, MB=Moreton Bay. p-phylum, c-class, o-order, f-family, g-genus, s- species.

103 Variation in beta diversity

Differences in microbial diversity across faecal samples was assessed using beta diversity. PCoA analysis at the OTU level demonstrated dissimilarity in dugong faecal bacterial communities between locations and showed two groupings (Figure 6.7). Faecal samples from Clairview, Hervey and Moreton Bays, plus one sample from each of Ince Bay and Newry Region clustered loosely together (Figure 6.7). The remaining samples from the other locations (Ince Bay north to Townsville) formed a second cluster (Figure 6.7).

Figure 6.7. Principal Coordinate Analysis (PCoA) plot showing similarities in dugong faecal bacterial communities at the OTU level amongst the collection sites using Bray-Curtis distances.

104 Analysis of location-specific bacterial communities

A number of bacterial families were significantly associated with all sample locations whilst some bacterial families were detected at particular locations only (Figure 6.8). Clostridiaceae and Lachnospiraceae were found to be significantly associated with all sampling locations (ANOVA, P < 0.05), with a high association between the Clostridiaceae family and Clairview. An unclassified Actinobacteria family was significantly (ANOVA, P < 0.05) associated with dugong faecal samples collected from Clairview, Hervey and Moreton Bays, i.e. the southern Queensland locations.

105

1 - p_Firmicutes;c_Clostridia;o_Clostridiales;f_Clostridiaceae_1

2 - p_Firmicutes;c_Clostridia;o_Clostridiales;f_Lachnospiraceae

3 - p_Bacteroidetes;c_Bacteroidia;o_Bacteroidales;f_Bacteroidaceae

4 - p_Firmicutes;c_Clostridia;o_Clostridiales;f_Peptostreptococcaceae

5 - p_Bacteroidetes;c_Bacteroidia;o_Flavobacterialesf;_Flavobacteriaceae

6 - p_Actinobacteria;c_

7 - p_Firmicutes;c_Clostridia;o_Clostridiales;f_

8 - p_Firmicutes;c_Clostridia;o_Clostridiales;f_Ruminococcaceae Figure 6.8. The eight most significantly different abundant families of bacterial communities present at each sampling location based on ANOVA. The significantly different families are shown as a bar chart (P < 0.05) with standard error shown as error bars. p-phylum, c-class, o-order, f-family.

106 Assessment of the shared bacterial communities across all sampling locations identified 20 shared bacterial OTUs (Figure 6.9). These included six shared bacterial OTUs from phylum Firmicutes and three shared bacterial OTUs from phylum Bacteroidetes (Table 6.2). There were no unique/restricted bacterial OTUs identified in locations from Townsville south to Airlie Beach, however, a total of 13 OTUs were identified from dugong faecal samples collected between Repulse Bay and Moreton Bay (Figure 6.9). Unique bacterial OTUs were also identified exclusively within each of Repulse Bay, Clairview and Moreton Bay (Table 6.2).

0 20 13

Figure 6.9. Venn diagram to show the number of shared and unique OTUs in dugong faecal samples collected from north (blue; Townsville, Upstart Bay, Bowling Green Bay, Airlie Beach) and south (pink; Repulse Bay, Newry Region, Ince Bay, Clairview, Hervey Bay, Moreton Bay) Queensland.

107 Table 6.2. Shared and restricted (unique to specified locations) bacterial OTUs in dugong faecal samples collected from various locations along the east Queensland coast. p-phylum, c-class, o-order, f-family.

Taxa Type Location p_Bacteroidetes;c_Bacteroidia;o_Bacteroidales;f_Bacteroidaceae shared all p_Bacteroidetes;c_Bacteroidia;o_Flavobacterialesf;_Flavobacteriaceae shared all p_Bacteroidetes;c_Bacteroidia;o_ shared all p_Firmicutes;c_Clostridia;o_Clostridiales;f_Clostridiaceae_1 shared all p_Firmicutes;c_Clostridia;o_Clostridiales;f_Lachnospiraceae shared all p_Firmicutes;c_Clostridia;o_Clostridiales;f_Peptostreptococcaceae shared all p_Firmicutes;c_Clostridia;o_Clostridiales;f_Ruminococcaceae shared all p_Firmicutes;c_Clostridia;o_Clostridiales;f_ shared all p_Firmicutes;c_ shared all p_Bacteroidetes;c_Bacteroidia;o_Bacteroidales;f_ restricted Airlie Beach, Bowling Green Bay, Clairview, Hervey Bay, Ince Bay, Newry Region, Repulse Bay, Townsville, Upstart Bay p_Firmicutes;c_Clostridia;o_Clostridiales;f_Christensenellaceae restricted Airlie Beach, Bowling Green Bay, Clairview, Ince Bay, Moreton Bay, Newry Region, Repulse Bay, Townsville, Upstart Bay p_Actinobacteria;c_Coriobacteriia;o_Coriobacteriales;f_ restricted Airlie Beach, Bowling Green Bay, Hervey Bay, Ince Bay, Moreton Bay, Newry Region, Repulse Bay, Townsville, Upstart Bay p_Actinobacteria;c_Coriobacteriia;o_Coriobacteriales;f_Eggerthellaceae restricted Airlie Beach, Bowling Green Bay, Clairview, Hervey Bay, Ince Bay, Moreton Bay, Newry Region, Repulse Bay, Upstart Bay p_Firmicutes;c_Erysipelotrichia;o_Erysipelotrichales;f_Erysipelotrichaceae restricted Airlie Beach, Bowling Green Bay, Ince Bay, Moreton Bay, Newry Region, Repulse Bay, Upstart Bay p_Tenericutes;c_Mollicutes;o_ restricted Airlie Beach, Bowling Green Bay, Repulse Bay, Townsville, Upstart Bay p_Proteobacteria;c_Gammaproteobacteria;o_Betaproteobacteriales;f_Burkholderiaceae restricted Bowling Green Bay, Clairview, Hervey Bay, Moreton Bay, Repulse Bay p_Actinobacteria;c_ restricted Clairview, Hervey Bay, Ince Bay, Moreton Bay p_Verrucomicrobia;c_Verrucomicrobiae;o_Verrucomicrobiales;f_Akkermansiaceae restricted Bowling Green Bay, Moreton Bay, Repulse Bay p_Bacteroidetes;c_Bacteroidia;o_Flavobacteriales;f_ restricted Bowling Green Bay, Repulse Bay, Townsville

108 p_Actinobacteria;c_Actinobacteria;o_ restricted Clairview, Hervey Bay, Moreton Bay p_Bacteroidetes;c_Bacteroidia;o_Bacteroidales;f_Prevotellaceae restricted Clairview, Hervey Bay, Moreton Bay p_Bacteroidetes;c_Bacteroidia;o_Bacteroidales;f_Rikenellaceae restricted Clairview, Hervey Bay, Moreton Bay p_Firmicutes;c_Clostridia;o_Clostridiales;f_Family_XIII restricted Clairview, Hervey Bay, Moreton Bay p_Synergistetes;c_Synergistia;o_Synergistales;f_Synergistaceae restricted Bowling Green Bay, Repulse Bay p_Firmicutes;c_Bacilli;o_Bacillales;f_ restricted Hervey Bay, Moreton Bay p_Firmicutes;c_Bacilli;o_Bacillales;f_Bacillaceae restricted Moreton Bay p_Firmicutes;c_Bacilli;o_Bacillales;f_Planococcaceae restricted Moreton Bay p_Firmicutes;c_Clostridia;o_Clostridiales;f_Eubacteriaceae restricted Repulse Bay p_Spirochaetes;c_Spirochaetia;o_Spirochaetales;f_Spirochaetaceae restricted Repulse Bay p_Tenericutes;c_Mollicutes;o_Izimaplasmatales;f_uncultured_organism restricted Moreton Bay p_Verrucomicrobia;c_Verrucomicrobiae;o_Opitutales;f_ restricted Clairview p_Verrucomicrobia;c_Verrucomicrobiae;o_ restricted Clairview

109 6.5 DISCUSSION

This study presents the first comparison of the faecal microbiome of dugongs from multiple foraging sites along the Queensland coast. It provides insights into the likely key bacteria involved in dugong hindgut digestion. Using high-throughput sequencing technology, I was able to accurately characterise the dugong faecal microbiome, and detect differences in the bacterial composition of faeces sampled at different sites. The observed differences between sites contributes to our understanding of both the relationship between diet and the faecal microbiome, and dugong contemporary movements in this region of Australia.

This study provides the most comprehensive characterisation of the dugong faecal microbiome to date. Diversity profiling of dugong faecal samples identified Firmicutes (62%) as the most abundant phyla, followed by Bacteroidetes (30%) and Actinobacteria (5%). Previous studies have found different relative abundances of the two main bacterial phyla identified in this study. Analysis of faecal samples from wild dugongs in Moreton Bay found 75.6% Firmicutes and 19.9% Bacteroidetes (Eigeland et al. 2012), while a single captive dugong fed only Zostera marina was found to have 83% Firmicutes and 15% Bacteroidetes (Tsukinowa et al. 2008). This study also provides more accurate information about the core bacterial families that are likely to be important in dugong hindgut digestion due to the greater number of faecal samples analysed and the broader geographic sampling area. Four bacterial families (Clostridiaceae_1, Lachnospiraceae, Peptostreptococcaceae, and Ruminococcaceae) from the Firmicutes phyla and two families (Bacteroidaceae, Flavobacteriaceae) from the Bacteroidetes phyla were shared across all sampling locations (Table 6.2). These bacterial families have been shown to be involved in fibre degradation and the digestion of cellulose, starch and other polysaccharides (Gamage et al. 2017), and therefore may be important bacteria involved in seagrass digestion in all dugongs in this coastal region of Australia. Additionally, unique/restricted bacterial families (Table 6.2) were identified in the dugong populations of Repulse Bay south to Moreton Bay, with the majority of these within the three most southerly populations (Table 6.2). An unidentified (not identified by QIIME 2) Actinobacteria family was found in Clairview, Hervey and Moreton Bays, with this phylum known to function in the modulation of gut permeability, metabolism, immune function and the breakdown of resistant starch (Binda et al. 2018). Interestingly, Prevotellaceae which functions to breakdown carbohydrates and proteins, was identified in Clairview, Hervey

110 and Moreton Bay samples, and has also been found in the rumen and hindgut of sheep and cattle (Rosenberg 2014).

Differences in the bacterial community composition of Queensland dugong faecal samples were identified between regions. In the southern Queensland populations (Clairview, Hervey and Moreton Bays), the relative abundance averages at the phyla and family level were different to the overall averages. Additionally, the PCoA plot indicated differences in the faecal microbial communities between locations, with the samples from southern Queensland (Clairview, Hervey and Moreton Bays) loosely clustering together, while all but two samples from the remaining more northerly locations were found to cluster together. This indicates that dugongs inhabiting the most southern Queensland coastal waters have a different microbiome compared to dugongs located in more northern coastal waters. Differences in seagrass species availability, or consumption by dugongs in different locations, may be the reason for the finding of different bacterial community compositions and the finding of unique bacterial families in some locations.

Diet has previously been demonstrated to be a key determinant of the faecal microbiome (Ley et al. 2008). It is likely therefore that diet plays a role in the bacterial composition of the dugong faecal microbiome. A review of the literature identified differences in the number of seagrass species present at the various collection sites, with greater seagrass species diversity identified in tropical north Queensland and a general gradual reduction towards sub-tropical southern Queensland (Table 6.1), although these data do not indicate relative availability at a site. Seagrass meadows of northern Queensland have higher species diversity and biomass due to more optimal seagrass growth conditions including more sunlight hours, greater rainfall and associated nutrient runoff, and higher sea temperatures in tropical regions (Perez and Romero 1992; Long et al. 1993; Lee et al. 2007). By contrast, the sub-tropical meadows of southern Queensland tend to have lower species diversity and biomass abundance due to poorer growth conditions associated with lower sea temperatures and sunlight hours (Lanyon 1991; Preen 1992). Further, the intertidal seagrass meadows on which dugongs feed can be highly dynamic, with assemblage biomass and plant growth varying substantially (Rasheed and Unsworth 2011). The biomass of a tropical seagrass meadow in north Queensland, where only Halodule uninervis and Halophila ovalis were found, was found to differ significantly

111 between years, with elevated temperatures and decreased river flow significantly correlated with periods of lower seagrass biomass (Rasheed and Unsworth 2011). Further, the location of coastal and intertidal seagrass meadows influences their susceptibility to major climatic events, such as cyclones and raised sea temperatures (Rasheed and Unsworth 2011), with events such as these being more common in north Queensland. Thus regional differences in seagrass species presence and biomass may explain some of the observed variation in the faecal microbiome in this study.

However, while there appears to be a greater number of seagrass species present in more northern dugong habitats, whether dugongs are consuming all of these species remains unclear. The species actually consumed by dugongs is only known for some locations at specific sampling periods. Dugong stomach contents (n = 65) were analysed from Townsville individuals with the Halodule, Halophila, Cymodocea and Thalassia genera found in the greatest number of individuals, while Zostera was found in only a small number of individuals, and Syringodium absent, although it was present at this site (Marsh et al. 1982). There were also considerable difference in the number of seagrass species identified from individual Townsville dugong stomach contents, ranging from 1 to 5 genera between individuals. In Moreton Bay, analysis of dugong faeces found Z. capricorni, H. ovalis, H. spinulosa, H. uninervis and S. isoetifolium (Preen 1995). However, there are difficulties in identifying seagrass species in dugong faeces, with more fibrous species, such as Z. capricorni, more easily identified as they are less digestible (Lanyon 1991). It has been suggested that dugongs prefer Halophila and Halodule species due to their higher nitrogen content (Marsh et al. 2002; Marsh et al. 2011). Therefore, while more seagrass species are present in north Queensland, dugongs may or may not be consuming them, or they may only incidentally consume them when grazing mixed seagrass assemblages. Further study is required to determine the diet composition of dugongs in different regions and the distribution and abundance of seagrass species available for dugongs to graze. This may be possible using molecular taxonomy methods whereby analysis of dugong faecal samples could be used to identify seagrass species consumed by dugongs.

Analysis of the differences and similarities in the Queensland dugong’s microbiome allows for some interpretation of the population structure of the species. The different bacterial

112 compositions and the finding of unique microbial communities in the southern Queensland populations suggests they are harbouring these bacteria potentially due to the species of seagrass they are consuming, or maybe from sediment or other food material e.g. marine invertebrates (Preen 1995), they may ingest when feeding. As the three southern Queensland populations were found to have unique bacterial families in their faeces, this indicates they are either feeding in the same areas or are ecologically similar, whilst individuals from more northern populations are not moving into and feeding on the same seagrass meadows, at least within the period of sampling. Analysis of the shared microbiome of dugong populations over time, and comparisons between seasons may help to reveal how frequent movements occur between populations and whether the microbiome adapts to a new diet, with analysis of samples from a single time period only reflecting what they have eaten recently. Interestingly, the boundary (Whitsunday Islands) between the two genetically defined clusters in Chapters 3 and 4 was not reflected in the faecal microbiome findings, where the boundary was north of Clairview, well south of the Whitsunday Islands (>200 km). This indicates that dugongs from the northern genetic cluster are likely to be foraging south of the genetic boundary; alternatively, that the seagrass species distribution or dugong preferences for seagrass species does not change across this genetic break location, although the diversity of seagrass species seems to decline south of Airlie Beach.

Comparison of the dugong faecal microbiome with other hindgut herbivores demonstrates shared core bacterial families. The Lachnospiraceae, Flavobacteriaceae, Bacteroidaceae, Clostridiaceae, Ruminococcaceae families identified in this study as core bacterial families to all dugong populations studied, were also identified as core bacterial families in the white rhinoceros (Bian et al. 2013), horse (Dougal et al. 2013), donkey, rabbit and Chinchilla, which are all hindgut fermenters (Donnell et al. 2017). Additionally, other core terrestrial hindgut bacterial families, Prevotellaceae and Rikenellaceae (Bian et al. 2013; Dougal et al. 2013; Donnell et al. 2017), were only found in the three southern Queensland locations, while Spirochaetaceae was only found in Repulse Bay samples. Core bacterial families identified in dugongs have also been identified in the Florida manatee (Bacteroidaceae, Lachnospiraceae, Clostridiaceae, Ruminococcaceae; Merson et al. 2013). The large number of shared bacterial families confirms the importance of these families in hindgut digestion in both terrestrial and marine herbivores.

113 There are some factors that may be biasing or limiting interpretation of the results from this study. Firstly, the samples analysed from Hervey and Moreton Bays were fresh faecal samples collected directly from dugongs while on board the vessel while all other samples were floating faecal samples where the time since defaecation is unknown. Whether the duration since defecation and the contact with seawater affects the microbiome findings requires investigation. Additionally, a number of bacterial communities found in dugong faecal samples could not be classified at either the genera or species level, which has also been found in other marine species (Eigeland et al. 2012; Merson et al. 2014; Ahasan et al. 2017b). This indicates there are novel or undescribed bacteria present in the dugong, or that there are deficits in the microbial databases. Therefore, further work should be conducted to try and classify microbes found in the dugong as they may represent microbial assemblages unique to herbivorous marine mammals. Ontogeny has previously been suggested to influence the microbial composition and diversity of dugong faeces, however this could not be assessed here as the age of the dugongs from which the majority of samples were collected was unknown. While diet appears to influence the dugong microbiome, investigation of the impact of other factors requires investigation.

114 CHAPTER 7: Antibiotic resistance profiles of bacteria cultured from Queensland dugong faeces

7.1 ABSTRACT

Therapeutic antibiotic usage by humans has resulted in significantly reduced global mortality, however, routine use or over-use has caused a significant increase in the frequency of antibiotic resistant bacteria. While antimicrobial resistance mutations are likely to be found in all species, selection through the continued exposure of bacteria to antibiotics has caused the proportion of resistant bacteria to increase. Recently, concerns have been raised about the frequency of antibiotic resistance bacteria isolated from wildlife and associated environments where direct application of antibiotics does not occur. Antibiotics that are un-metabolised are discharged from human wastewater treatment plants, and are present in run-off from agricultural, horticulture and aquaculture operations, eventually flowing into marine environments. Studies of wild marine species indicate that antimicrobial resistance is an increasingly common and widespread issue in the marine environment. The aim of this study was to assess the antibiotic resistance profiles of bacteria cultured from faeces of wild dugongs sampled at two sites, one adjacent to a large city (Moreton Bay) and one adjacent to an agricultural region (Newry Region), in coastal Queensland Australia. Nine isolates cultured from four dugong faecal samples and one isolate from associated marine sediment underwent antibiotic susceptibility testing using disc diffusion and assessment of minimum inhibitory concentrations. Antibiotic resistance and virulence genes were investigated using Whole Genome Sequencing. All four Staphylococcus warneri isolates and the Bacillus cereus isolate from dugong faeces were resistant to penicillin with two S. warneri isolates also displaying resistance to trimethoprim. The four Escherichia coli isolates were all found to be resistant to ampicillin. Resistance genes, including FosB, Bcll, dfrC, blaZ and mdf(A), were identified in the isolates cultured from dugong faeces with two virulence genes (GAD and lpfA) identified in all E. coli isolates. Lysinibacillus sphaericus cultured from marine sediment collected from a dugong foraging ground (Newry Region) displayed multidrug resistance (fosfomycin, amoxicillin- clavulanic acid, cephalothin and penicillin). These results demonstrate that antibiotic resistant bacteria are present in dugongs in at least two locations along the developed coast of Queensland, highlighting the potential threat that human and agricultural discharges pose to the marine environment. This study provides baseline data

115 for future investigation of the frequency and geographic extent of antimicrobial resistance in Queensland dugongs.

116 7.2 INTRODUCTION

Since the discovery of antibiotics in the early 20th century, antibiotic usage has significantly reduced the mortality rates of human and domestic animals (Kummerer 2009; Carvalho and Santos 2016). Antibiotics have been used extensively in human and veterinary medicine to treat bacterial diseases and have been given to livestock to help promote growth and productivity (Kummerer 2009; Carvalho and Santos 2016). They have also been applied to fruit and vegetable crops to control diseases and more recently have been used in aquaculture for therapeutic purposes and as prophylactic agents (Kummerer 2009; Gaw et al. 2016). However, while antibiotics have been able to dramatically improve human and animal health, their usage has resulted in an increasing frequency of multidrug resistant bacteria, with development of new and effective antibiotics unable to keep pace with the increase in resistance (Grundmann et al. 2011; Carvalho and Santos 2016; Ahasan et al. 2017a). Antibiotic resistance is an adaptive genetic trait that is either inherent or acquired by bacteria allowing them to survive and grow even when an antibiotic is administrated in therapeutic concentrations (Carvalho and Santos 2016). Resistant genes are likely found in all wild type bacterial isolates, however, continued exposure of bacteria to antibiotics results in the selection of bacteria with resistance genes that are able to survive and subsequently spread throughout the environment, thereby increasing the frequency of bacteria with antibiotic resistance (Levy and Marshall 2004). Horizontal gene transfer also increases the spread of antibiotic resistance genes (Hawkey and Jones 2009). The appropriate, inappropriate and over- use of antibiotics has resulted in increased and widespread antibiotic resistance with some infections no longer responding to even last-resort antimicrobials, with these increasing the rapidity of the spread (Levy and Marshall 2004; Carvalho and Santos 2016). Of particular concern recently has been the findings of antibiotic resistant bacteria in marine environments and in animals where therapeutic antibiotics have not been directly administered (Foti et al. 2009; Kummerer 2009; Rose et al. 2009; Wallace et al. 2013; Prichula et al. 2016), prompting the need for further investigation.

Antibiotics enter marine and freshwater environments via a number of pathways (Figure 7.1). One route is through the discharge of water from wastewater treatment plants into coastal and ocean outfalls (Stewart et al. 2014). Antibiotics are largely un-metabolized by the human and animal body, with between 30-90% of administered antibiotics excreted in either urine or faeces (Costanzo et al. 2005). Sewage treatment plants are unable to

117 completely remove antibiotics resulting in the discharge of antibiotics and metabolites into marine environments (Costanzo et al. 2005; Kummerer 2009; Gaw et al. 2016). Another route is from agriculture and horticulture run-off, where un-metabolised or excess antibiotics make their way into surface and ground waters that then flow into marine environments (Kummerer 2009; Carvalho and Santos 2016; Gaw et al. 2016). Antibiotic use by commercial aquaculture farms located in coastal areas also contribute to contamination of the marine environment (Gaw et al. 2016). Aquatic organisms may then be exposed to antibiotic resistant bacteria through their diets, through their gills or via contact with contaminated sediments, with the likelihood of exposure dependent on the proximity to these sources (Gaw et al. 2016)

Figure 7.1. Pathways for antibiotics used in human and veterinary medicine and agriculture to enter the environment. Adapted from Carvalho and Santos (2016).

While antibiotic resistance may occur naturally in marine bacteria, the frequency of antibiotic resistant bacteria in the marine environment has increased, likely due to the increased exposure of aquatic bacteria to antibiotics that are entering these environments (Rose et al. 2009; Wallace et al. 2013). Comparison of the frequency of antibiotic resistant bacteria cultured from stranded pinnipeds from the Northwest Atlantic during 2004-2006 to those sampled in 2010 found there had been a significant increase (11.7% to 44.2%) in

118 antimicrobial resistance over a relatively short period (Wallace et al. 2013). Increased antimicrobial resistance has also been detected in the common bottlenose dolphin (Tursiops truncatus) in Florida, with a statistically significant increase in the percentage of resistance found for 11 of the 15 antibiotics tested across the various bacterial species (Schaefer et al. 2019). Further, multidrug resistance was identified for a number of bacterial species isolated from marine mammals and seabirds from the USA’s north- eastern coast, suggesting that antibiotic resistance is now relatively common and widespread in these marine environments (Rose et al. 2009). In Australia, approximately one third of Enterobacteriales isolates from green sea turtles caught off the coast of Townsville displayed multidrug resistance (Ahasan et al. 2017a) suggesting that antibiotic resistance may be widespread in this coastal marine system. Given these concerns, ongoing research in this area is required to determine baseline levels of antimicrobial resistance, particularly in areas where there is significant urban and/or agricultural run-off. Defining the antibiotic resistance profiles of bacteria isolated from different populations of marine species may improve our understanding of its spread throughout the environment and the potential health threat to the inhabited region.

The aim of this study was to determine if there were antibiotic resistant bacteria in dugong faecal samples collected from coastal Queensland, Australia. The dugong’s long life span and shallow water coastal habitat dependence (Marsh et al. 2011) make it a good indicator species to assess the environmental health of this region. Their coastal habitat means they are in close proximity to human and agricultural activities and are therefore likely to come into contact with antibiotic contaminants from these sources. As the antibiotic resistance of bacteria found in wild sirenians generally, and dugongs specifically, has not been reported previously, this study will provide critical baseline data.

119 7.3 METHODS

Bacterial isolates

Nine bacterial isolates (randomly selected) from four dugong faecal samples collected from wild dugongs at two major dugong foraging grounds on the east Queensland, Australia coast were used in this study: Moreton Bay (southern Queensland, n = 2) and Newry Region (central Queensland, n = 2; Figure 7.2). Additionally, one marine sediment sample collected from the Newry Region dugong foraging area using a van Veen grab deployed from a boat was investigated. Dugong faeces sample collection, bacterial culture and species identification methods are described in Chapter 5, with these methods followed for the marine sediment sample. The nine bacterial isolates cultured from dugong faecal samples and the one isolate cultured from the marine sediment sample (Table 7.1) were stored in Brain Heart Infusion (BHI) with 20% glycerol at -80°C following culturing.

Moreton Bay (Figure 7.2) supports a population of around 900 dugongs (Lanyon 2003) and is located adjacent to the city of Brisbane, which has a population of 2.5 million people; the Brisbane River passes through the city and flows into western Moreton Bay. The dugong foraging areas are in the Eastern Banks area of Moreton Bay, approximately 15-20 km from the developed western shore, and are protected within the Moreton Bay Marine Park (Lanyon 2003). In contrast, the Newry Region is located approximately 50 km north of Mackay in central Queensland and is surrounded by beef cattle, sugar cane, tomato growing and other horticultural farms, and associated townships. The Newry Region is a Dugong Protected Area A with a relatively small (~122 individuals) dugong population (Sobtzick et al. 2017).

120

a

b

Figure 7.2. Satellite map of a) Moreton Bay and b) Newry Region sample collection sites.

121

Antimicrobial susceptibility testing

The stored isolates were thawed and re-cultured on SBA and submitted to the Veterinary Laboratory Services (VLS), The University of Queensland, for antimicrobial susceptibility testing. Disc diffusion susceptibility was assessed following Clinical and Laboratory Standards Institute guidelines (CLSI 2018a) using the 16 antimicrobials: amoxicillin- clavulanic acid (20/10 ug), ampicillin (10 ug), cepftazadime (30 ug), cephalothin (30 ug), ciprofloxacin (5 ug), clindamycin (2 ug), enrofloxacin (5 ug), erythromycin (15 ug), fosfomycin (50 µg), gentamicin (10 ug), penicillin (10 units), sulphonamide (300 µg), tetracycline (30 ug), ticarcillin-clavulanic acid (75/10 ug), trimethoprim (5 µg), and trimethoprim-sulphamethoxazole (1.25/23.7 ug; all from ThermoFisher Scientific).

Additionally, antimicrobial resistance was assessed using the minimum inhibitory concentration (MIC) method in duplicate for isolates that were identified as intermediate or resistant from disc diffusion testing or for isolates that had resistant genes identified (where antibiotics were available; CLSI 2018a). The antimicrobials tested using MIC were amoxicillin-clavulanic acid, ampicillin, cephalothin, cephazolin, clindamycin and penicillin (Sigma-Aldrich, Castle Hill, NSW, Australia).

Clinical and Laboratory Standards Institute (CLSI 2018b) guidelines for bacteria isolated from animals were used for interpretative breakpoints for disc diffusions and MICs as there are none specifically available for dugongs. Staphylococcus spp. breakpoints were used for Lysinibacillus sphaericus and Bacillus cereus species. There are no CLSI breakpoints available for fosfomycin so a diameter of >32 mm diameter was considered sensitive and a 0 mm diameter was considered resistant for the disc diffusion susceptibility testing.

Whole Genome Sequencing

The ten bacterial isolates used for antimicrobial susceptibility testing were also submitted for Whole Genome Sequencing (WGS) to assess the presence of resistance and virulence genes (E. coli only) and for multilocus sequence typing (MLST, E. coli only). Isolates frozen in BHI and glycerol were re-cultured onto SBA medium and DNA extracted using the QIAamp DNA mini kit from QIAGEN following manufacturer’s instructions for bacteria.

122 DNA quality was assessed using the Qubit fluorometer (ThermoFisher Scientific). DNA was submitted to the Ramaciotti Centre for Genomics (Sydney, Australia) for library preparation (Nextera XT, Illumina), sequencing (Illumina MiSeq 250bp PE run) and quality control.

Bioinformatics procedures were conducted in Geneious (version r9, Auckland, New Zealand, https://www.geneious.com/). All reads were trimmed to filter out low quality reads with an error probability limit of 0.1, and at least 20 bp at the 5’ end and 30 bp at 3’ end trimmed. Genomes were assembled against a reference genome (downloaded from GenBank) using Geneious alignment with default parameters, with CP026085.1 used for E. coli isolates, CP003668.1 used for S. warneri isolates, CP016316.1 used for B. cereus and CP015224.1 used for Lysinibacillus sphaericus.

Consensus sequences were uploaded to ResFinder (threshold of 80% identity; http://cge.cbs.dtu.dk/services/ResFinder/) to identify antimicrobial resistant genes and chromosomal point mutations (E. coli only) and CARD (http://arpcard.mcmaster.ca) databases to identify antimicrobial resistant genes. E. coli sequences were uploaded to VirulenceFinder (threshold of 90% identity; https://cge.cbs.dtu.dk/services/VirulenceFinder/) to identify virulent genes and submitted to EnteroBase (http://enterobase.warwick.ac.uk/) to identify the MLST. VirulenceFinder does not have an option to scan for virulence genes for the other bacterial species. E. coli sequence types from the A, B1, B2, D, AxB1 and ABD lineages used by Enterobase were compared against the sequence types identified from the E. coli WGS in this study, with the seven housekeeping genes (adk, fumC, gyrB, icd, mdh, purA, and recA) associated with each of the sequence types downloaded from http://enterobase.warwick.ac.uk/, and the concatenated sequences for each sequence type aligned using MEGA7 (version 7.0.26, https://www.megasoftware.net). A Maximum Likelihood phylogenetic tree was constructed using the Tamurei-Nei model with 1,000 bootstraps to display the relationship between the sequence types identified in this study with those downloaded from Enterobase. Average nucleotide identity was calculated between the E. coli WGS using http://enve-omics.ce.gatech.edu/ani/index following Goris et al. (2007) to determine the similarity of isolates.

123 7.4 RESULTS

Bacterial isolate details

Collection location and species identity (dugong faecal isolates identified in Chapter 5) for all tested isolates are presented in Table 7.1.The isolate cultured from marine sediment was identified as Lysinibacillus sphaericus based on the 16S sequence generated with the highest similarity match (>98%) to a previously submitted Lysinibacillus sphaericus sequence to GenBank (CP015224.1).

Table 7.1. Collection location and species identity of ten bacterial isolates cultured from wild dugong faeces (n=9) and marine sediment (n=1).

Dugong/Sediment Isolate ID Type Species Location Sample ID

Sediment 4 Sediment 4-4 Marine Lysinibacillus Newry Region sediment sphaericus NW16025 NW16025-3 Dugong faeces Bacillus cereus Newry Region

NW16025 NW16025-2 Dugong faeces Staphylococcus Newry Region warneri

MB15702 MB15702-3 Dugong faeces Staphylococcus Moreton Bay warneri

MB15703 MB15703-3 Dugong faeces Staphylococcus Moreton Bay warneri

MB15703 MB15703-4 Dugong faeces Staphylococcus Moreton Bay warneri

NW16031 NW16031-2 Dugong faeces Escherichia coli Newry Region

NW16031 NW16031-3 Dugong faeces Escherichia coli Newry Region NW16031 NW16031-6 Dugong faeces Escherichia coli Newry Region

NW16031 NW16031-7 Dugong faeces Escherichia coli Newry Region

Antimicrobial susceptibility testing

The four S. warneri isolates cultured from Queensland dugong faecal samples were resistant to penicillin and two isolates were also resistant to trimethoprim based on disc diffusion susceptibility testing (Table 7.2). All E. coli isolates were only resistant to ampicillin (Table 7.2). The B. cereus isolate was resistant to penicillin and was found to have intermediate susceptibility to cephalothin and clindamycin (Table 7.2). The marine sediment isolate was resistant to amoxicillin-clavulanic acid, cephalothin, fosfomycin and penicillin and was also found to have intermediate susceptibility to clindamycin (Table 7.2).

124 The MIC testing on isolates using antimicrobials found to be resistant or intermediate from disc diffusions testing indicated all bacterial species to be resistant to all antimicrobials tested except for clindamycin where the B. cereus and L. sphaericus isolates were found to have intermediate susceptibility (Table 7.3).

125 Table 7.2. Antimicrobial susceptibility determined by disc diffusion of bacteria cultured from wild dugong faecal samples (NW=Newry Region, MB=Moreton Bay) and a marine sediment sample from Queensland, Australia. R=resistant, I=intermediate, S=susceptible and Blank=not tested. Zone of inhibition in mm shown for fosfomycin.

Sample ID-Isolate ID Sediment4- NW16025- NW16025- MB15702- MB15703- MB15703- NW16031- NW16031- NW16031- NW16031- 4 3 2 3 3 4 2 3 6 7 Species L. B. cereus S. warneri S. warneri S. warneri S. warneri E. coli E. coli E. coli E. coli sphaericus Ciprofloxacin S S S S S S S S S S Enrofloxacin S S S S S S S S S S Cepftazadime S S S S Amoxicillin- R S S S S S S S S S Clavulanic acid Ticarcillin-Clavulanic S S S S acid Cephalothin R I S S S S S S S S Penicillin R R R R R R Ampicillin R R R R Trimethoprim R R S S Sulphonamide S S S S Trimethoprim- S S S S S S S S S S Sulphamethoxazole Fosfomycin R (0 mm) S (32 mm) Tetracycline S S S S S S S S S S Gentamicin S S S S S S S S S S Erythromycin S S S S S S Clindamycin I I S S S S

126 Table 7.3. Minimum inhibitory concentration (MIC, ug/ml) results of bacteria cultured from Queensland dugong faecal samples (NW=Newry Region, MB=Moreton Bay) and a marine sediment sample. Blank=not tested. Sample ID- Species Amoxicillin- Cephalothin Penicillin Ampicillin Cephazolin Clindamycin Isolate ID Clavulanic acid Sediment 4-4 Lysinibacillus 8/4 64 >64 >16 1 sphaericus NW16025-3 Bacillus cereus 8/4 64 >64 >16 1 NW16025-2 Staphylococcus 4 warneri MB15702-3 Staphylococcus 4 warneri MB15703-3 Staphylococcus 4 warneri MB15703-4 Staphylococcus 8 warneri NW16031-2 Escherichia coli 4 NW16031-3 Escherichia coli 4 NW16031-6 Escherichia coli 4 NW16031-7 Escherichia coli 4

127 Whole Genome Sequencing

A total of 47,790,042 reads were obtained from the Illumina MiSeq sequencing for the ten isolates with a yield of 10.49 GBP and 97.4% passing quality control. The number of reads for each isolate and the percentage of reads assembled to make a single contig using the reference sequences are presented in Table 7.4.

Table 7.4. Summary of sequence reads and alignment to reference sequence information of post quality control WGS to reference sequences. Reference sequences were downloaded from GenBank. CP026085.1 used for E. coli isolates, CP003668.1 used for S. warneri isolates, CP016316.1 used for B. cereus and CP015224.1 used for L. sphaericus. %Q30 is the percentage of bases >Q30.

Sample ID- Species # of reads from Mean %Q30 % of reads Reference Isolate ID sequencing sequence assembled to GenBank length of all form contig sequence reads length Sediment 4- Lysinibacillus 3,309,622 244.5 ± 24.8 89.8 90.12 4,692,801 4 sphaericus NW16025-3 Bacillus cereus 5,037,580 243.8 ± 26.7 89.7 85.77 5,218,997 Staphylococcus 4,102,368 243 ± 28.4 91.8 84.92 NW16025-2 warneri Staphylococcus 4,938,256 237.4 ± 35.4 92.0 85.26 MB15702-3 warneri 2,486,042 Staphylococcus 3,460,458 243.9 ± 27.1 90.4 76.52 MB15703-3 warneri Staphylococcus 3,909,866 239.0 ± 34.0 93.3 78.79 MB15703-4 warneri NW16031-2 Escherichia coli 3,441,848 244.3 ± 24.7 84.6 90.67 NW16031-3 Escherichia coli 3,438,936 244.8 ± 24.1 83.9 90.34 4,833,062 NW16031-6 Escherichia coli 4,778,254 244.8 ± 24.0 83.2 90.49 NW16031-7 Escherichia coli 2,943,366 245.5 ± 25.2 82.8 90.61

Submission of the E. coli WGS to the Enterobase database identified all four E. coli isolates as belonging to the sequence type ST196. The maximum likelihood tree constructed in MEGA7 showed that ST196 grouped most closely with sequence types belonging to the AxB1 E. coli clade (Figure 7.3). Average nucleotide identity (Table 7.5) between the E. coli WGS cultured from a single dugong faecal sample indicated very high similarity (>99.9%) among all sequences.

128 Table 7.5. Average nucleotide identity between E. coli WGS cultured from a dugong faecal sample (NW16031) collected from Newry Region, central Queensland Australia.

NW16031-2 NW16031-3 NW16031-6 NW16031-7 NW16031-2 99.97% 99.97% 99.97% NW16031-3 99.96% 99.97% NW16031-6 99.96%

129 Lineage

A

AXB1

B1

AxB1

B1

B2

ABD

D

Figure 7.3. Phylogenetic relationship of E. coli sequence types, including ST196. The maximum likelihood phylogenetic tree was constructed in MEGA7 using the Tamurei-Nei mutation model and bootstraps calculated with 1,000 iterations. Branch lengths are measured in the number of substitutions per site. ST196 was derived from concatenated sequences generated from an E. coli isolate cultured from a Queensland dugong faecal sample and is circled in red. A selection of subtypes from the A, B1, B2, D, AxB1 and ABD lineages were downloaded from the EnteroBase database.

130 A number of resistance genes were identified after comparisons of the WGS with previously published gene sequences. The ResFinder database identified one resistance gene in the genome of the B. cereus isolate cultured. The fosB1 gene (87.41% identity) which encodes fosfomycin resistance was identified (Table 7.6). The CARD database also identified fosfomycin resistance (88.41%) and additionally identified cephalosporin resistance (91.37% identity) in the B. cereus isolate (Table 7.6). Staphylococcus warneri sequences from MB15702-3 and NW16025-2 (both 100% identity) were found to have trimethoprim resistance and MB15703-4 had beta-lactamase resistance according to the CARD database (Table 7.6). The ResFinder database identified macrolide resistance (mdf(A) gene, 98.78% identity) in the four E. coli isolates (Table 7.6). Genotypic resistance was found for some isolates where phenotypic resistance had been indicated by the susceptibility testing (Table 7.6). The database VirulenceFinder identified two virulence genes in all four E. coli sequences: the GAD gene and lpfA genes were found at 100% identity (Table 7.7).

131 Table 7.6. Resistance genes identified in the WGS of bacteria cultured from Queensland dugong faecal samples. HSP is the length of the alignment between the best matching resistance gene and the corresponding sequence in the WGS. N/A – not tested. Sample ID- Species Database Drug class Gene Position in Query/HSP length % Identity Accession # Phenotypic Isolate ID contig resistance NW16025-3 Bacillus cereus ResFinder Fosfomycin FosB1 1981449..1981861 417/413 87.41 CP001903 No NW16025-3 Bacillus cereus CARD Fosfomycin FosB 88.41 3000172 No NW16025-3 Bacillus cereus CARD Cephalosporin, BcII 91.37 3002878 Yes Penam MB15702-3 Staphylococcus CARD Diaminopyrimidine dfrC 100 3002865 Yes warneri antibiotic MB15703-4 Staphylococcus CARD Penam blaZ 95 3000621 Yes warneri NW16025-2 Staphylococcus CARD Diaminopyrimidine dfrC 100 3002865 Yes warneri antibiotic NW16031-2 Escherichia coli ResFinder Macrolide mdf(A) 3145966..3147198 1233 / 1233 98.78 Y08743 N/A NW16031-3 Escherichia coli ResFinder Macrolide mdf(A) 3145966..3147198 1233 / 1233 98.78 Y08743 N/A NW16031-6 Escherichia coli ResFinder Macrolide mdf(A) 3145965..3147197 1233 / 1233 98.78 Y08743 N/A NW16031-7 Escherichia coli ResFinder Macrolide mdf(A) 3145938..3147170 1233 / 1233 98.78 Y08743 N/A

132 Table 7.7. Virulence genes identified by the database VirulenceFinder in the WGS of bacteria isolated from Queensland dugong faecal samples.

Sample ID- Species Virulence Identity Protein Contig number Position in Query/HSP Accession No. Isolate ID factor % Function contig NW16031-2 Escherichia GAD 100 Glutamate 1 2418301..2419701 1401/1401 AP010953 coli decarboxylase NW16031-3 Escherichia GAD 100 Glutamate 1 2418261..2419661 1401/1401 AP010953 coli decarboxylase NW16031-6 Escherichia GAD 100 Glutamate 1 2418290..2419690 1401/1401 AP010953 coli decarboxylase NW16031-7 Escherichia GAD 100 Glutamate 1 2418277..2419677 1401/1401 AP010953 coli decarboxylase NW16031-2 Escherichia lpfA 100 Long polar 1 4800593..4801165 573/573 CP002185 coli fimbriae NW16031-3 Escherichia lpfA 100 Long polar 1 4800557..4801129 573/573 CP002185 coli fimbriae NW16031-6 Escherichia lpfA 100 Long polar 1 4800600..4801172 573/573 CP002185 coli fimbriae NW16031-7 Escherichia lpfA 100 Long polar 1 4800552..4801124 573/573 CP002185 coli fimbriae

133 7.5 DISCUSSION

This study was the first to investigate antibiotic resistant bacteria in wild dugongs. Antibiotic resistant bacteria were found in faecal samples collected from both urban (Moreton Bay) and agricultural (Newry Region) adjacent study sites based on phenotypic and genotypic testing. It was hypothesised that antibiotic resistance may be present in the Moreton Bay sampling site due to its close proximity to major wastewater treatment operations and other urban developments, while the Newry Region site is adjacent to agricultural land where potential run-off occurs.

The three bacterial species isolated from dugong faecal samples all displayed phenotypic antibiotic resistance. The disc diffusion susceptibility method identified resistance to at least one antibiotic for all bacterial isolates, with multidrug resistance identified for two S. warneri isolates, one from Moreton Bay and one from the Newry Region. All S. warneri isolates were resistant to penicillin while all E. coli isolates, which the average nucleotide identity indicated were clonal, were resistant to ampicillin. The L. sphaericus isolate from marine sediment (Newry Region) was found to have multidrug resistance, with resistance to four of the 11 antibiotics tested. The resistance found with disc diffusion testing was confirmed with MICs with the exception of B. cereus, where resistance to amoxicillin- clavulanic acid was not found using disc diffusion but was demonstrated with MICs.

Genotypic resistance was also identified in this study for some of the isolates that displayed phenotypic resistance based on analysis of the whole genome sequences. Two of the S. warneri isolates (MB15702-3 and NW16025-2) were found to have the trimethoprim resistance gene (dfrC) using the CARD database with both displaying phenotypic resistance. Beta-lactamase resistance (blaZ gene) was also identified in the S. warneri MB15703-4 isolate which showed phenotypic resistance, however no beta- lactamase resistance gene was found in the other S. warneri isolates that displayed phenotypic resistance. This may be because the resistance mechanism for this bacterial species against this antibiotic has not previously been identified, suggesting possible inherent resistance. The ResFinder and CARD databases found the fosb1 gene that encodes for fosfomycin resistance in the B. cereus isolate, however the disc diffusion testing did not find resistance against this antimicrobial based on the breakpoints used, and it was not possible to investigate the MIC for this antibiotic as it was not available.

134 Further testing of dugong B. cereus isolates would be required to determine the resistance breakpoint for fosfomycin.

The findings in this study are similar to those reported by other studies investigating antimicrobial resistance in marine mammals. For example, Staphylococcus spp. isolated from live and dead-stranded sea otters from California and Alaska either had intermediate resistance or were above the resistance threshold for penicillin based on MIC testing, as were E. coli isolates tested against ampicillin (Brownstein et al. 2011). Multidrug resistance has also been recorded for bacterial species isolated from wild bottlenose dolphins in south-eastern USA with 48% of dolphins found to have bacteria resistant to more than one antibiotic (Stewart et al. 2014). The results from the present study also supports previous findings of antimicrobial resistance in a marine species that shares a coastal seagrass habitat with dugongs. The green sea turtle is also a long lived herbivore with a coastal foraging distribution similar to that of the dugong in Queensland waters. Wild green turtles foraging in the Townsville region of Queensland (>300 km north of the Newry Region in this study) were found to be resistant to multiple antibiotics, with the highest frequency of resistance found to be against beta-lactam class antibiotics (Ahasan et al. 2017a). Both the results from this study and the green sea turtle study indicate antibiotic resistant bacteria are present at various locations along the Queensland coast.

The multi-locus sequence typing of the E. coli WGS identified it belonged to a previously described sequence type. ST196 was found to group with other sequence types that belong to the AxB1 lineage according to the Maximum likelihood phylogenetic tree. This sequence type has previously been identified in a faecal sample from a German pig farmer (Fischer et al. 2017), a commensal isolate in cattle from Germany (Fischer et al. 2014) and in sequences uploaded to the EnteroBase database from companion and livestock animals (USA), human faeces (Cambodia), common brush tail possum (Australia), chickens (Kenya), primates (Gambia) and environmental samples taken in the USA.

Virulence genes were also identified in the four E. coli WGS. The GAD and lpfA genes were identified by the VirulenceFinder database with 100% identity. The enzyme glutamate decarboxylase (GAD) is found in all E. coli strains and function is to protect cells in acidic environments (Bergholz et al. 2007). The long polar fimbria protein A (lpfA) gene

135 has been found in clinical and commensal isolates from humans and cattle and is associated with cell adhesion (Toma et al. 2006; Blum and Leitner 2013; Kidsley et al. 2018).

Due to the small number of isolates tested in this study, conclusions about the extent of antimicrobial resistance in Queensland dugongs and whether they are wild type isolates that harbour resistance, or are acquired clinical isolates requires further investigation. To be able to determine this, further phenotypic and genotypic testing needs to be done on a larger number of isolates from more locations, including pristine regions that are not adjacent to likely run-off from cities, industry or agriculture. This would allow comparisons of the resistance profiles of isolates cultured from samples collected from areas adjacent to urban areas versus rural/remote areas to possibly determine antibiotic pathways into the dugong’s habitat. Susceptibility testing of seawater and sediment samples from dugong seagrass habitats would also allow insights into the degree of transfer of antibiotic resistant bacteria from the environment to dugongs. More dugong faecal bacterial culturing is also required to determine what commensal species are present in dugongs so that a surveillance bacterial species can be identified to monitor the impact that indirect contact with antibiotics has on dugong resistance profiles and subsequently the health of their habitat. Samples should also be taken and tested from individuals that appear to be diseased to determine if the disease is caused by resistant bacteria. The resistance guidelines used in this study (CLSI 2018b) were not developed using data from dugongs or other marine mammals, instead they have mainly been developed for companion and production animals. Antimicrobial susceptibility testing of a larger number of bacterial isolates will help inform the development of guidelines to help interpret clinical and wild type breakpoints for dugong cultured isolates.

This is the first study to report antimicrobial resistance in sirenians and in dugongs specifically, providing baseline data for future comparisons to assess the impact antibiotics contaminants are having on these marine mammals. The results from this study indicate that dugongs in Queensland are harbouring antibiotic resistant bacteria that may have been acquired from terrestrial sources, as evident from the finding of resistance genes and the E. coli sequence typing. As dugongs globally are found in shallow coastal areas, it is likely that populations close to anthropogenic influences will be exposed to antibiotic

136 resistant bacteria due to antibiotics entering their environment through wastewater discharge and run-off. However, this relationship between human activity and antimicrobial resistance in the aquatic environment requires further investigation to determine how antimicrobial resistance is acquired and what impact it is having on wild populations that are not directly administered antibiotics. With antibiotic resistant bacteria considered an indicator of aquatic pollution (Foti et al. 2009) and the potential for the aquatic environment to be a reservoir for antibiotic resistant bacteria that could potentially be harmful for both humans, livestock and wild animals (Costanzo et al. 2005; Stoddard et al. 2008), developing ways of minimising antibiotic entry into aquatic environments is required.

137 CHAPTER 8: General discussion

In this thesis, I have applied various genetic techniques and analyses to understand historical dispersal and contemporary movements in dugongs in Queensland, to evaluate factors that may be influencing dugong movements and to assess the impact of coastal threats on population connectivity. Overall, the knowledge gained from this thesis provides important information necessary for Queensland dugong conservation and management in the face of increasing coastal threats. The dugong’s restricted coastal habitat means it is susceptible to a number of human related threats, including habitat loss and pollution. Therefore, understanding the factors that impact on dugong population viability, including genetic connectivity and threats, are necessary for effective management.

8.1 HISTORICAL DUGONG DISPERSAL

The use of multiple genetic datasets and different analysis methods in this thesis have detected significant genetic structuring and suggested the presence of an abrupt genetic break in the eastern Queensland dugong population between Torres Strait (10°S) and Moreton Bay (27.4°S, distance >2,000 km). All nuclear genetic analyses detected the presence of two main genetic clusters, with the genetic break in the Whitsunday Islands region (20.32°S) suggesting dugong effective breeding dispersal and gene flow is limited across the region. The microsatellite analysis using 22 microsatellite loci detected the presence of the two main dugong clusters, with individuals from Torres Strait to Airlie Beach forming one cluster, while individuals from Midge Point south to Moreton Bay formed the second cluster. Similarly, the admixture analysis using the newly discovered 10,690 SNP loci also detected two main geographical clusters separated at the Whitsunday Islands, as demonstrated in the microsatellite analysis. My review of published literature did not reveal such an abrupt genetic break in other animal species along this coastline and this was not expected in dugongs due to their high dispersal ability. The fact that both the small number of microsatellite markers and the larger number of genome- wide SNPs were able to detect the genetic break in the Whitsunday Islands region suggests the break is significant and that gene flow is likely restricted across the region.

138

While two clusters were evident from the genetic analyses, differences in the level of gene flow across the east Queensland dugong population was suggested between the three genetic markers. In the SNP analysis, there was a third cluster indicated that included samples from northern and southern Queensland, suggesting historically dugong dispersal may have occurred across the region of the genetic break. The mtDNA sequence analysis also supports this assertion by the finding of mixed northern and southern Queensland haplotypes within clades. In contrast, a more in depth microsatellite analysis indicated sub- structuring within each of the two main clusters, with significant FST values (0.15087- 0.80546) from the mtDNA sequence analysis also supporting this. It is difficult to reconcile these differences, but the microsatellite and mtDNA analyses potentially indicate gene flow is further restricted within each of the two clusters, with lower sensitivity to detect this sub- structuring by the SNP markers due to the smaller sample size. Hence, further sampling around these potential genetic breaks followed by SNP analysis is required to confirm further restrictions to gene flow.

Further refinement of the SNP markers enabled identification of 464 highly discriminatory markers (Chapter 4) that detected the same population structure as the overall analysis using the 10,690 loci. The discovery of these highly discriminatory genome-wide markers potentially provides opportunities for future genetic diversity and population structure studies of dugong populations in Australia using a developed SNP array that is more cost- effective than the sequencing of thousands of SNPs. However, sequencing of a greater number of samples is required throughout the rest of the Australian and global dugong population to determine how variable these SNPs are and if there is any ascertainment bias. The identification of highly discriminatory genome-wide SNPs may also enable sequencing and analysis of lower quality DNA samples, such as non-invasively collected faecal samples, without the loss of resolution. As the SNP markers are only assessing variation at one nucleotide location, relatively small regions need to be amplified, usually smaller than that required for microsatellite markers. Therefore assessment of population structure using SNP markers may enable assessment in regions where tissue samples are difficult to collect but where degraded DNA from floating faecal samples or stranded dugong carcasses are available. Additionally, further genomic analyses will be possible using the SNP loci discovered, for example mapping of quantitative trait loci and identification of variants associated with phenotypic effects, particularly once a dugong genome is available. Depending on the genomic analyses undertaken, it may be

139 necessary to use different selection criteria to select SNPs for an array, e.g. for pedigree discrimination, for future genomic studies.

Gene flow was evident within each of the two main genetic clusters identified, even with potential sub-structuring indicated, suggesting population recoveries from possible regional population size declines is promising. The recovery of the Hervey Bay population following major weather events in 1992, in which a considerable reduction in the population size was detected (Table 2.1; Preen and Marsh 1995) is evidence to support population recoveries are possible if there is connectivity between populations. Population size estimates of the major Queensland populations are relatively high (Table 2.1), however population size estimate are small for some locations (Sobtzick et al. 2017). Adequate protection of seagrass meadows along the Queensland coast should maintain dugong movement corridors and sustain genetic connectivity due the dugong’s reliance on seagrass.

The remote location of some dugong populations, their cryptic nature and poor water clarity means collection of tissue samples for genetic analysis can be difficult and expensive. Due to this, there were certain regions of the Queensland coast were no or limited number of samples were available for genetic analysis, for example between Torres Strait and Starcke River where a potential genetic break was indicated. This has potentially limited evaluation of more fine-scale population structuring, although major divisions were revealed.

8.2 ECOLOGICAL DRIVERS OF DUGONG POPULATION STRUCTURE AND MOVEMENT

While genetic analyses are able to provide information about where barriers to gene flow are located, a landscape/seascape genetic approach will help evaluate factors that may be associated with the observed population structuring. In this thesis, I used a seascape genetics approach to investigate potential drivers of dugong population structure and genetic variability detected by the microsatellite analysis. Sea Surface Temperature (SST) has previously been demonstrated to be an ecological factor that influences population structure in other marine mammals (Fontaine et al. 2007; Mendez et al. 2010; Mendez et

140 al. 2011; Amaral et al. 2012; Viricel and Rosel 2014) and the greater than 6°C difference in SST from northern to southern Queensland waters was hypothesised to be a driver of dugong population structure, either directly through influencing movements or indirectly through affecting other ecological factors such as seagrass abundance. However, SST was not found to significant influence dugong population structure or genetic distance. The SST gradient across the study region matches the coastline direction and may be correlated with other factors. Additionally, seagrass distribution was hypothesised as another ecological factor that influences dugong population structure due to the dugong’s reliance on seagrass meadows. Seagrass distribution was also not found to be significant. Seagrass mapping along the Queensland coast is limited, incomplete or out-dated in some areas, therefore the influence that seagrass distribution, abundance and species composition has on dugong population structure and connectivity may not be fully understood until a comprehensive map is available. Seagrass loss globally has been high in recent years (Waycott et al. 2009), with losses likely in Queensland due to coastal threats affecting seagrass growth. Therefore, an understanding of the impact seagrass has on dugong population structure and connectivity is important as the loss of seagrass may be acting as a barrier to dugong dispersal and gene flow.

While SST and seagrass distribution were not found to influence dugong population structuring, an oceanographic phenomenon, known as the ‘sticky water’ effect, occurs across the Whitsunday Islands region. Strong tidal currents are associated with the ‘sticky water’ phenomenon, and along the inshore waters of eastern Queensland is only found at the Whitsunday Islands region (Wolanski and Spagnol 2000; Wolanski and Kingsford 2014). It is unclear whether this is an incidental association with the genetic break identified in dugongs in Queensland that is potentially limiting the dugong’s ability to traverse the region due to this phenomenon. Hence, this effect might be contributing to the genetic break and acting as a barrier to gene flow. However, given the dugongs strong dispersal ability, it is unlikely that this phenomenon is the only factor contributing to the genetic break, with other potential factors requiring investigation. The actual effect this phenomenon has on dugong movement requires investigation.

Comparison of the microbiome of dugong faecal samples collected at various locations along the Queensland coast potentially indicates seagrass distribution, species diversity

141 and/or availability, and subsequently diet, are impacting on dugong movements. The faecal microbial composition of the three most southern Queensland populations (Clairview, Hervey Bay and Moreton Bay) revealed dugongs from these populations had distinctly different microbiomes compared to the populations sampled further north. Interestingly, the difference in the faecal microbiome between northern and southern populations did not match the same location as the genetic break suggested by genetic analyses, instead it was further south (>200 km south of Whitsunday Islands). These differences were potentially due to variations in diet, with seagrass species diversity differences along the Queensland coast, where lower species diversity was apparent in southern Queensland. Therefore differences in diet, either due to individual preference or regional availability of seagrass species, may be driving the differences in the microbiome observed in Queensland. Diet has been shown to substantially influence the microbiome of various mammals (Ley et al. 2008), with differences in plant species consumption shown to result in different microbial compositions (Brice et al. 2019). Diet may be a driver of the dugong’s microbiome, however further investigation of the differences in seagrass species availability and the actual diet of individuals at different locations along the Queensland coast requires further study to understand the links between diet and microbiome. The microbiome analysis from my study period may potentially suggest dugongs from either side of the indicated divide are not feeding in the converse feeding grounds, potentially providing some indication of dugong contemporary movements, although we don’t yet understand if or how rapidly a microbiome would adapt to an altered diet following migration of a dugong to a region of altered seagrass composition.

Further research into other factors that may be influencing dugong movements that were not possible in this thesis, including social behaviour are required, especially due to the large number of coastal threats dugong face. Additionally, expanding the geographical distribution of samples to include the entire Australian/global dugong distribution will also be beneficial as it will increase the power of the seascape genetics analysis to include a greater number of regions for comparison. The factors that influence dugong population structure and genetic diversity are potentially different to other marine mammals due the dugong’ classification as a coastal marine herbivore.

142 8.3 CONTEMPORARY DUGONG MOVEMENTS

The genetic methods, such as those employed in Chapter 3 and 4, are well-suited to detecting historical population structure and the level of gene flow. However, changes to the coastal environment can occur over a relatively short time frames, such as port developments, increased run-off from urban or agricultural areas, increased sediment load from floods, or damage from cyclones. These impacts can have an effect over a period of time that is less than one generation of the dugong. Loss of connectivity may take considerable time to be detected by genetic methods due to the dugong’s long generation time, therefore methods that detect contemporary movements are needed to assess possible recent barriers to movements. Telemetry and pedigree analysis have previously been used in dugongs to assess contemporary movements, however limitations of these methods restrict widespread use.

Genetic tools were trialled in this thesis to determine contemporary movements made by dugongs in Queensland. A novel method of detecting contemporary movements (commensal bacterial network), based on the sharing of commensal bacteria cultured from dugong faecal samples collected along the Queensland coast, was evaluated. It is based on a method successfully implemented in a small number of terrestrial species where commensal bacterial networks matched social networks (VanderWaal et al. 2014a; VanderWaal et al. 2014b). Interestingly, E. coli, which is a bacterial species very commonly found in animal faecal samples (Dixit et al. 2004; Derakhshandeh et al. 2013; Ju and Willing 2018), was not able to be widely cultured from dugong faecal samples. The microbiome analysis of 47 dugong faecal samples was also not able to detect the presence of E. coli in dugong faeces, indicating E. coli is not a bacterial species that regularly colonises the lower intestinal tract of dugongs. S. warneri was indicated to be a suitable bacterial species to use to construct a commensal bacterial network due to improved culture success, but the lack of genetic variability detected between individual dugong bacterial isolates meant an informative commensal bacterial network could not be constructed to infer Queensland dugong contemporary movements. A common commensal bacterial species that is easily cultured and has substantial genetic variability would need to be identified, potentially from the microbiome analysis (Chapter 6), to be able to fully explore the methods capacity to determine dugong contemporary movements

143 8.4 DUGONG FAECAL BACTERIA ANALYSIS

Analysis of the dugong faecal microbial composition along the Queensland coast revealed core bacterial families that are likely important in seagrass digestion. While regional differences in the faecal microbiome were indicated, the analysis also found a number of core bacterial families, including Clostridiaceae_1, Lachnospiraceae, Peptostreptococcaceae, Ruminococcaceae, Bacteroidaceae and Flavobacteriaceae, that were shared amongst all dugong populations sampled between Townsville and Moreton Bay. This suggests their importance in digestion of the dugong’s food source of seagrass. All core bacterial families, except Peptostreptococcaceae, were also found to be shared across other terrestrial herbivores that are also hindgut fermenters and grass specialists (Bian et al. 2013; Dougal et al. 2013; Donnell et al. 2017), indicating their evolutionary importance in hindgut herbivory digestion. Additionally, the finding of a number of unique OTUs potentially indicates dugongs are harbouring novel bacterial species that may be important to digestion of seagrass. Investigation of the role the core bacterial families and unique OTUs play in seagrass digestion and analysis of samples from other dugong populations should enable assessment of how evolutionary important they are to dugong digestion.

The close proximity of dugong habitat to the developed Queensland coast puts their habitat and dugongs at risk from human activities. In this thesis, I investigated antibiotic resistance of dugong faecal bacterial isolates for the first time as a possible new pollutant that could be a threat to dugongs. While only limited samples were tested (n = 9 isolates, 2 locations), antibiotic resistance was identified for all bacterial isolates tested, including all S. warneri isolates found to be resistant to penicillin, while E.coli isolates were all resistant to ampicillin. Multidrug resistance was also identified in two of the nine dugong faecal isolates and the single marine sediment sample. These results are consistent with other studies of marine mammals, with antimicrobial resistance found to have increased in the marine environment (Wallace et al. 2013). Additionally, the WGS analysis indicated resistance may have occurred due to antibiotic runoff from human and/or agricultural sources, with resistance genes identified from databases of published resistance genes from human and livestock isolates.

144 I conducted a preliminary assessment of antimicrobial resistance in this thesis, with only a small number of samples tested and therefore only limited conclusions can be drawn at this time about how extensive the presence of antimicrobial resistance is in Queensland dugongs. Testing of a greater number of samples from both pristine and other dugong populations adjacent to highly urbanised areas should help determine how widespread this threat is to dugongs in Queensland and help determine if it is intrinsic resistance or caused from antibiotic runoff. This should also allow for some assessment of the health of the marine ecosystem dugongs share with other coastal species.

8.5 IMPLICATIONS FOR DUGONG MANAGEMENT IN QUEENSLAND

The findings from this thesis suggest dugong management in Queensland should be updated to reflect the two breeding groups indicated by the various genetic population structure analyses. All three genetic markers employed by this thesis identified the presence of an abrupt genetic break in the Whitsunday Islands region and indicated dugong breeding across the region is substantially restricted. The development of new genetic markers and analysis of a greater number of samples from along the entire eastern Queensland coastline has improved our understanding of dugong population structure and gene flow in Queensland. The Great Barrier Reef Marine Park is responsible for managing threats to dugongs throughout a large proportion of their eastern Queensland range, including where the genetic break is located. Current management initiatives include protection of dugongs within the specified dugong protection areas, with a concerted effort to reduce mortality due to human threats. However, given gene flow appears to be limited across the genetic break, updates to the GBR Marine Park management plans could incorporate this finding. If significant population declines occur in one of the clusters, it appears highly unlikely individuals from the other cluster would re- populate the losses due to the restricted gene flow across the Whitsunday Islands region. If a population is significantly depleted, this could lead to reduced connectivity of populations and subsequently greater genetic differentiation of populations. Therefore the knowledge gained from this thesis should help managers more effectively manage restoration of population declines to more viable sizes and to ensure connectivity within each of the two cluster remains high. Coordination with the other dugong managers in Queensland, including Moreton Bay and Great Sandy Marine Parks, would also help assist with maintaining connectivity through the protection of dugong movement corridors and important habitats. Additionally, aerial survey design of the Queensland coast previously

145 calculated a population size estimate for the southern GBR region which includes populations on both sides of the genetic break. The design could be updated to reflect the two breeding populations so that an accurate population size estimate of the two breeding groups is known.

While dugong populations in Queensland have been suggested to have declined over the last few decades (Marsh et al. 2001; Marsh et al. 2005), genetic connectivity appears to be sufficient to maintain genetically viable populations. However, with coastal threats growing due to the increasing human population, effective conservation and management of dugongs in Queensland is needed to ensure population sizes and connectivity remain high. Timely action is required due to the fact that the dugong is a long-lived, slowly reproducing animal with a restricted habitat, meaning recovery from declines can take considerable time. Given the dugong’s extensive distribution along the Queensland coast, conservation and management of dugongs has the potential to also protect other species that share the dugong’s habitat.

146 REFERENCES

Ahasan MS, Picard J, Elliott L, Kinobe R, Owens L, Ariel E (2017a) Evidence of antibiotic resistance in Enterobacteriales isolated from green sea turtles, Chelonia mydas on the Great Barrier Reef. Marine Pollution Bulletin 120:18-27. https://doi.org/10.1016/j.marpolbul.2017.04.046 Ahasan MS, Waltzek TB, Huerlimann R, Ariel E (2017b) Fecal bacterial communities of wild-captured and stranded green turtles (Chelonia mydas) on the Great Barrier Reef. Fems Microbiology Ecology 93:11. https://doi.org/10.1093/femsec/fix139 Allendorf FW, Leary RF (1986) Heterozygosity and fitness in natural populations of animals. Conservation biology: the science of scarcity and diversity 57:58-72. Allgeier JE, Rosemond AD, Layman CA (2011) Variation in nutrient limitation and seagrass nutrient content in Bahamian tidal creek ecosystems. Journal of Experimental Marine Biology and Ecology 407:330- 336. https://doi.org/10.1016/j.jembe.2011.07.005 Amaral AR et al. (2012) Seascape genetics of a globally distributed, highly mobile marine mammal: the short-beaked common dolphin (genus Delphinus). Plos One 7 https://doi.org/ 10.1371/journal.pone.0031482 Anderson IC, Rhodes M, Kator H (1979) Sublethal stress in Escherichia coli - function of salinity Applied and Environmental Microbiology 38:1147-1152. Andrews KR, Good JM, Miller MR, Luikart G, Hohenlohe PA (2016) Harnessing the power of RADseq for ecological and evolutionary genomics. Nature Reviews Genetics 17:81-92. https://doi.org/10.1038/nrg.2015.28 Andutta FP, Kingsford MJ, Wolanski E (2012) 'Sticky water' enables the retention of larvae in a reef mosaic. Estuarine Coastal and Shelf Science 101:54-63. https://doi.org/10.1016/j.ecss.2012.02.013 Baird NA et al. (2008) Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers. Plos One 3:7. https://doi.org/10.1371/journal.pone.0003376 Bandelt H-J, Forster P, Röhl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular biology and evolution 16:37-48. Bansal S, Grenfell BT, Meyers LA (2007) When individual behaviour matters: homogeneous and network models in epidemiology. Journal of the Royal Society Interface 4:879-891. https://doi.org/10.1098/rsif.2007.1100 Behrenfeld MJ, Falkowski PG (1997) Photosynthetic rates derived from satellite-based chlorophyll concentration. Limnology and Oceanography 42:1-20. https://doi.org/ 10.4319/lo.1997.42.1.0001 Bergholz TM, Tarr CL, Christensen LM, Betting DJ, Whittam TS (2007) Recent gene conversions between duplicated glutamate decarboxylase genes (gadA and gadB) in pathogenic Escherichia coli. Molecular Biology and Evolution 24:2323-2333. https://doi.org/10.1093/molbev/msm163 Bian GR, Ma L, Su Y, Zhu WY (2013) The Microbial Community in the Feces of the White Rhinoceros (Ceratotherium simum) as Determined by Barcoded Pyrosequencing Analysis. Plos One 8:9. https://doi.org/10.1371/journal.pone.0070103 Bik EM et al. (2016) Marine mammals harbor unique microbiotas shaped by and yet distinct from the sea. Nature Communications 7:13. https://doi.org/10.1038/ncomms10516 Binda C, Lopetuso LR, Rizzatti G, Gibiino G, Cennamo V, Gasbarrini A (2018) Actinobacteria: A relevant minority for the maintenance of gut homeostasis. Digestive and Liver Disease 50:421-428. https://doi.org/10.1016/j.dld.2018.02.012 Blair D, McMahon A, McDonald B, Tikel D, Waycott M, Marsh H (2014) Pleistocene sea level fluctuations and the phylogeography of the dugong in Australian waters. Marine Mammal Science 30:104-121. https://doi.org/ 10.1111/mms.12022 Blum SE, Leitner G (2013) Genotyping and virulence factors assessment of bovine mastitis Escherichia coli. Veterinary Microbiology 163:305-312. https://doi.org/10.1016/j.vetmic.2012.12.037 Bolyen E et al. (2019) Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature biotechnology 37:852-857. Bray JR, Curtis JT (1957) An ordination of the upland forest communities of southern Wisconsin. Ecological Monographs 27:326-349. 147

Brice KL, Trivedi P, Jeffries TC, Blyton MDJ, Mitchell C, Singh BK, Moore BD (2019) The Koala (Phascolarctos cinereus) faecal microbiome differs with diet in a wild population. Peerj 7:27. https://doi.org/10.7717/peerj.6534 Brito PH, Edwards SV (2009) Multilocus phylogeography and phylogenetics using sequence-based markers. Genetica 135:439-455. https://doi.org/10.1007/s10709-008-9293-3 Broderick D, Ovenden J, Slade R, Lanyon JM (2007) Characterization of 26 new microsatellite loci in the dugong (Dugong dugon). Molecular Ecology Notes 7:1275-1277. https://doi.org/ 10.1111/j.1471- 8286.2007.01853.x Brownstein D et al. (2011) Antimicrobial susceptibility of bacterial isolates from sea otters (Enhydra lutris). Journal of Wildlife Diseases 47:278-292. https://doi.org/10.7589/0090-3558-47.2.278 Brumfield RT, Beerli P, Nickerson DA, Edwards SV (2003) The utility of single nucleotide polymorphisms in inferences of population history. Trends in Ecology & Evolution 18:249-256. https://doi.org/10.1016/s0169-5347(03)00018-1 Bull CM, Godfrey SS, Gordon DM (2012) Social networks and the spread of Salmonella in a sleepy lizard population. Molecular Ecology 21:4386-4392. https://doi.org/10.1111/j.1365-294X.2012.05653.x Burgess EA (2012) Determination of critical reproductive parameters of live, wild dugongs in a subtropical environment. PhD thesis, The University of Queensland, Australia Burgess EA, Lanyon JM, Keeley T (2012) Testosterone and tusks: maturation and seasonal reproductive patterns of live, free-ranging male dugongs (Dugong dugon) in a subtropical population. Reproduction 143:683-697. https://doi.org/10.1530/rep-11-0434 Campbell SJ, McKenzie LJ (2004) Flood related loss and recovery of intertidal seagrass meadows in southern Queensland, Australia. Estuarine Coastal and Shelf Science 60:477-490. https://doi.org/10.1016/j.ecss.2004.02.007 Carter AB, McKenna SA, Rasheed MA, McKenzie L, Coles RG (2016) Seagrass mapping synthesis: A resource for coastal management in the Great Barrier Reef World Heritage Area. Report to the National Environmental Science Programme. Cairns, Australia Carvalho IT, Santos L (2016) Antibiotics in the aquatic environments: A review of the European scenario. Environment International 94:736-757. https://doi.org/10.1016/j.envint.2016.06.025 Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Molecular Ecology 22:3124-3140. https://doi.org/10.1111/mec.12354 Chapman JR, Nakagawa S, Coltman DW, Slate J, Sheldon BC (2009) A quantitative review of heterozygosity- fitness correlations in animal populations. Molecular Ecology 18:2746-2765. https://doi.org/10.1111/j.1365-294X.2009.04247.x Chen J, Griffiths M (1998) PCR differentiation of Escherichia coli from other Gram‐negative bacteria using primers derived from the nucleotide sequences flanking the gene encoding the universal stress protein. Letters in applied microbiology 27:369-371. Clapham PJ (1996) The social and reproductive biology of Humpback Whales: An ecological perspective. Mammal Review 26:27-49. https://doi.org/10.1111/j.1365-2907.1996.tb00145.x Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology 9:1657-1659. https://doi.org/10.1046/j.1365-294x.2000.01020.x CLSI (2018a) Performance Standards for Antimicrobial Disk and dilution susceptibility tests for bacteria isolated from animals. Fourth edition. CLSI supplement VET08. Wayne, Pennsylvania, USA. CLSI (2018b) Performance standards for antimicrobial disk and dilution susceptibility tests for bacteria isolated from animals; Approved standard, fifth edition. CLSI standard VET01. Wayne, Pennsylvania, USA. Coles R, Long W, McKenzie L, Roder C (2002) Seagrasses and marine resources in the dugong protection areas of Upstart Bay, Newry Region, Sand Bay, Llewellyn Bay, Ince Bay and the Clairview Region, April/May 1999 and October 1999. Research publication no.72. Great Barrier Reef Marine Park Authority, Townsville Cope RC, Pollett PK, Lanyon JM, Seddon JM (2015) Indirect detection of genetic dispersal (movement and breeding events) through pedigree analysis of dugong populations in southern Queensland, Australia. Biological Conservation 181:91-101. https://doi.org/ 10.1016/j.biocon.2014.11.011 Costanza R et al. (1997) The value of the world's ecosystem services and natural capital. Nature 387:253- 260. https://doi.org/10.1038/387253a0

148 Costanzo SD, Murby J, Bates J (2005) Ecosystem response to antibiotics entering the aquatic environment. Marine Pollution Bulletin 51:218-223. https://doi.org/10.1016/j.marpolbul.2004.10.038 Craft ME, Caillaud D (2011) Network models: an underutilized tool in wildlife epidemiology? Interdisciplinary perspectives on infectious diseases 2011 Critchell K, Grech A, Schlaefer J, Andutta FP, Lambrechts J, Wolanski E, Hamann M (2015) Modelling the fate of marine debris along a complex shoreline: Lessons from the Great Barrier Reef. Estuarine Coastal and Shelf Science 167:414-426. https://doi.org/ 10.1016/j.ecss.2015.10.018 Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12:499-510. https://doi.org/10.1038/nrg3012 Delandmeter P et al. (2017) Submesoscale tidal eddies in the wake of coral islands and reefs: satellite data and numerical modelling. Ocean Dynamics 67:897-913. https://doi.org/ 10.1007/s10236-017-1066- z Derakhshandeh A, Firouzi R, Moatamedifar M, Motamedi A, Bahadori M, Naziri Z (2013) Phylogenetic analysis of Escherichia coli strains isolated from human samples. Molecular Biology Research Communications 2:143-149. DeSalle R, Amato G (2004) The expansion of conservation genetics. Nature Reviews Genetics 5:702-712. https://doi.org/10.1038/nrg1425 Deutsch CJ, Reid JP, Bonde RK, Easton DE, Kochman HI, O'Shea TJ (2003) Seasonal movements, migratory behavior, and site fidelity of West Indian manatees along the Atlantic Coast of the United States. Wildlife Monographs:1-77. DeYoung RW, Honeycutt RL (2005) The molecular toolbox: Genetic techniques in wildlife ecology and management. Journal of Wildlife Management 69:1362-1384. https://doi.org/10.2193/0022- 541x(2005)69[1362:tmtgti]2.0.co;2 Dixit SM, Gordon DM, Wu XY, Chapman T, Kailasapathy K, Chin JJC (2004) Diversity analysis of commensal porcine Escherichia coli - associations between genotypes and habitat in the porcine gastrointestinal tract. Microbiology-Sgm 150:1735-1740. https://doi.org/10.1099/mic.0.26733-0 Donnell MMO, Harris HMB, Ross RP, O'Toole PW (2017) Core fecal microbiota of domesticated herbivorous ruminant, hindgut fermenters, and monogastric animals. Microbiologyopen 6:11. https://doi.org/10.1002/mbo3.509 Dougal K, de la Fuente G, Harris PA, Girdwood SE, Pinloche E, Newbold CJ (2013) Identification of a Core Bacterial Community within the Large Intestine of the Horse. Plos One 8:12. https://doi.org/10.1371/journal.pone.0077660 Dussex N, Taylor HR, Stovall WR, Rutherford K, Dodds KG, Clarke SM, Gemmell NJ (2018) Reduced representation sequencing detects only subtle regional structure in a heavily exploited and rapidly recolonizing marine mammal species. Ecology and Evolution 8:8736-8749. https://doi.org/10.1002/ece3.4411 Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460-2461. https://doi.org/10.1093/bioinformatics/btq461 Edgar RC (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10:996-998. https://doi.org/10.1038/nmeth.2604 Eigeland KA et al. (2012) Bacterial Community Structure in the Hindgut of Wild and Captive Dugongs (Dugong dugon). Aquatic Mammals 38:402-411. https://doi.org/10.1578/am.38.4.2012.402 Ellegren H (2000) Microsatellite mutations in the germline: implications for evolutionary inference. Trends in Genetics 16:551-558. https://doi.org/10.1016/s0168-9525(00)02139-9 Ellegren H (2014) Genome sequencing and population genomics in non-model organisms. Trends in Ecology & Evolution 29:51-63. https://doi.org/10.1016/j.tree.2013.09.008 Erwin PM, Rhodes RG, Kiser KB, Keenan-Bateman TF, McLellan WA, Pabst DA (2017) High diversity and unique composition of gut microbiomes in pygmy (Kogia breviceps) and dwarf (K-sima) sperm whales. Scientific Reports 7:11. https://doi.org/10.1038/s41598-017-07425-z Escoda L, Fernandez-Gonzalez A, Castresana J (2019) Quantitative analysis of connectivity in populations of a semi-aquatic mammal using kinship categories and network assortativity. Molecular Ecology Resources 19:310-326. https://doi.org/10.1111/1755-0998.12967

149 Escoda L, Gonzalez-Esteban J, Gomez A, Castresana J (2017) Using relatedness networks to infer contemporary dispersal: Application to the endangered mammal Galemys pyrenaicus. Molecular Ecology 26:3343-3357. https://doi.org/10.1111/mec.14133 Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14:2611-2620. https://doi.org/ 10.1111/j.1365- 294X.2005.02553.x Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular ecology resources 10:564-567. Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479-491. Fahrig L (1997) Relative effects of habitat loss and fragmentation on population extinction. Journal of Wildlife Management 61:603-610. https://doi.org/10.2307/3802168 Fischer J, Hille K, Ruddat I, Mellmann A, Kock R, Kreienbrock L (2017) Simultaneous occurrence of MRSA and ESBL-producing Enterobacteriaceae on pig farms and in nasal and stool samples from farmers. Veterinary Microbiology 200:107-113. https://doi.org/10.1016/j.vetmic.2016.05.021 Fischer J et al. (2014) bla(CTX-M-15)-carrying Escherichia coli and Salmonella isolates from livestock and food in Germany. Journal of Antimicrobial Chemotherapy 69:2951-2958. https://doi.org/10.1093/jac/dku270 Fitzpatrick BM (2009) Power and sample size for nested analysis of molecular variance. Molecular Ecology 18:3961-3966. Fontaine MC et al. (2007) Rise of oceanographic barriers in continuous populations of a cetacean: the genetic structure of harbour porpoises in Old World waters. Bmc Biology 5:16. https://doi.org/ 10.1186/1741-7007-5-30 Foti M, Giacopello C, Bottari T, Fisichella V, Rinaldo D, Mammina C (2009) Antibiotic Resistance of Gram Negatives isolates from loggerhead sea turtles (Caretta caretta) in the central Mediterranean Sea. Marine Pollution Bulletin 58:1363-1366. https://doi.org/10.1016/j.marpolbul.2009.04.020 Frankham R (1995) Conservation genetics. Annual Review of Genetics 29:305-327. https://doi.org/10.1146/annurev.ge.29.120195.001513 Frankham R, Ballou JD, Briscoe DA (2004) A primer of conservation genetics. Cambridge University Press, Cambridge, United Kingdom Frichot E, Mathieu F, Trouillon T, Bouchard G, Francois O (2014) Fast and Efficient Estimation of Individual Ancestry Coefficients. Genetics 196:973-+. https://doi.org/10.1534/genetics.113.160572 Gales N, McCauley RD, Lanyon J, Holley D (2004) Change in abundance of dugongs in Shark Bay, Ningaloo and Exmouth Gulf, Western Australia: evidence for large-scale migration. Wildlife Research 31:283- 290. https://doi.org/10.1071/wr02073 Gamage H, Tetu SG, Chong RWW, Ashton J, Packer NH, Paulsen IT (2017) Cereal products derived from wheat, sorghum, rice and oats alter the infant gut microbiota in vitro. Scientific Reports 7:12. https://doi.org/10.1038/s41598-017-14707-z Garcia-Rodriguez AI et al. (1998) Phylogeography of the West Indian manatee (Trichechus manatus): how many populations and how many taxa? Molecular Ecology 7:1137-1149. https://doi.org/ 10.1046/j.1365-294x.1998.00430.x Garvin MR, Saitoh K, Gharrett AJ (2010) Application of single nucleotide polymorphisms to non-model species: a technical review. Molecular Ecology Resources 10:915-934. https://doi.org/10.1111/j.1755-0998.2010.02891.x Gaw S, Thomas K, Hutchinson TH (2016) Pharmaceuticals in the Marine Environment. In: Hester RE, Harrison RM (eds) Pharmaceuticals in the Environment, vol 41. Issues in Environmental Science and Technology Series. Royal Soc Chemistry, Cambridge, pp 70-91 Great Barrier Reef Marine Park Authority (2011) Special Management Areas http://www.gbrmpa.gov.au/zoning-permits-and-plans/special-management-areas. 2018 Gibson RN, Atkinson RJA, Gordon JDM (2007) Loss, status and trends for coastal marine habitats of Europe. In: Oceanography and Marine Biology, vol 45. Crc Press-Taylor & Francis Group, Boca Raton, pp 345-405. doi:10.1201/9781420050943

150 Gil P, Vivas J, Gallardo CS, Rodriguez LA (2000) First isolation of Staphylococcus warneri, from diseased rainbow trout, Oncorhynchus mykiss (Walbaum), in Northwest Spain. Journal of Fish Diseases 23:295-298. https://doi.org/10.1046/j.1365-2761.2000.00244.x Glad T, Bernhardsen P, Nielsen KM, Brusetti L, Andersen M, Aars J, Sundset MA (2010) Bacterial diversity in faeces from polar bear (Ursus maritimus) in Arctic Svalbard. Bmc Microbiology 10:10. https://doi.org/10.1186/1471-2180-10-10 Goh SH, Potter S, Wood JO, Hemmingsen SM, Reynolds RP, Chow AW (1996) WSP60 gene sequences as universal targets for microbial species identification: Studies with coagulase-negative staphylococci. Journal of Clinical Microbiology 34:818-823. Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM (2007) DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. International Journal of Systematic and Evolutionary Microbiology 57:81-91. https://doi.org/10.1099/ijs.0.64483- 0 Goslee SC, Urban DL (2007) The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software 22:1-19. Goudet J, Jombart T (2015) Hierfstat: estimation and tests of hierarchical F-statistics. R package version 004-22 Gower JC (1971) General coefficient of similarity and some of its properties. Biometrics 27:857-871. https://doi.org/ 10.2307/2528823 Gray JS (1997) Marine biodiversity: Patterns, threats and conservation needs. Biodiversity and Conservation 6:153-175. https://doi.org/10.1023/a:1018335901847 Grech A, Coles RG (2010) An ecosystem-scale predictive model of coastal seagrass distribution. Aquatic Conservation-Marine and Freshwater Ecosystems 20:437-444. https://doi.org/ 10.1002/aqc.1107 Groom MJ, Meffe GK, Carroll CR, Andelman SJ (2006) Principles of conservation biology. Sinauer Associates , Inc., Sunderland, Massachusetts, USA Grundmann H et al. (2011) A framework for global surveillance of antibiotic resistance. Drug Resistance Updates 14:79-87. https://doi.org/10.1016/j.drup.2011.02.007 Hagihara R, Jones RE, Grech A, Lanyon JM, Sheppard JK, Marsh H (2014) Improving population estimates by quantifying diving and surfacing patterns: A dugong example. Marine Mammal Science 30:348-366. https://doi.org/10.1111/mms.12041 Halpern BS et al. (2008) A global map of human impact on marine ecosystems. Science 319:948-952. https://doi.org/10.1126/science.1149345 Hawkey PM, Jones AM (2009) The changing epidemiology of resistance. Journal of Antimicrobial Chemotherapy 64:3-10. https://doi.org/10.1093/jac/dkp256 Haynes D, Carter S, Gaus C, Muller J, Dennison W (2005) Organochlorine and heavy metal concentrations in blubber and liver tissue collected from Queensland (Australia) dugong (Dugong dugon). Marine Pollution Bulletin 51:361-369. https://doi.org/10.1016/j.marpolbul.2004.10.020 Helyar SJ et al. (2011) Application of SNPs for population genetics of nonmodel organisms: new opportunities and challenges. Molecular Ecology Resources 11:123-136. https://doi.org/10.1111/j.1755-0998.2010.02943.x Hoelzel AR, Hey J, Dahlheim ME, Nicholson C, Burkanov V, Black N (2007) Evolution of population structure in a highly social top predator, the killer whale. Molecular Biology and Evolution 24:1407-1415. https://doi.org/ 10.1093/molbev/msm063 Holderegger R, Di Giulio M (2010) The genetic effects of roads: A review of empirical evidence. Basic and Applied Ecology 11:522-531. https://doi.org/10.1016/j.baae.2010.06.006 Hudson ME (2008) Sequencing breakthroughs for genomic ecology and evolutionary biology. Molecular Ecology Resources 8:3-17. https://doi.org/10.1111/j.1471-8286.2007.02019.x Hughes AR, Inouye BD, Johnson MTJ, Underwood N, Vellend M (2008) Ecological consequences of genetic diversity. Ecology Letters 11:609-623. https://doi.org/10.1111/j.1461-0248.2008.01179.x Jefferson TA, Webber MA, Pitman RL (2015) Marine mammals of the world: a comprehensive guide to their identification. Elsevier, Jernberg C, Lofmark S, Edlund C, Jansson JK (2010) Long-term impacts of antibiotic exposure on the human intestinal microbiota. Microbiology-Sgm 156:3216-3223. https://doi.org/10.1099/mic.0.040618-0

151 Ju TT, Willing BP (2018) Isolation of Commensal Escherichia coli Strains from Feces of Healthy Laboratory Mice or Rats. Bio-Protocol 8:9. https://doi.org/10.21769/BioProtoc.2780 Kalinowski ST (2002) How many alleles per locus should be used to estimate genetic distances? Heredity 88:62-65. https://doi.org/10.1038/sj.hdy.6800009 Ke DB, Picard FJ, Martineau F, Menard C, Roy PH, Ouellette M, Bergeron MG (1999) Development of a PCR assay for rapid detection of enterococci. Journal of Clinical Microbiology 37:3497-3503. Keeling MJ, Eames KTD (2005) Networks and epidemic models. Journal of the Royal Society Interface 2:295- 307. https://doi.org/10.1098/rsif.2005.0051 Kidsley AK et al. (2018) Antimicrobial Susceptibility of Escherichia coli and Salmonella spp. Isolates From Healthy Pigs in Australia: Results of a Pilot National Survey. Frontiers in Microbiology 9:11. https://doi.org/10.3389/fmicb.2018.01207 Kindlmann P, Burel F (2008) Connectivity measures: a review. Landscape Ecology 23:879-890. https://doi.org/10.1007/s10980-008-9245-4 Kosman E, Leonard KJ (2005) Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species. Molecular Ecology 14:415-424. https://doi.org/ 10.1111/j.1365-294X.2005.02416.x Kummerer K (2009) Antibiotics in the aquatic environment - A review - Part I. Chemosphere 75:417-434. https://doi.org/10.1016/j.chemosphere.2008.11.086 Kwan D (2002) Towards a sustainable indigenous fishery for dugongs in Torres Strait: A contribution of empirical data analysis and process. PhD thesis, James Cook University, Australia Lah L et al. (2016) Spatially Explicit Analysis of Genome-Wide SNPs Detects Subtle Population Structure in a Mobile Marine Mammal, the Harbor Porpoise. Plos One 11:23. https://doi.org/10.1371/journal.pone.0162792 Lambeck K, Rouby H, Purcell A, Sun YY, Sambridge M (2014) Sea level and global ice volumes from the Last Glacial Maximum to the Holocene. Proceedings of the National Academy of Sciences of the United States of America 111:15296-15303. https://doi.org/10.1073/pnas.1411762111 Lambrechts J, Hanert E, Deleersnijder E, Bernard PE, Legat V, Remacle JF, Wolanski E (2008) A multi-scale model of the hydrodynamics of the whole Great Barrier Reef. Estuarine Coastal and Shelf Science 79:143-151. https://doi.org/ 10.1016/j.ecss.2008.03.016 Lane D (1991) 16S/23S rRNA sequencing. In: Stackebrandt E, Goodfellow M (eds) Nucleic acid techniques in bacterial systematics. John Wiley and Sons, New York, USA, Lanyon J (1991) The nutritional ecology of the dugong (Dugong dugon) in tropical north Queensland. PhD thesis, Monash University, Australia Lanyon J (2003) Distribution and abundance of dugongs in Moreton Bay, Queensland, Australia. Wildlife Research 30:397-409. https://doi.org/ 10.1071/wr98082 Lanyon J, Slade R, Sneath H, Broderick D (2006) A method for capturing dugongs (Dugong dugon) in open water. Aquatic Mammals 32:196-201. Lanyon J, Sneath H, Long T (2010a) Three skin sampling methods for molecular characterisation of free- ranging dugong (Dugong dugon) populations. Aquatic Mammals 36:298-306. https://doi.org/ 10.1578/am.36.3.2010.298 Lanyon JM, Marsh H (1995) Digesta passage times in the dugong. Australian Journal of Zoology 43:119-127. https://doi.org/10.1071/zo9950119 Lanyon JM, Sanson GD (2006a) Degenerate dentition of the dugong (Dugong dugon), or why a grazer does not need teeth: morphology, occlusion and wear of mouthparts. Journal of Zoology 268:133-152. https://doi.org/10.1111/j.1469-7998.2005.00004.x Lanyon JM, Sanson GD (2006b) Mechanical disruption of seagrass in the digestive tract of the dugong. Journal of Zoology 270:277-289. https://doi.org/10.1111/j.1469-7998.2006.00135.x Lanyon JM, Sneath HL, Long T, Bonde RK (2010b) Physiological Response of Wild Dugongs (Dugong dugon) to Out-of-Water Sampling for Health Assessment. Aquatic Mammals 36:46-58. https://doi.org/10.1578/am.36.1.2010.46 Lapegue S et al. (2014) Development of SNP-genotyping arrays in two shellfish species. Molecular Ecology Resources 14:820-830. https://doi.org/10.1111/1755-0998.12230

152 Larsen AR, Stegger M, Sorum M (2008) spa typing directly from a mecA, spa and pvl multiplex PCR assay-a cost-effective improvement for methicillin-resistant Staphylococcus aureus surveillance. Clinical Microbiology and Infection 14:611-614. https://doi.org/10.1111/j.1469-0691.2008.01995.x Lavery TJ, Roudnew B, Seymour J, Mitchell JG, Jeffries T (2012) High Nutrient Transport and Cycling Potential Revealed in the Microbial Metagenome of Australian Sea Lion (Neophoca cinerea) Faeces. Plos One 7:9. https://doi.org/10.1371/journal.pone.0036478 Lee KS, Park SR, Kim YK (2007) Effects of irradiance, temperature, and nutrients on growth dynamics of seagrasses: A review. Journal of Experimental Marine Biology and Ecology 350:144-175. https://doi.org/10.1016/j.jembe.2007.06.016 Levy SB, Marshall B (2004) Antibacterial resistance worldwide: causes, challenges and responses. Nature Medicine 10:S122-S129. https://doi.org/10.1038/nm1145 Ley RE et al. (2008) Evolution of mammals and their gut microbes. Science 320:1647-1651. https://doi.org/10.1126/science.1155725 Librado P, Rozas J (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451-1452. https://doi.org/ 10.1093/bioinformatics/btp187 Liu ZJ, Cordes JF (2004) DNA marker technologies and their applications in aquaculture genetics. Aquaculture 238:1-37. https://doi.org/10.1016/j.aquaculture.2004.05.027 Long WJL, Mellors JE, Coles RG (1993) Seagrasses between Cape York and Hervey Bay, Queensland, Australia. Australian Journal of Marine and Freshwater Research 44:19-31. Longhurst A (2006) Ecological geography of the sea. 2nd edn. Academic Press, San Diego, USA Manel S, Schwartz MK, Luikart G, Taberlet P (2003) Landscape genetics: combining landscape ecology and population genetics. Trends in Ecology & Evolution 18:189-197. https://doi.org/10.1016/s0169- 5347(03)00008-9 Marsh H (1980) Age determination of the dugong (Dugong dugon (Muller)) in northern Australia and its biological implications. Zoology Department, James Cook University, Australia Marsh H (1995) The life history, pattern of breeding, and population dynamics of the dugong. Department of tropical environment studies and geography, James Cook University, Townsville, Australia Marsh H, Breen B, Preen A (1994) The status of dugongs, sea turtles and dolphins in the Great Barrier Reef region, south of Cape Bedford. Great Barrier Reef Marine Park Authority, Townsville, Australia Marsh H, Channells PW, Heinsohn GE, Morrissey J (1982) Analysis of stomach contents of dugongs from Queensland. Australian Wildlife Research 9:55-67. Marsh H, Corkerton P, Lawler I, Preen A, Lanyon J (1996) The status of the dugong in the southern Great Barrier Reef Marine Park. Great Barrier Reef Marine Park Authority, Townsville, Australia Marsh H, De'Ath G, Gribble N, Lane B (2005) Historical marine population estimates: Triggers or targets for conservation? The dugong case study. Ecological Applications 15:481-492. https://doi.org/10.1890/04-0673 Marsh H, De'ath G, Gribble N, Lane B, Lawler I (2001) Shark control records hindcast serious decline in dugong numbers off the urban coast of Queensland and Dugong distribution and abundance in the southern Great Barrier Reef Marine Park and Hervey Bay: results of an aerial survey in October- December 1999. Great Barrier Reef Marine Park Authority, Townsville, Australia Marsh H, Eros C, Corkeron P, Breen B (1999) A conservation strategy for dugongs: implications of Australian research. Marine and Freshwater Research 50:979-990. Marsh H, Hagihara R, Hodgson A, Rankin R, Sobtzick S (2019) Monitoring dugongs within the Reef 2050 Integrated Monitoring and Reporting Program. Final report of the dugong team in the megafauna expert group, July 2018. Townsville, Australia Marsh H, Heinsohn GE, Channells PW (1984a) Changes in the ovaries and uterus of the dugong, Dugong dugon (sirenia, Dugongidae), with age and reproductive activity. Australian Journal of Zoology 32:743-766. https://doi.org/10.1071/zo9840743 Marsh H, Heinsohn GE, Glover TD (1984b) Changes in the male reproductive-organs of the dugong, Dugong dugon (sirenia, Dugongidae) with age and reproductive activity. Australian Journal of Zoology 32:721-742. https://doi.org/10.1071/zo9840721 Marsh H, Heinsohn GE, Marsh LM (1984c) Breeding cycle, life-history and population-dynamics of the dugong, Dugong dugon (sirenia, Dugongidae). Australian Journal of Zoology 32:767-788. https://doi.org/10.1071/zo9840767

153 Marsh H, Kwan D (2008) Temporal variability in the life history and reproductive biology of female dugongs in Torres Strait: The likely role of sea grass dieback. Continental Shelf Research 28:2152-2159. https://doi.org/10.1016/j.csr.2008.03.023 Marsh H, Lawler I (2001) Dugong distribution and abundance in the Southern Great Barrier Reef Marine Park and Hervey Bay: results of an aerial survey in October–December 1999. Townsville, Australia Marsh H, Lawler I (2007) Dugong distribution and abundance on the urban coast of Queensland: a basis for management. Unpublished Report Marsh H, Lawler IR, Kwan D, Delean S, Pollock K, Alldredge M (2004) Aerial surveys and the potential biological removal technique indicate that the Torres Strait dugong fishery is unsustainable. Animal Conservation 7:435-443. https://doi.org/10.1017/s1367943004001635 Marsh H, O'Shea TJ, Reynolds JE (2011) Ecology and conservation of the Sirenia: dugongs and manatees. vol 18. Cambridge University Press, Cambridge, UK Marsh H, Penrose H, Eros C, Hughes J (2002) Dugong status report and action plans for countries and territories. United Nations Environment Programme, Nairobi Marsh H, Saalfeld W (1990a) The distribution & abundance of dugongs in southern Queensland. Final report to the Queensland Department of Primary Industries. Townsville, Australia Marsh H, Saalfeld W (1990b) The distribution and abundance of dugongs in the Great Barrier Reef Marine Park south of Cape Bedford. Australian Wildlife Research 17:511-524. Marsh H, Saalfeld WK (1989) Distribution and abundance of dugongs in the northern Great Barrier Reef Marine Park. Australian Wildlife Research 16:429-440. Marsh H, Sobtzick S (2015) Dugong dugon, The IUCN Red List of Threatened Species. http://dx.doi.org/10.2305/IUCN.UK.2015-4.RLTS.T6909A43792211.en. Martien KK et al. (2014) Nuclear and Mitochondrial Patterns of Population Structure in North Pacific False Killer Whales (Pseudorca crassidens). Journal of Heredity 105:611-626. https://doi.org/10.1093/jhered/esu029 McDonald B (2005) Population genetics of dugongs around Australia: Implications of gene flow and migration. PhD thesis, James Cook University, Australia McKenzie L, Collier C, Langlois L, Yoshida R, Smith N, Waycott M (2018) Marine Monitoring Program: Annual Report for inshore seagrass monitoring 2016-2017. Great Barrier Reef Marine Park Authority, Townsville McRae B, Shah V, Mohapatra T (2014) Circuitscape 4 User Guide http://www.circuitscape.org. Medeiros AW et al. (2016) Characterization of the faecal bacterial community of wild young South American (Arctocephalus australis) and Subantarctic fur seals (Arctocephalus tropicalis). Fems Microbiology Ecology 92:8. https://doi.org/10.1093/femsec/fiw029 Mendez M, Rosenbaum HC, Subramaniam A, Yackulic C, Bordino P (2010) Isolation by environmental distance in mobile marine species: molecular ecology of franciscana dolphins at their southern range. Molecular Ecology 19:2212-2228. https://doi.org/ 10.1111/j.1365-294X.2010.04647.x Mendez M et al. (2011) Molecular ecology meets remote sensing: environmental drivers to population structure of humpback dolphins in the Western Indian Ocean. Heredity 107:349-361. https://doi.org/ 10.1038/hdy.2011.21 Merson SD, Ouwerkerk D, Gulino LM, Klieve A, Bonde RK, Burgess EA, Lanyon JM (2014) Variation in the hindgut microbial communities of the Florida manatee, Trichechus manatus latirostris over winter in Crystal River, Florida. Fems Microbiology Ecology 87:601-615. https://doi.org/10.1111/1574- 6941.12248 Metin S, Kubilay A, Onuk EE, Didinen BI, Yildirim P (2014) First isolation of Staphylococcus warneri from cultured rainbow trout (Oncorhynchus mykiss) broodstock in Turkey. Bulletin of the European Association of Fish Pathologists 34:165-174. Morin PA, Martien KK, Taylor BL (2009) Assessing statistical power of SNPs for population structure and conservation studies. Molecular Ecology Resources 9:66-73. https://doi.org/10.1111/j.1755- 0998.2008.02392.x Morin PA, McCarthy M (2007) Highly accurate SNP genotyping from historical and low-quality samples. Molecular Ecology Notes 7:937-946. https://doi.org/10.1111/j.1471-8286.2007.01804.x

154 Murray RM, Marsh H, Heinsohn GE, Spain AV (1977) Role of midgut cecum and large-intestine in digestion of sea grasses by dugong (mammalia-sirenia). Comparative Biochemistry and Physiology Part A: Physiology 56:7-10. https://doi.org/10.1016/0300-9629(77)90432-7 Nei M (1987) Molecular evolutionary genetics. Columbia university press, New York, NY, USA, Nelson TM, Apprill A, Mann J, Rogers TL, Brown MV (2015) The marine mammal microbiome: current knowledge and future directions. Microbiology Australia 36:8-13. https://doi.org/10.1071/ma15004 Nelson TM, Rogers TL, Brown MV (2013a) The Gut Bacterial Community of Mammals from Marine and Terrestrial Habitats. Plos One 8:8. https://doi.org/10.1371/journal.pone.0083655 Nelson TM, Rogers TL, Carlini AR, Brown MV (2013b) Diet and phylogeny shape the gut microbiota of Antarctic seals: a comparison of wild and captive animals. Environmental Microbiology 15:1132- 1145. https://doi.org/10.1111/1462-2920.12022 Nielsen KA, Owen HC, Mills PC, Flint M, Gibson JS (2013) Bacteria isolated from dugongs (Dugong dugon) submitted for postmortem examination in Queensland, Australia, 2000-2011. Journal of Zoo and Wildlife Medicine 44:35-41. https://doi.org/10.1638/1042-7260-44.1.35 Ogden R et al. (2013) Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing. Molecular Ecology 22:3112-3123. https://doi.org/10.1111/mec.12234 Pavoine S, Vallet J, Dufour AB, Gachet S, Daniel H (2009) On the challenge of treating various types of variables: application for improving the measurement of functional diversity. Oikos 118:391-402. https://doi.org/ 10.1111/j.1600-0706.2008.16668.x Peakall R, Smouse PE (2012) GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28:2537-2539. https://doi.org/ 10.1093/bioinformatics/bts460 Perez M, Romero J (1992) Photosynthetic response to light and temperature of the seagrass Cymodocea nodosa and the prediction of its seasonality. Aquatic Botany 43:51-62. https://doi.org/10.1016/0304-3770(92)90013-9 Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species. Plos One 7:11. https://doi.org/10.1371/journal.pone.0037135 Pierce D (2012) ncdf4: Interface to Unidata netCDF (version 4 or earlier) format data files. R package, URL http://CRAN R-project org/package= ncdf4 Plon S, Thakur V, Parr L, Lavery SD (2019) Phylogeography of the dugong (Dugong dugon) based on historical samples identifies vulnerable Indian Ocean populations. Plos One 14:19. https://doi.org/10.1371/journal.pone.0219350 Pollock KH, Marsh HD, Lawler IR, Alldredge MW (2006) Estimating animal abundance in heterogeneous environments: An application to aerial surveys for Dugongs. Journal of Wildlife Management 70:255-262. https://doi.org/10.2193/0022-541x(2006)70[255:eaaihe]2.0.co;2 Poyart C, Quesne G, Boumaila C, Trieu-Cuot P (2001) Rapid and accurate species-level identification of coagulase-negative staphylococci by using the sodA gene as a target. Journal of Clinical Microbiology 39:4296-4301. https://doi.org/10.1128/jcm.39.12.4296-4301.2001 Preen A (1989a) Observations of mating behaviour in dugongs (Dugong dugon). Marine Mammal Science 5:382-387. https://doi.org/10.1111/j.1748-7692.1989.tb00350.x Preen A (1989b) The status and conservation of dugongs in the Arabian region. Metereological and Environmental Protection Administration Kingdom of Saudi Arabia Preen A (1995) Diet of dugongs-are they omnivores? Journal of Mammalogy 76:163-171. https://doi.org/10.2307/1382325 Preen A, Marsh H (1995) Response of dugongs to large-scale loss of seagrass from Hervey Bay, Queensland Australia. Wildlife Research 22:507-519. Preen AR (1992) Interactions between dugongs and seagrasses in a subtropical environment. PhD thesis, James Cook University, Australia Prichula J et al. (2016) Resistance to antimicrobial agents among enterococci isolated from fecal samples of wild marine species in the southern coast of Brazil. Marine Pollution Bulletin 105:51-57. https://doi.org/10.1016/j.marpolbul.2016.02.071

155 Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945-959. Proboste T, Corvalan P, Clark N, Beyer HL, Goldizen AW, Seddon JM (2019) Commensal bacterial sharing does not predict host social associations in kangaroos. Journal of Animal Ecology Rasheed MA, Unsworth RKF (2011) Long-term climate-associated dynamics of a tropical seagrass meadow: implications for the future. Marine Ecology Progress Series 422:93-103. https://doi.org/10.3354/meps08925 Raymo ME, Lisiecki LE, Nisancioglu KH (2006) Plio-pleistocene ice volume, Antarctic climate, and the global delta O-18 record. Science 313:492-495. https://doi.org/10.1126/science.1123296 Raymond M, Rousset F (1995) Genepop (version-1.2) - population-genetics software for exact tests and ecumenicism. Journal of Heredity 86:248-249. https://doi.org/ 10.1093/oxfordjournals.jhered.a111573 R Core Team (2016) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2014. Rose JM, Gast RJ, Bogomolni A, Ellis JC, Lentell BJ, Touhey K, Moore M (2009) Occurrence and patterns of antibiotic resistance in vertebrates off the Northeastern United States coast. Fems Microbiology Ecology 67:421-431. https://doi.org/10.1111/j.1574-6941.2009.00648.x Rosenberg E (2014) The family prevotellaceae. In: Rosenberg E, DeLong EF, Lory S, Stackebrandt E, F T (eds) The Prokaryotes. Springer, Berlin, Heidelberg, pp 825-827 Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. American Journal of Human Genetics 73:1402-1422. https://doi.org/10.1086/380416 Schaefer AM, Bossart GD, Harrington T, Fair PA, McCarthy PJ, Reif JS (2019) Temporal Changes in Antibiotic Resistance Among Bacteria Isolated from Common Bottlenose Dolphins (Tursiops truncatus) in the Indian River Lagoon, Florida, 2003-2015. Aquatic Mammals 45:533-542. https://doi.org/10.1578/am.45.5.2019.533 Schork NJ, Fallin D, Lanchbury JS (2000) Single nucleotide polymorphisms and the future of genetic epidemiology. Clinical Genetics 58:250-264. https://doi.org/10.1034/j.1399-0004.2000.580402.x Schuelke M (2000) An economic method for the fluorescent labeling of PCR fragments. Nature Biotechnology 18:233-234. https://doi.org/ 10.1038/72708 Seddon JM, Ovenden JR, Sneath HL, Broderick D, Dudgeon CL, Lanyon JM (2014) Fine scale population structure of dugongs (Dugong dugon) implies low gene flow along the southern Queensland coastline. Conservation Genetics 15:1381-1392. https://doi.org/ 10.1007/s10592-014-0624-x Selkoe KA et al. (2016) A decade of seascape genetics: contributions to basic and applied marine connectivity. Marine Ecology Progress Series 554:1-19. https://doi.org/ 10.3354/meps11792 Sellas AB, Wells RS, Rosel PE (2005) Mitochondrial and nuclear DNA analyses reveal fine scale geographic structure in bottlenose dolphins (Tursiops truncatus) in the Gulf of Mexico. Conservation Genetics 6:715-728. https://doi.org/ 10.1007/s10592-005-9031-7 Shackleton NJ (1987) Oxygen isotopes, ice volume and sea level. Quaternary Science Reviews 6:183-190. https://doi.org/10.1016/0277-3791(87)90003-5 Shastry BS (2007) SNPs in disease gene mapping, medicinal drug development and evolution. Journal of Human Genetics 52:871-880. https://doi.org/10.1007/s10038-007-0200-z Sheppard JK, Jones RE, Marsh H, Lawler IR (2009) Effects of Tidal and Diel Cycles on Dugong Habitat Use. Journal of Wildlife Management 73:45-59. https://doi.org/ 10.2193/2007-468 Sheppard JK, Preen AR, Marsh H, Lawler IR, Whiting SD, Jones RE (2006) Movement heterogeneity of dugongs, Dugong dugon (Muller), over large spatial scales. Journal of Experimental Marine Biology and Ecology 334:64-83. https://doi.org/ 10.1016/j.jembe.2006.01.011 Smith SC, Chalker A, Dewar ML, Arnould JPY (2013) Age-related differences revealed in Australian fur seal Arctocephalus pusillus doriferus gut microbiota. Fems Microbiology Ecology 86:246-255. https://doi.org/10.1111/1574-6941.12157 Sobrino B, Brion M, Carracedo A (2005) SNPs in forensic genetics: a review on SNP typing methodologies. Forensic Science International 154:181-194. https://doi.org/10.1016/j.forsciint.2004.10.020 Sobtzick S, Cleguer C, Hagihara R, Marsh H (2017) Distribution and abundance of dugong and large marine turtles in Moreton Bay, Hervey Bay and the southern Great Barrier Reef. Townsville, Australia

156 Sobtzick S, Hagihara R, Grech A, Marsh H (2012) Aerial survey of the urban coast of Queensland to evaluate the response of the dugong population to the widespread effects of the extreme weather events of the summer of 2010-11. Final Report to the Australian Marine Mammal Centre and the National Environmental Research Program Sommer F, Backhed F (2013) The gut microbiota - masters of host development and physiology. Nature Reviews Microbiology 11:227-238. https://doi.org/10.1038/nrmicro2974 Steiner CC, Putnam AS, Hoeck PEA, Ryder OA (2013) Conservation Genomics of Threatened Animal Species. In: Lewin HA, Roberts RM (eds) Annual Review of Animal Biosciences, Vol 1, vol 1. Annual Review of Animal Biosciences. Annual Reviews, Palo Alto, pp 261-281. doi:10.1146/annurev-animal-031412- 103636 Stewart JR et al. (2014) Survey of antibiotic-resistant bacteria isolated from bottlenose dolphins Tursiops truncatus in the southeastern USA. Diseases of Aquatic Organisms 108:91-102. https://doi.org/10.3354/dao02705 Stoddard RA et al. (2008) Risk factors for infection with pathogenic and antimicrobial-resistant fecal bacteria in northern elephant seals in California. Public Health Reports 123:360-370. https://doi.org/10.1177/003335490812300316 Storfer A et al. (2007) Putting the 'landscape' in landscape genetics. Heredity 98:128-142. https://doi.org/10.1038/sj.hdy.6800917 Storfer A, Murphy MA, Spear SF, Holderegger R, Waits LP (2010) Landscape genetics: where are we now? Molecular Ecology 19:3496-3514. https://doi.org/10.1111/j.1365-294X.2010.04691.x Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution 10:512-526. Thrasher DJ, Butcher BG, Campagna L, Webster MS, Lovette IJ (2018) Double-digest RAD sequencing outperforms microsatellite loci at assigning paternity and estimating relatedness: A proof of concept in a highly promiscuous bird. Molecular Ecology Resources 18:953-965. https://doi.org/10.1111/1755-0998.12771 Toma C, Higa N, Iyoda S, Rivas M, Iwanaga M (2006) The long polar fimbriae genes identified in Shiga toxin- producing Escherichia coli are present in other diarrheagenic E-coli and in the standard E-coli collection of reference (ECOR) strains. Research in Microbiology 157:153-161. https://doi.org/10.1016/j.resmic.2005.06.009 Tsukinowa E, Karita S, Asano S, Wakai Y, Oka Y, Furuta M, Goto M (2008) Fecal microbiota of a dugong (Dugong dugong) in captivity at Toba Aquarium. Journal of General and Applied Microbiology 54:25-38. https://doi.org/10.2323/jgam.54.25 Turner S, Pryer KM, Miao VPW, Palmer JD (1999) Investigating deep phylogenetic relationships among cyanobacteria and plastids by small submit rRNA sequence analysis. Journal of Eukaryotic Microbiology 46:327-338. https://doi.org/10.1111/j.1550-7408.1999.tb04612.x Valencia LM, Martins A, Ortiz EM, Di Fiore A (2018) A RAD-sequencing approach to genome-wide marker discovery, genotyping, and phylogenetic inference in a diverse radiation of primates. Plos One 13:34. https://doi.org/10.1371/journal.pone.0201254 Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004) MICRO-CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4:535-538. https://doi.org/ 10.1111/j.1471-8286.2004.00684.x VanderWaal KL, Atwill ER, Hooper S, Buckle K, McCowan B (2013) Network structure and prevalence of Cryptosporidium in Belding's ground squirrels. Behavioral Ecology and Sociobiology 67:1951-1959. https://doi.org/10.1007/s00265-013-1602-x VanderWaal KL, Atwill ER, Isbell LA, McCowan B (2014a) Linking social and pathogen transmission networks using microbial genetics in giraffe (Giraffa camelopardalis). Journal of Animal Ecology 83:406-414. https://doi.org/10.1111/1365-2656.12137 VanderWaal KL, Atwill ER, Isbell LA, McCowan B (2014b) Quantifying microbe transmission networks for wild and domestic ungulates in Kenya. Biological Conservation 169:136-146. https://doi.org/10.1016/j.biocon.2013.11.008 Venables W, Ripley B (2002) Modern applied statistics with S. 4 edn. Springer-Verlag, New York

157 Viricel A, Rosel PE (2014) Hierarchical population structure and habitat differences in a highly mobile marine species: the Atlantic spotted dolphin. Molecular Ecology 23:5018-5035. https://doi.org/ 10.1111/mec.12923 Wallace CC, Yund PO, Ford TE, Matassa KA, Bass AL (2013) Increase in Antimicrobial Resistance in Bacteria Isolated from Stranded Marine Mammals of the Northwest Atlantic. Ecohealth 10:201-210. https://doi.org/10.1007/s10393-013-0842-6 Waycott M et al. (2009) Accelerating loss of seagrasses across the globe threatens coastal ecosystems. Proceedings of the National Academy of Sciences of the United States of America 106:12377- 12381. https://doi.org/10.1073/pnas.0905620106 Wells K, Lakim MB, Pfeiffer M (2008) Movement patterns of rats and treeshrews in Bornean rainforest inferred from mark-recapture data. Ecotropica 14:113-120. Whitehead H (2017) Gene-culture coevolution in whales and dolphins. Proceedings of the National Academy of Sciences of the United States of America 114:7814-7821. https://doi.org/10.1073/pnas.1620736114 Whiting SD (2008) Movements and distribution of dugongs (Dugong dugon) in a macro-tidal environment in northern Australia. Australian Journal of Zoology 56:215-222. https://doi.org/10.1071/zo08033 Wolanski E, Kingsford MJ (2014) Oceanographic and behavioural assumptions in models of the fate of coral and coral reef fish larvae. Journal of the Royal Society Interface 11:12. https://doi.org/ 10.1098/rsif.2014.0209 Wolanski E, Spagnol S (2000) Sticky waters in the Great Barrier Reef. Estuarine Coastal and Shelf Science 50:27-32. https://doi.org/ 10.1006/ecss.1999.0528 Young P, Kirkman H (1975) The seagrass communities of Moreton Bay, Queensland. Aquatic Botany 1:191- 202. Yugueros J, Temprano A, Berzal B, Sanchez M, Hernanz C, Luengo JM, Naharro G (2000) Glyceraldehyde-3- phosphate dehydrogenase-encoding gene as a useful taxonomic tool for Staphylococcus spp. Journal of Clinical Microbiology 38:4351-4355. Zakrzewski M, Proietti C, Ellis JJ, Hasan S, Brion MJ, Berger B, Krause L (2017) Calypso: a user-friendly web- server for mining and visualizing microbiome-environment interactions. Bioinformatics 33:782-783. https://doi.org/10.1093/bioinformatics/btw725 Zeh DR, Heupel MR, Hamann M, Jones R, Limpus CJ, Marsh H (2018) Evidence of behavioural thermoregulation by dugongs at the high latitude limit to their range in eastern Australia. Journal of Experimental Marine Biology and Ecology 508:27-34. https://doi.org/10.1016/j.jembe.2018.08.004 Zeh DR, Heupel MR, Hamann M, Limpus CJ, Marsh H (2016) Quick Fix GPS technology highlights risk to dugongs moving between protected areas. Endangered Species Research 30:37-44. https://doi.org/10.3354/esr00725 Zhang CH et al. (2010) Interactions between gut microbiota, host genetics and diet relevant to development of metabolic syndromes in mice. Isme Journal 4:232-241. https://doi.org/10.1038/ismej.2009.112 Zhang JJ, Kobert K, Flouri T, Stamatakis A (2014) PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614-620. https://doi.org/10.1093/bioinformatics/btt593

158 APPENDICES

Appendix 0.1. Animal ethics approval letter copies for sample collections.

159

160

161

162

163

164

165 Appendix 3.1. Locations (from north to south) along the east Queensland coast from which dugong tissue samples were collected for microsatellite DNA analysis. Average summer and winter sea surface temperatures presented. N is number of samples from each location. Data not available for ‘not recorded’.

Location GPS N Dates of Dugong Average Average location collection Protection summer winter Area temperature temperature °C °C

Torres Strait -10.03 38 1997-2999 no 28.74 25.95 142.15 Starcke River -14.77 1 2012 no 28.37 23.77 145.05 Cape Flattery -14.96 1 2011 no 28.37 23.77 145.35

Daintree -15.47 1 2010 no 27.97 23.67 145.31

Port Douglas -16.61 2 2011 no 28.26 23.57 145.57

Cairns -16.81 10 1999, no 28.12 23.52 2011, 2012 145.87

Yarrabah -16.99 1 2012 no 28.06 23.53 145.97

Cardwell/ -18.21 7 1999, yes 28.82 23.31 Hinchinbrook 2010, 2011 146.05 Balgal Beach -19.03 1 2009 no 28.74 22.98 146.47

Townsville -19.27 23 1999, yes 28.44 22.90 2000, 146.89 2007, 2009-2011, 2013

Bowling Green -19.27 8 2011, 2015 yes 27.82 22.89 Bay 147.26

Home Hill -19.67 1 2011 no 27.97 22.81 147.63

Upstart Bay -19.75 1 2011 yes 28.12 22.76 147.69

Bowen -19.99 2 1999, 2011 yes 27.68 22.38 148.29

Airlie Beach -20.23 2 2016 no 27.38 22.22

166 148.71

Midge Point -20.65 4 1998, no 27.85 21.33 1999, 2013 148.75

Cape -21.63 1 not no 27.20 20.65 Palmerston recorded 149.47

Clairview -22.07 15 1997, yes 27.08 20.50 2013, 2016 149.55

Shoalwater -22.27 67 2005, yes 26.72 21.40 Bay 2006, 150.21 2012, 2014

Port Clinton -22.53 1 2014 yes 26.34 21.52 150.81 Yeppoon -23.13 2 2015 no 26.52 20.21 150.79

Gladstone -23.47 8 2010, yes 26.29 20.40 2011, 151.17 2014, 2015

Rodds Bay -23.97 1 2011 yes 26.04 20.61 151.69

Bundaberg -24.71 3 2010, 2013 no 25.96 19.48 152.35

Woodgate -25.09 2 2009, 2011 no 26.01 19.23 152.59

Hervey Bay -25.23 30 2010-2011 no 25.97 19.13 152.75

Great Sandy -25.33 30 2006-2008 no 25.86 19.13 Straits 152.95

Moreton Bay -27.44 30 2002-2007 no 23.88 21.04 153.39

167

Appendix 3.2. STRUCTURE bar plots showing individual sample assignments to K = 2, 3, 4 and 5 clusters. Each individual is represented as a column and samples are arranged from north (left) to south geographically. Each colour shows the estimated assignment probability of each individual to each cluster.

168

Appendix 3.3. The oceanographic model domain over the whole Great Barrier Reef, the non- structured mesh with a very high resolution in the Whitsundays area and the reefs offshore. The colour bars indicate the bathymetry scale ranging from 0 to 100 m with dark blue water indicating a depth of < 10 m. White indicates land.

169

Appendix 3.4. Allelic richness values per locus for Cluster 1 (northern cluster) and Cluster 2 (southern cluster). These values were calculated from 1,000 replicates, allowing individuals to be randomly assigned to one of the two clusters in each replicate based on their assignment probabilities from STRUCTURE.

170 Appendix 3.5. Hardy-Weinberg Equilibrium for each cluster at each locus. P-values were considered significant if <0.05. Loci which differed significantly from Hardy-Weinberg Equilibrium are in bold. No loci were out of Hardy-Weinberg Equilibrium in both clusters.

Locus Cluster 1 Cluster 2 H200B02 0.3076 0.0142 H200B05 0.3816 0 H200C03 0.0002 0.0942 H200B01 0.1545 0.0038 H200C09 0.0569 0.4889 H200E08 0.5525 0.0706 H200D11 0.0101 0.9257 H200E04 0.0012 0.3021 H200A12 0.2002 0.5218 H200F11 0.0552 0.0552 H200G12 0.6824 0.2156 H200E11 0.4396 0.1414 H200G11 0.3948 0.2834 H200H09 0.8161 0.0293 H200C11 0.5102 0.0947 H200H04 0.2174 0.426 H200E03 0.0696 1 H200E09 0.5792 0.0057 H200H02 0.055 0 H200A01 0.7459 0.0192 H200F06 1 1 H200G10 0.0477 0.5478

171

Torres Strait

Appendix 3.6. Sub-structuring of samples from Cluster 1 (Torres Strait to Airlie Beach). STRUCTURE bar plot shows individual sample assignment to K = 2 clusters. Each individual is represented as a column and samples are arranged from north (left) to south geographically. Each colour shows the estimated assignment probability of each individual to either of the two clusters. No samples were available from a >500km region between Torres Strait and Starcke River.

Appendix 3.7. Sub-structuring of samples from Cluster 2 (Midge Point to Moreton Bay). Individual sample assignment values for K = 4 from the program STRUCTURE. Each individual is represented as a column and samples are arranged from north (left) to south geographically. Each colour shows the estimated assignment probability of each individual to either of the four clusters.

172

Appendix 3.8. Sub-structuring of samples from Cluster 2 (Midge Point to Moreton Bay). Individual sample assignment values for K = 2 from the program STRUCTURE. Each individual is represented as a column and samples are arranged from north (left) to south geographically. Each colour shows the estimated assignment probability of each individual to either of the two clusters.

173 Appendix 3.9. Distribution of mtDNA haplotypes across geographic sub-clusters identified from microsatellite analysis (sub-clusters 1a, 1b, 2a, 2b). The total number of times each mtDNA haplotype was represented in each microsatellite sub-cluster region is shown.

Haplotype Torres Starcke Midge Point Bundaberg Totals GenBank number Strait (1a) River to to to Moreton accession Airlie Gladstone Bay(2b) Beach (2a) (1b) h1R 2 3 77 137 219 EU835761 h2R 13 13 EU835762 h3R 38 38 EU835763 h4R 5 5 EU835764 h5R 9 3 1 13 EU835765 h6R 2 5 14 34 55 EU835766

h7R 28 28 EU835767

h8R 1 11 46 58 EU835768

h9R 4 4 EU835769 h10R 2 2 EU835770

h11R 2 2 EU835771 h12R 2 2 EU835772 h101R 1 1 xxxxxxxx h102R 1 1 xxxxxxxx h103R 1 1 xxxxxxxx h104R 1 1 xxxxxxxx h105R 1 1 xxxxxxxx h106R 1 1 xxxxxxxx

h13W 2 2 EU835773

h14W 24 4 28 EU835774 h15W 2 21 1 24 EU835775 h16W 22 2 24 EU835776 h17W 3 4 7 EU835777 h18W 1 1 EU835778 h19W 1 1 EU835779 h20W 2 2 EU835780 h21W 15 15 EU835781

h22W 26 19 45 EU835782

h23W 1 1 EU835783

h25W 1 1 EU835785 h27W 1 1 EU835787

h33W 10 3 13 EU835793

h34W 1 1 2 EU835794

h40W 1 1 EU835800 h42W 2 2 EU835802

h45W 4 4 EU835805 h46W 1 1 EU835806 h201W 1 1 xxxxxxxx h202W 1 1 xxxxxxxx h205W 1 1 xxxxxxxx

174 h206W 3 3 xxxxxxxx h207W 2 2 xxxxxxxx h208W 1 1 xxxxxxxx

h209W 0 7 7 KJ944389 h210W 1 1 xxxxxxxx

h211W 1 1 xxxxxxxx

h212W 1 1 KJ944382

Totals 139 102 177 221 639 R -restricted mitochondrial lineage: W -widespread mitochondrial lineage

Appendix 3.10. Proportion of explained variance of each variable’s influence on an individual’s population assignment.

Appendix 3.11. An animation of the potential waterborne particles emanating from the three colour- coded Whitsundays seagrass meadows over 184 h during the prevailing SE winds. The colour bar shows the depth in metres with the scale ranging from 0 (dark blue) to 100 m.

Appendix 3.12. An animation of the potential waterborne particles emanating from the three colour- coded Whitsundays seagrass meadows over 184 h during calm weather conditions. The colour bar shows the depth in metres with the scale ranging from 0 (dark blue) to 100 m.

175 Appendix 4.1. R code for SNP population structure analysis and diversity metrics. #load the required packages install_github("green-striped-gecko/PopGenReport") install.packages("dartR", dependencies = T) library(adegenet) library(dartR) library(devtools) library(ape) library(pegas) library(vcfR) install.packages("BiocManager") BiocManager::install(c("SNPRelate", "qvalue")) install_github("green-striped-gecko/dartR")

#set the path

## can use the unfiltered vcf file from agrf but need to find a way to convert from vcf to genlight vc <- vcfR::read.vcfR("./Alex_data/8035-1A.vcf") d <- read.csv("./Alex_data/metadata.csv") d$ind <- as.character(trimws(d$ind)) d$id <- as.character(d$id) d <- d[order(d$id),]

### Load in genlight object and rename gg <-vcfR::vcfR2genlight(vc) test <- as.matrix(gg) test.names <- data.frame(ind = trimws(gsub('-.*', '',[email protected]))) test.names <- test.names %>% dplyr::left_join(d) pop(gg) <- as.factor(d[,2]) table(pop(gg))

### once you have a dataset, you need to recalculate metrics

176 gg <- gl.recalc.metrics(gg, v=3) saveRDS(gg, "Duggonsgg.gl") gg<-readRDS("Duggonsgg.gl")

# to visualise data gl.plot(gg) ### smear plot of SNPs - white = missing data gl.report.callrate(gg) ##This can indicate how much data we will lose by our filtering and allows us to set the threshold

### Filters #recalculate the matrix with applied threshold of 95% gg2 <- gl.filter.callrate(gg1, threshold = 0.95) gg2 <- gl.recalc.metrics(gg2, v=3) # this is the matrix with both filter, the individuals and the 95% threshold saveRDS(gg2,"duggonsgg2.gl") gg2 <- readRDS("duggonsgg2.gl") test <- t(as.matrix(gg2)[1:10,1:50]) test[is.na(test)] <- 9 gl2structure(gg2, outfile = 'E:/SNP_workshop/gg2.txt', outpath = '.') gg2 <- readRDS("duggonsgg2.gl") gl.plot(gg2) gl.report.callrate(gg2)

# then filter for monomorphs - removing all monomorphic loci left after filtering gg3 <-gl.filter.monomorphs(gg2) gg3 <- gl.recalc.metrics(gg2, v=3) saveRDS(gg3,"duggonsgg3.gl") gg3 <- readRDS("duggonsgg3.gl") gl.plot(gg3) gl.report.callrate(gg3)

#Dimensions of the matrix ind x loc dim(as.matrix(gg3))

#showing the first five individuals and the first 3 loci as.matrix(gg3[1:10, 1:10])

177

#run SMNF to determine K (number of clusters) library(LEA) write.geno(test, output.file = 'test.geno.format.geno') list.files() project.snmf=snmf("test.geno.format.geno", K=1:10, entropy=TRUE,repetitions=100,project="new") project = load.snmfProject("test.geno.format.snmfProject")

#plot cross entropy to select K plot(project.snmf, cex = 1.2, col = "blue", pch = 19)

# get the cross-entropy for K = 3 ce = cross.entropy(project.snmf, K = 3)

# select the run with the lowest cross-entropy for K = 3 best = which.min(ce) my.colors <- c("tomato", "lightblue", "olivedrab", "gold")

#plot barchart of assignment probalities to each cluster for each individual barchart(project.snmf, K = 3, run = best, sort.by.Q = F, border = NA, space = 0, col = my.colors, xlab = "Individuals", ylab = "Ancestry proportions", main = "Ancestry matrix") -> bp axis(1, at = 1:length(bp$order), labels = rownames(reduced.test), las = 3, cex.axis = .4)

#make Q matrix of best run Q.3 <- Q( project.snmf, K = 3, run = 5)

#diversity metrics library("dplyr") gl2structure(gg3) allel.mat <- as.matrix(gg3)

178 allel.mat[1,1:10] rownames(Q.3) <- rownames(allel.mat) devtools::install_github('nicholasjclark/STRUCTURE.popgen', force=TRUE) library("STUCTURE.popgen")

#need to have / separated allele data allel.mat[allel.mat == 0] <- '0/0' allel.mat[allel.mat == 1] <- '0/1' allel.mat[allel.mat == 2] <- '1/1' allel.mat[is.na(allel.mat)] <- '-9' data.frame(allel.mat) -> allel.dat allel.dat[20,1:10]

Q.3 <- data.frame(Q.3) Q.3$ind <- rownames(Q.3) colnames(Q.3) <- c('one', 'two', 'three', 'ind') Q.3 <- Q.3[, c('ind', 'one', 'two', 'three')]

#Fst cluster.fis <- fst.STRUCTURE.popgen(qmatrix = Q.3, allele.dat = allel.dat, nclusters = 3, nreps = 100, NA.symbol = '-9', nsamp = 1000, ncores = 3) #Fis cluster.fis <- fis.STRUCTURE.popgen(qmatrix = Q.3, allele.dat = allel.dat, nclusters = 3, nreps = 100, NA.symbol = '-9', nsamp = 1000, ncores = 3) #heterozygosity het.stats <- obs.metrics.STRUCTURE.popgen(qmatrix = Q.3, allele.dat = allel.dat, nclusters = 3, nreps = 100, NA.symbol = '-9', nsamp = 1000, ncores = 3)

#observed heterozygosity

179 het.stats$obs.Ho %>% dplyr::ungroup() %>% dplyr::group_by(cluster) %>% summarise(lower = Hmisc::smean.cl.boot(Ho)[2], mean = Hmisc::smean.cl.boot(Ho)[1], upper = Hmisc::smean.cl.boot(Ho)[3]) # expected het.stats$obs.Hs %>% dplyr::ungroup() %>% dplyr::group_by(cluster) %>% summarise(lower = Hmisc::smean.cl.boot(Hs)[2], mean = Hmisc::smean.cl.boot(Hs)[1], upper = Hmisc::smean.cl.boot(Hs)[3])

##3.2. Genomic relatedness matrix #We can calculate genomic relatedness via: grmatrix <- gl.grm(gg3)

180

Appendix 4.2. Table of highly discriminatory SNPs showing consensus sequences for each catalog tag and the SNP position.

Catalog Consensus Sequence SNP positon Tag Id Reference>New 1 41 AATTCCCAAGGTCGATATCATTTTTACTATTTATAATGAAATTATAGTTAGGTCAGGAGGACTTTTCAACCTGGAACTTT 75A>T ATAAAGACTACAAAGAAGATTTAATGAGAAGAATTACTATGAAATAAGAATTGGGTAGCA 2 365 AATTCATTCCAGCTAGTAATGAAAACTTCTCTGGTTGGGCACACTGAATTTACAATCAATAATTAGCTCTAAATTGAAT 111G>A TTATAATTGCTTTTGTAATGCTATTATAATCGTGGGCATTCAGAATGCTGACTTCTCAGAC 3 996 AATTCCAGCATTGTTTTTATGATGTCCCAAGTTGTCTTCAGTTAACCCAAGGCAGGGACAAGAAATTCTAAAACGAAAT 75G>A GAAATACTCCTTTGTCAAAAACCTCTCTCTCTATGGAAAATGTGGGGTCATAAATAGCCTC 4 1165 AATTCAATTGGTGTTGTAATTCAGCCACTCTTACAATAAGACTCTGGGTTTGGTTTTTGGCGATATCTGCTCTGTTGCT 66T>C AGAAAAATAAAAGCTTTCTTTCAGGGCACAAGTGGCAACTTTGAGGTCATTTATGCAGCTC 5 1358 AATTCTCCTCTTCTAGAGGCAGTCGGTACATTCTGTCTTTGGACACTTGTGACTGCCCAAAAATCTCCCTTGAGAAGT 97A>T GGGTGTGATACCATAAACAGGAAAATGTACATATATGGACCATCTGCAATTCCATATTCGGT 6 1413 AATTCTATTCACTATTTCAATTTAGGAAAACCAATTTCTATCTAGACAAAGCCTAAATCTAAGCTCACTTTGCTACTTCT 76C>T:95A>T TTTCTTTTCTTTTTATTGGCATAGTTAGGTTCATTGGCTTAAACAAAATCCATGCTTCCA 7 1480 AATTCTAATAAATAAGGGATGCACTGACCTGCAATTCACTCCTCTCAGGGGCATTACTATCCAGAGGAAATGCTTTCA 52C>T GCATGAACATTCTGGAAGCACTAAAATGTACAAAGTTCCAAGACCACTACTAAGAAAGGGAA 8 1521 AATTCTTTCTCAGGTGCTTCCATTGCCTTATTTTCCTCTGGAAAGTTTCCTAGTTTCTTATTTTTGTTGCTTGCTAGAGC 46T>C TATCTTGACCTGCGTCTTTATGTGATTTGATATTGGTTGTTGTCTCTGAGGCATCAATAA 9 1610 AATTCAGACTTCTAGCCTCCTGGACAGTGAGAGAATACTTTTCTCTTTGTTAATTGTGGTATTTCTGTTATATTAGCAC 104C>T TAGATAACTAAGACACCTAGTGACCGGCAAATAGAAAAGTCCTCAGTACACAATACCAAAA 10 1880 AATTCTTCTATTACCCAGTGATGTGTAAGCAGGTTATTATTCAGTTTCCATGTAATTTAGCTTTTATCTTTGCTCTTTCT 38C>T GTTGTTAATGTCTAACTTGAGGGCACTGTGGGCAGAGAAGATACTTTGTATTATCTCAAT 11 2085 AATTCTCTGTTGTAAATATCACTGAGAAAGGCCCATCAGGGAGAAACAAAGGGCCAAATGTCACAGCTGTGAACACA 100T>C TGTGAGACATGCCCTGGAGATATCAACAATTGACTTTAGGAAGAACTAGAGGGCTCAAAATAA 12 2346 AATTCAAAGTGTCTCTATTTGCAGATGATATGATTTTATACACAGAAAACCCGAAAGAATCCACAAGAAAACTACTGAA 110T>A ACTAATAGAAGGTTTCAGCAAAGTTTCAGGATACAAGATTAACATACAAAAATCAGTTGGA 13 2407 AATTCCTTCTTTAAGAAACTTCAGTTTTTGCTCTTTAGGCCTTCAACTGATTGAATAAGTTTCACACATATTCTCCAGAA 101A>C TAATTTTCATTACTTAAAGTCAACCTACTGTAACTGTTAATCACATCTACAAAATATCTT 14 2575 AATTCTGTGTACTACACCTGGAAGATTGTGATGCCTACAGTCTAGAAGGAAGGAGAAGACACAGATTTAAATACATGT 85G>A TTCTCAGTCTCTTGATTCAACAGAGATTCAATATATTGTCATTAGTAGGTATCCTTGGTTTT 15 2611 AATTCATGAAGATTAAACGGGTTCCATACATGATCAACAAGCACAGCCACCTAGAAGCATTGATATATGGCTCAAAGT 87T>C CCCATTGCTTGCACCTCTTTACAGAGGTCTCTGTCCTCTGACCCCACTAAAGGCTGTCTGGC 16 2735 AATTCCTCGCTAAGATAATATTCTATCCCTTGGGGAAGGATAACTTAAAGGGGTTAAGAAGTTGTTGGTAGAGCAATT 102A>G AATTTAGGAAGTTGCTTTTAGTTAGGACTTCAGGCTATGCTAAAAAGATAGGTCACTGATAT 17 3290 AATTCTATTGTGTCCCCCCAAAATATGTGTTAGAATCCAACTCCTATACCTGTGGATTTAATGCCATTTGGAAATTGGA 109T>C TTTTCTTTGTTATGTTAATGAGGCCATATCAGTGTCGGGTATGTCTTAAGTTTTATCACTT 181

18 3885 AATTCAACAACATATTAAAAAACATAATTCATCATGATCAAATGGGATTTTTACAAAGTAGGCAGGGATGGTTCAACAC 32T>C TAGAAAAACAATCAATGTAATCCATCACATAAATAAAACAAAAGACAAGAACAACATGATC 19 3894 AATTCCAGGAAAGTGCTCTTAGTCAGCACTGCCTGAACACGGTTTTCTTCTGAACTATCATGATATCACTGCAAAAGA 111T>G GCCAATAATCCTACTTCCTGCAGGAAAAAAGTGATTTTATTTTTTAGTAAATGGGTTTAGTT 20 3952 AATTCATGTAAGAAATACTTTCAAAGAGCCTTAATAACTGAAAGGAAAAAAAATGACTTTGGCAAAGGATTCCTGCCAA 39T>G ACTTTGGCTGATTTATCTTGATGCTTCAAATAAAACTCAATTCTTTTCATCGGCTCCACTG 21 4126 AATTCCAGCAAACTCTGGCAGTAACAGAATTAGAATAAACTGATTTCTTTCTTTATATTAGATCGGAAATGGAGAAAAC 93A>G AAGGCTGTAGATAGGATAATTCATCTCTGCAAAAAAGAGTCACTTAAGATGAAAAACTGTT 22 4203 AATTCCAGAACACTTAATTGTGCTTATGAGGAATGTGTACATAGATCAAGAGGTAGTCATTTGAACAGCACAAGGAGA 108T>C TACAGCATGATTTAAAGTCAGGAAGGGTGTGTGTCAGAGCTGTATTCTTTCACCATACCTAT 23 4261 AATTCTGTGAGCACAGGTAATATTGAAACGAGAAGTTAGGTAGATAGAAGAGTTCCAAAATCATATTTCCAATACTTAT 78A>G GAATTGAAAATAGAACTGAACTAGAACTAGTAAGAACTGGACACAGATAGGACTTTTTTAA 24 4489 AATTCTATGCTAAGGTTTAAAAACCGGTTGATTACACATACAAATCTGGATTTCCAGTTTTGTTGAGTCAGACAGCCCA 60T>C TCTGTAGCCCGGGCGCACATTTTCAGAGGATAGCGATTGTGTGGGGAGTCCCTGGGTGGTG 25 4957 AATTCGAACTGTCCTTTAAATATCACCATCATTGTTGCAGCCTATAAGCCCATCTGTATAACTTTGCCTAGCTTTTAAG 33T>C ACTAATGAAAAAGGAAATAGTTCAACTCCATACATTCTGAAAAGGCTTCAGTATTAATGTG 26 5240 AATTCTTGGCTCAGTGCTAGAGACAGATGACCCAATAAGCCAGGTACTCACGGGCTTACTTGTGTTTACTTACTAATC 65C>T TGCAGTGCACAATCTCACTGGTTTTTTGCCACATCAAAGGCTCAGGGCCTGTAAACTTTCAA 27 5486 AATTCTACCTCCAGTAATTCCAAGAACTCCTCTTCCTTTGGTAAGTTTCTTGGTTCTTTGTTTGGGGGAACTTGCTGAA 95A>C GCCATCATGGTCTGCCTCTTTATGTGATTTAATATTGACTGTTGTTTCCAAGCTATCAGTA 28 5497 AATTCATTTCCACCTACCTCCTCACACTCAACCAACCCCTAAATTTACACACTCACATTCTTTCCTACATGCATCCTTTT 87G>T TATGGTGAGGTTGCTTGTTGTTAAAGGTATTTAACCACCTAACAAGTATAATCAAAAAGG 29 5538 AATTCAAACATGTTGCCGTTGTCTGACTCATGGCGACCCCATGTGTGTTAGAGTAGAACTGAGCTCCCTAGGGTTTTC 41A>G AACGTCTGTAATTTTATGGAAGTAGATCACCAGGCCTTTCTTCCATGGTGCCTCTAAGTAGA 30 5813 AATTCTACCACTGAGCTACCAATGCCACAGTGAATTAGGAGGTTGTTTGTGCTTCGAGGAGCCCTGGTGGCGCAGTG 47T>C GTTAAGAGCTTGGCTGCTAACCAAAAGGTCGGCAGTTCAAATCCAACAGCTGCTCTTTGGAAA 31 5952 AATTCAACTTGTTAGGATACTGTAGAATTTTATTTATTCTCTTTAATTAATATTGGGTATTGAGATTTCTCAAAACCATAA 56A>G TCACCTAATTGGGGAAAAAAATTACTCAGATATAAATCAGATGCACAAAATAATAACCA 32 6103 AATTCACTTATGTACACAGCTTTGGGCAGATCGAGACCCAAGAGTCAGATTCCTACATGGTGACATACTAGCAGGAC 36G>A TTAAAAATAGATGACAGTGGTCAGTGTCTGAAAAAAGTACTAAGAAAAACCAATTGCTATGGA 33 6588 AATTCTCAGTTCTGTTCCATTGGTCTATGTGTCTGTTGTTGTGCCAGTACCAGGCTGTTTTGACTACCGTGGCGGTAT 73C>T AATAGGTTCTAAAATCAGGTAGTGTGAGGCCTCCAACTTTGTTCTTCTTGAATAACGCTTTA 34 7108 AATTCCCGTTCTTTGTGATGTTATCTGTAATTTGTTATGATCCACACAGCTGAGTGCCTTTGCACAGTCGATAAAACAC 71A>G AAGTGAACATCTTCCTGGTGTTCTCTTTTTTCAGCCAGGATCCATCTGACATCGGCAATGA 35 7376 AATTCAGGAAGGCTAGCCAAAGTGTCATGAAGCCTAAGTAAAGTTGACACAAAATATTCTACTTAATAATATAATTTAA 59T>C TTGTATTTTGGAAAATAGAGTTACTTCTCCAAACACACAGACATAAGTACTACTCAGAGAC 36 7623 AATTCAGTTTATAAACCTTTCATTCTGAAAAATAAATTGAGAACCTTCAGAAATATATTGAGTATAAAGTGCTATGCTGA 77C>A AACTGATGATTATCAAAAGAAAAGCCTTTACTCCTGTTAAAGTGGCACTAATTGTAAAGA 37 7856 AATTCAGTTCAGTGTGTTATTCTTAATATATTGTTCTTAACTATTAGTGAATAAATCCACTTATCAAAAATTTATTTTTAA 71T>C ATTAGGTAAAGAAGAATATAATTCTAAAACATAACTTCTTTTTAAAGTATATCCAAAGA

182 38 7925 AATTCTTGAACCAGCATCATTGTGCATTTTACAGAGATAACAAGCAGCATCAGCTTGCCAATGCGTACTCCACAGAAG 56A>T TCTAATTCTAGCTCTAAGGAGAACTTCCTCTGCATTTCTGAGGCATCTGTTCCATTCTCAGA 39 8157 AATTCATTAGACAAGATTGGACTGTAATTCCACTTTTTCATTTGTTTCCCTTCAAAGGTATCTTAACAGATAAACGGCT 76G>A GAGGCATTAATCTTCCTAGAGAAAAGTCACAGCATAAACCAGCCTACACTGATAACAGCCA 40 8347 AATTCAGGGTTCCAGCTCCAGGGGAAGGCTTTCTCTCTGTTTGCTCTGAGAGAAGGTCCTTGTCATCAATCTTCCGTT 87A>G:102G>C GGGTCTAGAAGTTTCTCAGCACAGGGATCTGGGGTCCAAAGGATGCGCTGTTCTCCTTGCTC 41 8880 AATTCATCATGCACCTGACTCCTCTTCTTACTTATTTACCTACCATCTGATTGGAAGAAGAGAGAGATCTAGCCCGTC 45A>T AGTGCATGCAGCTGGTTCAGGGGAAATCTTTGACTTCAAATAGGAATAATGAGAATCTGGAT 42 9122 AATTCTATGAGCATGTGCCCATTACAACATTTTTTTTGCTGTGAATTGAGTTCCTTGATCGGAAGCAATGTTGTATGGA 33C>T AGACTGTAACTGGGAATAAGCCATCTGGTAAGTAAAGGATGGTAGTTTGGGCAGAAAGATT 43 9361 AATTCAGGAACAGGAGGAGACAAAAAAGAGCACAGAGAGAATTTTTGAAGATTTATTGGCAGAAAACTTCCATAATAC 72A>C CATAAAAGATGAGAAGTTTCCATCCAAGAAGCTCAGTGAACCACATATGTGATAGACATCCA 44 9569 AATTCCCAAACTACAAGCGCAAAGCAACTTGGTGATCTCTCACCAAAAATTATTTTAATGGTCTATTGATGACATAGGT 108A>G CCCCCTTCAATTTTGCTGAAAGCAAATAATTAGGAAACTCCTGTCTCACTAAAGGATTTAT 45 9610 AATTCGATATGAACGCGGAGTATCTGACGCAGGGCTGTGCTCAGAACACAGCTGCTAGACCGTAATTTAAGGACTGG 48A>G TCACTTCCCTTCCATCTGTGGTCTTGAGAAGCCCAGGAGGAAAGCTGCTGCAGAACTGTGGAG 46 9646 AATTCCAGAACACTTAATTGTGTTCATGAGGAACCTGTACATAGGCCAAGAGGCAGTCGGTTGAGCAGAACGAGGGG 35C>A ATACTGCGTACTTTAAAGTCAGGAAAAGTGTGCCTCAGGGTTGTATCTTTTCACTATACTTAT 47 9749 AATTCTCATCCTCAGCCAGTTCCTTGTGCCCCTGTGCGTATCCAGTTTAACAAGTGCTTAAATTATATATGCAATTTTC 94A>G TGAGTGGTTCGCTTACTTTTCAAGTTTGTAAACTAACACATCAACAATTCCCAATCGATAG 48 9878 AATTCATGTTATGGTCCTGAGCCTTTGACCAAATAACCTCAGATGCCTTGGGAGAGTGACAGCTTTCCCCTCAGCTG 102G>A GGTATATGACTTTACAATTCAGGGAATGTGTACAAATAAGTGAGAAGGTGGAGATGTAAAGAA 49 9989 AATTCTGAGGAAATTTATTTACCTATTAATACATTCCTTTATTTATATTGGCGCTTTTATTAATTTATTTTATTTTTGCTGT 72G>A TGTTGTTGAGAATATACACAGCAGAACATACACCAATTCAACAACATCTACATGTACA 50 10027 AATTCACGAGTACTGATAAAGGAGAGAATCAAACAGTGTGCAAACAGACCTGGGAAGGAAGGGGAGAGAAGCAGGG 63G>A TCTAAGGGTGGGGAGTAGATTATAAGTCATGAACTTAGGGACTATTGGCTAATTTAAAACAAGT 51 10162 AATTCAGGAAAACAATGCAAAATAAATTCGCAACTAGAAATTATACAAAAACAACAATTAGAAATCCAAAAGATAAACA 32A>G ACAAGATTTCAGAAATGGACGGTGTCATAGAAGGGCTGAGGAGCAGATTTGAAATGGTGGA 52 10300 AATTCTCTATCATCTGCTACTTTGATGACATCAACAGGGTTTTATATCCCGGCGTCAGCATGTTCCCCAGCTAAGGGC 51G>A AGGGACTGGAGAACACCTGGGTTAGGGGTAGGGGCTACATTTGCCAAACATGGCATCTGTGG 53 10571 AATTCCTATGTATGCATGTGGATTGCAGAAAAATCGTGAAAGGTAAATTTAAAGAGTTACTATTGTCCATATAATTTCA 34T>C AAAGCAATAATTATGAAAAAAATCTTCAGGCTTTTTACAGCTGTGAAAAAAGGTTGACACG 54 10584 AATTCTAGAAGACTCGTTCTTGAGTTGTTATTTTTGTCAGGTACCACTGAGTCAGTTCCAACTCAGAGTGACCCGATG 74C>T TACAACAGAACAAAATACTACTCGGTCCTGCACCATCCTCATAATCCTTGCTATGTTTGAAT 55 11114 AATTCTAATGGCTAAAATTTTGTTGAGGATTTTTGCGTCAATGTTCATGAGGGATATCAGTCTGTAATTTTCTTTTTTTG 51A>G TGGTGTCTTTACCTGGTTTTGGTATCAGGGTTATGCTGGCTTCATAGAATAAGTTTGGGA 56 11174 AATTCTTGCCTTCCATGCAGGAGACCGGGGTTCAATTCCCAGCCAACGCACCTCAGTGCACAGCCACTACCTGTCAG 97G>A TGGAAGCTTACATGTTGCTATGATGCTGAATAGGTTTCAGCAGAGCTTTCAGACTAAGATGGA 57 11768 AATTCCATTATTAAAGCTTAAATGTAAAAACATACAGCTTATCAACAGTCTAATATCCTTAATTCTCCATTCCACACTGC 47A>C AGTTATGCAATTTAAGCTTTAGAAGTTCACACACAGAGCCCAAAGGAATTTTATTTAACT

183 58 11839 AATTCTCCAGTGAAGCCAACTGGGCCAGGGCTTTTTTTTTGTTGGGAGTTTTTTTTCAATCTCTTCTCTTGTTATGGGT 68C>T CTGTTCAGATTTTCAACATCAATTTGTGTTAGTTTGGGTAGCGTGTTTCTAGAAATTTGTT 59 12664 AATTCAGTTGTAAAATGGTAGATTAATTGCCTTTTGTATTAAATGAAATAAGAATCTCAAATATGCAAAGTCAGGAAATT 74G>A TAATTGTGATCTTCAATTGCTTATTCAATGCATTGGAAAACCGAGTGCAGAAGTACAGAA 60 13006 AATTCTGGTCTTCTATTTCTTAGGGCAGTGACTCAACTAATTGCCCAAGGGACTCACAAAGTACAATATTTTTATAACA 88A>G GAATTTTTATTCTTTTGCTGTGGTCACTAAAGAGAACATTGTTTCTTGGAAACATGGACTC 61 13934 AATTCTCAGTCCTTGGAGACAAGGAATCAATGGTAGGTAATTAAGTACATCTCTCAGGCATCTGGGGAAAGCGGGGG 89T>C CAATACAATTATAAATGCAACTTGAGACACACACAGGTTATCCCAGTCAGTAAATGGCATTCT 62 14136 AATTCTGCTTAAAGAATTTTTTAAAATTAAAGCATGAAGTCTTTAGTTTGAAAATTAAAAATAGACATTTGACTGTTTTTG 32G>T AATGTGTGGTAGGATTTCAATAGAGAGGACGCATGTTTTACCCCACCATCTAGATGTAG 63 14344 AATTCCAGAACACTTAATTGTGCTCATGCGGATCCTGTACATACACCAAGAGGCAGGAGATACTGCGTCATTTAAAAT 58A>G CAGGAAAGGTGTTCCTCAGGGTTGTATCCTTTCACCATACTTATTCAACCTGTAAACTGAGC 64 14410 AATTCTAGACTCACTGTGTTTGTGCTGATAGCCACTGAACTAAAAATAAATTGGTAGATTCCTGAGTTCTCAGATTATG 40G>C TATATAAATTCCCTTAAGTTCTTATTCCCTCTTGATTGTAAGGATTGTAGGTTTAAAAGAG 65 14639 AATTCTTTGGCCATTAGGAAAACCTATGAACCCCCTCTCAGAATAATGCTTTTAAATGTGTAAAATAAAATAGATTAAA 72A>G AAGGAAGGCAATTATATTGAAATACAGTTCTCAAAATATTAAAAATCCAATTTGTAGTATG 66 15700 AATTCTGCTCTCTGCTGTCCAGGCCTGAGTCAGTGGCTGCATCCCAGGAAGCGAGGGCATAGAGAGGGTGCATGCT 73C>T AAGCTGGTGGAGTCCAGGTACTGGGCTTTTAGGTGAGGCCCTGGGGCACTCAGGAAGGCATTAG 67 16274 AATTCAGCATCATATCAAAAAAAATAATACACCATGAACAAGTAGGATTCATACCAGGTATGCAAGGATGGTTCAACAT 98G>A TAGAAGATCTATCAACGGAATCCACCACATAAATAAAACAATCACATGATCATCTCAATTG 68 16486 AATTCCTATCTAGAGAAGTCTGTTAACATGCACATATGAATTAGCTTTGTTTACCATTACCCTCTAAGCTAACAATCAG 37C>T AATTTTATTAATTAGAAATAAATGTCTTCCCATCCTTGTTGATTGTTATGAAATGAATTTT 69 16727 AATTCACAGGGGGGCTGGAGGACATGTTAAACTTTCAAGGGAAACACAATGATGCTCAGTATCTGTCAGCCACTGCA 87C>T CGAACCAGTTCACAGTTTTCATGTTAGATATCCCTGCATTCCCTTCAATGACGGCGTATCTTT 70 16908 AATTCATTTGCTTTCCAACGCTGACTATGCCTCTAAAGTAACATCACTATTTCTTTAAAATATTTCACTCAAAGTTTATTT 35G>A AGGAAGCCTTATATTGATAGATAGTGCAAAGTTAAACTGCAGGCAAACGTGATTTTTTT 71 17731 AATTCAGGAAAATAATGCAGGAACAAAATGCCAAAGTACATTCACAGTTAGAAATCATACAAAAACAACAATGAGAAAT 42C>T CCAAAAGATAAACAACAAGATTTCAGAAATGGACAGTGTCATAGAAGGGCTGAGGAGGAGG 72 17835 AATTCCACAACACTTAATTGTGCTCATGAGGAACCTGTACTTATATCAAGATTAGTTGTTTGGCCAGAACAAACGGATA 63A>G CTGCTGGTTTAAAGTGAAGAACAGTATGTGTCAGGGTTCTATTCTTTCACCATAGTTATTC 73 18505 AATTCAGCAACATATGAAAAAAATAGTTCACCATGACCAAGTGGGATTCATACAAGGTATGCACGGGTGGTTCAACAT 54A>G:90A>T TAGAAAAACAAACAATGTGATCCATCACATAAATAAAACAAAAGATAAAAACCACACGGTCT 74 18849 AATTCTTTGGAAAATGCTGGGAGACTTAGTGCTTCAAGGTATACAAGGAACTCTCCGTCCAATCATTAACCAATGATT 87T>G AAGCTAACTGAACAGAAACTGAACATGGCAAGTAAAGACTTTACATAATTAGTCCAGTAAAG 75 19427 AATTCACCCCAGTTAGTAATTGCTTTCTTTCCTGTTTCCTTTCCACTGCACCTTCGGATACAGAAATACATTTTGTTGTT 60A>G GTTGTTAGTTGCCGTCCAGTTGGCTCTGACTCATGGGGACCTTCGGTATAACGAAAGAAA 76 19635 AATTCTACCCTTAGGACAAATCTATGCCCACATTATGCCTTCCACGTATTTTATTCACTTTTTAAAATAACTGTATGTAG 77G>C TAACTGTAACAGGCTTTTGTGAGGTGAGGAGAATCTAAGTTTAGAAACAGAGATGTGCAG 77 20017 AATTCATTTCCAGCATTAAGAAATTGGGAGATTTTACATAAAGATACGGATTTCTGGCATCTCTTGAAAATTAGAATATT 92C>A GAAAACACTGGATCTCCATTCCCACCTGGCTGTCATCATCAGAGGCTGAAGAACACCTGC

184 78 20254 AATTCCGTTAGCAGCATGTCTCTGGGATAGGATGGTTTATGGTAGGTGCATCAAGAATCCATGGATTTGATGGGTAT 84A>C GATTCCATGGAAAATGTCATGTCAGCTATATAATTACCTGAGCATAAAATGGAGATGTTTCAT 79 20593 AATTCACATGACCTGCTGCTATATAAAATGAAAATCTCTGCATACGTGTATTCTTGTACTCAGACGAGTTTTGGGTTTT 65C>T TCTATGATTATTCTGTGGATTCAAATAAACCGGCCATTTCTACTTAATTCAACTTAAGTAT 80 20676 AATTCAGCAAAGTTGCAGGGTACAAGGTCAAGAAACAAAAATCAGCTGGGTTTCTATACACCAGCAATGAGAAGTTTG 59C>T GAAAGGGAATTAGGAAAACAATTCCATTTACAACAGCATCTAAGAGAATAAAATAACTAATA 81 20788 AATTCAAGTGCCTCTGTTTAGGGGTGAAGAAATGGACTCGAGGGGACCTAAGTGACTTGCCTGATGTTACAAAAGTA 33T>C GATGTGACCGAGGTGAGACTCAGACTTGGGGCTCCTGGCTCCCAGTTCTGTACTCTGTGCCTG 82 21546 AATTCTCATCCTAAACTTATTATTTAAGGATACATATACATGTCCTCAGTCTGTATGGAAAATCCAGGGCTTCCAAGTG 97G>C GTTCTTTCTGGTTGGAACCCCTGCTGAGAATTTGCATTTCCATCCCAGGTATACTGAATCA 83 22104 AATTCCTGGCAGTTCCCTGTCACTATACTGCTACAACTGCTGTTGAATGATCTTCAGCAAAATTTTCTCTGTGTGTGAT 68T>C ATTAATGATATTGTTTGATAATTTCCACATTTGACTGGATCACCTTTCTTTGGATAACTAA 84 22455 AATTCCATTTTATTTGGCTACACAATCCCCATGGATGCAGCAGTCAGGAAATCAAACAACACGCTGCATCAGGCAAAT 71A>G CTGCAGCAAAAGACCTCTTTAAAGTGTTAAAAAGCAAAGGATGTCATTAAGGACTAGGGTGC 85 22811 AATTCACTAAATAGCAGGTAACACAACAGAAAGATGCTTTTGGAGCTGGAAAGACTGGACATCGTGCTCCCCGACAC 54A>G CACTGGTAAGAACATCCCAATACCAGGATTAAATTTAGTCAAGAAAGAACCTGCATATGGAGG 86 22857 AATTCTATCTTTTGGAATTATTTTAATTATTTACACAATCAATAAAATAAAATCAACAGGGCTAGGGAAAAACTCTAAAG 100G>C TGAAATATTAGCAGAAAATGAATTGGCAAATGGAATATAAGCAGAAAGTGCATTAATTTG 87 23013 AATTCTGAGCTCTTAATACCTGTACACTGGGCCCAACATATTTAATGCATTGTGGGAGCCTGTTTGCATTTTAAATGCA 75A>G CTGTGTTTTGAACTTGAAGGTGACTCCCAAGGACTTACCCTGTTTGCTTTTGTTCCACTCT 88 23209 AATTCCCTTGGAAATGAAGAAAATCTACATATCTTCCAAAGATGCACATGGAAACAAAACAAAGTATTTTAGTGAAAGG 90C>T AAAATATTTGTGCTGTATTATTCTTTTAAAGAATCATACGTGCAGAAGACATCAATATTTG 89 23671 AATTCTTTTTTTCTAGCTTCAGACTCCAATATTGAACTCTGTCAAACATCTTTACCAGGATATTTCACAGGCGCCTGAA 58A>G:73A>G GAGTATTCCAAAATATGTGCAATGACTTGGAGATGAGAAGGGAGAGGTTTGTAGATGAAAT 90 23848 AATTCCATTCTGTTGTGCATCGGGTCACCATGAGTCAAGGCCGACTCAACAGCAGCTAAGAACAACAACAATAAAAG 40G>A ACGCGTTATTTCAGTGAGCGCTGTGAAGGTGGAGTGAAAGCTTGACTATATGCTATAGCAGTA 91 24158 AATTCCCGTGCTTTGCAATGTTATCCGTAATTTGTTATGATCCACAGAGTTGAATGCGTTTGCATAGTCAATAAAACAC 42C>G AGGTAAACGTCTTTCTGGTATTCTCTGCTTTCAGCCAAGATCTGTCGACATCAGCAATGAT 92 24331 AATTCCATATGGCAAATAAATACATGAGAAATTAATTACAGATATCTCATTGTCCTATAAAAATGAAAGCACAGTGCAT 71G>A AAGGAAGTATGTACAAGGCTTTGGGTGAATCTGGAAGCAGTCTGAACTTACATCAGTAGGA 93 24710 AATTCCTGAGTCATCAAAAGTAGTCAATTAGATATATGAGATGTGTGATTAGGGTGAGGAAAGAGTTAGTCCACTACT 77C>T TCTGTCTTTTGAGACTAGGGGGATGGGTTTTGCCACAGAACGATACAGACTTAACAGAGTAA 94 25284 AATTCACTGGGGGGCCAACTGTTTGATTTCTCTCTCTCATGCTTGACACAAGAATGAAATTGTAGCTGCGTTTATAGC 63C>T GGAAGGTTGGATTATATTCTCTTTTGTAGGAATTAAACCCCATAGAATAATTCATGGACAGC 95 26006 AATTCAGATTAGTAGGCAAATGCTACTTGACCTGAACAAGAAGAATCTTAAGGACTCTTACCGGGGGTATTAAGAGTG 36A>G CAATGAAGAAGGTAAGAACTTTAACAGGTCAAGAGATGGAGGCTGACTGGTGGTGAATAAAG 96 26383 AATTCTGAGCGTTGACTGCATCCAAGATGCAGACAGAAGAGAAGGAACTGCAGAGAGTTCATAAGGAGATCAACTGT 82A>G GATCGGAGAACCAGGAACCCACAGGGTCACTCAGGCCACAGAGGCCAGTATCAAGGAGGGAGT 97 26401 AATTCTGGATTTAGCAAATCCAATCTAAGTAAATAATGATCATTTTTTTTAGCAAACGGGTAAATAGAAGCATTATTAAA 58A>G:74A>T CTCACTGTGTTTGACTATTGTGGTTGTTAGCAAATGTTTGTTTTTTAACTCTTTCCTGTA

185 98 26733 AATTCATCCATTCCAGTACAGAATATCCAAAAGAGTAAAGAAAAAGCATTATTCACTCAAGCAGATATCTCTTGCATAG 33G>A ACATTTCTTTAGAAAACGATAACTTTTCAAGAGCCAGGTCTTGAGTTGCATATGTAGTACA 99 26902 AATTCCGACACCTGGACTACTAACTGTGAGAAAATGAATTTTTGTTCTTCAATGCCACCCACTTGTGGTATTTGTGTTA 52G>A TGGCAGCAGTATGTAACTAAGACAGTCACTGTATTGTTCCTGTGGCCCTAGGTAAAACTCA 100 26985 AATTCTGCTATTGGGCTACAATGTTCAGTTTACTACCATTTGACAATACATGTAGAATAGCACAGGTTACTTTCTCCCT 64C>A GTCCACTGCTGCTTTGATTATCACTTTCCAGAAAGTGTTGGTAAAATGTGTTTTATTCTTT 101 27175 AATTCAGTGGAGCTGATGTGATGAAGAAACAATGAATGGAGAAAGGAATCTCCTTAACTTAGGTGGCTATTACTTTTT 45G>A TCTTCCAATCTGAGGAAGGGCATGATATCAAAGAAAAGGAAATATAAACCTGATGAGAGATT 102 28664 AATTCTGGGAGACAGCTAGAACAAGCAGCTTAAAAAATCGGCTCTCCCAGGGCTGCAGTTTGCCATTCCTGTCTTAG 33G>A AAAAGGCCCATTCTAGAATGATCATGATGAGGATTGGGTCTCCACCCAATGCCTCCAGCATGG 103 29535 AATTCACACTGGTCTAGATTTATCTGACACTTCCTTTTCCTTTCTGTATAGTCTCGGGTAAATTACTTAACCTCATTTTG 36C>T CCTTCACCTCCTTATCTGTAAAATGGGATAATAATAGTATATCTAGATACATAAAATTAA 104 29588 AATTCCTATTTACTTGATGTTTTTTGAGAGCTTGTATTATTAAAATTAGAGAAAAGTTATCGTAGAATGGTGTATATTGG 48A>G CACTTGAGAAATTTGGTTGACAAGGTTATTCTCTGGGTTTATTTGCTCTTAAGCTGTAAA 105 29875 AATTCTGTTAAGGAGGCTGACAATTGGGTATCGCTCTCTCTCCCTGTTTCTTGTCAGAGAAGATCGCTCTATTCTGAG 66T>G CTCACAGTAACTTTTTCTGTCTCCATGACTACCTATGACTACTTAATTGTACTATAGTAATT 106 30256 AATTCCCAAGAGTCTTAGGAGCCAAAGATGGCTGAAAGTGTTGGCTCTTAGGTTACCAGGAGCTTTTCATCCAGACC 70T>G TTGGAGAATCAGCAGCATGAACAAGAGCTGAAAGGGGGTGGAGTAATGAAGTAGGAGCCACCA 107 32308 AATTCAGCAACATTGAAAGATACAAGCTCAACACACAAAAATCAGTTGAGTTCCTTTTCAATAACAAAGAGAATCTAGA 73A>G AAAAGAAATCAAGAAAACCATATTATTTACAATAACCTCCAAACTGATAAAATGCCTAGGA 108 32870 AATTCACAACAGGAGACAGAGATACTGTGTAAACAAGCCTAGAATGAAACAAGGTGAGTGGGATGCAAGCCAGGGTT 84A>G AAAAAGAGTAAGCACTGGAAGACTGGGTGAGAGCTCCAGTGATCCTGTTGACCACAAAGAAGA 109 37826 AATTCTCTTTTTTTCAATTTGTAGGATAATTTTCAATATTAATAAAGAAGTATTTTTTTCAAAATTATTATTGAAACCTACA 37T>C TGGGCCAAACCTAGAAATATATGTTAAGTGGTTAAAAGCACATGAGGACAACTAGCTG 110 39126 AATTCTCTTTTTTTTGCCGTTCAAGTTCTAGATGCCAATCAAACGTTGTGGTAGATTAAAAACTCTTTACTGCTCTTCTG 88G>A ATTAAGAGTGGATTCTATCTCCCTACCCTTGAATTTGGACTGCTTCATGACTTGCTCTGC 111 39514 AATTCATGGTCCCAGGTAGGACAGCAACATCTTTTGTGGAGCAGACAATATCACAGCTGGTGCAATAATCATTTACAT 39C>G TGCTTCATTTTGGATGTGTCCAAGCATCAGCCACACAAACTATTCCCCTTCACTCGGTCCTA 112 40071 AATTCAAATCAAGCTTATCTGACTCCTAAATCATTATGTACACTCTATGCTAGCAGTATATTTATATTCAGGTTTCTTTT 72A>G GCTATTTCACCATAAAATGCTGACACCCCATCTTAGCGTTGGAAGAGATTTAACTCTCAA 113 41381 AATTCTGACATGAAAAAATTGTTGTTGTTCTTAGGTGCCGTTGAGTTGGTTCTAACTCATAGTGATCCTATTACAACAG 83G>A AACGAAATACTGCCTGGTCCTGTACCATCCTCACAATCTTTGCTATGCTAGAGCCCATCAT 114 41561 AATTCAAGCCTTGAGAAACAATATTTAAGGATTCTAAAGGGAAAATATTAAACAACGCAAGAAGCATCAAAAGAAGAT 56C>T GAATGTAATACACAGAGTCACTGTACCAAAAAGAATTAGTTGACGTTCAACCATTTCAGGAG 115 42113 AATTCAGACCTAGAGACAAGATTCAACCTTGAACTGAAGAGCACCAAGGGCGACAGCAACACGGGGGAGACACCTG 82C>T GTCTGCGTGGGGTAAACTAGAGGGAGAAGTGCTTCCTGTAGAATCTGTGCCTGCCTCAGTTACT 116 42162 AATTCACCAGCCGCTCCTTGGAAACCCTATAGGGCAGTTCTACCCTGTCCTATAGGGTCGCTATGAGTTGGAATCAA 40T>C CTTGACGGCAATGTTATTTTTTGTGGGGGGAGGTAGGGTCACAGGTAATAAGTAGTCCAGCTG 117 42315 AATTCTACCACTGAACCACCACTGCCTCCATAGGCAGAAAATACCCTCCTAAATTTTTCCATGTCCTAATTCTGGAAAC 84G>A CTGTGCACATGTTACCTTATATGGCAAAAGAGAATTTGCAGATATGATTAAATTAAGGATC

186 118 43916 AATTCAGGAACCTTTTCTCAGGTAGTTCTGCTTCACTGCAGATAGGTTTGGTACATTCTCTTTATATGTTTTAGGCAAT 98A>T TACTGGGCTCTCTTCTGTATTCTTTCACTGGCACCTTTGTGGCTGCTCAAAGTCACTGACC 119 44114 AATTCAGAGGGGAAGGTGGAATGTGTTTTCCAAAACCCAAAGATGCTAGCAGGGGAGAAGGAACAAGAGGAGTAAT 35A>G TATATCCGTTTGAGCCATTGGAAAGACTCCCTCCTCTCAGGAATGCCTTGCTACTTTTGCTTTC 120 44668 AATTCCATGGTGAATCCTTAGTCAATTTATGAAATAATTAACTCTTAATCCGGCATTTATATGAGTATTGGCTTTTATAC 61A>G ATGCAGTTTACTTCAAGTGGGACACGTGTAGACGTATGATGAGAGAATAGATACTAAGAA 121 44838 AATTCAGACTTGATGTTGGGTTGTTTTAGCTTTTGTTTATTTGTGTCATGAGTTACTTAAGTACATAGTGTGTCCTCTGT 84C>A CTGCTTATCAGCCCCGAGTTCATCTTTGCTGAATATCACTGTTAAAAAAGATAGTCTTCA 122 45505 AATTCCTGACGGGCTCTCAGGGACACTGACATTGTGGGAGACGCATTCCCTTGCCCCATGATCGTTGCCTCTGTGG 63C>T GAAACCCCAAGGAACCAGGGCAGGATATGCACCGTCTCCAGCTCCACTCTTTGCTATTTTCACA 123 45899 AATTCACAATACATAGGGAAATCACATCAGATCATAAAATGGTAGACAATCACACAATACTGGGAATCATGACCTAGC 53T>C:68T>C CAAGTTGACAGATATTTTGGGGGCCACAATTCAATCCATGACAGGAGTTTTTTGGGTTTTTT 124 47938 AATTCAGAACTAGAAATTATACAAAAACAACAATTAGAAATCCAAAAGATAAAAAGATTTAAAAAATGGATAGTGTCATA 34T>C GAAGGGCTGAGGAGCAGACTTGAAATGGTGGAAGACAGGATCAGTGGAATTGAAGACAAA 125 53313 AATTCATTCTAGGAAGCCAGCATAACCCTGATACTGAAACCAGGCAAAGATGCCACAAAAAATGAAAATTACAGATCA 35C>T ATATCTCTCATGAATAAAGATGCAAAAATTCTCAACAAAATTCTAGCCAGTAGTTCAGCATC 126 57442 AATTCGGGGGGTCTCTGAACTTGATCCAAGCTCAAGGTGTTCACACTTCCGTGGGGACCCCTCCCTGGAGTGTAGG 37A>G CAGGACATGGGACTGGCTCCCAGTATAGAACATGGCAGAAGCGGTCAGTGTCAGTCAGGGATCG 127 57545 AATTCCTGTTTGTGAGAAAATAAGTGTCTCTTTTTCTAGTAATTCAGGGATAAGCTATTTGAGGTTGAAGGCAAGGCC 98G>A CTGGGGGCTGGGTATTAGCGAAAGCCAAGGCACAAAGGATGTGGCTAAACACCTTTGCATGA 128 58173 AATTCTACCACTGCCCTCTGCTCATATATAAAATGGGACTAATAACCACAGTCTTACAGGCTTTCTTGTAGAATAAATG 40T>G AAATAAAGTACCCTTAAGTACGTCTTCTTGAACTGTGGAGTGAATAAATTTGGCTGCTAAA 129 58962 AATTCAAAAGTAGATATATCTACATGACCACGGCGCCATCAAGGCTCCTGTATTTCCACACATAGTTATACAATATGAA 31C>T CACGTGAAGAATATTGTTCATCGTATTTTTATGTATAACGCTTGTGGAGATTACAAGTTTC 130 59547 AATTCTTCTGGATTGAAAATGAATCTTGGTTTGATACTTGCTGAAGCATCAGAGGCCTACTGTGGTCCAGAACTAACC 52A>G TCTAGGTGGTATGATGAGACCCCAGAGTTGTTCTGGGGCTAGAAGCAGTGAAAGGGGCTATT 131 61038 AATTCCAGTAGGCTTGTTGCCCCAAATTGGAGCCTATGGTTTTCTTTAAATAATTGGACAAATGGAAATCCCCCAATG 86A>G GATAAACAGAGGTCTGATTGGGTGTCAGAAAATATAATGCAGTAATTTAATGCAAAACTATT 132 61744 AATTCAAAATAATATCAAAAAATAATAATAATAATACACCATGACCAAGTGGGATTCATACTAGGTTATGCAAGGACAG 86A>G GTCAACATTAGAAAATTGATCAATGCAACCCACCACATAAATAAAACAAAAGAATGTGATC 133 63453 AATTCTGCCAGAGTAGTGTAGTGGCGTCTTCTTGAGAACGCAGAAATGTCCACCAGGAAAGGAGCACATAGGCCCTT 88G>A CATTACCCACGCCGGTTTTTTGCCATTATGATACTTTCTCTCTTGGCATCTTATACAGCTTTC 134 63539 AATTCTCATACCTTGCTGATAGGAATGCAAAATAGCACAGAATACACACACAAACCCACTGCTGTCAAGTCAATTCCG 87T>C ACTCATAGTGACCCTACAGGAGAAAGTAGAACTACCCCATAGGGTTTCCAAGGCTATAATTT 135 63631 AATTCCTATGAGGTCAGCCCTACCCAAGTCATGTATAAAAATCAAAATGTATTCCTGAGGTGAAATGAGTATCCAGAG 32T>A GTGCATCCTGAAGGATAGGAAAGTGGTTGTTTGGTGGATGGCAGTAGATCTGACTCACCAGA 136 63775 AATTCTATAATGTCGTCGCGTTTTCAACACGGAAACATTAGGAGGAGACACATTTTATGTTTAATTTCATTTTAGGAAT 37A>C TTCTGTTCAACTTATTTCCTAACGAAGCTCTGTAGCCAGTCAGTTGATAACTAATCTTTTA 137 65231 AATTCAGGAAGTGTTGCTTCCTTTTCTATGCTCTGAAATAGTTTGAGTTGTACTGGTGTAAGCTCTTCTCTGAATGTTT 42T>C GGTAGAATTATCCAGTGAAGGCATCTGGGCCAGAGCTTTTTTTGGTTGGAAGTTTTTTTTT

187 138 66481 AATTCAGGGCACCAGCTCAGGAAGAAGAAGGCTTGTTCTCTCTGTCAGCTCTGGGGAAAGATTATTGTCTCTTTTCA 60G>A GTTTCTATTCCTGGGTTCCTTGATTGACATGTGGCAAGTGTCTTCCTTGCTTGTTCTTGTTTA 139 67992 AATTCTGGTGGCTTTTGTGGAATTTTTTTTATCACCTGTCTGTCAGTTTGTCTTACTGTGGTGTTTTGTGTGTTGCTAT 91T>C GATGCTGGTAGCTATGACACCGATATTTCAAATACCAGCAGGGTCACCCATGATGGACAAG 140 72383 AATTCTCTGACGGAGGCAGTCAGTGAACACTGTCAGTTGCCAAGCTATTGTTAGGAGGTGGATAAGGGAAGTATTAT 66G>A CTGTAGATTAAAAAGAAAAGAAAATCCCTGAGCCTAGAGGATTGTAGGTGGCTTAAACAGGCA 141 72411 AATTCAGAAATCTCAGTTGGAGAATGGGACGTAGTGGATGTAGTGGATGCAACAGCTTCCCACATACAACACGTGTT 73A>G TCTCACATCTCTTGTATAGAAAGTGATTGCTCTGCCAGATGTATAATGGTATAGTAAGCTCTA 142 72753 AATTCACATTGCCTTGTGCTAGAAAGATCTGATCACTTCACTGCTGATCCGTGTCACACACAGACTTGAGTCACCTGC 80G>A CGTAGTCATCTCCATAAGCGACAGGTGTTGCTCTGTCTTTATCAGATGTCACACGCACACAC 143 72844 AATTCTCCCTCTGGTAATTCTAAGAAATTCTCTTCCTCCAGAAAGTTTCTTGATTCTTTGTTTGGTGGCTTGCTGCAGC 38C>A:63T>A CTTCATGTTCTGCCTGTTTACGTGATTTGATATTGAGCTGTTACCTCCAAGTCATCAATAA 144 73540 AATTCAGTTCCCCAGTTAATCAGACAATCTTCCAAACTGCTTCCCTCAGGTAAATCATGCCTCTGTATTATTTAAATATT 71T>C:83G>T CTGACCAGAAGCAAAGCATGACTAATCTCAATGTTATTTGTATAGGGTATTTAAACAGAT 145 74898 AATTCAACTGATTTTTATATGTTTATCTTGTATCCTGAAACTTTTCTGAAATATTCCATTAGTTCCAGTAGTTTTCTTGTG 45T>G:86C>A CATTCTTTGGGGTTTTCTATATATAAGATCATATCATCTGCAAATAGAGATACTTTGAC 146 77759 AATTCTATCAAACATTTAAGTAATAGTAAACACCAATACTTCTCAAACTCTTCCAAATAAATAGAAGAGGAGAGAACAC 53T>C:64T>G TTCCTAACTCATTCTATGAGGCCAGCATTACCCTGATACCAAAGCCAGACAAGGATACCAC 147 78931 AATTCAGTGAAAGGGATGTTCTGGAGCTTCTGAGTCCCAGAGCTCTGTCATAACAAGCGTTGGCAGTTTCCACCTAA 80T>G GCGTCTTGGATTGCTTGCTCTTGGTGTATTTCCGCTCAGGATTCTCCCTCTGGGAAACCAGTC 148 79636 AATTCAAAGGAATGTCAAAGATCACTGGCAAACATCAGAAACTAGGAGAAAGGCTCACAGAAGGTATCAACATGGCT 65T>A GAGACCCTGATTTGGAATTTTAGTCTCCAGAAGTGTGAGAAAATAAATGTCTACTCTTTACAG 149 79655 AATTCCCAGTAAAAAGGTCTTGAAGGAGCATGCAAAAAGGCTCCATAGACTTAAGATGACATAATGTGCAGAAAATGA 50C>G AGCTGAGAGTATGAAACCCCAGTTTTAATCTCCATTCAAGTTGTATAACTTTGGGAAAGTTG 150 79782 AATTCTACAGTGAAGCCAGAGTAGAATTATCTATTTTCCCCCAAAATATTGTTGAAATACTTTTACATAGTGTTAATGTT 48G>A GTTGCTGTTGTTGGGTACTGCTGAGTTGATTCTGACTCATGGGGACCCCATGTGACAGGA 151 80868 AATTCTAAATGGCAAATTTGGTAATATAAAGATGGTGAGTTAAATTTGGATGTGGATGTATTTGCCTGAGTGACAGTAC 52G>A ATAAAGTGCTTTGACTTAGATGTTGAGTAAGGGGGGATTGTGGTGAAGACTGGGTGAAACT 152 82025 AATTCTACTATAATTGTGCCCAGTCTGGCAACTGAGAGAAAAAAGGCACGGAAACTAGGCAGTAACTTGCGCAGACC 70C>T TCAACATAGTTGGACAGGTAGACAAACGTACTATCTCTGGTTTTGGTGCCTCCTTCTGGGAAT 153 83189 AATTCTCTCTCGATAATCCAAACGATTCCCTCAGAAACTTTTCAAGAGACGCTCAACACTGGGGAGGTTGTGCTCTTC 89G>A AAACAAAACTGAGTCCAGTAGTCACAGATGAATCGGCCAAATGACAGAACACCGGTCACTAT 154 83371 AATTCCAGAACACTTAATTTTGCTCATGCAGAAGCTGTACACAGCCCAAGAAGCAGTAGTCTGAACAGAAGAAAGGG 58A>T ATAATACATGGCTTCGAATTAGGAAGGTGTGTGTCAAGGTTTTATGCTTCCACCATACCTATT 155 83484 AATTCCGTGCTCATGGGGCACGTGAGCAGTACGTGGGAGTGGGTAGCTAGGGACTGCGTGTTGTGGGTATCTCCTC 111T>C CGTGTCGTACTCCTGCCATTCTCCCATCGGACCTTGCTTACAAAACACAGTTTAAAGACAAAAC 156 83689 AATTCAGGGTGCCAGCTCCAGTAGAAGTCTTTCTCTCTCTGTTAGCTCTGGAGGAAGGTTCTTGTCATCAATCTTCCC 57G>A TTGGTAGAGGAACTTCTCTGTGCAGGAACCCTGGGTCCAAAGGACACGTTCTGCTCCTGGCA 157 83844 AATTCCAGTGTGGGCAGCAACTGGCTGTAAAGTTCAGGAGAGGCTCTGTTTGTGGAGATGTCCTCAGCTGACATAAA 98C>T TGTCAAGTGATGTATGGCTACTGACACTATACTCCATCGTTGCAGACACAAGATGGAGAGCTG

188 158 83996 AATTCAAGGTTCCAGGGCATCAGGAGACAATACTAAAAAAGCATGAGTGAATCCTGGCACAGTCTTCGGTTCCACTTT 33T>C ACACGCAAGCGGTGAACGAGGGACAATGTGTGCGACACTTACCCGTCTGAAGCTGTAGCTCC 159 84051 AATTCATGTCTTGGGCAGTGGGAAACTATTTGTAATTAGCCCCACCAAAACATGTTTTTAATTAGTTCATTTTGGCATA 95C>T AAGTAGGTAAGGATCCCCCATGTGCATTTTTAAAATCAAATATAAAATGGATAAAAATACC 160 84225 AATTCCTATTATGTGAAAAATAGCTACTTCATACAAAAATGAAAACCAAACCCACTGCTGTGGAGTCAATTCCAACTCA 65A>G CAGCAACCCTATAGGACAGAGTAGAACTGCCCCATAGAGCTTTCAAGGAGCACCCAGTGGA 161 84667 AATTCAAACTCCTGACTTTTTGGTTGGCAGCCAAGCTCTTAACCACTACCCCATCAAGGTTCCTAAGTGAGATAAATT 105G>A GACTAACAAACCATAAAAGTGAAAAGGAAATATGTATTGTCATTGGAACAAGTTAGTTAAAT 162 84760 AATTCCCAGGACCCCTAAAAAGAATCTTGCATAAATCAGACCTGTGTAGACTTTCTTTGGACTTCATCTCACTGGAAC 105A>G ACTAATGAAGACAGAGGAAAGAATCTAGCCCATGTGGGATGGAAAGGCCACCTCACTTGGGC 163 85020 AATTCAAATATCCAACCTTTCTATCAGTAGTCCAGTATGCTAACTGTTTGCATCACTCAGGGACTCCCAATCTCAGGAA 64C>T ACTTTTGGGGTTTTGAGGTGAAGAAGAAGACTGAAGCCAGTTTTGCATTGCCCATCCCCAT 164 85198 AATTCTATGCATGGGGTAAACATGGAGTTGTTAGGTGCCATCAAGTCAGTTCCAACACATACCGATGCTGTTTACAGC 99T>C AGAACAAAACACTGCCCGGTTCTGCACCATCCTCACAATCTTTGCTATGTCTGAGTGCATTG 165 85207 AATTCTCACTTTCCTCATAGCAGGCCCAGGTTCAATTCCTGGCCAATGTACCTCATGTGCTGCCAGTACCTATCTGTC 107T>C ACTGGGAGCTTGTGTGTTGCTATGATGCTGAACAGGTTTCAGTGGAGCTTCCAGATAAGATA 166 85554 AATTCAGGAAAATAATATGGGAACAAAAAGCCAAAATAAATTCACAACTAGAAATCATACAAACACAACAATTAGAAAT 58C>T CCAAAAGATAAACAACAAGATAGCAGAAATGGATAGTGTCATAGAAGGGCTGAAGAACAGG 167 86488 AATTCTGGGGATGTTGAGACTTTAGGACCAGCTGTAGCATAAGATAAATGAGGAAAAAGAATAGAAAATATTTTTTAAA 76T>A GTAGATTGGCATAAGATCTTAGACTTGTGTGCTATGCTAAAAAACTTGAATTGTGTTATAT 168 86503 AATTCATCAGCCGCTCCTTAGAAGCAGTTCTACTCTGTCCTATAGGGTCACTATGAGTCGAAATCAACTCGACGGCAA 79C>G CGGGTTTTTATTTTATTCTTAACCTATGGGCCATACAAAAACAGGCCGTGGGTGGAATTGGC 169 87163 AATTCCTTTTGTTTGTGGATTTTTTAAATTTCTTTTGTTTTTGTAGATTTTGTTGTTACTGAGACTTTATGTTTTTCTTCCT 33T>C TATTTTGATGAGTAGGTTTGTTAACTTTCTTTGTGGTTAGCTTGAAATTTACCCTTAT 170 87223 AATTCAATACATAGGTGATCAGTTTTGAATTTCCTGCAAGGTGGTTTAGAAAGTGAAGTCCCAAAGGAAGTCTTTTAG 55G>T GTTCAGAGGGAGGCGGGCAGGGATGAGTAAGCGGAGCACAGGGCATTTTTAGGGTGGCAAAA 171 87505 AATTCACACACATTTTTGGTCTCCATTCAAGATACTTGAGAGCAGCATGAAATTAAATTCTTCGTAATAGAAGACTACT 72A>G TCAAAATGGTAGGGAATCCAGAAAAAGAAGATAGGCTTGAACAAATTGCATGAAATAACAA 172 88224 AATTCCACTTGTAGGGTTCCATTGGGGGAAAATGACAAGAAAGGTAACCAAGGGATGATTATCATGATCCAATTTGTG 37A>G ACAGTGAAAATGGAAAATAGAAATATGGAAAGAGACGCTGGTCATGCACTGTGGTACACCCA 173 88663 AATTCAAGATTGACAGGTAAGAAGGCAGGCTTCTGGCTCAAGTCCTAAGAACTGGAGATCAGATGATGAGGCGCCAA 73A>G ATGCAGGATCCAGAGCAAAGCAATAGCCAGTAAGCTTTGCCAGAAAGTCCATACATATTGTAT 174 89927 AATTCTGAAGACAGGTCTGTCCTCATCAGCATGCTGCTCAAGACCTGGGAGAGCGCAGTCCCTCTTGGTCACACCAG 59G>T GATGGATTGTAGTGTTGGCCTGAGGACACCAACCAGCAGGGGAAGCTTAGTCTCAGACAGGTC 175 90182 AATTCATTTTATGAAGCCAGCATAACACTGATACCAAAACCAGGCAAAGATGCCACAAAAAAAGAAAATTACAGACCA 97G>C ATATCTCTCATGAATATAGATGCAAAATTTCTCAACAAAACTCTAGCCAATAGAATCCAACA 176 90331 AATTCTACGATAATCCTAGAAATTCCTGTCCCCAGGTGTACATGGTTTGTAAAATCCCCTCTCCTTGATTGTGGGCCT 104T>C GTGAATATGATGGGGTAGTCACTCCTTTGATTAGGTTAACTTATATAACAACAGTTATAGGA 177 90495 AATTCCATTGCCGCTTACTTTATGGAGTAGATGTGTTCCTGAAAAGGTGTGGCTTGACTTAATTTCTGTAAAAGTATTA 73A>T GAAGGCCCATTAATTTGTAACGAGAGTGTATTGCACCACTCTCTAACCTCACTTACAGAGG

189 178 90498 AATTCCTTCTATAGCGTCTCTGACCCTGCCCGGTTTGCCTCAAGGTCTTCTTCTCTTTCTTGCTCTTACTGGTTCAATT 44G>A CAGTAAATTTATACTTCTCACATATTATGTGTAAAGCGCGTAAGATACAGTTCTCAAATGA 179 91958 AATTCTGTTTGTACTTCTTCCATGACACGTTTGTTCATTCTTTTGGCAATCCATGGTATATTCCATATTCTTTGCCAACA 83A>G CCACAATTCAAAGGCATCAATTCTTCTTCAGTCTTCCTTATTCATCGTCCAGCTTTCCCA 180 92293 AATTCATGCATTTGGAGGGGCAGGCATGTGTGTTTGCAAGGATCTATGCATGCCAGACAGAGTCTGTGTACATATGT 39A>T:41G>C GCACATGCATACTGATCTCTACTTGTCTGCTCTGTGCTGAGCACAGGCAGGTGCTTTAATAAC 181 92346 AATTCTCAAATGGAGTTTGGACAGGAGAGACAGAGGGCAGGAATCCCGCCCTTCAGCTCTGAGCAGCCTGAGCATG 83G>A GGGTACGGGAAGGGAAGTGTGGGAAACAGCAGATGAGGCATGACTTGAGAGGTACGTGTTCCCT 182 92568 AATTCCTTCTTCCCAAGAGTTTAGAAACAGTTCTGAGAAGCTTTAAGCCTCTGCAAATATTTATTGAGTCTCCCTTTCC 36A>G TGAGCAGGATTCGGCACACTCTTTCTTTAAAGGGGCAGATAATAAATATTTCAGGATTTTC 183 92702 AATTCTACCACTGAACCACTGTCCCCTGAATAATGCCTACCTAACAGAAAAAAAAACAACTGAAAACAGCTCTAGATA 57C>A CTATGAATAGGGGAGCCATTAAATAAATTACGGTACATCTATAGGATGATTTATTATACAAA 184 93302 AATTCTGGGCACAAGCTCTAAGGGAAGGCTCTTTCTCTGTTGGCCTGGGGGAAGATCCTTGTCTCTTTCAACTTCTGT 39G>C AGCCCCTCTGTTCCTTGGTTTCTTGGTGACATCCACGTGGCGTGTATCTTTCTCTGTTTGTG 185 93656 AATTCATGATTAGCAAGATAATACAGGATGGATCTCTCCCATTGGAGGTCTTCAGTTAATCTCTTTAAAACCCTAAGTG 72C>A TTCTTCCCAAGGGCAAGCTGCCCCTCCTTGGTTTACCTATTAACAGTAATGGGTCACTTTC 186 93856 AATTCAAATCTCTAATATTCAGAGCTACCCAAGAGCTGAAGCAAAAAGCAGACAAAAATGAGGAAAAAACAGACAAAT 93A>G TCACGGAAAATACAAATAAAAAAATGGAAAAATTCAGGAAAATAATATAGGAACAAAATGCC 187 94241 AATTCTGTGCCCCTACTTCTTTGATTTACTATTTTTTATGGTAAGGAAGTAGGATGACAGAGCCTGTTGGGTATGTGA 44G>A GGGAAGTCTGTGGAACTCTTGTTCACACCTGCTGAGGTTCTGGCAGTGGTATTTGCTAAGTG 188 94390 AATTCAATTTATTGCAATTGTAGAACAAAGATCTTCAATTAATTTCTGGCTGTTGAAATAGCTAAAGGGAGTTAAATATG 71G>A TCCACCCTCATATCCTAGAAGCCACCTGTAATTCTTTGCCACATTGGGGTTCTCCAGCAT 189 94541 AATTCATGAAGGTTCTAGGTGTTAATGTCCATTTTTGTAACTACTGAGGGACACAAAAAATAATGAACAAATTGAAAGT 94C>G ATTACCATGTTTTTCGATAGGAAGGTTCAAATTCATAAAATTGTTATATTACCATAATTTA 190 95005 AATTCTCAAATTTAGCAATCACAGTGTCTTGCAAGCAGGATAAATAAAAAGAAATCCATGGCTATGTACAAGATAAAAC 65T>C TGTAGATTATTAAAGACAAAAATAAAAATTCTAAAGCCAGTGAGAAAGAAAACACAGATTA 191 95048 AATTCCAAAAGCATCTCTGACACAAAGTCCTCTACCTCCACCATATTCCAAGGAGGGAAGAAGTAAACATCATTGCAC 55G>A AGGTTTTCTGTTGCATGAAACAGCAGCCCAGAGAGTGGAAGTGAATTGTGTAAGGTGTATAG 192 95216 AATTCACTTTCCTTTATCCATCAATATTCCACAATTGATCCCAAGTTTATCCCTTCCTTGGACTTCTTCAGGCACTCAG 95G>T AGTAACTGAGATGTCGTTACTGTGGGTCTTAGCATTTTTTCAAATCTCAGCTTATGTTGGC 193 95289 AATTCTTCAGAAACTGAACACCAAAATGAAGAGAAAGTGTTATTATTCAGGTAATGCAGGCTTGCTTACCACTCAACAT 85G>A GTTCAGTTAGATATTTTCTGACCCTGAGCTCCATGATGGCTGGAGCTGATTTTTCCTCCTT 194 95348 AATTCAGTTGAACCAATTCACTTAGCAAGGGAAAGGTTTGAATTTGGCCAGTAGTGATTATGTATGTTTTCCTTCGGTT 76G>A CCTCACCAGTTATTAATTATATCTTTCAGAGTCCGTAGGTGGTTTAAACAGTTAACACGCT 195 95435 AATTCATAATTTATCAAAACCATATAAAGTTTATCCCAGGAATGTAAGGTTAGATCAATATAAGGAAATTTTTAATATGA 62A>C GGTTAAGTTATATAATTACATCAATTAATTCTATGAAAGTACTTAATAATATTTAATACC 196 95580 AATTCCAAACACTTAACTGTGCTCATGAGGAACCTGTACATAAATCAAGAGGAAGTCGTTTGGACAGAACAAGGGGA 58G>A TACCATGTGGTTTAAAGTCAGAAAAGGTGTGCATCAGGGTTGTATCTTTTCACCATATTTATT 197 96502 AATTCAGAACAGTGCTCCCAATATAGTAAACACTACACCAGTTAGCAGTTATTGTTCTTTTTAATTGCATGGGATTCCT 69G>A CATGGGGATGTACCATAGTTTATTTCTACATTACCTTCTGCGTTACTTTCTTAGGGCTGCA

190 198 96868 AATTCCCAGGACGAGGGTGGGGGTGGGAGGGTGAACAGCCCGCGCAGGTAAGTATGGGAGCCACATGGTTTCGCG 70T>C AATGGAAAAAAGGAAAATCGTCTTCAAATGAAAGCAGTCAGGTGGGGGCAGATCGAAGCTGAAAG 199 97132 AATTCTTCAGGGAAGGGTTAGAGAGATGGACATACCACCTTCAGAAGTGTATAGACCTAGATGGAGGATATGTCAAG 76A>G AAACAATAACTTCACATTTTGATATTTTTGTTTAGTAAAAGTTTGTGTGAGTTTGTAGCAGTA 200 97394 AATTCACTTAGGAACAATGAGGTGGAGCATTCTATTATTAGCTGCTGTCCCAAGGGCTTCTGAATGAGTTAAACTGTT 96T>C ACATGAATTAGTTTTACTTTGATGTCATAGTGTAATTAAAATGTCATTCGTGGGAGCTCCCA 201 97456 AATTCAGGAAAATAATGGAGGAACAAAATGCCAAAATAAATTCACAACTAGAAATTATACAAAGACAACAATTAGAAAT 56T>C CGAAAAAGTAAACAACAAGATTTCAGAAATGGACAGTGTCATAGAAGGGCTGAGAAGAAGA 202 97595 AATTCTACCCAAAGCAATTTACAAATACAATGCAATCCCAATCCAAATGTGAACAATATTCTCCAAGGAGACGGAAAAA 63C>T CTAATCATTCACTTTATATGGAAAGGGAAGAAGCCCCAAATAAGTAAAGCACTACTGAAGA 203 98782 AATTCCAGAACTCTTAATTGTGCTTATGCGCAACCTGTACTTAGACAAAGGAAGTTGTCTGAACAGAACAAGGAGTTA 59C>T CTGCATGGTTTAAAATCAGGAAAGGTGTACGTTAGTGTTGTATCATTTCACCATACTTATTC 204 99006 AATTCAGGAAAATAATGCAGGAACAAAATGCCAAAATAAATTCACAACTAGAAATTATACAAAGACAACAATTAGAAAG 45G>C CCAAAAGATAAACAACAAGATTTCAGAAATGGACGGTGTAATAGACGGGGTGAGAAGAAGA 205 99463 AATTCACATTTCTTCTCTCAGTGTTGTGTTTCCGGTTTCGCTGTCAATGTTTATCCCGAGGGGCGACAGTCACAATCA 33C>T AGAACAGCTAACAGCTTGCTGTTTTGAAATATTTATAACAATATCTGTGTTAAATGCTGTCT 206 99870 AATTCCAGTAAGGCATTCCCCAGCCCACATCCTGATCCCACACAAGCACACAAAAGCATGCACAGTTTTTTTAATGGT 87T>C ACTCAGATTCTAAAGAGTTTATTCATTTTAGTAAAGTTCATCATTTCTTGCTCCTTCTGAAG 207 100108 AATTCTTTTCAAGGCTTTTATTTATAAGTGCGTGTTCCAAAAACGCAATTGTGTCTGATGTTAATAACAGGTTGGGGGG 45G>A TAAAAAATACAAACTACCAACAACCATGCAACTTTATATGTAGATAAGTACAAGGCCATTT 208 100393 AATTCAAGGTTTAGAAAGTGGACATTAGAGCAAATTCAGGTGACTGAGCACTTCAAACCATCCCACCACACTGAACCT 91A>C:100C>T TCTTCGAATCAGAATTAGCCACCTCTGGGCAGAGAACAAGCACAAAATAGAAGGTGTGCAGA 209 100444 AATTCAGTTCTGCCTTTTTAGAATTTTTATTGTGGTGAAATATAGGTAACAAAACACTTGCCATTTCAACCATTTTTACT 85C>T TGTACGATTCGGTGACATTAATTCTGTTCTAAATGAAGAGAAACCATCACCACTCTCAGA 210 101121 AATTCACCAGTGAAGCCATTTGGGTCTGGTGGAATTTTTGGTGGAAAGTTTTTAATTACAGATTTGAGTTTCTTAAAAG 55A>T GTATAAGACAGTTCAGGCTTTATATTTTTTATCTATCAGGCTTGAGATACTGTGTTTTAAA 211 102506 AATTCCAACTGCCAACCTTTTGGTTAGCAGCCAAGCTCTTAACCACTGCACCACCAGGGCTGCTGATAAAGTCATTAG 85T>C GCTGTATGAAATATTTTCCTCTCTCTACATGTTCAATGGTAGCTGATAGTGCATTTCTTCCC 212 104095 AATTCTTGATCAAAATGTTCAGTAAATGTTAACTGGGCACCATCCAATTCTTCTTATCTCAGGGCAAAGGAGGCAGTT 62G>T GTTCATGGAGTCAATTAGCCACACATTCCATATCCTCCTTCTATTCCTGACTCTTATTTCTC 213 104159 AATTCTCCTGAGCCAGAATGTTCTGAACCTGATGAGGTTGCAGTAATTGTTTTACATGGAAGAGTTGTTGCATACACC 32T>A ATAAACTTTCCTTACTACATAACAGCATTTGCTCTAAGTAAATTTGATTATCTTCAGTTCAA 214 104587 AATTCCAAATCTACTTATTTTCAGTCTTGTTTTTAGTGTCTTCTTTGCCAGTTGGAAACCCTGGTGGCATAGTGGTGAA 51G>A GAGCTACAGCTGCTAACCAAAAGGTCAGCAGTTCGAATCCACAAGGCACTCCCTGGAAACC 215 104737 AATTCCCTATCAGGTAGTTCTAGTGCATTTTCTTCCTCCTGAAAGTCATCTGGTGTTTTATTTTTGGATGCCTACTGGA 83A>T ACCATCCTGTCCTGTTTTTGTAAATGTTTTGACATATTCTGCTGTCTTCAGGGTATTCAGT 216 104799 AATTCCAGGATACTGCCTGACTTGGGCTTTTCTGAGAGCAAAGGTTACCTTTCTTTCTATATCAGCAAACTTTTTAAAA 57C>A GCTTGCCTAGATATTCATATTAATTACATGGGGTCAGGGTGGCAAATAATATTGGAACACC 217 104841 AATTCTGCCACTGACCCACCACTGAGCCTGGTTTTTTCTGTTTGTTTGTTTGTTTGTTTTTGGCACTCCCTTCAATTTT 56G>T GTGCCTGAGGCAAGTGCCTCACTCACCTCACCCTAATCCTGACCCTGCTAATAATTAACAT

191 218 105099 AATTCCAATTGAGATGTTTCGACAAACGGATGCAATCCTGGAAGTGCTCACTCGTCTATGCCAAGAAATTTGGAAGAC 47C>G AGCTACCTGGCCAACCGACTGGAAGAGATCCATATTTGTACCCATTCCAAAGAAAGGTAATC 219 105590 AATTCTAATAGCATAGATTCATTTTGCCCATTTTACTATGTGACATTATCTTTTTTTATGATTAAATTCACATGACATGTA 78T>G AATGAATAATTTCTTTTATAAGGATCAGCACAAAGCTGGTTCCTATGAGGTGGAGGTTA 220 106055 AATTCACCAGCCGCTCCTCGGAAACCTTGTGGGGCAGTTCTACTCCGTCCTATAGGGTTGCTATGAGTCTCAATCAA 46C>T CTCGAAGGCAATGGGTTTGGTTTGCTTTGAAAGACAAATTCATTTTGCATATTAAATGAATGA 221 106463 AATTCCTGCTTGTTACACAGCTTATCTCCTCTGAAACAAAGGTTTTGATCGAATTAACAAGTAGGGAATTTTCTGTGAC 55T>A TTAGGTGCAGTAAAAAGTGTTTCCTATTTTCCTCCCTCACCCCACTATGAGTACCTGGACT 222 106522 AATTCAGTTACTACTCAAAACAATTCTTTACAAACCAGAAAAGAGACTGGGAGACAATATTCCCAGCTTTGCACGAGG 95C>A TCATAATCTGATCTTGCACCTTGAGTAAATACATTTCACTTTCATGATCAAAGGAACAAAAC 223 106523 AATTCATAAAAACCAAACCAAACTTACAAGAAATGTTAAAGGGAGTTCTTTGGTTAGAGAATCGAGCAACAACTGGAG 47T>C TCTAGAACACAGGATAACGTCAGCCAAATACCAACCTAAGTAAAGAACTCTCAATAATAAAA 224 106638 AATTCAAAAATTATCAATGATAATTTGGATTTTTTTAATCGGTTTTTAATTTTCATCTAATAACTACTTTTTGATAAATTTA 95C>A TGCAATCATGATCACTTTGATTTTAAGCTCTTTTCAGCTATACTCCTGTTAGTCTGTC 225 106658 AATTCAAAACTAATACAAACTTAGGATTAATGATAATAAAAAAAATAATTATTCACTTAGAAAATGTCCAAAGGATATGT 50A>T ACAAACACTTTCACTAAAGAAGACAGGTAAATGCAAATTAAAATCACAATGAGATACCAC 226 106846 AATTCTCTCACATTTTTAGAGACCAGGGTTCTGAGATCAGTATCACTGGACTGAAATCAAAGTTTCTGCAAGCTATGAT 86A>T TCCTCCAAAGGCTTTGGCAGAGAATCTGTTTCTTGCTTCATCCAGCTTCTGGTGGCTGCCA 227 107237 AATTCCCATTCTTCACAATGTTATCCACAACTTGTTATGATCCACACAGTTGAATGCCTCTGCATAGTCAATAAAACAC 31C>T AGGTAAACATCTTTCTGGTATGCTCTGCTTTCAGCCAAGATATATCTGACACCAGCAATGA 228 107242 AATTCTAGAAGTCTTTCCCCCCTTTTCCAGCATTATACATTTGAATGCTAGTATAACTGTAGAGACAGGAGAAGTGTTC 50A>C:83G>A CCAGTAATGATTAAGATTGATTTTTGCCACCATCTGATGGTATCTTAGATGGCTCATATAT 229 107700 AATTCATCAAAATGCTGATCAGTATAATGGATGGATGCCCTATCCAGAAAGCTCCAGTATCATCTCCCTGAAGCTTGC 88T>C CAGAAATCTCGGCATGGGGTTCTACTGGCAGCAACATCCCCCCTAAACTGAGGCTGGGAGAT 230 107761 AATTCTGGCGAGGATGTGGAACCAGGGATATCATTGCTGTTGCCAGATGGATCCTAGCTGAAAGCAAATAACACCAG 42G>T AAAGATGTTAACCTTTGTTTTATTGACTATCCAAAGGCATTTGACTGTGTGGATCATAATAAA 231 107780 AATTCTGCCTATTTTCTTAAAAATTGTTCATAAACTCTTCCCCCATTCTCTCTCCTCCTGACACTGCCTTTTCTTTCCCT 62C>T CTCCACTTAAATTCTTTCTTCTACTCTTTCTTCCTCTCCTTGCTCTGAATGCATTTTATT 232 108034 AATTCACCTTCACTTTTTCCACTATTAAGGACACTTGGCTGCTTATTTTTTGAAACAACTCACTATTTTCCTGTTTCCAT 31A>G AGCAATAAGGCCACATTAAACATTACAGTCAATACTGAGGCACAATTATTATCCATTTGC 233 108248 AATTCTGGCCTCTAGCCTCCTAAACTGTGAGAAAATAAATTTGTTTGTTAAAGCCACCCATTTGTGGTATTTCTGTTAT 82C>G AGCAGCAGTAGATAACTGACACTAACATAAAAATTGAATCACCTAATTTAAAAGAAATAGG 234 108331 AATTCTAAATGTACCTTAACCACTTCCAGTGAGTGTTCATTTTGAATTATTTCTGGTTTGTCATGGAACCAGAGTCAGA 83C>T ATACTGAAAGTAGGCCTATAATATTTTACTTTGAAGAATAATAACTTTTTGTATTATGCTC 235 108863 AATTCTTATTAAATCTGTATTGGACAGTCTTAATTGTTGTCTTGAAAACATTAAAAATTTTCCCAGACTTTCCTATTACC 39G>C CTGTGTTCCTGGGGATGTTAATTCTCTAACTTCTCTTGTCTACCCACACTCCCTTACAGT 236 109096 AATTCAAAACCAGGCCTGTCTGACTCCAGAGACTAATTTTTTTTTTCTTAGTTTTTAAGTGATATTCCTTTTTATGGAGC 62A>G CCTGGTGGTGCAGTGGTTAAGAGCTTGGCTGCCAACCAAAAGTTCGGTAGTTCGAATCCA 237 109751 AATTCTACCAAATATTCAGAAAAGAGTTGACACCAATCCTACTCAATCTCTTCAAAAACACAGTAGAGTAAGGAACACT 103G>A ACCAAACTCTTTCTATGAAGCCAGCATAACTCTGATACCAAAACCAGAAAAAGATAAAACA

192 238 109801 AATTCTGCCACCTATGAATGTAGGTGCCCAATAGTAATGTGGCAATAAGTTAAGTGGCACTGAGTGAGTGTGGAACA 60C>T TCTCAGTGGTGCGCCTGTGAATGAGCGAGCTAACACCTGAGTATAACATTTGGCTACATAAAC 239 109962 AATTCTCACTCACAGAAGCCTCTATCTTGGTGATAATGCACCTTCTCCCCACACATGCCCGAAGTGTAAAAACACAGG 49T>C CTCAAGGCATCACCGAGTTTTAAAGATGGAAAATAAACCTAGGTGAAAAAAGCAGTTTGACT 240 110271 AATTCCAATCTCTTGAAAAGTACCACTGCAATGCATTACTCCTTATTGCAAAGGCATTATTTACTGTTCAGAAGCTAGA 33G>A AAAGCGGGGAGCCAGTAACTCCTGGAAACATGGATGGCATCAAGGAGAAAATGGCCACTGT 241 110293 AATTCTCAGGACCCTTTTTTATACAGTAGACGCAAATGAAAACACAAAGTCCAAACGGTTTGGTTCAATTACTTATGGG 32G>A AACAAAATAACTTGAAAGTCATGCTTTACTGGGGAAAATGCTTAACTTTCTTATCCTTTTA 242 111277 AATTCAGTTCAAAACACACATCAAAACATTCACCAATACAACAACTTCCACATATACATCTCAATGACATGGATTACGT 77C>A TTTTCATGTTGTGCCACCATTATCGCTATCCTTTTCCACAATATTCCACCACCATTAACAT 243 111807 AATTCTCTGATGCGCCAAGGCCCCTGATAATGAACACTGCCCCTGAGCTGAATGGTCAGCTCACCCTCACCCAGCC 96G>A GCCTTTCTCTGGGGTTGCTGTGGCACCAGCAGTAAAGGCTCACTGGTGGGGGAATGCAGCCACA 244 111808 AATTCTATGCCAATATTTGGGTTTATATTATTATTTATACTTCATGGATTTTAACACCTATAATGCCCATCACTGCTCCA 43C>T CACTTAGAATTGCAATTAGGGTAATACCCTTGGTTTGCATTTTCTTTTACTGACACTAAA 245 111814 AATTCACCTTCGATATTGATTATATGCTCCTAAATTCTTGCCTGTTGGCCTGTCCCAAAGACTGAAATTATTTCCCCCA 78C>T:79A>G AAAAACTCGTTCCTCTGGGTTCAGACTGTGCACATCCGAGGGTCTCCATTTGAGTGCAGAT 246 112070 AATTCAGCTAAATTGGTTTATTTGTTGTCATAGGAGAAAAGCTTAGATGCTTGTATCACTTTTCTGTAGTCACTAAAAC 47A>T AAATTACTACAAACTTCGTGGCTTAAAACAACAGAAATCTGTTTTCTTACATTTCTGGAGG 247 112111 AATTCACTCTGAGTGACTTAGTAGAAGGTCCTGGCAATTTACTTGTGAAAGGTCACAGCCATTAAAAACCATATGGAG 42C>T TACAGTTCTTCTCTGAAACACATGGTGTAGCCATGAGTCAACATGAGTCAAATTGACTAGAT 248 112692 AATTCCAAGAAACACCCAGGGCTACCAGATCTGAAACAGACAAGGAAGGCTCTTTCCCTAGAGGAGACAGGGAAAA 51C>T GCATAGGTCTGCCAAGGCTCTGAATTTGAACTTCCAGCCTCCAGAACCAGGAGACAATAAATGT 249 113097 AATTCACCTTCCCATCATCTGAAATTTAGCTCGTTAAAGACCTCCATGACGATGCCAGACATTAGTTACTTCCTCCTCT 52A>G ATAAATCCAAAATATGTTGTCTGGACACACTGTATTTTAGCATCTATCACAATTAGTTACT 250 113115 AATTCCATGAGCATGGGCCCACTATCACACTTCATTTGCTGTGAAGTGATTCTCTTGATCTGAGGCAATGCTGTGGG 90A>C GGATACCATGACCATGGATAAGGCATTTTGTAAGTCCACGGATAGTAGTTTTAGCAGAAGCAT 251 114449 AATTCTACCTCCGGTAATTCTAAGACCTTCTCTTTCTCCAGGAGGTTTCCTGGTTCTTTGTTCTGTTCGCTTGTTGGAG 69G>A CCATCTTGATCTGCCTCTTTATGTGATTAGATATTGACTGTTGTCTCCAAGTCATCAGTAA 252 114586 AATTCTAGAGAGGTGGGGTTTTCCCTTCTATCGCTGTCTTGAAACATAAACTTTATCATTGTTCGGGTGGCAGGTTCC 44A>C AGAGCAAAAGGAGCAATCCCGTCAGCCCAGAGTCCTCCCCACACGCCCTTTTAAGGCTTTCC 253 115505 AATTCTTAGAAAATGAAAATTACCAAAATGCAGTCCAACAAATAGAAAACTGAATAATCCCAGACAGAATAAAAGCTGT 69A>G TCTTGGGTGGTGAAAAGGTTAAGCACTGGACCACTAGCCAAAAGGTTGGAGGTTCAGACTC 254 115891 AATTCTACCACTGAACCATCGCTGCCCTGGTCCATATTTTAATATCTCTGAAATTGGGGTATTTCTTACAACCCATGGC 76T>C ATCTTACCATTGCTGGTGGCCAGGCAATGCTGCTGACATTTGTCCTTGCCTGTCCATGGGT 255 115964 AATTCAGTGAAAATATTGACCAAATTCTTTACTCAAGATTGCTTAAACATGATATTTGGTAAATTCACAAAGTTCTTCAT 44T>C TTTCCACCCACTTGTTACAGAAAGTAAGACATAAACATTAATGATATATAGGTAGGACAA 256 116102 AATTCAGTCTCTTTCTAAGCCCTTGGGTACTACTGGCATAAAGAACCCAGCCTTATCTGTATCTGTTTTTATTATTGTC 63C>T TTTTCTAAAGACTGGGGGATAAATAATTTATAAACAGGAAAAAAATATCAAGATGAAGTCT 257 116174 AATTCAGTGCAAGAAATAAATGCACTAATAGCAAAATAATCAAATAAATCTTAACCACCCCAGAATGACCATTAAAAGC 39T>A CATATAGGTTATACACTGAACATGTCAAGGGAGTCCCATAATCTGCAAATCATATGTGATT

193 258 116372 AATTCTGTCTTAGTTATCTAGTGCTGCCATAATAGAAATACCATGAGTGGATGGTTTTAACAAAGAGAAGTTTATTTCC 42C>T TCACAGTCAAAGTAGGCTGAAAGTCCAAATTCAGGGTGTCAGCTCCAGGGGAAGGCTTTTT 259 116968 AATTCCATTTCTCACACTGGAGTGAAAACTGAGAGGCAGAAACACAGCTGTTGAAGAGTAGTTGCTGTCCTGGTCAA 93C>T CCTGTCTTCACAAACCTGCTCAGTTAAACCCCGGTTAACACCAGTAGTGTGCGTCCCTGTTGG 260 117489 AATTCAGATTTCTAGTGTCTTAAACTGTGAGAAAACAAATTTCTGTTTGTTAGAGCCATCCTCCTGTGGTATTTCTGTT 88C>A ATAGCAGCCCTAGGTAACTAAGACACATTCTAATTTCATTTATAACATTTTTATTTTTAGA 261 117491 AATTCTTCCATTTTTGTCTGTATTTTCCATGAATTTGTCTGCCTTTTCCATGAATTTGTTTATTCTTTCCTCATTTTTCTC 50A>T TGTTTTTTGCCTCAACTCTTGGATAATTCTGAGTATTAGAGATTTAAATTCCCTATCAG 262 117675 AATTCTGATCTCTTCTCTGGCATGTGGCCTGCCTCCATGGAGAAATCTTTCCATTGTCCAGATGTGAGAGTTACTAAG 96A>G:100A>G AGATTGTCTAATTCCCCGGCAGAGTATTATGAAGCTCATTGTGGCCATATATCCACTGATCC 263 117797 AATTCTACAAAACATTTACAGAAGAGCTTGTACCAGTTCTACTCAAGCTATTTCAGAGCATAGAAGAGGAAGCAAAAC 92C>A TTCCAAATTCATTCTATGAAACCAGCATAATGCTGATATCAAAACCAGGAAAAGATGTCATA 264 118207 AATTCTGTCTCCCAAAAATATGTTTCAACTTGGTTAGGCCATGATTCCCAGTATTGTATGGCTGTCCTCCATTTTGTGA 107A>G TTGTAATTTTATGTTAAAGAGAATTAGGGTGGGATTATAACACCCTTACTAAGGTCGCATC 265 118466 AATTCATTAGTTGTTCAATCAGAAAAAAACTACCAATGTATAAAACAATCTCTGGAGAGGATAATTGAAAAAAAGGAAA 106A>T GAAATATACTCTTTCTAAGCCATCTTAAAAGCAAAACATGATTTAGAAACTGGATGTAGAT 266 119149 AATTCATCTGTCTTGGATAACAGTTTCTGTGTTAGCCAAGGTTTTCCAGAGAAAGATAACCAAAGGGATATGTTTAGA 65A>G GAGACAGAGAGAGAGTGAGAGAGGTTTATCTTAAGAAATTGGCTCACGCATCGTGGGGACTG 267 119468 AATTCTAGGACTCCAGATCCTAAAAAAAGCTCTGGTGTATCAAACCTAGTTTAACCTATATTCCAAAGCAGAACACTAT 98A>G TTGGATTTTTTGAGACACATAACAGCTGCCAGCTGTTATTTACGCTTTTTCCTATTTACAC 268 119674 AATTCAGTAATTAACATTATTGTTAAGTAGGATTAGAACAGGAAAATCAAGCCGTTTACCTCACAGATTACTGTTTCAT 82T>C CTTCTTGTTGTTGTTGAAAAATAGACATAGCAGAACATACACCAAGTTTCTACATGTACAA 269 119767 AATTCAGTCTGGGACAAGTTAAGTTCCCCCAGGCATCTACTACTTTGTCGTACTGTGGGGGCTTGTGTGTTGCTGTG 49T>C ATGCTGGAAGATTTGCCACCAATATTCAAATACCAGCAGGATCACCCATGGTGGACAAGTTTC 270 120598 AATTCTATCACTGAACCACCAAGGCAGAATGAATAAAAACTGAGAGACTGTAAGTTCTATTTAGCTGGGTTTCTGCAA 50G>A:66T>C TCGTTTCTCTTCGCCTTTGTGTGATTATGTTGCCCTTCCTCATTAGGGCAACGTGGGCTAAC 271 120924 AATTCCATAAATGCTTGTTGATGAATGAATCGATAGATGGATACACATGTATTCAAACATAAATGCATACAAACTTACA 78C>T GCCAGTACACCTTCTACTCATTTCCTTATCTACAGAGTCCATCATACTGGAATTGTCACTT 272 121350 AATTCCAGAACACTTAATTGTGCTCATGAGGAAACTGTACATAGATCAAGAGGCAGTCATTCGAACAGAACAAGGGG 77G>A ATACTGCATGGTTTAAAGTCAGGAAAGGTGTGCGTCAGGGTTGCATCCTTTCACCATGTGTTC 273 121807 AATTCTTATAATAGTGGTCTTGTACAATATTTGTCCTTCTGTGACTGACTAAATTCACTCAGAATAATGCCATCCAGGT 35C>T TCCTCCATGTTATGTTTCACGGCTTCATTGTTGTTCTTTATCATTGTGTAGTATTCCATTG 274 122437 AATTCTAGTCTACATAGAATGCAATGCCCCTATAAGAAATTTAAATCCATTTTACCTATAGGAACTGTATTGCTTAAAGA 34A>T TGCATATTGGATCCCAGTATAAATTAATAATTTGTAGACTGTAAAAGATAATACATAAAG 275 122884 AATTCACATATCAAAACTGTTCTGAGTTGTGATGCTAATATGCTTTATCAAAGTCTAAATTAACATTTACTGCTGTTAAC 40G>A CAGAGGCAATGCCTCATAAAAAAAGAAAAATTTAGATGATTGTGCCAAAATTTCACATTT 276 123302 AATTCCAGAACACTTAAATTTAAGAGGACTAAGTTGTGCTTGATCGAAGCCATAGTATTTTCAAATGCCTCACATGAAT 96T>G GTGAAAGCTGGTCAATTAATAAGAAAGACCAAAGAAGAACTGATGCTTTTGAATTAGAGTG 277 123312 AATTCTGACAAACTGCGCTTACAACTGTCTGTTTTTATTTTTCTTTTTACACCTTATGGTTAGTAATGCATACTACATAT 70G>A CATGTTGCCGTGAGTAATTTGCTGATGTTATCATTCTCATGCCAACCCCTAAATACAAAG

194 278 123319 AATTCTCTTTGAAGTTTTTCTTCCCAGTTGCATGGGAGGTGGGTGCTGCTGTTGCCTCCTGGCTGTGTATGAGACGAT 92T>C CAGATGCCCCTTGTGTTCCTTGTTTTCCTTCTGTTAGCCCTGCTCAGAAATCTCTTTGAAAT 279 123596 AATTCTCTGGTTTTCAGAGATGCTTAAATGTTACTTCTTTAGGAAAACTTACCCTGACACCCAAATGTAGGTCAGCCCT 90A>G TCCTTCCCTCGTAGGACCCTCTATCTTTTTTATATATGTATATAGTACGCCTTAAAATTGT 280 123842 AATTCAATACACTGCTAATCTTGATGCTCATGGAGACTTCCTCTAAGAGGAAACAATCAAACATTTTTATTTCTCCGTC 76T>C GTAGTATAGCTTCTCAAATTTGGCACCACGGATAAAGGCTTTGTAAATTTAATAGTATGGA 281 123936 AATTCCCATTCTTTGCAACGTTATCCATGATTTGTTCAACGATCCATACAGTCGAATGCCTTTGCATAGTCAATAAAAC 40T>C ACAAGTAAACATTTTTCTGGTATTTTCTGCTTTCAGTCAAGACCCATCTGACATCAGCAAT 282 124406 AATTCAGCATCAGGATTTGAGGAAGATTCATTAACAACCTGCTATATGCAGATGACATAACCTTCCTTGCTGAAAGCG 81G>A AAGAGGACTTGAAGCACTTACTGATGCAGATCAAAGACTACAGCCTTCAGTATGGATTATAC 283 124638 AATTCTAAGGTAGGAATTACTACTACTACTACCCGCTGCCGTCTAGTTGATTCTGACTCATAGTGACCCTAGAGGACA 76A>G GAGGAGAACTGCCCCATAAGGTAGGAATTGGGGAATCCCAAAAGGAAGATATAAATCCCCAA 284 124865 AATTCTAGTTCTACAGATCTTCACTGGGCCTCAATTTCCTCTGCTCTAAAATGGGAATAATAATAATAACTGCTTCTAA 100C>T GTGCCGTGGTAATGAAATAACTAAACGGTAGGAGCTAGTGAGAAGAGCAGCCTTGCTTGCT 285 125026 AATTCAGAAAAAATGTTTGCTGTGTATATCACAGGCACTTGATACATAGTAGAGTAGAATAAATGTTTCTTGACTGAAT 38C>T GTAAAAATAATTAGTAAAATAAGAATATAAAATATATCTCTGATACATTATAAATTTTTGT 286 125571 AATTCAATCTAAGTTTCCACCTTAGAAAACTAAAGAGAGAAGAGCATTAAGCTTAAAGCATATGAGAAGGGGAATAAC 95A>G AAAAGTAAGAGCAGAAATCAGTGAAATCGAAAGTGAGTAGAGCAAAGTCTGCAGGATAAAGG 287 125636 AATTCAATGAGACAGAGAAACAGGTGATGGGCATGAAATAGGAGTTTGATAAATAGCCTCTTCACCTCCTTCCCTTGC 45T>C TCTTTCTCTTCTCCCTCCCTCCCTGCCTTCCTCCTTTTTCCTTTCTTTCCCCAAAAAGATAT 288 126402 AATTCTGTATTTTTGCAGTCATGAAGTACTTGAAGTTCCTGAAGTTTATTCTGAAGTCAATTAAAAATATATCATAAAAT 90A>G GTCATGTAAAAGCTATCATTTTTTAAGTCTTTAAAATCAATAAATTATATGAATATACAG 289 126429 AATTCCCACTCCTGCTCTTTTTGGTAGTTGTTAGCTTGATATATTTTTTTCCAACCTTTGGCTTTTAATAAATTCCCATC 61A>G TTTGTTTCTAAGGTGTGTCTCTTGTAGAGAGCATATTGATGGATCCTGTTTTTTTATCCA 290 127382 AATTCCAGAACATTTAATTGTGCTCATGAGGAACCTGTACACAGATAACGATGCAGTTGTGCAAACTGAACAAGGGGA 50A>G TACTGCAGGGCTTAAAGTCAGGAGATGTGTGCATCAGGGTTGTATCCTTTCACCATACTTAT 291 127533 AATTCTGGCTCTGTCACTTATTGGCTGTATAACCTTGGATGGTTATTTAACTTAACTATATAATGGGAATAATAATAATA 40C>T ATATCATCATAAAGTTGTTAGGAGGATAAAATCAGTGAATATCTGAATAGTTCTTGAAAC 292 127571 AATTCTGCCTTTCATTCAGGCCTTGAGGTTCCAGATGAAGCTAAGGAATTTTATGAGAGTTTGGGGATGTCTTTCTATT 33A>G GCCTTCCCTTAGGAAAGGCTGTGCGGCAGGGTTCCGCAATCTTGTGCATCACTGTTTTATA 293 128317 AATTCTAACTCCTAGACAAAGGGGAATGAAAGGGGACAGGACATCAAGTGAGGACCATTGGGAAGACTACAGAAGA 35G>A ACTCACATGGTACGAGAGATTCCTCTAGTGATTTGGCTGCTGTCACAAACCTCCAGATTCTCAT 294 128658 AATTCTTCAGAAGGGGGTTTTGGGTTTTTTTGTTTGTTTGTTTGTTTTCCTTTTCAAACTTCTATACTTCAGTGGCCTCA 86G>C TAAGTCAAGAATATGAAAAAACATTAATAAACAAATTTACAATAACCATTTCTGACTATA 295 129600 AATTCTACTTCCAGTAATTCCAAGAAGTTCTCTTCCTCTGGAAGGTTTCTTGATTCCTTGTTTTGGGTGCTTGCTGAAG 100A>C CCATCATGGTCTGCCTCTTTCTGTGATTTGATATTGACTGTTGTCTCTGAGCCATCAATAG 296 130231 AATTCCAGAGCACTTAATTGTGCTCATGATCAACCTGTACATAGACCAAGAGGCAGTCATTCAAATGGAACGAGGGG 92C>A ATATTGTGTGGTTTAGAATCAGGAAAGGTGTGCATCAGGGTTGTATCATTTCACCATACTTAT 297 130332 AATTCTAGGGAAGAAATTATTAGCATTTAGGTCAAGTACATATAGATGGATCAGGGGAAACACAGAGAGAGAAAGAG 100C>A AATTATTTCTTATTGCAGAGAGCAAAATGTCACAAAGCTCCTGGACAGAGTCTGACTTACAAA

195 298 130477 AATTCAGTAACAAAGGGTCTTGAATGCAGCTGGAGTTGTTTTAATTTAATGCTTTGAGAAGTTGTGAGTGATAAGATA 37T>C GGCACTCTGATATATACAGCTTAACATGTGCCTGCGTGTGATGCATTAGAATTGAAATAACG 299 130766 AATTCACTTCCTTTTATACTGGTTGCTTGGGCATCCTTCAGTCAAATGGTCTGCCTGGGTTGCTGATATGTCAGTGGA 41G>A ACTATGATCTGGTTGGCCTGGATCCATTCCAGCTGTCAGTGGCTCAGATAAGAAACCTCAGT 300 131729 AATTCTATTTTGCAATGTCCCCTACATTTCATGCATAAGATTTTAAAGATGTAATTCCCTTGTCTTCAAGTTCTCATTGT 65C>T TTCTATTCAGAAGATGATTTCCTATCCTACTTTTTATCTGAATTAATGAAAAATAAAATT 301 133258 AATTCCCATTCTCTGCAATGTTATCCATAATTTGTTATGATCCACACAGTCAAATGCCTTTGCACAGTCAATAAAACAC 89C>T AGGTAAACACCTTTCTGGCATTATCTGCTTTCAGCCAGGATCCATCTGACATCAGCAATGA 302 133274 AATTCTGTCTAGAAACATTGATATTTAGAAGCTACTCATATTTACTAGTTAGAGGCTAGTGCCTGGAAGGGAGTCTGA 98C>T TGCTAGAATTACAGTAATCCAAATAACAATCACAACAATATTAATGATTACTTATTGAACTT 303 133426 AATTCTAAACATGGTTCCCTAAGATTCTAGTTCCCTGGTTATTCAATCAAATATTCACCTAAGTATGCTGTGAAAGGAC 94T>C TCTGCAGATGGAATCAAGGTTACTAGTCAGCTGACATGAAGACAGAGTAATATACTGAATT 304 133574 AATTCAGGCTTACACTGTTCAGGGGCTAAGCACGGCTCTGCCAGTGACTGACCAGAGCAAGGTGAGACCTCGCCTA 58C>A GGCAGCTGTTCTCAGGTGCCGGTCTCTGGTGAAGGCGAGAAAACACGCACCAGTACTTCTGCTC 305 134296 AATTCTCTCAACTTTTGCATGTCTGAAAGAGTCTTTACTCCACCTTCATTTTTGAAAGACATTTTTGCTGGGTATAGAAT 89T>G TTTAGCTTTTCAGTTTTTTTTTCTTTCAGTACTTTAAAGATTTTGCTCTTTGGTCGTCTT 306 135686 AATTCAAGGGCACAGGTGTTGGTGTGGAGCTAGTGTTTGAGCCCTAGCACTGAGTATGTGCCCCTGGGCAGTTGTTA 61C>A TAGATCCACTGTGCCTTGGTCTCCTCTCCTGAAGCCCTCAGATGGTTCTGGTACCCATCTTGT 307 136094 AATTCAGGAGCAGCTTAGCTGACTGCTTCTAGCTTGGGGTTGTTCTAAGATTGTAATCAAGGTATCCACTAAGGCTTC 71A>G AATCATCTTATGAAGGATCCACTTCCCAGCTCATTCACATGACTGTTTATAGGCCCCCAGCT 308 136258 AATTCCAGAACGCTTAATTGTGCTCATACGAAACCTGTACATAGACCAACAGGCAGTTGTTTGAACAGAAAAAGGGGA 68C>G TTCAGCATGGTTTAAAATCAGGAAAGGTGTGCATCAGGGTTGTATCCTTTCACTGTACTTAT 309 137489 AATTCTTATACTAGTGGTCTCTTACAATATTTGTCCTTTTGTGACTGACTTATTTCACTCAGCATAATGCCCTCCAGATT 97T>C CATCCAAGTTGTAAGATGTTTAGCTAATTCATCATTGTACTTTATTGTTGCATAGTGTGT 310 137723 AATTCAGACTTCTAGCCCCTTAGACTGTGAGAGAATAAATTTCTTTTTGTTAAAGCCATCCGCTGGTGGTATTTCTGTT 100G>A ATAGTAGCACTACATGACTAGGACAGTGACCAAGTAAATTATCTCTGAAATTTATGATAAA 311 137727 AATTCTTGAAGACTGCTGTGATTGCTATTTTATGTAAATCAGATATGACAATGGGAACTGCCCTAACTGAATTAAGATA 45A>T CCTAACTACAATGGGGCTAATTGGACCTGGTGGCAGTAGGGGACAAGTGGCTGCACTCAAT 312 138755 AATTCCAAGAATTGAAAGCAAGGACACAAACAGGTATTTGCACACAAATGTTCATTGCAACACTATTCATAACAGCCA 34G>A ATAAGTTGAAATAACTCAAATGTCCATCAACAGATGAATAGATAAACAAAATGTGGTATGCC 313 139031 AATTCACCAGTGGAGTCATTGGTCCTGGGGTTCTCTTTCTGGTAACGTTGTTAACCACAGGTTCAAATTCTTTAAAAG 61G>A CTATAGGACTTTCAGAACACACATTTTCTTTTGAGTGGCCTTTCTGAAGTTTGTGTCTTACA 314 139796 AATTCAAATTGATAGATTACTCTGTTTTAATATATTTTCCTCATATACAGTACTACAGGGCATCTACAGAGAAAAAGACC 56C>T:59A>G AAGGAGATCCTTCAGACCCATTATTAGAATTAACAAAGAGTTTAACATAATGCCTAGGTG 315 139845 AATTCAACTGCTGGTGACATCTCATATTATATATATAAAAATTGTTATAAAGATAAAGATGATATGGTTTATATAATTGTT 61G>A GTAAAGATAAATATAATACATGTTATATAATTGTTGTAAAGATAAATATAATACAACAT 316 141533 AATTCCGAGATCAACAGATAAGCTGCTAGCTCAAGTCCCACCAACTGGAGGTGAGACAACAAGAGGCAGCTGCTGA 42G>C GTCTGGAACAAGCCACAAACCTTGGTAAGGTAAGCAGGCAGAAGGTAACCCGGTGGAAGTCAGA 317 141883 AATTCATTGCAATGATTGACTTATTTTAGTTCCCTCGCTTTTGCAAGTCACCTGGAACTGGTGTTCCCATTGGTTGTTT 36C>T ACCAGGACTTGTGCAAGAGCATATCCTTATGTTTGTATCACAACAAAGAGCATTGCTGACC

196 318 141932 AATTCAGAACAGCTGAAATAAAGTGAGTCTGAATTATTTAATTTGAATTGTACAGCATTTCTGTGAAGATATTCAGTGT 43T>C TAACACTAACAAGTGTCTGCACATAATGGGATGAAACAGATCTTTATTGCGCTTCTTCCAT 319 141964 AATTCCAGGGCTCAGAGCTTCCTGGCCGCAGGGAGGAGTCGTGAAGGGAAGTTTCCTGAAGGGTCGGGGGTGGGG 95A>G GCGGACACAGATTTCCTTGGAGGATAGAGCTAAGGTTGCGAGGGCCCCGCAGCCCGGCTCGCCCC 320 142498 AATTCTGAAGTTATATATTGTTGCTGAAGGCAAATACTTCCTGATTTGATATGCTGTCTTTTTAATAATTGCTTTTCAAT 100T>G TTCTCTTTAATATAATTTTGATTCTAAATGTATGTCTATAATAAAAAATCTGTTGCTGTC 321 142972 AATTCCTTTTGGTTGATGGATAGCATTACTATGACCCTACACTATTTTGACACTAGACCGCTTGGGTCCTGGGATTGT 77A>G AAATGGGTTTCCACAGTTTTCTCTTGGCCTGAAGGCAGAGGTTTACTTTGAAGGTTTTCTTT 322 143294 AATTCAGAAATAACCCAAATCCAAACTAATAGAAGACATGAGCATAAATAAGCAACTTAATTCCATTTTAGAGCTGTGA 91T>C CAAATCTAAAATAAAGTGGAAATTACAGGCATAAGCTTAAGGAAAAATGAAACCGAAATAT 323 143503 AATTCACTGTCAAAAACTGGGTCTCAAAATTTTACCCCTTTGTGTTCTTCCAAGAGTTTTATGGTTTTAGCTCTTACATT 44G>T AGGTCTTTGATCCATTTTGAGTTAATTTTCATATATGGTGTGAGATGTGGGTCCAGCTTC 324 144646 AATTCATGTTGACTCCTAGTTAATCTTTAGTTTTACAACAAGTTGGGCTGCTTTTTCTCCTTAATGATTTAATTAAGCAT 83T>G CTTATACTGAGCTTTCCTGTCAAATAAACCAAAAGTCTGGCTTCATTCTATTTTGAAGTT 325 144808 AATTCACATTAGCATGGATTTCTAGCGGCATCTGTAGTGTCTATAATGCCCCGTCCTATCAGACTGCTAAGCCATAAT 34A>G AGACCTGCTTTGAATTTAGGACACGATTTTTGTATTTGAAAACGAAAGCTCTGATCATATTT 326 144827 AATTCCAAACAGTGTATGTAGATACTACCCCTTCCAGGAGATGTAGTATTTTTCTTTTTGAGAAAACAAACAAGAAAAC 38G>A CCATTGTTGTCAAGTCAATTCCAAGTCATAGCAACCCTTTAAGATACAGCAGAACTACCCC 327 145066 AATTCCAAAGGTCTTTTCCCTGGCACTTGGAATCTCCACATCGTATGTTATCTCATTTAATCCCCACAAATATCCCATG 43G>A AGAGAGATATTATTATCATCATTTTATTGATAAGAAAACTGTTTTAGGCTGGGTTCTCTAG 328 145334 AATTCTAGTTAGCTGGTTAATTTGCAGACAACTCGGAACAATACAAATTTCACTATTTTGCAGGGGGGAGTCTCCTCT 80C>T TTGGGGGGAGGAAGCAAATCTTCACCTCTCTGTGCATATTTTCATTCTCTGCAGTGATGAAA 329 145851 AATTCTTTTCTGTTTTCTGCAGAAATTCCCTGAAGGAAAATATTATTCTCCACATTTTTATGCTGGATTGACTTTCTATC 36G>A AGATTTGTCTTTATATCTTAGTTTCCAGTTGATGAGAAAACTCAAAGTTTTTGAAAAATC 330 146074 AATTCAGTGCAGGAGGGTCACGGAGAGGGCGGGGTGGGGGGTGGGGGAGCAGCTCATTCCCTCCAAATGACCAAA 55C>T ATCTAAATGGAAAAGGGTTTGTTACAACAAGATAAGGAGCCCAAGAGGCATAGCTACCGACAAAA 331 146152 AATTCCCTCTACTTAGAGATTTAAAATCATTCTCAGAAAAGAAGTACAATACCCTGGAAGGTGTTTTGGTTGTCAGAAT 61G>T TGTGGGTCAGGGGTGTTACTGATCTTTAGGGAGTGACCAGGGAGGCCAGATTCCCTGCAAT 332 146278 AATTCTGTTTCGGGGGTGGGGGGTGTTAACTCCAGTGATTCTGCTTGCCTTGAGTGTGGACAAGCACATTACCAGAA 53A>C AAGGGAGCCACAGGCAGAGAGACACAAGTATTGCCAGCTTGCAGCCTTAGGTGTGGAGAAGCA 333 146393 AATTCACAGAGTCCCTGTACCAAAAAGAATTGGTCTATGTTCGACCACTTTAGGAGGCAGCATATGATCAAGAGCCAA 86T>G CAGTAGTTAAGGAAGAAGTCCAAGCTGCACTGAAGGTATTGGCAAAAAACAAGGTTCCAGTA 334 146681 AATTCAGGATATGCATTCAACAATATCATCACGGACGGTGAAAAAAGAAGATACTTAAAAAGGTTAATAGTTTGGATTT 37G>C TCATCATTTTCCTTGTGTGTTCCAGATCACGGACTAAAATTTTTACTTTGCACCTTAGATT 335 146694 AATTCCCATAATAACCCTGAGATAATTGAAACGAAGGCAGAGAGGGATTTGGAGGCCAGATTATCCAGAAAATTCCT 32A>C GTCTCAGCCTGGAATGGCCTGCGAGAGGCCATTTTCCCCAGGGCACAGGATAAAGCCACAGCT 336 146813 AATTCAGGAAGATGATGCAGGAACAAAATGCCAAAATAAATTCACAGCTAGAAATTATACAAAGACAACAACTAGAAA 72C>T TCCAAAAGATAAACAACAAGATTTCAGAAAAGGATGATGTCACAGAAGGGTTAAGAAGCAGA 337 147229 AATTCTTCTCTATATCCTTGTCGGTATTTGGTATTGTCAGAGTTTAAATTGTAGCCATTCTAGTGGGTTTGTAGTGGTG 89T>C TCTCATTATTACTAAGGATGTTGAGTATGCTTTCATGTGCGTTTTTGTCTTTTTTGTTTTT

197 338 147789 AATTCAACAACATCAAAAAAATAATTCACCATGATCAAGTGGGATTTATAGCAGGTATGCAGGGATGGTTCAACAACA 84A>G GAAAAACAGTCCACATAATCCATCACATAAATAAAACTAAAGACAAGAACCACATGATCTTA 339 147792 AATTCTCATCACCAGAAAAAGAGCAAGACATACCCCCTGAGCAGCCTGCCCCGCAAACAAGAGTCTGGAGGGAACA 85C>T CTACGCTGCAGAGGAGCCCGGAGCTGCTGTTACCTTTGTTCAGTGAGGGCAGCTTCAACGGGAT 340 148398 AATTCTACAAGAGATGTGACTTTGTGGAGTAAGCCCGCTTTAGTCTCGGTGAGTCTCATTATTCTAGCAACTCTTCAG 95T>C CTTGCAATTGGTGAGATATATTTGTGCCTGTTTATGTAAATCTAATTTGTGTTCTTGGGAAT 341 148612 AATTCAGGTTATTTCCAACCTCTTTTCAGTCTACCATACTACCTTGTAAAAACTGAAGATGAAAGGGAAGAAATTCTTT 55G>C TGAGACCTAGTTCTCTCTCCTTGACACTCTATAACTTATTTTACTATTTCTTATCTGATTT 342 149179 AATTCCAAGACCAGCAGGCAAGCCCACAAGGTTTCTCCTGATTCACACAGCTGCAGGAGCTGGCAAACCCAAGATCA 90C>G GCAGGTCAGAGAGCAGGGCTCTGCTCACAGGCTGTGATGATCAGCAAATCCCAAGTTTGGCAG 343 149247 AATTCCAGAACACTTAATTGTGCTCATGCAGAACCTATGCATACACCAAGAGGCAGTCCTTAGAACAGAACAAGGGG 76G>A ATACTGCAAACGAGGGGATACTGTGTGGTTTCAGGAAAAGTGTGCGTCAGAGTTGTATCCTTT 344 149472 AATTCAGAAAAATAATACGAGGACAAAATGTCAAAATTAATAAACAATTAGAAATCATTTTAAAAAATTAGAAATCCAAA 38T>A TGATAAAAAACAAAATTTCAGAAATGGACAAAGCAATAGAAGGTTTTAGGAGCAAATTTG 345 149681 AATTCAAACTGGGAAGTCAAGAGACAATTATAATTCAACAGAGAGAGAAAGGCTGTGTTATCTCTCTCCTGCTCACGA 95C>T TTCACCCAAGGCATCCCTAAGTAGAAGCTGATACCTTGCTGTGATACTTTAGATGGTCCACA 346 149718 AATTCTACCTCTGAACTACCACTGCCCTCTCCATGTTAATTAGTATGATATAAAAAGCCCTTCACAGTCTCTCTCTAGC 75C>A CTATCTATTTTGGTTTTATCTCTTGGCTCTCTGCCCCACCTCCACCTTCATCTCTACTCCT 347 149749 AATTCAGGGCATCAGCTCCAGGAGAAGTCTTTCTCTCTCTGTCGGCTCTGGAGGAAGGTCCTTGTGATGCATCAGTC 47T>C TTTCCTTGGTCTGGGAGCATCTCAGTGCAAGAACCTCAGGTCCAAAGGATACGCTCTGCTCCC 348 150371 AATTCAGTAGAGGCAGTACAGGAGGCACGAAGGGGGAAAGTGAGAGCGGAGTGTGGCTCTCTAGACCCCACTTTCA 36G>A TTTTCACCACAGCAGCTCCAATTCATAGCTAATGCTTTGTTTGTAAAAGGGATTCTATGGCTAA 349 150567 AATTCTCATTAATTTATTTTTATCGTTTCTCATAGGAAGGTCTGAAATATGTTGACAGATATGCTATTTGAGTGAAATGT 42C>A CAGTTTAAATTAGTTTAAATGTTGTCATTGTATGTAATCCTAGTCTGAAAGACAGTTTAC 350 150603 AATTCCCAGTCTTCTCAATGTTATCCATAATTTGTTATGATCCACACAGTCGAATGTCTTTGCATAGTCAATAAAATACA 76G>A GGTAAACATCTTTCTGGTATTCTCTGTTTCATCCAGGATCCATCTGACATCAGCAGTGAT 351 151301 AATTCTAGTCCTGGTTCTTCTATTCACTAGTTTCGTGGATGCAGGCAGGCAACCTGGGCTTCCTTATCTATGAAATGA 61T>C GGAGGCTGGACTTTGATCCCCAAGGCCCCTGGCAGCTCTAATGTCCCCATGACTCTAGGCTC 352 151688 AATTCCTGTATTGTGTGCCAAACTTGCATTTCTCAAGTCTTGAAGCAAGAGTCTTGCAGCCTTGTGGAAATAGTACAC 85G>A TGCTGAGAAGAAGAGTACTTGAAAGCCATTGAATTTTCTTAATTACTATTTCCTCTTTTCCT 353 152064 AATTCAGGGCAACAGCTCCAGGGGAACTCTTCTCAGTCAGCTTTGGAGGAGGATCCTTGTCATCAGTCTTCCCCTGG 39A>G TCTAGGAGCTTCTCCATGCAGGAACCCCAGGTCCAAAGGAAGTGCTCTGCTCCTGGCCTCCCT 354 152638 AATTCCAGCTTAAAGCTTTTAAAGTAGAGATTGTGTGGAGAAAGCACTGTGGATCTCTGTGACAGAAACTCGACTGAT 99A>G GGACATTCAATGTAGTCACAAATAATGTGCTGTTTATAGGAGGAGGCTAGGCTGAGGTTTGA 355 152694 AATTCTCTGAGGCATAATATGGGGAAACGAATTTTGCTACCATTTTTCCAAGTCAGGTATTTCTTAATATCAGTTTCCT 76T>C TTTTACAGAATTTTTGTGAATATTCAGAATAGCAGCTTTAAAATTATATCTTTTATAATAC 356 152976 AATTCTTACTAAGCCACCATTGCCCTCCTGCTCATGTGTAAAATGGGAATTTATAATGGAATCTAAAGTTGAATTTACC 45G>A CAGGATTGGTGACAGGATTCAGTAGAATAACATGGTGTGATTATTATAATAGTAATAATTG 357 153235 AATTCATGATTGAATGTGTTTGGTTCTCTTAAAAGAGCAAGTACACTTGGTTTGACTAAGCTATCTGTAGGAAACTATT 99G>A AAATAGATTTATAAAACATGTAATGTGTAAAGCAAAATAATATGTCTTAGGAAACACTGGG

198 358 153420 AATTCTGGCAGTTCCCTGTCAATGTACTGCTGCAACTGTGTTTGAATTATCTTCAGCAAAATTTTACTTGCGTGTGATA 67C>T GTAGTGACATTGTTCAATAATTTCCACATTCAGTTGGGTCACCTTTCTTTGGAATGGGCAC 359 153736 AATTCAGTCCCTTGGGGAGTTTGGATGTGTCTATGGGGCTTCCATGACCTTTCCTTCTACAGGTTGTGCTAGTTTCCC 76T>C CTGTATTGTGTACTGTCTCACCCTTCACCCAAGATACCACTTATCTATTGCCTATTTAGTAT 360 154943 AATTCAGCACAAGAGAGTAAGAAGAACAATGAAAATGTCAGCACCACAAAAAAAGCACTACAAAATGAAAACAATAAA 90C>T CTCATATTTATCAATAATTACACGGAAAGTAAATGGATTCAATGCACCCATAAAGAGACAGA 361 154976 AATTCTGACTGTACTTTTTCCAAGATAGATTTGTTTCTTCTTCTAGCAGGCCATGGTACATTCAATATTTACTGCCACA 49A>G CCATGATTTAAAGGCATCAATTCTTCTGTCTTCCTTATTCATTGTCCAGCTTTTGCATGCA 362 155054 AATTCACTAGCTGTACATTCTTGTTCAGTTCAATACATTTTTAGGATTAAAATAAATCAGAACAAAGAATTTAAATTTTAT 68A>G TAGAAGAGGAAAAGTAAAGAAAGATACAATCATTTGCCTGCCATACTCAGAGCTCAACA 363 155203 AATTCTATACCCAGCCAAATTATCTTTTAAGGATGGGGGTAAAATAGAAAAAAATCCCAGCTTTATTCTTGTGAAAGAC 40C>T CAAGTGAAGATGAAACACATTTTTATGCAAATAACATGTGCCTTCTACGTTTGTTTGCTAA 364 155440 AATTCTTAACCACTGGGCTGGAAGGGATGGTGGGAGCAGACAGTGGAAGGGAGAGAGATGGATGAAGGAATGGAG 59T>C:64G>A GGAACACCACAGATATCAATATATGTCTTGTCTCCCCAAATGAACGGAAAATTCATAATTGCCCC 365 155469 AATTCACTCCCACCATCAAAAGTTAAACCCTACAAAAACCAAGTAGCGATCATGTGAGAACATTATTAGGGAATCAAG 92C>A CCATTCAGTCATACAAGGCTTAAATAGTAAATCCAGGCAGGGCCTGATTCTGATTCTAGTTC 366 156016 AATTCTTCTAGAGGGAGTTAGAATTTTGATTATCTGTATATCCCAGCACAGTGAGAGACATGATCCACAGCACTGCTC 85G>A GGGGGAAGTCACAGCCTGAAGGACTTTCTCATGGCTTGTATTCCAAACCAGGTCTCCAGGGT 367 156914 AATTCATGGTACCCTACTTGTCATGGATTGAATTGTGTCCCTCAAAAATATCTGTCAACTTGGCTGGGTCACTATTCCC 82C>T AGTATTGTATGATTGTGTACCATTTTGTAATCTGATGTGATTCCCTTATGTGTTGTAAATG 368 157022 AATTCAAAATCACAAGGACATAGCATGCTGAGTGTGTCGGGATAAATAAAACCAGCTTCTGTGGAGTTGACTCTGACT 91G>A CATCACAACTCCGTGTGTGTCAGAGTAGAACTGTGCTCCATGGAGTTTTCAGTGGCTGATTG 369 157989 AATTCCTTATTTTCTGAAATAATTCCAAATCTCATGATCAAACTCCCAGTTTCAAAATTCTTAACCCCCAGGTATCTTGG 67C>T GCATTAGCATAAGTAATGAAGTAGACAAGGCTCACTTTAGATGCACACATAGAAGATCTG 370 158028 AATTCTACCATGCCATAACGTACTCCAGTAATATAATACCTCACGTTTTTGTTGTTGTTAGATGCCGTTGAGTCTATTTT 63T>G:76G>A GACTTATAGGTACTCCATGTGACAGAGTAGAACTTCCCCATAGGTTTTGTAGGCTGTAAT 371 158157 AATTCTGACAGCAGCTTGAGAAAAACAAAAAGTCACATACAGAGGAGAAACAATAAGACTAAACTCAGATTACTCAGC 36C>A AGAAACCATGCAGGCAAGAAGGCACTGGAATGACATATATAAAACCTTGAAAGGACAAAAAT 372 158226 AATTCAGCAGAGAAGTAGAGAGTGCCGTGGGAGGGACAGTTCGTGTTAGAACTGGCAAAAAGTTTGGAGAGTGGTG 42C>G GCATACCTGACCTCCACCTCATTGGAATCAACCTTGGGCTGACGATCTTCATTCTCCTTCCTCC 373 158275 AATTCACACAGAGTTTGGATTATATTAGTAGTCCTCAAATTTCTCTAGTTATCATAACTACCTGGAATCTTTTTAAAAAA 92C>T TATGGATTTCCCGGTCTCTACCTTAGATGTACCACATCAGAAGTTCCAGAAGAGAATCCT 374 158546 AATTCTGGCCTCCATGTGGGAGACCTGTGTTCGATTCCCTGCCTGTGCACTTCATGCACAGCTACCATCTATCTGTCA 48T>C GTGAACAATTATGTGTTGCCACAATGCTGAAGGACCCCTGGTGGCGCAGTGGTTAAGAGTTC 375 158567 AATTCTCATAACACACAAGCCACCACAGTACAACAAACTGGCAGACCAGTGAGTGCCCATTTAAGACCATATTAAATA 52A>G GTAACCAGTATTTTGCAGCTACTAAAAATAAAGAGCCAGTAATGTATTTATAGCACACAGCG 376 158786 AATTCCATTTTATTTCTGTTCTTCTTTTTATTTTTTTGTTTAAAATTAACCTGTATACTGGTTTCACAAGTAACAGAAGGG 88G>C AGAGTAGTAAGAATATAGTGATGCTTAAGCCAGGAGAAAATGTTCCTGGAAATATATGT 377 158910 AATTCAAAGAAAATTGCTCACCTTAAATAAAGTTTGGGTTTTTTTATGAAATGGGTAAAAACTTTCTCATTCAAATGTTT 77A>G TATTAACATTTCTTGACTGACTGGCTTTTCTTTTCTTTTTTACTTTTATTCGTTCATCCA

199 378 159313 AATTCTACAAAACATTTACAGAAGAGCTTATACCAGTACTACTCAAGGTATTTCAGAGCATAGAAGAGGAAGGAACAC 63C>G TTCCAAATTCATTCTATGAAGCCAGCATAACTCTGATACCAAAACCAGGCAAAGACACCACA 379 159535 AATTCCTGGACCTTGAAAGATAGGCTTTATTGTTGTTGAGACTTACACTGTTGAGAGAAAAATATCTCACAGGCTTATC 97A>C ATGTTCATGTGATTCTCAAATTTTGAAAACCAAGATATAGATTCATCAGCACTCACAGAAT 380 159649 AATTCCCTTCATTTTAATTTTACCTGAAAAATGAGTAAATCATACGCATTTGGCAGAAATGCGACAATGCCGTTCAAGC 46G>A TTCTAGCCCAAATGCTTGCAATATCATAGTTTCTCAATGAACATTTTCTGTGATCACCAAC 381 160128 AATTCAAAAAAAATTCTAAATCAGTTTTAAAATTGAAAGGACAAATCTTTCTCTTTTTTCATACTCATAGAGGTACTGCT 86C>T GATTGCTAATTTTAATCTTTAAAGATTTTCTAGTACTGTAACACTAAGACAAAAAAGTAA 382 160370 AATTCCAGAACACTTAATTCTGCTTGTGAGGAACCTGTACATAGGCCGAGAGGCAGTCCTTTGAATGGAATAAAGGG 74A>G ATACTGTGTGGTTTAAAATCACAAAAGGTGTTCATCGAGTTGTTGTATCCTTTCACCATATTA 383 160576 AATTCTTCAAAGAAGGAGCCAGAGTCTGACTGTTTCTTCGGTGGTGGAGACAAGGAAATTATCAGGAAGTCAAGTCT 39C>T AAGAGGTGTCATAAGCCCCAAGTGATATCCGTATGGGTATCAGGGAGAGCTGTATGCTCAAGG 384 160688 AATTCATCAGAGAAAAGATGAGAATGGCAAGATCTCTTACTCCCTTTCATATTAGAAAATCTTATCAGTAAAGTTTTTTA 65T>A:81G>A GGTAGAAAAGAAAGCTTCTCTTAACATATTTCATGCTATTTTCAATTATTATCCTATATT 385 161107 AATTCACAATCCTACTGCAGAAAAGGGTTAGTAAAGCAGCTACAGGTCCCTGAGTGGCCCAAAAGGTTTGTGCTTGA 47T>C CTGCTAACCTAAAGGTTGGAGGTTCAAACACACCCAGCACTGCTGCAGAAGAAAAGCCCAGCA 386 161522 AATTCACCAGTGATTTCTGCCTATAATCAAAATTACAAGGTAGAGATGGGATTTGTAAAACCTGTTGGTTGTAGGGTA 64G>C AGTAGCCAGCCATTCTTCTATAAACTGTTAGCATATTTAGTAAGTGCATGTCTTTTCACTAG 387 161757 AATTCTTCTGCTTATACATATTAAGTAAATATATAGTAGATTTTCCATCTGTTTCACTCCTAGGGACACCATGACTAAGT 46T>C ATCCAGTGCCATAAGTCCATGCAAGTCAGACTGTTCTGATTACTGTTTTGACTCTGCTGT 388 161865 AATTCAGCCTTTGCAACAGTGTTGAAATAAATGAACAAACATAAAGGGAGAGGTCAGAGAGGCCCTCCTCTTACGGT 74T>C AAAGAAAGTCTCTTTATGCCTTTGACCCCTGGTTCATGCATTTATGGTCACCATTTGAGAAGT 389 162625 AATTCTAATGTTGGCAGTTTCTCTTTGTGCTGGGGCAGCACTTGCAAAAGCACATTCTTCTGCTAAATGCATATACAC 71A>G AGTGAGATAATCACAGGCCTCCACTGCTATCAGGGAGCTCCCACTGGAATGGATTTGGAGAG 390 162646 AATTCTTTGTGGTTGGAAAACATGAATAAGTTGCTGGAAATCTTGTTTATATATTATGATAGTTCAGTTTAATAAGAGTA 87C>G ATTAGCCTTTTATATGAACTTTTCTGTTATACAGGTGTCTGCTGACTGCCATCTTGAATT 391 163142 AATTCAAAATCTAAGAAGAGAGCCAGGGGAGGCAGACAGAGATAGAAATATCAGCATTATTACCGACAACGCTACAT 75C>G TACAGCTGAGGCTGGGCCCATGGCTTCCATTTTAGGACAATTTGCCCAAGGGGATGATGGGTC 392 163434 AATTCTTTACCCAGCCAAAATGAAAATCAAGTGTCAAGACATTTTCAGATATGCAGAATTTTAAAAATTTAACTCTCATA 32T>C CAATCTTTTCCAGGAAGCTATCAGAAGAAATATTCCACAAAAACAAGAAAATAAACAGAG 393 163737 AATTCTCAATTATATGCCACTGGTCTGTATGTCTATCCTTATGCCAGTACCACATTGTTTTGATTACTCTAATTTTATAG 84A>G TAAGTTTTGAAAACTAGGAGATTCATTTTTATTAGGAACTTTTTTCTCAAATGTATTTTT 394 164192 AATTCTAAAATAAACTACATCAATCTATTATCGGTCTTTGTCTGTGAACAGAAATAATTGTTAAGTAGACCATCTCAGAT 72A>G ATGAGTATTTAGTCTGCCAAACCTGCTCTTTATATCAGTGCAAGAGTTAGTGTAAGAAGA 395 164393 AATTCCAGACACCAATCTGACACAAGCAACTCTGTGACCATGATGGAGCCTGGCTATAACAAGACTAGCTCATAATTA 77T>C TGTCTGAGCACAGAGGAAAACAGCAACATTGTTGAAACCACAAAATTGACTAAACTGGTTTG 396 164463 AATTCAGACTTCTAGCCTCCTAAAGTGTGAGAAAATAAATGTCTGTTAAAGCCACTCACTTGTGGTATTTCTGTGATAG 81A>C CAGCACTAGATAATTAAGACACATAGAAAATTGTTCTTATCCACCAGCTTTCTCATCTGTG 397 164467 AATTCACATTTATTCACAGGGGAGATTTCTGTGAGGTGATATTGATGGGGAGCTTTTTGGTGCATGAGACTGGGCTCT 52G>A GAACATTTCCAAATGAGCCATGGGAGAGGAATCTGGTTGTATAGAAATTTTCTTTCCCTGAA

200 398 164845 AATTCTTGGGATTGTTGATTTGTTAAGGATATTAAGCTACTTAGTGATACCAAGGGAATTTCCCAAAAGTGTATTACAA 94T>G ATATAACACATATATCAGAAGAAACCAATGATCTAATTCCATATTCTGTATCTTACAGTCA 399 164908 AATTCATCCTCCAGATAAATTCTCAGAAATGTGCCAAGATGAATGCACAAGGATGTTTATTGTGGCATTGTTCACAATG 64G>A GCAAATATCTGAAATCAATTGAATATAATGCTGCTACTAAAAACAATGAGGTACATGTTGA 400 165365 AATTCAATCACGTGGTCAGTAAATCAGTCAATCAAACATGCCAACATAATGAGGTCTCAATAAAAACTCTGGATACAG 45C>A TAATCTCCCCCTTCCTATTGAATTATTTCCATCAGACTAAAACAAGCTATTATTTCTCCAAT 401 165553 AATTCAGACCTTCCAGGGTTCAAGTTACTGAACCTCACTGTTCCTCTCAGCTTTCTCATTCATTTATATACAAATATGT 96A>C GAAAGGAGTAGCAGAGATGTATGTCCAGAGTACCTATTTCATGCACTCATTAAATATTTTT 402 165615 AATTCATATTCAAACTTTTGCGTTGTTCTGCTTGCCATCTCATATTTCAGTGCATGCTCTGTCATGGATTGAATTGTGT 40T>C CCCCCAGAAATGAGTGTCAGCTTGGCTAGGCCACGATTCCCAGTATTGTGTGATTGTCCAC 403 166421 AATTCAATTCATGGTGATCCTCACCATGCACTTACTGATAAAGGTGGGGTTTGTCTACCTTGCCACTTGTTGGCACAG 90G>C TGGGCAGTACAGCCCTCTTTGCCTGTTCCTCTCAACTCTGTTCTATCCAATACAAGCCCTTA 404 170221 AATTCCTGGTTATTTGGCACAAGATGGCCTCCTGTTGTAACAACACTAAATTTCTTTTCTCTCATATCGTCTTTGTAATT 69G>A TTTCTACTATAGTAATTCATTGTGAAAATTAAGATTAAAAAAATCATCTACGAAAATTTC 405 172068 AATTCCCATTCTTCTCAGTGTTATCCATAATTTGTTATGATCCACGTAGTCAAATGCCTTGGCATAGTCAATAAAACAC 84A>T GGGCAATCATCTTTCTGTTATCCTCTGTTTTCAGCCAACATCCATCTGACATCGGCAATGA 406 172417 AATTCTCCGAGCTCCTTAACTTGGGGTGGCTTTAAGCAGCATGCTGTGCAGCCATGCTTGGAGTGTGTGACAGAGAA 39T>G CTGCAAAATGCCTGTCAAAACCTTTGACAAGTGAAAGACAGGGAAACTGGGGGGAGGCCGCCC 407 173042 AATTCTTTCCTTCCATGCAGGAGACCCAGGTTCATTTCCTGGCCAATGTACCTCATGCGCAGCCACTACTCATCTGTC 35T>A AGTGGAAGATTGAATGTTGCAATGACATTTAACAGGTTTCATTGGAGCTTCTAGATTAAAAC 408 173710 AATTCAGGAAAATAATGCAGGAACAAAATTCAAGAATAAATGCACAACTAGAAATTATACAAAACAGCAATTAGTAATA 79G>A CAAAAGATAAACAACAAGATTTCAGAAATGGACGGTGTCATTGAAGGGCTGAGGAGCAGAT 409 173941 AATTCGTAGTGAGTCCTTAAAGAAGTATTATTAAAAATATACAAATAAAATTCAGGTCCTATATATACAATATGCTCTAA 96A>C AGTATTACTGACTGGATTAGTTCAACAATTGGGCCACACAGGATGTTTCATAAAGTTTAG 410 175437 AATTCCAACTCCAGTCTCCCTGTCTCCTGAGCCTGGCTCTTTCTAAGACGCTTTTTGGAAAAGTCATGACTAAGTTAT 49T>C CCCCATTTGGAAGGGCTCCTTCTTTTTCAACGATGCAGCTCATTATCAGAGCTGACTTACAT 411 177415 AATTCTTTAAGGACCTTCCAGTTATAGTTTCATTTTTCAGTGCAATGGAGCAAGGCTGTGACTGTTTACTCTGCGTTTA 83G>A GAGAGGAAAAATGTGACACCTCTCTGCCCCATTTATCAGGAATTTGGCATTTTTGTGGGCA 412 179917 AATTCACTCTCAGCCTGCAGTACCCCCTGGCTAGAAGGCATAGATGTATACAATGCCTGGTATTCAGGGAACAGAAA 111A>C GAGAAGGAAGATATAGAGAGATGAGAAACTGAGCAATGTAGAAATACAAAAAGGGATAAAGAC 413 180185 AATTCATATTGGAAGGTATGTATCAACACTAACTATATTTGTCAAATATTTTATTAGTACTAATGCTAAGTTGGAAACCC 43C>T TGATGGCATAGTGGTTAAGAGCTACGGCCACTAACTAAAAGGTCAGCAGTTCGAATGCAC 414 180323 AATTCCAACTTCAAACCTGTGTGAGGCTGAGCTTCCTTTAGTAAATCTCCAAGGAATTGTTTTTCTCATTCCATCTAGC 66C>T GTCAATACCAAGACAAGCATTTCCATGCTCCTCACTTTTTTTCTTTGTTTAATTTTGTCAC 415 183590 AATTCAAGTCCTTTATCCAGTAGGGGTTTTGCCAATATTTTCTCCCAGTTTGTGGCTTGTCATTTGATCGTTAACAGTG 69C>T TCTTTCATAGAGCAGAAGTTTTTAATTTTAATAAAGCCCAACTTACCAATTTTTTTTCTTA 416 184020 AATTCCCATTCTTCACAATGTTGTCCATAATTCGTTATGATCCACACAGTTGAATGCCTTTACTTATTCAATAAAACACA 91C>T GGTAACCATCTTTCTGGTATTCTCTGCTTTCAGCCAGGATCTGTCTGACATCAGCAGTGA 417 185075 AATTCAGAACTGAAGTAATGAGTTCCAAGAGCAGAAACTCTGAAAAGATGGAAGGTAATTACTGCAGAAGAAAGTGCT 32C>T GCAGCAGCATTTGCTAGAACTATTCAAGCATTCTTGGGTCAGTCCACCCCATTAACCAAGGC

201 418 185242 AATTCCAAAGGGAACACCTTATTCTCCAGTGATGATAAGGATACCTCCATATCCATTGAGAAGAGGGGGTGAGTTGA 35A>G AACGCATTGCTAGACTCTGAATAGGTTATAAGCACGAGTTTGTTATGGATGTCCTATCCGACT 419 185899 AATTCTGTCTGTTTTGGGAAACTGACTCCAAAGATTGGAATGAGAGCTATGTTTAGGTTAAATCAGTAATTAGTTTAAT 66G>C CAGTCTTTAAGACAGTGTCAGTACAAAATACCTGTCTAAAGTTTATAATCAAACTGTCACC 420 186912 AATTCTGTAGCTGGCCCTGCTCCTGGCTATAGGTCCTAACCCAAAGGAAAACATGCTAGGACTGCTGAGCATTGGGT 50A>G AGGACAACCTCTGCTGAAACACCTCATGGGTTTTGTTCTCTAAGATGAGGGCCATAAGCTTTC 421 186970 AATTCTCCTTTGAACCATTTGTAATGTCTTACTGACTCCTTCATTAATTAATGAGTGAAATAAAAATTTTGACACTCCTT 35G>A CATTTTATATTTGTATGACAGTCATGATGCTTCATTTAACAAGGGTAGCCCCCAGGGCAT 422 187465 AATTCTCTGTTTCCTGATTTGGACAATAAATTATGTCAATATTCTACTTTAAATAACATTGTTTTTGCCGTGAGTTTCAA 92T>C GTACAAAACTTCTTTTTTAAGTTTTGTACCTTTATTAGGAGTCCCTGAGTGGCACAGTTA 423 187709 AATTCTGCCTCCAGTAATTCCAAGGCCTTCTCTTTCTCCAGAAGGTTTCTTGGTTCTTTGTTTCATCACTTGCTGAAGC 65A>G CATCATGGCCTGCCTCTTTATGTGATTTAATATTGACTGTTGTCTCTGAGCTATCAATAAG 424 187767 AATTCCAGAACACTTAACTGTGCTCCTGAGGAAACTGTACATAGATCAAGAGGCAGTTGTTTGAACATACATATATGT 70C>T ATATAAATTTTTACAATTAATGATATAGATTTCCATGTTTAAGAAGATGGAGCATATATATT 425 187844 AATTCCTCTATTTTCCTGGGAAAAGAAGGATTATTTGTAATCCTACAGAAGCCCACCTTAACAGAATGGGCATGGCAT 83A>G AGTGAAAGAGAAAGCTAAACAAAAAAACTTCCTTCTGTAAAGCTGGGCTAAGGACTGTAACA 426 188437 AATTCTTACAGACCACCAGACACACTCTGAGAAATACTAAAAAGCAGTCATGTCTCTAAATAAATATGGAGGCAACAG 94C>A GTGTACCCATGTGTACAATCTGAAACAAAAATGGAAATGCCTGCCTTTTCTTGAGAACAGAA 427 188527 AATTCTGTGACTGAACCAGCACTGTCTTTTGCAGATCAAAAAATAACCAAAGCACCCAGATCAATCACTGTACAGTTT 85T>C ATTCTTCGGCAAAACCTCTTTCCTTCTTCTAGATAAAGCTTGTTGCAGTTTCTCAAACTAAA 428 188645 AATTCAAGTGGGGATGGAGAAGATAAAGTTCCAAGGGTACGGGACTTGAAACATAAATGGTAGATAGGATTGTATCT 100A>G AAACAAATTAGCATGGACTTCAATTAGCTGAGACTCACTTCAGAAGAATAAAAAATCCTTTTT 429 188964 AATTCTTAGGGAAACCCAAATGCATCAGGGAATAAAACACACAATCTGTCACACACACACATGCAGGAGAGAGACAA 89G>A AGAGGAAATAGGAGAAACTGAAGCCTTTGGCATTTACAGCTACAGCAAACATTAAACACAGCC 430 189110 AATTCTATTTCTCATCCAATGCACACAAAAATTGATTGTAATCAAAGCATTTTATTGCTTTATGTCTCATATAGATTCTTA 89G>A AATTAAAGAATGAATAGCTGTTTGAAATTAAAAATAGTCTTTTATGGAGAACTTGAGAT 431 189335 AATTCAGGAAGAAACACATAAATGTGATAATTCTACATCTACTTTTTTATCTTCTCATCTTAACGACAGATAACAATGCT 49A>G GCAATAGATAAAAAATGTAATGAGTGAAATCTTAGTAATAGCACTGGCTTTAAAAACAAA 432 189503 AATTCAGGATAATTGCTTCACTGGACTGTTGCAACATTATGTAAATGAGCTGAAGAGAGAACCTTCCAAATGGTCTGG 68G>A AATTTCTACCATGATGACATGTAGTTTATCTTAAATGTGCACAATCTTATAGTAAATTCTGC 433 190184 AATTCCTATTCTTCGCAATGTTACCCATCATTTGTTACGGTCCACACAGTCGAATGCCTTTGCATAGTCAATAAAACTC 78A>T AGGTAAACATCTTTCTGGTATTCTCTGCTTTCAGCCAAGATCCATTTGATATCAGCAATGA 434 209470 AATTCTCTGTTCCATCATCTGAGAGCCATAAGGACTGGGCGCAGTACACCACCATGCTTTTGAGCTCTGATCCATGG 62A>G ACTATTCAACACATAGCCTTCCTTCAGAGCCATTTAGGTCAGAAGCAGTATAGGTATTAACCT 435 210238 AATTCATATCAAAGTTTCCAATACTTCCAATTTCAGTCCAAAACCTTAGGGTTCTTTCCAGCTGTCTCCTGTTCCATAAT 85G>A TATAAATTCTTTTTCAGACACTGAGAAACCTTGCTCCCACATTATCAGTATACTTACTTA 436 217824 AATTCCACACCCAGGTGACAGCTGTCCCTGCATGACACCCCAATCATTGATTGTAATGTGCATAGAAAGAAACTTCTA 65G>A TTGTGTTAAGCAACTGAGAATTTTTGGAGTCCCTGGGTGGTGCAAATAAGCCCCTGGGTGGC 437 219585 AATTCAAGTTCTAGAACTTGAATCTAGAGTCCCATACTGGTTAAGAACCAAAGCTCTTAAGTAAGGCCGGGTTCAAAT 32C>T CAGAGCTGCCATTAGTGTGGCTTTGGGCAAGTTATTTAACCTTGAGGAGGCTCACTATCCTC

202 438 219816 AATTCGGAAACCCCTTTTGTTCTCTTGAGCACCTTGATGGGTCTAGGTTCAAATATTTAGTTAGAAGCCCTTAATGTGT 94A>G ATTTACTAGTTCTCGATCTATTATCTTCAGCACGCCTTCCAGGTAAAGTATGAAATTAAAA 439 228341 AATTCATTTCTCAGAAGGTTTCAAATTTCAAGTATAGCATAAAGATGCTTGATAGCTTACAGCAATACATTTTTACCAGT 35T>G TATCAAAAAAAGAATAAAATGTACTTCTGTTCATAACTAAAAAATAGACAAAGAAGCACC 440 229523 AATTCATGTATTTATTATTTCTGTTTGTAAATCTAGCACCGAGAACTGTTATCTACTATTCAAGAGAAACTCTATACATG 41G>T TGTGAAACTCTGAGCAGTCAGGACTGCATAAGAAGGAGGGAGATGAACTTTAAACATGAA 441 232926 AATTCCACATTGTTTCTAGGGCATGGCCTGGAAGCAGGACTCCCAGAGCTCAAGCTGAGCCGATGACTAGAAGCTG 80G>A GGCATCTGTCTTCCCTTATTGCTTTTGCTTCCACCTGGTAACTGAACATCCAGAATGTCTTTAG 442 236905 AATTCTCCAGCTGCTCTGCGGGAAAGATGAGGCAGTCTGCTTCCGTAAAGATTTACAGCCTTGGAAACCCTGTGGGG 44C>T CAGATCTACTCTGTCCTATAGGGTCACTATAAGTCAGAATCCACTCGACAGCAATGGACTTGG 443 239008 AATTCCCATTCTTTGTAATGTTATCCTTAATTTGTTATGATCCACACAGTGGAATACCTTTGCATACTCAATAAAATGCA 100T>G GGTAAATATCTTTCTGATAGTCTCTGCTTTCAGCCATGATACCTCTGACATCAACAATGA 444 239009 AATTCTGTGCATTCACTCACCCATCCACCCATCCATCCATTAAAAACTAAACGCTCTATATGAGTTTTGGAAAGTGTGT 52C>T CAACTAATATTATTAGAGATTGAGGATGTCCTATTAGTAGGGGATTTTTTTTCAAAAATAA

203 Appendix 6.1. Relative abundance averages for bacterial phyla identified in Queensland dugong faecal samples. Southern Queensland included Clairview, Hervey Bay and Moreton Bay.

Overall averages Southern Queensland averages p_Firmicutes 62% 60% p_Bacteroidetes 30% 15% p_Actinobacteria 5% 16% p_Proteobacteria 2% 6% p_Verrucomicrobia 0% 0% p_Other 1% 3%

Appendix 6.2. Operational Taxonomic Units (OTUs) identified in the 47 dugong faecal samples collected in Queensland from analysis using QIIME 2. Classification from phylum (D_1) to species (D_6) level.

OTU 1 D_0__Bacteria;D_1__Actinobacteria;D_2__Actinobacteria;__;__;__;__ 2 D_0__Bacteria;D_1__Actinobacteria;D_2__Coriobacteriia;D_3__Coriobacteriales;D_4__Eggerthell aceae;__;__ 3 D_0__Bacteria;D_1__Actinobacteria;D_2__Coriobacteriia;D_3__Coriobacteriales;__;__;__ 4 D_0__Bacteria;D_1__Actinobacteria;D_2__Thermoleophilia;D_3__Solirubrobacterales;D_4__Solir ubrobacteraceae;__;__ 5 D_0__Bacteria;D_1__Actinobacteria;__;__;__;__;__ 6 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Bacteroidaceae; D_5__Bacteroides;__ 7 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Muribaculaceae; D_5__wallaby gut metagenome;D_6__ 8 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Muribaculaceae; __;__ 9 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Paludibacterace ae;D_5__Paludibacter;D_6__bacterium enrichment culture clone DPHB06 10 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Paludibacterace ae;__;__ 11 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Prevotellaceae; __;__ 12 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;D_4__Rikenellaceae;D _5__dgA-11 gut group;D_6__uncultured Bacteroidaceae bacterium 13 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Bacteroidales;__;__;__ 14 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Flavobacteriales;D_4__Flavobacteria ceae;D_5__Polaribacter;D_6__uncultured Flavobacteriaceae bacterium 15 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Flavobacteriales;D_4__Flavobacteria ceae;__;__ 16 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;D_3__Flavobacteriales;__;__;__ 17 D_0__Bacteria;D_1__Bacteroidetes;D_2__Bacteroidia;__;__;__;__ 18 D_0__Bacteria;D_1__Cyanobacteria;D_2__Melainabacteria;D_3__Gastranaerophilales;__;__;__ 19 D_0__Bacteria;D_1__Cyanobacteria;D_2__Oxyphotobacteria;D_3__Chloroplast;__;__;__ 20 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Bacillaceae;D_5__Bacillus;_ _ 21 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Bacillaceae;__;__ 22 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Family XII;D_5__Exiguobacterium;D_6__Exiguobacterium sp. AG2 23 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Family XII;D_5__Exiguobacterium;__ 204

24 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Paenibacillaceae;D_5__Brevi bacillus;__ 25 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Paenibacillaceae;D_5__Pae nibacillus;__ 26 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Planococcaceae;D_5__Lysin ibacillus;__ 27 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;D_4__Planococcaceae;__;__ 28 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;D_3__Bacillales;__;__;__ 29 D_0__Bacteria;D_1__Firmicutes;D_2__Bacilli;__;__;__;__ 30 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Christensenellaceae;D _5__Christensenellaceae R-7 group;D_6__uncultured Ruminococcus sp. 31 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Christensenellaceae;D _5__Christensenellaceae R-7 group;__ 32 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Christensenellaceae;_ _;__ 33 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Clostridiaceae 1;D_5__Clostridium sensu stricto 13;__ 34 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Clostridiaceae 1;D_5__Clostridium sensu stricto 1;__ 35 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Clostridiaceae 1;__;__ 36 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Eubacteriaceae;D_5__ ;__ 37 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Eubacteriaceae;__;__ 38 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Family XI;D_5__Sedimentibacter;D_6__Sedimentibacter hongkongensis 39 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Family XI;__;__ 40 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Family XIII;D_5__[Eubacterium] nodatum group;D_6__uncultured bacterium 41 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Family XIII;D_5__[Eubacterium] nodatum group;__ 42 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Family XIII;__;__ 43 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Lachnospiraceae;D_5_ _Blautia;D_6__Blautia massiliensis 44 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Lachnospiraceae;D_5_ _Lachnospiraceae ND3007 group;D_6__uncultured bacterium 45 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Lachnospiraceae;__;_ _ 46 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Peptostreptococcacea e;__;__ 47 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Acetanaerobacterium;D_6__uncultured marine bacterium 48 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Acetanaerobacterium;__ 49 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Anaerofilum;__ 50 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Caproiciproducens;__ 51 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Faecalibacterium;D_6__bacterium ic1379 52 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__GCA-900066225;D_6__Ruminococcaceae bacterium 53 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__GCA-900066225;__ 54 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Oscillibacter;D_6__uncultured Oscillibacter sp. 55 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Oscillibacter;__ 56 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Pygmaiobacter;D_6__uncultured Clostridiaceae bacterium

205 57 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminiclostridium 5;__ 58 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-002;D_6__Trichuris trichiura (human whipworm) 59 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-010;D_6__uncultured Bacillus sp. 60 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-010;D_6__uncultured organism 61 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-010;__ 62 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-013;D_6__uncultured Acetivibrio sp. 63 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcaceae UCG-014;__ 64 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcus 1;D_6__Clostridium islandicum 65 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Ruminococcus 1;__ 66 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__Subdoligranulum;__ 67 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_ 5__[Eubacterium] coprostanoligenes group;D_6__uncultured Clostridium sp. 68 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;__; __ 69 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;__;__;__ 70 D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;__;__;__;__ 71 D_0__Bacteria;D_1__Firmicutes;D_2__Erysipelotrichia;D_3__Erysipelotrichales;D_4__Erysipelotri chaceae;D_5__Asteroleplasma;D_6__gut metagenome 72 D_0__Bacteria;D_1__Firmicutes;D_2__Erysipelotrichia;D_3__Erysipelotrichales;D_4__Erysipelotri chaceae;D_5__Catenibacterium;__ 73 D_0__Bacteria;D_1__Firmicutes;D_2__Erysipelotrichia;D_3__Erysipelotrichales;D_4__Erysipelotri chaceae;D_5__Erysipelotrichaceae UCG-004;D_6__uncultured Mollicutes bacterium 74 D_0__Bacteria;D_1__Firmicutes;D_2__Erysipelotrichia;D_3__Erysipelotrichales;D_4__Erysipelotri chaceae;D_5__Turicibacter;__ 75 D_0__Bacteria;D_1__Firmicutes;D_2__Negativicutes;D_3__Selenomonadales;D_4__Acidaminoco ccaceae;D_5__Succiniclasticum;D_6__uncultured Veillonellaceae bacterium 76 D_0__Bacteria;D_1__Firmicutes;__;__;__;__;__ 77 D_0__Bacteria;D_1__Fusobacteria;D_2__Fusobacteriia;D_3__Fusobacteriales;D_4__Fusobacteri aceae;D_5__Fusobacterium;__ 78 D_0__Bacteria;D_1__Lentisphaerae;D_2__Lentisphaeria;D_3__Victivallales;D_4__Victivallaceae; D_5__Victivallis;D_6__uncultured organism 79 D_0__Bacteria;D_1__Planctomycetes;D_2__Planctomycetacia;__;__;__;__ 80 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Betaproteobacteriales;D_ 4__Burkholderiaceae;__;__ 81 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Oceanospirillales;D_4__ Halomonadaceae;D_5__Halomonas;D_6__Halomonas caseinilytica 82 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Oceanospirillales;D_4__ Halomonadaceae;D_5__Halomonas;__ 83 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pseudomonadales;D_4_ _Moraxellaceae;D_5__Acinetobacter;D_6__Acinetobacter sp. Salman2 84 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pseudomonadales;D_4_ _Moraxellaceae;D_5__Psychrobacter;__ 85 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pseudomonadales;D_4_ _Moraxellaceae;__;__ 86 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pseudomonadales;D_4_ _Pseudomonadaceae;D_5__Pseudomonas;D_6__Pseudomonas sp. IGCAR-24/08 87 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pseudomonadales;D_4_ _Pseudomonadaceae;D_5__Pseudomonas;__ 88 D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;__;__;__;__ 89 D_0__Bacteria;D_1__Proteobacteria;__;__;__;__;__

206 90 D_0__Bacteria;D_1__Spirochaetes;D_2__Spirochaetia;D_3__Spirochaetales;D_4__Spirochaetace ae;D_5__Treponema 2;__ 91 D_0__Bacteria;D_1__Synergistetes;D_2__Synergistia;D_3__Synergistales;D_4__Synergistaceae; __;__ 92 D_0__Bacteria;D_1__Tenericutes;D_2__Mollicutes;D_3__Izimaplasmatales;D_4__uncultured organism;D_5__;D_6__ 93 D_0__Bacteria;D_1__Tenericutes;D_2__Mollicutes;D_3__Izimaplasmatales;__;__;__ 94 D_0__Bacteria;D_1__Tenericutes;D_2__Mollicutes;D_3__Mollicutes RF39;D_4__gut metagenome;D_5__;D_6__ 95 D_0__Bacteria;D_1__Tenericutes;D_2__Mollicutes;__;__;__;__ 96 D_0__Bacteria;D_1__Verrucomicrobia;D_2__Verrucomicrobiae;D_3__Opitutales;__;__;__ 97 D_0__Bacteria;D_1__Verrucomicrobia;D_2__Verrucomicrobiae;D_3__Verrucomicrobiales;D_4__A kkermansiaceae;D_5__Akkermansia;D_6__uncultured Akkermansia sp. 98 D_0__Bacteria;D_1__Verrucomicrobia;D_2__Verrucomicrobiae;__;__;__;__

207