Microsatellite Data Quality Checks
Total Page:16
File Type:pdf, Size:1020Kb
1 Microsatellite data quality checks
2 Microsatellite alleles were binned with the R package MsatAllele version 1.02
3 following a previously described protocol (Alberto 2009). All individuals were then
4 checked for duplicate genotypes using the R package ALLELEMATCH (Galpern et al.
5 2012). The probability of sampling an identical twin was extremely unlikely, ranging
6 from 1.0 x 10-20 to 8.0 x 10-26, so the occurrence of duplicate genotypes most likely
7 resulted from duplicate sampling of the same individual. All duplicate genotypes were
8 removed from subsequent analyses. Each microsatellite locus was examined with
9 MICROCHECKER to check for patterns generated by null alleles, allele scoring error
10 due to either large allele dropout or stutter, or other natural processes (Van Oosterhout et
11 al. 2004). Linkage disequilibrium among all loci was tested using GENEPOP (Raymond
12 and Rousset 1995; Rousset 2008). All loci were used for multivariate analyses since these
13 statistical models are not biased by deviations from Hardy-Weinberg equilibrium (HWE)
14 or linkage disequilibrium (Jombart et al. 2009).
1 15 Table S1 Connectivity Modeling System parameterization
Module Parameter Input Physical Ocean circulation, nest 1 HyCOM Global + NCODA assimilated Physical Ocean circulation, nest 2 HyCOM Gulf of Mexico expt 30.1 Physical Ocean circulation, nest 3 ROMS Bahamas, (Chérubin 2014) Physical Ocean circulation, nest 4 HyCOM FLK, (Kourafalou and Kang 2012) Physical Horizontal diffusivity, nest 1 15 m2 s-1 Physical Horizontal diffusivity, nest 2,3,4 8 m2 s-1 Physical Vertical diffusivity, nests 1,2,3,4 0.05 m2 s-1 Physical Timestep 45 min Physical Maximum Tracking time 196 days Physical Simulation Timespan Jan 2004 through Sept 2009 Biological Competence 152 days Biological Habitat site size ~50 × 35 km Biological Habitat site numbers 261 reef polygons Biological Mortality ~90% of larvae Biological Ontogenetic vertical migration Enabled (Butler et al. 2011) Biological Release Pattern Seasonal (Kough et al. 2013) 10,314,220 larvae annually in Caribbean Biological Release Magnitude varying with habitat (Kough et al. 2013) 16
17 Table S2 Sample location GPS coordinates with levels of larval local retention (calculated with
18 the Panulirus argus biophysical model; 0 = no local retention, 1 = 100% local retention)
19 Country Site name Local retention Latitude Longitude Bermuda Bermuda * 32.350715 -64.781607 Panama San Blas 0 9.489993 -78.810006 Cayman Islands Grand Cayman 0 19.335664 -81.386871 Puerto Rico Puerto Rico 0.02 17.966078 -67.244514 Belize Glover's Reef 0.04 16.788778 -87.774517 Belize Caye Caulker 0.26 17.746381 -87.999082 Nicaragua Corn Islands 0.98 12.501816 -82.85405 Belize Sapodilla Cayes 0.9 16.163048 -88.330863 Bahamas Andros Island 0.82 24.618112 -77.707897 Venezuela Los Roques 0.72 11.776538 -66.580912
2 20 Microsatellite locus characteristics and conformity to HWE
The observed heterozygosity HO and number of alleles per locus have been described previously for all the microsatellite loci were used (Diniz et al. 2005, 2007; Tringali et al. 2008). Six loci (Par1, Par2, Par7, Par9, fwc04, and argus 2) showed heterozygote deficiencies and statistically significant deviations from HWE after corrections for multiple comparisons. Analysis with MICROCHECKER suggested the presence of null alleles in the six loci that deviated from HWE (Table S3). There was no evidence of scoring errors due to stutter or the dropout of large alleles. The six loci with null alleles were removed from further FST-based analyses because of their potential to bias estimates of genetic differentiation, leaving a final dataset of 13 loci. Once these loci were removed, Puerto Rico was the only location that consistently deviated from HWE (fwc05, fwc07, fwc08, fwc17, fwc18, argus2). No evidence of linkage disequilibrium was observed among any combinations of loci.
3 21 22 Table S3 Departures from Hardy Weinberg Equilibrium (HWE). P-values for each combination of sampling location and locus.
23 Significant departures from HWE, after the sequential goodness-of-fit correction for multiple tests are shown in bold (P <0.008). The
24 suggested presence of null alleles after analysis with MICROCHECKER is indicated by *. Loci shaded in grey were excluded from FST and
25 Jost’s D analyses of genetic differentiation due to the majority of sites deviating from HWE or potentially containing null alleles
26
Par1 Par2 Par3 Par Par6 Par7 Par9 fwc04 fwc05 fwc07 fwc08 fwc14 fwc14 fwc17 fwc18 argus2 argus5 4 a b Nicara 0.070 0.000 0.433 0.3 0.148 0.000* 0.000 0.000 0.168 0.224 0.281 0.114 0.363 0.065 0.155 0.000 0.294 gua * 02 * * * Bermu 0.020 0.000 0.258 0.3 0.232 0.000* 0.000 0.001 0.194 0.536 0.036 0.139 0.282 0.118 0.194 0.000 0.092 da * 78 * * * Glover 0.003* 0.236 0.373 0.4 0.398 0.000* 0.000 0.001 0.248 0.722 0.010 0.473 0.233 0.570 0.412 0.000 0.059 's 24 * * * Venez 0.000* 0.000 0.274 0.3 0.153 0.000* 0.000 0.000 0.182 0.163 0.000 0.326 0.261 0.016 0.169 0.000 0.460 uela * 25 * * * * Puerto 0.000* 0.000 0.578 0.4 0.097 0.000* 0.000 0.000 0.000 0.000* 0.000 0.000* 0.100 0.000* 0.000* 0.000 0.011 Rico * 61 * * * * * Panam 0.022 0.000 0.416 0.0 0.438 0.000* 0.011 0.219 0.560 0.421 0.576 0.363 0.519 0.067 0.000* 0.000 0.017 a * 51 * Caym 0.000* 0.000 0.264 0.3 0.189 0.000* 0.000 0.000 0.029 0.019 0.001 0.173 0.009 0.017 0.093 0.000 0.257 an * 53 * * * Andro 0.000* 0.025 0.473 0.0 0.280 0.000* 0.001 0.017 0.293 0.097 0.001 0.457 0.363 0.274 0.010 0.001 0.076 s 83 * * * Sapodi 0.000* 0.000 0.344 0.0 0.328 0.000* 0.000 0.002 0.396 0.136 0.041 0.407 0.532 0.103 0.495 0.000 0.306 lla * 94 * * Caulk 0.000* 0.026 0.111 0.4 0.048 0.000* 0.152 0.001 0.168 0.115 0.001 0.551 0.544 0.007 0.233 0.000 0.356 er 95 * * 27 28 29 30 4 31 Table S4 Summary statistics including the country, location, local oceanographic environment, levels of local retention (calculated
32 with the Panulirus argus biophysical model; 0 = no local retention, 1 = 100% local retention), number of samples (NS), observed
33 heterozygosity (HO), total expected heterozygosity (HT), allelic richness (AR) calculated with rarefaction, inbreeding coefficient (GIS),
34 and proportion of individuals to three clusters in the population genetics software STRUCTURE. The standard deviation across
35 populations for HO = 0.05 and for HT = 0.04
36
Country Site name NS HO AR GIS Cluster 1 Cluster 2 Cluster 3 Bermuda Bermuda 75 0.75 13.18 0.01 0.96 0.03 0.01 Panama San Blas 41 0.70 12.42 0.07 0.30 0.65 0.04 Cayman Islands Grand Cayman 87 0.71 13.48 0.06 0.92 0.07 0.02 Puerto Rico Puerto Rico 38 0.60 12.81 0.21 0.74 0.25 0.01 Belize Glover's Reef 33 0.73 12.91 0.01 0.86 0.12 0.02 Belize Caye Caulker 56 0.75 13.21 0.01 0.93 0.07 0.01 Nicaragua Corn Islands 81 0.75 13.20 0.01 0.98 0.02 0.00 Belize Sapodilla Cayes 60 0.75 13.42 0.03 0.67 0.26 0.07 Bahamas Andros Island 36 0.73 13.76 0.04 0.87 0.03 0.09 Venezuela Los Roques 74 0.71 13.49 0.04 0.93 0.07 0.00 37 * Biophysical modeling data is not available for Bermuda 38 39 40 Table S5 Pairwise comparisons of genetic differentiation among sampling sites. Pairwise FST values are located below the diagonal
41 and pairwise Hedrick’s G'ST values are located above the diagonal. Values marked in bold were significant after using a sequential
42 goodness-of-fit correction for multiple tests
43
5 Nicaragua Bermu Glover's Venezuela Puerto Panama Cayman Andros Sapodilla Caulker da Rico Nicaragua - 0.0030 -0.0067 0.0001 0.0054 0.0230 -0.0029 0.0136 0.0045 0.0088 Bermuda 0.0036 - -0.0054 -0.0021 0.0140 0.0209 -0.0057 0.0047 0.0047 -0.0048 Glover's 0.0037 0.0041 - -0.0066 0.0218 0.0107 -0.0027 0.0098 0.0041 -0.0009 Venezuela 0.0034 0.0032 0.0041 - 0.0252 0.0140 -0.0014 0.0152 0.0067 0.0045 Puerto 0.0053 0.0065 0.0106 0.0078 - Rico 0.0248 0.0048 0.0273 0.0157 0.0160 Panama 0.0065 0.0066 0.0083 0.0061 0.0096 - 0.0235 0.0658 0.0220 0.0312 Cayman 0.0027 0.0025 0.0041 0.0031 0.0049 0.0061 - -0.0012 0.0001 0.0033 Andros 0.0059 0.0053 0.0088 0.0065 0.0114 0.0149 0.0045 - 0.0037 0.0060 Sapodilla 0.0041 0.0043 0.0058 0.0046 0.0076 0.0073 0.0035 0.0060 - 0.0109 Caulker 0.0048 0.0033 0.0056 0.0045 0.0079 0.0088 0.0040 0.0065 0.0057 - 44
6 45 References
46 Alberto F (2009) MsatAllele_1.0: an R package to visualize the binning of microsatellite alleles. 47 J Hered 100:394–397 48 Butler IV MJ, Paris CB, Goldstein JS, Matsuda H, Cowen RK (2011) Behavior constrains the 49 dispersal of long-lived spiny lobster larvae. Mar Ecol Prog Ser 422:223–237 50 Chérubin LM (2014) High-resolution simulation of the circulation in the Bahamas and Turks and 51 Caicos Archipelagos. Prog Oceanogr 127:21–46 52 Diniz FM, Maclean N, Ogawa M, Paterson IG, Bentzen P (2005) Microsatellites in the 53 overexploited spiny lobster, Panulirus argus: isolation, characterization of loci and 54 potential for intraspecific variability studies. Conserv Genet 6:637–641 55 Diniz FM, Iyengar A, Lima PSDC, Maclean N, Bentzen P (2007) Application of a double- 56 enrichment procedure for microsatellite isolation and the use of tailed primers for high 57 throughput genotyping. Genet Mol Biol 30:380–384 58 Galpern P, Manseau M, Hettinga P, Smith K, Wilson P (2012) Allelematch: an R package for 59 identifying unique multilocus genotypes where genotyping error and missing data may be 60 present. Mol Ecol Resour 12:771–778 61 Jombart T, Pontier D, Dufour AB (2009) Genetic markers in the playground of multivariate 62 analysis. Heredity 102:330–341 63 Kough AS, Paris CB, Butler IV MJ (2013) Larval connectivity and the international management 64 of fisheries. PLoS One 8:e64970 65 Kourafalou VH, Kang H (2012) Florida Current meandering and evolution of cyclonic eddies 66 along the Florida Keys reef tract: are they interconnected? J Geophys Res Oceans 117 67 [doi:10.1029/2011JC007383] 68 Raymond M, Rousset F (1995) Genepop (version 1.2): population genetics software for exact 69 tests and ecumenicism. J Hered 86:248–249 70 Rousset F (2008) Genepop’007: a complete re-implementation of the Genepop software for 71 windows and linux. Mol Ecol Resour 8:103–106 72 Tringali MD, Seyoum S, Schmitt SL (2008) Ten di- and trinucleotide microsatellite loci in the 73 Caribbean spiny lobster, Panulirus argus, for studies of regional population connectivity. 74 Mol Ecol Resour 8:650–652 75 Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004) MICRO-CHECKER: software 76 for identifying and correcting genotyping errors in microsatellite data. Mol Ecol Notes 77 4:535–538 78
79 Figure caption
80 Fig. S1 Estimation of the optimum number of genetically unique clusters (K) of using all alleles
81 from microsatellite dataset for Panulirus argus. Bayesian clustering analysis was performed in
82 the population genetics program STRUCTURE using 105 burn-in iterations followed by 106
83 Markov chain Monte Carlo iterations. A total of three independent replicates were carried out for
7 84 K ranging from two through eight. These data were uploaded into the online version of
85 STRUCTURE Harvester, which identified an optimal K of three
8