Genetic architecture of water-use efficiency
1 Running Header: Genetic architecture of water-use efficiency
2
3
4 SUPPLEMENTARY MATERIALS
5
6
7 The genetic architecture of local adaptation II: The QTL landscape of water-use efficiency
8 for foxtail pine (Pinus balfouriana Grev. & Balf.)
9
10 Andrew J. Eckert1*, Douglas E. Harwood2, Brandon M. Lind2, Erin M. Hobson1, Annette Delfino
11 Mix3, Patricia E. Maloney4, and Christopher J. Friedline1
12
13 1Department of Biology, Virginia Commonwealth University, Richmond, VA 23284
14 2Integrative Life Sciences Program, Virginia Commonwealth University, Richmond, VA 23284
15 3Institute of Forest Genetics, USDA Pacific Southwest Research Station, Placerville, CA 95667
16 4Department of Plant Pathology, University of California, Davis, CA 95616
17
18
19
20
21
22
23
24
25
26 Genetic architecture of water-use efficiency
27 Supplemental Tables
28
29 Table S1. Definitions of the 19 bioclimatic variables (Bio) used to construct species distribution
30 models (SDMs). More information is available at: http://www.worldclim.org/bioclim
Bio Definition Variable Type
Bio1 Annual Mean Temperature Temperature
Bio2 Mean Diurnal Range Temperature
Bio3 Isothermality Temperature
Bio4 Temperature Seasonality Temperature
Bio5 Max Temperature of Warmest Month Temperature
Bio6 Min Temperature of Coldest Month Temperature
Bio7 Temperature Annual Range Temperature
Bio8 Mean Temperature of Wettest Quarter Both
Bio9 Mean Temperature of Driest Quarter Both
Bio10 Mean Temperature of Warmest Quarter Temperature
Bio11 Mean Temperature of Coldest Quarter Temperature
Bio12 Annual Precipitation Precipitation
Bio13 Precipitation of Wettest Month Precipitation
Bio14 Precipitation of Driest Month Precipitation
Bio15 Precipitation Seasonality Precipitation
Bio16 Precipitation of Wettest Quarter Precipitation
Bio17 Precipitation of Driest Quarter Precipitation
Bio18 Precipitation of Warmest Quarter Both
Bio19 Precipitation of Coldest Quarter Both
31
32
33 Genetic architecture of water-use efficiency
34 Table S2. Variable contribution (VC) and permutation importance (PI) scores for each species
35 distribution model (SDM) for each bioclimatic variable (Bio). Values are either means or
36 standard deviations (sd) across the 10 replicated runs. KM = Klamath Mountain; SN = southern
37 Sierra Nevada.
KM SDM SN SDM
Bio VC (mean) VC (sd) PI (mean) PI (sd) VC (mean) VC (sd) PI (mean) PI (sd)
Bio1 9.4669 0.9973 69.3646 0.0418 1.0681 0.9973 0.0684 0.0418
Bio2 0.6066 1.8227 0.2639 19.2140 41.3070 1.8227 10.9376 19.2140
Bio3 7.7503 1.0824 2.0402 2.3958 25.4156 1.0824 4.1491 2.3958
Bio4 1.5862 0.0272 9.6481 0.4408 0.1172 0.0272 1.6714 0.4408
Bio5 7.2930 2.8700 0.0000 0.1336 20.8623 2.8700 0.0988 0.1336
Bio6 2.3979 0.0787 1.0080 0.4417 0.0818 0.0787 0.1551 0.4417
Bio7 2.0505 0.1575 0.0000 0.0437 0.1638 0.1575 0.0242 0.0437
Bio8 0.0281 0.1454 0.0000 12.7328 0.2716 0.1454 34.3415 12.7328
Bio9 0.4293 0.2292 0.4839 0.6193 0.3272 0.2292 1.5396 0.6193
Bio10 1.4292 1.1211 0.6715 0.0150 3.1607 1.1211 0.0057 0.0150
Bio11 0.3747 0.1462 0.4255 0.1171 0.1865 0.1462 0.0747 0.1171
Bio12 5.8994 0.1988 0.0410 2.3376 0.2082 0.1988 3.7135 2.3376
Bio13 0.0341 0.0606 0.0240 0.4824 0.0924 0.0606 0.5832 0.4824
Bio14 1.3270 0.5418 6.0128 0.5319 2.5582 0.5418 1.3117 0.5319
Bio15 0.3237 0.4655 0.3469 11.9870 2.8783 0.4655 38.7247 11.9870
Bio16 20.8919 0.1043 1.3043 0.0559 0.0557 0.1043 0.0412 0.0559
Bio17 37.7992 0.0660 6.9994 0.6256 0.1596 0.0660 1.0055 0.6256
Bio18 0.3012 0.1371 1.3658 0.1851 0.1777 0.1371 0.3119 0.1851
Bio19 0.0110 0.2424 0.0000 1.2179 0.9081 0.2424 1.2421 1.2179
38
39 Genetic architecture of water-use efficiency
Table S3. Summary of annotation (InterPro domains) information given by Friedline et al. (2015) for RADtags within 3 cM of the QTL 40 positions in Table 5. Unique annotations are separated by a semicolon. 41
Trait LGa Position RADtagsb P. taedac Putative Putative gene annotationse
(cM) genesd
15 δ N 1 0 356 119 7 Uncharacterized protein family FPL; SANT/Myb domain|Myb-like domain; Winged
helix-turn-helix DNA-binding domain|Peptidase M24, structural domain|Peptidase
M24A, methionine aminopeptidase, subfamily 2; Glycoside hydrolase, family 3, N-
terminal|Glycoside hydrolase, superfamily|Glycoside hydrolase family 3 C-
terminal domain; GDP-fucose protein O-fucosyltransferase; SS18
family|Crotonase superfamily; DNA mismatch repair protein MutS, clamp|DNA
mismatch repair protein MutS, C-terminal|DNA mismatch repair protein MutS-like,
N-terminal|DNA mismatch repair protein MutS, core|DNA mismatch repair protein
MutS, connector domain
15 δ N 1 79 35 14 2 Pentatricopeptide repeat; UDP-glucose 4-epimerase GalE|NAD(P)-binding
domain
13 δ C 1 13 144 40 3 Leucine-rich repeat|Leucine-rich repeat, typical subtype|Serine/threonine-/dual
specificity protein kinase, catalytic domain|Concanavalin A-like lectin/glucanase,
subgroup|Protein kinase, ATP binding site|Tyrosine-protein kinase, catalytic
domain|Protein kinase domain|Protein kinase-like domain|Serine-
threonine/tyrosine-protein kinase catalytic domain|Serine/threonine-protein Genetic architecture of water-use efficiency
kinase, active site|Leucine-rich repeat-containing N-terminal, type 2; Zinc finger,
C3HC-like; RWP-RK domain|Phox/Bem1p
13 δ C 1 98 76 24 0 NA
13 δ C 2 66 52 18 2 Class II glutamine amidotransferase domain|Alpha-helical ferredoxin|Glutamate
synthase, central-N|Glutamate synthase, alpha subunit, C-terminal|Aldolase-type
TIM barrel|Glutamine amidotransferase type 2 domain|Glutamate synthase,
central-C|FAD-dependent pyridine nucleotide-disulphide
oxidoreductase|Glutamate synthase, NADH/NADPH, small subunit 1; Protein
kinase, ATP binding site|Serine/threonine-protein kinase, active site|Tyrosine-
protein kinase, catalytic domain|Serine/threonine-/dual specificity protein kinase,
catalytic domain|Concanavalin A-like lectin/glucanase, subgroup|Protein kinase
domain|Protein kinase-like domain
13 δ C 2 77 102 30 6 Leucine-rich repeat, cysteine-containing subtype; Amino acid transporter,
transmembrane; Ribosomal protein L4/L1e|Ribosomal protein L4/L1e, bacterial-
type; NAD(P)-binding domain|Multi antimicrobial extrusion protein;
Tetratricopeptide-like helical; Pseudouridine synthase, catalytic
domain|Pseudouridine synthase I, TruA|Pseudouridine synthase I, TruA, C-
terminal
13 δ C 3 14 113 37 2 Ribosome maturation protein SBDS|Ribosome maturation protein SBDS,
conserved site|Ribosome maturation protein SBDS, C-terminal|Ribosome Genetic architecture of water-use efficiency
maturation protein SBDS, N-terminal; Leucine-rich repeat
13 δ C 3 34 62 22 3 DNA topoisomerase I|DNA breaking-rejoining enzyme, catalytic core|DNA
topoisomerase I, DNA binding, eukaryotic-type|DNA topoisomerase I, DNA
binding, mixed alpha/beta motif, eukaryotic-type|DNA topoisomerase I, catalytic
core, eukaryotic-type|DNA topoisomerase I, catalytic core, alpha/beta
subdomain|DNA topoisomerase I, active site|DNA topoisomerase I, domain
1|DNA topoisomerase I, eukaryotic-type|DNA topoisomerase I, catalytic core,
alpha-helical subdomain, eukaryotic-type; Haem peroxidase,
plant/fungal/bacterial|Haem peroxidase|Peroxidases heam-ligand binding site;
Iron hydrogenase, large subunit, C-terminal|Iron hydrogenase
15 δ N 3 35 70 25 4 DNA topoisomerase I|DNA breaking-rejoining enzyme, catalytic core|DNA
topoisomerase I, DNA binding, eukaryotic-type|DNA topoisomerase I, DNA
binding, mixed alpha/beta motif, eukaryotic-type|DNA topoisomerase I, catalytic
core, eukaryotic-type|DNA topoisomerase I, catalytic core, alpha/beta
subdomain|DNA topoisomerase I, active site|DNA topoisomerase I, domain
1|DNA topoisomerase I, eukaryotic-type|DNA topoisomerase I, catalytic core,
alpha-helical subdomain, eukaryotic-type; Haem peroxidase,
plant/fungal/bacterial|Haem peroxidase|Peroxidases heam-ligand binding site;
Iron hydrogenase, large subunit, C-terminal|Iron hydrogenase; Mediator complex,
subunit Med14 Genetic architecture of water-use efficiency
15 δ N 3 52 101 29 6 RNA polymerase II, heptapeptide repeat, eukaryotic|RNA polymerase Rpb1,
domain 5; PUB domain|Zinc finger, C2H2|UBX|PUG domain; Aldehyde
dehydrogenase domain|Aldehyde dehydrogenase NAD(P)-
dependent|Aldehyde/histidinol dehydrogenase|Aldehyde dehydrogenase, C-
terminal; RNA recognition motif domain|NAD(P)-binding domain|Nucleotide-
binding, alpha-beta plait; Pentatricopeptide repeat|Tetratricopeptide-like helical;
Cation/H+ exchanger|Cation/H+ exchanger
13 δ C 5 64 42 13 2 Molybdopterin biosynthesis MoaE|Molybdopterin biosynthesis
MoaE|Molybdopterin biosynthesis MoaE|Molybdopterin biosynthesis MoaE;
Lipase, class 3
13 δ C 6 46 91 30 4 Concanavalin A-like lectin/glucanases superfamily|Serine/threonine-/dual
specificity protein kinase, catalytic domain|Legume lectin, beta chain, Mn/Ca-
binding site|Legume lectin domain|Concanavalin A-like lectin/glucanase,
subgroup|Tyrosine-protein kinase, catalytic domain|Protein kinase domain|Protein
kinase-like domain|Serine/threonine-protein kinase, active site; PAR1;
Transcription factor CBF/NF-Y/archaeal histone|Histone-fold; G10
protein|BUD31/G10-related, conserved site
13 δ C 6 56 54 22 2 EF-hand domain pair|Endonuclease/exonuclease/phosphatase|EF-Hand 1,
calcium-binding site; Ubiquitin-like 1 activating enzyme, catalytic cysteine
domain|NAD(P)-binding domain|Molybdenum cofactor biosynthesis, MoeB Genetic architecture of water-use efficiency
15 δ N 7 62 87 34 3 HAD-like domain; Cytochrome P450, E-class, group I|Cytochrome P450,
conserved site|Cytochrome P450; Protein kinase, ATP binding
site|Serine/threonine-protein kinase, active site|Tyrosine-protein kinase, catalytic
domain|Serine/threonine-/dual specificity protein kinase, catalytic
domain|Concanavalin A-like lectin/glucanase, subgroup|Protein kinase
domain|Protein kinase-like domain
15 δ N 7 80 68 23 3 Protein kinase domain|Protein kinase-like domain|Serine-threonine/tyrosine-
protein kinase catalytic domain; Zinc finger, RING-type|Zinc finger, C3HC4 RING-
type|Zinc finger, RING/FYVE/PHD-type; Glycoside hydrolase, family 85
15 δ N 8 68 57 22 0 NA
15 δ N 8 71 80 28 0 NA
15 δ N 9 64 74 29 2 Ribosomal protein L38e|Cytochrome P450; Thiolase-like|Thiolase, N-
terminal|Thiolase
15 δ N 9 95 238 81 9 Citrate synthase, type II|Citrate synthase active site|Citrate synthase-like|Citrate
synthase-like, large alpha subdomain|Citrate synthase-like, small alpha
subdomain|Citrate synthase-like, core; G-patch domain; Pollen Ole e 1
allergen/extensin; Protein kinase, ATP binding site|Serine/threonine-protein
kinase, active site|Tyrosine-protein kinase, catalytic domain|Serine/threonine-
/dual specificity protein kinase, catalytic domain|Protein kinase domain|Protein
kinase-like domain; ARID/BRIGHT DNA-binding domain|High mobility group box Genetic architecture of water-use efficiency
domain; Homeodomain-like|SANT/Myb domain|Myb domain; Lipase, class 3;
NAD(P)-binding domain; RNA recognition motif domain|Nucleotide-binding,
alpha-beta plait|Nuclear transport factor 2|Nuclear transport factor 2, eukaryote
13 δ C 12 23 172 58 4 Protein of unknown function DUF247, plant; Myc-type, basic helix-loop-helix
(bHLH) domain; UDP-glucuronosyl/UDP-glucosyltransferase|UDP-
glucuronosyl/UDP-glucosyltransferase; Serine/threonine-protein kinase, active
site|Tyrosine-protein kinase, catalytic domain|Serine/threonine-/dual specificity
protein kinase, catalytic domain|Protein kinase domain|Protein kinase-like domain
13 δ C 12 43 28 6 2 Alpha/beta hydrolase fold-1; ATPase, AAA-type, core|DNA polymerase III,
gamma subunit, domain III|DNA polymerase III, subunit gamma/ tau Genetic architecture of water-use efficiency
Supplemental Figures
Figure S1. The geographical distribution for foxtail pine as defined by Little (1971). The GIS shapefile used to create this figures was obtained from the USGS Geosciences and Environmental Change Science Center (http://esp.cr.usgs.gov/data/little/).
Supplement: 10 Genetic architecture of water-use efficiency
Figure S2. Number of SNPs with quality > 20 as a function of the percent of samples missing a genotype call for that SNP.
Supplement: 11 Genetic architecture of water-use efficiency
Figure S3. Variant calling statistics for n = 11943 bialleleic SNPs called for n = 182 individuals across five familes. QUAL is the Phred-scaled quality score for calling the alternate allele (filtered with –minQ 20), MQ is the RMS mapping quality, and DP is the combined depth across samples.
Supplement: 12 Genetic architecture of water-use efficiency
Figure S4. Mean receiver operating characteristic (ROC) curves for each species distribution model (SDM) for each regional population (top = Klamath Mountains, bottom = southern Sierra Nevada).
Supplement: 13 Genetic architecture of water-use efficiency
Figure S5. Differences between the ecological niches of regional populations of foxtail pine are revealed using a biplot from a principal components analysis (PCA) based on 19 bioclimatic variables (BC) extracted from GIS layers for the 209 (nKlamath Mountains = 65, nsouthern Sierra Nevada = 144) observed occurrences and 10,000 randomly sampled background points (n = 5,000 points/regional population). Illustrated are the first two principal components (PCs), which explained 84.26% of the variance. Observed occurrences for foxtail pine in the Klamath Mountains are shown as filled blue circles, whereas observed occurrences in the southern Sierra Nevada are shown as filled red circles. Average values for the first two PCs are plotted as triangles for each regional population (blue = Klamath Mountains, red = southern Sierra Nevada). Background points for the Klamath Mountains are given as filled salmon circles, whereas background points for the southern Sierra Nevada are given as filled green circles. Vectors for each bioclimatic variable are colored by whether it is related to temperature (orange), precipitation (blue), or both temperature and precipitation (purple). Shown above and to the right of the biplot are loadings for 19 bioclimatic variables on the first (left to right: BC1 to BC19) and second (top to bottom: BC1 to BC19) PCs, respectively. Colors in the bar plots have the same meaning as colors for the vectors in the biplot.
Supplement: 14 Genetic architecture of water-use efficiency
Figure S6. Regional means for 19 bioclimatic variables (BC) based on occurrences of foxtail pine illustrate differences between climates of the Klamath Mountains and southern Sierra Nevada. Values were centered and standardized across the entire dataset (occurrences plus background points), thus a value of zero is the global mean from which values deviate in units of global standard deviations. Values plotted on the left for each bioclimatic variable (lighter colors) are for the Klamath Mountains, while values on the right (darker colors) are for the southern Sierra Nevada. Vertical lines give the 95% confidence interval for the mean (± 1.96 × standard error of the mean). Bioclimatic variables are colored based on whether they are temperature- related (orange, brown), precipitation-related (blue, dark blue), or related to both temperature and precipitation (purple, dark purple).
Supplement: 15 Genetic architecture of water-use efficiency
Figure S7. Correlations of bioclimatic variables (bio) for localities of foxtail pine in the Klamath Mountains (upper diagonal) and the southern Sierra Nevada (lower diagonal) range from strongly negative to strongly positive. Values within cells are Pearson correlation coefficients (r) based on bioclimatic variables extracted from WorldClim GIS layers for the localities where foxtail pine was known to occur in each regional population (n = 65 for Klamath Mountains, n = 144 for the southern Sierra Nevada). Cell colors are proportional to the value of r, with dark red being r = -1.0 and white being r = 1.0.
Supplement: 16 Genetic architecture of water-use efficiency
Figure S8. The relationship between permutation importance (PI) scores for bioclimatic variables (Bio) derived from each species distribution model (SDM). Values are not significantly correlated between SDMs at α = 0.05 (Spearman’s ρ = -0.089, P = 0.7171). For symbols without apparent error bars, the diameter of the circle was greater than the standard deviation. Values > 10 are labeled for each SDM.
Supplement: 17 Genetic architecture of water-use efficiency
Figure S9. Jackknife estimates of variable importance based on AUC for bioclimatic variables derived from the species distribution model (SDM) for the Klamath Mountains population of foxtail pine.
Supplement: 18 Genetic architecture of water-use efficiency
Figure S10. Jackknife estimates of variable importance based on test gain for bioclimatic variables (bio) derived from the species distribution model (SDM) for the Klamath Mountains population of foxtail pine.
Supplement: 19 Genetic architecture of water-use efficiency
Figure S11. Jackknife estimates of variable importance based on regularized test gain for bioclimatic variables (bio) derived from the species distribution model (SDM) for the Klamath Mountains population of foxtail pine.
Supplement: 20 Genetic architecture of water-use efficiency
Figure S12. Jackknife estimates of variable importance based on AUC for bioclimatic variables derived from the species distribution model (SDM) for the southern Sierra Nevada population of foxtail pine.
Supplement: 21 Genetic architecture of water-use efficiency
Figure S13. Jackknife estimates of variable importance based on test gain for bioclimatic variables derived from the species distribution model (SDM) for the southern Sierra Nevada population of foxtail pine.
Supplement: 22 Genetic architecture of water-use efficiency
Figure S14. Jackknife estimates of variable importance based on regularized test gain for bioclimatic variables derived from the species distribution model (SDM) for the southern Sierra Nevada population of foxtail pine.
Supplement: 23 Genetic architecture of water-use efficiency
Figure S15. Summaries of niche divergence for foxtail pine reveal statistically significant divergence. (A) Observed localities of each regional population (red = southern Sierra Nevada; blue = Klamath Mountains) relative to 5,000 background points (green = southern Sierra Nevada; salmon = Klamath Mountains) for two bioclimatic variables (Bio18 on x-axis and Bio19 on y-axis). (B) Null distributions (n = 100 permutations) for the D and I statistics relative to the observed values (red lines) reveal that the null hypothesis that the two SDMs are no more differentiated than those randomly drawn from a common SDM with non-overlapping geographical distributions for each regional population is not well supported. (C – F) Null distributions (n = 100 permutations) of the D (panels C and D) and I (panels E and F) statistics relative to observed values (red lines) for the southern Sierra Nevada relative to the Klamath Mountains background (panels C and E) and the Klamath Mountains relative to the southern Sierra Nevada (panels D and F) reveal that observed values are not predicted well by the null hypothesis.
Supplement: 24 Genetic architecture of water-use efficiency
Figure S16. Autocorrelation functions (ACF = Pearson’s r) at various lags reveal moderate levels of spatial autocorrelation for the F-statistic used to test for the presence of QTLs along linkage groups for each trait. (A) Autocorrelation of the F-statistic for δ13C. (B) Autocorrelation of the F-statistic for δ15N.
Supplement: 25 Genetic architecture of water-use efficiency
Figure S17. The relationship between F-statistics measured for each trait at a resolution of 1- cM window reveals little to no correlation.
Supplement: 26 Genetic architecture of water-use efficiency
Figure S18. The negative relationship between the correlation in family effects from the two- locus QTL models, as assessed using the t-statistic for each family at each QTL on the same linkage group, and the difference in position of the QTLs on the same linkage group
Supplement: 27 Genetic architecture of water-use efficiency
Figure S19. Average estimates of h2 for δ13C in sugar pine are over-estimated using a small number of families (n). Averages are arithmetic means across 100 randomly selected datasets for each value of n. The horizontal red line is the estimate reported in Eckert et al. (2015).
Supplement: 28