Supporting Information
Total Page:16
File Type:pdf, Size:1020Kb
Convergent Evolution of Toxin Resistance SUPPORTING INFORMATION Title: Widespread convergence in toxin resistance by predictable molecular evolution Authors: Beata Ujvari, Nicholas R. Casewell, Kartik Sunagar, Kevin Arbuckle, Wolfgang Wüster, Nathan Lo, Denis O’Meally, Christa Beckmann, Glenn F. King, Evelyne Deplazes and Thomas Madsen Abstract: The question about whether evolution is unpredictable and stochastic or intermittently constrained along predictable pathways is the subject of a fundamental debate in biology, in which understanding convergent evolution plays a central role. At the molecular level, documented examples of convergence are rare and limited to occurring within specific taxonomic groups. Here we provide evidence of constrained convergent molecular evolution across the metazoan tree of life. We show that resistance to toxic cardiac glycosides produced by plants and bufonid toads is mediated by similar molecular changes to the sodium-potassium-pump (Na+/K+- ATPase) in insects, amphibians, reptiles and mammals. In toad-feeding reptiles, resistance is conferred by two point mutations that have evolved convergently on four occasions, whilst evidence of a molecular reversal back to the susceptible state in varanid lizards migrating to toad-free areas suggests that toxin resistance is maladaptive in the absence of selection. Importantly, resistance in all taxa is mediated by replacements of two of the 12 amino acids comprising the Na+/K+-ATPase H1-H2 extracellular domain that constitutes a core part of the cardiac glycoside binding site. We provide mechanistic insight into the basis of resistance by showing that these alterations perturb the interaction between the cardiac glycoside bufalin and the Na+/K+-ATPase. Thus, similar selection pressures have resulted in convergent evolution of the same molecular solution across the breadth of the animal kingdom, demonstrating how a scarcity of possible solutions to a selective challenge can lead to highly predictable evolutionary responses. 1 Convergent Evolution of Toxin Resistance SI Materials and Methods Sequence data. Squamate sequence data for the H1-H2 domain (encoding the αM1- αM2 extracellular loop) of the α3 subunit of the Na+/K+-ATPase were generated as previously described (1). Genomic DNA was isolated by phenol-chloroform extraction. Using the primers previously detailed (2), DNA corresponding to the H1- H2 domain was subsequently amplified and sequenced on an ABI 3130xl Genetic Analyzer using a 121 BigDye Terminator Kit v.3.1 (Applied Biosystems). In total, we sampled 43 squamates, including the 18 varanid lizards previously reported (1). The DNA sequence data generated in this study have been submitted to GenBank with the accession numbers KP238131–KP238176. We supplemented these data with sequences generated from on going or completed genome sequencing projects for squamates (Burmese python, king cobra, green anole and bearded dragon) and non- squamate outgroups (American alligator, chicken, tuatara). A list of the sequenced taxa and their propensity to feed on bufonid toads is displayed in the SI Appendix, Table S1. Sequence data for the H1-H2 domain of the α subunit of the Na+/K+- ATPase from insects and anurans were obtained from previous studies (3, 4). The mammalian dataset was generated by isolating the previously reported sequence for the rat (5, 6) and undertaking BLAST similarity searches using the nucleotide, genome, EST and SRA databases of NCBI (http://www.ncbi.nlm.nih.gov) to identify homologous genes in related taxa. Ancestral reconstructions and sequence evolution. For each group of taxa (squamates, insects, anurans and mammals), separate sequence alignments were generated using the MUSCLE algorithm (7). Species trees of the taxa represented in each dataset were then constructed from previously published studies (see Fig. 1 and SI Appendix, Figs. S4-S6 and S8 for details). This data was then used to reconstruct ancestral sequences at various nodes of the Na+/K+-ATPase phylogenies using the marginal sequence reconstruction method (8) implemented using the ASR algorithm on the Datamonkey web-server (9). The rate of evolution of Na+/K+-ATPase gene was estimated using the maximum-likelihood model (M8) of PAML package (10). The influence of episodic adaptive selection was assessed on each dataset using the state- of-the-art ‘mixed effects model of evolution’ (MEME) implemented in HyPhy (11). Coevolving amino acid sites were detected using the spidermonkey algorithm (12) in 2 Convergent Evolution of Toxin Resistance HyPhy: spidermonkey reconstructs the substitution history of the alignment using a maximum likelihood-based phylogenetic approach, followed by assessment of the joint distribution of substitution events using Bayesian graphical models to detect significant evolutionary associations among amino acid positions. Finally, the Fast, Unconstrained Bayesian AppRoximation (FUBAR) method (13) was utilised to detect sites in each dataset evolving under the pervasive influence of selection, whilst the Directional Evolution in Protein Sequences (DEPS) algorithm (14) was used for identifying sites that are the subject of directional evolution. Phylogenetic analyses. To identify which α subunit of the multi-locus Na+/K+- ATPase gene family was the subject of resistance-conferring amino acid replacements in each animal lineage, we reconstructed the evolutionary history of these genes. Full- length sequence data were obtained for each α subunit by BLAST similarity searching representative taxonomic groups in the genome and protein databases of NCBI using various α subunit template sequences. The resulting amino acid sequence data were aligned using the MUSCLE algorithm (7) and then checked manually. For gene tree generation, we performed phylogenetic analysis using MrBayes v3.2 (15). First we selected an appropriate model of evolution favoured by the Akaike Information Criterion using ModelGenerator (16). The selected model (GTR + G) was implemented into Bayesian inference analysis using MrBayes on the CIPRES Science Gateway (www.phylo.org). The analysis was run in duplicate using four chains simultaneously (three heated and one cold) for 5x106 generations, sampling every 500th cycle from the chain and using default parameters in regards to priors. Tracer v1.5 (http://beast.bio.ed.ac.uk/tracer) was used to estimate effective sample sizes for all parameters (with all showing well in excess of the minimum accepted – 200) and to construct plots of ln(L) against generation to verify the point of convergence (burnin). Trees generated prior to the point of convergence were discarded through a conservative first 25% cutoff and a consensus gene tree was generated from the remaining trees sampled. The resulting gene tree was annotated with details of which α subunit confers resistance in the different animal lineages (see SI Appendix, Fig. S2). Changes in isoelectric point and charged residues. The isoelectric point and changes in charge of the H1-H2 extracellular domain of the α Na+/K+-ATPase were 3 Convergent Evolution of Toxin Resistance calculated using the ProtParam tool (http://web.expasy.org/protparam) hosted at the ExPASy Bioinformatics Resource Portal. Both isoelectric points and the addition/loss of charged amino acid residues were calculated for all sequence data sourced from the susceptible and resistant taxa analysed in each taxonomic group (SI Appendix, Table S2). Statistical comparisons of changes in isoelectric point between resistant and susceptible taxa were performed using an unequal variance two-tailed t-test. We next investigated whether resistance was associated with a shift from neutral to charged amino acids using binomial tests in R v.3.1.0 (17). Binomial tests compare an observed proportion, in this case the proportion of resistance mutations that involved a neutral to charged amino acid shift, to an expected proportion. In effect, this test asks whether shifts to charged amino acids (as we observe) occur more frequently than shifts to other amino acids (that we do not observe). In order to ensure the robustness of our results, we used three different expected proportions. In the first, we simply used the proportion of all amino acids (excluding the ancestral one) that are charged. In the second, the expected proportion was as before except that amino acid shifts were weighted based on the number of codons that code for each one. This is likely to be more realistic as it accounts for the fact that, if mutations were random, it may be easier to shift to a new amino acid that has four possible codons than one which has only two. In the third, the expected proportion was based on a null model of equal nucleotide-base substitution, such that transitions were weighted based on the number of individual base changes required to shift from one amino acid to another. This is more realistic again as it accounts explicitly for silent mutations in the evolutionary process. Since all three analyses yielded qualitatively identical results (first version, P=6.86x10-5; second version, P=1.81x10-10; third version, P=6.90x10-7), we only report the third, most realistic version, in the main text of the paper. Molecular modelling. All docking simulations were performed with AutoDock vina 2.0 (18) using an approach similar to that described by Zhen et al. (19). To establish a docking protocol we first re-docked the cardiac glycoside bufalin into the crystal structure of bufalin bound to the pig Na+/K+-ATPase (PDB: 4RES) (20). The bufalin ligand