Sensitivity of Ribosomal RNA Character Sampling in the Phylogeny of Rhabditida
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Nematology 47(4):337–355. 2015. Ó The Society of Nematologists 2015. Sensitivity of Ribosomal RNA Character Sampling in the Phylogeny of Rhabditida 1 2 2 OLEKSANDR HOLOVACHOV, LAUREN CAMP, AND STEVEN A. NADLER Abstract: Near-full-length 18S and 28S rRNA gene sequences were obtained for 33 nematode species. Datasets were constructed based on secondary structure and progressive multiple alignments, and clades were compared for phylogenies inferred by Bayesian and maximum likelihood methods. Clade comparisons were also made following removal of ambiguously aligned sites as determined using the program ProAlign. Different alignments of these data produced tree topologies that differed, sometimes markedly, when analyzed by the same inference method. With one exception, the same alignment produced an identical tree topology when analyzed by different methods. Removal of ambiguously aligned sites altered the tree topology and also reduced resolution. Nematode clades were sensitive to differences in multiple alignments, and more than doubling the amount of sequence data by addition of 28S rRNA did not fully mitigate this result. Although some individual clades showed substantially higher support when 28S data were combined with 18S data, the combined analysis yielded no statistically significant increases in the number of clades receiving higher support when compared to the 18S data alone. Secondary structure alignment increased accuracy in positional homology assignment and, when used in combination with paired-site substitution models, these structural hypotheses of characters and improved models of character state change yielded high levels of phylogenetic resolution. Phylogenetic results included strong support for inclusion of Daubaylia potomaca within Cephalobidae, whereas the position of Fescia grossa within Tylenchina varied depending on the alignment, and the relationships among Rhabditidae, Diplogastridae, and Bunonematidae were not resolved. Key words: Bayesian inference, Daubaylia, Fescia, maximum likelihood, molecular phylogenetics, multiple sequence alignment, progressive alignment, Rhabditida, ribosomal RNA, secondary structure. The first molecular phylogeny that included a repre- Ribosomal RNA genes remain the most commonly sentative sample of the phylum Nematoda was based on sequenced and analyzed molecules for phylogenetics of a single gene (18S rRNA) and had 53 terminal taxa nematodes, with studies usually focused on either the (Blaxter et al., 1998). Since then, numerous molecular near-complete 18S rRNA gene or partial sequences of phylogenetic hypotheses for nematodes have been the 28S rRNA gene (e.g., D2-D3 or D1-D2-D3 domains). published. These include many based on one genetic Because these two genes are from the same locus, they locus, such as the 18S rRNA gene from more than 1,200 are non-independent estimators of the same un- species (van Megen et al., 2009) or analysis of 12 derlying gene tree. Differences in topology between protein-coding genes from the mitochondrial genomes phylogenies inferred using 18S and 28S sequences for of up to 100 species (Kang et al., 2009; Park et al., 2011; the same taxa occur, but they are best explained as Sun et al., 2014; Kim et al., 2015). The application of stochastic inference errors for the individual datasets. multilocus data to estimate nematode phylogeny has so Analysis of near-complete 18S or partial 28S sequences far been focused on taxonomically restricted questions do not provide sufficient resolution for certain phylo- (Nadler and Hudspeth, 2000; Nadler et al., 2006a; genetic questions involving nematodes (Smythe and Kiontke et al., 2007, 2011; Subbotin et al., 2008; Mayer Nadler, 2007; Holovachov et al., 2011; Shokoohi et al., et al., 2009). Whether focused on single genes or mul- 2013), but their continued use is justified because of tiple loci, optimal estimates of evolutionary history re- the large database of comparative information available quire consideration of many factors that influence in GenBank and the relative ease of rRNA PCR amplifi- phylogenetic analysis of sequence data. For example, cation and sequencing, even from individual nematodes. methods for producing multiple alignments of ribo- Structural information from rRNA has been fre- somal sequences and decisions to include or exclude quently used to improve alignments of 18S and 28S aligned sites because of ambiguity in positional ho- genes in evolutionary studies of nematodes that address mology can have profound effects on inferred nema- questions at broad taxonomic levels (Aleshin et al., tode relationships (Smythe et al., 2006). 1998; Blaxter et al., 1998; Holterman et al., 2006, 2008a, 2008b; Meldal et al., 2007; Holovachov et al., 2009, 2012, 2013b; van Megen et al., 2009; Bik et al., 2010) or within individual orders (Chilton et al., 2006; Kiontke Received for publication December 30, 2014. 1Department of Zoology, Swedish Museum of Natural History, Box 50007, SE- et al., 2007; Bert et al., 2008; Holterman et al., 2009; 104 05, Stockholm, Sweden. Holovachov et al., 2013a). Some of these studies di- 2Department of Entomology and Nematology, University of California, One Shields Avenue, Davis, CA 95616. vided the alignment into ‘‘stem’’ (paired sites) and This research was partly supported by the following grants: ‘‘Training the ‘‘loop’’ (non-paired sites) partitions with separate sub- next generation of nematode taxonomists: Applying the tools of modern monography across free-living and parasitic Tylenchina’’ (NSF PEET, USA, stitution models for each (Holterman et al., 2006, DEB-0731516) to S.A.N.; ‘‘Taxonomy and distribution of free-living nematodes 2008a, 2008b, 2009; Bert et al., 2008; van Megen et al., of the order Plectida in Sweden’’ (Swedish Taxonomy Initiative, Sweden) to O.H. The authors are grateful to Dr. Andy Vierstraete (Ghent University, 2009; Bik et al., 2010; Holovachov et al., 2012, 2013a, Belgium) for making publications by J. Aerts and A. Hendrickx known to us. E-mail: [email protected]. 2013b), but few used paired-site substitution models This paper was edited by Erik J. Ragsdale. in consideration of compensatory substitutions (Bert 337 338 Journal of Nematology, Volume 47, No. 4, December 2015 et al., 2008; Holovachov et al., 2012, 2013a, 2013b). et al., 2007). Near-complete nuclear 28S (26S/28S or There are also few studies of nematode relationships large subunit) rDNA was amplified and sequenced in above the family level that use nearly full-length 18S three overlapping pieces. Two 28S rRNA regions rep- and 28S rRNA sequences for phylogeny inference (e.g., resenting ~80% of the gene were amplified using three Chilton et al., 2006; Kiontke et al., 2007), and none that PCR reactions. Approximately 1,000 bp of the 28S 59- combine such sequences with paired-site substitution end was amplified using forward PCR primer 391 (59- models for both genes. AGCGGAGGAAAAGAAACTAA) and reverse primer In phylogenetic analyses based on sequence data, 501 (59-TCGGAAGGAACCAGCTACTA). Approximately there are several factors that can influence the out- 1,600 bp of the remaining 28S rRNA was amplified in come, that is, the tree topology and its branch lengths. two overlapping pieces. These two amplicons were nor- These include (i) multiple sequence alignment mally obtained using forward primer 527 (59-CTAAG- (Smythe et al., 2006); (ii) the number of genes in- GAGTGTGTAACAACTCACC) in combination with cluded in the analysis (Rokas and Carroll, 2005); (iii) reverse primer 532 (59-AATGACGAGGCATTTGGC- the length of the sequences used for analysis (Rosenberg TACCTT), and forward primer 537 (59-GATCCG- and Kumar, 2001); (iv) the choice of inference TAACTTCGGGAAAAGGAT) with reverse primer 531 method and associated models (Kelchner and Thomas, (59-CTTCGCAATGATAGGAAGAGCC). If amplification 2006); and (v) selection of model parameter values. using primers 527/531 failed, the alternative forward Herein, we explored factors i, iii, and iv in some detail primer 563 (59-ACCCGAAAGATGGTGATCTAT) was for a group of Rhabditida (Table 1), evaluating if used for PCR in combination with either primer 532 or combined analysis of near-full-length 18S and 28S primer 531. PCR reactions (25 ml) consisted of 0.5 mMof rRNA sequences yields increased clade resolution and each primer, 200 mM deoxynucleoside triphosphates, support over analysis of 18S data alone. In addition, we and MgCl2 ranging from 1.5 to 3.0 mM as needed for evaluated if phylogenetic analyses based on secondary specific and robust amplification. Proofreading polymer- structure alignments, either alone or combined with ase (0.5 units, Finnzymes DNAzyme EXT; MJ Research) paired-site substitution models, are advantageous when was used for PCR; the range of PCR cycling parameters compared to those based on computer-based pro- included denaturation at 948C for 3 min, followed by 35 gressive multiple alignment. Finally, we evaluated the cycles of 948Cfor30sec,508Cto608Cfor30sec,and728C phylogenetic effect of removing characters judged as for 60 to 80 sec, followed by a post-amplification exten- alignment ambiguous from the progressive alignment sion at 728Cfor7min. dataset. PCR products were prepared for direct sequencing as described previously (Nadler et al., 2006b). For in- MATERIALS AND METHODS stances when PCR products could not be directly se- quenced, the amplicons were cloned and sequenced Species selection: When selecting species to be included using methods detailed in the work of Nadler et al. in the analysis, we considered the