597.Full.Pdf

Copyright 0 1993 by the Genetics Society of America Rates and Patterns of Base Change in the Small Subunit Ribosomal RNA Gene Lisa Vawter' and Wesley M. Brown Molecular Systematics Laboratory and Insect Division, Museum of Zoology, and Department of Biology, University of Michigan, Ann Arbor, Michigan 48109-1079 Manuscript received October28, 1992 Accepted for publication February 4, 1993 ABSTRACT The small subunitribosomal RNA gene (srDNA) hasbeen used extensively for phylogenetic analyses. One common assumption in theseanalyses is that substitution rates are biasedtoward transitions. We have developeda simple method for estimating relative ratesof base change that does not assume rate constancy and takes into account base composition biasesin different structures and taxa. We have applied this method to srDNA sequences from taxa with a noncontroversial phylogeny to measure relativerates of evolution in various structural regions of srRNA andrelative rates of the different transitions and transversions.We find that: (1) the long single-stranded regionsof the RNA molecule evolve slowest, (2) biases in base composition associated with structure and phylogenetic position exist, and (3) the srDNAs studied lack a consistent transition/transversion bias. We have made suggestions based on these findingsfor refinement of phylogenetic analyses using srDNA data. HE accumulation of DNA sequence data from SON and BROWN 1986), a transition/transversionbias T disparatetaxa makes study of thenature of has yet to be demonstrated in structural RNA genes. DNAevolution possible. Because certain classes of Despite the lack of evidence bearing upon individ- DNA characters tend to change in a related fashion, ual transition/transversionrates in structuralRNA the nature of change within these classes can be ex- genes, phylogenetic analyses that require specific as- plored statistically, so that we might learn about the sumptions aboutthem havebeen undertaken. evolution of molecules, as well as refine the assump- Though MINDELL and HONEYCUTT (1990)did tally tions for the use of molecular data in phylogenetic transitions and transversions for srDNAs, they didnot analysis. calculate rates of change, and used taxa in their study Atransition-transversion rate bias has beenob- that were too closely related to allow calculation of served in primate mitochondrial DNA (BROWNet al. ratesfrom their tallies because of sample sizes. A 1982). These authors noted that transitions (C c, T transition rate bias and equal transversion rates for and A c, G changes) were more common than ex- different nucleotides have been used as assumptions pected, given random change, and proposed that the for phylogenetic analysis of rDNAs using the method mitochondrial DNA bias toward transitions is due to of invariants (LAKE 1988,1989). The transition-trans- mutation bias. LI, WU and LUO(1 985) found a tran- version bias assumption has also been suggested for sition bias in nuclearprotein-coding genes and cladistic analyses of rDNAs (MISHLER et al. 1988; pseudo-genes, though the bias is not as pronounced PATTERSON1989; MICKEVICHand WELLER1990). It here as in mitochondrialgenes. This bias is often is unclear how applicable a the transition-transversion rate bias assumption is for phylogenetic analysis using discussed theoretically in terms of the silent sites of non-protein-coding genes, however, including srRNA protein-coding genes (e.g.,JUKES and BHUSHAN1986; (OLSENand WOESE1989; WOESE1989). JUKES 1987). Because both BROWNet al. and LI et al. We demonstrate a simple method to estimate rela- grouped all transitions and all transversions in their tive rates of change among nucleotides for different studies, they did not discuss specifically whether each structural classes within srDNA. This methodhas the transition, taken individually, was more common than advantage that it circumvents a constant molecular each individual transversion. Though this does hold clock assumption. We then use these relative rates to true of the BROWNet al. data set, whether it holds for evaluate the assumptions of a transition-transversion the LI et al. data set is less clear (L. VAWTER, unpub- bias and equal transversion probabilities for srDNA. lished observations). However, except for those structural RNA genes in the mitochondrial genome (HIX- METHODS AND RESULTS I Current address: Museum of Comparative Zoology, Harvard University, The data set: Here, we present the set of srDNA Cambridge Massachusetts 02138. sequence characters we used (nucleic acid positions), Genetics 134: 597-608 (June, 1993) 598 L. Vawter and W. M. Brown Caenorhabditis Rattus Mus Homo I LLLLLLsssswsssssq) HTUlO’/ Xenopus’ HTU9 ws FIGURE2.”Illustration of terminology for categorization of bases into bulges (B), loops (L),stems (5’) and “other” (0).The HTUS/” “other” category comprises long single-stranded regions that are Artemia thought to interact with the ribosomal proteins (WOESE et al. 1983). sequence and secondary structure. These alignments were performed as detailed in SOGINand ELWOOD (1 986), except thatwe discovered regions of sequence similarity by the method of LAWRENCE and GOLDMAN (1988), as implemented in EuGene3.2 (Molecular Biology Information Resource, 1989). We confirmed HTU7 the secondarystructures from the literature using FIGURE1.-Unrooted phylogenetic network of all OTUs and energetic (ZUKER 1989; JAEGER, TURNERand ZUKER HTUs in thedata set. The relationships among these taxa are 1990) and phylogenetic considerations (WOESEet al. noncontroversial (KEMP 1988; MILNER, 1988; NOVACEK,WYSS and 1983). Where the structure was ambiguous, we dis- MCKENNA 1988; WILLMER 1990). We used a parsimony method cardedthe structural information. The aligned se- (HENNIG1966), as implemented in the computer program PAUP, quences will be provided electronically by the author to assign base changes to branches. Branch lengths, excluding the dashed portion of the branch between Caenorhabditis and Artemia, (L.V.) upon request and aregiven in VAWTER(1991). are proportional to the number of changes they represent and are The method of structural classification we used (stem, based on all characters, including cladistically noninformative char- loop, bulge or“other”) is shown in Figure 2. We acters. The consistency index (KLUGEand FARRIS1969), based only emphasize here that we are including G-U pairs as on informative characters, is 0.94 and is discussed later in the text. stem structure, where they are not terminal to the as well as the methods we used for inferringhomology stem. Unpaired regions within stems are classified as (in the evolutionary sense) among the characters. We bulges, so that no unpaired positions are included as used a parsimony method to infer hypothetical ances- stem bases. The “other”class is not an arbitrary des- tral sequences for the evolutionary ancestors (HTUs) ignation for positions with unknown structure; rather, of the taxa that bear these srDNA characters. Forthis it comprises long single-stranded regionsthat interact with ribosomal proteins (WOESEet al. 1983). All data, study, we used published srDNA sequences from a including those for which structure was ambiguous, mouse, Mus musculus (RAYNAL,MICHOT and BACH- were included in calculation of overall base composi- ELLERIE 1984), a rat, Rattus noruegicus (CHANet al. tions. Those positions where the alignment was un- 1984),a human, Homo sapiens (TORCZYNSKI,FUKE ambiguous but the structure unknown (e.g., positions and BOLLON1985), a frog, Xenopus laevis (SALIMand 1226-1 321 of the human sequence) were excluded MADEN1981), a brine shrimp,Artemia salina (NELLES from structural analysis. Sequence data that were not et al. 1984) and a nematode, Caenorhabditiselegans included in the remaining analyses because of ambi- (ELLIS,SULSTON and COULSON1986). We chose these guity of alignment or structure are detailed in VAW- taxa because their evolutionary relationships are non- TER (1991) and are available from the author (L.V.) controversial (Figure 1) (KEMP 1988; MILNER 1988; upon request. NOVACEK, WYSS and MCKENNA 1988; WILLMER For the phylogeny in Figure 1, we calculated hy- 1990)and are derivedfrom data sets otherthan pothetical ancestral sequences (HTUs) and predicted srDNA. By doing this, we avoid the logical circularity base changes forall varying positions using parsimony that would result if we were to use srDNA sequence (PAUP; SWOFFORD1989). Aswas pointed out by to derive an evolutionary network, and then derive FITCHand MARKOWITZ(1 970),it is necessary to su- conclusions about changes of srDNA sequence from perimpose sequence data on a phylogeny, rather than the network derived from those same changes. merely to tally differences between the taxa when We inferred homology among the srDNA nucleic taken pairwise, in order toutilize all changes required acid characters through alignment of both primary by the phylogeny. This approach to identifying base and PatternsRates and of srDNA Change 599 FIGURE3.-Matrix of genetic dis- tances between the taxa in the analysis. We calculated these as strict per- cent difference, with no corrections for multiple hits. As suggested by SOGINand ELWOOD(1986), we omit- ted unique insertions from the cal- culations. changes has the advantage over utilizing correction and structural data from Caenorhabditis and the hy- formulae to estimate numbers of changes between pothetical ancestral taxa included,and oncewith them pairs of taxa, as it allows not only estimation of the eliminated. Because of ambiguities in structure

597.Full.Pdf

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support