<<

Proc. Nati. Acad. Sci. USA Vol. 91, pp. 9916-9920, October 1994 Evolution Group I introns are inherited through common ancestry in the nuclear-encoded rRNA of () (green /lateral traser/phylogeny/secondary structure) DEBASHISH BHATTACHARYA*t, BARBARA SUREK*, MATTHIAS RuSING*, SIMON DAMBERGERt, AND MICHAEL MELKONIAN* *UniversitAt zu Kdln, Botanisches Institut, Gyrhofstrasse 15, 50931 Cologne, Germany; and tDepartment of Molecular, Cellular, and Developmental Biology, University of Colorado, Porter Biosciences Building, Campus Box 347, Boulder, CO 80309-0347 Communicated by Thomas R. Cech, June 23, 1994

ABSTRACT Group I introns are found in organellar ge- suggests that they were introduced into the nucleus of nomes, in the genomes of eubacteria and phages, and in later-diverging (i.e., Metakaryota) by gene transfer nuclear-encoded rRNAs. The origin and distribution of nucle- from the intron-containing cyanobacterium that gave rise to ar-encoded rRNA group I introns are not understood. To the plastid [i.e., tRNA'-eu group I intron (4, 9)] or the a purple elucidate their evolutionary relationships, we analyzed diverse eubacterium that gave rise to the mitochondrion. Group I nuclear-encoded small-subunit rRNA group I introns icluding introns have been found within that diverge after the nine sequences from the green-algal order Zygnematales [e.g., Naegleria spp. (11, 12), Physarum polyceph- (Charophyceae). Phylogenetic analyses of group I introns and alum (13)] and before the radiation of the eukaryotic crown rRNA coding regions suggest that lateral transfers have oc- groups. curred in the evolutionary history of group I introns and that, To assess the origin and distribution of rRNA group I after transfer, some of these elements may form stable com- introns lacking ORFs, we analyzed a data set ofsmall-subunit ponents of the host-cell nuclear genomes. The Zygnematales (SSU) rRNA group I introns from , Zygnematales introns, which share a common insertion site (position 1506 ("desmids"); ; and fungi. Zygnematales are mem- relative to the Escherchia cofi small-subunit rRNA), form one bers ofthe Charophyceae and occupy a basal position within subfamily o group I introns that has, after its origin, been the radiation of green algae and land (14).§ inherited through common ancestry. Since the first Zygne- matales appear in the middle Devonian within the fossil record, MATERIALS AND METHODS the "1506" group I intron presumably has been a stable component of the Zygnematales small-subunit rRNA coding Complete SSU rRNA sequences have been previously de- region for 350-400 million years. termined for four Zygnematales (14): caldari- orum (), Genicularia and Stau- rastrum sp. M752; (, sensu ref. 15), and Mou- Group I introns are characterized by conserved RNA sec- geotia scalaris (, ref. 16). To enlarge on this ondary structures essential for splicing and are often capable data set we determined the complete rRNA coding regions of of self-splicing or require protein factors for excision (1-3). four members of the Desmidiaceae, Cosmarium botrytis The origin and distribution of group I introns are not under- [strain 274, Sammlung von Conjugaten-Kulturen University stood. Group I introns have been found most often in the of Hamburg (SCK), refs. 17 and 18], Cosmocladium saxoni- organellar and nuclear genomes ofgreen algae, higher plants, cum (strain 320, SCK), Sphaerozosma granulatum (strain and fungi and in the genomes of some eubacteria and phages 204, SCK), and Staurastrum sp. (strain M753, Culture Col- (3). Since the phage group I introns are readily mobile and the lection Melkonian, Cologne), and one member of the Zyg- phage genome represents a mosaic ofgene segments, it is not nemataceae, Zygnemopsis circumcarinata (strain 241, SCK). possible to address group I intron origin with these sequences DNA Amplification and Sequencing. Total DNA from Cos- (4). Of the organellar group I introns, some contain an open marium botrytis, Cosmocladium saxonicum, Sphaerozosma reading frame (ORF) which encodes a sequence-specific granulatum, Staurastrum sp. M753, and Zygnemopsis cir- endonuclease to mediate their lateral transfer into homolo- cumcarinata was prepared as described (14). SSU rRNA gous sequences [intron homing (5)]. Group I intron mobility genes were amplified by PCR (19) using oligonucleotide is also postulated to result from reverse splicing (6). primers complementary to conserved sequence elements Some group I introns which lack endonuclease coding proximal to the 5' and 3' termini ofrRNA coding regions (20). regions appear to be nonmobile and provide a potentially SSU rRNA sequences were determined by the dideoxynu- valuable tool for tracing the evolutionary history of these cleotide chain-termination procedure (21) using single- sequences (2): the presence of a nonmobile group I intron stranded templates produced with the Dynabeads 280 positioned in thq homologous site of the tRNALeu of cyano- streptavidin system (Dynal; ref. 22). Coding and noncoding bacteria and in plastids of photosynthetic lineages that di- strands of Zygnematales rRNAs were determined with oli- verged as representatives of the eukaryotic crown group (7, gonucleotide primers complementary to conserved regions 8) radiation (e.g., green algae, land plants, , within these coding regions. Two Zygnematales intron- glaucocystophytes) suggests that this intron was present in the progenitor(s) ofthese plastids and therefore is at least one Abbreviations: ORF, open reading frame; SSU, small subunit; LSU, billion years old (9). Within , the apparent absence large subunit. of group I introns within the earliest-diverging amitochon- tTo whom reprint requests should be addressed. drial and aplastidial Archezoa (see ref. 10 for definition) §The rRNA sequences of Cosmarium botrytis, Cosmocladium sax- onicum, Genicularia spirotaenia, Mesotaenium caldariorum, Sphaerozosma granulatum, Staurastrum sp. M752, Staurastrum The publication costs of this article were defrayed in part by page charge sp. M753, and Zygnemopsis circumcarinata have been deposited in payment. This article must therefore be hereby marked "advertisement" the GenBank database (accession nos. X79498, X79497, X74753, in accordance with 18 U.S.C. §1734 solely to indicate this fact. X75763, X79496, X74752, X77452, and X79495).

9916 Downloaded by guest on September 25, 2021 Evolution: Bhattacharya et al. Proc. Natl. Acad. Sci. USA 91 (1994) 9917 specific primers (6715F, 5'-ACCTTATCATTTAG-3'; tum (M84319), Chlorella ellipsoidea (X63520), Coleochaete 6716R, 5'-TTTAGTCTGTGAAC-3') were used to determine orbicularis (M95611), Cosmarium botrytis (X79498), Cosmo- double-stranded sequences over these regions. cladium saxonicum (X79497), Dunaliella parva (M62988), Host-Cell Phylogeny. To analyze the host-cell phylogeny of Dunaliella salina (M84320), Friedmannia israelensis intron-containing taxa, SSU rRNAs of 28 eukaryotes includ- (M62995), Gingko biloba (D16448), Genicularia spirotaenia ing members of the Zygnematales and other green algae and (X74753), Gloeotilopsis planctonica (27), Klebsormidium land plants were manually aligned, and only regions which flaccidum (M95613), Mesotaenium caldariorum (X75763), could be unambiguously aligned in all the sequences were Mougeotia scalaris (X70705), Nephroselmis olivacea used for the phylogenetic analysis (1718 nt). Distance anal- (X74754), Nitella sp. (M95615), Pneumocystis carinii ysis of group I introns was implemented with the PHYLIP (X12708), Porphyra umbilicalis (L26201), Sphaerozosma (version 3.5c; ref. 23) computer program. The neighbor- granulatum (X79496), Staurastrum sp. M752 (X74752), Stau- joining method (24) was used to infer an unrooted phyloge- rastrum sp. M753 (X77452), Ustilago maydis (X62396), netic tree from evolutionary distances estimated by the Zamia pumila (M20017), Zea mays (K02202), and Zygne- method ofKimura (25). Bootstrap resamplings (26) were used mopsis circumcarinata (X79495). to assess stability of monophyletic groups. Group I Intron Analysis. The alignment of Zygnematales The host-cell phylogenetic analysis included the following group I intron sequences was aided by the secondary struc- sequences (with GenBank accession numbers shown where ture-based alignments of Cech (1), Michel and Westhof (28), available): Acrosiphonia sp. (U03757), Ankistrodesmus stip- and S.D. and R. Gutell (unpublisheddata). The 5'-P-Q-R-S-3' itatus (X56100), Chara foetida (X70704), Characium sacca- regions were initially aligned, and then other conserved re- P P2> P2< P2.1> P2.1< P3> P4> PS> PSa> P5B> L5b P59< P5C> L5C

C. botrytis * CAUGGAAGCCUAUGGGGG*ACAUGCUAGUGCU*UGCGAGCCG -tUCAGUCUGCGGGAA*UCCU-CCGUGGU*GGUA-CCAAGCGC*AGCGU*AGCGG-COGGaU-AGUGA------G C. saxonicum * CACGGAAGCCUAAGACCCAGUUGCUCCC¶CGCGAJGUt- CJAAGGGGGCCU-CUUPAlUGGUUA- CCAAGCAC-AAAGCA*AG - CCAG(CC*LVOGA*CCU-GG G. spirotatnia *UGarGAAGCCUUCCCC*GAAUGCUAGTGCC*UGCGACAUCG-CCAAAa)GCGGAGAA*UCCA* U*AUUA-CCAA *CAAG*CCCGUGG-CGCU-AAUG-CUULG M. caldariouxn *CCCCCC*AAACU GC UCA-GCCGACGGAAA-UCCC-UAAAGCUU*ACIA-CCAJGC*CGAAAS*CGCAUGG-CCPU-AA-CCUCGG M. scalaris *UACCGAAGCCUUAGCCGCCCoAAGUCUAGGUU-UGACAUCG-UCCC-U AAAGCUUACUA-CCAA£SCACoCGAAAG*GGUGU3G-AGGG A-CCUCGG S. granulatum *CAC CCUGPC-AUAUGCUX CCoGCAlGC3C-CCAAALCGAA-ACCA-CU-CUU~lGoaA-CCAAGCACGGAUAGC*UQG-C(AG3CC-ACGA*CCU-GG StaurastrunMl52 *CACGCWA3CACUAUoCCCUAlUGCCetGCGACGUCA-UC AAA*UCCU-A GGAUA-CCAAGCAC*CAIU-GUGt-CU -AAUGC-AaUCGG Staurastrunt4753 *CACGGAAGCCUGAGCACCGU*CCGUGCUAQUGUC CGCGACGUCA-UCAAAUGCCGGAAA*UCCA-GGAtGAAAOGAUA- CCAAGCAC CAIJUCU*AGUHUG- COAUU-AAUGC-ACJCGG Z. ciruar. UUGA.CC CCAAAGAJLtJCCoCGCGACACCG-CA A-UCCC- UAAAAiMlUACUA-CC AC-CGAAAGA*GGG-CCAOSC-AACGA-CCU3G C. ellipsoidea oUACCaUGCCGvaAAAGA GGCAACACCG-UCAAA AC*UCCU-AAGA aJAACCA-CCAAoG3G*UCUGUGG-CCGGAUU-AACGA-aUCCGG C. mirabilis o .AGGCCUCCGCCCoAAGCGCUAGUCoGGCACACCGC-UCPCGMICGGAC-UCCC*AACAGCUCACCA-CCAAGCAG-*C3ANJG*CCCGUGG-CCAGGU--CACGA-CCUCGG P. carinii *AAAGAAGCCUAGCAGCCU*AAUGCUAGUCUGoGGCGACAUUG-CCAAAUU3OGGGAAG-UCCC-UAAAGAUU*ACUA- CJAAGCAGo3GGAAACA*GUUGEGG-CCX3AGUU-AAUIAG*CCU-GG P. carinii LSU oGAGGGUCAC3GCGCoUGUU7CCUAGUGAU-UGCGACACUG-UCAU)GCGGGCAC-UCCC-UA ACUA-CUAAGCAGGGAAACA*GWGUGG-CCGAGU-AAUAG*CaU-GG H. rubra oMGCUCCCCoCGACUAGTCGCGAUCUUC-UCAAAU UGC U*AUGGCCG *GACA-CCGCGGCU-GACAACA*AGCA-GG-C-GGGGU-AGCAC-CCUGCU P. spiralis *UACUGAAGCCUUUGCGGCCCoCACGAC-A-UUAAoUGCUACCJUt-UCAAAIGCfGGAAJ*CCaJ-AAGAGCCCoUCUA-CCGCGGC- oGAAAACA*AGCA-GG-C-GAGGU-AGUUC-CCAGCU P. inouye A *CACUAUaDCCUGCAU.G GGAGAUCA~oGGACACACUU-U CGAAIXXGOGG(GAC*UCCU-UA- -GAU- oUCUA-CCAACCAGoGGAAACC*GCUGGGG-CCUTAUGC-UAAAA*CAUGGG A. stipi tatus oAUUGCCCAUCAUAGCAGCUGoGUCUGCUACo CGAGA CCG-UC OGAC*Ca-UACAGCUCoGCUA- CCAACUUG.AGAAMG*GCANAGG-CCGGGGU-AAUGA- CCCALGG C. sorokiniana oACUrCCC CCC P6< P7> P3< P8> P8< P7< P9> C. botrytis G*UAAGGUAACCCoCAWICCGACCCGCPJCCAGC-CCUGCAA-GCCACACAGGCU*GGCUCAAGAUAUAGUCGGUCCGCCAGC C. saxonictun-G

D. salina G-UAC!GGUAA-A AAUGCAALCC GCAGCCAAGCUCCUAC-AC A-GClAo - T. the G-nophilaG-UARBAAGCJAUGGUCCUAACCACAOCCAMUC-CUGU-AU G CC U GAaUA

FIG. 1. Partial alignment of 27 group I intron sequences used in the phylogenetic analyses shown in Fig. 3. The catalytic core regions (P, Q, R, S) and other group I intron secondary-structure elements are indicated above the sequences. Dots are used to mark breaks within all the aligned sequences, whereas stars are used to mark sequences which have had one or more nucleotides excluded to optimize the alignment. Alignment gaps are marked with dashes. LSU, large subunit. See Materials and Methods for genera. Downloaded by guest on September 25, 2021 9918 Evolution: Bhattacharya et al. Proc. Natl. Acad Sci. USA 91 (1994) gions within the SSU rRNA group I introns were subsequently insertion site (position 1506; see below). Bootstrap resam- added to the alignment on the basis of secondary-structure plings (26) were used in neighbor-joining and maximum- considerations (28). With the aid of an alignment program parsimony analyses to assess the stability of monophyletic which codes different nucleotides with different colors (SE- groups. QAPP REL i.9; ref. 29), 228 group I intron positions could be The phylogenetic analyses included the following group I unambiguously aligned for evolutionary analysis (Fig. 1). The intron sequences (with GenBank accession numbers shown phylogenetic analyses were restricted to 25 SSU rRNA group where available): Ankistrodesmus stipitatus (X56100), I introns and 2 LSU rRNA group I introns (Tetrahymena Characium saccatum (M84319), Chlorella ellipsoidea thermophila, Pneumocystis carinii) that are classified in sub- (X63520), Chlorella mirabilis (X74000), Chlorella sorokini- class IC1 (28). The Tetrahymena thermophila LSU intron ana (X73993), Cosmarium botrytis (X79498), Cosmocladium sequence was used as the outgroup for this study since the saxonicum (X79497), Dunaliella parva (M62988), Dunaliella LSU rRNA group I introns are evolutionarily distinct from safina (M84320), Genicularia spirotaenia (X74753), Gloeoti- those interrupting SSU rRNA coding regions (30, 31); as an lopsis planctonica (T. Friedl and C. Zeltner, personal com- exception, thePneumocystis carinii LSU rRNA group I intron munication), Hildenbrandia rubra (L19345), Mesotaenium is closely related to an intron located in its SSU rRNA coding caldariorum (X75763), Mougeotia scalaris (X70705), Pneu- region (32). All group I introns (LSU, SSU, and organellar) are mocystis carinii SSU (X12708), Pneumocystis carinii LSU postulated to share a common ancestry based on conserved (M86760), Porphyra spiralis var. amplifolia (35), Protomyces sequences and secondary structures (1). inouye (36), Sphaerozosma granulatum (X79496), Stauras- The distance method was implemented with the PHYLIP trum sp. M752 (X74752), Staurastrum sp. M753 (X77452), computer program as described above. Maximum-parsimony Tetrahymena thermophila LSU (V01416), Urospora penicil- analysis of the intron sequences was done with a weighting liformis (31), Ustilago maydis (X62396), and Zygnemopsis scheme (1/no. of steps, over an interval of 1-100) for each site circumcarinata (X79495). within the aligned data set (MACCLADE 3; ref. 33). The weighted data were used as input for a heuristic bootstrap RESULTS AND DISCUSSION analysis with a branch-swapping algorithm (TBR, tree bisec- tion-reconnection, PAUP 3.1.I; ref. 34). The maximum- PCR products of SSU rRNAs from 19 Zygnematales that likelihood (PHYLIP) and weighted maximum-parsimony represent members ofall families within this order (D.B. and (branch-and-bound; PAUP), methods were also implemented to B.S., unpublished data) were found to be larger (2.1-2.4 kb) determine the evolutionary relationships among 17 and 12 than expected for these coding regions (=1.8 kb). Sequence group I intron sequences, respectively, which share a common analysis of 13 of these taxa, which represent several major 7, a Mougeotia scabris Zygnemopsis circumcarinata ]Zyg Mosoteenium caldariorum JMes. 87 [Staurastrum sp. M753 _l 9!j Staurastrum sp. M752 Zygnematales loo 4iCosmocladium saxonium Des. DSpherozosma granulatum _ Cosmarium botrytis GeniculariaIspiroteet'Ipnis#ED _j __ -Coleochaete orbicularis Charophyceae - Klebsormidium flaccdum DO dellasp. &-~~~~~Chara foedda - Gingko bilobe __ Zamia pumile J Embryophyta IZea mays Anklstrodesmus stipitatus 6 Dunaliella parva 100 Dunaliella saline Cheracium saccatum Chlorella ellipsoid. 8 Fnedmannia israelensis 95 100 - Acroslphonia sp. Gloeotfiopsis planctonice Nephroselmis oaivacee Pneumocystis carinii tumycola...mtw,LffO~f Usfflago maydis J Por,rphyra umbilkalls J Rhodophyta 4%

FIG. 2. Phylogenetic analysis of SSU rRNA coding regions. Unrooted tree constructed with the neighbor-joining method (24) based on structural distances (25) between SSU rRNA coding regions is shown. A total of 1718 nt were considered. Evolutionary distances are represented in the horizontal axis by the sum of branch lengths separating taxa. The distance that corresponds to 4% sequence divergence is indicated by the scale. Bootstrap percentage values based on 100 resamplings of these data are shown at the internal nodes. Taxa which contain a group I intron(s) within their rRNA are shown in boldface type. Members of the Zygnematales families Zygnemataceae (Zyg.), Mesotaeniaceae (Mes.), and Desmidiaceae (Des.) are indicated. Downloaded by guest on September 25, 2021 Evolution: Bhattacharya et al. Proc. Natl. Acad. Sci. USA 91 (1994) 9919

A B Mougotbl scealrls 54% ZygnemopPsIclrcumcarlnate 54% 4MougeoUt scalris Msotaenlum coldarlorum 51% 4 Zygnemopsis circumcarlnets Meotenlum calderiorum -Sturestrum sp. M753 52% 3 Steurestrum op. M753 Sturestrum sp. M752 50% R 3 100 Steurastrum ap. M752 Zygnematales Cosmarlum botrytl 62% Cosmerlun botrytis _Spheerozom grenuletum 61% 72 Sphaerozosme granulatum CoIFsocladlum eaxonicum 54% S;S_ 7aiif Cosmocladlum saxonlcum Geniculerla splrotanbI 52% 54 --Genlcularle splrotaenle hlorefa mirebfils 60% 41 97 lorelle mirablils Chlor0lla elllpsoidw 52% 89 ' Chlorelle ellIpsoidee I5 06w ProtomycesInouye A 50% 42 96 Pneumocystis carinll Hildenbrandle rubre 57% Pneumocystis carinhi LSU Porphyra spirells 53% 1 Porphyre spirals Pneumocystls cartnin LSU 47% §100 -! Hildenbrandle rubre Pneumocystis carlnl 47% 91s6Protomyces Inouye A Tetrehymene thermophlis LSU 48% 309 100 Anklrodesmus stipitetus 3 Chlorelle sorokinlene 2 I 1046 C 25 1 s Ustilago meydis 61 96 Protomyces Inouye B Dunellelpaerve 1 Dunehlehle saline sp. M753 -loeotlopsls planctonice sp. M752 40 Chereclum seccetum Dunallehl parv 2 1512 Urospore penichlltormis Tetrehynene thermophile LSU 10%

FIG. 3. Phylogenetic analyses ofrRNA group I introns. (A) Rooted tree constructed with the neighbor-joining method (24) based on structural distances (25) between group I introns. A total of228 nt were considered. The distance that corresponds to 10%o sequence divergence is indicated by the scale. Bootstrap percentage values based on 100 resamplings of these data are shown above the internal nodes. Bootstrap percentage values shown below the internal nodes in italic type are inferred from a weighted maximum-parsimony analysis (34) of the same data set (100 resamplings); this phylogram had a consistency index of 0.6. Intron insertion sites, based on their positions in the Escherichia coli SSU rRNA coding region, are shown at right. (B) Maximum-likelihood analysis of "1506" rRNA group I introns. The global-search option was used with a transition/transversion ratio of 2, empirically determined base frequencies, and a jumbled species input; the tree has a log-likelihood of -2921.37. The percent G+C contents of the group I intron sequences used in this phylogenetic analysis (over the aligned 228 nt) are shown next to the species names. Both A and B are rooted with the Tetrahymena thermophila group I intron sequence. (C) Phylogeny oftwelve "1506" rRNA group I introns inferred with the maximum-parsimony method (34) using 228 nt. This phylogram has a consistency index of 0.777 and is the consensus of 1000 bootstrap resamplings that resulted from a branch-and-bound search procedure; bootstrap percentage values are shown at the internal nodes. C is rooted with the Pneumocystis carini SSU rRNA intron sequence. families of the Zygnematales, demonstrates that the inser- with the SSU rRNA phylogeny of these taxa (Fig. 2). These tions are attributable to single group I introns located at an results, taken together with the shared insertion site and the identical position (nt 1506, relative to the Escherichia coli lack of a systematic G+C bias within the Zygnematales group SSU rRNA coding region) in these rRNAs. Zygnematales I introns (see Fig. 3B), support a monophyletic origin of the other than those described here which also contain "1506" 1506 intron in this group. That the bootstrap analyses provide group I introns are interruptus (Mesotaeniaceae), little support for the monophyly of all Zygnematales group I Desmidium swartzii and Phymatodocis nordstedtiana (Des- introns presumably reflects the small number of positions midiaceae), and sticticum (Zygnemataceae). The which could be used in this analysis coupled with the rela- only member of the Zygnematales that does not contain an tively high divergence of these sequences [e.g., 42.4% evo- intron is sp. (Zygnemataceae, strain 253, SCK). lutionary distance (25) over 221 aligned nucleotides between Phylogenetic Analyses. Phylogenetic analysis of the SSU Staurastrum sp. M752 and Genicularia spirotaenia]. Reduc- rRNA coding regions shows that the Zygnematales is a tion of the number of positions used in the phylogenetic distinct evolutionary lineage within the green algae/land analysis to those defining the conserved P+-R-S second- plants which shares a most recent common ancestry with ary-structure elements (97 nt) did not result in any case in other Charophyceae and land plants (Fig. 2). The Charo- greater resolution of the evolutionary relationships or higher phyceae/land plants are a monophyletic assemblage which is bootstrap support for the groupings identified in this study a sister group to all other green algae. Distance analysis ofthe (D.B. and B.S., unpublished data). The monophyly of Zyg- Zygnematales SSU rRNA sequences demonstrates that Gen- nemopsis circumcarinata/Mesotaenium caldariorum, for ex- icularia spirotaenia, Cosmarium botrytis, Cosmocladium ample, which is supported in the neighbor-joining and max- saxonicum, Sphaerozosma granulatum, and Staurastrum imum-parsimony bootstrap analyses in Fig. 3A (74% and spp. (Desmidiaceae) are closely related to one another and 82%, respectively), in the maximum-parsimony bootstrap distinct from Mesotaenium caldariorum (Mesotaeniaceae) analysis in Fig. 3C (94%), and in the coding-region phylogeny and Mougeotia scalaris and Zygnemopsis circumcarinata in Fig. 2 (85%) has no bootstrap support in a neighbor-joining (Zygnemataceae). analysis of the reduced (i.e., 97-nt) data set (41%). Phylogenetic analyses of the rRNA group I introns dem- Importantly, there is evidence in the bootstrapped neighbor- onstrate a nearly identical topology-of the Zygnematales 1506 joining analysis for the monophyletic origin of group I introns intron subtree in the distance, maximum-likelihood, and in two independent Zygnematales lineages (i.e., Mesotaenium maximum-parsimony analyses (Fig. 3) and its close similarity caldariorum/Mougeotia scalaris/Zygnemopsis circumcari- Downloaded by guest on September 25, 2021 9920 Evolution: Bhattacharya et al. Proc. Natl. Acad Sci. USA 91 (1994) nata and Genicularia spirotaenia/Cosmarium botrytisi ORFs (e.g., reverse splicing; ref. 6) should provide a more Cosmocladium saxonicum/Sphaerozosma granulatumi complete understanding of their phylogenetic distribution. Staurastrum spp.; Fig. 3A). These results are mirrored in the rRNA We thank T. Friedl (University of Bayreuth), V. A. R. Huss coding-region phylogeny (Fig. 2). The only uncertainty (University ofErlangen), and M. Ragan (National Research Council, within the topologies of the Zygnematales intron and coding- Halifax) for making rRNA sequences available to us prior to their region phylogenies concerns the branch points ofthe Desmid- publication, R. Gutell (University of Colorado, Boulder) for help iaceae (excluding Genicularia spirotaenia). The Desmidiaceae with the intron alignment, and E. Miller for technical assistance. define a closely related clade whose members radiate over a This research was supported by a grant from the Deutsche Fors- relatively short evolutionary distance (-0.3%; Fig. 2). chungsgemeinschaft (ME 658/11-2) to M.M., a fellowship from the To further test the monophyly of the Zygnematales group Ministerium fur Wissenschaft und Forschung Nordrhein-Westfalen I introns, user-defined trees were created which disrupt the (Wiedereinstiegsstipendium HSP II) to B.S., an Alexander von monophyly of these sequences and were used to constrain Humboldt Foundation Research Award to D.B., and support from maximum-likelihood analyses. Placement of the Chlorella the National Institutes of Health (GM48207) to R. Gutell for S.D. spp. 1506 introns at the base of the Desmidiaceae resulted in 1. Cech, T. R. (1988) Gene 73, 259-271. a phylogeny with a lower log-likelihood (-2929.15) that was 2. Belfort, M. (1993) Science 262, 1009-1010. not significantly different (37) from the "best" tree shown in 3. Lambowitz, A. M. & Belfort, M. (1993) Annu. Rev. Biochem. 62, Fig. 3B (log-likelihood, -2921.37). Positioning of the Pneu- 587-622. mocystis carinui group I intron sequences at the base of the 4. Belfort, M. (1991) Cell 64, 9-11. a 5. Dujon, B. (1989) Gene 82, 91-114. Desmidiaceae lineage resulted in maximum-likelihood phy- 6. Woodson, S. A. & Cech, T. R. (1989) Cell 57, 335-345. logeny with a log-likelihood (-2942.90) that was significantly 7. Knoll, A. H. (1992) Science 256, 622-627. "worse" than that of Fig. 3B. These results provide a 8. Wainwright, P. O., Hinkle, G., Sogin, M. L. & Stickel, S. K. (1993) measure ofsupport for the branch length uniting the Chlorella Science 260, 340-342. spp./Zygnematales 1506 group I introns. All methods sup- 9. Kuhsel, M. G., Strickland, R. & Palmer, J. D. (1990) Science 250, port placement of the Chlorella spp. introns outside of the 1570-1573. Zygnematales assemblage. 10. Cavalier-Smith, T. (1993) Microbiol. Rev. 57, 953-994. 11. Embley, T. M., Dyal, P. & Kilvington, S. (1992) Nucleic Acids Res. We propose that the common ancestor ofthe Zygnematales 20, 6411. contained a 1506 group I intron which has, for an unknown 12. De Jonckhere, J. F. (1993) J. Eukaryotic Microbiol. 40, 179-187. reason, remained stationary within the rRNA coding region. 13. Ruoff, B., Johansen, S. & Vogt, V. M. (1992) NucleicAcidsRes. 20, The alternative hypothesis that the 1506 element has been 5899-5906. frequently laterally transferred within the Zygnematales and 14. Surek, B., Beemelmanns, U., Melkonian, M. & Bhattacharya, D. spread throughout all member taxa is not supported by our (1994) Syst. Evol. 191, 171-181. analyses. Frequent lateral transfers would result in intron 15. Hoshaw, R. W., McCourt, R. M. & Wang, J.-C. (1990) in Hand- are with book ofProtoctista, eds. Margulis, L., Corliss, J. O., Melkonian, phylogenies which widely discordant that derived M. & Chapman, D. J. (Jones and Bartlett, Boston), pp. 119-131. from the nuclear-encoded rRNAs. In the absence of a G+C 16. Huss, V. A. R., Siegler, M.-L. & Kranz, H. D. (1993) Plant Mol. bias (the aligned nucleotides used in the Zygnematales rRNA Biol. 22, 557-560. coding region phylogeny range from 46% to 47% G+C 17. Mix, M. (1973) Mitt. Staatsinst. Alug. Bot. Hamburg 14, 135-169. content), intron and coding-region topologies are not ex- 18. Engels, M. & Mix, M. (1980) Mitt. Inst. Alig. Bot. Hamburg 17, pected to converge by chance alone. We cannot, however, 165-171. 19. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., unequivocally distinguish between a paraphyletic origin of Horn, G. T., Mullis, K. B. & Erlich, H. A. (1988) Science 239, the 1506 intron during divergence of the two major groups of 487-491. the Zygnematales or the presence in the common ancestor of 20. Medlin, L., Elwood, H. J., Stickel, S. & Sogin, M. (1988) Gene 71, these taxa. The inclusion of additional Zygnematales 1506 491-499. group I intron sequences in the phylogenetic analyses may 21. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. result in greater bootstrap support for the monophyletic Sci. USA 74, 5463-5467. 22. Hultman, T., Bergh, S., Moks, T. & Uhlen, M. (1991) BioTech- origin of this intron subfamily. Given that our hypotheses niques 10, 84-93. accurately reflect the evolutionary history of the 1506 intron 23. Felsenstein, J. (1993) PHYLIP Manual (Dept. of Genet., Univ. of in the Zygnematales, we approximate a minimum age of Washington, Seattle), Version 3.5, p. 172. 350-400 million years (i.e., middle Devonian) for this ele- 24. Saitou, N. & Nei, M. (1987) Mol. Biol. Evol. 4, 406-425. ment, based on the first appearance ofa "'desmid" within the 25. Kimura, M. (1980) J. Mol. Evol. 16, 111-120. fossil record 26. Felsenstein, J. (1985) Evolution 39, 783-91. (15, 38). 27. Zeltner, C. & Friedl, T. (1994) J. Phycol. 30, 500-506. The complex distribution ofgroup I introns within eukary- 28. Michel, F. & Westhof, E. (1990) J. Mol. Biol. 216, 585-610. otes suggests, however, that both lateral transfer and com- 29. Gilbert, D. (1992) SEQAPP, A Biosequence Editor and Analysis mon ancestry have played important roles in the evolutionary Application (Biology Dept., Indiana Univ., Bloomington), Release history of these elements. Evidence for the lateral transfer of 1.9al59, p. 22. rRNA group I introns comes from the analyses of these 30. Johansen, S., Johansen, T. & Haugli, F. (1992) Curr. Genet. 22, 297-304. elements in Tetrahymena spp. LSU rRNA (39) and from the 31. Van Oppen, M. J. H., Olsen, J. L. & Stam, W. T. (1993) Mol. Biol. present phylogenetic study. The clustering ofgroup I introns Evol. 10, 1317-1326. from distinct insertion sites (i.e. 943 and 1046; Fig. 3A) 32. Liu, Y., Rocourt, M., Pan, S., Liu, C. & Leibowitz, M. J. (1992) suggests that these introns may be traced back to one or more Nucleic Acids Res. 20, 3763-3772. invasive elements which were independently inserted into the 33. Maddison, W. P. & Maddison, D. R. (1992) MaCCLADE (Sinauer, rRNA ofgreen algae and fungi. After transfer, some ofthese Sunderland, MA), Version 3, p. 398. 34. Swofford, D. L. (1993) PAUP, Phylogenetic Analysis Using Parsi- introns may be vertically inherited within the SSU rRNA mony (Ill. Nat. Hist. Surv., Champai), Version 3.1.1, p. 117. (i.e., Zygnematales 1506 and 1512) or lost from this coding 35. Oliveira, M. C. & Ragan, M. A. (1994) Mol. Biol. Evol. 11, 195-207. region (40). In contrast to the Zygnematales 1506 intron, the 36. Nishida, H., Blanz, P. A. & Sugiyama, J. (1993) J. Mol. Evol. 37, sporadic distribution ofgroup I introns within the green algae 25-28. and fungi and the small number ofavailable intron sequences 37. Kishino, H. & Hasegawa, M. (1989) J. Mol. Evol. 29, 170-179. representing each of these subfamilies preclude a clearer 38. Baschnagel, R. A. (1966) Trans. Am. Microsc. Soc. 85, 297-302. 39. Sogin, M. L., Ingold, A., Karlok, M., Nielsen, H. & Engberg, J. understanding of their evolutionary origin(s). In addition to (1986) EMBO J. 5, 3625-3630. further sequence analyses, uncovering the mechanisms 40. Wilcox, L. W., Lewis, L. A., Fuerst, P. A. & Floyd, G. L. (1992) which control the lateral transfer of group I introns lacking Mol. Biol. Evol. 9, 1103-1118. Downloaded by guest on September 25, 2021