DOI 10.1002/bies.200900036 Review article

Trans-splicing of organelle – a detour to continuous RNAs Stephanie Glanz and Ulrich Ku¨ck* Lehrstuhl fu¨r Allgemeine und Molekulare Botanik, Ruhr-Universita¨t Bochum, Bochum, Germany

In eukaryotes, RNA trans-splicing is an important RNA- lineages and are believed to be potential ancestors of processing form for the end-to-end ligation of primary chloroplasts and mitochondria. More recently, group II introns transcripts that are derived from separately transcribed have been discovered outside these eukaryote organelles in . So far, three different categories of RNA trans- (7) splicing have been found in organisms as diverse as the genome of the archaean Methanosarcina sp. and in the (8) algae to man. Here, we review one of these categories: nuclear genome of the bilaterian Nephtys sp. the trans-splicing of discontinuous group II introns, During splicing, introns are removed from a precursor which occurs in chloroplasts and mitochondria of lower RNA, and the concomitant ligation of exons results in the eukaryotes and plants. Trans-spliced exons can be pre- formation of a mature transcript. This process of intramole- dicted from DNA sequences derived from a large number of sequenced organelle genomes. Further molecular cular ligation involves only a single RNA molecule and is genetic analysis of mutants has unravelled proteins, called cis-splicing. In cases, however, when more than one some of which being part of high-molecular-weight com- is involved in an intermolecular ligation, the plexes that promote the splicing process. Based on data RNA is processed by trans-splicing. So far, three different derived from the alga Chlamydomonas reinhardtii,a categories of RNA trans-splicing have been found in genomes model is provided which defines the composition of an organelle spliceosome. This will have a general rele- as diverse as archaeans to man: the spliced-leader (SL) trans- vance for understanding the function of RNA-processing splicing, the alternative trans-splicing and the trans-splicing of machineries in eukaryotic organelles. discontinuous group II introns (Fig. 1). The term SL trans-splicing describes the spliceosomal Keywords: chloroplasts and mitochondria; group II introns; transfer of a short RNA sequence (the SL, 15–50 nt) from the 50- organelle spliceosome; trans-splicing end of a particular non-coding RNA donor molecule (the SL RNA, 45–140 nt) to unpaired splice acceptor sites on pre- mRNA molecules. As a result, diverse mRNAs, ranging from a Introduction small proportion to 100% of the mRNA population in different organisms, acquire a common 50-sequence.(9) This whole Introns were first discovered in 1977 and were subsequently process converts a polycistronic transcript into translatable identified in organisms from all three kingdoms, namely monocistronic mRNAs (Fig. 1A). The phenomenon of SL trans- prokaryotes, archaea and eukaryotes.(1–3) On the basis of splicing was first discovered in pre-mRNAs from nuclear their splicing mechanisms and conserved RNA-folding of trypanosomes. In this organism, the capped 50-terminal patterns, introns are classified into the following categories: sequence of SL RNAs is a mini- containing an AUG start group I and group II introns, nuclear tRNA introns, archaeal codon, which is trans-spliced onto the 50-end of each mRNA.(10) introns and spliceosomal mRNA introns. Group I introns are Since then, SL trans-splicing has been found in diverse groups widely distributed in genomes of prokaryotic and eukaryotic of eukaryotes including ascidians, cnidarians, dinoflagellates, organisms but not in archaea,(4) while the tRNA and/or euglenozoans, flatworms, and ;(11–13) how- archaeal introns are found in eukaryotic nuclear tRNAs as ever, it has an as yet unknown evolutionary origin. well as in archaeal mRNAs, rRNAs and tRNAs.(5) Spliceo- Alternative trans-splicing has recently been discovered in somal mRNA introns were exclusively discovered in nuclear Drosophila and mammals. In this case, exons located on genomes of eukaryotes (see Glossary),(6) whereas group II separate primary transcripts are selectively joined to produce introns are restricted to chloroplasts and mitochondria of mature mRNAs encoding proteins with distinct structures and lower eukaryotes, plants and some prokaryotes. These functions. Alternative trans-splicing can essentially be prokaryotes belong to cyanobacterial and proteobacterial differentiated into intragenic and intergenic trans-splicing processes. Intragenic trans-splicing is known to occur in rat and involves exon repetitions, whereas intergenic trans- *Correspondence to: U. Ku¨ck, Lehrstuhl fu¨r Allgemeine und Molekulare splicing was found in man and mouse and involves trans- Botanik, Fakulta¨tfu¨r Biologie und Biotechnologie, Ruhr-Universita¨t Bochum, 44780 Bochum, Germany. splicing of two RNA molecules originating from two different (14–16) E-mail: [email protected] genes (Fig. 1B).

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 921 Review article S. Glanz and U. Ku¨ck

Figure 1. Three categories of trans-splicing. A: Spliced leader (SL) trans-splicing. SL trans-splicing accurately joins sequences derived from separately transcribed small non-coding RNAs and independently transcribed pre-mRNAs. The SL sequence is donated from the SL RNA to pre- mRNAs to form the 50-terminal mini-exon of the mature mRNA. Outron indicates the 50-segment of a trans-spliced pre-mRNA upstream of the trans-splice acceptor site. B: Alternative trans-splicing. Intragenic trans-splicing generates mRNAs containing tandem duplications of specific exons (dark blue) and intergenic trans-splicing generates chimeric mRNAs (grey and light blue) between pre-mRNAs originating from two different genes (A or B). C: Group II trans-splicing. Primary transcripts derived from distantly located exonic sequences are joined end to end and ligated after assembly and splicing of the flanking group II intron sequences. Exons are shown as boxes and waved lines represent non- translated sequences.

Finally, trans-splicing occurs between transcripts derived molecules and/or proteins. Discontinuous group II introns from scrambled fragments flanked by discontinuous share a common splicing mechanism with spliceosomal group II introns (Fig. 1C). Group II introns are characterised by introns and are therefore considered as an evolutionary link a conserved secondary structure configuration. This structure between cis-splicing group II introns and nuclear spliceoso- consists of six major stem-loops, corresponding to domains mal introns. These nuclear spliceosomal introns depend D1–D6, radiating from a central core of single-stranded RNA functionally on the trans-acting spliceosome machinery.(20,21) segments that brings the 50- and 30-splice junctions into close Previously, it was assumed that DNA rearrangements within proximity. For correct folding and catalytic function, the group II introns result in discontinuous mosaic genes with formation of tertiary interactions is essential. Within group II exons scattered over the genome.(22) introns, two main subclasses of secondary structures, IIA and IIB, each consisting of two forms (IIA1, IIA2 and IIB1, IIB2), have been found and more recently, two further classes, IIC Chloroplast trans-splicing and IID, have been discovered in bacteria.(6,17) A model for the secondary structure of a typical group IIB intron is shown in The plastid DNA (plastome) of plants and algae codes for Fig. 2 and excellent reviews concerning the structure and 100–140 genes. Most of these genes are organised in folding of these introns have been published.(18,19) operons and the corresponding polycistronic precursor Discontinuous group II introns are found in chloroplasts of transcripts undergo complex posttranscriptional processes. algae as well as in chloroplasts and mitochondria of higher These processes include the stabilisation of the RNA as well plants and will be the focus of this review. The genes involved as cis- and trans-splicing processes, endonucleolytic clea- consist of exons that are distributed throughout the genome vage, RNA editing, and terminal nucleotide additions and/or and flanked by 50- and 30-regions of group II intron consensus deletions. We have analysed a total of 120 sequenced sequences. Due to the assembly of these regions, a functional chloroplast genomes with regard to trans-splicing processes group II intron secondary structure is restored in trans.In (Table S1). Our analysis revealed that no trans-splicing events many cases, the correct assembly and splicing reaction take place in the three members of the Apicomplexa and in a seems to depend on trans-acting factors, which could be RNA single cercozoan species sequenced so far, but that 101

922 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

either in a large single copy region or in the two inverted repeats of chloroplasts genomes. Both the 50- and 30-rps12 gene fragments are organised in operons and are expressed as polycistronic transcripts.(27,28) Maturation of the rps12 mRNA comprises both endonucleolytic cleavage of the polycistronic transcripts and trans-splicing of exon 1 with exon 2 as well as cis-splicing of exons 2 and 3. Numerous sequencing projects have enabled the in silico analysis of further genes with putative trans-spliced introns that has revealed other genes than those mentioned above. For example, these genes include psaA of the green alga Scenedesmus obliquus, pbsA (heme oxygenase) of the red

alga Rhodella violacea, petD (subunit IV of cytochrome-b6/f- complex) and psaC (subunit VII of photosystem I) of the green algae Stigeoclonium helveticum and Oedogonium cardiacum, rbcL (large subunit of ribulose 1,5-bisphosphate carboxylase/ Figure 2. Secondary structure model of a typical group IIB intron. oxygenase) of the green algae S. helveticum and Floydiella Intron and exon sequences are given as thin and thick lines, respec- terrestris, and ndhH (subunit of the NA(P)DH dehydrogenase tively. Arabic numerals denote the six conserved domains of group II complex) of Triticum aestivum.(29–33) introns (D1–D6). The dashed domain D4 is the most variable intron The highest number of trans-spliced introns within a plastid region and sometimes contains an . A branch 0 genome was predicted from complete sequencing of the point involved in group II intron splicing is circled. Arrows indicate 5 to (29) 30 strand polarity.Potential fragmentation sites of trans-spliced introns chloroplast DNA from the green alga S. helveticum. Four are mapped with arrowheads in domain D1, D2, D3 and D4. Typical discontinuous group II introns were identified in the pre- tertiary interactions between exon-binding sites (EBS) and intron- mRNAs, one each in the petD and psaC genes and two in the binding sites (IBS) are indicated. For a complete set of tertiary rbcL gene. Recently, further chloroplast DNAs from the green (95) (18) interactions in group II introns see Pyle et al. and Fedorova et al. algae F. terrestris and O. cardiacum were reported to encode petD, psaC or rbcL pre-mRNAs that are spliced in trans (Table 1, Table S1 and Fig. 3B). trans-splicing events take place in the other 116 analysed To date, the best-analysed trans-splicing process is the eukaryotic genomes. Table 1 summarises all characterised one of the psaA mRNA in the unicellular green alga chloroplast trans-splicing introns known to date, and Fig. 3A C. reinhardtii. This alga can be regarded as the model shows examples for the organisation of exon sequences in organism for the analysis of plastid gene expression during chloroplast genomes from selected organisms. The first photosynthesis and the communication between the nucleus examples of chloroplast trans-splicing were discovered for the and the chloroplast.(34) Mutant strains with a defective group II introns from the Marchantia polymorpha and photosystem I finally served as base for the identification Nicotiana tabacum rps12 gene, encoding the 30S ribosomal of trans-splicing processes.(35) As early as 1987, it was protein S12, and from the C. reinhardtii psaA gene, encoding already known that the three exons of the psaA gene are the P700 chloropyll a-apoprotein of photosystem I reaction distributed on the plastome and transcribed separately from centre.(23–26) each other(25).Twotrans-splicing steps are necessary to form Maturation of the rps12 mRNA represents a complex the mature mRNA. For intron 1, three independently trans- trans-splicing process, and the corresponding gene shows a cribed RNA molecules assemble into a functional group II highly similar organisation in chloroplasts of charyophytes intron structure by base pairings and tertiary interactions. This and embryophytes. The mosaic rps12 genes, like other tripartite group IIB intron is interrupted in domains D1 and D4; discontinuous chloroplast genes, contain introns encoding thereby exon 1 is flanked by a portion of domain D1 and exon 2 cis- and/or trans-spliced primary transcripts that are flanked by the entire domains D4 and D5 as well as a portion of D6. by sequences showing features of group II introns (Table S1). The rest of domains D1 and D4 as well as the entire domains In Fig. 3A, two out of four possible organisations of the rps12 D2 and D3 are delivered from a third RNA molecule, the genes are depicted. In all cases, splicing of exon 1 and exon 2 plastid-encoded tscA RNA, which is 450 nt in length.(36) occurs in trans, with an intronic fragmentation site in domain The tscA RNA was also detected in C. gelatinosa (376 nt) D3, while exon 2 can harbour further two exonic sequences and in C. zebra (466 nt) and exhibits sequence identity of that can be processed on the RNA level by cis-splicing. approximately 55% to the tscA RNA of C. reinhardtii for both of Alternatively, the continuous exon 2 sequence or the the species. Analysis of the secondary structure of the three discontinuous exon 2-exon 3 sequences can be located tscA RNAs showed also a high degree of similarity with the

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 923 Review article S. Glanz and U. Ku¨ck

Table 1. Distribution of discontinuous chloroplast group II introns from selected algae, higher plants and mosses.

Intron typeb, Splicing fragmented Gene Introna type domain Organismc pbsA pbsA-i1 trans n.d. Rhodella violacea(33) pbsA-i2 cis? n.d. petD petD-i1 trans IIB, bi, D1 Oedogonium cardiacum, Stigeoclonium helveticum(29,30) psaA psaA-i1 trans IIB, tri, D1 þ D4 Chlamydomonas reinhardtii(25) psaA-i2d trans IIB, bi, D4 psaA-i1d trans IIB, bi, D4 Scenedesmus obliquus(31) psaC psaC-i1 trans IIB, bi, D1 Oedogonium cardiacum, Stigeoclonium helveticum(29,30) rbcL rbcL-i1 trans IIB, bi, D1 Floydiella terrestris, Stigeoclonium helveticum(29,30) rbcL-i2 trans IIA, bi, D2 Floydiella terrestris, Stigeoclonium helveticum(29,30) rps12 rps12-i1 trans IIB, bi, D3 Epifagus virginiana, Hordeum vulgare L., rps12-i2 cis IIA, bi Marchantia polymorpha, Nicotiana tabacum(23,24,26,98,99) rps12-i1 trans IIA, bi, D3 Cuscuta europaea, Staurastrum punctulatum, Zygnema circumcarinatum(100,101)

This list contains examples that have thoroughly been analysed by cDNA and/or Northern or sequence analyses. A complete list of chloroplast introns shows that in Chlorophyta, 6 out of 27 genomes encode nine trans-spliced RNAs. Similarly, in charyophytes such as Chara vulgaris, 4 out of 6 genomes encode rps12 RNAs that are predicted to be trans-spliced. The same is true for the 83 embryophytes whose plastomes are completely sequenced. An exception seem to be the ndhA and ndhH transcripts that are most probably trans-spliced in wheat (Table S1).(97) Abbreviations and gene products: bi, bipartite intron; D1-4, domains D1-D4 of a typical group II intron; n.d., not determined; pbsA, heme oxygenase; petD, subunit IV of cytochrome-b6/f-complex; psaA,P700 chloropyll a-apoprotein of photosystem I reaction centre; psaC, subunit VII of photosystem I; rbcL, large subunit of RubisCO; rps12, 30S ribosomal protein S12; tri, tripartite intron aThe intron nomenclature is based on the flowering plant mitochondrial literature used by Bonen.(43) bPrediction of the secondary structure and the classification into subclasses IIA and IIB rely on sequence analyses, based on models of Michel(6) and Michel and Ferat.(42)

formation of domains D2 and D3 and partial domains D1 and are almost exclusively found in vascular plants with the D4, all of which are indicative for group II introns.(37) Plastome exception of the protist species ichthyosporean Amoebidium sequencing of the green alga S. obliquus revealed that the parasiticum with a chondriome size of >200 kb.(39) The size psaA gene is split into two exons, which are likewise ligated difference of plant chondriomes compared to other eukaryotic by a trans-splicing process (Fig. 3B). This discontinuous chondriomes is mostly due to the presence of an additional set group II intron is located and interrupted within domain D4 at of genes, promiscuous DNAs of nuclear or plastid origin, the same position as the second trans-spliced group II intron repetitive DNAs, and numerous group I or group II in the psaA gene of C. reinhardtii.(31) introns.(40,41) The large number of so far characterised trans-spliced Our analysis of 59 sequenced algal and plant chondriomes RNAs allows a comparison of the sites of fragmentation within identified 19 genomes with genes whose pre-mRNA is the conserved group II intron structure. With the exception of predicted to be trans-spliced (Table S2). Soon after the domains D5 and D6 (Fig. 2), of which domain D5 shows the discovery of several trans-spliced group II introns in most conserved sequence similarity within all group II introns, chloroplasts, mainly DNA sequencing work led to the all other domains can be fragmented as listed in Table 1. As detection of split group II introns in a range of mitochondria. described in the next section, this list of multipartite Phylogenetic analyses demonstrated that group II introns of chloroplast genes can be extended by a number of plant chondriomes can be distinguished from those found in mitochondrial genes showing similar fragmented group II the chloroplast genome.(17,42) In addition, sequence analyses intron structures (Table 2). revealed that many mitochondrial group II introns of flowering plants, as compared with bacterial and chloroplast introns, show variations in the sequence, structure and/or length of Mitochondrial trans-splicing typical group II introns.(43) Most group II introns in higher plant mitochondria are Mitochondrial genomes (chondriomes) of eukaryotes show a processed by cis-splicing; however, a distinct set of tran- great variation in size, ranging from about 15 kb in Metazoans scripts, encoding subunits of the NADH dehydrogenase and a few algae to about 2 000 kb in species of the complex, are trans-spliced. PCR and phylogenetic analyses Cucurbitaceae.(38) Chondriomes that are larger than 200 kb of cis-homologue introns in early branching land plants such

924 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

Table 2. Distribution of discontinuous mitochondrial introns from selected organisms.

Intron typea, Splicing fragmented Gene Intron type domain Organismb cox1 cox1-i1 trans – Diplonema papilatum, Emiliana huxleyi(54,102) cox3 cox3-i1 trans – Karlodinium micrum(103) nad1 nad1-i1 trans IIB, bi, D4 Arabidopsis thaliana, Brassica napus, Oenothera berteriana, nad1-i2 cis bi Petunia hybrida, Triticum aestivum, Vicia faba, Zea mays(45–47,104–107) nad1-i3 trans IIB, bi, D4 nad1-i4 cis bi Arabidopsis thaliana, Brassica napus, Oenothera berteriana, Vicia faba(47,105–107) trans IIB, bi, D4 Petunia hybrida, Triticum aestivum, Zea mays(45,46,104) nad2 nad2-i1 cis IIA, bi Arabidopsis thaliana, Brassica napus, Oenothera berteriana, nad2-i2 trans IIA, bi, D4 Triticum aestivum, Zea mays (sterile line) (48,51,108,109) nad2-i3 cis IIA, bi nad2-i4 cis IIA, bi nad3 nad3-i1 trans IIA, bi, D4 Mesostigma viride(110) nad3-i2 trans IIA, bi, D4 nad5 nad5-i1 cis IIA, bi Arabidopsis thaliana, Brassica napus, Oenothera berteriana, nad5-i2 trans IIA, bi, D4 Triticum aestivum, Vicia faba, Zea mays(49,105,111,112) nad5-i3 trans IIB, bi, D4 Arabidopsis thaliana, Brassica napus, Triticum aestivum, Vicia faba, Zea mays(49,105,111,112) trans IIB, tri, D4 Oenothera berteriana(49) nad5-i4 cis IIA, bi Arabidopsis thaliana, Brassica napus, Oenothera berteriana, Triticum aestivum, Vicia faba, Zea mays(49,105,111,112)

This list contains examples that have thoroughly been analysed by cDNA and/or Northern or sequence analyses. A complete list of all organelle introns predicted to be trans-spliced is given in the supplemental material (Table S2). Abbreviations and gene products: bi, bipartite intron; cox1, cox3, subunits of cytochrome c oxidase; D4, domain D4 of a group II intron; nad1, nad2, nad3, nad5, subunits of NADH dehydrogenase complex; n.d., not determined; tri, tripartite intron. aThe prediction of the secondary structure as well as the classification into the subclasses IIA or IIB are based on sequence analyses, according to the models of Michel(6) and Michel and Ferat.(42) bAccession numbers are given in the supplemental material (Table S2).

as ferns, horsetails, hornworts and mosses have suggested However, this intron can have a bipartite or a tripartite that trans-spliced introns might have evolved from originally organisation. In O. berteriana, sequence analyses of the cis-arranged continuous exon–intron structures by disruption tripartite organisation showed that an intronic region down- due to DNA rearrangements.(44) These genes include stream of exon 3 is missing, which is encoded by a distant nad1,(45–47) nad2(48) and nad5(49) (see Table 2 and Fig. 3C) genomic region named tix locus (trans-splicing intron and recent sequencing of the first mitochondrial genome of a fragment (Fig. 3C)).(52) This tripartite structure is reminiscent gymnosperm, the cycad Cycas taitungensis, revealed trans- of intron 1 of the chloroplast psaA RNA from C. reinhardtii that spliced group II introns within the homologous genes.(50) requires the tscA RNA in order to form the correct secondary Similar to their chloroplast counterparts, these mosaic genes structure.(36) Finally, despite their sequence dissimilarities, contain introns encoding cis-ortrans-spliced primary both tix and tscA show a highly conserved secondary transcripts that are flanked by sequences showing features structure with fragmentation sites in domains D1 and D4 at of group II introns (Fig. 3C).(6,34) homologous sites.(52) The genomic organisation, e.g. the exon/intron boundaries An unusual trans-splicing mechanism was predicted in as well as the high degree of sequence identity, is conserved both the dinoflagellate Karlodinium micrum (Alveolata) in different organisms. For instance, the intron nad2-i2 is split (cox3)(53) and the diplonemid Diplonema papillatum (Eugle- at the same position in angiosperms and shows 98% nozoa) (cox1).(54) In D. papillatum, a member of diplonemids, sequence identity in exons of Arabidopsis, Brassica, which are a sister group of kinetoplastids, a fragmented cox1 Oenothera and Triticum.(51) gene encoded on two different chromosomes was found. Another conserved example is the third intron of the nad5 Interestingly, the flanking regions do not exhibit any gene, which is trans-spliced in all angiosperms investigated. characteristics of organelle or nuclear introns nor contain

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 925 Review article S. Glanz and U. Ku¨ck

conserved sequences adjacent to coding regions, and organelle-encoded RNA factors so far known to support the therefore, this trans-splicing mechanism can be predicted splicing process in trans.(36,52) As described in the next to be different from those processes described above for chapter, the tscA RNA is most probably part of an organelle group II introns.(55) The second remarkable example is the spliceosome that similar to the nuclear spliceosome contains bipartite cox3 gene (cytochrome c oxidase subunit 3) from the protein as well as RNA components.(34) dinoflagellate K. micrum. Similar to the example mentioned Maturases are highly conserved organelle-encoded pro- above, no evidence of flanking group II introns was found. teins, and are usually encoded in domain D4 of some of the Instead, numerous inverted repeats in the intergenic characterised group II introns. These enzymes catalyse the sequences, which might form secondary structures, led to excision of non-autocatalytic introns, e.g. the excision of the the assumption that they play a role in splicing. At the splice intron from its own primary transcript, and together with the site, five adenine nucleotides are found that seem to be intron RNA, they form a ribonucleoprotein (RNP) com- derived from the polyA-tail of the 50-upstream fragment. plex.(59,60) Moreover, maturases have reverse transcriptase Therefore, ligation of exonic sequences seems to occur activity, mediating the integration of their ‘mobile’ introns into without involvement of group II intron sequences, and the new DNA sites (see Glossary).(61) Functional maturases exact mechanism of the splicing process has still to be encoded by bacterial introns were shown to promote splicing, resolved.(56) e.g. of the group II intron Ll.LtrB from Lactococcus lactis.(62) Recently, the Ll.LtrB intron was used as a model system to study group II intron trans-splicing in bacteria. A highly Trans-acting factors sensitive splicing/conjugation assay was developed and it was demonstrated that assembly and trans-splicing of a frag- Although some group II introns exhibit autocatalytic splicing mented group II intron can efficiently take place in bacterial activity in vitro (see Glossary), both cis- and trans-splicing cells. The authors mimicked naturally occurring fragmentation introns require cofactors for efficient splicing in vivo.(57) In sites, e.g. the site in domain D1 of psaA, and further showed principle, factors encoded by organelle or nuclear genomes that the Ll.LtrB intron-encoded maturase LtrA is essential for can be distinguished, and most of our current knowledge trans-splicing.(63,64) stems from work with mutants having a defect in RNA Nuclear-encoded trans-acting factors are the second class splicing.(34,58) The organelle-encoded components can be of components that are able to compensate for the loss of differentiated into RNA and protein factors (Table 3). As autocatalytic splicing activity in organelle introns (Table S3). It already mentioned above, the tscA RNA from algal chlor- is generally accepted that mitochondria and chloroplasts are oplasts and the tix RNA from plant mitochondria are the only the result of an endosymbiosis of a-proteobacteria-like and

Table 3. Examples of nuclear-encoded factors controlling trans-splicing of group II introns.

Affected RNA Gene Organism Local. Function Sequence homology nad1 OTP43 A. thaliana(78) mt trans-splicing of nad1 intron 1 PPR protein psaA Raa1 C. reinhardtii(80) cp, m trans-splicing of psaA n.d. (possible PPR protein) (class B, 30 end processing of tscA RNA; group IIB) Raa2 C. reinhardtii(92) cp, LDM trans-splicing of psaA Pseudouridine synthases (class A; group IIB) Raa3 C. reinhardtii(89) cp, s þ m trans-splicing of psaA Pyridoxamine 50-phosphate oxidases (class C; group IIB) Rat1 C. reinhardtii(88) cp, m trans-splicing of psaA NADþ-binding domain of poly (class C, 30 end processing of tscA RNA; (ADP-ribose) polymerases group IIB) Rat2 C. reinhardtii(88) n.b. trans-splicing of psaA Domain of a putative RNA-binding (class C, 30 end processing of tscA RNA; protein of Synechococcus spec. group IIB) WH8102 rps12 ppr4 Z. mays(79) cp, s trans-splicing of rps12 (intron 1), PPR protein biogenesis of ribosomes

An extended list of trans-acting factors is given in the supplemental material (Table S3). Abbreviations: cp, chloroplast; LDM, low-density membrane; local., localisation of gene products; m, membrane; mt, mitochondrion; n.d., not determined; OTP, organelle transcript processing defect; ppr, pentatricopeptide repeat; Raa, RNA maturation of psaA; Rat, RNA maturation of psaA tscA RNA; s, stroma.

926 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

Figure 3. Organisation of selected chloroplast and mitochondrial genes with trans-spliced RNAs. A: Two possible genomic organisations of rps12 genes from different sources. The bottom example is representative for higher plants with duplicated 30-exons in the inverted repeat (IR) regions IRa and IRb and a 50-exon in the large single copy region (LSC). B: Examples of chloroplast genes from diverse algae. C: Schemes of mitochondrial loci from the five exons of nad1 and nad5. The transcripts of nad1 and nad5 are polycistronic (data not shown).(96) The genome of O. berteriana was not yet completely sequenced. Exons are represented by black boxes with their corresponding size in base pairs. Arrows indicate direction of transcription. Cis-spliced introns are depicted as red boxes and trans-spliced introns are marked in yellow. Black double slashes indicate split gene fragments, which are separately transcribed. Distances in kb were determined from a clockwise orientation of the chloroplast genomes. Abbreviations of species are as follows: C.v., Chara vulgaris (NC_008097); N.t., Nicotiana tabacum (NC_001879); O.b., Oenothera berteriana (X07566, X60046, X60049, X99516); S.o., Scenedesmus obliquus (NC_008101); S.h., Stigeoclonium helveticum (NC_008372); T.a., Triticum aestivum (NC_007579). Abbreviated gene designations are explained in the legend of Tables 1 and 2.

cyanobacteria-like prokaryotes, respectively. This process is functionally characterised gene products. When known, we accompanied by the relocation of a major part of the also give the homology of the proteins to other factors as well prokaryotic genomes into the chromosome of the host cell. as their subcellular localisation. As a consequence, the nuclear-encoded organelle proteins While some of these trans-acting factors seem to be have to be retargeted to their ancestral compartments.(65,66) specific for only a single intron, other factors are involved In Fig. 4, this situation is depicted for the chloroplast of the in splicing of a set of introns. Moreover, some of these unicellular green alga C. reinhardtii. RNA-processing, trans- nuclear-encoded proteins have acquired, in addition to lation, as well as assembly of membrane or membrane- their catalytic role, further organelle functions during associated complexes, is dependent on both, organelle- and splicing. It is therefore assumed that during evolution some nuclear-encoded proteins. The latter are translated on of these nuclear-encoded factors were adapted to the cytosolic ribosomes and will be transported through the binding of intron structures, thus playing a role in the splicing chloroplast membranes into the inner space of the orga- process.(22) nelle.(67–69) Table 3 summarises nuclear-encoded proteins The trans-acting factors involved in splicing of a set of involved in trans-splicing of group II introns together with the introns can be divided into three groups. The first group of

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 927 Review article S. Glanz and U. Ku¨ck

of higher plants are edited. The editing event comprises mostly C-to-U substitutions.(73) For example, fusion of the trans-splicing intron nad1-i3 from O. berteriana with sequences from an autocatalytic splicing intron from yeast revealed that unedited intron sequences are not able to form a functional, splicing-competent group II intron structure.(74) The second group of nuclear-encoded factors exhibit repeated motifs of 34–38 amino acids, e.g. like in penta- tricopeptide repeat (PPR) proteins. The PPR protein family is characterised by tandem repeats of a motif consisting of a degenerate 35 amino acid repeat. Several PPR proteins are encoded in the genomes of animals, fungi and trypano- somes,(75) and most of these proteins are found in genomes of higher plants.(76) Already characterised PPR proteins show RNA-binding features and affect the processing or the translation of specific RNA molecules in mitochondria and chloroplasts.(77) Recently, the nuclear OTP43 (organelle transcript processing defect) gene was shown to be specifically required for trans-splicing of mitochondrial nad1-i1 in A. thaliana.(78) Another example is PPR4 of Zea mays, which is responsible for trans-splicing of the first intron of the chloroplast rps12 RNA by directly binding to this intron.(79) Finally, Raa1 (RNA maturation of psaA) from C. reinhardtii, which is involved in the trans-splicing process of both psaA introns, also harbours tandem repeats similar to Figure 4. Dependence of chloroplast biogenesis on nuclear- those found in PPR proteins.(80) encoded factors. During chloroplast biogenesis, coordination of gene The third and final group of nuclear-encoded factors expression is achieved by nuclear-encoded factors that affect RNA processing, translation, and also assembly of complexes. Chloroplast represent proteins that cannot be assigned to any classified multisubunit complexes are thus formed by both nuclear- (arrows) and function. CRS1 (chloroplast RNA splicing 1) from Z. mays, for chloroplast- (dashed arrows) encoded polypeptides. example, contains a new RNA-binding domain, the CRM domain (chloroplast RNA splicing and ribosome maturation), which is also found in archaeal and bacterial proteins, involved nuclear-encoded factors comprises enzymes involved in RNA in the maturation of ribosomes.(81) This CRM domain is maturation processes, and recently, new members of this likewise found in CRS2, another splicing factor from maize.(82) group with homologies to mitochondrial maturases were Moreover, the trans-acting factors in Z. mays are known to be detected. Four genes for group II intron maturases, nMat-1a, part of a high-molecular-weight ribonucleoprotein complex nMat-1b, nMat-2a and nMat-2b, were identified in the nuclear that also contains spliced intron sequences.(79) Similar genomes of both A. thaliana and Oryza sativa. Interestingly, data are available from C. reinhardtii, which in recent years these maturase-like proteins are not intron-encoded. The was the subject of intense mutational analyses all of which predicted mature proteins show homology to mitochondrial show a defect in trans-splicing of the psaA RNA.(80,83) As counterparts and contain putative mitochondrial import detailed below, these studies led to the notion of a complex sequences. It is assumed that they were transferred during chloroplast spliceosome being involved in the splicing evolution from the chondriome to the nuclear genome and process. may have retained their role in splicing of mitochondrial group II introns, as was functionally demonstrated with an A. thaliana mutant analysis.(70,71) Nuclear-encoded RNA- Putative chloroplast spliceosome of the editing factors also play a role in splicing. RNA editing in plant model alga C. reinhardtii organelles is mediated by specific nuclear-encoded factors(72) and is essential for the formation and stabilisation of splicing- Trans-splicing of the psaA RNA from C. reinhardtii requires a competent primary and secondary structures of several plastid-encoded tscA RNA and at least 14 nuclear-encoded mitochondrial group II introns.(48,51) Almost all protein-coding chloroplast factors.(36,84) The corresponding nuclear mutants transcripts as well as some introns and tRNAs in mitochondria are grouped into three classes according to their mode of

928 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

Figure 5. Trans-splicing of the chloroplast psaA RNA of C. reinhardtii. The psaA gene is fragmented into three independently transcribed exons, which are flanked by consensus sequences of group II introns (green wavy lines). To generate a mature psaA mRNA, two trans-splicing steps are necessary. For the formation of the first group II intron, a small chloroplast-encoded RNA (tscA) is required that interacts with precursor transcripts. The tscA RNA is co-transcribed with chlN and is the subject of various 30 end processing events. Ovals represent nuclear mutant classes, which are affected in different steps of the trans-splicing process. Colours indicate class A, B and C mutants, respectively.The first group II intron is labelled as denoted in Fig. 2. Arrowheads indicate the sites of fragmentation. Abbreviations: EBS2, exon binding site 2; IBS2, intron binding site 2.

action (Fig. 5): class A mutants fail to trans-splice exon 2 and 3 by reciprocal coimmunoprecipitations, showing that either primary transcripts; class B mutants neither splice exon 1 and protein of the complex can be immunoprecipitated with the 2 nor exon 2 and 3 primary transcripts; and class C mutants other. However, these data do not yet answer the question are not able to splice exon 1 and 2 primary transcripts.(85,86) whether the identified splicing factors interact with each Lack of correct splicing in class B and C mutants can take other.(80,87) place at two different levels of RNA processing, i.e. Figure 6 provides a current model of chloroplast either splicing of the primary psaA transcripts or 30-end spliceosomal complexes and their action in splicing and processing of the tscA RNA is affected. Of note is that integrates data from several experimental approaches. The maturation of the tscA precursor is a prerequisite for correct tscA RNA participating in the formation of the secondary splicing of exon 1 and 2. To date, five trans-acting factors were group II intron structure is involved in splicing of the first group characterised in C. reinhardtii with three belonging to class C II intron (psaA-i1).(36) At least three factors are involved in the factors (Table 3). processing of the precursor molecule containing the tscA Based on cofractionations and sucrose density RNA. As mentioned above, Raa1 is related to PPR proteins gradient centrifugations using protein extracts of wild-type and is a factor involved in tscA RNA maturation. It contains or splicing-deficient mutants, Rochaix and co-workers two distinct domains of which the C-terminal domain is proposed at least three protein complexes, two of which involved in processing of the tscA RNA, and the central are associated with chloroplast RNAs. Thus, the latter two domain in splicing of intron 2. The function of both domains can be considered as chloroplast RNP (cpRNP) complexes was deciphered when different truncated versions of the Raa1 that might be part of the chloroplast spliceosome. This gene were used in restoration experiments analysing two concept of a spliceosome-like complex was further supported different mutants.(80)

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 929 Review article S. Glanz and U. Ku¨ck

Figure 6. Model of chloroplast psaA RNA trans-splicing complexes in C. reinhardtii. The scheme integrates data from several groups of investigators as described in the text. Depicted are proteins and protein complexes required for trans-splicing of three psaA mRNA precursors. The two secondary structures resemble the folding of 50-and30-intronic RNAs flanking exon 1, 2 and 3 sequences modified after Goldschmidt-Clermont et al.(84) For details of the secondary structure from the first intron see Fig. 5. Abbreviations: chlN, subunit of the light-independent protochlorophyllide reductase; Cpn60, chaperonine 60; cNAPL, chloroplast nucleosome assembly protein-like; pL118B and pL137H, class B factors, and pL121G, class A factor, which are defined genetically;(87) Raa1-6, RNA maturation of psaA; Rat1-3, RNA maturation of psaA tscA RNA; tscA, trans-splicing chloroplast. Abbreviations are as described in the legend of Fig. 2, and see also the text for further details.

Rat1 and Rat2 (RNA maturation of psaA tscA RNA), both be determined whether it is also part of a high-molecular- of which are encoded by two adjacently located nuclear weight complex (Glanz, unpublished). genes, are also part of the maturation process. Interestingly, A biochemical approach including UV-crosslinking experi- only when both genes are simultaneously transferred into the ments, yeast three-hybrid analysis and mass spectrometry corresponding splicing-deficient mutant, they are able to identified three further chloroplast proteins with a more restore the wild-type phenotype. The deduced amino acid general affinity to group II introns. These include a 31 kDa sequence of Rat1, which directly interacts with tscA RNA, protein with a 39% sequence homology to the NADþ-binding shows 26% sequence homology to the conserved NADþ- domain of 6-phosphogluconate dehydrogenases Cpn60, a binding domain of poly(ADP-ribose) polymerases (PARP).(88) bacterial homologue of GroEL ATPases, and a chloroplast- All proteins involved in processing of the tscA precursor are localised cNAPL protein, showing high similarity to nucleo- associated with the thylakoid membrane. The processed tscA some assembly proteins.(83,90,91) RNA is also associated with a stromal 1 700 kDa protein For splicing of the second intron, apparently two complex that additionally contains the exon 1 primary membrane-associated complexes are involved (Fig. 6). The transcript with its 50-intron. A component of this protein first is the 670 kDa complex containing the above-mentioned complex is Raa3, showing homologies to pyridoxamine 50- Raa1, together with so far uncharacterised RNA molecules phosphate oxidases. The cofractionation of these two RNAs and protein factors. The second is a 400–500 kDa multiprotein together with Raa3 was shown by size exclusion chromato- Raa1/Raa2 complex, which is probably not associated with graphy.(89) Recently, another factor (Raa4) that shares a small RNA, since no direct interaction with psaA-i2 or solubilised protein domain with tRNA synthetases was shown to be chloroplast extracts in vitro was detected.(87) The Raa2 involved in splicing of the first group II intron, and it remains to polypeptide contains conserved motifs with significant

930 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

sequence similarity to two domains of pseudouridine in membrane and non-membrane fractions indicates further synthases; however, this enzyme activity is not a prerequisite different modes of action during the trans-splicing process. for trans-splicing. Therefore, Raa2 was speculated to be a bifunctional protein acting in pseudouridination as well as (92) trans-splicing. Moreover, it was suggested that this Glossary complex represents a pre-spliceosome, which is assembled and/or stabilised via three genetically defined factors. It was discussed that this complex has an indirect role in recognition Autocatalytic splicing: Self-splicing of group I, II and III and assembly of primary exon 2 and 3 RNAs, and thus this introns in vitro under non-physiological reaction conditions complex may be involved in the storage of trans-splicing in the absence of protein factors. These introns are factors. Finally, upon gene activation, this complex may referred to as ribozymes (ribonucleic acid enzymes) and specifically be redistributed to the site of transcription.(87,92,93) can catalyse their own cleavage or the cleavage of other The spatial separation of the complexes into membrane RNAs. However, efficient in vivo splicing almost always and stromal chloroplast fractions indicates that they may act in requires the assistance of a catalytic enzyme, RNA different modes and at different steps in the psaA trans- molecules and/or other protein factors that are either splicing process. The first reaction probably takes place in the encoded by the nucleus or the intron itself (maturases). stromal phase, whereas the second reaction is associated Mobile group II introns: Mobile group II introns are with the membrane. It can be further speculated that found in bacterial and organelle genomes. They are both membranous splicing of the second intron is coupled with catalytic RNAs and retrotransposable elements with an the translation and integration of the psaA protein into the intron-encoded protein that has reverse transcriptase thylakoid membrane system.(87) activity. Group II introns can transpose with high Although no homologues of C. reinhardtii factors have been efficiencies (retrohoming) into defined sites or can invade identified in higher plant chloroplasts (see Table 3), it may be at ectopic sites (retrotransposition). envisioned that proteins promoting trans-splicing act as RNA Nuclear spliceosome: Nuclear pre-mRNA introns are chaperones and stabilise or support the correct folding of intron not able to splice autocatalytically without the assistance structures. Alternatively,they may mediate splicing indirectly by of trans-acting RNA or protein factors. Eukaryotic pre- interaction with other protein factors. Indeed, cpRNPs in mRNA splicing takes place in the spliceosome, a tobacco were shown to act as stabilising factors for a number of ribonucleoprotein (RNP) complex of 60S that assembles non-ribosome-bound stromal chloroplast mRNAs.(94) Even from the five U-rich small nuclear ribonucleoproteins though the exact functions of so far characterised factors in the (snRNPs) U1, U2, U4, U5 and U6, which are temporarily trans-splicing process have to be elucidated, the presented associated with more than 70 proteins such as RNA high-molecular RNP complexes and splicing factors provide a helicases and SR proteins. For accurate spliceosome basis for the isolation and characterisation of further trans- assembly, a range of dynamic protein-protein, RNA- splicing factors and for the analyses of their general functional protein and RNA-RNA interactions are required. role in an organelle spliceosome. Splicing mechanism of group II introns: Splicing occurs via two sequential transesterification reactions. First, the 20OH of a specific branch point nucleotide within Conclusions the intron performs a nucleophilic attack on the first nucleotide of the intron at the 50-splice site forming the Trans-splicing of discontinuous group II introns is a lariat intermediate. Second, the 30OH of the released phenomenon that occurs in a huge number of organelles 50-exon performs a nucleophilic attack at the last from plants and diverse lower eukaryotes. We provide a nucleotide of the intron at the 30-splice site, thereby joining complete survey of 187 organelle trans-spliced introns that the exons and releasing the intron lariat. were predicted from the complete sequencing data of 179 organelle genomes. Furthermore, a summary of trans- splicing factors that are supposed to promote group II intron splicing is given. Genetic and biochemical data from splicing- deficient mutants support the assumption that intron trans- Supporting Information available online splicing is promoted by a set of trans-acting factors as part of high-molecular-weight complexes. The presented model predicts that these complexes may be involved in the storage Acknowledgments: The experimented work of the authors is of trans-splicing factors and can specifically be redistributed supported by the Deutsche Forschungsgemeinschaft to the site of transcription. A spatial separation of the complex (SFB480, B3).

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 931 Review article S. Glanz and U. Ku¨ck

References downstream from its 3( part in tobacco chloroplast genome. Nucleic Acids Res 1986. 14: 3143–3143. 27. Hildebrand, M., Hallick, R. B., Passavant, C. W. and Bourque, D. P., ( 1. Berget, S. M., Moore, C. and Sharp, P. A., Spliced segments at the 5 Trans-splicing in chloroplasts: the rps12 loci of Nicotiana tabacum. Proc terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci USA 1977. 74: Natl Acad Sci USA 1988. 85: 372–376. 3171–3175. 28. Koller, B., Fromm, H., Galun, E. and Edelman, M., Evidence for in vivo 2. Breathnach, R., Mandel, J. L. and Chambon, P., Ovalbumin gene is trans-splicing of pre-mRNAs in tobacco chloroplasts. Cell 1987. 48: 111– split in chicken DNA. Nature 1977. 270: 314–319. 119. 3. Chow, L. T., Gelinas, R. E., Broker, T. R. and Roberts, R. J., 29. Be´langer, A. S., Brouard, J. S., Charlebois, P., Otis, C., Lemieux, C. ( An amazing sequence arrangement at the 5 ends of adenovirus 2 and Turmel, M., Distinctive architecture of the chloroplast genome in the messenger RNA. Cell 1977. 12: 1–8. chlorophycean green alga Stigeoclonium helveticum. Mol Genet Geno- 4. Haugen, P., Simon, D. M. and Bhattacharya, D., The natural history of mics 2006. 276: 464–477. group I introns. Trends Genet 2005. 21: 111–119. 30. Brouard, J. S., Otis, C., Lemieux, C. and Turmel, M., Chloroplast DNA 5. Cavalier-Smith, T., 1991. Intron phylogeny: a new hypothesis. Trends sequence of the green alga Oedogonium cardiacum (Chlorophyceae): Genet 7: 145–148. unique genome architecture, derived characters shared with the Chae- 6. Michel, F., Umesono, K. and Ozeki, H., Comparative and functional tophorales and novel genes acquired through horizontal transfer. BMC anatomy of group II catalytic introns - a review. Gene 1989. 82: Genomics 2008. 9: 290. 5–30. 31. de Cambiaire, J. C., Otis, C., Lemieux, C. and Turmel, M., The 7. Toro, N., Bacteria and archaea group II introns: additional mobile complete chloroplast genome sequence of the chlorophycean green genetic elements in the environment. Environ Microbiol 2003. 5: 143–151. alga Scenedesmus obliquus reveals a compact gene organization and a 8. Valle`s, Y., Halanych, K. M. and Boore, J. L., Group II introns break new biased distribution of genes on the two DNA strands. BMC Evol Biol boundaries: presence in a bilaterian’s genome. PLoS ONE 2008. 3: 2006. 6:37. e1488. 32. Ogihara, Y., Isono, K., Kojima, T., Endo, A., Hanaoka, M., et al. 9. Hastings, K. E. M., SL trans-splicing: easy come or easy go? Trends Structural features of a wheat plastome as revealed by complete Genet 2005. 21: 240–247. sequencing of chloroplast DNA. Mol Genet Genomics 2002. 266: 10. Liang, X. H., Haritan, A., Uliel, S. and Michaeli, S., trans and cis 740–746. splicing in trypanosomatids: mechanism, factors, and regulation. Eukar- 33. Richaud, C. and Zabulon, G., The heme oxygenase gene (pbsA)inthe yotic Cell 2003. 2: 830–840. red alga Rhodella violacea is discontinuous and transcriptionally acti- 11. Stover, N. A., Kaye, M. S. and Cavalcanti, A. R. O., Spliced leader vated during iron limitation. Proc Natl Acad Sci USA 1997. 94: 11736– trans-splicing. Curr Biol 2006. 16: R8–R9. 11741. 12. Zhang, H., Hou, Y., Miranda, L., Campbell, D. A., Sturm, N. R., et al. 34. Barkan, A. and Goldschmidt-Clermont, M., Participation of nuclear Spliced leader RNA trans-splicing in dinoflagellates. Proc Natl Acad Sci genes in chloroplast gene expression. Biochimie 2000. 82: 559– USA 2007. 104: 4618–4623. 572. 13. Blumenthal T., 2005. Trans-splicing and operons. WormBook, ed. The 35. Nickelsen, J. and Ku¨ ck, U., The unicellular green alga Chlamydomonas C. elegans Research Community, WormBook, http://www.wormboo- reinhardtii as an experimental system to study chloroplast RNA meta- k.org. bolism. Naturwissenschaften 2000. 87: 97–107. 14. Horiuchi, T. and Aigaki, T., Alternative trans-splicing: a novel mode of 36. Goldschmidt-Clermont, M., Choquet, Y., Girard-Bascou, J., Michel, pre-mRNA processing. Biol Cell 2006. 98: 135–140. F., Schirmer-Rahire, M. and Rochaix, J. D., A small chloroplast RNA 15. Dorn, R., Reuter, G. and Loewendorf, A., Transgene analysis proves may be required for trans-splicing in Chlamydomonas reinhardtii. Cell mRNA trans-splicing at the complex mod(mdg4) locus in Drosophila. 1991. 65: 135–143. Proc Natl Acad Sci USA 2001. 98: 9724–9729. 37. Turmel, M., Choquet, Y., Goldschmidt-Clermont, M., Rochaix, J. D., 16. Pirrotta, V., trans-splicing in Drosophila. BioEssays 2002. 24: 988–991. Otis, C. and Lemieux, C., The trans-spliced intron 1 in the psaA gene of 17. Toor, N., Hausner, G. and Zimmerly, S., Coevolution of group II intron the Chlamydomonas chloroplast: a comparative analysis. Curr Genet RNA structures with their intron-encoded reverse transcriptases. RNA 1995. 27: 270–279. 2001. 7: 1142–1152. 38. Ward, B. L., Anderson, R. S. and Bendich, A. J., The mitochondrial 18. Fedorova, O. and Zingler, N., Group II introns: structure, folding and genome is large and variable in a family of plants (Cucurbitaceae). Cell splicing mechanism. Biol Chem 2007. 388: 665–678. 1981. 25: 793–803. 19. Pyle, A. M., Fedorova, O. and Waldsich, C., Folding of group II introns: 39. Burger, G., Forget, L., Zhu, Y., Gray, M. W. and Lang, B. F., Unique a model system for large, multidomain RNAs? Trends Biochem Sci 2007. mitochondrial genome architecture in unicellular relatives of animals. 32: 138–145. Proc Natl Acad Sci USA 2003. 100: 892–897. 20. Valadkhan, S., The spliceosome: a ribozyme at heart? Biol Chem 2007. 40. Knoop, V., The mitochondrial DNA of land plants: peculiarities in phy- 388: 693–697. logenetic perspective. Curr Genet 2004. 46: 123–139. 21. Nilsen, T. W., The spliceosome: the most complex macromolecular 41. Kubo, T. and Mikami, T., Organization and variation of angiosperm machine in the cell? Bioessays 2003. 25: 1147–1149. mitochondrial genome. Physiol Plant 2007. 129: 6–13. 22. Lehmann, K. and Schmidt, U., Group II introns: Structure and catalytic 42. Michel, F. and Ferat, J. L., Structure and activities of group II introns. versatility of large natural ribozymes. Crit Rev Biochem Mol 2003. 38: Annu Rev Biochem 1995. 64: 435–461. 249–303. 43. Bonen, L., Cis-andtrans-splicing of group II introns in plant mitochon- 23. Fromm, H., Edelman, M., Koller, B., Goloubinoff, P. and Galun, E., dria. Mitochondrion 2008. 8: 26–34. The enigma of the gene coding for ribosomal protein S12 in the chlor- 44. Malek, O. and Knoop, V., Trans-splicing group II introns in plant oplasts of Nicotiana. Nucleic Acids Res 1986. 14: 883–898. mitochondria: The complete set of cis-arranged homologs in ferns, fern 24. Fukuzawa, H., Kohchi, T., Shirai, H., Ohyama, K., Umesono, K., et al. allies and a hornwort. RNA 1998. 4: 1599–1609. Coding sequences for chloroplast ribosomal protein S12 from the liver- 45. Chapdelaine, Y. and Bonen, L., The wheat mitochondrial gene for wort, Marchantia polymorpha, are separated far apart on the different subunit I of the NADH dehydrogenase complex: a trans-splicing model DNA strands. FEBS Lett 1986. 198: 11–15. for this gene-in-pieces. Cell 1991. 65: 465–472. 25. Ku¨ ck, U., Choquet, Y., Schneider, M., Dron, M. and Bennoun, P., 46. Conklin, P. L., Wilson, R. K. and Hanson, M. R., Multiple trans-splicing Structural and transcriptional analysis of two homologous genes for events are required to produce a mature nad1 transcript in a plant the P700 chlorophyll a-apoproteins in Chlamydomonas reinhardtii: mitochondrion. Genes Dev 1991. 5: 1407–1415. evidence for in vivo trans-splicing. EMBO J 1987. 6: 2185–2195. 47. Wissinger, B., Schuster, W. and Brennicke, A., Trans-splicing in 26. Torazawa, K., Hayashida, N., Obokata, J., Shinozaki, K. and Sugiura, Oenothera mitochondria: nad1 mRNAs are edited in exon and trans- ( M., The 5 part of the gene for ribosomal protein S12 is located 30 kb splicing group II intron sequences. Cell 1991. 65: 473–482.

932 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. S. Glanz and U. Ku¨ck Review article

48. Binder, S., Marchfelder, A., Brennicke, A. and Wissinger, B., RNA 73. Wissinger, B., Brennicke, A. and Schuster, W., Regenerating good editing in trans-splicing intron sequences of nad2 mRNAs in Oenothera sense - RNA editing and trans-splicing in plant mitochondria. Trends mitochondria. J Biol Chem 1992. 267: 7615–7623. Genet 1992. 8: 322–328. 49. Knoop, V., Schuster, W., Wissinger, B. and Brennicke, A., Trans- 74. Bo¨ rner, G. V., Mo¨ rl, M., Wissinger, B., Brennicke, A. and Schmelzer, splicing integrates an exon of 22 nucleotides into the nad5 mRNA in C., RNA editing of a group II intron in Oenothera as a prerequisite for higher plant mitochondria. EMBO J 1991. 10: 3483–3493. splicing. Mol Gen Genet 1995. 246: 739–744. 50. Chaw, S. M., Shih, A. C. C., Wang, D., Wu, Y. W., Liu, S. M. and Chou, 75. Mingler, M. K., Hingst, A. M., Clement, S. L., Yu, L. E., Reifur, L. and T. Y., The mitochondrial genome of the gymnosperm Cycas taitungensis Koslowsky, D. J., Identification of pentatricopeptide repeat proteins in contains a novel family of short interspersed elements, Bpu sequences, Trypanosoma brucei. Mol Biochem Parasitol 2006. 150: 37–45. and abundant RNA editing sites. Mol Biol Evol 2008. 25: 603–615. 76. Lurin, C., Andres, C., Aubourg, S., Bellaoui, M., Bitton, F., et al. 51. Morawala-Patell, V., Gualberto, J. M., Lamattina, L., Grienenberger, Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins J. M. and Bonnard, G., Cis- and trans-splicing and RNA editing are reveals their essential role in organelle biogenesis. Plant Cell 2004. 16: required for the expression of nad2 in wheat mitochondria. Mol Gen 2089–2103. Genet 1998. 258: 503–511. 77. Delannoy, E., Stanley, W. A., Bond, C. S. and Small, I. D., Pentatri- 52. Knoop, V., Altwasser, M. and Brennicke, A., A tripartite group II intron copeptide repeat (PPR) proteins as sequence-specificity factors in post- in mitochondria of an angiosperm plant. Mol Gen Genet 1997. 255: 269– transcriptional processes in organelles. Biochem Soc Trans 2007. 35: 276. 1643–1647. 53. Waller, R. F. and Jackson, C. J., Dinoflagellate mitochondrial genomes: 78. de Longevialle, A. F., Meyer, E. H., Andre´s, C., Taylor, N. L., Lurin, C., stretching the rules of molecular biology. Bioessays 2009. 31: 237–245. et al. The pentatricopeptide repeat gene OTP43 is required for trans- 54. Marande, W., Lukes, J. and Burger, G., Unique mitochondrial genome splicing of the mitochondrial nad1 intron 1 in Arabidopsis thaliana. Plant structure in diplonemids, the sister group of kinetoplastids. Eukaryotic Cell 2007. 19: 3256–3265. Cell 2005. 4: 1137–1146. 79. Schmitz-Linneweber, C., Williams-Carrier, R. E., Williams-Voelker, 55. Marande, W. and Burger, G., Mitochondrial DNA as a genomic jigsaw P. M., Kroeger, T. S., Vichas, A. and Barkan, A., A pentatricopeptide puzzle. Science 2007. 318: 415. repeat protein facilitates the trans-splicing of the maize chloroplast rps12 56. Nash, E. A., Nisbet, R. E., Barbrook, A. C. and Howe, C. J., Dino- pre-mRNA. Plant Cell 2006. 18: 2650–2663. flagellates: a mitochondrial genome all at sea. Trends Genet 2008. 24: 80. Merendino, L., Perron, K., Rahire, M., Howald, I., Rochaix, J. D. and 328–335. Goldschmidt-Clermont, M., A novel multifunctional factor involved in 57. Bonen, L. and Vogel, J., The ins and outs of group II introns. Trends trans-splicing of chloroplast introns in Chlamydomonas. Nucleic Acids Genet 2001. 17: 322–331. Res 2006. 34: 262–274. 58. Nickelsen, J., Chloroplast RNA-binding proteins. Curr Genet 2003. 43: 81. Barkan, A., Klipcan, L., Ostersetzer, O., Kawamura, T., Asakura, Y. 392–399. and Watkins, K. P., The CRM domain: An RNA binding module derived 59. Dai, L. X., Chai, D. G., Gu, S. Q., Gabel, J., Noskov, S. Y., et al. A three- from an ancient ribosome-associated protein. RNA 2007. 13: 55–64. dimensional model of a group II intron RNA and its interaction with the 82. Ostheimer, G. J., Rojas, M., Hadjivassiliou, H. and Barkan, A., For- intron-encoded reverse transcriptase. Mol Cell 2008. 30: 472–485. mation of the CRS2-CAF2 group II intron splicing complex is mediated by 60. Rambo, R. P. and Doudna, J. A., Assembly of an active group II intron- a 22-amino acid motif in the COOH-terminal region of CAF2. J Biol Chem maturase complex by protein dimerization. Biochemistry 2004. 43: 6486– 2006. 281: 4732–4738. 6497. 83. Bunse, A. A., Nickelsen, J. and Ku¨ ck, U., Intron-specific RNA binding 61. Kelchner, S. A., Group II introns as phylogenetic tools: structure func- proteins in the chloroplast of the green alga Chlamydomonas reinhardtii. tion, and evolutionary constraints. Am J Bot 2002. 89: 1651–1669. Biochim Biophys Acta 2001. 1519: 46–54. 62. Toro, N., Jime´nez-Zurdo, J. I. and Garcı´a-Rodrı´guez, F. M., Bacterial 84. Goldschmidt-Clermont, M., Girard-Bascou, J., Choquet, Y. and group II introns: not just splicing. FEMS Microbiol Rev 2007. 31: 342–358. Rochaix, J. D., Trans-splicing mutants of Chlamydomonas reinhardtii. 63. Belhocine, K., Mak, A. B. and Cousineau, B., Trans-splicing of the Mol Gen Genet 1990. 223: 417–425. Ll.LtrB group II intron in Lactococcus lactis. Nucleic Acids Res 2007. 35: 85. Choquet, Y., Goldschmidt-Clermont, M., Girard-Bascou, J., Ku¨ ck, U., 2257–2268. Bennoun, P. and Rochaix, J. D., Mutant phenotypes support a trans- 64. Belhocine, K., Mak, A. B. and Cousineau, B., Trans-splicing versatility splicing mechanism for the expression of the tripartite psaA gene in the of the LI.LtrB group II intron. RNA 2008. 14: 1782–1790. C. reinhardtii chloroplast. Cell 1988. 52: 903–913. 65. Bhattacharya, D., Archibald, J. M., Weber, A. P. M. and Reyes-Prieto, 86. Hahn, D., Nickelsen, J., Hackert, A. and Ku¨ ck, U., A single nuclear A., How do endosymbionts become organelles? Understanding early locus is involved in both chloroplast RNA trans-splicing and 3( end events in plastid evolution. Bioessays 2007. 29: 1239–1246. processing. Plant J 1998. 15: 575–581. 66. Gould, S. B., Waller, R. R. and McFadden, G. I., Plastid evolution. Annu 87. Perron, K., Goldschmidt-Clermont, M. and Rochaix, J. D., Rev Plant Biol 2008. 59: 491–517. A multiprotein complex involved in chloroplast group II intron splicing. 67. Kleine, T., Maier, U. G. and Leister, D., DNA transfer from organelles to RNA 2004. 10: 704–711. the nucleus: the idiosyncratic genetics of endosymbiosis. Annu Rev Plant 88. Balczun, C., Bunse, A., Hahn, D., Bennoun, P., Nickelsen, J. and Biol 2008. 60: 115–138. Ku¨ ck, U., Two adjacent nuclear genes are required for functional 68. Bock, R. and Timmis, J. N., Reconstructing evolution: gene transfer complementation of a chloroplast trans-splicing mutant from Chlamydo- from plastids to the nucleus. Bioessays 2008. 30: 556–566. monas reinhardtii. Plant J 2005. 43: 636–648. 69. Patron, N. J. and Waller, R. F., Transit peptide diversity and divergence: 89. Rivier, C., Goldschmidt-Clermont, M. and Rochaix, J. D., Identifica- a global analysis of plastid targeting signals. Bioessays 2007. 29: 1048– tion of an RNA-protein complex involved in chloroplast group II intron 1058. trans-splicing in Chlamydomonas reinhardtii. EMBO J 2001. 20: 1765– 70. Mohr, G. and Lambowitz, A. M., Putative proteins related to group II 1773. intron reverse transcriptase/maturases are encoded by nuclear genes in 90. Balczun, C., Bunse, A., Schwarz, C., Piotrowski, M. and Ku¨ ck, U., higher plants. Nucleic Acids Res 2003. 31: 647–652. Chloroplast heat shock protein Cpn60 from Chlamydomonas reinhardtii 71. Nakagawa, N. and Sakurai, N., A mutation in At-nMat1a, which encodes exhibits a novel function as a group II intron-specific RNA-binding a nuclear gene having high similarity to group II intron maturase, causes protein. FEBS Lett 2006. 580: 4527–4532. impaired splicing of mitochondrial NAD4 transcript and altered carbon 91. Glanz, S., Bunse, A., Wimbert, A., Balczun, C. and Ku¨ ck, U., metabolism in Arabidopsis thaliana. Plant Cell Physiol 2006. 47: 772– A nucleosome assembly protein-like polypeptide binds to chloroplast 783. group II intron RNA in Chlamydomonas reinhardtii. Nucleic Acids Res 72. Tillich, M., Poltnigg, P., Kushnir, S. and Schmitz-Linneweber, C., 2006. 34: 5337–5351. Maintenance of plastid RNA editing activities independently of their 92. Perron, K., Goldschmidt-Clermont, M. and Rochaix, J. D., A factor target sites. EMBO Rep 2006. 7: 308–313. related to pseudouridine synthases is required for chloroplast group II

BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc. 933 Review article S. Glanz and U. Ku¨ck

intron trans-splicing in Chlamydomonas reinhardtii. EMBO J 1999. 18: 103. Jackson, C. J., Norman, J. E., Schnare, M. N., Gray, M. W., Keeling, 6481–6490. P. J. and Waller, R. F., Broad genomic and transcriptional analysis 93. Zerges, W. and Rochaix, J. D., Low density membranes are associated reveals a highly derived genome in dinoflagellate mitochondria. BMC with RNA-binding proteins and thylakoids in the chloroplast of Chlamy- Biol 2007. 5:41. domonas reinhardtii. J Cell Biol 1998. 140: 101–110. 104. Clifton, S. W., Minx, P., Fauron, C. M. R., Gibson, M., Allen, J. O., et al. 94. Nakamura, T., Ohta, M., Sugiura, M. and Sugita, M., Chloroplast Sequence and comparative analysis of the maize NB mitochondrial ribonucleoproteins function as a stabilizing factor of ribosome-free genome. Plant Physiol 2004. 136: 3486–3503. mRNAs in the stroma. J Biol Chem 2001. 276: 147–152. 105. Handa, H., The complete nucleotide sequence and RNA editing content 95. Pyle, A. M. and Lambowitz, A. M., Group II introns: ribozymes that of the mitochondrial genome of rapeseed (Brassica napus L.): compara- splice RNA and invade DNA. In: Gestel, R. F., Cech, T.R. and Atkins, J. F. tive analysis of the mitochondrial genomes of rapeseed and Arabidopsis editors. The RNA World, 3rd ed. Cold Spring Harbor, NY, Cold Spring thaliana. Nucleic Acids Res 2003. 31: 5907–5916. Harbor Laboratory Press, 2006. p 469–506. 106. Unseld, M., Marienfeld, J. R., Brandt, P. and Brennicke, A., The 96. Farre´, J. C. and Araya, A., The mat-r open reading frame is transcribed mitochondrial genome of Arabidopsis thaliana contains 57 genes in from a non-canonical promoter and contains an internal promoter to co- 366,924 nucleotides. Nat Genet 1997. 15: 57–61. transcribe exons nad1e and nad5III in wheat mitochondria. Plant Mol Biol 107. Wahleithner, J. A., Macfarlane, J. L. and Wolstenholme, D. R., 1999. 40: 959–967. A sequence encoding a maturase-related protein in a group II intron 97. Ogihara, Y., Isono, K., Kojima, T., Endo, A., Hanaoka, M., et al. of a plant mitochondrial nad1 gene. Proc Natl Acad Sci USA 1990. 87: Chinese spring wheat (Triticum aestivum L.) chloroplast genome: com- 548–552. plete sequence and contig clones. Plant Mol Biol Rep 2000. 18: 243–253. 108. Handa, H., Mizobuchi-Fukuoka, R. and Pinyarat, W., The rapeseed 98. Hu¨ bschmann, T., Hess, W. R. and Bo¨ rner, T., Impaired splicing of the mitochondrial gene for subunit 2 of the NADH dehydrogenase rps12 transcript in ribosome-deficient plastids. Plant Mol Biol 1996. 30: complex: a trans-spliced structure is conserved in one of the smallest 109–123. plant mitochondrial genomes. Curr Genet 1997. 31: 336–342. 99. Ems, S. C., Morden, C. W., Dixon, C. K., Wolfe, K. H., dePamphilis, 109. Lippok, B., Brennicke, A. and Unseld, M., The rps4 gene is encoded C. W. and Palmer, J. D., Transcription, splicing and editing of plastid upstream of the nad2 gene in Arabidopsis mitochondria. Biol Chem RNAs in the nonphotosynthetic plant Epifagus virginiana. Plant Mol Biol Hoppe Seyler 1996. 377: 251–257. 1995. 29: 721–733. 110. Turmel, M., Otis, C. and Lemieux, C., The complete mitochondrial DNA 100. Turmel, M., Otis, C. and Lemieux, C., The complete chloroplast DNA sequence of Mesostigma viride identifies this green alga as the earliest sequences of the charophycean green algae Staurastrum and Zygnema green plant divergence and predicts a highly compact mitochondrial reveal that the chloroplast genome underwent extensive changes during genome in the ancestor of all green plants. Mol Biol Evol 2002. 19: 24–38. the evolution of the Zygnematales. BMC Biol 2005. 3: 22. 111. deSouza, A. P., Jubier, M. F., Delcher, E., Lancelin, D. and Lejeune, 101. Freyer, R., Neckermann, K., Maier, R. M. and Ko¨ ssel, H., Structural B., A trans-splicing model for the expression of the tripartite nad5 gene in and functional analysis of plastid genomes from parasitic plants: loss of wheat and maize mitochondria. Plant Cell 1991. 3: 1363–1378. an intron within the genus Cuscuta. Curr Genet 1995. 27: 580–586. 112. Scheepers, D., Luo, H. and Boutry, M., Variant mitochondrial tran- 102. Sa´nchez Puerta, M. V., Bachvaroff, T. R. and Delwiche, C. F., The scripts of a broad bean line are associated with two point mutations complete mitochondrial genome sequence of the haptophyte Emiliania located upstream of the nad5 exon c. Plant Sci 1997. 129: 203– huxleyi and its relation to heterokonts. DNA Res 2004. 11: 1–10. 212.

934 BioEssays 31:921–934, ß 2009 Wiley Periodicals, Inc.