Department of Physics, Chemistry and Biology

Master Thesis

Serine/Arginine-rich proteins in patens

Andreas Ring LITH-IFM-A-EX--11/2447—SE

Supervisor: Johan Edqvist, Linköpings universitet Examiner: Matthias Laska, Linköpings universitet

Department of Physics, Chemistry and Biology Linköpings universitet SE-581 83 Linköping, Sweden

Avdelning, Institution Datum Division, Department Date

Molecular Genetics, IFM 2011-06-03

Språk Rapporttyp ISBN Language Report category ______Svenska/Swedish Licentiatavhandling ISRN x Engelska/English x Examensarbete ______C-uppsats D-uppsats ______Övrig rapport Serietitel och serienummer ISSN Title of series, numbering

______LITH-IFM-A-Ex— 11/2447 —SE

URL för elektronisk version

Titel Title

Serine/Arginine-rich proteins in Physcomitrella patens

Författare Author

Andreas Ring

Sammanfattning Abstract

Serine/Arginine-rich proteins (SR-proteins) have been well characterized in metazoans and in the flowering . But so far no attempts on characterizing SR-proteins in the Physcomitrella patens have been done. SR-proteins are a conserved family of splicing regulators essential for constitutive- and alternative splicing. SR-proteins are mediators of alternative splicing (AS) and may be alternatively spliced themselves as a form of gene regulation. Three novel SR-proteins of the SR-subfamily were identified in P. patens. The three genes show conserved intron-exon structure and protein domain distribution, not surprising since the gene family has evidently evolved through gene duplications. The SR-proteins PpSR40 and PpSR36 show differential tissue-specific expression, whereas PpSR39 does not. Tissue-specific expression of SR-proteins has also been seen in A. thaliana. SR-proteins determine splice-site usage in a concentration dependent manner. SR-protein overexpression experiments in A. thaliana and Oryza sativa have shown alteration of splicing patterns of endogenous SR-proteins. Overexpression of PpSR40 did not alter the splicing patterns of PpSR40, PpSR36 and PpSR39. This suggests that they might not be a substrate for PpSR40. These first results of SR-protein characterization in P. patens may provide insights on the SR-protein regulation mechanisms of the common land plant ancestor.

Nyckelord Keyword

Physcomitrella patens, SR protein, gene expression analysis, overexpression

Contents

1. Abstract ...... 2 2. List of abbreviations ...... 2 3. Introduction ...... 2 4. Materials and methods ...... 4 4.1 Plant material and standard growth conditions ...... 4 4.2 Sequence data mining ...... 4 4.3 Phylogenetic analysis ...... 4 4.4 Primer design ...... 5 4.5 Gene expression analysis ...... 5 4.6 Overexpression of PpSR40 in P. patens ...... 5 4.6.1 Generation of pCMAK1:PpSR40 construct ...... 5 4.6.2 Transformation of P. patens ...... 6 4.6.3 Gene expression analysis of pCMAK1:PpSR40 transformants ...... 8 5. Results ...... 8 5.1 Sequence Data Mining ...... 8 5.1.1 Identification of SR proteins in P. patens ...... 8 5.1.2 SR-protein transcripts ...... 10 5.2 Phylogenetic analysis ...... 10 5.3 Gene expression analysis ...... 12 5.4 Overexpression of PpSR40 in P. patens ...... 12 5.4.1 Gene expression analysis of pCMAK1:PpSR40 transformants ...... 13 6. Discussion ...... 14 6.1 Protein identification and characterization ...... 14 6.2 SR-protein transcripts in P. patens ...... 15 6.3 Gene expression analysis ...... 15 6.4 Overexpression of PpSR40 in P. patens ...... 15 6.4.1 Transformation of P. patens ...... 15 6.4.2 Gene expression analysis of P. patens transformants ...... 16 6.5 Conclusion ...... 17 7. Acknowledgements ...... 17 8. References ...... 17

1

1. Abstract

Serine/Arginine-rich proteins (SR-proteins) have been well characterized in metazoans and in the flowering plant Arabidopsis thaliana. But so far no attempts on characterizing SR- proteins in the moss Physcomitrella patens have been done. SR-proteins are a conserved family of splicing regulators essential for constitutive- and alternative splicing. SR-proteins are mediators of alternative splicing (AS) and may be alternatively spliced themselves as a form of gene regulation. Three novel SR-proteins of the SR-subfamily were identified in P. patens. The three genes show conserved intron-exon structure and protein domain distribution, not surprising since the gene family has evidently evolved through gene duplications. The SR-proteins PpSR40 and PpSR36 show differential tissue-specific expression, whereas PpSR39 does not. Tissue-specific expression of SR-proteins has also been seen in A. thaliana. SR-proteins determine splice-site usage in a concentration dependent manner. SR-protein overexpression experiments in A. thaliana and Oryza sativa have shown alteration of splicing patterns of endogenous SR-proteins. Overexpression of PpSR40 did not alter the splicing patterns of PpSR40, PpSR36 and PpSR39. This suggests that they might not be a substrate for PpSR40. These first results of SR-protein characterization in P. patens may provide insights on the SR-protein regulation mechanisms of the common land plant ancestor.

2. List of abbreviations

Alternative splicing – AS Polymerase chain reaction - PCR Cauliflower mosaic virus – CaMV – PEG Coding sequence - CDS Regulated unproductive splicing and Complementary DNA - cDNA translation - RUST Exonic splicing enhancer - ESE Reverse transcriptase polymerase chain Expressed sequence tags – EST reaction – RT-PCR Heterogeneous nuclear ribonucleoprotein - RNA recognition motif – RRM hnRNP Small nuclear ribonucleoprotein – snRNP Nonsense-mediated decay - NMD Untranslated region - UTR Nuclear localization signal - NLS

Keywords: Physcomitrella patens, SR protein, gene expression analysis, overexpression

2

3. Introduction Traditionally Arabidopsis thaliana has been regarded as the undisputed premier model to study plant genetics. Recently, a plant model contender in the form of the modest moss Physcomitrella patens has emerged. However, the potential of moss in genetics was already recognized in the early 1920’s by von Wettstein (Cove et al., 1997; von Wettstein, 1924). P. patens belongs to the division Bryophyta () and characteristic to all , have an alternation of generations in their life-cycle. The moss has a dominant haploid gametophytic and a diploid sporophytic generation. The of P. patens are monoecious, that is with both male and female gamete-producing organs, called antheridia and archegonia, respectively. Upon gamete fertilization, a diploid zygote is produced, which undergo cell division until a multicellular mature emerges. The mature sporophyte produces haploid through that in turn will grow in to a filamentous juvenile stage called protonema. Protonema consists of two cell-types, chloronema that have densely packed cells with large and caulonema that have less chloroplasts with less chlorophyll (Cove et al., 1997). After -germination or regeneration, chloronemal filaments are produced, from which caulonemal filaments can be branched. Buds originate from caulonema, which later develops into a leafy gametophyte and thus completing the life cycle of P. patens.

The moss P. patens has the key attributes for being a good model system, namely it is easy to grow and maintain, it is open for genetic manipulation and phenotypes can be selected for (Cove, 2005; Wood, 2000). P. patens has in the field of molecular genetics become increasingly popular over the past 10 years which can be reflected in the 5-fold increased number of publication hits for “Physcomitrella patens” in well-used article databases Pubmed, Scopus and Web of knowledge. This can be explained by a number of reasons. P. patens has (and other mosses) emerged as a valuable organism to study cell polarity (Cove et al., 1996; Sievers et al., 1996) and other complex biological processes (Cove et al., 2006; Cove et al., 1997; Reski, 1998; Wood, 2000). Mosses hold an interesting evolutionary position, between photosynthetic algae and vascular land plants. P. patens diverged from the vascular land plant lineage some 430 million years ago, thus making it one of the earliest plants that conquered land (Kenrick and Crane, 1997). Another feature that stands out from the plant A. thaliana is that efficient can be achieved (Schaefer, 2001, 2002; Schaefer and Zrÿd, 1997). Because P. patens consists predominantly of haploid gametophytes, this makes a viable option but also makes selection for a preferred genotype easier. Moss can be transformed efficiently using a Polyethylene glycol (PEG)-mediated method (Schaefer et al., 1991). The most important reason for the increased popularity of P. patens is very likely to be the sequencing of the (Rensing et al., 2008).

Most eukaryotic protein coding nuclear genes are interrupted by non-coding sequences called introns (Sharp and Burge, 1997). Following gene transcription, introns are excised out of the precursor-mRNA in a process called splicing. This is achieved by a macromolecule called the spliceosome, which is made up of the U1, U2, U4/U6, and U5 small nuclear ribonucleoproteins (snRNPs) (Sharp and Burge, 1997). Introns are typically identified by the consensus dinucleotides GT and AG at the 5’ and 3’ intron boundaries, respectively. However it was discovered that exceptions to the GT-AG rule exists, introns denoted minor class introns which used the AT and AC dinucleotides (Hall and Padgett, 1994; Jackson, 1991). After a series of publications from different groups a novel spliceosome was unveiled, consisting of the snRNPs denoted U11 (Kolossova and Padgett, 1997; Tarn and Steitz,

2

1996b), U12 (Hall and Padgett, 1996; Tarn and Steitz, 1996b), U5 (Tarn and Steitz, 1996b), U4atac and U6atac (Tarn and Steitz, 1996a). There are two main splicing mechanisms, constitutive splicing (CS) and alternative splicing (AS). In CS all introns are removed from the pre-mRNA, leaving all exons to form a mature mRNA. AS is a mechanism in which pre-mRNA can form several structurally and functionally different protein products from a single gene (Graveley, 2001; Manley and Tacke, 1996). This makes AS a major contributor to protein diversity, but AS is also a form of post-transcriptional gene regulation. AS of pre-mRNA transcripts in which introns or “poison exons” are spliced into an mRNA can lead to changes in the coding sequences. Any change to the coding sequence could potentially lead to a premature termination codon (PTC). Proteins containing PTCs are often not translated; they are rather the target of nonsense-mediated decay (NMD) which is a RNA surveillance system that prevents the synthesis of truncated and potentially hazardous protein products (Belgrader et al., 1994; Maquat, 2004). A recent survey of alternatively spliced SR-protein transcripts containing PTCs shows that about half of them were targeted by NMD (Palusa and Reddy, 2010). It is proposed that AS and NMD play important roles in post-transcriptional gene regulation through a prevalent process named regulated unproductive splicing and translation (RUST) (Lewis et al., 2003). It has been shown that RUST is highly conserved in mammalian Serine/Arginine-rich proteins (SR- proteins) (Lareau et al., 2007) and has been suggested to be important for the autoregulation of SR-protein expression (Ni et al., 2007). The SR-proteins are a family of conserved essential splicing factors for the intron- recognition, spliceosome assembly and for the regulation of AS (Xiao-Qin et al., 2007). SR- proteins contain one or two N-terminal RNA recognition motif (RRM) and a variable length C-terminal region of SR-dipeptides constituting an SR-domain (Manley and Tacke, 1996; Wu and Maniatis, 1993). The RRM domains have high affinity for splicing enhancer elements in the mRNA and the RRM domains are vital for in vitro splicing (Chandler et al., 1997; Tacke and Manley, 1995). The SR-domain, that is highly phosphorable, is important for mediating protein-protein interactions that is necessary for proper splicing (Kohtz et al., 1994; Wu and Maniatis, 1993; Xiao and Manley, 1997). It has been shown that the AS of SR-proteins in A. thaliana is controlled in a developmental and tissue-specific manner (Golovkin and Reddy, 1998; Kalyna et al., 2003; Lazar et al., 1995; Lopato et al., 1999; Lopato et al., 1996; Palusa et al., 2007). Alternatively spliced SR-proteins with even subtle differences have been linked to distinct biological functions (Zhang and Mount, 2009). SR-proteins determine splice-site usage in a concentration-dependent manner (Lazar and Goodman, 2000; Smith and Valcárcel, 2000). Generally, an increase in SR-protein concentration leads to the alternative usage of a proximal 5’-splice site (Cáceres et al., 1994; Fu et al., 1992; Screaton et al., 1995; Wang and Manley, 1995; Zahler et al., 1993) but individual SR-proteins have been proven to have opposite effect on splice site election namely the use of a distal 5’ splice-site (Dauksaite and Akusjärvi, 2004). The usage of a proximal 5’ splice site is counteracted by the heterogeneous nuclear ribonucleoprotein (hnRNP) A/B family of proteins (Mayeda and Krainer, 1992; Mayeda et al., 1994). All in all, this suggests that, ultimately, splice site selection in vivo is determined by relative concentrations of splicing factors.

The aim of this project was to identify and characterize putative SR-proteins in P. patens. Three P. patens SR-proteins of the SR-subfamily were identified. The SR-proteins PpSR40 and PpSR36 show tissue-specific expression in protonema. Overexpression of PpSR40 did not significantly alter splicing patterns of PpSR40, PpSR36 and PpSR40. These first results of

3

SR-protein characterization in P. patens may provide insights on the SR-protein regulation mechanisms of the common land plant ancestor.

4. Materials and methods

4.1 Plant material and standard growth conditions P. patens ssp. patens (strain Gransden 2004) were grown axenically on solid BCD-medium (1 mM MgSO4, 1.85 mM, KH2PO4, 10 mM KNO3, 45 µM FeSO4, 0.22 µM CuSO4, 0.19 µM ZnSO4, 10 µM H3BO4, 0.10 µM Na2MoO4, 2 µM MnCl2, 0.23 µM CoCl2 and 0.17 µM KI) supplemented with 5 mM ammonium tartrate, 1 mM CaCl2 and 0.8 % (w/v) plant in petri dishes (90 mm diameter) (Ashton and Cove, 1977). The moss were grown in standard conditions, that was 24 light hour cycle, 25 °C and continuous photosynthetic photon flux of 155 µmoles m-2 sec-1 in a growth chamber (CLF Plant Climatics, Emersacker, Germany, Percival CU-36L/5). The moss were routinely sub cultured monthly to fresh supplemented BCD-medium. Protonema from P. patens was generated by homogenization of gametophyte colonies in 100 ml liquid supplemented BCD-medium by an Ultra-Turrax® T18 basic (IKA, Staufen, Germany). The liquid culture was grown for 5 days, and then re-homogenized and subsequently transferred to cellophane covered supplemented solid BCD-medium.

4.2 Sequence data mining From the National Center for Information (http://www.ncbi.nlm.nih.gov/) the protein sequences of four A. thaliana SR-proteins (AtSR30; AtSR34; AtSR34a; AtSR34b) were used as queries for homologous proteins in P. patens (taxid: 3218) using NCBI BLASTp with default settings (Altschul et al., 1997). Protein sequence hits below E-value e-20 were rejected. The proteins found were built on pre V.1.6 gene models, so subsequently the proteins were used as queries in a BLASTp with default settings in P. patens database at http://www.phytozome.net to obtain V.1.6 gene models. To find expressed sequence tags (ESTs) in P. patens transcriptome BLAST at http://www.cosmoss.org were used with default settings for each gene. Genomic DNA was used as query in a BLASTn search in the pp0409_fil database, with an E-value threshold of 1e-2. Multiple sequence alignment of EST’s was done using MUSCLE with default settings at the European Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/Tools/msa/muscle/) (Edgar, 2004). Protein masses were calculated with the PeptideMass tool at http://www.expasy.ch (Wilkins et al., 1997).

4.3 Phylogenetic analysis A.thaliana and P. patens SR-protein sequences from the SR-subfamily were aligned with ClustalW2 (Larkin et al., 2007) at EBI (http://www.ebi.ac.uk/Tools/msa/clustalw2/) using default settings; the alignment was saved in the PHYLogeny Inference Package (PHYLIP) (Felsenstein, 1989) format. A neighbor-joining (Saitou and Nei, 1987) tree was created using PHYLIP v.3.6.9 programs in order: Seqboot, Protdist, Neighbor and Consense all with default settings and with the option multiple datasets 100 when possible. A maximum likelihood tree was created using PhyML 3.0 online (http://www.atgc-montpellier.fr/phyml/) (Guindon and Gascuel, 2003; Guindon et al., 2005) using default settings and with 100 bootstraps. All phylogenetic trees were viewed in FigTree v.1.3.1 (http://tree.bio.ed.ac.uk/software/figtree/).

4

4.4 Primer design Gene specific primers, pre v1.6 gene model, were designed using the following pattern XXXXYYYYYYZZZZZZZZZZZZZZZ where X is random base pairs, Y is restriction site consensus for Xho1 (CTCGAG) for forward primer and Apa1 (GGGCCC) for reverse primer and Z is gene specific untranslated region sequence upstream of the start codon and downstream of the termination codon. For RT-PCR, gene specific forward primers were used and reverse primers were designed so that they would span several exons, in which the resulting amplicons would have evidence for alternate splicing. The housekeeping gene beta- tubulin 1 was used as an endogenous control (Holm et al., 2010). The full list of primers used can be viewed in Table 1.

Table 1. Primer sequences for cDNA constructs, cDNA-synthesis, -PCR and RT-PCR. Primer Sequence 5' to 3' dT-adaptor GGC CAC GCG TCG ACT AGT ACT TTT TTT TTT TTT TTT T Apa1-adaptor ACT TTG GGC CCG GCC ACG CGT CGA CTA GTA C PpSR40_f AGA ACT CGA GAA CCC ATC GAA ACC AGC ACG AT PpSR40_r GAA CGG GCC CAC TGA GAT TCG CGT GAC GGA AC PpSR36_f TAA GCT CGA GAG ACG ATA TTG TAA CCA GCA CA PpSR36_r CAA CGG GCC CTA CAT AAT TAA CAA GCA GAA AC PpSR39_f GGA ACT CGA GGT TGC TTA TCA CAT TCA ACA AC PpSR39_r ATA TGG GCC CTT AGC AGT CAG AAG GCT ATG AC PpSR40_r_RT-PCR GTC CAC GAT GCC CAT TGT TCC C PpSR36_r_RT-PCR GGC GTA TTT CAT ATC GTC GTA G PpSR39_r_RT-PCR TTC CAG CTG AGC CAT CAC GAA A Beta-tubulin 1_f GAC TGC TTG CAA GGT TTC CAA G Beta-tubulin 1_r TTT AGC TGC CCA GGG AAT CGG A

4.5 Gene expression analysis

Total RNA was isolated from ten days old protonema and the upper half-part of five weeks old gametophytes in an RNAse-free environment using RNeasy plant mini kit (Qiagen GmbH, Hilden, Germany). Total RNA-isolations were treated with DNase 1 (Fermentas, Vilnius, Lithuania). cDNA was synthesized from DNA-free total RNA using RevertAid H- minus Reverse Transcriptase (Fermentas) using random hexamer primers. RT-PCR was performed with DreamTaq DNA polymerase (Fermentas) using 1 µg of cDNA as template and gene specific primers for the genes PpSR40, PpSR36 and PpSR39 and beta-tubulin 1, respectively. PCR was run with the settings: 3 min at 94 °C; 35 cycles of 30 sec at 94 °C, 30 sec at 55 °C, 1 min at 72 °C; and 7 min at 72 °C. The resulting amplicons were visualized with 1x SYBR safe™ (Invitrogen, Carlsbad, USA) and separated on 1.2 % agarose (Saveen & Werner, Limhamn, Sweden) gels with 0.5xTBE running buffer (Fermentas). Gene expression data were corrected for background luminescence and normalized using beta-tubulin 1 as endogenous control using ImageJ (version 1.43u) (Abramoff et al., 2004). The gene expression of PpSR40, PpSR36 and PpSR39 was analyzed using a One-way ANOVA with Tukey and Fisher method post-hoc analysis.

4.6 Overexpression of PpSR40 in P. patens

4.6.1 Generation of pCMAK1:PpSR40 construct Total RNA was isolated from three weeks old gametophytes in an RNAse-free environment using RNeasy plant mini kit (Qiagen). Total RNA-isolations were treated with DNase 1 (Fermentas). Full-length cDNA was synthesized from DNA-free total RNA using RevertAid H-minus Reverse Transcriptase (Fermentas) using primer dT-adaptor primer.

5

RACE-PCR was performed with DreamTaq DNA polymerase (Fermentas) using 1 µg of cDNA as template and gene specific forward primers for PpSR40, PpSR36 and PpSR39 respectively and Apa1-adaptor as reverse primer for all reactions. PCR was run with the settings: 3 min at 94 °C; 35 cycles of 30 sec at 94 °C, 30 sec at 55 °C, 1 min 30 sec at 72 °C; and 7 min at 72 °C. One µl of the resulting PCR-product was used as template for PCR using gene specific primers for PpSR40, PpSR36 and PpSR39, respectively. PCR was run with the settings: 3 min at 94 °C; 35 cycles of 30 sec at 94 °C, 30 sec at 55 °C, 1 min 30 sec at 72 °C; and 7 min at 72 °C. The resulting amplicons were visualized with 1x SYBR safe™ (Invitrogen) and separated on 1.2 % agarose (Saveen & Werner) gels with 0.5xTBE running buffer (Fermentas). A ~900 bp amplicon, corresponding to the gene PpSR40 full-length cDNA was excised and purified using QIAEX II Gel Extraction Kit (Qiagen). The pCMAK1 plasmid (Fig. 1) was kindly supplied by National Institute of Agrobiological Resources, Tsukuba, Japan. The pCMAK1 plasmid and the PpSR40 full-length cDNA was digested with Xho1 (New England Biolabs, Ipswich, USA) and Apa1 (New England Biolabs). The resulting fragments were purified in between Xho1 and Apa1 digestion and after Apa1 digestion, using GeneJET™ PCR Purification Kit (Fermentas). The digested pCMAK1 plasmid and the digested PpSR40 full-length cDNA-amplicon was ligated using T4 DNA ligase (Invitrogen). The pCMAK1:PpSR40 plasmid was amplified using Subcloning Efficiency™ DH5α™ Chemically Competent E. coli (Invitrogen) and isolated using Maxi Plasmid Kit (Qiagen).

Figure 1. The plasmid pCMAK1 is suitable for bacterial and P. patens transformation. The plasmid is a pCR4-topo vector which has a pUC replication origin, lac promotor coupled to LacZα-ccdb lethal gene, kanamycin and ampicillin resistance for bacterial selection. The plasmid contains an expression cassette that can be released by a number of restriction enzymes. The expression cassette is directed to the BS213. It uses a variant of the Cauliflower mosaic virus (CaMV) 35S-promotor, namely the P7113 promotor that is a strong constitutive promotor. Between the promotor and the nopalin synthase terminator there are multiple cloning sites for the insertion of fragments. Furthermore, the cassette has a Zeocin cassette for selection in eukaryotic cells.

4.6.2 Transformation of P. patens Transformation was performed by the PEG-mediated method (Schaefer et al., 1991). Prior to the transformation, MMM solution was made by dissolving 8.5 g D-mannitol, 0,305 g MgCl26H2O and 10 ml 1 % MES pH 5.6 in 100 ml distilled water. The MMM solution was

6 then autoclaved. PEG solution was made by dissolving 80 g PEG-4000, 2 ml 1 M Tris pH 7.5 and autoclaved 0.38 M D-mannitol/0.1 M Ca(NO3)2 solution up to 200 ml. The solution was heated up to 45 °C to dissolve the PEG and pH was adjusted to 8 and was subsequently filter sterilized. On the day of the transformation, 1% Driselase solution was prepared by dissolving 500 mg Driselase (Sigma, St. Louis, USA) in 50 ml autoclaved 8.5 % D-mannitol. It was incubated for 15 minutes in RT and centrifuged at 2500 x g for 5 minutes. The clear supernatant was collected and adjusted to pH 5.6 and was then filter sterilized. In a sterile hood, 10 day old protonema from 5-10 plates was collected and treated with 1 % Driselase for 30 minutes in room temperature in order to obtain protoplasts. The tube was turned from time to time to facilitate mixing. The moss protoplasts were poured through a large meshed filter into a . The filtrate was incubated for ten minutes and then poured through a fine meshed filter into another petri dish. The protoplast solution was then transferred to a 50 ml falcon tube and was washed by centrifugation at 700 rpm in a Universal 320 centrifuge (Hettich, Tuttlingen, Germany) at 5 °C for 5 minutes. The supernatant was poured off and the resulting pellet was re-dissolved in the original volume of 8.5 % D-mannitol. The wash was repeated and the protoplast pellet was dissolved in 2 ml of 8.5 % D-mannitol. Ten µl of the protoplast solution was used in a Bürcher chamber to estimate the amount of intact protoplasts. Three instances of ten µg of pCMAK1:PpSR40 plasmid, where the expression cassette (Fig. 2) had been released by Not1 (Fermentas) digestion were pipetted into 15 ml falcon tubes.

Figure 2. The expression cassette released from the pCMAK1:PpSR40 plasmid with Not1 restriction enzyme. The cassette uses the locus BS213 as target for homologous recombination. The P7113 promoter is a strong constitutive promoter; Tnos is the Agrobacterium tumefaciens nopaline synthase transcription terminator. The cDNA-clone is cloned in the cassette between the promotor and the terminator. Transgenic plants are selected for growth on zeocin.

The protoplast suspension was centrifuged and the supernatant was pipetted off as much as possible, and then re-suspended to a density of 1.2 x 106 cells in MMM solution. 300 µl of the protoplast suspension was pipetted to the three tubes containing pCMAK1:PpSR40 plasmid and were mixed gently. Three µl of PEG solution was added to the tubes and was gently mixed and then heat shocked at 45°C for 5 minutes. After the heatshock the tubes were incubated in room temperature for 10 minutes. The transformation mix was progressively mixed with regeneration media (BCD medium supplemented with 5 mM ammonium tartrate, 1 mM CaCl2, 6.6 % (w/v) D-mannitol and 0.5 % (w/v) D-glucose) 5 x 300 µl drop wise and then 5 x 1 ml slowly. The tubes were incubated in darkness and room temperature overnight. From the tubes containing the transformation mix, regeneration media was removed until 3 ml was left without disturbing the sedimented protoplasts. The protoplasts were resuspended carefully and 3 ml of 42°C liquid top layer of regeneration media (BCD medium supplemented with 5 mM ammonium tartrate, 1 mM CaCl2, 6.6 % (w/v) D-mannitol, 0.5 % (w/v) D-glucose and 1.2 % (w/v) plant agar) was added to each tube and were quickly mixed. Two ml of the solution was poured onto cellophane covered regeneration plates (BCD medium supplemented with 5 mM ammonium tartrate, 1 mM CaCl2, 6.6 % (w/v) D-mannitol, 0.5 % (w/v) D-glucose and 0.8 % (w/v) plant agar), three plates per tube total of 9 plates. The plates were sealed with micropore tape and were grown in standard conditions. After 5 days the cellophane with the moss on top of it, was transferred to selective BCD-medium containing 50 µg / ml zeocin (Invitrogen, Carlsbad, USA). Moss was selected for two weeks and surviving colonies were transferred to BCD-medium supplemented with 5 mM ammonium tartrate, 1 mM CaCl2 and 0.8 % (w/v) plant agar. After two weeks of non-

7 selective growth parts of the moss colonies were transferred to selective medium. The stability of the transformants was assayed by looking at the moss growth on the second selection stage after two weeks. Moss colonies that showed significant growth were regarded as stable transformants.

4.6.3 Gene expression analysis of pCMAK1:PpSR40 transformants Total RNA was isolated from three week old moss colonies in an RNAse-free environment using RNeasy plant mini kit (Qiagen). Total RNA-isolations were treated with DNase 1 (Fermentas). cDNA was synthesized from DNA-free total RNA using RevertAid H-minus Reverse Transcriptase (Fermentas) using random hexamer primers. RT-PCR was performed with DreamTaq DNA polymerase (Fermentas) using 1 µg of cDNA as template and gene specific primers for the genes PpSR40, PpSR36 and PpSR39 and beta-tubulin 1, respectively. PCR was run with the settings: 3 min at 94 °C; 35 cycles of 30 sec at 94 °C, 30 sec at 55 °C, 1 min at 72 °C; and 7 min at 72 °C. The resulting amplicons were visualized with 1x SYBR safe™ (Invitrogen) and separated on 1.2 % agarose (Saveen & Werner) gels with 0.5xTBE running buffer (Fermentas). Gene expression data were corrected for background luminescence and normalized using beta-tubulin 1 as endogenous control using ImageJ (version 1.43u).

5. Results

5.1 Sequence Data Mining

5.1.1 Identification of SR proteins in P. patens In order to identify SR-proteins in P. patens, four A. thaliana SR-proteins (AtSR30; AtSR34; AtSR34a; AtSR34b) were used as query in a BLASTp which revealed three homologous predicted SR-proteins in P. patens. The query proteins show high similarity to Phypa_205547, Phypa_140978 and Phypa_113630 respectively as can be seen in Table 2. The three homologous proteins Phypa_205547, Phypa_140978 and Phypa_113630 are 355, 333 and 349 amino acids long respectively. Furthermore, Phypa_205547, Phypa_140978 and Phypa_113630 has an RRM1 domain (prosite:PS50102) at 7-82, 7-82 and 7-82 respectively, an RRM2 domain with the conserved signature motif SWQDLKD (prosite:PS50102) at 123- 202, 127-206 and 125-204 respectively and an RS-rich domain at the C-terminal of the protein. The proteins all contain a nuclear localization signal (NLS) when analyzed by TargetP 1.1 (Emanuelsson et al., 2007) and WoLF PSORT (Horton et al., 2007). The three identified proteins can be classified as proteins in the SR-subfamily of the SR- proteins (Barta et al., 2010). The calculated molecular weights of the Phypa_205547, Phypa_140978 and Phypa_113630 proteins were 39534, 36275 and 38519 daltons. The novel proteins were assigned appropriate names according to the suggested nomenclature of SR- proteins (Barta et al., 2010). Phypa_205547, Phypa_140978 and Phypa_113630 were denoted as PpSR40, PpSR36 and PpSR39, respectively.

Table 2. The four query proteins of A. thaliana and the relative amino acid sequence identity/similarity in percentages to the proteins of P. patens. The table also shows the relative amino acid sequence identity/similarity in percentages between novel proteins. AtSR30 AtSR34 AtSR34a AtSR34b PpSR40 PpSR36 PpSR39 PpSR40 65 / 77 62 / 80 65 / 81 66 / 80 - 81 / 86 84 / 89 PpSR36 61 / 74 62 / 80 68 / 82 64 / 78 81 / 86 - 83 / 87 PpSR39 65 / 76 63 / 81 66 / 83 67 / 83 84 / 89 83 / 87 -

8

Figure 3. Multiple sequence alignment of A. thaliana and P. patens SR-proteins in the SR-subfamily. The protein domains RRM1 and RRM2 show high sequence conservation. The NLS and SWQDLKD motifs are indicated. Black backgrounds represent amino acid residues that show identity in at least 4 sequences. Gray backgrounds represent amino acids that show similar characteristics, or if the amino acid residues show consensus with similar characteristics.

The protein sequence hits were then used in a BLASTp for corresponding genes at www.phytozome.net using default settings. The three protein sequences correlated to the V1.6 gene models Pp1s28_193V6 (PpSR40), Pp1s173_12V6 (PpSR36) and Pp1s7_8V6 (PpSR39). As can be seen in figure 4, the three genes are highly similar in the intron-exon structure. The protein PpSR40 is encoded by a gene that is 4596 bp long, consisting of 13 exons and 12

9 introns. The protein PpSR36 is encoded by a gene that is 5438 bp long consisting of 14 exons and 13 introns. The protein PpSR39 is encoded by a gene that is 4438 bp long, consisting of 13 exons and 12 introns. The gene lengths signify transcription start to transcription end.

Figure 4. The intron-exon structure of the genes is highly conserved. The distribution of RRM domains over the intron- exon structure is also highly conserved.

5.1.2 SR-protein transcripts

SR-proteins are known to be alternatively spliced, so to investigate AS-events, the genomic sequences from PpSR40, PpSR36 and PpSR39 were used as queries in transcriptome BLAST at www.cosmoss.org. The search yielded 13, 16 and 8 redundant EST’s for PpSR40, PpSR36 and PpSR39, respectively. The EST hits were aligned with the PpSR40, PpSR36 and PpSR39 full-length transcript. The PpSR40 gene has two mRNA-isoforms, differing in the third UTR exon. For the gene PpSR36, only one distinct isoform is recognized (Fig. 5). PpSR39 however, have two very distinct transcriptional (Fig. 4) starts but no other variations in the transcripts were seen in the EST analysis. PpSR40 is the only gene that shows signs of alternative splicing.

Figure 5. Intron-exon structure of SR-protein mRNA-isoforms in P. patens. The mRNA isoforms are built on the V1.6 gene models.

5.2 Phylogenetic analysis Protein sequences from A. thaliana and P. patens were aligned using ClustalW and a neighbor-joining and a maximum likelihood were created with 100 bootstraps. The proteins from P. patens form a separate clade that is supported with high boot strap. Seemingly, the P.

10 patens gene family has evolved from two gene duplications occurring after the separation of mosses. The SR-protein ancestor in P. patens was subject to gene duplication, resulting in the PpSR39 protein and the PpSR40 / PpSR36-ancestor that was duplicated and subsequently diverged to the proteins PpSR40 and PpSR36.

Figure 6. Two phylogenetic trees, (A) Neighbor-joining and (B) maximum likelihood were created of SR- subfamily protein sequences from A. thaliana and P. patens. Numbers indicate the percent of bootstraps that support the topology; bootstrap values under 50 are not shown. The first two characters denote the organism of origin, At=A. thaliana, Pp=P. patens, the last characters denote the gene name.

11

5.3 Gene expression analysis It is known that expression levels and splicing patterns of transcripts encoding plant SR proteins may vary in a developmental and tissue-specific manner (Golovkin and Reddy, 1998; Lazar et al., 1995; Lopato et al., 1999; Lopato et al., 1996; Palusa et al., 2007). Therefore the expression of the PpSR40, PpSR36 and PpSR39 genes were investigated in gametophyte and protonema tissues. The expression of the PpSR40 and PpSR36 genes were upregulated in protonema tissue whereas the expression of PpSR39 was similar in the two tissues (fig. 6a). The expression of PpSR40 and PpSR36 was significantly greater in protonema than in gametophyte (p= 0,019 and p= 0,025, respectively) post-hoc analysis also confirmed this difference (fig. 6b). Furthermore, there was no measurable alternative splicing in the two tissues.

Figure 7. Band density results of PCR products from the tissue-specific expression of the genes PpSR40, PpSR36 and PpSR39 in gametophytes and protonema. Beta-tubulin 1 was used as endogenous control. (A) The genes PpSR40 and PpSR36 show upregulation in protonemal tissue while the expression of PpSR39 is similar in the two tissues. (B) ImageJ analysis of tissue-specific expression of the genes PpSR40, PpSR36 and PpSR39. Gene expression data was normalized with beta-tubulin 1 as reference of endogenous overall transcription levels (Data are presented as means ± SD fold-change to gametophyte, *: p=0.05, one-way ANOVA, N=2, Tukey and Fisher method post-hoc analysis.

5.4 Overexpression of PpSR40 in P. patens To further characterize the biological function of PpSR40 SR-protein genes, a cDNA-copy was cloned into the pCMAK1 plasmid (fig. 1). pCMAK1 is a plasmid that is suitable for overexpression in P. patens because it can be used for both bacterial transformation as well as for transformation in P. patens. The plasmid’s expression cassette has a modified 35S promoter called P7113 that is a strong constitutive promoter. A modified 35S promoter is used since the 35S promoter is weak in P. patens (Horstmann et al., 2004; Saidi et al., 2009). The target of the expression cassette is the genomic locus 213 (Schaefer and Zrÿd, 1997). Following the transformation, 42 initial transformants were generated after selection on selective medium containing 50 µg / ml zeocin (fig. 7a). After two weeks of non-selective growth (fig 7b), the 42 transformants were further selected on to get stable transformants (Ashton et al., 2000). After two weeks of selective growth six transformants showed significantly greater growth (fig 7c). These are regarded as stable transformants, and have thusly incorporated the expression cassette in the genome.

12

Figure 8. The post-transformational selection process. (A) The idea behind the selection process is that only moss that have taken up the expression cassette will survive. (B) The surviving transformants are moved to non- selective medium for growth. This step will help in the generation of stable transformants. (C) Transformants that have incorporated the cassette in the genome will show significant growth versus episomal transformants (non-incorporated cassette).

There were six transformation lines which showed some characteristic phenotypes (Table 3). Transformation lines one, two and three looked similar to wild-type phenotype colonies, consisting of a foundation of protonema from which a considerable amount of them differentiated into leafy gametophytes, giving the colonies a bushy appearance. Transformation line four consisted primarily of protonema but of which few had differentiated to gametophytes. The last two lines, five and six, consisted almost exclusively of protonema, with none or very low amount of gametophytes. Furthermore, colony composition seemed to have a connection with the water content in the colonies. Transformation lines five and six had considerably higher amount of water than lines three and four, which in turn, had higher water content than line one and two.

5.4.1 Gene expression analysis of pCMAK1:PpSR40 transformants To determine the success of the overexpression of the pCMAK1:PpSR40 transformants, gene expression analyses on the individual transformation lines were carried out. The results show varied levels of overexpression in the different lines for PpSR40. Interestingly, the overexpression of PpSR40 also affected the expression of PpSR36 and PpSR39 to some extent. This was however greatly accentuated in line 2, in which the expression of PpSR36 and PpSR39 was almost as high as the overexpressed PpSR40.

Table 3. The observed phenotypes of the six transformant lines correlated with expression of PpSR40, PpSR36 and PpSR39. The scale for gametophyte distribution range from, low / none (+), some / few (++) and up to wild- type differentiation (+++). The amount of protonema is inversely correlated to the amount of gametophytes. Water content scale from, normal (+), high (++) and very high (+++). The scale for SR-protein expression range from up to 2-fold increase (+), 2 to 4-fold increase (++) and 6 to 8-fold increase (+++). Gametophyte Protonema Water PpSR40 PpSR36 PpSR39 L1 +++ + + +++ + ++ L2 +++ + + +++ +++ +++ L3 +++ + ++ +++ - ++ L4 ++ ++ ++ + + ++ L5 + +++ +++ +++ ++ ++ L6 + +++ +++ ++ - +

13

Figure 9. Band density results of PCR products of the genes PpSR40, PpSR36 and PpSR39 from the pCMAK1:PpSR40 transformants. Beta-tubulin 1 was used as endogenous control. The six different transformant lines show varied overexpression of PpSR40. Furthermore, the overexpression of PpSR40 did also affect the expression of PpSR36 and PpSR39 to some extent. Gene expression data was normalized with beta-tubulin 1 as reference of endogenous overall transcription levels (Data are presented as means ± SD fold-change to control).

6. Discussion

6.1 Protein identification and characterization Three novel SR-proteins of the SR-subfamily were identified in Physcomitrella patens. They are highly similar in their intron-exon structure. This is not strange as phylogenetic analysis suggest they are the result of two gene duplication and subsequent sequence divergence. This is in agreement with similar events in A. thaliana in which three rounds of genome duplications has likely occurred (Simillion et al., 2002), in which 12 SR-proteins are located on the duplicated regions (Kalyna and Barta, 2004). It has previously been shown that SR- protein genes after gene duplication and further diversification are under positive selection (Escobar et al., 2006). To measure positive selection, the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) is estimated (Escobar et al., 2006). This means that there has been selection on mutations that alters the amino acid sequence and thus diversifying the SR- protein gene family. The SR-proteins recognize and bind exonic splicing enhancers (ESEs) (Maniatis and Tasic, 2002), diversification of the sequence binding region in the SR-proteins could possibly increase the number of substrates of the SR-protein gene family. This suggests that it is beneficial for the organism to have more SR-proteins and thus increasing protein diversity. But mutations can also occur in regulatory sequences that control spatial and temporal expression (Kalyna and Barta, 2004). The A. thaliana SR-proteins atSRp30 and atSRp34/SR1 are both expressed during early root formation, whereas atSRp34/SR1 in later stages is only expressed in the growing rot tip (Lopato et al., 1999). This highlights the potential functional overlap, or perhaps redundancy, and later on highly specialized function of different SR-protein gene families.

14

6.2 SR-protein transcripts in P. patens SR-proteins are essential for AS (Dauksaite and Akusjärvi, 2004) and are themselves known to be subject to AS (Isshiki et al., 2006; Kalyna et al., 2003; Lopato et al., 1999; Palusa et al., 2007). Therefore AS was investigated in PpSR40, PpSR36 and PpSR39 transcripts by analyzing ESTs. In the ESTs examined, no alternative transcripts that affected the coding sequences and thus affecting the protein products were found. However, the analysis revealed different transcriptional starts and subtle variations in the 5’ UTR but also an alternative 5’ UTR exon in the PpSR40 transcripts. The UTRs, primarily the 3’ UTR, is thought to contain various regulatory sequences and is believed to affect the stability of the transcript (Mazumder et al., 2003). Due to the low number of ESTs found and no full-length cDNA, not all transcripts may be available and also dependent on sequencing methods (5’ or 3’) different regions of the transcripts are underrepresented. The ESTs are all from standard conditions but from different tissues, like gametophytes, protonema and protoplasts. Even though not seen in this EST-analysis SR-proteins are known to be alternatively spliced (Isshiki et al., 2006; Lopato et al., 1999; Tanabe et al., 2007). If alternative splicing introduces a PTC, the transcript will translate a truncated protein. Proteins containing PTCs are often not translated they are rather the target of NMD (Belgrader et al., 1994; Maquat, 2004). AS and NMD might play an important role in post-transcriptional gene regulation through RUST (Lewis et al., 2003), a process which is highly conserved in mammalian SR-proteins (Lareau et al., 2007) and has been suggested to be important for the autoregulation of SR-protein expression (Ni et al., 2007). In P. patens no signs of AS of SR-proteins were seen in control conditions in different tissues. These first results indicate that regulation of SR-proteins through NMD in tissues does not occur. This does not however rule out that AS of SR-proteins in P. patens is used in developmental stages and tissues.

6.3 Gene expression analysis Gene expression analysis of SR-proteins in A. thaliana have shown that SR-proteins are differentially expressed in tissues and developmental stages (Golovkin and Reddy, 1998; Kalyna et al., 2003; Lazar et al., 1995; Lopato et al., 1999; Lopato et al., 1996; Palusa et al., 2007). While in Rice, Oryza sativa, SR-proteins are not markedly differentially expressed in tissues (Isshiki et al., 2006). Gene expression analysis of the SR-proteins PpSR40, PpSR36 and PpSR39 showed tissue- specific differential expression. The genes PpSR40 and PpSR36 showed significant differential expression between protonema and gametophyte tissue, whereas PpSR39 did not. This is consistent with the notion that in certain developmental stages or in different tissues there could be a functional overlap of SR-proteins. In A. thaliana and P. patens there is differential expression of SR-proteins between tissues but O. sativa have no marked difference in tissue-specific expression. P. patens, representing early diverging land plants show, as A. thaliana, variation of SR-protein concentration between tissues and developmental stages. This poses a question whether this tissue-specific gene expression differential is an ancestral mechanism which has been lost in O. sativa or independently evolved in A. thaliana and P. patens. To test this hypothesis, further experimentation on more plant SR-proteins would need to be carried out.

6.4 Overexpression of PpSR40 in P. patens

6.4.1 Transformation of P. patens The generation of stable transformants in P. patens has been proven to be time consuming. Even though P. patens readily take up transformation cassettes and with great efficiency by

15 homologous recombination can incorporate supplied linearized DNA into the genome, a strenuous post-transformational procedure is required. It is known that intact bacterial plasmids are replicated in P. patens cells without integration into the genome (Ashton et al., 2000). All moss transformants, whether they have incorporated the transformation cassette or exist extrachromosomally, will be able to survive and grow on selection (Hohe et al., 2004). This results in a surplus of transformants that are not stable. The non-stable transformants have the same morphology and able to be differentiated from the stable transformants that successfully incorporated the transformation cassette in the genome. When subjecting the moss transformants with two rounds of selection, in which between they are grown without selection, stable transformants that incorporate the transformation cassette show significant growth. The second round of selection ensures that extrachromosomally replicating plasmid are lost (Hohe et al., 2004). In this overexpression experiment 42 initial transformants were able to grow on the first round of selection, after a second round of selection six moss colonies show significant growth and were regarded as stable transformants. This shows that the overexpression of PpSR40 is not inherently toxic. The six transformant lines exhibit differences in their morphology. Lines one, two and three differentiates similarly as wild-type moss, whereas line four colonies consist primarily as protonema with a few gametophytes and lines five and six colonies consist of undifferentiated protonema. It is however unclear what the reasons for the distinct phenotypes are. Moss that undergoes transformation will sometimes acquire more chromosomes in the plant nucleus, a phenomenon called polyploidy (Schween et al., 2005). P. patens protoplasts are haploid but can after transformation become diploid or tetraploid. The of moss affect the differentiation and gametophyte morphology (Schween et al., 2005). Transformant lines one, two and three show wild-type differentiation which indicates haploid or diploid moss (Schween et al., 2005). Haploid moss can in most cases be distinguished from diploid moss by studying the morphology of gametophytes (Schween et al., 2005). However, transformant lines four, five and six have few, to none gametophytes thus displaying the characteristics of tetraploidy (Schween et al., 2005). But to be certain of the ploidy of the moss lines, flowcytometric analysis needs to be employed.

6.4.2 Gene expression analysis of P. patens transformants When the relative concentration of SR-proteins is altered the splice-sites of target genes may be alternatively determined (Lazar and Goodman, 2000; Smith and Valcárcel, 2000). The six P. patens transformant lines show varied levels of overexpression of PpSR40, ranging from 2.5- to 7.5-fold relative wild-type. Increased gene expression was not limited to PpSR40, but also increased expression of PpSR36/39 in most lines. In moss transformant line two, all three SR-protein genes were increased about 7-fold relative wild-type. How the ploidy of the moss affect gene expression and specifically expression of SR-protein is not clear. But it is known that SR-protein concentration can be altered when subjected to stress (Palusa et al., 2007; Tanabe et al., 2007). The overexpression of a specific SR-protein could be stressful for the plant. A. thaliana plants overexpressing atRSZ33 have reduced viability (Kalyna et al., 2003), suggesting that control of the expression levels of SR-proteins is vital. It has been proposed that AS and RUST and subsequent NMD play and role in SR-protein level regulation. In the six PpSR40 overexpression lines which all had elevated PpSR40 levels, no significant alteration of splicing patterns were seen. It is known that SR-proteins determined splice-site usage in a concentration dependent manner. These first results suggest that PpSR40, PpSR36 and PpSR39 are not substrates for PpSR40. Furthermore, the results show that the SR-protein levels in these overexpression lines are not regulation through RUST.

16

Overexpression studies of SR-proteins in other plants, A. thaliana and O. sativa, show alteration of splicing patterns of both the overexpressed protein, but also other SR-proteins (Isshiki et al., 2006; Kalyna et al., 2003; Lopato et al., 1999). Alteration of these splicing patterns results in a decrease of the expression of full-length transcript and thereby regulation SR-protein expression (Isshiki et al., 2006; Kalyna et al., 2003; Lopato et al., 1999). Why is this not seen in moss? The SR-protein regulation in A. thaliana and O. sativa might be more refined than in the early diverging moss P. patens. Both A. thaliana and O. sativa is more morphological complex and have more complex biological processes. In complex life forms, homeostasis is of utmost importance and SR-proteins play pivotal roles in gene expression.

6.5 Conclusion The three identified novel SR-proteins of the SR-subfamily in P. patens show high similarity in intron-exon structure and in protein domain distribution. The SR-protein gene family is likely the result of gene duplication with subsequent sequence divergence, also seen in the flowering plant A. thaliana. The proteins PpSR40 and PpSR36 show differential tissue- specific expression, while PpSR39 does not. Tissue-specific expression of SR-proteins is seen in A. thaliana but not in O. sativa. Overexpression of PpSR40 does not alter splicing pattern of PpSR40, PpSR36 and PpSR39 in contrast to overexpression of SR-proteins in other plants. P. patens show most of the SR-protein characteristics as seen in later diverging plants, suggesting that the SR-protein regulation mechanisms was already established in the common land plant ancestor and that was further refined in flowering plants. However, more research on plant SR-proteins would need to be carried out on more early diverging land plants e.g. Chlamydomonas Reinhardtii or Selaginella moellendorffii.

7. Acknowledgements I would like to thank my supervisor Dr. Johan Edqvist for all his help and guidance. I would like to thank Monika Edstam for all the assistance in the lab. I would also like to thank Jessica Olsen for all help in the lab, ideas and rewarding discussions.

8. References

Abramoff, M.D., Magelhaes, P.J., and Ram, S.J. (2004). Image Processing with ImageJ. Biophotonics International 11, 36-42.

Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389-3402.

Ashton, N.W., Champagne, C.E.M., Weiler, T., and Verkoczy, L.K. (2000). The bryophyte Physcomitrella patens replicates extrachromosomal transgenic elements. New Phytologist 146, 391-402.

Ashton, N.W., and Cove, D.J. (1977). The isolation and preliminary characterisation of auxotrophic and analogue resistant of the moss, Physcomitrella patens. Molecular and General Genetics MGG 154, 87-95.

Barta, A., Kalyna, M., and Reddy, A.S.N. (2010). Implementing a Rational and Consistent Nomenclature for Serine/Arginine-Rich Protein Splicing Factors (SR Proteins) in Plants. The Plant Cell Online 22, 2926-2929.

17

Belgrader, P., Cheng, J., Zhou, X., Stephenson, L.S., and Maquat, L.E. (1994). Mammalian nonsense codons can be cis effectors of nuclear mRNA half-life. Mol Cell Biol 14, 8219- 8228.

Cáceres, J.F., Stamm, S., Helfman, D.M., and Krainer, A.R. (1994). Regulation of Alternative Splicing in Vivo by Overexpression of Antagonistic Splicing Factors. Science 265, 1706- 1709.

Chandler, S.D., Mayeda, A., Yeakley, J.M., Krainer, A.R., and Fu, X.-D. (1997). RNA splicing specificity determined by the coordinated action of RNA recognition motifs in SR proteins. Proceedings of the National Academy of Sciences 94, 3596-3601.

Cove, D.J. (2005). The moss Physcomitrella patens. Annual Review of Genetics 39, 339-358. Cove, D.J., Bezanilla, M., Harries, P., and Quatrano, R.S. (2006). Mosses as Model Systems for the Study of Metabolism and Development. Annu Rev Plant Biol 57, 497–520.

Cove, D.J., Knight, C.D., and Lamparter, T. (1997). Mosses as model systems. Trends in plant science 2, 99-105.

Cove, D.J., Quatrano, R.S., and Hartmann, E. (1996). The alignment of the axis of asymmetry in regenerating protoplasts of the moss, Ceratodon purpureus, is determined independently of axis polarity. Development 122, 371-379.

Dauksaite, V., and Akusjärvi, G. (2004). The second RNA-binding domain of the human splicing factor ASF/SF2 is the critical domain controlling adenovirus E1A alternative 5'- splice site selection. Biochem J 381, 343-350.

Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792-1797.

Emanuelsson, O., Brunak, S., von Heijne, G., and Nielsen, H. (2007). Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protocols 2, 953-971.

Escobar, A., Arenas, A., and Gomez-Marin, J. (2006). Molecular of serine/arginine splicing factors family (SR) by positive selection. In Silico Biol 6, 347-350.

Felsenstein, J. (1989). PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics 5, 164-166.

Fu, X.D., Mayeda, A., Maniatis, T., and Krainer, A.R. (1992). General splicing factors SF2 and SC35 have equivalent activities in vitro, and both affect alternative 5' and 3' splice site selection. Proceedings of the National Academy of Sciences 89, 11224-11228.

Golovkin, M., and Reddy, A.S.N. (1998). The Plant U1 Small Nuclear Ribonucleoprotein Particle 70K Protein Interacts with Two Novel Serine/Arginine –Rich Proteins. The Plant Cell Online 10, 1637-1648.

Graveley, B.R. (2001). Alternative splicing: increasing diversity in the proteomic world. Trends in Genetics 17, 100-107.

18

Guindon, S., and Gascuel, O. (2003). A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Systematic Biology 52, 696-704.

Guindon, S., Lethiec, F., Duroux, P., and Gascuel, O. (2005). PHYML Online—a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Research 33, W557-W559.

Hall, S.L., and Padgett, R.A. (1994). Conserved Sequences in a Class of Rare Eukaryotic Nuclear Introns with Non-consensus Splice Sites. Journal of Molecular Biology 239, 357-365.

Hall, S.L., and Padgett, R.A. (1996). Requirement of U12 snRNA for in Vivo Splicing of a Minor Class of Eukaryotic Nuclear Pre-mRNA Introns. Science 271, 1716-1718.

Hohe, A., Egener, T., Lucht, J., Holtorf, H., Reinhard, C., Schween, G., and Reski, R. (2004). An improved and highly standardised transformation procedure allows efficient production of single and multiple targeted gene-knockouts in a moss, Physcomitrella patens. Current Genetics 44, 339-347.

Holm, K., Kallman, T., Gyllenstrand, N., Hedman, H., and Lagercrantz, U. (2010). Does the core circadian clock in the moss Physcomitrella patens (Bryophyta) comprise a single loop? BMC Plant Biology 10, 109.

Horstmann, V., Huether, C., Jost, W., Reski, R., and Decker, E. (2004). Quantitative promoter analysis in Physcomitrella patens: a set of plant vectors activating gene expression within three orders of magnitude. BMC Biotechnology 4, 13.

Horton, P., Park, K.-J., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C.J., and Nakai, K. (2007). WoLF PSORT: protein localization predictor. Nucleic Acids Research 35, W585- W587.

Isshiki, M., Tsumoto, A., and Shimamoto, K. (2006). The Serine/Arginine-Rich Protein Family in Rice Plays Important Roles in Constitutive and Alternative Splicing of Pre-mRNA. The Plant Cell Online 18, 146-158.

Jackson, L.J. (1991). A reappraisal of non-consensus mRNA splice sites. Nucleic Acids Research 19, 3795-3798.

Kalyna, M., and Barta, A. (2004). A plethora of plant serine/arginine-rich proteins: redundancy or evolution of novel gene functions? Biochemical Society Transactions 32, 561- 564.

Kalyna, M., Lopato, S., and Barta, A. (2003). Ectopic Expression of atRSZ33 Reveals Its Function in Splicing and Causes Pleiotropic Changes in Development. Mol Biol Cell 14, 3565-3577.

Kenrick, P., and Crane, P.R. (1997). The origin and early evolution of plants on land. Nature 389, 33-39.

19

Kohtz, J.D., Jamison, S.F., Will, C.L., Zuo, P., Luhrmann, R., Garcia-Blanco, M.A., and Manley, J.L. (1994). Protein-protein interactions and 5'-splice-site recognition in mammalian mRNA precursors. Nature 368, 119-124.

Kolossova, I., and Padgett, R.A. (1997). U11 snRNA interacts in vivo with the 5' splice site of U12-dependent (AU-AC) pre-mRNA introns. RNA 3, 227-233.

Lareau, L.F., Inada, M., Green, R.E., Wengrod, J.C., and Brenner, S.E. (2007). Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446, 926-929.

Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., et al. (2007). Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947-2948.

Lazar, G., and Goodman, H.M. (2000). The Arabidopsis splicing factor SR1 is regulated by alternative splicing. Plant Molecular Biology 42, 571-581.

Lazar, G., Schaal, T., Maniatis, T., and Goodman, H.M. (1995). Identification of a plant serine-arginine-rich protein similar to the mammalian splicing factor SF2/ASF. Proceedings of the National Academy of Sciences 92, 7672-7676.

Lewis, B.P., Green, R.E., and Brenner, S.E. (2003). Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proceedings of the National Academy of Sciences 100, 189-192.

Lopato, S., Kalyna, M., Dorner, S., Kobayashi, R., Krainer, A.R., and Barta, A. (1999). atSRp30, one of two SF2/ASF-like proteins from Arabidopsis thaliana, regulates splicing of specific plant genes. Genes & Development 13, 987-1001.

Lopato, S., Waigmann, E., and Barta, A. (1996). Characterization of a Novel Arginine/Serine- Rich Splicing Factor in Arabidopsis. The Plant Cell Online 8, 2255-2264.

Maniatis, T., and Tasic, B. (2002). Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 418, 236-243.

Manley, J.L., and Tacke, R. (1996). SR proteins and splicing control. Genes & Development 10, 1569-1579.

Maquat, L.E. (2004). Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics. Nat Rev Mol Cell Biol 5, 89-99.

Mayeda, A., and Krainer, A.R. (1992). Regulation of alternative pre-mRNA splicing by hnRNP A1 and splicing factor SF2. Cell 68, 365-375.

Mayeda, A., Munroe, S.H., Cáceres, J.F., and Krainer, A.R. (1994). Function of conserved domains of hnRNP A1 and other hnRNP A/B proteins. EMBO J 13, 5483-5495.

Mazumder, B., Seshadri, V., and Fox, P.L. (2003). Translational control by the 3'-UTR: the ends specify the means. Trends in Biochemical Sciences 28, 91-98.

20

Ni, J.Z., Grate, L., Donohue, J.P., Preston, C., Nobida, N., O’Brien, G., Shiue, L., Clark, T.A., Blume, J.E., and Ares, M. (2007). Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsense-mediated decay. Genes & Development 21, 708-718.

Palusa, S.G., Ali, G.S., and Reddy, A.S.N. (2007). Alternative splicing of pre-mRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses. The Plant Journal 49, 1091-1107.

Palusa, S.G., and Reddy, A.S.N. (2010). Extensive coupling of alternative splicing of pre- mRNAs of serine/arginine (SR) genes with nonsense-mediated decay. New Phytologist 185, 83-89.

Rensing, S.A., Lang, D., Zimmer, A.D., Terry, A., Salamov, A., Shapiro, H., Nishiyama, T., Perroud, P.-F., Lindquist, E.A., Kamisugi, Y., et al. (2008). The Physcomitrella Genome Reveals Evolutionary Insights into the Conquest of Land by Plants. Science 319, 64-69.

Reski, R. (1998). Development, genetics and molecular biology of mosses. Bot Acta 111, 1- 15.

Saidi, Y., Schaefer, D.G., Goloubinoff, P., Zrÿd, J., and Finka, A. (2009). The CaMV 35S promoter has a weak expression activity in dark grown tissues of moss Physcomitrella patens. Plant Signal Behav 4, 457-459.

Saitou, N., and Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406-425.

Schaefer, D., Zryd, J.P., Knight, C.D., and Cove, D.J. (1991). Stable transformation of the moss Physcomitrella patens. Molecular and General Genetics MGG 226, 418-424. Schaefer, D.G. (2001). Gene targeting in Physcomitrella patens. Current Opinion in Plant Biology 4, 143-150.

Schaefer, D.G. (2002). A NEW MOSS GENETICS: Targeted Mutagenesis in Physcomitrella patens. Annual Review of Plant Biology 53, 477-501.

Schaefer, D.G., and Zrÿd, J.-P. (1997). Efficient gene targeting in the moss Physcomitrella patens. The Plant Journal 11, 1195-1206.

Schween, G., Schulte, J., and Reski, R. (2005). Effect of Ploidy Level on Growth, Differentiation, and Morphology in Physcomitrella patens. The Bryologist 108, 27-35.

Screaton, G.R., Cáceres, J.F., Mayeda, A., Bell, M.V., Plebanski, M., Jackson, D.G., Bell, J.I., and Krainer, A.R. (1995). Identification and characterization of three members of the human SR family of pre-mRNA splicing factors. EMBO J 14, 4336–4349.

Sharp, P.A., and Burge, C.B. (1997). Classification of Introns: U2-Type or U12-Type. Cell 91, 875-879.

21

Sievers, A., Buchen, B., and Hodick, D. (1996). Gravity sensing in tip-growing cells. Trends in plant science 1, 249-250.

Simillion, C., Vandepoele, K., Van Montagu, M.C.E., Zabeau, M., and Van de Peer, Y. (2002). The hidden duplication past of Arabidopsis thaliana. Proceedings of the National Academy of Sciences 99, 13627-13632.

Smith, C.W.J., and Valcárcel, J. (2000). Alternative pre-mRNA splicing: the logic of combinatorial control. Trends in Biochemical Sciences 25, 381-388.

Tacke, R., and Manley, J.L. (1995). The human splicing factors ASF/SF2 and SC35 possess distinct, functionally significant RNA binding specificities. EMBO J 14, 3540-3551.

Tanabe, N., Yoshimura, K., Kimura, A., Yabuta, Y., and Shigeoka, S. (2007). Differential Expression of Alternatively Spliced mRNAs of Arabidopsis SR Protein Homologs, atSR30 and atSR45a, in Response to Environmental Stress. Plant and Cell Physiology 48, 1036-1049.

Tarn, W.-Y., and Steitz, J.A. (1996a). Highly Diverged U4 and U6 Small Nuclear RNAs Required for Splicing Rare AT-AC Introns. Science 273, 1824-1832.

Tarn, W.-Y., and Steitz, J.A. (1996b). A Novel Spliceosome Containing U11, U12, and U5 snRNPs Excises a Minor Class (AT AC) Intron In Vitro. Cell 84, 801-811.

Wang, J., and Manley, J.L. (1995). Overexpression of the SR proteins ASF/SF2 and SC35 influences alternative splicing in vivo in diverse ways. RNA 1, 335-346.

Wilkins, M.R., Lindskog, I., Gasteiger, E., Bairoch, A., Sanchez, J.-C., Hochstrasser, D.F., and Appel, R.D. (1997). Detailed peptide characterization using PEPTIDEMASS – a World- Wide-Web-accessible tool. ELECTROPHORESIS 18, 403-408. von Wettstein, F. (1924). Morphologie und Physiologie des Formwechsels der Moose auf genetische Grundlage, II. Indukt Abstamm Vererbungsl 33, 1924-1941.

Wood, A.J., Oliver, M.J. and Cove, D.J. (2000). Bryophytes as Model Systems. The Bryologist 103, 128-133.

Wu, J.Y., and Maniatis, T. (1993). Specific interactions between proteins implicated in splice site selection and regulated alternative splicing. Cell 75, 1061-1070.

Xiao-Qin, G., Hong-Zhi, Z., and De-Bao, L. (2007). Alternative splicing of pre-mRNA in plants. Chinese Journal of Agricultural Biotechnology 4, 1-8.

Xiao, S.H., and Manley, J.L. (1997). Phosphorylation of the ASF/SF2 RS domain affects both protein-protein and protein-RNA interactions and is necessary for splicing. Genes & Development 11, 334-344.

Zahler, A.M., Neugebauer, K.M., Lane, W.S., and Roth, M.B. (1993). Distinct Functions of SR Proteins in Alternative Pre-mRNA Splicing. Science 260, 219-222.

22

Zhang, X.-N., and Mount, S.M. (2009). Two Alternatively Spliced Isoforms of the Arabidopsis SR45 Protein Have Distinct Roles during Normal Plant Development. Plant Physiology 150, 1450-1458.

23