ABSTRACT

COMPETITION BETWEEN AND DEFINES THE EXPRESSION OF THE OXT6 GENE ENCODING TWO PROTEINS INVOLVED IN MRNA PROCESSING

by Zhaoyang Liu

This thesis reports a study of the interaction between messenger RNA polyadenylation and splicing in the expression of an Arabidopsis gene OXT6. This gene encodes two proteins that may involve in both polyadenylation and splicing processes. Interestingly, alternative polyadenylation or splicing of -2 of the gene defines the expression ratio of the two transcripts. To reveal the relationship of these processing events, a set of mutations were introduced at the splice sites and polyadenylation signals within intron-2 and transformed into the oxt6 mutant plants. The splicing and polyadenylation events were monitored by quantitative RT-PCR, revealing a competition nature of alternative polyadenylation and splicing during the OXT6 transcript processing. ChIP (chromatin immunoprecipitation) analysis performed against RNA Polymerase II in splicing mutant lines suggests that Pol II binding may affect the switch between polyadenylation and splicing during transcript processing. This work is the first of its kind that exemplifies the interplay among , splicing and polyadenylation that defines a gene’s expression outcome in plants.

COMPETITION BETWEEN ALTERNATIVE SPLICING AND POLYADENYLATION DEFINES THE EXPRESSION OF THE OXT6 GENE ENCODING TWO PROTEINS INVOLVED IN MRNA PROCESSING

A Thesis Submitted to the Faculty of Miami University in partial fulfillment of the requirements for the degree of Master of Science Cell, Molecular and Structural Biology Graduate Program by Zhaoyang Liu Miami University Oxford, Ohio 2010

Advisor______(Qingshun Quinn Li)

Reader______(Chun Liang)

Reader______(Nicholas P. Money)

TABLE OF CONTENTS

CHAPTER 1: INTRODUCTION...... 1

MULTIPLE LINKS BETWEEN MESSENGER RNA PROCESSING THROUGH

POLYADENYLATION AND SPLICING ...... 1

MESSENGER RNA POLYADENYLATION MECHANISM AND POLYADENYLATION FACTORS4

MESSENGER RNA SPLICING MECHANISMS AND SPLICING FACTORS ...... 6

INTERACTIONS BETWEEN POLYADENYLATION AND SPLICING ...... 8

THE OXT6 GENE AND ITS CONNECTION WITH ALTERNATIVE SPLICING AND

POLYADENYLATION...... 10

CHAPTER 2: MATERIALS AND METHODS...... 13

ISOLATION OF THE OXT6 MUTANT...... 13

DETECTION OF OXT6 POLYADENYLATION SIGNALS ...... 13

CONSTRUCTION OF OXT6 SPLICING AND POLYADENYLATION MUTATIONS...... 13

PLANT TRANSFORMATIONS ...... 14

QUANTITATIVE RT-PCR ANALYSIS OF MUTANT GENE EXPRESSION...... 14

CHROMATIN IMMUNOPRECIPITATION (CHIP) ASSAYS...... 15

CHAPTER 3: RESULTS...... 16

POLYADENYLATION AND SPLICING PROFILES OF THE OXT6 GENE...... 16

MUTAGENESIS OF THE SPLICE SITES AND POLY(A) SIGNALS ...... 16

ii POLYADENYLATION AND SPLICING PATTERNS OF OXT6 WERE ALTERED BY MUTATION ...... 17

COMPETITIVE RELATIONSHIP BETWEEN SPLICING AND POLYADENYLATION IN INTRON-2

OF OXT6 ...... 18

MUTATIONS ON SPLICE SITES CHANGE THE BINDING AFFINITY OF POL II ...... 19

CHAPTER 4: DISCUSSION AND CONCLUDING REMARKS ...... 21

THE COMPETITION NATURE OF POLYADENYLATION AND SPLICING DURING PROCESSING

OF THE OXT6 GENE ...... 21

THE REGULATORY ROLE OF POL II IN OXT6 GENE PROCESSING...... 23

FUTURE PERSPECTIVE ...... 25

REFERENCES ...... 27

iii LIST OF TABLES

TABLE 1. PRIMERS USED ...... 38

TABLE 2. OXT6 MUTATION CONSTRUCTS AND TRANSGENIC PLANTS USED...... 41

iv LIST OF FIGURES

FIGURE 1. THE STRUCTURE OF THE OXT6 GENE AND ITS RNA TRANSCRIPTS...... 42

FIGURE 2. POLY(A) SITES DETECTED IN ATCPSF30 AND ATC30Y TRANSCRIPTS...... 43

FIGURE 3. AN ILLUSTRATION OF THE SEVEN OXT6 MUTATION CONSTRUCTS...... 44

FIGURE 4. CRYPTIC POLY(A) SITES AND CRYPTIC SPLICE SITES DETECTED IN TRANSGENIC LINES...... 45

FIGURE 5. RATIO FOLD OF CHANGE OF THE ATCPSF30 AND ATC30Y TRANSCRIPTS IN THE OXT6 MUTATION CONSTRUCTS...... 46

FIGURE 6. CHIP ASSAYS OF THE SPLICING MUTANT TRANSGENIC LINES...... 47

v ACKNOWLEDGEMENT

I would like to express my sincere gratitude to my advisor Dr. Qingshun Quinn Li for his constant encouragement and guidance. As an amazing advisor, he devoted great efforts on helping me to uncover the intriguing truth behind the involuted phenomena, and led me to overcome many crisis situations during my master’s research.

I am also grateful to my committee members Dr. Chun Liang and Dr. Nicholas P. Money for their insightful comments, valuable suggestions and warm supports for both my study and research.

I would like to thank all the fellow members in the Li lab who offered me great help. I acknowledge Dr. Man Liu and Dr. Denghui Xing, Yingjia Shen and Jun Zheng for their generous collaborations and helpful discussions. Thanks to student assistants who worked in the lab during the past two years. Particularly, I appreciate Daniel F. Comiskey for his assistance to my research.

I would also like to thank the faculty, staff and students in the Department of Botany. I acknowledge Ms. Barb Wilson and Ms. Vickie Sandlin for their diligent work, and I thank Dr. John W. Hawes and Ms. Xiaoyun Deng at the Center of Bioinformatics and Functional Genomics, Miami University, for their excellent technical supports. Thanks to all my friends who have been so helpful and friendly to me.

Without the support of my family, I would not able to complete my work. I sincerely thank my parents for their lasting love and understanding.

Finally I would like to extend my gratitude to the NSF award and the Academic Challenge Program in the Department of Botany, Miami University, for funding support of this research.

vi Chapter 1: Introduction

Multiple links between messenger RNA processing through polyadenylation and splicing

In eukaryotic cells, messenger RNA (mRNA) is transcribed by RNA polymerase II (Pol II), capped at 5’ end, spliced by the and polyadenylated at the 3’ end to form a mature mRNA. Splicing is a process where are removed and are ligated together by a two-step transesterification reaction. Polyadenylation is a process with the cleavage at the polyadenylation site followed by the addition of a polymerization of an adenosine tail to the 3’-end of an mRNA molecule. Both splicing and polyadenylation serve many important biological functions including stability, translocation to the cytoplasm and translation (Coller et al., 1998; Huang and Carmichael, 1996; Kuhn and Wahle, 2004) and interact with multiple mRNA processing events (Zhao et al., 1999; Kornblihtt et al, 2004; Proudfoot, 2004; Gilmartin, 2005).

The first function of mRNA polyadenylation is to protect the mature mRNA (Coller et al., 1998). In eukaryotic cells, mRNA is majorly degraded by two pathways that are both initiated by shortening of the 3’ polyadenosine [poly(A)] tail, followed by 3’-exonuclease or 5’-exonuclease digestion (Franks and Lykke-Andersen, 2008). Poly(A) tails can protect mature mRNA from unregulated digestion from both ends (Kuhn and Wahle, 2004). Secondly, poly(A) tail promotes the export of mRNA from the nucleus to the cytoplasm (Huang and Carmichael, 1996; Kessler et al., 1997). Both polyadenylation related proteins and poly(A) binding proteins are believed to be involved in mRNA exporting (Coller et al., 1998; Brodsky and Silver, 2000; Poon et al., 2000). In addition, poly(A) binding proteins also interact with the translational machinery, and highly increase translation efficiency through enhancement of translation initiation (Kuhn and Wahle, 2004; Siddiqui et al., 2007). More over, some polyadenylation related proteins also involved in other cellular processes such as the maturation of small nucleolar (sno-) RNAs

1 (Nedea et al., 2003; Morlando et al., 2004) and the formation of 3’-end of cell cycle-regulated histone mRNAs (Dominski et al., 2005; Benoit et al., 2008).

Polyadenylation is also coupled to other mRNA processes. For example, some polyadenylation complex subunits interact with the transcription complex at the stage of transcription initiation (Calvo and Manley, 2003; Proudfoot, 2004), elongation (Dantonel et al., 1997; Barilla et al., 2001) as well as transcription termination (Hammell et al., 2002; Buratowski, 2005). It has been discovered that one of the polyadenylation factors CPSF (cleavage and polyadenylation specificity factor) can be purified with the general transcription factor TFIID, which plays a central role in promoter recognition during transcription initiation. The interaction with TFIID recruits CPSF to the promoter and brings it to the Pol II concomitant with transcription initiation (Dantonel et al., 1997). The largest Pol II subunit contains a carboxyl-terminal domain (CTD), a heptad amino acid sequence repeated 52 times in mammals and 26 times in yeast. Phosphorylation of two serine residues (Ser2 and Ser5) per repeat is essential at different stage of transformation elongation (Dahmus, 1996). CTD serves as a gathering/delivering platform of polyadenylation factors and also acts as an integral component of the 3’ end processing complex (McCracken et al., 1997; Hirose and Manley, 1998). These interactions indicate multiple links between mRNA polyadenylation and other mRNA processing, and the association of the polyadenylation machinery with the transcriptional apparatus.

On the other hand, mRNA splicing is also highly coupled with other RNA processing event such as mRNA transport from the nucleus to the cytoplasm (Luo and Reed, 1999). mRNAs generated by splicing were more efficiently exported compared to their identical counterparts that lacking introns. This could be explained by that the spliced mRNAs were assembled into a distinct mRNP complex named junction complex which consists of multiple proteins such as REF. The splicing reaction may recruit the mRNA export factor ALY to the mRNA so as to promote the export efficiency (Zhou et al., 2000). Also the

2 recruitment of REF during mRNA biogenesis was supposed to increase export of spliced mRNA (Zhou et al., 2000; Strasser and Hurt, 2001).

Growing evidence show that splicing is occurring co-transcriptionally (Proudfoot et al., 2002; Pandya-Jones and Black, 2009). Indeed, all three mRNA processing reactions (capping, splicing, and polyadenylation) can be tightly couple to Pol II transcription (Proudfoot et al., 2002; Bentley, 2002). In the case of splicing, some splicing factors and splicing related proteins such as the U1 70 kDa subunit were detected to associate with the CTD subunit of Pol II (Proudfoot et al, 2002). Zeng and Berget (2000) found that depletion of Pol II inactivated splicing of pre-mRNA and addition of recombinant CTD restored the activity. For a long gene, some introns could be spliced co-transcriptionally, while others could be processed post-transcriptionally. The time that Pol II takes to synthesize each intron defines a minimal time that splicing factors can be recruited to form spliceosome, while the time that Pol II takes to reach the end of the transcription unit defines the maximal time that splicing could occur co-transcriptionally (Neugebauer 2002). Pandya-Jones and Black (2009) reported that in human gene c-Src and fibronecton pre-mRNA transcripts, the majority of introns had already been excised in a general 5’-to-3’ order and the introns flanking alternative exons were also removed co-transcriptionally with different excision efficiency between cell lines under different regulatory conditions. These results indicate that the decision of alternative splicing is made co-transcriptionally, and the efficiency of intron excision is varying in the pre-mRNA. The elongation rate of Pol II may play a role in selecting the alternative splice sites. Kadener et al (2000) reported that less-processive transcription elongation promoted the inclusion of an alternatively spliced exon, while more-processive transcription had a reverse effect. These results suggest that when Pol II is less processive, it may spend more time to transcribe through an alternative splicing site of a gene, and therefore may allow more time for the splicing factors to assemble on the splice site. If this intron also happens to house an alternative poly(A) site, the pausing time on the site may have an influence on the decision of splicing or polyadenylation.

3 Messenger RNA polyadenylation mechanism and polyadenylation factors

The polyadenylation is directed by polyadenylation signals that present in the pre-mRNAs as cis-elements. In mammalian cells, there are three core polyadenylation signals: the highly conserved AAUAAA sequence located between 10 and 35 nucleotides (nts) upstream of the cleavage site; a less conserved U-rich or GU-rich element located downstream of the cleavage site; and the cleavage site itself which is also referred to poly(A) site (Zhao et al., 1999; Tian et al., 2005). In plants, the polyadenylation signals are somewhat different from their mammalian counterparts. Three different classes of cis- elements were described: the near-upstream element (NUE), an equivalent of AAUAAA signal in mammals, is an A-rich element situated between 10 and 35 nts upstream from its associated poly(A) site; the far-upstream element (FUE), which is located as far as 100 nts upstream from the poly(A) site (Rothnie, 1996; Li and Hunt, 1997); and the poly(A) site and its flanking U-rich element named Cleavage Element (CE; Loke et al., 2005).

These polyadenylation signals are recognized by an apparatus consisted of a complex of about 25 to 30 protein factors. In mammals, these factors are termed as cleavage and polyadenylation specificity factor (CPSF), cleavage stimulation factor (CstF), cleavage factors CF Im and CF IIm, and poly(A) polymerase (PAP) (Zhao et al, 1999; Shi et al,. 2009). In plants, a set of polyadenylation complex factors have been characterized, including the Arabidopsis homologues of PAP, CPSF and CstF subunits, as well as Fip1 subunit (Xu et al, 2004; Delaney, 2006; Forbes et al, 2006; Hunt et al, 2008). In particular, Arabidopsis CPSF complex (AtCPSF) contains subunits such as AtCPSF30, AtCPSF73-I, AtCPSF73-II, AtCPSF100, AtCPSF160, AtFIPS5 and AtFY, where AtCPSF30 serves as a bridge to associate AtCPSF complex with other polyadenylation factors (Hunt et al., 2008).

4 In both mammals and plants, pre-mRNAs can be polyadenylated in different ways due to the use of more than one polyadenylation signal. This processing is called alternative polyadenylation (APA), allowing one single gene to generate different transcripts. In some cases these transcripts are different only in their 3’ ends, and in other cases, they can encode entirely different proteins due to changes in coding sequences (Lutz, 2008). In the latter case, APA is supposed to interact with pre-mRNA splicing. The best-known example of APA in plants is the FCA gene, which contains an APA site in an intron that yields a non- functional truncated protein. The ratio between the truncated protein and the full length protein is crucial for FCA to regulate flowering time (Simpson et al., 2004b; Xing et al., 2008). According to Tian et al (2005), over half of the human genes (~54%) have APA products. These poly(A) sites can lead to variable 3’ untranslated regions (UTRs) that contain different cis-elements controlling the stability or translatability of the mRNA, leading to disease or gene expression regulation (Mayr and Bartel, 2009; Ji, 2009; Singh, 2009). A large number of poly(A) sites are located in introns and internal exons, which are upstream of the 3’-most exon. In human cases, the uses of alternative poly(A) sites in introns or exons may lead to distinct protein products (Farh et al., 2005; Khabar et al., 2005; Simpson et al., 2004a). Recent studies also reveal APA profiles in plants. Shen et al (2008) mapped the rice genome and found that 50% of the genes analyzed had more than one unique poly(A) site, and about 1% of all the transcripts possessed alternative poly(A) sites were at introns. These data together indicate that APA is common in both mammals and plants, and can significantly contribute to the complexity of the transcriptome and proteome in the cell.

Polyadenylation also participates in quantitative and qualitative regulation of gene expression (Millevoi and Vagner, 2010). For transcripts carrying a single poly(A) signal, the function of the polyadenylation machinery is to define whether to process the transcript, and transcripts not processed at the 3’ end will be degraded (Kuhn and Wahle, 2004). For transcripts containing more than one poly(A) signal, the role of the regulatory factors is to determine where to process

5 the transcript. APA in 3’ UTR may generate transcripts with different stability, localization, transport and translation properties (Zhao et al., 1999; Lutz, 2008), while APA in exons and introns may change the coding sequences and result in various protein isoforms. A gene containing poly(A) site in an intron can lead to conversion of an internal exon to a 3’ terminal exon, so called composite exon; or associate with otherwise skipped exons (Edwalds-Gilbert, 1997). Both of these two kinds of APA interact with alternative splicing, and grow evidences that splicing and polyadenylation are coupled events that take place cotranscriptionally (Proudfoot et al. 2002, Pandya-Jones and Black, 2009).

Messenger RNA splicing mechanisms and splicing factors

The pre-mRNA splicing mechanism is well studied in mammals. Four short sequences define an intron: the 5’ splice site (5’ss in short) with a conserved GT dinucleotide, which can be recognized by the U1 small nuclear ribonucleoprotein particle (snRNP); the 3’ splice site (3’ss in short) with a conserved AG dinucleotide, which can be recognized by the U2 snRNP auxiliary factor 35 (U2AF35); a branch site sequence located about 17-40 nucleotides upstream of the 3’ss recognized by U2AF65; and a polypyrimidine tract located between the 3’ss and the branch site recognized by U2 snRNP (Black, 2003). Splicing is a two-step reaction: First, the 5’ss is attacked by the 2’-OH of an adenosine at the branch site to form a 5’-exon with a free 3’-OH and a lariat; second, the 3’-OH of the 5’ exon attacks the 3’ss to join with the 3’-exon and release the lariat. How splicing machinery identifies introns and exons in plants is not well understood, however recently, some basic information of plant splicing signal was revealed. Analysis of the 5’ss and 3’ss in all currently known introns of Arabidopsis and rice indicates that these sites are very similar to that of humans (Alexandrov et al., 2004). The polypyrimidine tract that is found upstream of the 3’ss in human is mostly Uridine in plants, and can function as a splicing signal (Simpson et al., 2004a). Furthermore, the branch site is not clearly defined in plants (Reddy et al., 2007).

6 Most intron splicing is carried out by spliceosome, a large complex composed of five as well as up to 300 accessory proteins (Zhou et al., 2002; Jurica et al., 2003). There are two types of in higher eukaryotes. The major U2-type spliceosome consists of U1, U2, U4, U5 and U6 snRNPs, catalyzing the removal of introns containing canonical (GT-AG) splice sites, while the minor U12-type spliceosome contains U11, U12, U4atac, U5 and U6atac snRNPs, recognizing a small amount of introns with noncanonical splice sites (Reddy, 2001; Jurica, 2003). The sequence-specific binding of the five snRNPs with the splice signals of pre-mRNA results in the removal of an intron and the ligation of the flanking exons (Brow, 2002). The spliceosome also contains many splicing related proteins such as heterogenous nuclear ribonculeoprotein particles (hnRNPs), DExD/H-box, and SR (Serine-Arginine-rich) proteins (Zhou et al., 2002). In addition, a number of cis-elements located in exons and introns playing enhancing or repressing roles in splicing were also detected (Ladd and Cooper, 2002). The mechanism regulating splicing in plants is less well understood and plant spliceosome have never been isolated. Some trans-acting factors were isolated from the Arabidopsis genome based on similarities to the mammal transacting factors, such as SR proteins and hnRNP like proteins (Lambermon, 2000; Lorkovic, 2000). A total of 74 snRNA genes and 395 genes encoding splicing related proteins have been identified in the plant model, and most of them are conserved (Wang and Brendel, 2004). Meanwhile, splicing factors such as SR proteins and hnRNPs are vastly expended in plants with many novel proteins, indicating plant-specific mechanisms in splice site recognition and splicing regulation (Wang and Brendel, 2004; Reddy, 2007).

Alternative splicing, a process generates two or more mRNA from the same pre- mRNA by choosing different splice sites, can lead to structurally and functionally distinct proteins (Woodley et al, 2002). Such a mechanism widely exists in both mammal and plants, and significantly contributes to transcriptome and proteome diversities in higher eukaryotes. Alternative splicing in humans is known to be common: approximately 70-80% of human genes have been demonstrated to

7 undergo alternative splicing (Clark et al., 2007). While in plants, the corresponding rate is much lower. Recent estimates of alternative splicing in Arabidopsis and rice based on EST/cDNA evidence suggest that over 20% of genes undergo alternative splicing (Campbell et al., 2006; Wang and Brendel 2006). The most abundant human alternative splicing event is exon-skipping (~42%), and the second one is alternative 5’ splicing and alternative 3’ splicing, while intron retention is the least common one (Kim et al., 2007). In contrast, ~40% of the alternative splicing events observed in Arabidopsis and rice are intron retention, and exon-skipping is relatively rare in plants (Ner-Gaon et al., 2004).

Currently there are two models for spliceosome assembly: the intron-definition which splices short intron and the exon-definition model which splices long intron (Berget 1995; Wang and Brendel, 2006). Global analysis of gene structure revealed the differences in the architecture of plant and animal genes. In humans, like other vertebrates, the average gene is 28 kb long with 8.8 exons of about 130 nucleotides and 7.8 introns with about 3000 nucleotides (Lander et al., 2001; Sakharkar et al., 2006). In contrast, plant genes are generally smaller and contain relatively short introns. In Arabidopsis, the average gene is 2.4 kb long with 5 exons of about 193 nucleotides and 4 introns with about 433 nucleotides (Wang and Brendel, 2004; Reddy, 2007).

Interactions between polyadenylation and splicing

Increasing evidence indicates that splicing and polyadenylation are coupled events that take place co-transcriptionally (Proudfoot et al., 2002; Pandya-Jones and Black, 2009). The perturbation of splicing and polyadenylation can affect each other. For example, mutation of the AAUAAA polyadenylation signal can impair the in vitro splicing of the last intron in vertebrate precursor RNAs (Niwa et al., 1992), and the poly(A) site strength has a proportional effect on the efficiency of the splicing reaction in vivo (Scott et al., 1996). Shi et al (2009) reported that polyadenylation factors can form a functional complex with splicing factors in

8 human cells. Recently, the interrelationship of splicing and polyadenylation has been demonstrated where direct coupling of the two processes is mediated by protein-protein interactions, such as U1 with CF Im (Awasthi and Alwine, 2003), CPSF with U2 (Kyburz, 2006), and U2AF65 with CF Im (Millevoi et al., 2006).

One model explain the positive regulation between polyadenylation and splicing is the direct recruitment mechanism, where the splicing factors can mediate the functional interplay between splicing and 3’ end processing (Millevoi, 2010). This mechanism is firstly revealed during the process of exon definition in RNA splicing, and mostly exists in the terminal exon. Three factors U2AF65, U2 snRNP and SRm160 were detected to be directly involved in the coupling. U2AF65 can bound to the pyrimidine tract upstream of the 3’ss, and stimulates polyadenylation by recruiting the cleavage factor CF Im at the polyadenylation signal (Millevoi, 2006). Coupling between splicing and polyadenylation also requires the interaction between U2 and CPSF which associates to the polyadenylation signals. Besides U2, splicing factors SR protein and SRm160 can also stimulate the assembly of polyadenylation machinery by targeting the CPSF complex (McCracken, 2002; Kyburz, 2006).

On the other hand, interaction between splicing and polyadenylation can also be a negative relationship. Proudfoot (1996) reported that the poly(A) site recognition is inhibited by the proximity of a 5’ss. This inhibition normally occur in genes where the poly(A) site is situated within the body of a gene. The crosslink between splicing and polyadenylation within a gene may lead to alternative splicing and APA. The best studied example in mammal is the immunoglobulin M (IgM) heavy chain gene, which contains a relatively weak internal poly(A) site positioned within an intron. The usage of the weak upstream internal poly(A) site works more efficiently in plasma cells, while the usage of the downstream poly(A) site is dominate in early developing B cells (Bruce et al., 2003). In plasma cell, the weaker poly(A) site can compete the splicing process is probably due to the higher amount of CstF-64, which allows the recognition of the weaker intronic poly(A) signal (Takagaki and Manley, 1998). Bioinformatic analysis also

9 suggested the competition between splicing and polyadenylation processes. Tian et al (2007) identified all polyadenylation sites in introns for all currently known human genes by cDNA/EST and genome sequence, and found that ~20% human genes have at least one intronic polyadenylation event that can lead to different mRNA variants. They also suggested that the intronic polyadenylation activity can vary under different cellular conditions for most genes, and the strength of splice sites and polyadenylation signals may play essential roles in the dynamic interplay between polyadenylation and splicing.

All These results demonstrate that polyadenylation and splicing factors interact with each other, and may have evolved into regulatory mechanisms. This interaction is possibly mediated by the Pol II CTD subunit, since it can interact with both splicing factors such as the U1 70 kDa subunit and the CPSF complex (Dantonel et al., 1997; Gunderson et al., 1998). However, much remains to be learned regarding the regulatory mechanisms of APA and alternative splicing in plants, particularly the interrelationship between these two events.

The OXT6 gene and its connection with alternative splicing and polyadenylation

The focus of this thesis is a gene in Arabidopsis named OXT6 (Oxidative Stress Tolerant-6; gene locus ID At1g30460). It was named because a mutation of the gene rendered resistance to oxidative stresses thus was selected by a screening (Zhang et al., 2008). This gene encodes two mRNAs and hence two proteins AtCPSF30 and AtC30Y through alternative polyadenylation and splicing (see below). The oxt6 mutant line, where a T-NDA insertion in the first exon of the OXT6 gene (Figure 1) disrupts the expression of both transcriptions, displayed a reduced sensitivity to oxidative stress treatment compared with wide type, suggesting the involvement of the OXT6 gene in the control of the plants response to stress. In addition, the expression level of the two transcripts changed in wild type were altered after oxidative stress treatment, indicating that the alternative processing of the OXT6 gene may play a role in this process. 3’

10 RACE applied for fours genes showed that the poly(A) site choice was different in oxt6 compared with that in wild type and complement plants, and microarray data showed that the transcription profile of many ROS (reactive oxygen species)-related genes were altered in the oxt6 mutant line. More interesting, AtCPSF30 is a calmodulin-binding protein, and its RNA-binding activity can be inhibited by calmodulin in a calcium-dependent manner. Taken together, these results suggest that alternative processing of OXT6 is involved in a number of important biological functions such as the plant response to the oxidative stress, by regulating the mRNA 3’ end processing, and the regulation of these processes may act through calmodulin.

As a result, it is important to understand the regulation mechanism of the expression of the OXT6 gene. This gene was demonstrated to encode two proteins that may link both polyadenylation and splicing processes, where one intron in OXT6 may play an essential role in the formation of the two different transcripts (Delaney et al., 2006). This intron (970bp, called intron-2 hereafter) contains an APA region where poly(A) sites can be found (see results for details, Figure 1). When one of these poly(A) sites is used, a shorter transcript of the OXT6 gene is generated. It encodes a polyadenylation factor subunit AtCPSF30, an Arabidopsis ortholog of mammalian CPSF30 (30 kDa subunit of Cleavage and Polyadenylation Specificity Factor). AtCPSF30 serves as a hub of an extensive network of protein-protein interactions in plant mRNA polyadenylation (Hunt et al., 2008). It can interacts physically with several other Arabidopsis CPSF and CstF subunits, and even interact with itself (Delaney et al., 2006; Xu et al., 2006; Hunt et al., 2008). On the other hand, if alternative splicing removes intron-2, a larger transcript is produced, which can be translated into a 68kDa protein named AtC30Y. This protein, containing all but the C-terminal 13 of its 250 amino acids of AtCPSF30 and fuses with an additional domain YT521-B, is less well understood. The YT521-B domain has been firstly reported as a member in the family of mammalian splicing-associated proteins (Imai, 1998; Hartmann, 1999). YT512-B was identified in two-hybrid screens as well as a co-

11 immunoprecipitation assay to interact with several splicing factors, and it contains a conserved YTH domain that may bind to RNA (Stoilov et al., 2002). However, whether the YT521-B domain of At C30Y has similar regulatory potential is not yet clear.

The OXT6 gene provide us a good model to study the interaction between splicing and polyadenylation, because the poly(A) site of the short transcription is located right in intron-2 which can also be alternatively spliced. This 970bp intron contains canonical conservative splicing signal “GT” and “AG” at the 5’ss and the 3’ss, respectively. While no highly conserved polyadenylation signals (e.g. AAUAAA) were found, we identified some potential poly(A) signals (particularly NUE) in both intron-2 and 3’ UTR of the OXT6 gene (see Results). Our working model is that the mutations of the poly(A) signals and/or splicing sites that are critical to pre-mRNA processing will significantly change the APA and/or alternative splicing pattern in intron-2. Thus, the hypothesis to be tested is that by recognition of these splicing and polyadenylation signals, the splicing factors and polyadenylation factors interact with each other during the processing of intron-2. It is their competition and/or coordination that determine the inclusion or exclusion of intron-2, a regulatory event that leads to differential expression of the OXT6 gene. In this thesis research, through the analysis of a set of splicing and polyadenylation mutations on intron 2, a dynamic interplay among splicing, polyadenylation, and Pol II binding was revealed.

12 Chapter 2: Materials and methods

Isolation of the oxt6 mutant

The oxt6 mutant bearing a T-DNA (pROK2) insertion in the first exon of AT1G30460 was identified from a collection of Arabidopsis thaliana (ecotype Columbia) from the Arabidopsis Biological Resource Center (Ohio State University) as described in Zhang et al (2008). The oxt6 mutant seeds were provided by Dr. Deane Falcone, now at the University of Massachusetts at Lowell.

Detection of OXT6 polyadenylation signals

Rapid Amplification of cDNA 3’ Ends (3’RACE) was conducted as described previously (Froham, 1988) with modifications. RT-PCR analysis of both transcripts of OXT6 was performed using one reverse primer [oligo d(T) adaptor] and two forward primers (all primers are listed in Table 1). The C30 F-2 primer is situated 79nts downstream of C30 F-1 primer in exon-2 of the OXT6 gene, while the C30Y F-2 primer is situated 31nts down stream of the C30Y F-1 primer in exon-7 of the OXT6 gene.

Construction of OXT6 splicing and polyadenylation mutations

The full length OXT6 gene (including 2.3 kb of native promoter) was cloned into Gateway vector pENTR/D-TOPO (Invitrogen Inc.) to generate pENT-C30ASG. This was done previously in our lab (R Xu and QQ Li, unpublished data). Site- directed mutagenesis was performed to the OXT6 gene according to the manufacture’s instructions (Stratagene, Inc.). Using the QuikChange ® II XL Site- Directed Mutagenesis Kit, the desired mutations are built into a set of oligonucleotides that are used to regenerate the vector DNA by PCR. Upon completion of the PCR, the newly synthesized vector plasmids will contain the desire mutation. Primers applied for mutagenesis are listed in Table 1.

13 Plant transformations

Seven OXT6 mutation constructs (Figure 2.) were converted into the binary vector pMDC32 (Brand et al., 2006). All these binary vectors were introduced into the GV3101 Agrobacterium strain (Koncz and Shell, 1986), and transformed into the oxt6 mutant background performed by the floral dip method (Cough and Bent, 1998). Arabidopsis plants were grown at 24°C in a 16 hrs light / 8 hrs dark cycles. Both first (T1) and second (T2) generation of transgenic plants were selected on 1/1000 BASTA (ARBICO Organics, Inc.).

Quantitative RT-PCR analysis of mutant gene expression

Total RNA were extracted from the leaves of four-week old plants and cDNA was synthesized from 1 μg of total RNA using SuperScript-III Reverse Transcriptase (Invitrogen Inc.) and an oligo-d(T) with adapter primer (Table 1). Quantitative RT- PCR was performed with at least three biological repeats on a Bio-Rad IQ cycler apparatus with the Quantitech SYBR green kit (Bio-Rad Inc.). At4g34270 (encoding a TIp41-like protein) was used as an internal reference gene (Czechowski, 2005). Copy number of the tested transcripts was determined by the comparative threshold cycle method (∆∆Ct) according to the manufacture’s instructions (Bio-Rad Inc.). The transcript ratios between AtC30Y and AtCPSF30 in transgenic plants were normalized with that in the wild type (calibrator sample) by this formula: ∆∆Ct AtC30Y/AtCPSF30 ratio fold of change= [∆Ct AtC30Y (transgenic sample)–∆CtAtC30Y (WT sample)] – [∆Ct AtCPSF30 (transgenic sample)–∆CtAPSF30 (WT sample)]. The relative expression level between AtC30Y and AtCPSF30 =2 ∆∆Ct AtC30Y/ AtCPSF30 ratio fold of change. The primers used are listed in Table 1. Forward primer AtCPSF30/AtC30Y -qPCR-F is situated at 99nts in the beginning of OXT6 exon 1, in front of the T-DNA insertion. Reverse primers AtCPSF30-qPCR-R and AtC30Y-qPCR-R are situated at OXT6 intron-2 and exon-7, respectively. The positions of the primers are shown in Figure 1.

14 Chromatin Immunoprecipitation (ChIP) assays

ChIP was done as described previously with modifications (He, 2003). Seeds of splicing mutation transgenic lines were grown at 24°C in a 16 hrs light/8 hrs dark cycle. Leaves were harvested four weeks after germination, fixed in 1% formaldehyde for 10 min in a vacuum chamber, and neutralized by 0.1 M glycine for 5 min. Approximately 0.6 g of samples were ground into fine power in liquid nitrogen and resuspended into lysis buffer containing protease inhibitors. The isolated chromatin was sonicated six times for 10s each. The antibodies for immunoprecipitation were purchased form Covance: RNA polymerase II H5 Monoclonal Antibody (MMS-129R). Immunoprecipitated DNA was purified with phenol/ chloroform/ isoamyl alcohol and precipitated with ethanol, and analyzed by quantitative PCR (Bio-Rad iCycler) with primers listed in Table 1. Forward primer OXT6 5'ss-ChIP-F and OXT6 3'ss-ChIP-F are situated at the end of exon- 2 and the end of intron-2, respectively, while reverse primer OXT6 5'ss-ChIP-R and OXT6 5'ss-ChIP-R are situated at the beginning of intron-2 and the beginning of exon 3, respectively. A ~60bp region at the beginning of exon-1 of the OXT6 gene was used as an internal reference. The ChIP signal was calculated by the following formula: 2 ∆Ct ChIP-∆Ct internal reference. The ChIP signals were normalized by the input sample indicating as % input.

15 Chapter 3: Results

Polyadenylation and splicing profiles of the OXT6 gene

The Intron-2 of OXT6 contains typical splicing signals (5’ss starts with GU and 3’ss start with AG). To detect the polyadenylation signals in both intron-2 and 3’UTR of OXT6, Rapid Amplification of cDNA 3’ Ends (3’RACE) was applied on both AtCPSF30 and AtC30Y transcripts. Two close polyadenylation regions (named pA1 and pA2) were detected in intron-2 and one more (named pA3) is detected in 3’ UTR (Figure 2). The pA1 region is located about 60nts downstream of the AtCPSF30 stop codon (TAA), containing one poly(A) site and presumed NUE signal (R. Xu and QQ Li, unpublished data), while the pA2 region is located about 130nts downstream of the AtCPSF30 stop codon, containing four poly(A) sites and presumed NUE signals. The pA3 region is located about 120nts downstream of the AtC30Y stop codon (TGA), which contains six poly(A) sites and presumed NUE signals. These results indicate that Intron-2 of OXT6 contains typical splicing signals and potential polyadenylation signals, and therefore both splicing sites and polyadenylation signals are subjected to mutations in order to study the interaction of splicing and polyadenylation of both transcripts from OXT6.

Mutagenesis of the splice sites and poly(A) signals

The full length OXT6 gene in vector pENT-C30ASG was used for making mutations. A set of mutations were introduced at splicing sites and/or polyadenylation signals with site-directed mutagenesis. Splicing sites and pA1 NUE signal were mutated to contain restriction enzyme recognition sites. The 3’ss was mutated to an EcoRV site while the pA1 NUE signal was mutated to a HeaII site. Meanwhile, 36bp pA2 region and 78bp pA3 region were deleted. Some constructs contain double-mutants at both splicing sites and the polyadenylation region. A total of seven OXT6 mutation constructs were made (Figure 3). These constructs were confirmed by PCR followed by restriction

16 enzyme digestion and DNA sequencing. These mutation constructs were introduced into the oxt6 mutant background to generate transgenic lines (Table 2).

Polyadenylation and splicing patterns of OXT6 were altered by mutation

In order to investigate the mutation effects on polyadenylation and splicing patterns, the polyadenylation and splicing profiles of the first generation of transgenic plants (T1) were analyzed. The 3' RACE was applied to oxt6::OXT6- pA1+pA2 (name annotation: OXT6 gene with pA1 and pA2 mutation transformed into oxt6 background) transgenic plants, and two previously cryptic polyadenylation sites were detected between the stop codon and the 36bp pA2 deletion region (Figure 4A). This result indicated that the polyadenylation site in intron-2 of OXT6 gene can be altered with mutation on polyadenylation signals, however the shorter transcript of OXT6 gene (coding for AtCPSF30) can still be generated. To detect the amount of the two transcripts of OXT6, quantitative RT- PCR is necessary.

In order to detect the spicing profile in mutant plants, the cDNA of T1 generation of splice site mutant plants oxt6::OXT6-5’ss, oxt6::OXT6-3’ss, oxt6::OXT6- pA1+pA2+5’ss and oxt6::OXT6-pA1+pA2+3’ss were cloned and sequenced. One cryptic 5’ss and one cryptic 3’ss of intron-2 were detected (Figure 4B). The cryptic splice sites located near the original splice sites and shared the same four nucleotides sequence with the original ones. According to all 14 samples sequenced, original 5’ss (or 3’ss) could either join with the cryptic 3’ss (or 5’ss), or with the original 3’ss (or 5’ss). These results indicated that splice sites of intron-2 can be altered with mutation on splicing signals, however intron-2 can still be removed from the OXT6 gene albeit with altered exon sequences. Cryptic splice sites usually situated close to the original splice sites, and share the same nucleotide sequence with the original ones. The efficiency of the usage of cryptic splice sites will be analyzed by quantitative RT-PCR.

17 Competitive relationship between splicing and polyadenylation in intron-2 of OXT6

Since mutations on polyadenylation and splicing signals could change the polyadenylation and splicing patterns of OXT6 gene expression, the relationship between polyadenylation and splicing in intron-2 could be drawn by detecting the amount of the two transcripts, AtCPSF30 and AtC30Y. In T1 transgenic plants, potential multiple T-DNA insertions may introduce more than one copy of OXT6 mutation construct, which may skew the absolute expression level between different transgenic lines. To avoid this problem, the ratio between AtC30Y and AtCPSF30 transcripts (AtC30Y/AtCPSF30) of each transgenic line was analyzed (Figure 5). If the relationship between polyadenylation and splicing is competition, mutation on polyadenylation signals should promote the splicing process resulting in the exclusion of intron-2, which will lead to an increase of AtC30Y transcripts and therefore increase the ratio of AtC30Y/AtCPSF30. Similarly, mutations on splice sites should stimulate the polyadenylation process within intron-2, generating more AtCPSF30 transcripts and decrease the ratio of AtC30Y/AtCPSF30. If the relationship between polyadenylation and splicing is cooperation, the result should be opposite.

To distinguish these possibilities, quantitative RT-PCR was performed to detect the levels of transcripts. This was done to both T1 and T2 generations of the transgenic plants, and log2 (AtC30Y/AtCPSF30 expression level) [∆∆Ct AtC30Y/ AtCPSF30 ratio fold of change] is shown in Figure 5. The ∆∆Ct AtC30Y/ AtCPSF30 ratio fold of change in the seven mutation constructs were normalized to that in the wide type plant (which expresses the wild-type OXT6 gene). If the ratio between AtC30Y and AtCPSF30 transcripts is increased, the ∆∆Ct AtC30Y/ AtCPSF30 ratio fold of change should be positive, or vise versa. As expected, mutations on polyadenylation signals in intron-2 (pA1+pA2) led to an increased ratio of AtC30Y/AtCPSF30 (~23), indicating the promotion of splicing of intron-2 in the OXT6 gene. Mutation on either splice site (5’ss or 3’ss) resulted in a similarly

18 decreased ratio of AtC30Y/AtCPSF30 (~2(-3)), while mutations on both splice site (5’ss+3’ss) led to a significantly decreased ratio of AtC30Y/AtCPSF30 (~2(-6) in the T1 generation and 2(-13) in the T2 generation). While in splicing and polyadenylation double mutation constructs (pA1+pA2+5’ss and pA1+pA2+3’ss), the ratio of AtC30Y/AtCPSF30 were decreased. However, such decreases were comparable to, or slightly less than, single splicing mutations (compare pA1+pA2+5’ss to 5’ss, or pA1+pA2+3’ss to 3’ss in Figure 5). Compare the double splicing mutant (5’ss+3’ss) and splicing/polyadenylation double mutants, the double splicing mutant caused more reduction on AtC30Y (more negative), suggesting that mutations on splice sites had a more severe effect than that on polyadenylation signals. Finally, mutations in 3’-UTR (pA3) inhibit the polyadenylation of the AtC30Y transcripts and lead to a decreased production of AtC30Y. Results in T2 generation were consistent with that in T1 generation. Our data suggests a competitive relationship between splicing and polyadenylation in intron-2 of OXT6 gene differential expression, and indicates more influence of splicing of this intron-2 than that of polyadenylation.

Mutations on splice sites change the binding affinity of Pol II

Previous studies showed that interaction between transcription and pre-mRNA splicing may be linked via Pol II binding (Proudfoot et al., 2002; Bentley, 2002). To detect whether Pol II binding to the chromatin where mutated OXT6 gene located was affected by the mutation of splicing signals, Chromatin Immunoprecipitation (ChIP) analysis was performed using antibody against Pol II in two transgenic plants oxt6::OXT6-3’ss and oxt6::OXT6-5’ss+3’ss. Immunoprecipitated DNA was purified and ~60bp region around 3’ss and 5’ss of intron-2 were analyzed by quantitative PCR (Figure 6). The ChIP signal was normalized by the input of each sample presented as % input in the figure. Stronger ChIP signal represents a higher lever of Pol II binding affinity to the DNA, which suggests potential Pol II pausing on the site. Interestingly, Pol II affinity at both 3’ss and 5’ss of intron-2 at OXT6 was decreased compared to that of the wild type. This result indicates that mutations on splice sites of intron-2

19 could decrease the Pol II binding affinity, suggesting a higher Pol II elongation efficiency. The latter may decrease the usage of the mutated splice sites, which may allow the switch from splicing of intron-2 to polyadenylation on intron-2 to occur more frequently.

20 Chapter 4: Discussion and Concluding Remarks

The competition nature of polyadenylation and splicing during processing of the OXT6 gene

Alternative polyadenylation and alternative splicing exist widely in eukaryotic cells. Growing evidence indicate that there must be a dynamic interplay between splicing and polyadenylation, which leads to widespread intronic polyadenylation events, and contributes to the complexity of the transcriptome in the cell (Proudfoot, 2004; Tian, 2007). Here we found that in the OXT6 gene, the mutations on the intronic polyadenylation signals promoted the splicing of the intron while mutations on the splice sites stimulated the intronic polyadenylation in the same intron. This result suggests a competition nature of polyadenylation and splicing in the OXT6 gene processing, and the important role of APA and alternative splicing on OXT6 expression regulation.

While the splicing mechanism in plant is not well understood, some basic information of plant splicing signal was revealed. The 5’ss and 3’ss in Arabidopsis are very similar to that of humans (canonical GT-AG splice site), however the branch site is not very clearly defined (Alexandrov et al., 2004; Reddy et al., 2007). As a result, we only introduced mutations to the 5’ss and 3’ss of intron-2 in OXT6. Our data suggested that mutation on one splice site (5’ss or 3’ss) was adequate to alter the splicing pattern, and therefore to change the expression ratio of the two transcripts of the OXT6 gene.

Recent studies reported that in human genes, introns that contain alternative poly(A) sites and composite exon tend to have weak 5’ss and a large size (Tian, 2007). These results suggested that the usage of an intronic poly(A) site may be promoted by a weak 5’ss and a large intron size. The large intron size would require longer time to transcribe and splice, providing a time window for polyadenylation within the intron, while the weak 5’ss decrease the competition between alternative splicing and polyadenylation, indicating that polyadenylation

21 within the intron may take place before the completion of transcription of it. However, our data suggested that no significant difference was observed between the two mutation constructs on 5’ss and 3’ss. Mutations on either splice site led to a ~2(-3) fold decreased ratio between AtC30Y and AtCPSF30 transcripts, while mutations on both splice sites almost deprive the generation of AtC30Y transcript. These results indicate that both splice sites of intron-2 contribute equally to the OXT6 gene processing, and 3’ss mutation can compete with the polyadenylation as well as 5’ss mutation does, suggesting that polyadenylation may take place after the completion of transcription of intron-2. Indeed, though the length of intron-2 (970bp) is comparatively longer than the average length of intron in Arabidopsis (433bp; Reddy, 2007), it is still much shorter than the average length of intron in humans (3000bp, Sakharkar et al., 2006), thus the polyadenylation within intron-2 is more likely to happen after the transcription of intron-2.

The expression profile of the OXT6 gene may be determined by the balance of several features, including the strength of the polyadenylation and splicing signals, the size of intron-2, and the presence of potential auxiliary cis-elements for splicing (Ladd and Cooper, 2002). Intron-2 contains typical splicing signal GT- AG, though no highly conserved polyadenylation signals (e.g. AAUAAA) were found, some potential poly(A) signals (particularly NUE) that were tested to be authentic NUE in other genes (Li and Hunt, 1997) were detected. Analysis in double mutation constructs pA1+pA2+5’ss and pA1+pA2+3’ss showed a decreased ratio of AtC30Y/AtCPSF30 transcripts, indicating that splice site may have a severe effect than that of polyadenylation signal on the OXT6 gene processing. In addition, it has been reported that a number of cis-elements situated in both exons and introns may act as enhancers or silencers in the regulation of alternative mRNA processing (Ladd and Cooper, 2002; Tian et al., 2005). Since we only mutated the major splicing and polyadenylation signals, the ratio between different OXT6 transcripts may also result from the dynamic antagonism between trans-acting factors binding to the potential cis-elements.

22 For the OXT6 gene, the interplay of APA and alternative splicing in intron-2 may reflect the self-regulatory expression of this gene. APA of this Arabidopsis gene (as well as its counterparts in other plants, AG Hunt, Unpublished data) leads to a smaller transcript encoding AtCPSF30 while alternative polyadenylation generate a longer protein AtC30Y containing all but the C-terminal 13 amino acids of AtCPSF30 and fuses with an additional domain YT521-B. AtCPSF30, like its other eukaryotic counterparts, is an RNA-binding protein, and interacts with many proteins in plant mRNA polyadenylation complex, such as AtCPSF160, AtCPSF100 and other CPSF and CstF subunits. AtCPSF30 may serves as a hub around which other subunits assemble into a large complex. The various interactions may reflect a progression through the steps of the polyadenylation or the different complex involved in APA, where the plant polyadenylation complex acts as a dynamic system that changes its composition (Hunt et al., 2008). Thus AtCPSF30 generated from APA of the OXT6 gene can promote the assembly of APA complex, which may in return to stimulate the APA process of the OXT6 gene and increase the expression level of AtCPSF30. Though the function of AtC30Y is less well understood, the YT521-B domain has been indicated to associate with splicing (Imai, 1998; Hartmann, 1999). Therefore, AtC30Y generated from alternative splicing of the OXT6 gene may potentially involved in splicing mechanism, and affect the expression level of AtC30Y itself. This predicated self-regulated mechanism of OXT6 gene indicates a distinct role of AtCPSF30 in plant mRNA polyadenylation complex, and may suggests different mechanisms of plant mRNA 3’ processing.

The regulatory role of Pol II in OXT6 gene processing

In eukaryotic cells, both splicing and polyadenylation factors may interact with CTD of Pol II extensively, playing critical roles at different stages of transcription, such as initiation, elongation and termination (Hirose and Manley, 2000; Proudfoot et al., 2002). Studies also suggest that the selection of alternative splice sites may be determined by the elongation efficiency of Pol II (Kadener et al, 2000; Kornblihtt and Mata, 2004). More processive transcription elongation

23 can promote the skipping of an alternative spliced exon, since this exon usually contains a weak 3’ss, which needs more time hence a lower transcription elongation rate would help its splicing (Kornblihtt and Mata, 2004). Recent studies in immunoglobulin heavy chain gene linked the Pol II elongation to polyadenylation factors (Martincic et al, 2009). The Pol II transcription elongation factor ELL2 can promote the binding of polyadenylation factor CstF-64 to phosphorylated Pol II CTD to increase the local concentration of polyadenylation factors, in return to promote the recognition of the APA site, which is consist with the finding that more precessive Pol II elongation efficiency inhibits the recognition of a weak splice site. In human cells, the binding level of Pol II are increased in exons compared to that in introns in all levels of expression (Schwartz, 2009), suggesting that the Pol II elongation efficiency in exons is slower than that in introns.

Our data suggest that with mutation on 3’ss and 5’ss of intron-2, the Pol II affinity decreased at the region around the 3’ss and 5’ss in splice site mutation transgenic lines compared with that in wild type plants, indicating an increased Pol II elongation efficiency around mutated splice sites. Under more processive transcription elongation, the usage of mutant splice sites (which can be treated as weak splice sites) should be decreased, which will lead to a less competition between splicing and polyadenylation, thus decrease the ratio between AtC30Y and AtCPSF30 transcripts. With more AtCPSF30 transcripts, the increased amount of AtCPSF30 may promote the recruitment of other polyadenylation factors (such as CstF-64) to Pol II, additionally increase the usage of the poly(A) site within intron-2 of the OXT6 gene. It is consistent with the prediction that the expression of the OXT6 gene can be self-regulated.

An interesting question is how mutations on splice site can change the Pol II elongation rate. Recent studies revealed that the recognition of exon and intron may be linked with the chromatin structure. Schwartz et al (2009) reported that there was increased nucleosome occupancy along exons rather than introns and the marking of exons by nucleosomes might have a role in defining the exon-

24 intron architecture of a gene. They suggested that the nucleosome positioning at the DNA level could affect exon recognition at the RNA level by two mechanisms: firstly, the nucleosomes may function as “speed bumps” to decrease Pol II elongation rate, which will allow more time for Pol II to process the splice site at the conjunction of exon and intron. Secondly, some preferential positioning of nucleosomes along exons may mark the exons with modified histons that interact with splicing machinery to enhance recognition of exons. The Pol II-mediated cross-talk between chromatin structure and exon-intron architecture provide us a new insight on the interaction between co-transcriptional polyadenylation and splicing. It would be interesting to detect whether the mutation on splice site would change the chromatin structure around the OXT6 gene.

Future Perspective

Regulation of alternative mRNA processing is increasingly considered as important steps in gene regulation. Here we report a predicted competition interaction of alternative polyadenylation and alternative splicing in the OXT6 gene expression, however much remains to be learned regarding the functional relevant of these two processes. The function of AtC30Y is not known, and the possible roles of the YT521-B domain in plants are not clear. To address this problem, mutant backgrounds that express only AtC30Y and AtCPSF30 need to be generated. In addition, mutation constructs used in this study can be transformed to the new mutant backgrounds to detect whether the AtC30Y and AtCPSF30 products can affect the expression level of the OXT6 gene, which may reveal the self-regulated mechanism of the OXT6 gene expression.

Regulation of mRNA alternative processing can be also involved in physiological (Delaney et al., 2006) or pathological (Mayr and Bartel, 2009; Ji, 2009; Singh, 2009) processes, which may act in a global or tissue specific manner. This reflects an extensive interaction between trans-acting factors and cis-elements. The expression level of the trans-acting factors and the recognition efficiency of the cis-element in different tissue at different developmental stages may make

25 the study more complicated. In order to understand the regulatory role of intron-2 in the OXT6 gene processing, construction of a mini-gene construct that only contains intron-2 with a marker gene should be helpful. Transformation of the mini-gene to WT and oxt6 mutant background (as well as new generated mutant backgrounds) may provide us direct evidence that APA and alternative splicing compete with each other during the intron-2 processing.

26 References

Alexandrov, N. N, Troukhan, M. E, Brover, V. V, Tatarinova, T, Flavell, R. B, Feldmann, K. A., 2006, Features of Arabidopsis genes and genome discovered using full-length cDNAs, Plant Mol Biol 60, 69–85.

Awasthi, S. and Alwine, J. C., 2003, Association of polyadenylation cleavage factor I with U1 snRNP, RNA 9, 1400–1409.

Barilla, D., Lee, B. A. and Proudfoot, N.J., 2001, Cleavage/polyadenylation factor IA associates with the carboxyl-terminal domain of RNA polymerase II in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 98, 445–450.

Benoit, P., Papin, C., Kwak, J. E., Wickens, M. and Simonelig, M., 2008, PAP- and GLD-2-type poly(A) polymerases are required sequentially in cytoplasmic polyadenylation and oogenesis in Drosophila, Development 135, 1969-1979.

Bentley, D., 2002, The mRNA assembly line: Transcription and processing machines in the same factory, Curr Opin Cell Biol 14, 336–342Berget, S.M., 1995, Exon recognition in vertebrate splicing, J Biol Chem 270, 2411–2414.

Black, D. L., 2003, Mechanisms of alternative premessenger RNA splicing, Annu Rev Biochem 72, 291–336.

Brand, L., Hoerler, M., Nuesch, E., Vassalli, S., Barrell, P., Yang, W., Jefferson, R. A., Grossniklaus, U. and Curtis, M. D., 2006, A versatile and reliable two- component system for tissue-specific gene induction in Arabidopsis, Plant Physiol 141, 1194-1204.

Brodsky, A. S. and Silver, P. A., 2000, Pre-mRNA processing factors are required for nuclear export, RNA 6, 1737-1749.

27 Brodsky, A. S., Meyer, C. A., Swinburne, I. A., Hall, G., Keenan, B. J., Liu, X. S., Fox, E. A. and Silver, P. A., 2005, Genomic mapping of RNA polymerase II reveals sites of co-transcriptional regulation in human cells, Genome Biol 6, R64.

Brow, D. A., 2002, Allosteric cascade of spliceosome activation, Annu. Rev. Genet 36, 333–360.

Bruce, S. R., Dingle, R. W. and Peterson, M. L., 2003, B-cell and plasma-cell splicing differences: A potential role in regulated immunoglobulin RNA processing, RNA 9, 1264–1273.

Calvo, O. and Manley, J. L., 2003, Strange bedfellows: polyadenylation factors at the promoter, Genes Dev 17, 1321-1327.

Campbell, M.A., Haas, B.J., Hamilton, J.P., Mount, S.M. and Buell, C.R., 2006, Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis, BMC Genomics 7, 327.

Clark, T. A., Schweitzer, A. C., Chen, T. X., Staples, M. K., Lu, G., Wang, H., Williams, A. and Blume, J. E., 2007, Discovery of tissue-specific exons using comprehensive human exon microarrays, Genome Biol 8, R64.

Coller, J.M., Gray, N.K., and Wickens, M.P., 1998, mRNA stabilization by poly(A) binding protein is independent of poly(A) and requires translation, Genes Dev 12, 3226-3235.

Czechowski, T., Stitt, M. and Altmann, T., 2005, Genome-Wide Identification and Testing of Superior Reference Genes for Transcript Normalization in Arabidopsis. Plant Physiol 139, 5-17.

Dahmus, M. E., 1996, Reversible phosphorylation of the C-terminal domain of RNA polymerase II, J. Biol. Chem 271, 19009–19012.

28 Dantonel, J. C., Murthy, K. G., Manley, J. L. and Tora, L., 1997, Transcription factor TFIID recruits factor CPSF for formation of 3’ end of mRNA. Nature 389, 399–402.

Delaney, K., Xu, R., Li, Q. Q., Yun, K. Y., Falcone, D. L. and Hunt, A. G., 2006, Calmodulin interacts with and regulates the RNA-binding activity of an Arabidopsis polyadenylation factor subunit, Plant Physiol 140,1507-1521.

Dominski, Z., Yang, X. C. and Marzluff, W. F. 2005, The polyadenylation factor CPSF-73 is involved in histone-pre-mRNA processing, Cell 123, 37-48.

Farh, K. K., Grimson, A., Jan, C., Lewis, B.P., Johnston, W.K., Lim, L.P., Burge, C.B., and Bartel, D.P., 2005, The widespread impact of mammalian MicroRNAs on mRNA repression and evolution, Science 310, 1817–1821

Forbes, K.P., Addepalli, B. and Hunt, A.G., 2006, An Arabidopsis Fip1 homolog nteracts with RNA and provides conceptual links with a umber of other polyadenylation factor subunits, J Biol Chem 281, 176-186.

Franks, T. M. and Lykke-Andersen, J., 2008, The control of mRNA decapping and P-body formation, Mol Cell 32, 605–615.

Frohman, M. A., 1993, Rapid amplification of complementary DNA ends for generation of full-length complementary DNAs: Thermal RACE, Methods Enzymol 218, 340-356.

Gilmartin, G.M., 2005, Eukaryotic mRNA 3’ processing: a common means to different ends, Gens& Development 19, 2517-2521Stratagene, La Jolla, CA., 2005, QuikChange ® II XL Site-Directed Mutagenesis Kit Instruction Manual.

Gunderson, S. I., Polycarpou-Schwarz, M. and Mattaj, I.W., 1998, U1 snRNP inhibits pre-mRNA polyadenylation through a direct inter action between U1 70K and poly(A) polymerase, Mol Cell 1, 255–264.

29 Hartmann, A.M., 1999, The interaction and colocalization of Sam68 with the splicingassociated factor YT521-B in nuclear dots is regulated by the Src family kinase p59 (fyn), Mol Biol Cell 10, 3909–3926

He, Y., Michaels, D. S. and Amasino, M.R., 2003, Regulation of flowering time by histone acetylation in Arabidopsis, Science 302, 1751-1754.

Hirose, Y. and Manley, J. L., 1998, RNA polymerase II is an essential mRNA polyadenylation factor. Nature 395, 93–96.

Hirose, Y. and Manley, J. L., 2000, RNA polymerase II and the integration of nuclear events, Genes & Dev 14, 1415–1429.

Huang, Y., and Carmichael, G. C., 1996, Role of polyadenylation in nucleocytoplasmic transport of mRNA, Mol Cell Biol 16, 1534-1542.

Hunt, A. G., 1994, Messenger RNA 3' end formation in plants, Annu Rev Plant Physiol Plant Mol Biol 45, 47-60.

Hunt, A. G., Xu, R., (10 others) and Li, Q. Q., 2008, Arabidopsis mRNA polyadenylation machinery: comprehensive analysis of protein-protein interactions and gene expression profiling, BMC Genomics 9, 220

Imai, Y., 1998, Cloning of a gene, YT521, for a novel RNA splicing-related protein induced by hypoxia/reoxygenation, Brain Res Mol Brain Res 53, 33–40

Ji, Z., Lee, J.Y., Pan, Z., Jiang, B. and Tian, B., 2009, Progressive lengthening of 3’ untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proc. Natl Acad. Sci. USA 106, 7028–7033.

Jurica, M. S. and Moore, M. J., 2003, Pre-mRNA splicing: awash in a sea of proteins, Mol. Cell 12, 5–14.

30 Kadener, S., Cramer, P., Nogués, G., Cazalla, D., Mata, M,. Fededa, J. B., Werbajh, S. E., Srebrow, A. and Kornblihtt, A. R., 2000, Antagonistic effects of T- Ag and VP16 reveal a role for RNA pol II processing machinery to RNA polymerase II elongation on alternative splicing. EMBO J 20, 5759–5768.

Kaneko, S. and Manley, J.L., 2005, The mammalian RNA polymerase II C- terminal domain interacts with RNA to suppress transcription-coupled 3’end formation, Mol Cell 20, 91–103.

Kessler, M. M., Henry, M. F., Shen, E., Zhao, J., Gross, S., Silver, P. A. and Moore, C.L., 1997, Hrp1, a sequence-specific RNA-binding protein that shuttles between the nucleus and the cytoplasm, is required for mRNA 3'-end formation in yeast, Genes Dev 11, 2545-2556.

Kim, E., Magen, A., and Ast, G. , 2007, Different levels of alternative splicing among eukaryotes, Nucleic Acids Res 35, 125–131.

Koncz, C. and Shell, J., 1986, The promoter of the TL-DNA gene 5 controls the tissue-specific expression of chimaeric genes carried by a novel type of Agrobacterium binary vector, Mol Gen Genet 204, 383-396.

Kornblihtt A. R., Mata, M., Fededa, J. P., Munoz, M. J., and Nogues, G., 2004, Multiple links between transcription and splicing, RNA 10(10), 1489-1498.

Kuhn, U. and Wahle, E., 2004, Structure and function of poly(A) binding proteins, Biochim Biophys Acta 1678, 67-84.

Kyburz, A., 2006, Direct Interactions between Subunits of CPSF and the U2 snRNP Contribute to the Coupling of Pre-mRNA 30 End Processing and Splicing, Molecular Cell 23, 195-205

Ladd, A. N. and Cooper, T. A., 2002, Finding signals that regulate alternative splicing in the post-genomic era, Genome Biol 3, reviews0008.

31 Lambermon, M.H.L., 2000, UBP1, a novel hnRNP-like protein that functions at multiple steps of higher plant nuclear pre-mRNA maturation, EMBO J 19, 1638– 1649

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C. et al.,2001, Initial sequencing and analysis of the human genome, Nature 409, 860–921

Li, Q. and Hunt, A. G., 1997, The polyadenylation of RNA in plants, Plant Physiol 115(2), 321-325.

Loke, J.C., Stahlberg, E.A., Strenski, D.G., Haas, B.J., Wood, P.C. and Li, Q.Q., 2005, Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures, Plant Physiol 138, 1457- 1468.

Lorkovic, Z.J., 2000, Pre-mRNA splicing in higher plants, Trends Plant Sci. 5, 160–167

Luo, M. J.and Reed, R., 1999, Splicing is required for rapid and efficient mRNA export in metazoans, Proc Natl Acad Sci USA 96, 14937–14942.

Lutz, S., 2008, Carol, Alternative plolyadenylation: A twist on mRNA 3’ end formation, ACS Chem Biol 3 (10), 609–617

Martincic, K, Alkan, S. A, Cheatle, A, Borghesi, L. and Milcarek, C., 2009, Transcription elongation factor ELL2 directs immunoglobulin secretion in plasma cells by stimulating altered RNA processing, Nat. Immunol 10,1102–1109.

Mayr, C. and Bartel, D., 2009, Widespread Shortening of 3 UTRs by Alternative Cleavage and Polyadenylation Activates Oncogenes in Cancer Cells. Cell 138 (4), 673-684.

McCracken, S., Fong, N., Yankulov, K., Ballantyne, S., Pan, G., Greenblatt, J., Patterson, S. D., Wickens, M. and Bentley, D. L., 1997, The C-terminal domain of

32 RNA polymerase II couples mRNA processing to transcription. Nature 385, 357– 361.

McCracken, S., Lambermon, M. and Blencowe, B. J., 2002, SRm160 splicing coactivator promotes transcript 30-end cleavage, Mol Cell Biol 22, 148–160.

Millevoi, S., Loulergue, C., Dettwiler, S., Karaa, S. Z., Keller, W., et al., 2006, An interaction between U2AF 65 and CFIm links the splicing and 3 ' end processing machineries, Embo Journal 25 4854-4864.

Millevoi, S. and Vagner, S., 2010, Molecular mechanisms of eukaryotic pre- mRNA 3’ end processing regulation, Nucleic Acids Research 38, 2757–2774

Morlando, M., Ballarino, M., Greco, P., Caffarelli, E., Dichtl, B., and Bozzoni, I., 2004, Coupling between snoRNP assembly and 3' processing controls box C/D snoRNA biosynthesis in yeast, EMBO J 23, 2392-2401.

Nedea, E., He, X.Y., Kim, M., Pootoolal, J., Zhong, G. Q., Canadien, V., Hughes, T., Buratowski, S., Moore, C.L. and Greenblatt, J., 2003, Organization and function of APT, a subcomplex of the yeast cleavage and polyadenylation factor involved in the formation of mRNA and small nucleolar RNA 3 '-ends, J Biol Chem 278, 33000-33010.

Ner-Gaon, H., Halachmi, R., Savaldi-Goldstein, S., Rubin, E., Ophir, R. and Fluhr, R., 2004, Intron retention is a major phenomenon in alternative splicing in Arabidopsis, Plant J 39, 877–885.

Neugebauer, K. M., 2002, On the importance of being cotranscriptional. J Cell Sci 115, 3865–3871.

Niwa, M., MacDonald, C. C. and Berget, S.M., 1992, Are vertebrate exons scanned during splice-site selection? Nature 360, 277–280.

33 Pandya-Jones, A. and Black, D., 2009, Co-transcriptional splicing of constitutive and alternative exons, RNA 15, 1896-1908

Poon, L. L. M., Fodor, E., and Brownlee, G. G., 2000, Polyuridylated mRNA synthesized by a recombinant influenza virus is defective in nuclear export, J Virol 74, 418-427.

Proudfoot, N., 1996, Ending the message is not so simple. Cell 87, 779–781.

Proudfoot, N., 2004, New perspectives on connecting messenger RNA 3' end formation to transcription, Curr Opin Cell Biol 16 (3), 272-278.

Proudfoot, N.J., Furger, A.and Dye, M.J., 2002, Integrating mRNA processing with transcription. Cell 108, 501–512.

Reddy, A. S. N, 2001, Nuclear pre-mRNA splicing in plants, Crit Rev Plant Sci 20, 523–71.

Reddy, A. S. N, 2007, Alternative Splicing of Pre-Messenger RNAs in Plants in the Genomic Era, Annu Rev Plant Biol 58, 267–94.

Reddy, S. N. A., 2007, Alternative splicing of pre-messenger RNA processing with transcription, Annual Review of Plant biology 58, 267-294

Rothnie, H.M., 1996, Plant mRNA 3'-end formation. Plant Mol Biol (1996) 32, 43- 61.

Sakharkar, M.K., Kangueane, P., Sakharkar, K.R., and Zhong, 2006, Z. Huge proteins in the human proteome and their participation in hereditary diseases, In Silico Biol 6, 275–279.

Schwartz, S., Meshorer, E. and G, A., 2009, Chromatin organization marks exon–intron structure, Nat Struct Mol Biol 16, 990–995

34 Scott, J. M. and Imperiale, J.M, 1996, Reciprocal effects of splicing and polyadenylation on human immunodeficiency virus type 1 pre-mRNA processing, Virology 224, 498–509.

Shen, Y., Ji, G., Haas, B.J., Wu, X., Zheng, J., Reese, G.J. and Li, Q.Q, 2008, Genome level analysis of rice mRNA 3’-end processing signals and alternative polyadenylation, Nucleic Acids Res 36, 3150–3161

Shi, Y., Di Giammartino, D.C., Taylor, D., Sarkeshik, A., Rice, W. J., Yates, JR., Frank, J., and Manley, J. L, 2009, Molecular architecture of the human pre- mRNA 3' processing complex, Mol Cell 33(3), 365-76.

Siddiqui, N., Mangus. D. A., Chang, T. C., Palermino, J. M., Shyu, A. B. and Gehring, K., 2007, Poly(A) nuclease interacts with the C-terminal domain of polyadenylated-binding protein domain from poly(A)-binding protein. J. Biol Chem 282, 25067-2075.

Simpson, C. G, Jennings, S. N, Clark, G. P, Thow, G., Brown, J. W., 2004a, Dual functionality of a plant U-rich intronic sequence element. Plant J. 37:82–91.

Simpson, G. G., Quesada, V., Henderson, I. R., Dijkwel, P. P., Macknight, R., and Dean, C., 2004b, RNA processing and Arabidopsis flowering time control. Biochem Soc Trans 32, 565-566.

Singh, P., Alley, T. L., Wright, S. M., Kamdar, S., Schott, W., Wilpan, R. Y., Mills,K. D. and Graber, J. H., 2009, Global Changes in Processing of mRNA 3′ Untranslated Regions Characterize Clinically Distinct Cancer Subtypes, Cancer Res 69 (24), 9442-9430

Stoilov, P., Rafalska, I., and Stamm, S., 2002, YTH: a new domain in nuclear proteins, Trends Biochem Sci 27, 495-497.

Strasser, K. and Hurt, E., 2001, Splicing factor Sub2p is required for nuclear mRNA export through its interaction with Yra1p, Nature 413, 648–652

35 Takagaki, Y. and Manley, J. L., 1998, Levels of polyadenylation factor CstF-64 control IgM heavy chain mRNA accumulation and other events associated with B cell differentiation, Mol. Cell 2, 761–771.

Tian, B., Hu, H., Zhang, H., and Lutz, C. S., 2005, A large-scale analysis of mRNA polyadenylation of human and mouse genes, Nucleic Acids Res 33, 201– 212.

Tian, B., Pan, Z. and Lee, J. Y., 2007, Wide spread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing, Genome Res 17,156–165.

Wang, B. B., Brendel, V., 2004, The ASRG database: identification and survey of Arabidopsis thaliana genes involved in premRNA splicing, Genome Biol, 5, R102.

Wang, B.B. and Brendel, V., 2006, Genomewide comparative analysis of alternative splicing in plants, Proc Natl Acad Sci 103, 7175–7180.

Woodley, L.and Valcarcel, J., 2002, Regulation of alternative premRNA splicing, Brief Funct. Genomic Proteomic 1, 266–277.

Xing, D., Zhao, H., Xu, R., and Li, Q. Q., 2008, Arabidopsis PCFS4, a homologue of yeast polyadenylation factor Pcf11p, regulates FCA alternative processing and promotes flowering time, Plant J 54, 899-910.

Xu, R., Ye, X. and Li, Q. Q., 2004, AtCPSF73-II gene encoding an Arabidopsis homolog of CPSF 73 kDa subunit is critical for early embryo development, Gene 324, 35-45.

Zeng, C. and Berget, S. M., 2000, Participation of the C-Terminal Domain of RNA Polymerase II in Exon Definition during Pre-mRNA Splicing, Mol Cell Biol 20, 8290-8301.

36 Zhang, J., Addepalli, B., Yun, K. Y., Hunt, A. G., Xu, R., Rao, S., and Li, Q. Q, Falcone, D. L., 2008, A Polyadenylation Factor Subunit Implicated in Regulating Oxidative Signaling in Arabidopsis thaliana, PLoS ONE 3(6), e2410.

Zhao, J., Hyman, L. and Moore, C., 1999, Formation of mRNA 3’ ends in eukaryotes: Mechanism, regulation, and interrelationships with other steps in mRNA synthesis, Microbiol Mol Biol Rev 63, 405–445

Zhou, Z., Licklider, L. J., Gygi, S. P. and Reed, R., 2002, Comprehensive proteomic analysis of the human spliceosome, Nature 419, 182–185.

Zhou, Z., Luo, M. J., Straesser, K., Katahira, J., Hurt, E. and Reed, R., 2000, The protein Aly links pre-messenger-RNA splicing to nuclear export in metazoans. Nature 407, 401–405.

37 Table 1. Primers used

Oligo name Sequence ( 5' to 3') Used For Oligo d(T) with TTCTAGAATTCAGCATTCGC Reverse Transcription Adaptor TTCTTTTTTTTTTTTTTTTTV Oligo d(T) Adaptor TTCTAGAATTCAGCATTCGC 3'RACE reverse primer TTC C30 F-1 CTGGACCTCCACCACCAGTT 3'RACE forward primer GA 1for AtCPSF30 C30 F-2 CCACAGCTACAAGATAGACC 3'RACE forward primer TCA 2 for AtCPSF30 C30Y F-1 AGGATGCATCACATGACATG 3'RACE forward primer GA 1for AtC30Y C30Y F-2 GAAGAGAGTGAAAGTGAAGA 3'RACE forward primer C 2 for AtC30Y OXT6 pA1 GTTGTTATTGGTTCAGTGGC To introduce mutation at GCCATATTGGTTTCTTATA pA1 polyadenylation region in intron-2 of OXT6. pA1 is replaced with a HeaII restrict site (ggc/gcc) OXT6 pA1-comp GCTCTTATAAGAAACCAATAT Complementary primer GGCACTGAACCA of OXT6 pA1 OXT6 pA2 GAATGTAACTGGTTATTTGC To introduce mutation at AAGTTCAGTATCTACCTGATT pA2 polyadenylation TTAGGATAATTTTTC region in intron-2 of OXT6. 36bp of pA2 polyadenylation region is deleted. OXT6 pA2-comp GAAAAATTATCCTAAAATCAG Complementary primer GTAGATACTGAACTTGCAAA of OXT6 pA2

38 TAACCAGTTACATTC OXT6 5'ss GCCTCAAGGGGTAAATAGAT To introduce mutation at GCGTTCAGAGTCCTAAGGT 5'SS in intron-2 of OXT6. 5'ss is mutated from AG/gtgt to AG/atgc. OXT6 5'ss-comp ACCTTAGGACTCTGAACGCA Complementary primer TCTATTTACCCCTTGAGGC of OXT6 5'ss OXT6 3'ss CTCTTCATAGGTAATCCCTTA To introduce mutation at CATAATTTAGGATCTTGTTTT 3'ss in intron-2 of OXT6. GATATCGTATTTTGTAGTTAA 3'ss is mutated from AAGTAACAATCGAG tttcag/GT to gatatc/GT (an EcoRV site). OXT6 3'ss-comp CTCGATTGTTACTTTTAACTA Complementary primer CAAAATACGATATCAAAACA of OXT6 3'ss AGATCCTAAATTATGTAAGG GATTACCTATGAAGAG OXT6 pA3 AGCGGCAGGTTGTTGGTGG To introduce mutation at TCTATGGCATTAG pA3 polyadenylation region in 3'UTR of OXT6. 76bp of pA3 polyadenylation region is deleted. OXT6 pA3-comp CTAATGCCATAGACCACCAA Complementary primer CAACCTGCCGCT of OXT6 pA3 AtCPSF30/AtC30Y CCGCCTGAAAACTCTTCCT qPCR for -qPCR-F AtCPSF30,AtC30Y, forward primer AtCPSF30-qPCR- TGAACCAATAACAACGTCTT qPCR for AtCPSF30, R GA Reverse primer located at 37nts after the stop codon of C30 short transcript. Product size: 703 bp.

39 AtC30Y-qPCR-R AGCTTCATTGCTCCTTTGTG qPCR for C30Y. Reverse primer located at 70nts in the beginning of C30Y Exon-3. Product size: 689bp. OXT6 Ref-ChIP-F ATTTCGAAGGCGGTCTTG Internal reference for qPCR for ChIP assay.

Forward primer located 28nts at the beginning of Exon-1 of the OXT6 gene. OXT6 Ref-ChIP-R GTTTTCAGGCGGAGCAAC Internal reference for qPCR for ChIP assay.

Reverse primer located 81nts at the beginning of Exon-1 of the OXT6 gene. OXT6 5'ss-ChIP-F CATCTCATCCTTTGCCTCAA qPCR for ChIP assay of 5'ss of intron-2 in OXT6,forward primer OXT6 5'ss-ChIP-R ACTGAACCAATAACAACGTC qPCR for ChIP assay of TTG 5'ss of intron-2 in OXT6 reverse primer OXT6 3'ss-ChIP-F CTCTTCATAGGTAATCCCTTA qPCR for ChIP assay of CA 3'ss of intron-2 in OXT6,forward primer OXT6 3'ss-ChIP-R ACACACCTTGTTGCACAGAT qPCR for ChIP assay of 3'ss of intron-2 in OXT6 reverse primer

40 Table 2. OXT6 mutation constructs and transgenic plants used

Gene mutation constructs Transgenic plants pA1+pA2 oxt6:: OXT6-pA1+pA2 5’ss oxt6:: OXT6-5’ss 3’ss oxt6:: OXT6-3’ss OXT6 5’ss+3’ss oxt6:: OXT6-5’ss+3’ss pA1+ pA2+5’ss oxt6:: OXT6-pA1+pA2+5’ss pA1+pA2+3’ss oxt6:: OXT6-pA1+pA2+3’ss pA3 oxt6:: OXT6-pA3

41 T-DNA Intron-2

pA1 and pA2 pA3

AtCPSF30 AtC30Y

Figure 1. The structure of the OXT6 gene and its RNA transcripts. Exons are indicated in gray (region found in AtCPSF30) or black (only in AtC30Y). UTRs are indicated in white boxes, while introns are in thick lines. The position of the T- DNA insertion in the oxt6 mutant is indicated above exon-1. The polyadenylation regions inside intron-2 (pA1 and pA2) and 3’UTR (pA3) are indicated between two arrows. The positions of the primers designed for quantitative RT-PCR to specifically detect the two transcripts are indicated above the corresponding region with arrows.

42 NUE 1

pA1 TAA attaaaatattggtttcttaTaagag

60 nts

NUE 2 1 4 4

pA2 TAA aatgaactttatatatacagAtatattttcaaCTT

129 nts

NUE 1 2 1 1 2 1

pA3 TGA aaatattttgtAttattagacaaagagtagttaaTaCaactctcgcgcgtctttctTtagtAtT

120 nts

Figure 2. Poly(A) sites detected in AtCPSF30 and AtC30Y transcripts. Poly(A) sites and presumed NUE signals of pA1, pA2 and pA3 are shown. Poly(A) sites are indicated by capital letters and NUE is indicated by italic letters and underlined. The number above the arrow represents the number of independent clones sequenced. The distance (nts) between the stop codon and the first poly(A) site within the particular polyadenylation region is also indicated (not in scale).

43 pA1+pA2

5’ss

3’ss

5’ss+3’ss

pA1+pA2+5’ss

pA1+pA2+3’ss

pA3

Figure 3. An Illustration of the seven OXT6 mutation constructs. The names of the mutant constructs are on the left-hand side. Mutated regions are indicated with “Xs”.

44 A 2 2 pA2 deletion

TAA gaatGtaactggttatttGcaagttcagtatct 85 nts

B cryptic original

5’

Intron-2 OXT6

3’

original cryptic

Figure 4. Cryptic poly(A) sites and cryptic splice sites detected in transgenic lines. (A) Cryptic poly(A) sites detected in oxt6::OXT6-pA1+pA2 transgenic plants. Poly(A) sites are indicated by capital letters, and the position of the pA2 deletion in the pA1+pA2 construct is indicated by italic “g”. The numbers above the arrows represent the numbers of independent clones sequenced. (B) Cryptic 5’ss and 3’ss of intron-2 in OXT6 gene detected in splicing site mutant lines. Original splice sites are indicated in black letters while cryptic splice sites are indicated in red letters, while both are shaded in yellow. The blue letters and * represent the rest of the intron-2 sequence.

45

pA1+pA2+5’ss pA1+pA2+3’ss

5’ss+3’ss pA3 5 * * * 5’ss 3’ss

0 Expression Level) Expression

WT pA1+pA2 * * * * * * -5 * * * * * * * * * * * *

AtCPSF30 -10 / T1 -15 T2 AtC30Y

-20 * Log2 ( Log2

Figure 5. Ratio fold of change of the AtCPSF30 and AtC30Y transcripts in the OXT6 mutation constructs, detected by quantitative RT-PCR. Copy numbers of the transcripts were determined by the comparative threshold cycle method (∆∆Ct). Results were derived from the averages of at least three independent transgenic plant lines. Two-tail t-test results are indicated by (*) p0.05, (**) p0.01. The significant differences were all compared to the wild-type (WT).

46 14 WT

oxt6:: OXT6-5’ss+3’ss 12 oxt6:: OXT6-3’ss 10

8

6

4 ChIP Signal (% Input) (% Signal ChIP ** * ** 2 *

0 3'ss 5'ss

Figure 6. ChIP assays of the splicing mutant transgenic lines. The relative amount (% Input) of co-precipitated DNA determined by qPCR was compared to that of the wild type. Results were derived from averages of two to three independent samples. Two-tail t-test results are indicated by (*) p0.05, (**) p0.01. The significant differences were all compared to the wild-type (WT).

47