Investigating the Role of the Putative Splicing Factor EMB-4 in Small RNA Pathways in Caenorhabditis elegans

by

Amena Nabih

A thesis submitted in conformity with the requirements for the degree of Master’s of Science

Department of Molecular Genetics University of Toronto

© Copyright by Amena Nabih 2017 Investigating the Role of the Putative Splicing Factor EMB-4 in Small RNA Pathways in Caenorhabditis elegans

Amena Nabih

Master’s of Science Department of Molecular Genetics University of Toronto

2017

Abstract

Small RNA pathways are essential for germline expression and maintenance of germline genome integrity. In C. elegans, two Argonautes function antagonistically to promote proper germline gene expression and development, CSR-1 and WAGO-9/HRDE-1. WAGO-9/HRDE-1 acts downstream of the piRNA pathway to silence deleterious nucleic acids, while CSR-1 functions to protect germline from piRNA-mediated silencing. The two interact with an array of co-factors to elicit changes in gene expression in the nucleus, although their mechanisms are not fully understood. This study identifies one such co-factor, the putative splicing factor EMB-4. Using integrated cell and molecular biology and genomic approaches, I show that loss of emb- 4 significantly perturbs CSR-1 and WAGO-9/HRDE-1 small RNA and mRNA transcriptomes. Moreover, I find that EMB-4 interacts with target transcripts of the two pathways differentially. These data implicate EMB-4 in small RNA biogenesis, potentially in relation to its role in the nuclear export of it targets.

ii Acknowledgements

This thesis is dedicated to women in science.

I owe a debt of gratitude to many people, without whom this thesis would not be possible. I extend my thanks to the following:

My supervisor, Dr. Julie Claycomb. Tackling this project was no easy feat and I could not be more grateful for her patience support and encouragement throughout, for sharing my frustrations and celebrating my triumphs, for teaching me about persistence and resilience.

My committee members Dr. Ben Blencowe and Dr. Andrew Spence, for their helpful suggestions and fruitful discussions throughout my research.

Dr. Olivia Rissland, for raising the bar, asking tough questions, expecting no less than excellence, broadening my horizons, teaching me about the true essence of science and for being both intimidatingly and inspirationally extraordinary in every way.

Dr. Katarzyna Tyc, for analyzing my plethora of sequencing data and for her patience with me, when I failed to understand them and for her commitment to perfection in our collaborative work.

Chris Wedeles, my favorite scientist and bay mate, for allowing me to adopt his brain child, then helping me as I struggled to raise it into a presentable adult, for all the techniques he’s taught me, for readily doling out advice and for all his jokes, pranks and general friendship.

Monica Wu, the person with whom I share the deepest sense of camaraderie, for her generosity with her time, for teaching me how to produce the most beautiful data, for always dropping what she was doing when I needed help and for always being there when I needed to vent.

Amanda Charlseworth, for providing everything I needed when I needed it: encouragement when I failed, cheers when I succeeded, laughter when I was sad, food when I was hungry and most importantly, entertainment when the day felt dull.

To paraphrase the talented Ms. Tina Fey, I’d like to thank my parents for somehow raising me to have confidence that is disproportionate to my intelligence and abilities. They’ve sacrificed their own comfort for mine, carried my burdens, lifted me up and despite the distance I never felt without them. My father’s curious, analytical and rational mind and my mother’s incredible work ethic and persistence have taught me more than any young scientist could hope to learn.

The rest of my family, especially, my siblings, brother-in-law and niece. There are no words to describe how much joy they bring me, how much I enjoy their mocking and appreciate their support, their weekly calls and daily texts and how much I love being their nerdy sister.

My friends all over, for pretending to understand what I was saying when I talked about my work, for always taking the edge off, for sharing food, drinks, laughter and words of encouragement. They make it all easier to bear and I wouldn’t survive a day in this world without them.

Crossfit416 and the 7am crew, for giving me a reason to get out of bed in the morning on days when I was ready to quit, for every “Keep going!” and “You got this!” that were screamed so loud at me at 7am that they resonated in my ears well into the workday, when I needed them even more. I have both loved and hated every grueling early morning I’ve spent with them.

iii Table of Contents

Acknowledgements ...... iii

Table of Contents ...... iv

List of Tables ...... vii

List of Figures ...... viii

Abbreviations ...... ix

1. Introduction ...... 1 1.1. Small RNAs and Argonautes ...... 1 1.2. AGO structure and function ...... 2 1.3. Classes of small RNAs ...... 3 1.4. C. elegans small RNA pathways ...... 5 1.4.1. miRNAs...... 6 1.4.2. 21U-RNAs (piRNAs)...... 6 1.4.3. Endo-siRNAs...... 7 1.4.3.1. 26G-RNAs...... 7 1.4.3.2. 22G-RNAs...... 8 1.4.3.2.1. CSR-1 22G-RNAs...... 9 1.4.3.2.2. WAGO 22G-RNAs ...... 9 1.5. Small RNAs in the cytoplasm ...... 10 1.6. Small RNAs in the nucleus ...... 10 1.6.1. Small RNA pathways and transcription ...... 11 1.6.1.1. S. pombe...... 11 1.6.1.2. A. thaliana ...... 12 1.6.1.3. D. melanogaster ...... 12 1.6.1.4. C. elegans ...... 13 1.6.2. Small RNA pathways and splicing...... 14 1.6.3. Small RNA pathways and nuclear export...... 19 1.6.4. A link between small RNAs and chromatin in C. elegans primordial germ cells ...... 21

iv 1.7. The conserved splicing factor EMB-4/AQR ...... 22 1.7.1 Homologs of EMB-4 ...... 25 1.8. Thesis Rationale ...... 26

2. Materials and Methods ...... 27 2.1. Nematode strains, culture, brood size counts and RNAi ...... 27 2.1.1. Brood size counts ...... 27 2.1.2. RNAi feedings...... 28 2.2. EMB-4 antibody generation and purification ...... 28 2.3. lysate preparation for Western blot, silver stain and immunoprecipitation ...... 30 2.4. RNA preparation for qRT-PCR, mRNA-seq, small RNA-seq and RIP-seq ... 32 2.5. Processing and analysis of high-throughput sequencing data ...... 34 2.5.1 Processing of small RNA-seq data ...... 34 2.5.2. Processing of mRNA-seq data ...... 34 2.5.3. Processing of RIP-seq data ...... 34 2.6. Chromatin immunoprecipitation ...... 35 2.7. Quantitative real-time PCR methods ...... 36 2.8. Immunofluorescence experiments ...... 37

3. Results ...... 39 3.1. EMB-4 antibodies specifically recognize EMB-4 ...... 39 3.2. Developmental timing of emb-4 mRNA and protein expression ...... 40 3.3. EMB-4 localization in the adult germline and embryo ...... 42 3.4. Loss of emb-4 leads to decreased broods and embryonic lethality ...... 44 3.5 EMB-4 interacts with C. elegans WAGOs CSR-1 and WAGO-9/HRDE-1 ...... 47 3.6 csr-1 and wago-9/hrde-1 mutants display defects in germline chromatin resetting similar to EMB-4 ...... 49 3.7 EMB-4 is present at CSR-1 and WAGO-9/HRDE-1 target genomic loci ...... 53 3.8. Changes in the emb-4(hc60) small RNA and mRNA transcriptomes ...... 55 3.9. EMB-4 interacts with nearly 12,000 transcripts ...... 63 3.10. The impact of EMB-4 binding on target transcript levels ...... 67 3.11. EMB-4 interacts with TREX component DDX-19 ...... 69

v 4. Discussion and Future Questions ...... 71 4.1.Discussion ...... 71 4.1.1. EMB-4 expression and localization are reflective of its function ...... 71 4.1.2. emb-4 mutants show defects in fertility and viability, reflective of germline roles ...... 72 4.1.3. EMB-4 interacts with 22G-RNA pathways in the germline ...... 74 4.1.4. EMB-4 associates with chromatin at 22G-RNA target loci and binds a diverse set of transcripts ...... 75 4.1.5. emb-4(hc60) small RNA and mRNA transcriptomes are perturbed ...... 76 4.1.7. EMB-4 interacts with TREX in C. elegans ...... 78 4.1.8. Summary ...... 79 4.2. Future Questions ...... 80 4.2.1. How do the complexes in which EMB-4 is found with WAGO-9/HRDE-1 and CSR-1 differ? ...... 80 4.2.2. How do the functions and targets of EMB-4 and DDX-19 overlap? ...... 81 4.2.3. Does loss of EMB-4 lead to mislocalization of transcripts? ...... 81 4.3. Conclusions ...... 82

5. References ...... 84

vi List of Tables

Table 1. Splicing factors implicated in RNAi and their homologs…………………………18

Table 2. C. elegans strains used in this study………………………………………………27

Table 3. Peptides used to generate EMB-4 antibody………………………………………29

Table 4. Primers used in this study…………………………………………………………..37

Table 5. Spectral counts of EMB-4 and CSR-1 in IPs ………………………………….…47

Table 6. CSR-1 and WAGO-9/HRDE-1 22G-RNAs changed upon loss emb-4………...59

Table 7. Global transcriptome changes in the emb-4(hc60) mutant ……………….……60

Table 8. Changes in CSR-1 and WAGO-9/HRDE-1 small RNAs and corresponding mRNAs upon loss of emb-4………………………………………………………………..….61

Table 9. CSR-1 and WAGO-9/HRDE-1 targets showing EMB-4 enrichment..………….64

vii List of Figures

Figure 1. Crystal structure of human Ago2………………………………………………...…3

Figure 2. Phylogenetic tree of C. elegans AGOs…………………………………………….5

Figure 3. Germline chromatin resetting phenotype in emb-4(hc60) mutant…………….24

Figure 4. The EMB-4 antibodies are specific to EMB-4…………………………………...40

Figure 5. emb-4 is predominantly expressed in embryos and adult germlines…………41

Figure 6. EMB-4 is expressed in the nuclei of the germline and adult embryo…………43

Figure 7. The emb-4(hc60) mutant is temperature sensitive and has increased sterility and embryonic lethality…..………………………………….…………………………………46

Figure 8. EMB-4 interacts with WAGOs CSR-1 and WAGO-9/HRDE-1………………...49

Figure 9. csr-1 and wago-9/hrde-1 mutants have defects in germline chromatin resetting, similar to emb-4 mutants…………………………………………………………..52

Figure 10. EMB-4 is enriched at CSR-1 and WAGO-9/HRDE-1 target loci. ……………55

Figure 11. CSR-1 22G-RNAs are depleted upon loss of emb-4………………………….58

Figure 12. CSR-1 targets are downregulated in the emb-4(hc60) mutant in correlation with a depletion of their small RNAs, while WAGO-9/HRDE-1 targets show no distinct pattern of misregulation………………………………………………………………..………62

Figure 13. EMB-4 binds approximately 12000 transcripts and is enriched at ….66

Figure 14. The impact of EMB-4 binding on CSR-1 and WAGO-9/HRDE-1 target mRNA levels……………………………………………………………………………..……………...69

Figure 15. EMB-4 interacts with TREX component DDX-19……………………………...70

viii Abbreviations

AGO Argonaute ALG-1/2/3/4 Argonaute-Like Gene-1/2/3/4 AQR Aquarius Aub Aubergine CCR4-NOT CCR4-NOT transcription complex subunit 4 ChIP Chromatin immunoprecipitation CLRC Cryptic Loci Regulator Complex co-IP co-immunoprecipitation CSR-1 Segregation and RNAi-deficient-1

DAPI 4',6-diamidino-2-phenylindole DCL3 Dicer-Like 3 DCP1/2 mRNA-decapping enzyme subunit 1/2 DCR-1 Dicer-related-1 DCR-2 Dicer-2 DDX-19 DEAD box helicase homolog-19 DNA Deoxyribonucleic acid DRH-3 Dicer-Related Helicase-3 DRSH-1 Ortholog of Drosophila Drosha-1 dsRNA double-stranded RNA DTBP dimethyl 3,3ʹ- dithiobispropionimidate EGO-1 Enhancer of GLP-1

eIF4E eukaryotic Initiation Factor 4E EJC -junction complex EKL-1 Enhancer of KSR-1 Lethality-1

EMB-4 Abnormal embryogenesis-4 EMU Erecta mRNA under-expressed endo-siRNA endogenous shot interfering RNA ERGO-1 Endogenous-RNAi deficient arGOnaute-1 ESP3 Enhanced Silencing Phenotype 3

ix FISH Fluorescent in situ hybridization GLH Germline Helicase HEL-1 Helicase-1 HENN-1 HEN1 of nematode HP1a Heterochromatin Protein 1a HRDE-1 Heritable RNAi Deficient-1 IBC -binding complex IBP160 Intron-Binding Protein 160 IF immunofluorescence IP Immunoprecipitation IPTG Isopropyl β-D-1-thiogalactopyranoside

LB Luria Broth

MAGOH Mago-Nashi homolog mes-4 maternal effect sterile-1 met-1 histone methyltransferase-like-1 miRISC microRNA-Induced Silencing Complex miRNA microRNA MLN51 Metastatic lymph node gene 51 mRNA messenger RNA NGM Nematode growth medium NMD Nonsense-mediated decay NRDE-1/2/3/4 Nuclear RNAi Deficient-1/2/3/4 nt nucleotide NXF-1 Nuclear export factor-1 P granule Processing granule PAGE Polyacrylamide gel electrophoresis PAZ PIWI–Argonaute–Zwille

PBS Phosphate Buffered Saline

PCR polymerase chain reaction

x PGC Primordial germ cell PGL P-granule abnormality-1 PIE-1 Pharynx and intestine in excess-1 piRNA PIWI-interacting RNA PIWI P-element induced wimpy testes

PPM Parts per million PRG-1 Piwi-related gene-1 pri-miRNA primary miRNA PTGS Post-transcriptional gene silencing qRT-PCR quantitative real-time PCR RdRC RNA-dependent RNA Polymerase Complex RdRM RNA-directed DNA methylation RdRP RNA-dependent RNA Polymerase RIP RNA-immunoprecipitation RISC RNA-induced silencing complex RITS RNA-induced transcriptional silencing complex RNA Ribonucleic acid RNA POL RNA Polymerase RNAi RNA interference RRF-1 RNA-dependent RNA polymerase Family-1 rRNA ribosomal RNA SET-25 SET-domain-containing-25 siRNA short interfering RNA snRNP small nuclear ribonucleoprotein TB Terrific Broth TGS Transcriptional gene silencing TREX Transcription export complex WAGO worm-specific argonaute XRN1 5'-3' exoribonuclease-1

xi 1. Introduction

The germline is arguably the most important tissue, as it is responsible for passing on genetic information from one generation to the next. Thus, proper germline development is essential for the propagation of organisms and their continuation as a species. Proper germline development occurs under the direction of very particular patterns of gene expression. It is crucial that gene expression in the germline is tightly regulated as mistakes in gene expression can have deleterious effects, not only on the organism itself, but also on coming generations, as they will be passed on. Gene expression in the germline is highly regulated by small RNA pathways in many organisms [1]. These pathways ensure that the proper germline genes are expressed and that somatic genes as well as any deleterious and foreign DNA are silenced [2]. Therefore, small RNA pathways lie at the center of many important biological processes and are essential for organism development and the protection of the germline genome.

The field of small RNA biology has been burgeoning over the past few decades. Although much is now known about the biological roles of small RNAs, current knowledge has left much to be explored by way of the functional mechanisms of these diverse molecules. A primary means of understanding small RNA pathways is aimed at uncovering the factors involved in these processes. This line of investigation enables the field to gain insight into the small RNA-related roles of these factors, which may or may not be different from other roles these factors have within cells. This thesis will focus on the characterization of a novel small RNA component, the putative conserved splicing factor, EMB-4, examining its role in small RNA pathways and germ cell development in the nematode, C. elegans.

1.1 Small RNAs and Argonautes:

Small RNAs are short non-coding RNAs, ranging in length from 18-30 nucleotides [3]. They are known to regulate gene expression through various mechanisms, ranging from affecting transcription and chromatin architecture in the nucleus to causing mRNA degradation and translational inhibition in the cytoplasm [4,5,6]. In most organisms, three classes of small RNAs exist: the PIWI-interacting RNAs (piRNAs), the short

1 interfering RNAs (siRNAs) and the microRNA (miRNAs), all of which have different biogenesis mechanisms and recognize target transcripts by varying degrees of sequence complementarity [3,7]. After their biogenesis, small RNAs are loaded into large, multi-domain called Argonautes (AGOs), which are the effector proteins at the core of the small RNA pathways. The small RNA serves as a specificity factor, guiding AGO to find its target RNA transcripts by sequence complementarity. When target transcripts are recognized by AGO/small RNA complexes, or RISCs (RNA- Induced Silencing Complexes) gene regulation can be achieved by a number of mechanisms, depending on the type of small RNA, association with co-factors, localization within the cell, and organization of the AGO structure itself [8,9].

1.2. AGO structure and function:

AGO structure supports its function as a modulator of gene expression in an RNA-dependent manner. AGOs share three common domains that are highly conserved across species: the PAZ domain, the Mid domain and the C-terminal PIWI domain (Figure 1) [9,10,11,12]. The N-terminal region of the protein is highly variable and relatively unstructured. It has been implicated in the unwinding of duplex small RNA complexes, thereby preparing the small RNA for loading [13]. The PAZ domain of AGOs is responsible for the association with the 3' end of the small RNA, whereas the MID domain holds the binding pocket for the 5' end of the small RNA [14,15]. The PIWI domain is also the catalytic domain in many AGOs and possesses endonucleolytic activity that can cleave RNAs in a manner consistent with RNAse H activity [16,17]. It is important to note that although AGOs possess these three characteristic domains, not all of them display endonucleolytic (or “Slicer”) activity due to lack of the key catalytic residues (DEDX) within the PIWI domain [18].

AGO structure is also tailored to accommodate interactions with important co- factors that enable mRNA regulation. For example, the PIWI domain of human Ago2 possesses binding pockets for tryptophan, which promotes binding with tryptophan-rich co-factors, such as GW182 [11]. GW182 is member of the core RNAi pathway in the cytoplasm as it functions to recruit deadenylase and decapping enzyme complexes,

CCR4:NOT and DCP1:DCP2 respectively [19].

2 Although AGO structure is highly conserved across species, the number of AGOs encoded in an organism’s genome is highly variable, ranging from one in the fission yeast Schizosaccharomyces pombe, to six in D. melanogaster, 26 in the nematode C. elegans and four in humans [8]. AGOs that bind siRNAs and miRNAs are ubiquitously expressed, whereas PIWIs (piRNA-associated AGOs) are found predominantly in animal germlines as well as in ciliated protozoans [9].

PAZ

3’ MID

5’

Figure1. Crystal structure of human Ago2 The crystal structure of human Ago2 shows the MID (green), PAZ (navy) and PIWI (grey) domains of the AGO with nucleotides 1-8 of the 5’ end and nucleotide 21 of the 3’end of a guide RNA (red) PIWI threaded in their binding pockets.

Figure1. Crystal structure of human Ago2 [11]. The crystal structure of human Ago2 shows the MID (green), PAZ (navy) and PIWI (grey) domains of the AGO with nucleotides 1-8 of the 5' end and nucleotide 21 of the 3' end of a guide RNA (red) threaded in their binding pockets.

1.3. Classes of small RNAs:

miRNAs are encoded by specific miRNA genes or clusters of genes. Once transcribed by RNA polymerase II, the precursor miRNA (pre-miRNA) folds into a hairpin structure, generating dsRNA which is first processed in the nucleus by the RNAse III type enzyme, Drosha to create a primary miRNA (pri-miRNA) [20,21,22,23]. The primary miRNA is then exported from the nucleus and processed into the mature miRNA in the cytoplasm by the RNAse III type nuclease, Dicer [24,25,26]. miRNAs represent an important platform to regulate gene expression at the post-transcriptional level in the

3 cytoplasm, via mRNA stability (by de-capping and de-adenylation followed by exonuclease degradation), mRNA degradation (by Slicer activity), and translational inhibition [27,28]. Generally, miRNAs interact with a sub-group of AGOs known as the

“classical” Argonautes.

PIWI-interacting RNAs (piRNAs) are restricted to the germline in animals, or are found in single-celled ciliated protozoa. They associate with the PIWI subfamily of AGOs, which act in the nucleus to silence transposable elements, repeats and other deleterious nucleic acids and thereby maintain the integrity of the germline genome in animals [29,30,31]. piRNAs are transcribed and processed from intergenic repetitive elements, transposons or large genomic clusters, termed piRNA clusters [32]. piRNA processing has been shown to be Dicer and Drohsa independent [33]. It is hypothesized, however, that piRNA processing may depend on Piwi nuclease activity.

In ciliated protozoans, piRNAs are involved in the process of developmentally- programmed genome rearrangement, where they can either promote the retention (Oxytrichia) or expulsion (Tetrahymena) of genomic sequences in the formation of the macronucleus [199,200].

Endogenous siRNAs (endo-siRNAs) are a more generalized and all- encompassing group of endogenous small RNAs. They generally target transgenes, transposons and viral RNA and are therefore considered important components of the host defense pathway against foreign nucleic acids [34,35]. However, many are also antisense to protein coding transcripts. In contrast to piRNAs and miRNAs, whose biogenesis is relatively well conserved, siRNAs vary in function and biogenesis from organism to organism. In some organisms such as flies, siRNAs arise from ectopically introduced long double-stranded RNAs (dsRNAs), whereas others such as C. elegans and plants, express RNA-dependent RNA polymerases, which utilize transcripts as templates to generate siRNAs [36,37].

4 1.4. C. elegans small RNA pathways:

Small RNA pathways in C. elegans are particularly diverse, likely due in part to the presence of 26 different AGO proteins. In addition to the classical miRNA-binding AGOs and the Piwis, C. elegans also possesses an additional class of sixteen AGOs called the Worm-specific AGOs (WAGOs) (Figure 2) [38]. WAGOs bind to a unique class of endo-siRNAs called the 22G-RNAs, due to their length of 22 nucleotides and a bias for a 5' Guanine nucleotide [39,40]. Together, the three distinct small RNA pathways play important roles in the gene regulation throughout the lifespan of the worm and in particular tissues, leading to proper development and fertility. As my thesis hinges on work performed using C. elegans as a model system, I will focus on small RNA pathways in C. elegans throughout this thesis.

0.05

Figure2. Phylogenetic tree of C. elegans AGOs [206]. C. elegans AGOs are divided into three sub-clades: the miRNA-binding AGOs (outlined in grey), the piRNA-binding AGOs (green) and the 22G-RNA-binding worm-specific AGOs (red).

5 1.4.1. miRNAs

The C. elegans miRNA pathway has been shown to be a crucial component of several developmental and physiological processes. Several miRNAs such as lin-4 and let-7 play central roles in the specification of developmental timing during embryonic and larval development [41]. For example, let-7 is involved in the regulation of the molting processes in larvae [42]. As well, lin-4 is required for axon elongation during nervous system development [43]. Furthermore, the miR-35 and miR-51 (also known as miR-100 in humans) families of miRNAs appear to be collectively essential for embryogenesis

[44,45,46].

miRNAs also play vital roles in normal physiological processes. For example, miR-1, miR-34 and miR-240 have been found to be involved in neuronal signaling, DNA damage response and defecation, respectively [47,45]. Several miRNAs are also involved in regulating longevity and lifespan including miR-71, miR-238, and miR-246, which act to increase longevity, while miR-34 and miR-239 reduce lifespan

[48,49].

C. elegans miRNAs are synthesized in a manner that is comparable to most animals. Most miRNAs are transcribed from genomic loci by RNA polymerase II to generate a long primary miRNA molecule [20]. This pri-miRNA is processed by the RNase III endonuclease DRSH-1, the C. elegans homolog of Drosha, to give rise to 60- 70nt long hairpin precursor miRNA [50]. These are then processed into 21-23nt miRNAs by the RNase III DCR-1, the C. elegans homolog of Dicer, and are loaded onto one of the worm miRNA-specific AGOs, ALG-1 or ALG-2 [25,51]. The AGO then releases one of the RNA strands and remains bound to the second, or guide strand, which possesses a 5' mono-phosphorylated residue [52]. The miRISC then uses a 6-8 nt seed region near the 5' end of the miRNA to bind to its target transcripts and induce gene silencing [53].

1.4.2. 21U-RNAs (piRNAs)

Similar to many other animals, the C. elegans piRNA pathway is a germline specific pathway used to protect the germline from foreign, mobile and deleterious

6 nucleic acids such as transposable and repeat elements [54,55,56]. This is supported by the observed germline mortality and sterility in piRNA pathway mutants [57]. Despite the fact that the C. elegans genome seemingly encodes two PIWI proteins, only one functional PIWI has been detected so far, namely PRG-1. PRG-1 associated with piRNAs, termed 21U- RNAs in the worm due to their length of 21nt and a general 5' bias for a Uridine [58]. piRNAs are encoded in the genome and are transcribed by RNA polymerase II from two large clusters on chromosome IV, as well as from regions upstream of certain protein-coding genes [58,59]. Their precursors are thought to be 26nt long capped small RNAs, which are subsequently decapped and trimmed at both ends by an unknown mechanism [59]. Finally, they are 2'-O-methylated at the 3' end by the conserved methyl-transferase, HENN-1,and loaded into PRG-1 [60]. Targeting of PRG-1 to mature mRNA by its associated 21U-RNA elicits secondary siRNA biogenesis of the worm specific 22G-RNAs against that target [57,61]. Secondary siRNAs then function in association with the worm-specific AGOs WAGO-1 and WAGO-9/HRDE-1 to amplify the silencing response and silence the of interest as described below [62]. piRNAs recognize their targets by full or partial sequence complementarity. The level of complementarity between the piRNA and its target is directly correlated to the amplitude of resulting 22G-RNA biogenesis-the higher the degree of complementarity, the more

22G-RNAs are produced [62].

1.4.3. Endo-siRNAs

Endo-siRNAs are anti-sense to protein-coding genes and are generated by one of four members of the RNA-dependent RNA polymerases (RdRPs) [39,37]. C. elegans has two distinct classes of endogenous siRNAs, the 26G-RNAs and the 22G-RNAs. They differ in length, biogenesis, and function. Within the 26G-RNA and 22G-RNA groups, further distinctions are made based on the particular small RNAs with which the

AGOs associate.

1.4.3.1 26G-RNAs

26G-RNAs play important roles in germline development. There are two classes of 26G-RNAs, Class I which is primarily found in sperm and bound by the AGOs ALG-3

7 and ALG-4 and Class II which is predominantly present in embryos and oocytes and bound by ERGO-1. Both pathways have been shown to be responsible for the silencing of their targets. Hence, 26G-RNAs are thought to regulate mRNA expression during spermatogenesis as well as in the developing zygote due to their maternal deposition [64,65]. As their name suggests, 26G-RNAs are 26 nucleotides in length and have a 5' Guanine bias. They are derived from endogenous transcripts which are converted to long dsRNA by the RNA-dependent RNA polymerase RRF-3 [66]. These long dsRNAs are converted into 26-nt species via the activity of DCR-1; thus they have a 5' mono- phosporylated residue. Similar to piRNAs, the ERGO-1 class of 26G-RNAs are also 2'- O-methylated at their 3' ends by HENN-1, whereas the ALG-3 and ALG-4 bound 26G- RNAs are not modified [67]. 26G-RNA recognition of transcripts occurs by full sequence complementarity and leads to the synthesis of additional 22G-RNAs against that transcript by the RdRPs RRF-1 or EGO-1 [69].

1.4.3.2. 22G-RNAs

22G-RNAs play a crucial role in genome surveillance. They are 22nt in length and have a bias for a 5' Guanine. Dicer is not required for the biogenesis of 22G-RNAs, thus they possess a 5' tri-phosphorylated residue. After their discrete synthesis by the RdRPs, EGO-1 or RRF-1, the 22G-RNAs are loaded into one of the WAGOs [69]. In addition to the RdRPs, 22G-RNA biogenesis is highly dependent on the Dicer-related helicase DRH-3 and the dual Tudor-domain-containing protein, EKL-1, which form a complex with EGO-1 or RRF-1 [70]. The 22G-RNAs target protein coding genes, transposable elements, poorly annotated transcripts, and repetitive sequences, and are most abundant in the germline. Thus, this class of small RNAs plays essential roles in germline development, specification and maintenance of the germline genome [74,75,115]. There are two distinct classes of 22G-RNAs: the WAGO-bound 22G-RNAs, which are generally synthesized by RRF-1 and EGO-1, and the CSR-1 bound 22G-

RNAs, which are only synthesized by EGO-1 [69,71].

8 1.4.3.2.1. CSR-1 22G-RNAs

CSR-1 is the only singly essential AGO in C. elegans. The CSR-1 small RNA pathway is crucial for proper chromosome organization and germline development, as loss of csr-1 leads to defects in oocyte and embryo chromosome morphology, P granule (germ granule) morphology abnormalities, and leads to embryonic lethality (in the few embryos that are produced) as well as sterility (the more common phenotype) [71,72,73]. CSR-1-bound 22G-RNAs are antisense to protein coding genes expressed predominantly in the germline, and encompass nearly 5000 genes (20% of the transcriptome) [71]. Unlike other C. elegans AGOs studied thus far, CSR-1 promotes the transcription of its targets, rather than to silence them. This is thought to be a mechanism to protect germline genes from silencing by the piRNA pathway and a means to differentiate self from non-self transcripts in the germline [74,75,116].

1.4.3.2.2. WAGO 22G-RNAs

Although the WAGO-bound 22G-RNAs also function to survey the genome, they target fewer protein-coding genes than CSR-1 (approximately 1800 protein coding genes are the targets of each of the WAGO-1 and WAGO-9/HRDE-1 pathways) [69,55]. Rather, they tend to target transposable elements, cryptic loci and transposons. WAGO 22G-RNAs have been observed to silence their targets, mainly in the germline, although some of them also function in the soma. The two best-studied WAGOs that are known to function in germline genome protection are WAGO-1 and WAGO-9/HRDE-1. Many of the small RNAs that are loaded into WAGO-1 and WAGO-9/HRDE-1 are generated by the RdRPs, RRF-1 and EGO-1 in response to PRG-1 mediated initiation of silencing [62,37,76]. They therefore function to amplify and stabilize the silencing established by the piRNA pathway. In particular, WAGO-9/HRDE-1 has been shown to be crucial for maintaining proper germline development and fertility [77]. WAGO-9/HRDE-1 and CSR- 1 are thought to act in an antagonistic manner to promote proper germline gene expression and protect the germline genome from deleterious nucleic acids. The net effect of these pathways, which are in a delicate balance, is to promote proper germline development and fertility.

9 1.5. Small RNAs in the cytoplasm:

Post-transcriptional gene silencing (PTGS) by small RNAs in the cytoplasm encompasses three main mechanisms: mRNA degradation, mRNA decay and translational inhibition [78,79,80,81,82,83].

One of the best-known examples of small RNA mediated gene silencing is siRNA mediated mRNA degradation during exogenous RNA Interference (RNAi). AGO is the catalytic core of the RISC complex, which is conserved across eukaryotes and functions in a similar capacity in these organisms. When exogenous dsRNA is added to cells or organisms such as C. elegans to elicit experimental gene silencing, siRNAs loaded into RISC guide the recognition and endonucleolytic destruction of the target mRNA via the

Slicer activity of the AGO [78,79,80].

Silencing in the cytoplasm can also be achieved by affecting mRNA stability without direct cleavage of the mRNA by the AGO. In fact, it has been shown that miRNA-associated AGOs can destabilize mRNAs by recruiting the deadenylases and decapping complexes CCR4-NOT and DCP1-DCP2 respectively, with the help of GW182. As a result deadenylation-dependent decapping occurs and the decapped mRNA is degraded by 5'->3' exonucleases, such as XRN-1 [27,19,81].

In the case of translational inhibition, protein levels are reduced due to miRNA dependent inhibition of translation initiation. One way that this can occur is if the AGO binds to the 5' m7G cap of the mRNA and inhibits translation initiation by impeding the recruitment of 80S ribosomes to the mRNA. This also conceals the cap such that binding to the initiation factor eIF4E cannot occur [82,83]. There is also evidence to suggest that miRNAs are capable of inhibiting translation in a cap-independent manner. In this case, RISC may recruit the inhibitory protein eIF6, which inhibits ribosomal subunit binding [84].

1.6. Small RNAs in the nucleus:

With the advent of high throughput sequencing tools to investigate changes in genome organization and transcriptome content, the molecular mechanisms of nuclear

10 small RNAs are emerging. Initially, the field focused primarily on small RNA-mediated changes to chromatin (and their subsequent impact on transcript levels). However, recent data, including my own, point to a role of small RNAs in many other nuclear processes ranging from transcription to splicing and nuclear export. Here, I summarize what is currently known about the roles of small RNAs in transcription, splicing and nuclear export.

1.6.1. Small RNAs and transcription

Small RNA pathways play important roles in transcriptional gene silencing (TGS) in the nucleus. First studied in plants and fungi, and later in metazoans, nuclear small RNA mediated silencing commonly occurs in the germline of animals as a means to protect the genome from foreign and deleterious nucleic acid sequences such as viruses and transposon derived sequences [85]. In plants and fungi, nuclear silencing pathways generally utilize siRNAs, while in most animal germlines, piRNAs serve in this capacity [86,87]. In C. elegans, an additional level of nuclear regulation is mediated by the 22G- RNAs. When incurred in the germline, nuclear small RNA pathways elicit epigenetic changes that can be inherited trans-generationally [88].

Small RNA and AGO mediated gene silencing in the nucleus often results in the generation of transcriptionally silent heterochromatic regions at small RNA target loci throughout the genome. AGO/small RNA effector complexes are recruited to nascent transcripts where they interact with and regulate the transcriptional machinery [85]. Due to the necessity of active transcription of a locus for it to be silenced, the process is co- transcriptional in nature. While TGS is a process that exists in many organisms, the mechanisms by which it occurs are not entirely conserved. Here I provide a brief overview of TGS in some of the model organisms where it has been observed.

1.6.1.1. S. pombe

Gene silencing by TGS in S. pombe occurs by the production of heterochromatin at pericentromeric regions. AGO1 is targeted to nascent transcripts and interacts with Dicer, DCR1.This complex then recruites the RNA- dependent RNA polymerase

11 complex (RdRC) to amplify the siRNA response by generating more siRNAs against the transcript [89,90]. These siRNAs are then loaded into AGO1, the catalytic component of the RNA-induced transcriptional silencing complex (RITS) [91,92]. The siRNA then guides the RITS to the nascent transcript and recruits the cryptic loci regulator complex (CLRC). This complex leads to the deposition of H3K9me at these loci, forming heterochromatin that is essential for proper centromere function, and resulting in the release of RNA POLII via an unknown mechanism to attenuate transcription [93,94].

1.6.1.2. A. thaliana

In A. thaliana, silencing occurs by RNA-dependent DNA methylation at genomic loci [95,96]. 24nt siRNAs are generated by an RNA-dependent RNA polymerase, RDR2 [97]. RDR2 uses transcripts generated by the RNA polymerase RNAPOLIV as templates to generate double-stranded RNA transcripts. These dsRNAs are then processed by DICER-LIKE 3 (DCL3) into 24nt siRNAs [98]. In the cytoplasm, siRNAs are loaded into AGO4, which then returns to identify nascent transcripts in the nucleus [99,100]. This in turn leads to the co-transcriptional deposition of repressive cytosine methylation, in a process referred to as RNA-directed DNA methylation (RdDM), which facilitates further heterochromatin formation by histone modification and results in gene silencing

[101,102].

1.6.1.3. D. melanogaster

The Drosophila piRNA pathway is perhaps the best characterized of the germline nuclear piRNA pathways. piRNA silencing in Drosophila is mediated by the two germline PIWIs, Piwi and Aubergine (Aub) and Argonaute3 (AGO3) [103,104,105]. It has become clear that the PIWIs and Ago3, each exclusively bind a unique subset of piRNAs; the PIWIs Aub and Piwi associate with antisense piRNAs, whereas Ago3 associates with sense piRNAs [106,107]. Piwi acts in the somatic follicle cells surrounding the oocyte to silence transposons. Here, a large proportion of the piRNAs are antisense piRNAs transcribed from the flamenco locus [109,110]. In nurse cells and ovaries, piRNAs function to silence a vast number of transposable elements, which are targeted by Ago3 and Aub. The piRNAs involved in this process are generated from a wide range of

12 genomic piRNA clusters and mRNA transcripts of active transposons [107,110]. They are loaded into Aub and Ago3 and enter an amplification loop, called the ping-pong cycle. Amplification occurs as follows: Upon recognition of an mRNA target, the AGO (Aub or Ago3) cleaves the transcript, generating piRNAs that are loaded into the reciprocal AGO (Ago3 or Aub) [108]. This process leads to the amplification of the piRNA signal and an increase in piRNA abundance in the germline. Upon target recognition, Ago3 and Aub function to degrade their targets and direct the accumulation of H3K9me2, by associating with the chromatin factor HP1a [111,112]. In order to prevent total silencing of piRNA clusters by AGO and HP1a activity, the chromatin factor Rhino localizes to piRNA clusters as both a marker for the clusters and to promote their transcription [113].

1.6.1.4. C. elegans

As mentioned previously, C. elegans small RNA pathways are diverse and complex. Three C. elegans AGOs have been shown to mediate nuclear RNAi in a co- transcriptional manner: the silencing WAGOs WAGO-12/NRDE-3 and WAGO-9/HRDE-

1, and the activating WAGO CSR-1 [114,115,116, 71,74,75].

The WAGOs WAGO-12/NRDE-3 and WAGO-9/HRDE-1 are the sole known modulators of nuclear silencing in the worm thus far and act in analogous manners, with WAGO-12/NRDE-3 acting in somatic nuclei and WAGO-9/HRDE-1 in germline nuclei. Both are triggered by WAGO- 22G-RNA binding to enter the nucleus, where they associate with nascent pre-mRNA targets. They then recruit cofactors, NRDE-1, NRDE- 2 and NRDE-4, which aid in the inhibition of the repressive H3K9me2 chromatin mark and the inhibition of POLII elongation. Genetic interactions have been observed between WAGO-9/HRDE-1 and the H3K9 methyltransferase SET-25, suggesting that SET-25 is responsible for the WAGO-9/HRDE-1 mediated deposition of H3K9me2 [114,115,116,55]. WAGO-12/NRDE-3 and WAGO-9/HRDE-1 are both responsible for the trans-generational inheritance of target silencing initiated by exo- siRNA in their respective tissues and have therefore been implicated trans-generational epigenetic inheritance. It is important to note that WAGO-12/NRDE-3 maintains

13 silencing over a single generation only, whereas WAGO-9/HRDE-1 silencing is multigenerational [117,118].

The germline WAGO CSR-1 functions to counteract silencing by the piRNA pathway and its downstream WAGO-9/HRDE-1 22G-RNA pathway. It plays an activating, instead of silencing role in gene expression [74,75]. CSR-1 is recruited to nascent transcripts by 22G-RNAs and leads to the activation of transcription at its target loci. Although mechanisms of CSR-1 action have not yet been fully elucidated, it appears as though CSR-1 acts in a co-transcriptional manner, as can be extrapolated from its RNA-dependent interaction with RNA polymerase II [74]. Unpublished data from our lab point to a role for CSR-1 in affecting chromatin architecture, as CSR-1 has an effect on the accumulation of several histone modifications including the euchromatin- associated modifications, H3K36me3 and H3K4me2, on a genome-wide scale. Loss of csr-1 leads to decreases in the levels of these histone marks at CSR-1 target genes as well as a global decrease in transcription levels of its target mRNAs, while other portions of the genome acquire these modifications aberrantly and are transcribed at abnormally higher levels. These data, in conjunction with the fact that CSR-1 interacts with methyl transferases mes-4 and met-1, suggests that CSR-1 may be functioning to recruit these histone modifiers to its target transcripts (Christopher Wedeles, Claycomb lab, unpublished data). Taken together, these data have led us to a model, where CSR-1 associates with nascent transcripts by full or partial sequence complementarity with its small RNA binding partner. Upon target mRNA binding, CSR-1 then recruits histone modifying enzymes to its target genomic loci, where they deposit euchromatic histone marks to further promote transcription of these loci. Because CSR-1 targets almost 5000 germline genes and is enriched in the germline, its key function appears to be promoting proper germline gene expression and development.

1.6.2. Small RNA pathways and splicing

The necessity for the coupling of small RNA pathways and transcription is clear: in order to be regulated by small RNAs, a locus must first be transcribed. However, there are additional functional implications for this relationship that must be considered, as transcription is but the first step in the expression of a transcript. For instance,

14 splicing is one key step co-transcriptional step in the maturation of a pre-mRNA into an mRNA that is competent to be exported from the nucleus. Splicing involves the ATP- dependent excision of intronic sequences from the pre-mRNA and the joining of the flanking [119,120]. This two-step process is catalyzed by a large complex of ribonucleoproteins, called the . The spliceosome is a dynamic multi-protein complex, composed of at least 145 factors [121]. The composition of splicing factors within the spliceosome changes over the course of the splicing reaction with several rearrangements, associations and dissociations taking place during the process.

At the core of the spliceosome lie five uridine-rich small nuclear ribonucleoproteins (), the U1, U2, U4, U5 and U6 RNA-protein complexes [121,122]. The U snRNPs are responsible for pre-mRNA binding of the spliceosome as well as the catalysis of the two-step splicing reaction. Within each snRNP is a common set of seven Sm proteins which form a ring like structure around the RNA. Sm proteins aid in assembling the U snRNPs with other spliceosome adaptor proteins

[123,124,125,126].

Several recent studies have pointed to the fact that many splicing factors may be involved in RNA-mediated gene regulation in the nucleus (Table1). Phylogenetic analyses, which searched for proteins that have similar conservation profiles to small RNA pathway components, identified splicing factors as possible components of RNAi machinery in a wide range of organisms [134]. More intriguing, in several organisms, mutations in splicing factors have been shown to have deleterious effects on endogenous small RNA pathways. For example, it has been reported that in S. pombe, defects in splicing factors, but not splicing itself, affect the biogenesis of siRNAs. Mutations in the essential splicing factors Cwf10 and Prp39 display decreased silencing at centromeric regions [127]. Cwf10 is homologous to the Saccharomyces cerevisiae U5 small nuclear ribonucleoprotein Snul 14 [128]. Prp39 is associated with U1 snRNA and is required for commitment to pre-mRNA splicing [129].

In these mutants, decreased centromere silencing was found to be associated with a reduction in centromeric siRNA accumulation. It is important to note that although siRNA accumulation is impaired in these mutants, splicing is not greatly affected by the

15 mutations, thus implying that the defects in silencing are likely unrelated to the roles of Cwf10 and Prp39 as splicing factors. Interestingly, Bayne et al also showed that many of the splicing factors that had been implicated in RNAi, including Cwf10, Rpr10, Prp5, and Prp12 co-immunoprecipitated with the RNA-directed RNA polymerase in S. pombe, known as Cid12, further implicating them in small RNA pathway activity. Other splicing factors such as Prp5, Prp10, Prpl2 and Prp8, Cwf11 were also observed to interact with the RdRP Cid-12 by IP Mass spectrometry. Their loss also leads to defective silencing by siRNAs, potentially implicating them in RNAi pathways in S. pombe [127].

Studies in A. thaliana have also identified splicing factors such as ESP3, a homolog of yeast Prp2, as components of plant small RNA pathways. Herr et al. showed that loss of esp3 leads to enhanced silencing during flowering and prevents transcripts from entering endogenous silencing pathways. This occurs due to improper RNA processing of aberrant intron-containing transcripts [130]. Unlike in other organisms, where the role of splicing factors in RNAi appears to be unrelated to splicing, splicing activity of ESP3 has a direct effect on small RNA biogenesis and subsequent silencing by the endogenous RNAi machinery.

Genome-wide screens in C. elegans as well as phylogenetic studies show that some splicing factors are universally required for RNA-mediated gene silencing. Using an engineered RNAi sensor screen, Kim et al. identified several factors involved in C. elegans RNAi pathways, including the conserved spliceosomal protein RNP-2 [131]. Moreover, a similar RNAi screen in C. elegans identified several RNA processing genes, including a previously uncharacterized gene, F32B6.3, a homolog of human HPRP18, which interacts with U5 snRNP [132,133].

Recently, studies in Drosophila have implicated the splicing factor SmD1 in RNAi, independently of its splicing activity [134,135]. SmD1 is one of the seven Sm proteins, which form the heptameric ring structure that accommodates the U snRNPs and constitutes the core of the spliceosome. In their studies, Xiong et al. showed that SmD1 is required for RNAi in vivo and that it interacts with components of the siRNA biogenesis machinery such as R2D2, and one Drosophila homolog of Dicer, DCR-2. Moreover, reduced levels of the splicing factor were found to lead to reduced levels of

16 siRNAs as well as impaired siRISC assembly due to reduced AGO2 slicer activity. Upon examination of other Sm proteins, it was determined that SmE also leads to defects in RNAi. It is important to note that although SmE and SmD1 appear to have crucial roles in the RNAi pathway, mutants do not display any significant defects in splicing. Interestingly, other Sm proteins such as SmF have been shown to play no significant role in the RNAi pathway but appear to be essential for splicing [135].

Although links between splicing factors and small RNA pathways have been made in many organisms, the mechanisms by which splicing factors function to elicit gene regulation via small RNAs have yet to be elucidated. Splicing-related factors that have been implicated in small RNA pathways are summarized in Table 1.

17 Table 1. Splicing factors implicated in RNAi and their homologs.

Direct effect Interaction with on siRNA- D. melano- S. S. H. sapiens C. elegans A. thaliana RNAi machinery mediated Ref. gaster cerevisiae pombe Observed in silencing observed? EFTUD2 CG4849 EFTU-2 Snu114 Cwf10 MEE5 S. pombe Yes 127 AQR CG31368 EMB-4 Sen1 Cwf11 EMB2765 S. pombe Yes 127 BRR2 I(3)72Ab SNRP-200 Brr2 Brr2 EMB1507 S. pombe Not Tested 127 CDC5 CG6905 D1081.8 Cef1 Cdc5 CDC5 S. pombe Not Tested 127 PLRG1 Tango4 PLRG-1 Prp46 Prp5 PRL1 S. pombe Yes 127 SYF1 CG6197 C50F2.3 Syf1 Cwf3 AT5G28740 S. pombe Not Tested 127 PRPF45 Bx42 SKP-1 Prp45 Prp45 SKIP S. pombe Not Tested 127 SYF3 CRN M03F8.3 Syf3 Cwf4 AT5G41770 S. pombe Not Tested 127 PRPF17 CG6015 PRP-17 Prp17 Prp17 AT1G10580 S. pombe Not Tested 127 SPF38 CG3436 F08G12.2/ - Spf38 AT2G43770 S. pombe Not Tested 127 C18E3.5 SPF27 CG4980 T12A2.7 - Cwf7 MOS4 S. pombe Not Tested 127 ECM2 - - Ecm2 Cwf5 - S. pombe Not Tested 127 SYF2 CG12343 K04G7.11 Syf2 Syf2 AT2G16860 S. pombe Not Tested 127 CWF19 CG7741 F17A9.2 DRN1 Cwf19 AT5g56900 S. pombe Not Tested 127 PRPF3 Prp3 PRP-3 Prp3 Prp3 AT1G28060 S. pombe Not Tested 127 ISY1 CG9667 F53B7.3 Isy1 Cwf12 - S. pombe Not Tested 127 U2-A’ U2af38 UAF-2 Lea1 Lea1 U2AF35B S. pombe Not Tested 127 SF3B3 CG13900 TEG-4 Rse1 Prp12 AT3G55200 S. pombe Yes 127 PRPF22 Pea MOG-5 Prp22/ Dhr2 Prp22 ESP3 S. pombe, A. Not Tested 127, thaliana 130 SRRM2 CG7971 RSR-2 Cwc21 Cwf21 - S. pombe Not Tested 127 PRPF43 CG11107 F56D2.6 Prp43 Prp43 AT3G62310 S. pombe Not Tested 127 SF3B1 CG2807 T08A11.2 Hsh155 Prp10 AT5G64270 S. pombe Yes 127 BUD31 I(1)10Bb C07A9.2 Bud31 Smb-1 AT4G21110 S. pombe Not Tested 127 PRPF39 CG1646 F25B4.5 Prp39 Prp39 PRP39 S. pombe Not Tested 127

SNRPD1 SmD1 SNR-3 Smd1 Smd1 AT4G02840 D. melanogaster Yes 134, 135 SNRPE SmE SNR-6 Sme1 - - D. melanogaster Yes 135 SNRPA snf RNP-2 Msl1 Usp1-2 U1A C. elegans Not Tested 131 HPRP18 Prp18 F32B6.3 Prp18 - AT1G-3140 D. melanogaster Not Tested 133

18 1.6.3. Small RNA pathways and nuclear export

The link between small RNA pathways and nuclear export is rather novel and highly dependent on the Exon Junction Complex (EJC). The exon junction complex is another large multi-protein complex that is deposited in a sequence independent manner onto the mRNA at newly spliced exon junctions by the spliceosome at precisely 24nt upstream of the spliced junction [137]. The core of the EJC consists of four key proteins: eIF4A3 (eukaryotic initiation factor 4A3), MAGOH, Y14 (also known as RNA- binding motif 8A) and MLN51 (metastatic lymph node 51; also known as CASC3) [138,139,140,141]. The EJC plays multiple roles as a beacon that identifies transcripts as spliced, a chaperone to escort transcripts out of the nucleus, and a recruiter of factors involved in post-transcriptional processing of the mRNA. Notably, the EJC remains bound to the mature mRNA at that site until it is disassembled during the first round on translation [142,143,144].

The EJC also serves as an intermediary that links the splicing and nuclear export processes, as EJC and pre-EJC assembly is highly reliant on structural changes in the spliceosome and splicing machinery [142,145]. MAGOH, Y14 and eIF4A3 are already present on an unspliced transcript as members of the spliceosome. After release of five spliceosomal small nuclear RNPs (snRNPs) and exon ligation, the pre-EJC is joined by MLN51 for stabilization. The association of MLN51 with the mRNA occurs in synchrony with the dissociation of the transcript from the spliceosome [146,147,148,149]. The intron-binding complex (IBC) also plays a significant role in EJC assembly, further linking the splicing process with nuclear export and post-transcriptional processing. Core members of the IBC, most notably the RNA helicase AQUARIUS (AQR), have been found to be necessary for the loading of EJC on the mRNA [150,151,152]. Following assembly of the EJC core, peripheral factors that aid in subsequent post-transcriptional processes are recruited. Many of these peripheral factors will go on to form the TREX complex. These include: mRNA export factors UAP56 (also known as DDX39B) and

Aly/REF export factor (ALYREF; also known as THOC4) [145,153,154].

The TREX complex (transcription/export) actively facilitates the transport of mRNAs from the nucleus to the cytoplasm. Many members of the TREX complex are

19 highly conserved adaptor proteins that bind to the mRNA early on during the transcription process as member of the spliceosome. These include UAP56 and ALY/REF [154]. UAP56 is a conserved protein that functions both as a splicing factor and important component of the TREX complex. UAP56 remains associated with the mRNA after splicing by association with TREX subunits ALY/REF and THO. Together UAP56, THO and ALY/REF form the core of the TREX complex which function with nuclear export factor 1 (NXF1) to shuttle transcripts out of the nucleus, through the nuclear pore [155,156].

Recently several pieces of evidence have emerged to support a role for EJC and TREX in small RNA pathways in several organisms. In D. melanogaster, UAP56 has been implicated in the piRNA pathway, as loss of UAP56 leads to mislocalization of the Piwi proteins Aub and Ago3 to the perinuclear nuage in the germline, which are sites of AGO and small RNA biogenesis and activity. As a result, piRNA production is greatly reduced in the uap56 mutant leading to increased transposon activity [157].

Arabidopsis THO2, the largest protein subunit of the TREX complex, has also been shown to impact RNAi in the plant. Loss of tho2 leads to reduced siRNAs and miRNAs and subsequently, leads to loss of gene silencing [158]. One possible model for the decrease in miRNAs in tho2 mutants is that THO2 binds miRNA precursors and recruits them to the proper processing complexes for maturation. Another component of the Arabi THO complex, EMU, was identified as a component of the miRNA pathway by genetic interaction with Arabidopsis ago1 and the miRNA biogenesis factor hyl1. In addition to enhancing the morphological defects associated with ago1 and hyl1 mutants, loss of emu was shows to result in a depletion of miRNAs [159].

Taken together, these data suggest a possible link between small RNA pathways, splicing, and nuclear export. They provide a solid point from which to enter this new and largely unexplored territory of small RNA biology.

20 1.6.4. A link between small RNAs and chromatin in C. elegans primordial germ cells

The C. elegans germline is set apart from the somatic lineage very early during development by a series of asymmetric cell divisions. The cell that is bound to give rise to the germline segregates away from the somatic cell lineages into the P, or Posterior- lineage. This first primordial germ cell (PGC) persists as a single cell until approximately the 100-cell stage, when it undergoes mitosis and gives rise to two daughter cells, the Z2/Z3 cells. The Z2/Z3 cells eventually proliferate and give rise to a mature germline during the larval stages of development [160]. Proper germline development is highly dependent on the expression and repression of the appropriate genes. Specification of the germline occurs as early as the 16-cell stage by three factors: (i) global transcriptional quiescence mediated by the PIE-1 transcriptional repressor protein, (ii) germline-specific changes in chromatin architecture, and (iii) the association of cytoplasmic RNA- and protein-rich granules called P-granules (germ granules in other organisms) [161,162,163,164].

Transcriptional silencing by PIE-1 is essential for germline identity during the early stages of germ cell specification, as loss of PIE-1 is a characteristic of somatic cells. Thus, PIE-1 persists in the germ cells until the Z2/Z3 stage where PIE-1 is rapidly degraded [161]. PIE-1 degradation is closely linked to changes in global chromatin architecture that also occur at this point in development, namely loss of H3K4me2 and H4K8ac (both associated with transcriptional activity) as well as the acquisition of H3K27me2 (associated with transcriptional repression) [165,166]. Since chromatin in the early PGCs appears to have transcription-inducing marks, it is believed that PIE-1 silencing is introduced to “overwrite” this transcription activation and maintain quiescence. Later on, when the germline is ready to proliferate, H3K4me2/me3 return to the germline genome [166]. At this stage the germline genome is poised for expression of the appropriate germline genes allowing for proper germline development.

P-granules are cytoplasmic aggregates of RNA and protein that are maternally loaded into the oocyte and deposited to the embryo at fertilization and segregate into the P-lineage from the very first cell division. They remain cytoplasmic for the first few cell

21 divisions, then associate with the nuclear periphery of the PGC and remain perinuclear throughout the remainder of germline development [164,167,168]. P-granules are thought to deliver maternally deposited proteins and RNA to the nascent germline as well as regulate newly synthesized transcripts as they exit the nucleus into the cytoplasm. Many maternally deposited mRNAs that play significant roles in development have been shown accumulate in P-granules [169]. Additionally, P-granules are home to a wide array of RNA-binding proteins including the PGL proteins and the GLH family of DEAD box helicases, which are the primary components of the P-granules. Several other proteins that play important roles in RNA processing and translation can be found in P-granules such as cap-binding initiation factor IFE-1, Sm proteins and PIE-1. Many of these also affect germ cell development [169,170,171]. It is important to note that P- granules are also home to a number of factors involved in the diverse small RNA pathways of C. elegans. For example, the Piwi PRG-1, miRNA AGOs ALG-1 and 2, and AGOs associating with the 22G-RNA class, including CSR-1, WAGO-9/HRDE-1, and WAGO-1 are all localized to P-granules [172, 71, 65]. Moreover, other members of the small RNA biogenesis machinery, including the RdRPs EGO-1 (22G-RNAs) and RRF-3 (26G-RNAs), Dicer (miRNAs, 26G-RNAs) and the Dicer-like helicase DRH-3 (22G- RNAs), associate with P-granules [71]. Hence, P-granules also play an important role in maintaining germline integrity and proper germline gene expression as small RNA factories. This, in addition to the known effect of small RNAs on chromatin, suggests complex role for small RNA pathways in germ cell development and maintenance.

1.7. The conserved splicing factor EMB-4/AQR:

Due to the unique activity of CSR-1 in licensing or promoting germline gene expression and its apparent role in germline development and maintenance, our lab is interested in understanding how CSR-1 performs its functions. Therefore, when she was a post-doctoral fellow, Dr. Julie Claycomb performed IP/Mass spectrometry on CSR-1 to identify binding partners of CSR-1. Using a rabbit immune serum generated against CSR-1, she immunoprecipitated CSR-1 (or performed a negative control IP with pre- immune rabbit serum) and analyzed the associated proteins by SDS-PAGE followed by silver stain.. From there, she isolated an approximately 170 kDa protein band that was

22 enriched in the IP using the anti-CSR-1 serum but not in the sample pulled down using the pre-immune serum control. Mass spectrometry revealed that the putative CSR-1 interacting protein present in this band was the conserved splicing factor, EMB-4.

emb-4 was first discovered in a screen for mutations causing embryonic lethality. However, the gene was not characterized extensively following its discovery. Only a handful of studies examined the phenotypes and possible functions of emb-4 in the worm. The first study, published in 2006 implicates emb-4 in lin-12 (Notch receptor) activity and Notch signaling during cell specification and germline development. This study showed that emb-4 plays a positive role in the activity of LIN-12 downstream of the receptor signal [173]. The second study describes a role for emb-4 in chromatin remodeling in the primordial germ cells during embryogenesis. Checchi and Kelley (2006) observed that emb-4 mutants display severe defects in development leading to the arrest of most progeny in the later stages of embryogenesis [174]. They observed that, in addition to embryonic lethality, emb-4 mutants displayed defects in germline chromatin resetting at the Z2/Z3 stage of PGC development. As mentioned previously, the emergence of Z2/Z3 marks the division of the PGC into two cells that will eventually give rise to the mature germline. This cell division is accompanied by significant changes in the chromatin landscape within the PGCs, including the genome-wide loss of H3K4me2, and H4K8ac as well as the acquisition of H3K27me2. In the absence of emb- 4, this loss of H3K34me2 is significantly delayed or does not occur (Figure 3). Interestingly, emb-4 only affects H3K4me2 loss and not other chromatin remodeling steps such as H4K8ac loss. Additionally, emb-4 appears to affect the role of PIE-1 in the specification of the germline and its effect on transcriptional quiescence, as emb-4 mutants exhibit a delay in the loss of PIE-1 in the PGC. In wild type embryos, PIE-1 is degraded in the PGCs Z2/Z3 immediately after their birth, however, it persists up until the 100- to 150- cell stage in emb-4 mutants.

23 MERGE DAPI H3K4me2

wild type

Figure 3. Germline chromatin resetting phenotype

200 cell stage in emb-4 mutants

mutant mutants were shifted to 25 stained for H3K4me2 (green), DAPI (red) and PGL-1 (blue). By the Z2/Z3 stage H43K4me3 is lost in wild

emb-4 type embryos, but retained in

Figure 3. Germline chromatin resetting phenotype in emb-4 mutants [174]. emb-4 mutants were shifted to 25°C overnight, dissected and stained for H3K4me2 (green), DAPI (red) and PGL-1 (blue). By the Z2/Z3 stage H43K4me3 is lost in wild type embryos, but retained in emb-4 mutants.

In their experiments, Checchi and Kelley used a null allele of emb-4, emb- 4(hc60), that appears to be temperature sensitive, but is poorly described at the molecular level. The temperature sensitive nature of this allele is not due to protein instability (as the allele is a presumptive null allele), but that the biological process that EMB-4 participates in is a temperature sensitive one, thus, the severity of emb-4 phenotypes is reduced but not fully abrogated at the lower “permissive” temperatures. In other words, these data suggest that emb-4 is not absolutely essential for normal development but that its activity is required for some temperature-sensitive processes that are involved in development. The number of escapers (embryos that survive to adulthood) is greater at permissive temperatures, although they exhibit stark morphological abnormalities. These results suggest that emb-4 is required for a number of processes involved in the development of the PGCs as well as somatic cells.

Most recently, Shiimori and Sakamoto (2013) uncovered a possible role for EMB- 4 as a member of the pre-Exon Junction Complex (pre-EJC) [175]. They show that EMB-4 immunoprecipitates with Y14, one of the three core subunits of the EJC, in an

24 RNA-independent manner. Furthermore, EMB-4 was associated with intronic sequences, as we would expect for a member of the IBC. This suggests that EMB-4 is required for the recruitment of EJC components to the spliceosome, and implicates EMB-4 as an important factor in the splicing-dependent recruitment of mRNA export- related proteins.

1.7.1. Homologs of EMB-4

EMB-4 is a highly conserved protein and has homologs in several other model organisms, which have been studied to varying extents. What we know about EMB-4 is mainly derived from studies regarding the EMB-4 homolog in S. pombe (Cwf11p) and from in vitro experiments on its human homolog Aquarius/Intron-binding protein 160 (AQR/IBP160), with which EMB-4 shares a high degree of sequence identity and similarity. Remarkably, the crystal structure of AQR has also recently been solved, providing additional insights as to the activity of AQR.

The splicing factor Cwf11 has been shown to immunoprecipitate with a host of other factors involved in pre-mRNA splicing. As its loss is not lethal, it is thought to be a non-essential component of the S. pombe spliceosome [177]. Interestingly, Cwf11 has been implicated in the RNAi pathway in S. pombe, as its loss compromises maintenance of heterochromatin by the RNAi pathway. It has also shown to interact with the RdRP Cid12 [176,127]. However, its mechanism of action in the context of RNAi has yet to be elucidated.

As described above, AQR is a non-sequence-dependent intron binding protein that helps to recruit the EJC to introns. EJC-binding and integrity are important factors that contribute to the directing transcripts into the nuclear export pathway rather than into the nonsense mediated decay (NMD) pathway. AQR has been shown to be a core component of the human splicing machinery as it directly interacts with a host of known splicing factors such as hSyf1 (also called Xab2), CCDC16, hIsy1 and CypE, which are also linked to the U2 snRNP. AQR is recruited to the spliceosome as a member of the intron-binding complex (IBC) [151,152].

25 AQR is an RNA helicase. Owing to its DEAD box domain, AQR possesses 3'-to-5' unwinding activity as it is capable of unwinding single-stranded RNA (ssRNA), as well as RNA duplexes with a 3' overhang. This activity is dependent on ATP hydrolysis, which is aided by its ATP-binding UvrD domain. AQR activity has been shown to be non- essential to splicing in vitro, but depletion of AQR by RNAi leads moderately reduced splicing efficiency in cell culture. These data are consistent with the role of AQR as an intron-binding protein and splicing factor. Taken together, these data suggest a role for AQR in splicing as well as other RNA-processing events such as EJC assembly

[151,152]. However, AQR has not been previously linked to small RNA pathways.

1.8. Thesis Rationale:

Clearly EMB-4/AQR is an interesting and integral part of the life-cycle of an mRNA. EMB-4 has already been implicated in germline development due to it effect on chromatin. Its homolog AQR, has been implicated in other RNA-processing events. This, in conjunction with the growing evidence linking splicing factors and small RNA pathways, make EMB-4 an intriguing candidate for examination in the context of small RNA pathways. In this thesis, I investigate the role of EMB-4 in small RNA pathways in the worm. Using an integrated approach I test whether EMB-4 is a component of small

RNA pathways and examine how they are affected by its loss.

26 2. Materials and Methods

2.1. Nematode strains, culture, brood counts, and RNAi:

All C. elegans strains were derived from the Bristol N2 strain (wild-type) and cultured as described previously described [178]. Worms were grown at 20°C on Nematode Growth Medium (NGM) plates inoculated with a lawn of E. coli OP50, unless otherwise stated. A list of strains used is provided in Table 2.

Table 2. List of C. elegans strains used in this study.

Strain Name Text Name Genotype

N2 wild type wild type

MJ60 emb-4 mutant emb-4(hc60) V

csr-1 mutant csr-1(tm892) IV; WM193 neIs19[pie-1::3xflag::csr-1, unc-119(+)] WM191 wago-9/hrde-1 mutant hrde-1(tm1200)

BA17 fem-1mutant fem-1(hc17) IV

CB4108 fog-2 mutant fog-2(q71) V

SS104 glp-4 mutant glp-4(bn2) I

2.1.1. Brood size counts

We conducted brood size counts on N2 and emb-4(hc60) mutants reared at 14°C, 20°C and 25°C. Parents were propagated at 20°C and 10-15 individuals from their offspring were shifted to 14°C, 20°C or 25°C at the L1 stage (1 worm/plate), where they remained until they became gravid adults. 24 hours after egg- laying began, the adult hermaphrodite was moved to a new plate and the embryos were counted. The following day, the number of larvae and embryos were scored in order to determine how many embryos hatched. This was important as the emb-4(hc60) displays a degree of embryonic lethality. This continued until the adult hermaphrodite was no longer laying eggs. The total number of embryos laid by a single hermaphrodite over the course of egg-laying was determined, as well as the total number of larvae that hatched. Data

27 from all plates were then used to generate box and whisker plots and compare total brood size and viable brood size numbers in for each genotype at each temperature.

2.1.2 RNAi feedings

RNAi experiments were carried out as previously described [179]. ddx-19 RNAi feeding clones were obtained from the Ahringer RNAi library. Small- scale (100ml) overnight cultures of RNAi or empty vector (L4440) bacterial clones were grown in LB supplemented with 50 µg/ml carbenicillin in a 25°C incubator. When cultures reached log growth, they were transferred to TB supplemented with 50 50µg/ml carbenicillin and placed in a 37°C incubator with shaking. When they achieved log growth, cultures were induced with 2mM IPTG and left for 4 h then collected. After RNAi or control (L4440) food was plated on large 15cm plates, approximately 100,000 synchronized L1 worms were plated onto the plate and left to grow until gravid adulthood. They were then harvested for protein lysate generation.

2.2. EMB-4 antibody generation and purification:

Mouse monoclonal antibodies were generated against twelve different peptides corresponding to the C-terminal domain of the EMB-4 protein by Abmart Inc. (SEAL Package) (Table 3). Nineteen potential antibodies were generated through this process, and were tested for specificity and recognition of EMB-4 by western blotting of worm lysates, performed by Christopher Wedeles. Based on the success of particular antibodies during western blotting, we obtained the cell lines for two of these antibodies, clones 7K3-12 (generated against peptide HFEDMDHEMQ, residues 1437-1446 in EMB-4) and 5M19-8 (generated against peptide ETEHEKKHRE, residues 1409-1418 in

EMB-4).

28 Table 3. List of peptides used to generate anti-EMB-4 monoclonal antibodies

Peptide Start End Sequence

1443 1452 HEMQEPAATA

748 757 DGFDEKEAVP

432 441 NEKPLFPTEK

230 239 SLEPNEAQES

1189 1198 GHGETQPSPH

1426 1435 QEMDDKKEAD

1348 1357 ERSKDGEPME

1409 1418 ETEHEKKHRE

1261 1270 TVDKYQGQQN

1437 1446 HFEDMDHEMQ

1248 1257 DTNPLIGMPA

1323 1332 RIFAKYPRKL

EMB-4 antibodies were purified from the supernatants from the 7K3-12 and 5M19-8 hybridoma cell lines using the Mouse TSC Purification System from Abcam. This kit uses protein A coated agarose resin to capture the antibody from the cell supernatant. The antibody is then eluted from the resin using a low-pH buffer. I eluted each antibody 4 times.

29 To test the titer and efficacy of each antibody eluate and ensure specificity of the purified fractions of antibody, 40 µg of wild type (N2) and emb-4(hc60) mutant (strain MJ60) total lysate were run on a precast gradient gel (4-12%, Invitrogen) and transferred to Hybond-C membrane (Amersham Biosciences). The membrane was incubated overnight at 4°C with either: anti-EMB-4 (1:500) or anti-Tubulin (1:10000). On the following day, the membrane was incubated 1 h at room temperature with HRP- conjugated secondary antibody (Jackson Immunoresearch) diluted 1:1000 in PBST. Membranes were washed 3x10 min in PBST and then visualized by Luminata Forte

Western Horse Radish Peroxidase substrate.

2.3. Protein lysate preparation for western blot, silver stain, and immunoprecipitation:

Synchronous populations of animals were grown on NGM plates, with OP50 E. coli at a density of approximately 90,000 animals per 15 cm Petri dish, and harvested at specific stages of development. The harvested animals were washed three times with M9 buffer and incubated for 30 minutes in M9 buffer to remove the bacteria from gut. The incubation was followed by three washes with M9 buffer. A last wash was performed with cold protein lysis buffer (30 mM HEPES-KOH [pH 7.4], 2mM magnesium acetate, 100 mM potassium acetate) and the pellet was flash frozen in a dry ice and ethanol bath. The frozen pellets were kept at -80ºC.

The frozen pellet was resuspended in ice-cold DROSO complete buffer 1:1 (v/v) containing 2 mM DTT, 0.1% Igepal CA 630 (Fluka), 4x concentration Complete protease inhibitor (Roche) and 1% (v/v) Phosphatase Inhibitors 2 and 3 (Sigma), and homogenized using a stainless steel Dounce homogenizer (Wheaton Incorporated) until worms and embryos were no longer visible. In cases where RNA was to be isolated from immunoprecipitations using the protein lysates, 1% (v/v) SuperaseIN RNAse inhibitor (Invitrogen). The homogenized extract was clarified by centrifugation at 13,000 x g for 10 min at 4°C. The concentration of the supernatant (total worm protein) was determined by Lowry assay using Biorad Lowry assay kit and Nanodrop 1800C spectrophotometer.

30 For immunoprecipitation or co-immunoprecipitation experiments, each IP was performed from 5 mg protein lysate. Lysate was precleared with 25 µl protein A/G agarose bead slurry (Santa Cruz Biotechnology, beads are equilibrated in DROSO complete buffer prior to use) for one hour at 4°C with rotating. 6.25 µg CSR-1 antibody, 50 µl of EMB-4 (5M19-8, 80 ng/µl) antibody, or anti-FLAG (Sigma Aldrich, 5 µg) or buffer alone (no antibody control) was added to each IP sample and incubated for two hours with rotating at 4°C. Immune complexes were recovered using 50 µl of a 50% slurry of Protein-A/G agarose beads (Santa Cruz Biotechnology) and washed 6 x 5 min at 4°C with DROSO buffer. Protein was eluted from beads and denatured by incubation Thermofisher 2x LDS sample buffer for 10min at 70°C. Input samples were prepared from the same lysate at a concentration of 2 µg/µl using Thermofisher 2x LDS sample buffer and reducing agent. Samples were then subjected to SDS-PAGE and western blotting or silver stain.

For western blots, proteins were resolved by SDS-PAGE on Precast gradient gels (4-12%, Invitrogen) and transferred to Hybond-C membrane (Amersham Biosciences) using a Bio-Rad semi-dry transfer apparatus (25V for 50 min). After washing with PBST (0.1% (v/v) Tween-20 in PBS) and blocking with 5% milk-PBST, the membrane was incubated overnight at 4°C with either: (i) purified anti-EMB-4 (5M19-8, 40 µg/ µl, diluted 1:200), or anti-alpha-tubulin (Sigma) diluted 1:2000, in 5% milk-PBST (137 mM NaCl, 10 mM Phosphate, 2.7 mM KCl, pH 7.4, and 5% [w/v] dried nonfat milk). The membrane was incubated 1 h at room temperature with HRP-conjugated secondary antibody (Jackson Immunoresearch) diluted 1:1000 in PBST. Membranes were washed 3x10 min in PBST and then visualized by Luminata Forte Western HRP substrate.

For silver stain followed by mass spectrometry, proteins were resolved by SDS- PAGE on precast gradient gels (4-12%, Invitrogen) as for western blots. Proteins were visualized by silver stain (SilverQuest silver staining kit, Invitrogen), as per the kit’s protocol. Bands corresponding to a molecular weight of approximately 170 kDa, 115 kDa and 95 kDa were cut from each IP lane and sent for mass-spectrometry to the

Taplin Mass Spectrometry Facility at Harvard Medical School, Harvard University.

31 2.4. RNA preparation for qRT-PCR, mRNA-seq, small RNA-seq, and RIP-seq:

General RNA preparations were performed similar to protein extraction, with the following modifications. When harvesting worms, the last wash was performed using sterile water, then animals were frozen in three to five pellet volumes of TRI Reagent (MRC, Inc.), flash frozen, and stored at -80ºC. Worms were thawed on ice and homogenized in a glass or metal dounce homogenizer, depending on volume, and total RNA was isolated as described previously [71]. For RT reactions, cDNA was generated from 1 µg C. elegans total RNA using random hexamers with Superscript III Reverse

Transcriptase (Invitrogen).

For mRNA and small RNA-seq experiments, N2 and emb-4(hc60) mutant worms were propagated at 14°C then shifted to 20°C or 25°C for one generation and harvested in TriReagent as young adults. Typical total RNA extraction and ethanol precipitation were performed, followed by quantification and crude assessment of RNA quality using

Qubit HS RNA kit (Invitrogen).

1 µg of total RNA was used for small RNA library preparation, using the Truseq Small RNA Library Prep Kit (Illumina), following the protocol provided by the manufacturer. The resulting PCR product DNA was visualized using PAGE (8% gel) and bands of the appropriate size (150 bp) were excised. DNA was then eluted from the excised bands overnight in buffer containing 10 mM Tris-Cl (pH7.5), 1 mM EDTA, and 0.3 M NaCl, then precipitated with 20 µg glycogen as the carrier and 1 volume of isopropanol. The resulting cDNA was resuspended in 30µl ultrapure water and quantified using Qubit HS DNA kit (Invitrogen). All samples were pooled in equal amounts into a 20nM solution and sequenced by TCAG Sequencing Facility at Peter Gilgan Centre for Research and Learning, SickKids Hospital on a HiSeq 2500 Sequencing System (Illumina). For each sample, biological duplicate libraries were prepared.

For mRNA library preparation, 4 µg of total RNA were treated with DNase I (Invitrogen), ethanol precipitated and quantified with the Qubit HS RNA kit (Invitrogen). 1 µg of DNase treated RNA was used for each of rRNA depletion (Ribo-Zero Gold rRNA

32 Removal Kit, Illumina) and Poly-A selection (Truseq Stranded mRNA Sample Preparation Kit). Library preparation was executed as described by the Truseq Stranded mRNA Sample Preparation Kit manual (Illumina). Samples were quantified by the Qubit

HS DNA kit (Invitrogen), pooled and sent for sequencing as described above.

For EMB-4 RNA IPs (RIPs), EMB-4 IPs were performed as described above on gravid adult wild type (N2) worms, using 50 µl anti-EMB-4 (7K3-12, 80ng/µl). To preserve RNA integrity, 1:100 Superase-In was added to DROSO Complete buffer prior to lysate preparation, as well as to any additional DROSO complete buffer added to each IP. 4x5 mg IPs were set up for each sample (anti-EMB-4 and no antibody control). After pre-clearing, incubation with antibody and collection of immunocomplexes, beads were washed 5x5 min and all four IPs were pooled for each sample prior to the last wash. 10% of the beads were saved for IP validation by Western Blot. The remaining 90% were resuspended in 4x volume or TriReagent and RNA extraction was performed. The extracted RNA was precipitated overnight at -80°C. The Qubit HS RNA kit (Invitrogen) was used to quantify the extracted RNA and up to 2 µg of each sample were treated with DNase I (Invitrogen). DNase-treated RNA was then phenol-chloroform extracted, ethanol precipitated and re-quantified using Qubit HS RNA kit (Invitrogen), as well as examined for quality by BioAnalyzer (Agilent). 600 ng of DNase-treated RNA were used for subsequent sample preparation. Ribosomal RNAs were removed from each sample using Ribo-Zero Gold rRNA Removal Kit (Illumina), as per the manufacturer’s protocol. The remaining RNA was ethanol precipitated and prepared for sequencing using Truseq Stranded mRNA Sample Preparation Kit (Illumina), following the protocol provided by the manufacturer. The resulting PCR product DNA was quantified using the Qubit HS DNA kit (Invitrogen) and all samples were pooled in equal amounts into a 20nM solution and sequenced by the TCAG Sequencing Facility at Peter Gilgan Centre for Research and Learning, SickKids Hospital on a HiSeq 2500 Sequencing System (Illumina). For validation by qRT-PCR, 100ng RNA from the input, EMB-RIP and Mock RIP samples were used for each reverse transcription reaction.

Reverse transcription was performed as described above.

33

2.5. Processing and analysis of high-throughput sequencing data:

Processing mRNA-seq data: polyA and ribozero samples were both processed in similar fashion. The raw sequences were checked for quality using FastQC [180]. The adapters were trimmed from the reads using cutadapt (version 1.10) [181]. Sequencing reads were then mapped to C. elegans genome, version ce10/WS220, using TopHat2 and allowing for maximum two mismatches in the final alignment (2). Reads mapping to ribosomal loci were removed. Only reads with quality scores higher than one were kept for the downstream analysis, removing reads mapping to more than 10 locations in the genome. For differential expression analysis we used DESeq2 package in R [205].

Reads mapping to protein coding genes were counted using featureCounts [182].

Processing small RNA-seq data: The raw sequences were checked for quality using FastQC [181]. The adapters were trimmed with cutadapt (version 1.10) [181]. The reads were then aligned to C. elegans genome, version ce10/WS220, using STAR (version STAR_2.3.0e.OSX_x86_64) [183] with parameters suitable for alignment of small RNAs (--outFilterMismatchNoverLmax 0.05 --outFilterMatchNmin 16 -- outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 1 --alignIntronMax 1).

Reads mapping to rRNAs were removed from the library. For each genomic feature, small RNAs were counted using multiBamCov function in bedtools (v2.25.0) [184]. For differential analysis of small RNAs mapping to the 20,389 protein coding genes, we only counted antisense mapping reads and normalized the counts to library size per million mapped reads (PPM values). Genes with coverage lower than 5 PPM in at least one of the four samples (small RNAseq of two MJ60 replicates and two replicates for N2 at 20C) were not considered. This reduced the list to 10,566 genes later subjected to differential expression analysis using samr package in R [185].

Processing RIP-seq data: The raw sequences were checked for quality using FastQC [180]. The adapters were trimmed with cutadapt (version 1.10) [181]. The reads were then aligned to C. elegans genome, version ce10/WS220, using bowtie2 [204].

34 Reads mapping to ribosomal loci were removed. Only reads with quality scores higher than one were kept for the downstream analysis. We downloaded the list of introns from the UCSC Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables). The enrichments over gene body or introns were expressed as a log2(FC), where fold-change is calculated as follows:

#����� ������� �� � ������ �� �� #��� ������ ����� �� ����� �� = × # ����� ������� �� � ������ �� ����� #��� ������ ����� �� ��

2.6. Chromatin Immunoprecipitation:

N2 worms were grown at 20°C, harvested in M9 as gravid adults and washed three times. They were then treated with 10mM of the cross-linking agent dimethyl 3, 3´- dithiobispropionimidate (DTBP, Thermo Fisher Scientific) diluted in M9 buffer (50 ml total volume), for 30 min at room temperature with rotating. DTBP was quenched by the addition of 2.5 ml of 2.5M glycine for 5 min at room temperature. Embryos were washed three times with M9 before proceeding to formaldehyde cross-linking.

In a second cross-linking step, worms were then treated using a final concentration of 1% formaldehyde for 30 minutes at room temperature (10 ml total volume) followed by quenching with 0.55 ml 2.5M glycine for 5 min at room temperature. Worms were then washed three times with M9 buffer, once with FA buffer (50mM HEPES/KOH pH 7.5, 1mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate; 150mM

NaCl), and frozen at −80°C.

Extracts were prepared by resuspending pellets in 1 volume FA Buffer supplemented with protease and phosphatase inhibitors, followed by dounce homogenization and sonication. Sonication was performed in a Diagenode Bioruptor®

Pico (15 cycles of 30 sec ON/30 sec OFF, high output) in a volume of 1.5 ml.

Lysate was then spun down at 13,000xg for 15min. Protein concentration was determined by Lowry method and 3.3 mg extract was used for each ChIP in a total volume of 400 µl. 10% of each IP was removed as input (40 µl). Each IP sample was pre-cleared with 25 µl of a 50% slurry of ProteinA/G beads for one hour at 4°C prior to

35 incubation with the antibody. 5 µg (anti-RNA Pol II, Abcam, #5408) or 50 µl of EMB-4 antibody or buffer alone (no antibody control) was added to each IP sample and incubated for two hours with rotating at 4°C.

Immune complexes were recovered using 50 µl of a 50% slurry of Protein-A/G agarose beads (Santa Cruz Biotechnology) and washed at room temperature with 1 ml of each of the following solutions: FA Buffer (2x 5 min), FA Buffer with 1M NaCl (1x 5 min), FA Buffer with 500mM NaCl (1x 10 min), TEL (0.25M LiCl, 1% NP-40, 1% sodium deoxycholate, 1mM EDTA, 10mM Tris-HCl, pH 8.0) (1x 10 min), and TE (1mM EDTA, 10mM Tris-HCl, pH 8.0) (2x 5 min). Samples were eluted twice with 150 µl elution buffer

(1% SDS in TE with 250mM NaCl) for 15 minutes at 65°C with shaking.

Eluates were combined and treated with 1 µl (20 mg/ml) Proteinase K (Biobasic). Input samples were also treated with 150 µl of elution buffer and 1 µl Proteinase K. Then, crosslinks were reversed for all samples by incubation overnight at 65°C with shaking. DNA was recovered by phenol/chloroform extraction and ethanol precipitation. All samples were resuspended in 50 µl of ultrapure water (Invitrogen) and stored at -

20°C. ChIP samples were subsequently analyzed by quantitative real-time PCR.

2.7. Quantitative real-time PCR methods:

For mRNA analysis, qRT-PCR was performed on the StepOnePlus System using Applied Biosystems Fast SYBR Green PCR Master mix. Thermocycling was done for 40 cycles, reactions were 15 µl total volume (7.5 µl SYBR master mix, 0.6 µl of 10mM

primer, 2 µl cDNA, 4.3 µl dH20).

For ChIP analysis, input samples were diluted 1:100 in ultrapure water (Invitrogen) and IP samples were diluted to 1:7. For RIP analysis, cDNA was diluted 1:50. qRT-PCR was performed on the StepOnePlus System using Applied Biosystems Fast SYBR Green PCR Master mix. Thermocycling was done for 40 cycles, reactions were 15 µl total volume (7.5 µl SYBR master mix, 0.6 µl of 10mM primer, 4 µl cDNA, 2.3

µl dH20). Fold enrichment was determined relative to the Beads Only Control. Quantity values for each target were normalized to input in each sample. Quantity in the

36 ChIP/RIP samples was then normalized to quantity in the beads only control. Error is calculated as described in [186]. All primers used in qRT-PCR experiments are listed in

Table 4.

Table 4. Primers used for qRT-PCR.

Experiment Target Forward Primer Reverse Primer emb-4 emb-4 TTCGTCCCCTGTTCCATATC ATCGGCTTCTGGCCTAAAAT expression time course daf-21 GGATGATCTACACCCGCCAAAAT CCAACTGTCGACACAAGCTTCCT bub-1 GACCCAAATTGACCAAGCGTTC AAGCGTCTCGAGTTGTGTCGTG klp-16 ACTCGGTGCTCAAACTGGTTGATT AGAACGGAGAGGGACTGATTGATG TGGCGAACTTATGGGATATG CGGCAAGAACCTCTTCAACT EMB-4/RNA D2096.1 POLII ChIP W08F4.9 GAAGATCCTCTGGTCGTGGA CCGGACGAATCGAACAGTAG analysis bath-45 GCTCTGGCGAGAATCAACTT TTAGAAGAACGTCTCAGTGTGAACT clp-3 GCATGTTTCAGCCAGAAAGTTGAA AACGCGCTCCATTGCAAAAATA M01G12.9 CCGTAATTATGAAGGCCCGAGTC TGCGAGAAGCGTTGTGAGTTTTT Y47H10.4 CCGTAATTATGAAGGCCCGAGTC TGCGAGAAGCGTTGTGAGTTTTT K09F6.10 ACGAGGAAGATGATGGATTTTGGA CATCGTCAACGCATAGTCCCTTC F37H8.7 CCAGAACAAGTGAGCAACCCATC TTTCGGAAGCAGAACTTCACTCG Y20F4.8 GCTGGAAACGCTGGTAATTGAAAC AGCATCGTCCGGATCTTTCTTGT EMB-4 RIP C16C8.21 GTCAAAACAGAGCTGATTGAGACGAA TCGCAAATTCTCCTTCCTGAATCTT validation F58H7.5 GCCGATGAAGCACAAAGAAGAGA CTGGAGAATCCTGTGGTGAGTCG rla-1 CTTGCGTCTACGCTGCTCTCATC CTGGCCAGTATGGCTCGAACTC rps-9 GATCCAAAGCGTCTTTTCGAAGG TCGAGCTTCATCTTGGTCTCGTC

2.8. Immunofluorescence experiments:

To examine EMB-4 localization and assess antibody specificity, gonads and embryos were excised from wild type (N2) and emb-4(hc60) mutant gravid adult worms grown at 20°C in 1x PBS on poly-L-lysine coated slides, frozen and cracked on dry ice for greater than 10 minutes, and fixed at -20°C for 5 min each (15 min total) in each of

37 the following, respectively: 100% methanol, 50% methanol/50% acetone, and 100% acetone. All sample incubations were performed in a humid chamber. Samples were washed 2x 5 min with 1x PBS, then 2x 5 min with 1x PBS/0.1% Tween. Next, slides were blocked for one hour in 1xPBS/0.1% Tween-20/3%BSA (PBST+BSA) at room temperature, and then incubated with primary antibody overnight at 4°C. (mouse monoclonal anti-EMB-4=1:500, mouse monoclonal IgM anti-PGL-1 (clone K76)= 1:100) Slides were washed 3x 10 min with PBST, and then incubated for 1 hour in PBST+BSA. Secondary antibodies were from Jackson Immunoresearch. Incubation with anti-mouse IgG and IgM (1:100) fluorescently conjugated secondary antibodies was performed for one hour in PBST+BSA at room temperature. Slides were washed 3x 10 min in PBST, 3x 5 min in PBS, then incubated with DAPI (1:2500) for 10 min at room temperature. Finally, slides were washed in PBS 3 times for 5 minutes and mounted in Vectashield (Vector Labs). All images were collected using a Nikon Ti-S inverted microscope with

NIS Elements AR software.

To examine H3K4me2 in emb-4 and RNAi mutants, N2, csr-1(tm892) IV; neIs19[pie-1::3xflag::csr-1, unc-119(+)]) , wago-9(tm1200) and emb-4(hc60) worms were propagated at 20°C and shifted to 25°C at the L1 stage. Once they had become gravid adults, embryos were dissected and stained as described above for EMB-4. H3K4me2 primary rabbit polyclonal IgG antibody from Millipore was used at 1:20000 and PGL-1 mouse monoclonal IgM antibody (clone K76) was used at 1:100. Secondary antibodies from Jackson Immunoresearch were used at 1:100 (Fluorescein (FITC) AffiniPure F(ab')₂ Fragment Goat Anti-Mouse IgM and Rhodamine (TRITC) AffiniPure Donkey Anti-Rabbit IgG)

38 3. Results

3.1. EMB-4 antibodies specifically recognize EMB-4:

We generated monoclonal antibodies (5M19-8 and 7K3-12) against the C- terminal end of the EMB-4 protein. In order to validate that these antibodies recognized EMB-4, I ran 20 µg of wild type (N2) worm lysate next to emb-4(hc60) mutant worm lysate on a gel and probed for EMB-4 using each antibody separately. The emb-4(hc60) mutant is a null mutant, therefore, there should not be any protein present in the mutant worm lysate. The EMB-4 protein is expected to migrate in correspondence with a molecular weight of 170 kDa. Both antibodies recognized a band at 170 kDa that is present in wild type and not present in the mutant (Figure 4a). This shows that both antibodies recognize the EMB-4 protein. Tubulin was used as a loading control. In order to test the specificity of the antibodies for EMB-4 alone, I ran wild type worm lysate on a gel and probed the western blot with each elution of each antibody. All elutions of both antibodies recognized a single band at 170 kDa. Since no other bands are visible on the western blot, we concluded that both EMB-4 antibodies specifically recognize EMB-4 alone (Figure 4b).

39 MW a) b) 5M19-8 7K3-12 in 1 2 3 4 1 2 3 4 kDa 235 EMB-4EMB-4 180 5M19-8 7K3-12 135 100 75

wild type emb-4(hc60) wild type emb-4(hc60) 63 EMB-4 170kDa 48

35 TUBULIN 63kDa 25 20 17

Figure 5. Validation of EMB-4 antibodies. a) 20 µg of wild type ands emb-4 mutant worm lysate were run on a gel. The Western Blot was probed with one of the two EMB-4antibodies: 5M19-8 (left) and 7K3-12 (right) and anti-tubilun as a loading control. Both antibodies recognize a band at 170 kDa, that is not present in the mutant, confirming that it is EMB-4.Figure b) 4. Each The elution EMB of- 4both antibodies EMB-4 antibodies are specific was use to to probefor EMB EMB-4-4. a) 20in wildtype µg of worms.wild type The andfull Western Blot showsemb that- 4theantibodies mutant worm recognize lysate only were one band run aton 170 a gel.kDa, Theshowing western that they blot are wasspecific probed for EMB-4 with. one of the two EMB-4antibodies: 5M19-8 (left) and 7K3-12 (right) and anti-tubulin as a loading control. Both antibodies recognize a band at 170 kDa, that is not present in the mutant, confirming that it is EMB-4. b) Each elution of both EMB-4 antibodies was used to probe for EMB-4 in wild type worms. The full western blot shows that the antibodies recognize only one band at 170 kDa, showing that they are specific for EMB-4.

3.2. Developmental timing of emb-4 mRNA and protein expression:

To determine when in development emb-4 mRNA and protein were expressed, we extracted total RNA and protein from developmentally staged wild-type worms (embryos, L1, L2, L3, L4 larval stages, young adult hermaphrodites without embryos, and gravid adult hermaphrodites with embryos), adults who have a female only or a male only germline (fem-1(hc17) and fog-2(q71) mutants, respectively) and worms lacking a germline altogether (glp-4(bn2) mutant). To examine mRNA expression, we performed reverse transcription followed by quantitative real-time PCR (qRT-PCR) using emb-4 specific primers. Expression data was normalized to the control housekeeping gene act-3. We observed that emb-4 mRNA is present throughout worm development, but that it is predominantly expressed in the embryo and gravid adult worms when a germline is present. Data from glp-4(bn2) mutants lacking a germline support the notion that the primary contribution of emb-4 to the whole worm is from the germline, as only

small amounts of emb-4 mRNA are detected in somatic tissue (Figure 5a).

40 The emb-4 mRNA expression pattern is reiterated in the protein expression pattern. When probed with the EMB-4 antibody, the western blot shows that EMB-4 protein is present at all developmental stages, but enriched in stages with a germline and embryos. Once again, EMB-4 appears to be predominantly expressed in the germline as no EMB-4 protein is detected in glp-4(bn2) mutant samples. In these experiments, the housekeeping gene, alpha-tubulin is used as a loading control (Figure 5b). a)

9 8 act-3 7 6 5 4 3 2 1 mRNA expression relative to mRNA 0 Embryo L1 L2 L3 L4 Young Gravid Gravid Female Male Soma Adult Adult Adult only only only 25oC b) Larval Stages Adult Stage

EMB-4

TUB

Figure 6. emb-4 expression across worm development. Total RNA and protein were extracted from stagedFigure worms, 5. emb worms-4 iswith predominantly a male or female-only( expressedfem-1 inand embryos fog-2 mutants, and gravid respectively) adult germline and wormslackinggermlines. a germline Total RNA altogher and protein(glp-4(bn2) were mutants).a) extracted Reversetranscription from staged worms, followed worms by with qRT-PRC a using emb-4male specific or female primers-only was ( fogperformed-2(q71) andand quantities fem-1(hc17) were mutants, normalized respectively) to the housekeeping germline gene and act-3. mRNAworms expression lacking data a germline show that altogether emb-4 is predominantly (glp-4(bn2) mutants). expressed a) in Reverse embryos transcriptionand gravidfollowed adult worms. by qRT b)- PRCWestern using blots emb were-4 probedspecific using primers the EMB-4was performed antibody andand show quantities that EMB-4 were is predominantlynormalized expressed to the housekeeping in embryos and gene gravid act adult-3. mRNA worms. expression EMB-4 is not data detectable show that in somatic emb-4 istissue, suggestingpredominantly that it is primarlyexpressed expressed in embryos in the and germline. gravid adult worms. b) Western blots were probed using the EMB-4 antibody and show that EMB-4 is predominantly expressed in embryos and gravid adult worms. EMB-4 is not detectable in somatic tissue, suggesting that it is primarily expressed in the germline.

41 3.3. EMB-4 localization in the adult germline and embryos:

To determine the subcellular localization of EMB-4 in the tissues where it is predominantly expressed, I performed immunofluorescence (IF) confocal microscopy using the EMB-4 antibody. For these experiments, I examined the patterns of EMB-4 (in some cases along with a germline marker, PGL-1) as well as DAPI (for DNA) in dissected adult gonads and developing embryos. I found that EMB-4 is generally restricted to the nucleus in the germline and embryo (Figure 6a). Closer examination of EMB-4 staining in the germline shows a highly dynamic pattern during meiosis and germ cell development. In the mitotic and transition regions of the germline, EMB-4 is present within the nucleus in foci that surround the DNA (Figure 6b and c). The mitotic region represents stem cell population of mitotically dividing nuclei that are constantly producing germ cells, which then enter into the early stages of meiotic prophase in the transition zone, where begin to pair [187,188,189,190]. This pattern changes in the pachytene zone, where chromatids begin to separate and EMB-4 is concentrated at one distinct focus in the center of the nucleus and a surrounding crescent shape (Figure 6d) [191]. We speculate, based on the position of the distinct EMB-4 focus, that it represents the nucleolus. Further in germline development, during diakinesis, EMB-4 appears to be evenly dispersed in the oocyte nuclei (Figure 6e) [192].

In the embryo, both at earlier and later stages, EMB-4 in the nuclei of all cells, including both germline and somatic tissues. In these, sub-nuclear localization of EMB-4 is also in several foci within each nucleus (Figure 6f). The fact that EMB-4 is restricted to the nucleus supports its role as a putative splicing factor, and roles in coordinating nuclear export. Together with the mRNA and protein expression patterns, these data suggest that EMB-4 could function in the germline of adults and embryos, and in somatic cells during embryonic development. In contrast, I did not obtain compelling evidence for EMB-4 localization in adult somatic tissues, supporting the developmental mRNA and protein expression. I observed minimal co-localization of EMB-4 with P- granules. The implications of this co-localization are not entirely clear at this point, but could indicate a role for EMB-4 in transiting nuclear pore complexes.

42 43 Figure 6. EMB-4 is expressed in the nuclei of the germline and adult embryo. The EMB-4 antibody was used to stain dissected germlines and embryos of wild type worms. DAPI (blue) is used to stain DNA. a) EMB-4 is expressed all germline nuclei b) and c) EMB-4 is present in foci in the nuclei of the mitotic and transition zones of the germline. d) It is concentrated in a single focus in and a crescent shape in the pachytene zone. e) In oocyte nuclei EMB-4 is even dispersed. f) In the both early and late-staged embryos, EMB-4 is expressed in the nuclei of all cells and is concentrated in foci. g) PGL-1 (green) is used as a marker for P-granules. EMB-4 shows minimal co-localization with PGL-1 in the germline.

3.4. Loss of emb-4 leads to decreased broods and embryonic lethality:

The emb-4(hc60) mutant was reported by Checchi et al (2006) to be both a null allele and display temperature sensitivity, as the authors observed more severe defects in germline chromatin resetting at the non-permissive temperature [174]. My assessment of EMB-4 protein levels in wild type and emb-4(hc60) mutants by western blot (Section 3.1) confirms that the allele is in fact a protein null mutant. To then assess the validity of the temperature sensitive phenotype, I conducted brood size counts on wild type (N2) and emb-4(hc60) mutants reared at 14°C, 20°C and 25°C. In these experiments, I also counted the number of larvae that hatched in order to quantify the degree of embryonic lethality and the effect of temperature stress on this phenotype.

My data show that emb-4(hc60) mutants lay fewer embryos than wild type worms at all temperatures (Figure 7a). At 14°C the median total brood size for wild type worms is 189 individuals, whereas emb-4(hc60) mutants only laid approximately 35 embryos in total. At 20°C wild type and emb-4(hc60) mutants laid 256 and 61 embryos respectively, whereas at 25°C wild type worms laid 110 and mutants laid 10 embryos. These data confirm a temperature dependent brood size defect for emb-4(hc60) mutants that is exacerbated at both low and high temperatures.

In addition to a reduced total brood size, emb-4(hc60) mutants also display a significant degree of embryonic lethality (Figure 7b and c). When considering the number of larvae that hatched as a proportion of the total number of embryos laid, I observed that only 38% and 50% of emb-4(hc60) embryos hatched to produce viable larvae at 14°C and 20°C, respectively. This in stark contrast to 96% and 97% of wild type embryos at 14°C and 20°C, respectively. Moreover, the embryonic lethal phenotype

44 appears to be fully penetrant at 25°C for emb-4(hc60), with 100% of embryos arresting during embryogenesis. In comparison, wild type embryo hatching is reduced, but only to 87% at 25°C, which is expected due to temperature stress. Taken together these data show that the emb-4(hc60) mutant is temperature sensitive and has reduced brood size as well as a significant embryonic lethal phenotype.

45 a) b) Total Brood Size Viable Brood Size

14°C 20°C 25°C 14°C 20°C 25°C 400 400

360 360

320 320

280 280

240 240 200 200

160 160

120 120

80 80 40 40

wildtype (hc60) wildtype (hc60) wildtype (hc60) (hc60) (hc60) (N2) wildtype (hc60) wildtype wildtype (N2) (N2) emb-4 (N2) (N2) emb-4 emb-4 emb-4 (N2) emb-4 emb-4 c) 100

90

80

70

60 Wild Type (N2)

50 emb-4(hc60) 40

30

Percent embryos hatched embryos Percent 20

10

14ºC 20ºC 25ºC

Figure 7. The emb-4(hc60) mutant is temperature sensitive and has increased sterility and embryonic lethality. Brood size counts we performed on wild type worms and emb-4(hc60) mutants grown at 14°C, 20°C and 25°C. a) Total number of embryos laid were counted. emb-4(hc60) mutant lay fewer embryos than wild type worms at all temperatures indicating sterility. At 25°C the number of embryos is greatly reduced relative to 14°C and 20°C indicating that the mutant is temperature sensitive. b) and c). Total number of larvae hatched were counted for both genotypes at 14°C, 20°C and 25°C. A smaller proportion of emb-4(hc60) embryos hatch into viable larvae at all temperatures, suggesting an embryonic lethal phenotype. At 25°C, all embryos arrested prior to hatching, suggesting that the embryonic lethal phenotype is fully penetrant at 25°C.

46 3.5. EMB-4 interacts with C. elegans WAGOs CSR-1 and WAGO-9/HRDE-1

To validate the interaction between CSR-1 and EMB-4 observed by Dr. Julie Claycomb in her initial IP/mass spectrometry experiments, I repeated the same experiment using the same CSR-1 antibody. I also tested whether the interaction was detectable following immunoprecipitation of EMB-4 using the newly-generated EMB-4 antibodies. In these experiments, I immunoprecipitated either EMB-4 and CSR-1 from 5 mg of wild type (N2) worm lysate each. In mock IP negative control experiments, I used only the Protein A/G agarose beads to perform the IP. I visualized the proteins associated with the immunocomplexes by silver stain on a gradient SDS polyacrylamide gel (PAGE) and cut out the bands corresponding to 170 kDa (EMB-4), 116 kDa (CSR-1 long isoform) and 96 kDa (CSR-1 short isoform) from each IP (Figure 8a). The gel bands were sent to the Taplin Mass Spectrometry Facility at Harvard Medical School, Harvard University to determine protein identity (this is where Dr. Claycomb also had the initial IP/mass spectrometry analysis performed). We observed a high number of spectral counts of EMB-4 in 170 kDa band of the EMB-4 IP and the CSR-1 IP, supporting that EMB-4 does interact with CSR-1. Unfortunately, I was unable to obtain any strong reading in terms of unique peptides from the 116kDa band in either IP, and thus cannot make a strong conclusion about CSR-1 from these data (Table 5).

Table 5. Spectral counts of EMB-4 and CSR-1 in IPs

EMB-4 IP CSR-1 IP

EMB-4 56 18

CSR-1 Short 1 1 Isoform

CSR-1 Long 0 0 Isoform

47 I next sought to validate these results by co-IP/Western Blot. In these experiments, I immunoprecipitated EMB-4 and CSR-1 from wild type (N2) worms using the same antibodies and conditions as I used for the IP/Mass spectrometry experiment. The resulting IP samples were run on SDS-PAGE and the region of the western blot greater than 135 kDa was probed using the EMB-4 antibody, while the lower molecular weight portion of the western blot was probed with the CSR-1 antibody. I successfully confirmed the interaction observed by IP/mass spectrometry in this experiment. In the CSR-1 IP, I detected a clear EMB-4 band at 170 kDa. In the EMB-4 IP, I only robustly detected the short isoform of CSR-1 (96 kDa) (Figure 8b). Finally, no non-specific binding of either EMB-4 or CSR-1 was detected in the Mock IP sample, which consisted of an IP performed using only the Protein A/G agarose beads. Collectively, my data indicate that CSR-1 and EMB-4 physically interact in the worm, and because of the reciprocal nature of the interaction, I have high confidence in these data.

Our collaborators in Dr. Eric Miska’s lab at the University of Cambridge (UK), also identified EMB-4 as an interactor of the C. elegans AGO, WAGO-9/HRDE-1 using quantitative mass spectrometry studies in which they immunoprecipitated WAGO- 9/HRDE-1. In the course of our collaboration with the Miska group, I performed experiments using the EMB-4 antibody to validate this interaction by co-IP. I used a worm strain that carries a FLAG-tagged version of WAGO-9/HRDE-1 in a wago-9/hrde- 1(tm1200) null mutant background, such that the only version of the WAGO-9/HRDE-1 protein produced is epitope tagged. I immunoprecipitated FLAG::HRDE-1 using the anti- FLAG M2 antibody and EMB-4 using the EMB-4 antibodies, performed SDS-PAGE, and probed the western blot for each protein as I had for CSR-1. This experiment confirmed the interaction observed by the Miska lab by reciprocal IPs. In the FLAG IP, EMB-4 is detected and vice versa, with no non-specific binding in the Mock IP (Figure 8c). My data demonstrate that EMB-4 clearly physically interacts with at least two members of the WAGO group of Argonautes in the worm, thus placing it in an interesting biological context to impact multiple, opposing, small RNA pathways within the germline.

48

a) Input Input Input 1ug EMB-4 IP MOCK CSR-1 IP 5 ug 2 ug

EMB-4 EMB-4

CSR-1 Long CSR-1 Long CSR-1 Short CSR-1 Short

b) IPs c) IPs

FLAG:: Input CSR-1 EMB-4 Mock Input EMB-4 HRDE-1 MOCK αEMB-4 175 kDa αEMB-4 175 kDa αEMB-4 175 kDa

117 kDa αFLAG 120 kDa αCSR-1 96 kDa

Figure 8. EMB-4 interacts with WAGOs CSR-1 and WAGO-9/HRDE-1. (a) EMB-4 and CSR-1 IPs were performed in lysate from gravid adult worms using EMB-4 and CSR-1 using antibodies specific for each protein. Western blots were probed for EMB-4 and CSR-1. (b) Worms expressing FLAG::HRDE-1/WAGO-9 were used to pull down EMB-4 and FLAG::WAGO-9/HRDE-1 (using anti-FLAG antibody). Western blots were probed for EMB-4 and FLAG.

3.6. csr-1 and wago-9/hrde-1 mutants display defects in germline chromatin resetting similar to EMB-4:

As described in Chapter 1, emb-4 mutants have severe defects in germline chromatin resetting during the development of the Primordial Germ Cells, Z2 and Z3. In particular, emb-4 mutants show significant delays or a complete block in the global loss

49 of H3K4me2 from the germline genome when the Z2/Z3 cells emerge. We observed interaction between EMB-4 and the WAGOs CSR-1 and WAGO-9/HRDE-1, which both play important roles in the germline by co-transcriptionally licensing (CSR-1) and silencing (WAGO-9/HRDE-1) particular genes in the adult germline. Moreover, both CSR-1 and WAGO-9/HRDE-1 appear to have roles in altering the chromatin architecture surrounding their targets. Therefore, we hypothesized that the CSR-1 and WAGO- 9/HRDE-1 small RNA pathways might also influence germline chromatin resetting at the

Z2/Z3 stage.

To test the effect of the germline small RNA pathways on H3K4me2 in the developing embryo, I dissected embryos from wild type (N2), emb-4(hc60), wago- 9/hrde-1(tm1200) null mutants and a csr-1 partially rescued mutant strain (csr-1(tm892) IV; neIs19[pie-1::3xflag::csr-1, unc-119(+)]) and performed immunolocalization studies using an antibody specific to H3K4me2. Anti-PGL-1 staining was also used as a marker for P-granules, and thus helped to identify germ cells in the embryos. Wild type and emb-4(hc60) mutants were used as controls, and partly to verify previous findings on the emb-4(hc60) allele. Worms were generally reared at 20°C, then shifted to 25°C at the L1 stage and allowed to grow to adulthood. At the gravid adult stage, worms were dissected to release their embryos for examination. I performed these experiments using a temperature shift, as the emb-4(hc60) germline chromatin resetting phenotype was previously observed to be fully penetrant at 25°C. Moreover, wago-9/hrde-1(tm1200) mutants were shown to have transgenerational fertility defects when grown at 25°C over successive generations, thus we reasoned that any phenotype might be more evident at this temperature. It was necessary to shift the worms from 20°C to 25°C at the L1 stage due to the embryonic lethality of the emb-4(hc60) mutant and the increased sterility of the csr-1 partially rescued mutant strain at 25°C. Using csr-1(tm892) null mutants would not be possible, as they produce very few embryos that do not reach the Z2/Z3 developmental stage at any temperature.

As expected, I observed that almost all (99%) wild type embryos were normal and displayed proper loss of H3K4me2 when Z2/Z3 emerge (Figure 9a, top row). Accordingly, almost all (98%) emb-4(hc60) mutants retain H3K4me2, in agreement with

50 previously published data by Checchi et al (2006) (Figure 9a, second row). Interestingly, approximately 60% of csr-1 partially rescued embryos displayed H3K4me2 retention well past the Z2/Z3 stage (Figure 9a, middle two rows, Figure 9b). wago-9/hrde- 1(tm1200) mutants also displayed defects in germline chromatin resetting, however, the proportion of such embryos was much lower, at approximately 25% (Figure 9a, bottom two rows, figure 9b). These data demonstrate that small RNA pathway mutants also have defects in germline chromatin resetting, suggesting that germline chromatin and germ cell specification might be a point of intersection between EMB-4 and small RNA pathways.

51

52 Figure 9. csr-1 and wago-9/hrde-1 mutants have defects in germline chromatin resetting, similar to emb-4 mutants. H3K4me2 specific antibody (red) was used to stain wild type (N2), emb-4(hc60), csr-1 and wago-9/hrde-1 mutant embryos past the Z2/Z3 stage, PGL-1 (green) was used as a marker for P-granules. a) Wild type embryos experience the expected loss of H3K4me2 at Z2/Z3, whereas emb-4 (hc60) retain H3K4me2 (top two rows). csr-1 (middle two rows) and wago-9/hrde-1(tm1200) (bottom two rows) mutant embryos display both the normal loss of H3K4me2 at the Z2/Z3 stage and the aberrant H3K4me2 retention. b) H3K4me2-stained embryos were counted.

3.7. EMB-4 is present at CSR-1 and WAGO-9/HRDE-1 target genomic loci:

Because of the known role for the CSR-1 and WAGO-9/HRDE-1 small RNA pathways in affecting chromatin at their target loci in the adult germline, the physical interaction between these WAGOs and EMB-4, and the similar germline chromatin resetting defects shared by emb-4 and wago mutants, we wondered whether EMB-4 was capable of interacting with chromatin at the genomic loci of CSR-1 and WAGO- 9/HRDE-1. To test this possibility, I performed chromatin immunoprecipitation (ChIP) on wild type (N2) gravid adult worms using the EMB-4 antibody. To increase the likelihood of capturing interactions with proteins more distantly associated with chromatin, such as those that interact with nascent transcripts rather than directly with chromatin, we employed two crosslinking agents, DTBP (Dimethyl 3,3′-dithiopropionimidate dihydrochloride) and formaldehyde. DTBP has a longer crosslinker length than formaldehyde, and can thus enable the detection of longer-range interactions. As EMB-4 is a putative splicing factor, we suspected that it was unlikely to directly associate with chromatin, but that it was more likely to associate with nascent transcripts, thus the

DTBP crosslinking was important.

I used a qRT-PCR to test for the enrichment of CSR-1 and WAGO-9/HRDE-1 target gene loci in the ChIP sample, relative to the Mock IP negative control, which consisted of a ChIP performed with only Protein A/G agarose beads. I also used RNA polymerase II (RNAPII) ChIP as a positive control, as we know that RNAPII is enriched at CSR-1 target gene loci. I tested the enrichment of several CSR-1 targets and RNAPII enriched genes as well WAGO-9/HRDE-1 targets in the EMB-4 ChIP. CSR-1 target genes were identified as those that were strongly enriched for 22G-RNAs in CSR-1 IP- sequencing experiments [71,201]. Putative WAGO-9/HRDE-1 target gene loci were

53 provided by our collaborators in the Miska lab, and were characterized as genes which were upregulated two-fold or more in mRNA-seq experiments on wago-9/hrde- 1(tm1200) mutants relative to wild type worms. I observed an enrichment of RNAPII at all loci examined, suggesting that the fixation and ChIP protocols were successful. I also observed an enrichment of EMB-4 at CSR-1 and WAGO-9/HRDE-1 target gene loci in the EMB-4 ChIP sample relative to the beads only negative control (Figure 10). I did not detect enrichment of non-small RNA target gene loci in either ChIP sample, supporting that our ChIP was specific in immunoprecipitating true interaction loci for EMB-4 and RNAPII. Collectively, these data point to a model whereby EMB-4 is present at CSR-1 and WAGO-9/HRDE-1 target gene loci and could act on their nascent transcripts during transcription.

54 POLII ChIP

EMB-4 ChIP Enrichment over Beads Only Negative

daf-21 bub-1 klp-16 bath-45 clp-3 D2096.1 W08F4.9 M01G12.9 Y4710.4 CSR-1/POLII Putative WAGO-9/HRDE-1 Non-Targets Targets Targets

Figure 10. EMB-4 is enriched at CSR-1 and WAGO-9/HRDE-1 target loci. Chromatin- immunoprecipitation of EMB-4 was performed on gravid adult worms after double- fixation using formaldehyde and DTBP. EMB-4 was immunoprecipitated using the EMB- 4 antibody. qRT-PCR was used to determine enrichment of CSR-1/RNAPII targets, putative WAGO-9/HRDE-1 targets and non-targets relative to the beads-only negative control. Two-fold enrichment was set as a threshold.

3.8. Changes in the emb-4(hc60) small RNA and mRNA transcriptomes: My data collectively point to a role for EMB-4 in the CSR-1 and WAGO-9 22G- RNA pathways. To begin to tease apart the function of EMB-4 in these small RNA pathways, we first examined the effect of loss of emb-4 on small RNA populations in adult worms. To do this, I performed small RNA cloning followed by Illumina sequencing on duplicate samples from emb-4(hc60) mutants and wild type (N2) worms grown at 20°C. Subsequent computational analysis of these and additional sequencing data

(below) was performed by computational post-doctoral fellow, Dr Katarzyna Tyc. First, we examined the proportions of small RNAs belonging to different classes: miRNAs, piRNAs, siRNAs (specifically CSR-1-22Gs, WAGO-1 or WAGO-9/HRDE-1- 22G-RNAs), and the remaining reads. We observed a dramatic increase in miRNAs (22.2% in the mutant relative to 14.6% in wild type), and piRNAs (2.75% in the mutant

55 relative to 1.1% in the wild type). We also observed a decrease in CSR-1 22G-RNAs in the emb-4(hc60) mutant compared to N2 (7.93% in wild type and 4.55% in the mutant) (Figure 11a). Interestingly, the class of WAGO-9/HRDE-1-associated small RNAs occupies the same proportion of the total small RNA population in the mutant relative to wild type. We then shifted from a global view of each small RNA class to a more quantitative examination of the differential expression of each small RNA species in the emb-4(hc60) mutant relative to wild type. This provides a more detailed view of the effect of loss of emb-4 on small RNA populations and allows us to identify differentially expressed small RNAs. In this analysis, we interrogated protein coding genes and differences in the number of small RNAs mapping to their genomic loci. We expressed these values as PPM (parts per million; precisely: (# reads mapping the genomic loci)/(# all reads mapped)*10^6). We then compared the relative quantity of each small RNA species in the emb-4(hc60) mutant relative to wild type (N2) and focused on CSR-1 and WAGO-9/HRDE-1 associated-small RNAs, as these are the two AGOs with which EMB-

4 interacts. Our comparison revealed that 3,882 (78.8% of all CSR-1 22G-RNA targets) CSR- 1 targets display alterations in 22G-RNAs upon loss of emb-4. CSR-1 targets are transcripts, against which CSR-1 associated 22G-RNAs have been identified [71]. Of these, the majority of genes (3,762) showed a depletion of 22G-RNAs and only 46 genes and showed increased levels of 22G-RNAs. Although we did not observe any changes in the WAGO-9/HRDE-1 22G-RNAs in terms of overall proportion, closer examination of these small RNAs at the single gene level revealed significant differences. 1,591 WAGO-9/HRDE-1 22G-RNA target genes showed changes in 22G- RNA levels upon loss of emb-4 (95.7% of all WAGO-9/HRDE-1 22G-RNAs). Unlike CSR-1 22G-RNAs, which are almost all depleted in the mutant, we observed a fairly even split in the WAGO-9/HRDE-1 22G-RNAs between those that are up-regulated and those that are down-regulated. 892 of the WAGO-9/HRDE-1 target genes had reduced levels of small RNAs in the mutant compared to wild type and 699 of the WAGO- 9/HRDE-1 target genes had increased levels (Figure 11b, Table 6). The fact that a subset of WAGO-9/HRDE-1 associated 22G-RNAs are up-regulated and another down-

56 regulated is the reason why we did not observe any changes in this class of 22G-RNAs when looking only at the overall proportions. These data suggest that there are two differentially regulated groups of WAGO-9/HRDE-1 associated 22G-RNAs, one that is depleted and one that is enriched upon loss of emb-4 (Figure 11c, Table 6).

57 30 a)

22.5 22.5 Wild Type (N2)

emb-4 mutant (MJ60)

15 15

7.5 7.5 % of total small RNA population

0 0

CSR-1- AGO-9- AGO-1- Other miRNAs 21U- RNAs W W associated (piRNAs) Associated Associated 22G- RNAs

b) All Genes CSR-1 Targets All Genes WAGO-9/HRDE-1 Targets

in ppm in ppm

emb-4(hc60) emb-4(hc60)

small RNAs in small RNAs in

small RNAs in wild type (20ºC) in ppm small RNAs in wild type (20ºC) in ppm

58 Figure 11. CSR-1 22G-RNAs are depleted upon loss of emb-4. a) Bar graph showing the proportions of miRNA. 21U-RNAs and CSR-1, WAGO-9/HRDE-1 and WAGO-1- associated 22G- RNAs in wild type (blue) and the emb-4(hc60) mutant (red). miRNAs and piRNAs increase in proportion, while CSR-1 22G-RNAs are reduced upon loss of emb-4. b) Scatter plots showing small RNAs in wild type (Y-axis) plotted against small RNAs in the emb-4(hc60) mutant (X-axis). CSR-1 22G-RNAs are highlighted in red (left) and WAGO-9/HRDE-1 22G-RNAs are highlighted in green (right), all other small RNAs are highlighted in blue. small RNA quantities are in ppm. Each dot represents one small RNA species. CSR-1 22G-RNAs are depleted in the emb-4(hc60) mutant, whereas WAGO-9/HRDE-1 22G-RNAs are both depleted and enriched

Table 6. CSR-1 and WAGO-9/HRDE-1 22G-RNAs changed upon loss of emb-4

Small RNA Class Total Number Changed UP in DOWN in

emb-4(hc60) emb-4(hc60)

CSR-1 22G-RNAs 3882 46 3762

(4932)

WAGO-9/HRDE-1 1591 699 892

22G-RNAs

(1661)

In parallel to the small RNA sequencing experiments in wild type versus emb- 4(hc60) mutants, I performed mRNA-seq on these same RNA samples. Although traditionally Poly-A selection is used to generate mRNA-seq libraries, it presents the caveat of biasing against incompletely processed mRNAs and pre-mRNAs, which may mask some of the effects of loss of EMB-4 due to its potential role as a splicing factor. Therefore, we split each total RNA sample and treated one half with the Illumina Ribozero Gold kit to deplete ribosomal RNAs and the other half with the lllumina Poly-A selection kit to select for polyA+ mRNAs. Using DEseq, we determined which transcripts were changed two-fold or greater in the emb-4(hc60) mutant relative to wild type. Using these data, we then examined how targets of the CSR-1 and WAGO-9/HRDE-1 pathways changed in relation to small RNA populations when emb-4 is lost. The data summarized in the following paragraph were generated from PolyA- selected mRNA samples; however, we obtain similar results upon analysis of data from rRNA-depleted samples.

59 Our analyses show that a total of 8,447 genes are misregulated upon loss of emb-4. Of these, 4,374 are upregulated and 4,073 are downregulated. We then focused on CSR-1 and WAGO-9/HRDE-1 targets and found that of the misregulated genes, 3,423 are CSR-1 targets (out of 4,932 total CSR-1 targets) while only 670 are WAGO- 9/HRDE-1 targets (out of 1661 total WAGO-9/HRDE-1 targets). The majority of changed CSR-1 targets appear to be down-regulated upon loss of emb-4, as 3,014 are decreased while only 418 are increased in their steady state mRNA levels. In contrast, WAGO-9/HRDE-1 targets show no distinct pattern of mis-regulation, as 421 are up- regulated and 249 are down-regulated (Table 7).

Table 7. Global transcriptome changes upon loss of emb-4

Gene Category Total Number UP in DOWN in

Changed emb-4(hc60) emb-4(hc60)

All Genes 8447 4374 4073

CSR-1 Targets 3432 418 3014

(4932)

WAGO-9/ HRDE-1 Targets 670 421 249

(1661)

We then examined the changes in small RNAs in correlation with the changes in mRNA levels. Since the levels of CSR-1 22G-RNAs were dramatically depleted, we chose to first examine the effects of loss of emb-4 on CSR-1 22G-RNA target transcript levels. We compared fold changes of small RNAs to fold changes of their target mRNA transcripts. We observed that 2,351 CSR-1 target mRNAs are reduced, in correlation with a depletion of their CSR-1 22G-RNAs. 6 CSR-1 target mRNAs are increased in parallel with their 22G-RNAs (Figure 12 a (left), Table 8a). This is consistent with the current model of CSR-1 acting to license expression of its targets genes. An additional 321 CSR-1 target transcripts are decreased in correlation with enrichment in their 22G- RNAs and 25 genes are increased in mRNA levels in parallel to a depletion of their 22G- RNAs.

60 Changes in WAGO-9 22-RNAs did not show an obvious pattern. 133 WAGO- 9/HRDE-1 target mRNAs were increased in correlation with decreased WAGO-9/HRDE- 1 22G-RNAs, while 31 mRNAs were decreased in correlation with increased WAGO- 9/HRDE-1 22G-RNAs (Figure 12b (right), Table 8b). These two subsets are consistent with existing data that WAGO-9/HRDE-1 is responsible for the silencing of its targets. 94 WAGO-9/HRDE-1 target mRNAs increase in correlation with increasing 22G-RNAs and 97 decrease in conjunction with a reduction in WAGO-9/HRDE-1 22G- RNAs. Similar patterns are observed in our analysis using rRNA-depleted samples (Figure 12b).

Table 8. Changes in CSR-1 and WAGO-9/HRDE-1 small RNAs and corresponding mRNAs upon loss of emb-4 a) b)

61 a) PolyA-selected mRNA All Genes CSR-1 targets emb-4 All Genes WAGO-9/HRDE-1 targets

C)

C)

º

º

wild type (20

wild type (20

/

/

emb-4(hc60)

emb-4(hc60)

mRNA Log2 FC mRNA

mRNA Log2 FC mRNA

small RNA Log2 FC emb-4(hc60)/ wild type (20ºC) small RNA Log2 FC emb-4(hc60)/ wild type (20ºC) b) rRNA-depleted selected mRNA

All Genes CSR-1 targets emb-4 All Genes WAGO-9/HRDE-1 targets

C)

C)

º

º

wild type (20

wild type (20

/

/

emb-4(hc60)

emb-4(hc60)

mRNA Log2 FC mRNA

mRNA Log2 FC mRNA

small RNA Log2 FC emb-4(hc60)/ wild type (20 ºC) small RNA Log2 FC emb-4(hc60)/ wild type (20ºC)

62 Figure 12. CSR-1 targets are downregulated in the emb-4(hc60) mutant in correlation with a depletion of their small RNAs, while WAGO-9/HRDE-1 targets show no distinct pattern of mis-regulation. Scatter plots plotting mRNAs changed 2- fold or greater in the emb-4(hc60) mutant relative to wild type (Y-axis) against small RNAs changed 1.5-fold or greater in the emb-4(hc60) mutant relative to wild type (X- axis). Each dot represents one protein-coding gene. CSR-1 targets are highlighted in red (left), WAGO-9/HRDE-1 targets are highlighted in green (right). (a) mRNA-seq data were generated from Poly-A-selected mRNA samples (b) mRNA-seq data were generated from rRNA-depleted RNA samples.

Finally, as EMB-4 is a putative splicing factor, we examined whether there were changes in splicing upon loss of emb-4. We examined whether transcripts abnormally retained introns using rMATs [207]. We did not observe any significant intron retention upon loss of EMB-4 and concluded that, while EMB-4 is likely to be a part of the splicing machinery in C. elegans, it may not be essential for splicing to proceed normally.

3.9. EMB-4 interacts with nearly 12,000 transcripts: Because of its role in the Intron Binding Complex, I sought to determine which transcripts EMB-4 interacts with using an RNA IP-Illumina sequencing approach (RIP- seq). Because of the data I have accumulated thus far, I suspected that EMB-4 would interact with a large proportion of CSR-1 and WAGO-9 target transcripts. However, I speculated that EMB-4 would likely also be associated with a number of non-small RNA transcripts based on its broad role in splicing in other organisms. In these experiments, I isolated EMB-4 complexes by IP along with a total RNA input sample and a Mock IP negative control consisting of an IP performed with Protein A/G beads only. After IP, I DNase I treated the RNA isolated from the samples, and depleted the total RNA of rRNAs using the Illumina Ribozero Gold kit. Libraries were prepared using the Truseq single stranded mRNA prep kit (Illumina). We first determined the Log2-fold change of entire transcripts in the EMB-4 RIP relative to the Beads Only Negative Control, and observed that EMB-4 is bound to transcripts corresponding to 3,532 genes (Figure 13a (left)). This is based on a cut-off of two-fold or greater enrichment in the EMB-4 RIP sample over the Mock IP negative control. However, if we restricted our analysis to reads mapping to intronic sequences only, we find that EMB-4 is bound to introns corresponding to 11,826 genes (Figure 13a (right)). (Note that the total intron bound gene set of 11,826 genes contains the vast

63 majority of genes where EMB-4 is bound along the entire transcipt, 3,371/3,532.) These data confirm that EMB-4, like its human homolog AQR/IBsP160 is primarily an intron- binding protein. We further analyzed these data to determine the enrichment of CSR-1 and WAGO-9/HRDE-1 target transcripts in the EMB-4 RIP, and observed a striking difference between the two pathways. First, when we examined genes in which EMB-4 is bound over the entire length of the transcript (exons and introns), we found that 483 (out of 3,532 total genes) are WAGO-9/HRDE-1 targets and only 119 are CSR-1 targets (29.1% and 2.4% of all WAGO-9/HRDE-1 and CSR-1 targets, respectively). Next, when we examined the set of genes in which EMB-4 is only enriched on introns (11,826 genes in total), we observed that 1,213 are WAGO-9/HRDE-1 targets and a 3,270 are CSR-1 targets (72.9% and 66.3% of all WAGO-9/HRDE-1 and CSR-1 targets, respectively) (Table 9). These data once again highlight a bifurcation in WAGO-9/HRDE-1 data sets, as there appear to be two distinct groups of WAGO-9/HRDE-1 genes. How these two groups of genes differ is currently unclear.

Table 9. CSR-1 and WAGO-9/HRDE-1 showing EMB-4 enrichment

Enriched in EMB-4 Enriched in RIP EMB-4 RIP (exons and (introns only) introns) (11826) (3532)

CSR-1 Targets 119 3270 (4932)

WAGO-9/HRDE-1 Targets 483 1213 (1662)

Taken together, these data highlight potentially different modes of activity for EMB-4 in binding to and regulating CSR-1 vs. WAGO-9/HRDE-1 targets, as EMB-4 is found along the entirety of WAGO-9/HRDE-1 targets, but only at CSR-1 target introns.

64 These data were generated from one biological replicate using the anti-EMB-4 (7K3-12) antibody. I have since performed this experiment on a second biological replicate with both antibodies in order to validate the initial RIP. After RNA extraction and DNAse treatment, I performed reverse transcription followed by qRT-PCR to determine the enrichment of known EMB-4 targets in each RIP relative to the Mock RIP (no-antibody control). I use primers specific to six of the most-enriched EMB-4 targets identified in the initial RIP-seq experiment, as well as three that are specific to genes that are depleted in the EMB-4 RIP, i.e. non-targets of EMB-4. I observe an enrichment of the six EMB-4 targets in both EMB-4 RIP samples (5M19-8 and 7K3-12) relative to the Mock RIP. Furthermore, I find that the three EMB-4 non-targets are depleted in both RIP samples. These data show that both antibodies are pulling down transcripts as well as provide a second biological replicate with which we can validate our first RIP experiment (Figure 13b).

65 a) Log2 FC over length of entire transcript Log2 FC over length of introns only

Pearson corr.=0.98 Pearson corr.=0.84

Log2 FC in EMB-4 RIP EMB-4 in FC Log2 RIP EMB-4 in FC Log2

Log2 FC in Mock RIP Log2 FC in Mock RIP

b) '%"

'$"14

'#"12

'!"10 EMB-4 RIP (5M19-8) &"8 EMB-4 RIP (7K3-12)

%"6

$"4

#"2 Enrichment Over Beads Only Negative

!"

rla-1 rps-9 K09F6.10 F37H8.7 Y20F4.8 C16C8.21 F58H7.5 EMB-4 Targets Non-Targets

Figure 13. EMB-4 binds approximately 12000 transcripts and is enriched at introns. Gravid adult worms were used to pull down EMB-4. RNA associated with EMB- 4 was deep sequenced. a) Enrichment of entire genes was plotted against the Mock RIP (left). Enrichment of intronic sequences only was plotted against the Mock RIP. EMB-4 is enriched at intronic sequences showing that it is primarily intron-binding protein but also binds the exons of a subset of transcripts. b) qRT-PCR validation of second biological replicate of EMB-4 RIP. RIPs were performed using the two EMB-4 antibodies (5M19-8, 7K3-12). Fold enrichment was determined relative to the Mock RIP (no- antibody control). EMB-4 targets are enriched in both RIPs, and non-targets (genes not enriched in EMB-4 binding) are depleted.

66 3.10. The impact of EMB-4 binding on target transcript levels: To characterize the possible mode of regulation of EMB-4 target genes, we integrated the RIP-seq data with the mRNA-seq data. We started by separating transcripts that are mis-regulated in the emb-4(hc60) mutant mRNA-seq datasets into two categories: ones that are bound by EMB-4 and ones that are not bound by EMB-4. We found that of the 8,447 total genes that show mis-regulation in transcript level upon loss of emb-4, are 5,774 are bound by EMB-4 and the remaining 2,673 not. We filtered both categories for genes that are germline expressed, as this is the key tissue in which EMB-4 and the CSR-1 and WAGO-9/HRDE-1 small RNA pathways intersect [208]. Of the EMB-4 bound transcripts that are mis-regulated, 4,012 are germline expressed genes (Of these, 1,459 are up- and 2,553 are down-regulated.). Of the genes that are not bound by EMB-4 that are mis-regulated, 1,516 genes are expressed in the germline (500 increase upon loss of emb-4, while 1,016 decrease.). We aimed to correlate the binding of EMB-4 with any changes in gene expression for the CSR-1 and WAGO-9/HRDE-1pathway target genes. To do this, we first calculated the number of germline-expressed small RNA pathway target genes that were also EMB-4 targets, regardless of their differential expression status in emb- 4(hc60) mutants. 11% of the EMB-4 target transcripts expressed in the germline are WAGO-9/HRDE-1 target genes (768/6,699, 217 are up and 292 are down), while 50% are CSR-1 target genes (3,387/6,699, 738 are up 1291 are down). We postulated that if EMB-4 plays no role in regulating the expression of these small RNA target genes, we would expect to observe these same numbers of up- and down-regulated genes in emb- 4(hc60) mutants (i.e. 738 up and1291 down for CSR-1 targets and 217 up and 292 down for WAGO-9/HRDE-1)(Figure 14). We observed 217 of the up-regulated and 148 of the down-regulated EMB-4 targets to be also WAGO-9/HRDE-1 targets. We used a Chi-square test to determine statistical significance of this observation and found that this deviation was statistically different from our expected value (p<10^(-10)). For the CSR-1 pathway, we observed that 243 of the up-regulated and 2154 of the down- regulated germline EMB-4 target genes are CSR-1 targets. Again, this is a significant deviation from the expected values (p<10^(-16)). Together these observations point to a role for EMB-4 binding in regulating target gene expression.

67 We performed the same type of analysis for germline genes that were not EMB-4 targets (4,057 genes in total, Figure 14). We observed no statistically significant deviation from the expected value for WAGO-9/HRDE-1 pathway targets, as 43 of the up-regulated and 59 of the down-regulated genes are targets of this pathway (p=0.19; expected numbers: 88 up-regulated and 44 down-regulated WAGO-9/HRDE-1 genes). In contrast, we found that 123 of the up-regulated and 799 of the down-regulated genes are CSR-1 pathway targets, which is a significant deviation from the expected value (p<2.2*10^(-16), expected numbers: 150 up and 307 down-regulated CSR-1 target genes). These observations suggest that EMB-4-binding to a WAGO-9/HRDE-1 target is important for maintaining its proper expression level. More specifically, they suggest that EMB-4 plays a repressive role in the WAGO-9/HRDE-1 pathway, as loss of EMB-4 binding leads to a significant up-regulation of WAGO-9/HRDE-1 targets (Figure 14). Furthermore, these data suggest that EMB-4 binding per se has no direct impact on the transcript levels of CSR-1 targets. Rather, it appears that the loss of emb-4 causes these changes, as CSR-1 target transcript levels are reduced in the emb-4(hc60) mutant regardless of whether they are bound by EMB-4. Overall, these data support our observation that EMB-4 plays differential roles in the CSR-1 and WAGO-9/HRDE-1 pathways.

68

Bound by EMB-4 Not Bound by EMB-4 2154 *** *** *** observed observed

1291 expected expected 799

738

Number of genes of Number

Number of genes of Number 307

292

243

217

167

123 148 150

88

59

44 43

DOWN UP DOWN UP DOWN UP DOWN UP CSR-1 WAGO-9/HRDE-1 CSR-1 WAGO-9/HRDE-1 Targets Targets Targets Targets Figure 14. The impact of EMB-4 binding on CSR-1 and WAGO-9/HRDE1target transcript levels. Bar graphs show expected and observed values of CSR-1 (red) and WAGO-9/HRDE-1 (green) target genes bound by EMB-4 (left) and the ones not bound by EMB-4 (right). CSR-1 targets are downregulated upon loss of EMB-4 regardless of EMB-4 binding, whereas only EMB-4-bound WAGO-9/HRDE-1 targets are sigficantly mis-regulated in the emb-4(hc60) mutant.

3.11. EMB-4 interacts with TREX component DDX-19 My data indicate that EMB-4 plays important and differential roles in the CSR-1 and WAGO-9/HRDE-1 pathways. I therefore became interested in investigating possible mechanistic functions of EMB-4. As discussed in Chapter 1, it is known that the human homolog of EMB-4, AQR/IBP160, is an important component of the pre-exon junction complex (pre-EJC) and is necessary for the recruitment of EJC components to the intron. After some reorganization, many components of the pre-EJC and EJC remain bound to the mRNA and form the TREX complex, which is responsible shuttling the mature mRNA transcripts out of the nucleus. I therefore tested for interactions between EMB-4 and members of the EJC and TREX complex, using a set of existing antibodies available in several labs, and co-IP/western blot experiments. I first tested UAP56/HEL- 1, a helicase and important component of the EJC and TREX, and observed no interaction (data not shown) [198]. Next, I tested for an interaction between EMB-4 and the nuclear export factor NXF-1. Again, I did not observe an interaction between these factors (data not shown) [202]. I next turned to the TREX complex, where I looked for an interaction between EMB-4 and the export factor DDX-19. DDX-19 is a DEAD-box helicase that is involved

69 in the nuclear export of transcripts. It localizes to the nuclear periphery and to P granules. We were generously provided with anti-DDX-19 mouse cell culture supernatant from Dr. Jim Priess’ lab at the Fred Hutchinson Cancer Research Center that allowed us to probe for DDX-19 by western blotting [202]. I then immunoprecipitated EMB-4 and probed for DDX-19 and observed a band of the expected size in the EMB-4 IP but not in the Mock IP. Unfortunately, I was unable to detect DDX-19 in the input protein sample. Therefore, in order to validate the specificity of the DDX-19 antibody, I performed the same IP experiment on worms that were grown on ddx-19 RNAi food. When I immunoprecipitated EMB-4 in these ddx-19 RNAi worms, I detected a far less DDX-19 in the EMB-4 IP. These data suggest an interaction between EMB-4 and the TREX complex, which would be consistent with the role of AQR/IBP160 in humans. This role in nuclear export could be an essential facet of the proper routing of small RNA pathway targets out of the nucleus and into the germ granules for small RNA biogenesis and subsequent regulation.

L4440 ddx-19 RNAi

Beads Beads EMB-4 IP Only EMB-4 IP Only

EMB-4 170 kDa

DDX-19 108 kDa

Figure 15. EMB-4 interacts with TREX component DDX-19. EMB-4 IPs were performed on wild type worms grown on control food (L4440) or ddx-19 RNAi food. Western blots were probed with anti-EMB-4 and anti-DDX-19. DDX-19 co- immunoprecipitates with EMB-4 in worms grown on control RNAi food. DDX-19 levels are reduced in the EMB-4 IP in the worms grown on ddx-19 RNAi food.

70 4. Discussion and Future Directions

4.1. Discussion

4.1.1. EMB-4 expression and localization are reflective of its functions: Due to our interest in uncovering the mechanisms of action of CSR-1 in the worm, we sought to characterize one of its interactors, the putative splicing factor EMB- 4. Since little was previously known about EMB-4 in C. elegans, we began by generating antibodies specific to EMB-4. We used this antibody to characterize EMB-4 protein (along with mRNA) expression throughout development, and to assess the subcellular localization of EMB-4. We show that emb-4 mRNA is present throughout worm development but is predominantly expressed in the C. elegans embryo and in adult hermaphrodites. In addition, emb-4 mRNA appears to be predominantly present in the germline, as worms possessing no germline (glp-4(bn2) mutants) show low levels of emb-4 expression. This expression is mirrored in the protein expression data, which also show that EMB-4 is present throughout development, but that the major contribution of EMB-4 is in the germline, with undetectable levels of EMB-4 in the adult soma. We further examined the localization of EMB-4 in the tissues and cells where it is predominantly expressed and found that EMB-4 is restricted to the nucleus in the germline and the embryo, as we observed co-staining with DAPI, but none with the cytoplasmic P-granule marker, PGL-1. EMB-4, however displays a dynamic expression pattern in the germline as it is concentrated in foci in the mitotic and transition zones of the germline, but shifts to one distinct focus, presumptively within the nucleolus, in the pachytene. EMB-4 dissipates in the oocytes and is uniformly distributed in oocyte nuclei. Despite not being present in adult somatic cells, in the embryo, EMB-4 is present in the nuclei of all cells, somatic and germline alike, and is also observed in subnuclear foci. As EMB-4 is a putative splicing factor, we expected to observe its localization in and restriction to the nucleus. It has also been previously reported in a number of organisms, that splicing factors localize to subnuclear foci, termed nuclear speckles. These are concentrated hubs of splicing activity within the nucleus and have been shown to be highly dynamic [193,194]. In contrast to this pattern, we did not observe multiple nuclear speckles of EMB-4 localization in the adult germline. However, in embryos, multiple subnuclear foci were evident and could be consistent with nuclear

71 speckles. The localization of EMB-4 to one focus in the pachytene nuclei is unusual, and although this focus may represent the nucleolus, a site where ribosomal RNA biogenesis occurs, the functional relevance of this pattern is not yet clear [195]. Nonetheless, the dynamic nature of EMB-4 localization in germline nuclei, along with the emb-4 defect in germline chromatin resetting, suggests an important role for EMB-4 in the germline and its development.

4.1.2. emb-4 mutants show defects in fertility and viability, reflective of germline roles: In order to better characterize the emb-4(hc60) mutant, which is central to our understanding of the molecular roles of emb-4, I analyzed the fertility and viability of emb-4(hc60) mutants. It had been previously reported that the emb-4(hc60) mutant is a temperature sensitive null allele with increased embryonic lethality at higher temperatures. I performed brood size counts at 14°C, 20°C, and 25°C and scored the total number of embryos laid as well as the total number of larvae hatched. From these experiments, I concluded that the mutant phenotype is temperature sensitive, as the brood size decreased drastically at 25°C relative to wild type worms. Notably, I also observed decreased broods for the emb-4(hc60) mutant at 14°C, relative to the normal culture temperature of 20°C, where the mutants were healthiest. Interestingly, these brood defects could be due to differential effects on sperm and egg development at the different temperatures, as studies in related Caenorhabditis species have shown that high temperature loss of fertility is often reflective of defective spermatogenesis, while low temperature loss of fertility is reflective of defective oogenesis [196]. We have not tested the possibility of different gametogenesis defects at different temperature extremes for emb-4(hc60) mutants, but could do so in the future. Moreover, our studies have focused on hermaphrodites and embryos, but further insights could be gained by examining EMB-4 localization in the male germline, and the fertility of emb-4(hc60) mutant males. In addition to reduced total brood size, emb-4(hc60) mutants also display a significant degree of embryonic lethality, as only 38% and 50% of the embryos hatched at 14°C and 20°C relative to 96% and 97% of wild type embryos. This phenotype is fully

72 penetrant at 25°C, as no larvae were observed in the mutant, while 87% of wild type embryos hatched into viable larvae. We therefore conclude that emb-4 does indeed have an embryonic lethal phenotype, which is fully penetrant at the non-permissive temperature of 25°C. Taken together, these data support a role for the putative splicing factor emb-4 in development. It is not uncommon to observe some level of lethality when examining splicing factor mutants due to the central role of splicing and its factors in mRNA processing. For instance, the Saccharomyces cerevisiae splicing factor Prp40 is essential [197]. Moreover, mutations in genes encoding constitutive splicing factors in C. elegans such as rnp-2 and rnp-3 (C. elegans U1A and U2B, respectively) also lead to lethality [198].

4.1.3. EMB-4 interacts with 22G-RNA pathways in the germline: I determined by immunoprecipitation followed by mass spectrometry that EMB-4 interacts with the short isoform of CSR-1. These data were confirmed by co-IP/western blot, as was an interaction between EMB-4 and WAGO-9/HRDE-1. Although there are two isoforms of CSR-1, a long and a short isoform, EMB-4 appears to interact with predominantly the short isoform. The two CSR-1 isoforms differ only in their N-terminus, where the long isoform possesses an additional 163 amino acids, including an arginine and glycine rich motif that may impact the post-translational regulation of CSR-1 (Amanda Charlesworth, Claycomb lab, unpublished). Little is known about the functional difference between the two CSR-1 isoforms in the worm; therefore, we do not know the relevance of this selectivity in EMB-4 interaction with CSR-1. As mentioned above CSR- 1 and WAGO-9/HRDE-1 play important roles in germline development and regulation of proper germline gene expression by licensing the transcription of the appropriate genes (CSR-1) and silencing foreign and deleterious nucleic acids (WAGO-9/HRDE-1). These interactions suggest a role for EMB-4 in both small RNA pathways and maintaining the critical transcriptome balance in the germline. To begin to tease apart the role of EMB-4 in the CSR-1 and WAGO-9/HRDE-1 small RNA pathways, we built on one key previous observation of the emb-4 mutant phenotype. Checchi and Kelley reported that emb-4 mutants have defects in chromatin resetting during the transcriptional activation of primordial germ cells, Z2/Z3 [174]. Loss

73 of emb-4 leads to the retention of the H3K4me2 mark in the PGCs well past the Z2/Z3 stage, when it should be removed. Due to the known role of CSR-1 and WAGO- 9/HRDE-1 in germ cell development and their effects on chromatin, as well as their interaction with EMB-4, we asked whether they also display defects in germline chromatin resetting. Like emb-4 mutants, csr-1 partially rescued mutants also displayed a high occurrence of aberrant germline chromatin resetting, as over 60% of embryos show H3K4me2 retention. In wago-9/hrde-1(tm1200) mutants, this phenotype is observed in only 25% of embryos. This discrepancy is interesting considering the differences between these two mutants. The csr-1 mutant is not a null mutant, but rather can be thought of as a hypomorphic allele, whereas the wago-9/hrde-1 mutant is a complete null. Genetics already suggested that csr-1 plays a more significant role in germline development and fertility as csr-1(tm892) null mutants are fully sterile, whereas wago-9/hrde-1(tm1200) mutants are viable and only show significantly reduced fertility under high temperature stress [73]. In both cases however, our data suggest a role for the WAGOs CSR-1 and WAGO-9/HRDE-1 in embryonic germline chromatin resetting, possibly in conjunction with EMB-4, as the primordial germ cells adopt a zygotic transcriptional program.

4.1.4. EMB-4 associates with chromatin at 22G-RNA target loci and binds a diverse set of transcripts: As a step toward understanding the mechanisms of EMB-4 function in small RNA pathways, we sought to determine whether EMB-4 was associated with the target gene loci and transcripts of the CSR-1 and WAGO-9/HRDE-1 small RNA pathways, by performing Chromatin IP experiments and RNA IP-sequencing experiments for EMB-4. First, in ChIP experiments, I observed an enrichment of EMB-4 at CSR-1 and WAGO- 9/HRDE-1 small RNA target gene loci relative to the Mock IP negative control. As EMB- 4 is a putative splicing factor, we predict that it acts upon these targets in a co- transcriptional manner, and is likely to associate with nascent transcripts rather than directly binding to chromatin. Both CSR-1 and WAGO-9/HRDE-1 are enriched at their gene targets by ChIP experiments, and interact with RNAPII in an RNA dependent manner [71,74,]. Thus, these AGOs are thought to be associated with target gene

74 nascent transcripts, via sequence specificity provided by their 22G-RNA binding partners. As EMB-4 homologs such as AQR/IBP160 do not interact with transcripts in a sequence specific manner, one possibility is that EMB-4 could utilize its AGO binding partners as a specificity factor to aid it in identifying its small RNA pathway target transcripts [151]. However, because EMB-4 is likely to be a constitutive splicing factor and member of the intron-binding complex in C. elegans (like AQR/IBP160), an alternative possibility is that EMB-4 may simply be present at nascent transcripts during transcription as a component of the RNA processing machinery. This would especially be likely for those genes bound by EMB-4 that are not targets of any known small RNA pathways. In RIP experiments for EMB-4, I again observed enrichment for EMB-4 binding of CSR-1 and WAGO-9/HRDE-1 target transcripts. Approximately 12,000 transcripts overall were enriched for EMB-4 binding, with 3092 genes being CSR-1 22G-RNA targets, 1087 genes being WAGO-9/HRDE-1 targets, and the rest not being targeted by known small RNA pathways. When we examined where EMB-4 was enriched on target transcripts, we noted a striking difference between CSR-1 and WAGO-9/HRDE-1 target genes. CSR-1 targets were enriched for EMB-4 predominantly over their introns, while WAGO-9/HRDE-1 targets were enriched for EMB-4 binding over their entire lengths (including both introns and exons). Non-small RNA genes bound by EMB-4 also displayed enrichment predominantly within introns. These data highlight a potential difference in the activity of EMB-4 for each of the 22G-RNA pathways, but the mechanistic significance of this difference is, as yet, not entirely clear.

4.1.5. emb-4(hc60) small RNA and mRNA transcriptomes are perturbed: Our collective data pointed strongly to a link between EMB-4 and the CSR-1 and WAGO-9/HRDE-1 22G-RNA pathways. Thus, I next asked whether small RNA and mRNA transcriptomes were altered in emb-4(hc60) mutants grown at 20°C. Dr. Katarzyna Tyc, a computational postdoctoral fellow in the Claycomb lab, analyzed these high throughput sequencing data, and from these analyses we made several striking observations. First, small RNA analysis revealed increases in miRNA populations and piRNA populations along with a concomitant decrease in CSR-1 22G-RNAs. The

75 WAGO-9/HRDE-1 22G-RNA changes were less uniform, and could be split into two groups: about 300 genes showed increases in 22G-RNAs, while about 450 showed decreases in 22G-RNAs when emb-4 was lost. Our analyses have not yet revealed any other distinguishing (or common) characteristics between the two WAGO-9/HRDE-1 22G-RNA target gene groups, however, this analysis is still ongoing. Taken together, our data suggest that emb-4 somehow affects 22G-RNA biogenesis or stability. It is important to note that in these analyses, we have represented all small RNA classes as a proportion of the total small RNAs, which can lead to a slight misrepresentation of some of the data. For example, the number of miRNAs may actually be the same in the mutant and the wild type samples, but it may appear as though miRNAs are changed in the mutant as they occupy a larger proportion of the total small RNA population due to loss of other classes of small RNAs, such as the CSR-1 22G-RNA class. I also performed mRNA-seq on libraries generated from the same RNA samples and found that a total of 8,447 genes were changed in the emb-4(hc60) mutant relative to wild type. Of these, 4,374 are up-regulated and 4,073 are down-regulated. The majority of down-regulated transcripts were found to be CSR-1 targets (3,104 out of 4,073 genes down-regulated in emb-4, and 3,104 out of 4,392 CSR-1 target genes) and this down-regulation appears to be correlated with a decrease in CSR-1 associated 22G-RNAs against these targets, as 74% of these also show a decrease in small RNA level. These data are consistent with published data on the role of CSR-1 as a positive regulator of its targets [74,75]. 11% of down regulated CSR-1 targets show an unexpected increase in small RNA levels. This may be due to the fact that these transcripts are targeted by other silencing small RNA pathways. The remaining 15% show no change in small RNA level. A small number of CSR-targets (418 genes) are up- regulated in the emb-4(hc60) mutant, with no obvious changes in small RNAs correlating with this up-regulation. In contrast, WAGO-9/HRDE-1 targets show no unified direction of changes in expression levels. A total of 670 WAGO-9/HRDE-1 targets are changed in the emb- 4(hc60) mutant, with 421 being up-regulated and 249 being down-regulated. Only 123 of the 421 up-regulated targets show the expected loss of WAGO-9/HRDE-1 small RNAs and only 26 down-regulated targets show the expected correlation of increased small

76 RNA levels. These data suggest that loss of emb-4 may have a less pronounced effect on the WAGO-9/HRDE-1 pathway than it does on the CSR-1 pathway overall. This may be in part due to the fact that CSR-1 targets are almost exclusively targeted by CSR-1 whereas WAGO-9/HRDE-1 targets may be subject to targeting by other small RNA pathways such as the PRG-1 and the WAGO-1 pathways [71,55]. While we do not observe an interaction between EMB-4 and WAGO-9/HRDE-1, suggesting a lack of a role for EMB-4 in the WAGO-1 small RNA pathway, an interaction between EMB-4 and PRG-1 has not been tested. Clearly, these data provide an additional line of evidence supporting the difference between EMB-4 activity in each small RNA pathway. In the CSR-1 small RNA pathway, there appears to be a strong correlation between the levels of small RNAs and their mRNA targets. However, it is difficult to infer causality from our data, based on what is already known about the biogenesis of 22G- RNAs. Target transcripts are used as a template to generate 22G-RNAs by RNA- dependent RNA polymerase enzymes [69]. Therefore, there must always be a basal level of transcription of these target genes in order to continue the biosynthesis of their 22G-RNAs. In this sense it is impossible to determine from our data whether the loss of CSR-1 22G-RNAs leads to a reduction in CSR-1 mRNA target level (as the pathway is thought to promote gene expression), or vice versa. To gain further insights into the differential role of EMB-4 in the CSR-1 and WAGO-9/HRDE-1 small RNA pathways, we performed additional analyses of the RIP- seq data in conjunction with our mRNA-seq data. Once again, we found a striking difference between the CSR-1 and WAGO-9/HRDE-1 targets in the mutant. Namely, if we separated the transcripts with altered steady state levels in the emb-4(hc60) mutant relative to wild type into two categories based on whether or not they are bound by EMB-4, we found that a higher than expected number of germline expressed CSR-1 targets are down regulated in the emb-4(hc60) mutant regardless of EMB-4 binding status. However, if we performed the same analysis on WAGO-9/HRDE-1 targets, we found that only WAGO-9/HRDE-1 targets that are directly bound by EMB-4 showed a higher than expected number of changed transcripts in the emb-4(hc60) mutant. Importantly, these EMB-4- bound WAGO-9/HRDE-1 targets are up-regulated in the emb-4(hc60) mutant. These data further solidify our hypothesis that EMB-4 targets the

77 CSR-1 and WAGO-9/HRDE-1 pathways in a differential manner. Furthermore, they suggest that EMB-4 itself, but not its binding, is necessary for the proper expression of CSR-1 targets. These data may point to indirect effects from EMB-4 in the CSR-1 pathway, or, they could suggest that EMB-4 is required for proper chromatin and/or transcription rates at CSR-1 target genes. Further ChIP studies of histone modifications and RNAPII in emb-4 mutants will help to clarify these possibilities. The activity of EMB- 4 in the CSR-1 pathway is in contrast to the WAGO-9/HRDE-1 targets, where EMB-4 binding appears to be necessary for their silencing, and suggesting that EMB-4 plays an actively repressive role in the WAGO-9/HRDE-1 pathway. Furthermore, our collaborators in the Miska lab have performed assays that suggest EMB-4 is required for the ability of RNA dependent RNA polymerases to traverse introns during the synthesis of 22G-RNAs.

4.1.7. EMB-4 interacts with TREX in C. elegans As an alternative approach to tease apart the molecular mechanisms of EMB-4 function, we set out to identify additional protein interactors of EMB-4 using a candidate approach. Since it has been shown that the human homolog of EMB-4, AQR/IBP160, play an important role in the loading of the exon junction complex onto the mRNA, we hypothesized that EMB-4 may be playing a similar role in the worms. I therefore tested for interactions between EMB-4 and several members of the EJC and TREX complexes in C.elegans. From these students I identified DDX-19, a component of the C. elegans TREX complex, as an interactor of EMB-4. These data suggest a potential role for EMB- 4 in the export of mRNAs out of the nucleus. As of yet, it remains unclear whether EMB- 4 is also a member of the pre-EJC and EJC in C. elegans. Although I tested for interactions between EMB-4 and UAP56/HEL-1, a core member of the EJC and TREX complexes, I was unable to detect an interaction by co-IP. Since UAP56/HEL-1 is one of the first components of the EJC that are recruited to the mRNA early on during mRNA processing and remains bound to it until export has occurred and the first round of translation has initiated, one would expect that if EMB-4 were interacting with members of the EJC and TREX, UAP56/HEL-1 would be among these interactors. It is possible, as the spliceosome, EJC and TREX are large and dynamic complexes, that the two

78 proteins are too distant to detect an interaction by co-IP. It is also possible, due to the dynamic nature of these complexes that an interaction between the two is transient and can therefore not be sufficiently captured by our methods. More exhaustive measures, such as more extensive mass spectrometry experiments or even Bio-ID should be utilized in the future to further dissect interactions between EMB-4 and other members of the RNA processing machinery in the nucleus (see Future Directions) [203].

4.1.8. Summary In summary, our data suggest an important and differential role for EMB-4 in the CSR-1 and WAGO-9/HRDE-1 small RNA pathways. I found that EMB-4, CSR-1 and WAGO-9/HRDE-1 affect the crucial process of germline chromatin resetting during embryogenesis. EMB-4 appears to affect biogenesis or stability of 22G-RNAs and their target transcripts in the CSR-1 pathway as we observe changes in their associated small RNAs, which is correlated to target transcript decreases upon loss of emb-4. Furthermore, we show that EMB-4, like its human homolog AQR/IBP160, is primarily an intron-binding protein and that many of the introns EMB-4 binds are CSR-1 targets. EMB-4 associates with the exons of a limited number of transcripts, mostly WAGO- 9/HRDE-1 targets. Perhaps the most exciting conclusion from our data is that EMB-4 appears to target the CSR-1 and WAGO-9/HRDE-1 small RNA pathways in different ways and appears to be playing different roles in each pathway. Finally, we found that EMB-4 interacts with the nuclear export factor DDX-19, suggesting a role for EMB-4 as component of the C. elegans EJC and/or TREX complex. We propose a model where EMB-4 acts co-transcriptionally on CSR-1 and WAGO-9/HRDE-1 target transcripts in the nucleus. EMB-4 may aid in the splicing of these transcripts, but may also assist in marking transcripts for appropriate export from the nucleus and into the P-granules. The P-granules are enriched for small RNA biogenesis and regulation activities, and possess factors like CSR-1, WAGO-9/HRDE-1, and the RdRPs, thus proper routing to these sites may enable the proper biogenesis of 22G-RNAs and may ultimately reinforce the proper regulation of these targets by 22G- RNA pathways. Therefore, we speculate that loss of emb-4 leads to inefficient export of transcripts, which results in a loss of a large proportion of 22G-RNAs. At the organismal

79 level, these 22G-RNAs are necessary for proper germline chromatin resetting, germ cell specification and germline development

4.2. Future Questions 4.2.1. How do the complexes in which EMB-4 is found with WAGO-9/HRDE1 and

CSR-1 differ? Our data collectively point to different mechanisms of action of EMB-4 in the CSR-1 and WAGO-9/HRDE-1 small RNA pathways. Therefore, it is necessary that we understand the similarities and differences between the two EMB-4/WAGO complexes. One experiment we will perform to do address this question is a sequential IP experiment. In this experiment we will use worm strains expressing FLAG::CSR-1 or FLAG:: WAGO-9/HRDE-1 and use anti-FLAG antibody to pull down the FLAG::CSR-1 and FLAG:: WAGO-9/HRDE-1 protein complexes. We would then use purified FLAG- peptide to elute the complexes and perform an EMB-4 IP, using the EMB-4 antibody, on the AGO protein complexes. This would eliminate the protein complexes in which the AGOs are found in the absence of EMB-4. The resulting protein complexes could then be sent to a mass spectrometry facility to determine what other proteins found in the complexes. This would help us elucidate which proteins are common and which are different in each EMB-4/AGO complex and is likely to lead us in the direction of understanding the different functions of EMB-4 in each context. It would also be interesting to perform IP followed by mass spectrometry on EMB-4 alone, as EMB-4 may be a member of the C. elegans spliceosome or other complexes in contexts that are not related to small RNA pathways. RNAse treatment could be incorporated to understand which interactions were facilitated by RNA or existed in the absence of a transcript. These data would help inform subsequent experiments to understand the diverse molecular functions of EMB-4. Beyond these initial experiments that are imminently feasible, given the reagents we have in hand, we could attempt more involved experiments that involve building transgenic strains, such as Bio-ID of EMB-4. In these experiments, EMB-4 would be expressed in the worm, as a BirA biotin ligase fusion protein. After assuring the function of EMB-4 via rescue experiments, we would identify EMB-4 interacting proteins via

80 streptavidin precipitation of biotinylated partners and subsequent mass spectrometry. These methods would require a more substantial commitment to building reagents and trouble-shooting these novel assays in the worm.

4.2.2. How do the functions and targets of EMB-4 and DDX-19 overlap? As we have uncovered a new interaction between EMB-4 and the export factor DDX-19 in C. elegans, it is important to characterize this interaction in more depth. If EMB-4 is involved in the nuclear export of transcripts with DDX-19, we expect a large overlap between the transcripts bound by each of the two proteins. It would be useful to perform an RNA immunoprecipitation experiment on DDX-19 and examine to what extent EMB-4 and DDX-19 share targets. Using reverse transcription followed by qRT- PCR, we can test whether EMB-4 targets are enriched in the DDX-19 RIP. If we saw some overlap, or at least evidence that the DDX-19 RIP was successful, we could perform a large-scale RIP-seq experiment to gain a global view of DDX-19 targets and their overlap with EMB-4 targets. We also hypothesize that EMB-4 recruits DDX-19 and other members of the TREX complex to their target transcripts. Therefore, we expect that, upon loss of EMB- 4, these factors should not be able to find their targets. We can test this hypothesis by performing RNA immunoprecipitation on DDX-19 in an emb-4(hc60) mutant background. Once again we can use reverse transcription and qRT-PCR to determine whether DDX- 19 targets are no longer enriched in the DDX-19 RIP upon loss of emb-4. Similarly, DDX-19 localizes to P-granules, and we could test whether loss of emb-4 leads to mis- localization of DDX-19 or a loss of DDX-19 from P-granules, as our model would predict. These experiments will provide mechanistic evidence for the function of EMB-4 in the export of transcripts and will provide a novel link between co-transcriptional RNA processing and RNA export in C. elegans.

4.2.3 Does loss of EMB-4 lead to mislocalization of transcripts? Our data lead us to hypothesize that EMB-4 is involved in the export of its targets out of the nucleus, possibly into P-granules or directly into the cytoplasm. If this is the case, we would expect a mislocalization of transcripts upon loss of emb-4. We could test

81 this hypothesis by performing RNA fluorescent in situ hybridization experiments using probes against known targets of EMB-4. As EMB-4 appears to be playing an important role in the germline, these experiments should be performed on dissected germlines. In the wild type background, these experiments will inform us on the normal localization of EMB-4 targets in the germline, where they are in the nucleus and what other subcellular compartments they are routed to after transcription. In parallel, we would perform the same experiment in an emb-4(hc60) mutant background, in order to determine whether EMB-4 targets are mislocalized upon loss of EMB-4. Mislocalization of EMB-4 targets would suggest an active role for EMB-4 in the export of its targets and the regulation of their localization.

4.3. Conclusions Collectively, the data generated from this study implicate EMB-4 in small RNA pathways and development in C. elegans. EMB-4 interacts with the C. elegans WAGOs CSR-1 and WAGO-9/HRDE-1, which play important roles in the maintenance of germline gene expression and genome integrity. I have shown that loss of csr-1 and wago-9/hrde-1 recapitulates one of the phenotypes associated emb-4, namely a defect in germline chromatin resetting. Reciprocally, I also show that loss of emb-4 recapitulates the sterility phenotypes associated with csr-1 and wago-9/hrde-1. Perhaps the most intriguing finding provided here is the apparent discrimination between the CSR-1 and WAGO-9/HRDE-1 small RNA pathways with regards to EMB-4 targeting and function. Loss of emb-4 leads to perturbations in the two small RNA pathways, in distinct ways. While CSR-1 22RNAs are generally depleted upon loss of emb-4, many WAGO- 9/HRDE-1 22G-RNAs appear to be divided into two subsets, one that is depleted and another that is enriched. This is reiterated in the changes in target transcript levels of the two pathways, as CSR-1 target transcripts are uniformly downregulated, in correlation with a depletion of CSR-1 small RNAs. WAGO-9/HRDE-1 targets are changed but not in one particular direction. Another major difference lies in the binding of EMB-4 to the targets of the two pathways and the effect of this binding on transcript levels. EMB-4 bind WAGO-9/HRDE-1 targets at both introns and exons and this binding appears to be necessary for the repression of these targets. In contrast, EMB-4 binds CSR-1 targets at

82 introns only, and it appears as though binding of EMB-4 is not necessary for the emb-4 dependent misregulation of CSR-1 target transcript level. Taken together, these data provide a compelling evidence to suggest fundamental differences in the mechanisms that govern the CSR-1 and WAGO-9/HRDE-1 small RNA pathways. Finally, I show that EMB-4 interacts with the export factor DDX-19, suggesting a possible role for EMB-4 in the nuclear export of its targets. This study represents the first molecular characterization of EMB-4 as well as the first example of the involvement of a splicing factor in small RNA pathways in the worm. Furthermore it provides initial key insights into possible mechanisms of splicing factor activity in small RNA pathways; interactions that are likely to play conserved functions across species.

83 5. References

1.Saxe, J. & Lin, H. Small Noncoding RNAs in the Germline. Cold Spring Harbor Perspectives in Biology 3, a002717-a002717 (2011). 2.Lau, N. Small RNAs in the animal gonad: Guarding genomes and guiding development. The International Journal of Biochemistry & Cell Biology 42, 1334- 1347 (2010). 3.Finnegan, E. The small RNA world. Journal of Cell Science 116, 4689-4693 (2003). 4.Hammond, S., Bernstein, E., Beach, D. & Hannon, G. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404, 293- 296 (2000). 5.Gu, S. et al. Amplification of siRNA in Caenorhabditis elegans generates a transgenerational sequence-targeted histone H3 lysine 9 methylation footprint. Nature Genetics 44, 157-164 (2012). 6.Olsen, P. & Ambros, V. The lin-4 Regulatory RNA Controls Developmental Timing in Caenorhabditis elegans by Blocking LIN-14 Protein Synthesis after the Initiation of Translation. Developmental Biology 216, 671-680 (1999). 7.Kim, V., Han, J. & Siomi, M. Biogenesis of small RNAs in animals. Nature Reviews Molecular Cell Biology 10, 126-139 (2009). 8.Hutvagner, G. & Simard, M. Argonaute proteins: key players in RNA silencing. Nature Reviews Molecular Cell Biology 9, 22-32 (2008). 9.Azlan, A., Dzaki, N. & Azzam, G. Argonaute: The executor of small RNA function. Journal of Genetics and Genomics 43, 481-494 (2016). 10.Jinek, M. & Doudna, J. A three-dimensional view of the molecular machinery of RNA interference. Nature 457, 405-412 (2009). 11.Schirle, N. & MacRae, I. The Crystal Structure of Human Argonaute2. Science 336, 1037-1040 (2012). 12.Nakanishi, K., Weinberg, D., Bartel, D. & Patel, D. Structure of yeast Argonaute with guide RNA. Nature 486, 368-374 (2012). 13.Kwak, P. & Tomari, Y. The N domain of Argonaute drives duplex unwinding during RISC assembly. Nature Structural & Molecular Biology 19, 145-151 (2012). 14.Simon, B. et al. Recognition of 2′-O-Methylated 3′-End of piRNA by the PAZ Domain of a Piwi Protein. Structure 19, 172-180 (2011). 15.Elkayam, E. et al. The Structure of Human Argonaute-2 in Complex with miR- 20a. Cell 150, 100-110 (2012). 16.Liu, J. Argonaute2 Is the Catalytic Engine of Mammalian RNAi. Science 305, 1437- 1441 (2004).

84 17.Hauptmann, J. et al. Turning catalytically inactive human Argonaute proteins into active slicer enzymes. Nature Structural & Molecular Biology 20, 814-817 (2013). 18.Faehnle, C., Elkayam, E., Haase, A., Hannon, G. & Joshua-Tor, L. The Making of a Slicer: Activation of Human Argonaute-1. Cell Reports 3, 1901-1909 (2013). 19.Behm-Ansmant, I. mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes & Development 20, 1885-1898 (2006). 20.Lee, Y. et al. MicroRNA genes are transcribed by RNA polymerase II. The EMBO Journal 23, 4051-4060 (2004). 21.Borchert, G., Lanier, W. & Davidson, B. RNA polymerase III transcribes human microRNAs. Nat Struct Mol Biol 13, 1097-1101 (2006). 22.Kim, V. MicroRNA biogenesis: coordinated cropping and dicing. Nature Reviews Molecular Cell Biology 6, 376-385 (2005). 23.Lee, Y. et al. The nuclear RNase III Drosha initiates microRNA processing. Nature 425, 415-419 (2003). 24.Lee, Y. et al. Distinct Roles for Drosophila Dicer-1 and Dicer-2 in the siRNA/miRNA Silencing Pathways. Cell 117, 69-81 (2004). 25.Ketting, R. Dicer functions in RNA interference and in synthesis of small RNA involved in developmental timing in C. elegans. Genes & Development 15, 2654- 2659 (2001). 26.Jiang, F. Dicer-1 and R3D1-L catalyze microRNA maturation in Drosophila. Genes & Development19, 1674-1679 (2005). 27.Nishihara, T., Zekri, L., Braun, J. & Izaurralde, E. miRISC recruits decapping factors to miRNA targets to enhance their degradation. Nucleic Acids Research 41, 8692- 8705 (2013). 28.Djuranovic, S., Nahvi, A. & Green, R. miRNA-Mediated Gene Silencing by Translational Repression Followed by mRNA Deadenylation and Decay. Science 336, 237-240 (2012). 29.Aravin, A. et al. Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster germline. Current Biology 11, 1017-1027 (2001). 30.Aravin, A. et al. The Small RNA Profile during Drosophila melanogaster Development. Developmental Cell 5, 337-350 (2003). 31.Kalmykova, A. Argonaute protein PIWI controls mobilization of retrotransposons in the Drosophila male germline. Nucleic Acids Research 33, 2052-2059 (2005). 32.Olovnikov, I. & Kalmykova, A. piRNA clusters as a main source of small RNAs in the animal germline. Biochemistry (Moscow) 78, 572-584 (2013).

85 33.Vagin, V. A Distinct Small RNA Pathway Silences Selfish Genetic Elements in the Germline. Science 313, 320-324 (2006). 34.Wang, X. et al. RNA Interference Directs Innate Immunity Against Viruses in Adult Drosophila. Science 312, 452-454 (2006). 35.van Rij, R. et al. The RNA silencing endonuclease Argonaute 2 mediates specific antiviral immunity in Drosophila melanogaster. Genes & Development 20, 2985- 2995 (2006). 36.Birchler, J. Ubiquitous RNA-dependent RNA polymerase and gene silencing. Genome Biol 10, 243 (2009). 37.Duan, G. et al. C. elegans RNA-dependent RNA polymerases rrf-1 and ego-1 silence Drosophila transgenes by differing mechanisms. Cellular and Molecular Life Sciences 70, 1469-1481 (2012). 38.Wedeles, C., Wu, M. & Claycomb, J. A multitasking Argonaute: exploring the many facets of C. elegans CSR-1. Chromosome Research 21, 573-586 (2013). 39.Ambros, V., Lee, R., Lavanway, A., Williams, P. & Jewell, D. MicroRNAs and Other Tiny Endogenous RNAs in C. elegans. Current Biology 13, 807-818 (2003). 40.Billi, A. Endogenous RNAi pathways in C. elegans. WormBook 1-49 (2014). doi:10.1895/wormbook.1.170.1 41.Ambros, V. MicroRNAs and developmental timing. Current Opinion in Genetics & Development 21, 511-517 (2011). 42.Hayes, G., Frand, A. & Ruvkun, G. The mir-84 and let-7 paralogous microRNA genes of Caenorhabditis elegans direct the cessation of molting via the conserved nuclear hormone receptors NHR-23 and NHR-25. Development 133, 4631-4641 (2006). 43.Olsson-Carter, K. & Slack, F. A Developmental Timing Switch Promotes Axon Outgrowth Independent of Known Guidance Receptors. PLoS Genetics 6, e1001054 (2010). 44.Alvarez-Saavedra, E. & Horvitz, H. Many Families of C. elegans MicroRNAs Are Not Essential for Development or Viability. Current Biology 20, 367-373 (2010). 45.Miska, E. et al. Most Caenorhabditis elegans microRNAs Are Individually Not Essential for Development or Viability. PLoS Genetics 3, e215 (2007). 46.Shaw, W., Armisen, J., Lehrbach, N. & Miska, E. The Conserved miR-51 microRNA Family Is Redundantly Required for Embryonic Development and Pharynx Attachment in Caenorhabditis elegans. Genetics 185, 897-905 (2010). 47.Kato, M. et al. The mir-34 microRNA is required for the DNA damage response in vivo in C. elegans and in vitro in human breast cancer cells. Oncogene 28, 3008- 3008 (2009).

86 48.Boulias, K. & Horvitz, H. The C. elegans MicroRNA mir-71 Acts in Neurons to Promote Germline-Mediated Longevity through Regulation of DAF-16/FOXO. Cell Metabolism 15, 439-450 (2012). 49.de Lencastre, A. et al. MicroRNAs Both Promote and Antagonize Longevity in C. elegans. Current Biology 20, 2159-2168 (2010). 50.Lee, Y. et al. The nuclear RNase III Drosha initiates microRNA processing. Nature 425, 415-419 (2003). 51.Denli, A., Tops, B., Plasterk, R., Ketting, R. & Hannon, G. Processing of primary microRNAs by the Microprocessor complex. Nature 432, 231-235 (2004). 52.Zinovyeva, A., Veksler-Lublinsky, I., Vashisht, A., Wohlschlegel, J. & Ambros, V. Caenorhabditis elegans ALG-1 antimorphic mutations uncover functions for Argonaute in microRNA guide strand selection and passenger strand disposal. Proceedings of the National Academy of Sciences 112, E5271-E5280 (2015). 53.Bartel, D. MicroRNAs: Target Recognition and Regulatory Functions. Cell 136, 215- 233 (2009). 54.Lee, H. et al. C. elegans piRNAs Mediate the Genome-wide Surveillance of Germline Transcripts. Cell 150, 78-87 (2012). 55.Shirayama, M. et al. piRNAs Initiate an Epigenetic Memory of Nonself RNA in the C. elegans Germline. Cell 150, 65-77 (2012). 56.Ashe, A. et al. piRNAs Can Trigger a Multigenerational Epigenetic Memory in the Germline of C. elegans. Cell 150, 88-99 (2012). 57.Batista, P. et al. PRG-1 and 21U-RNAs Interact to Form the piRNA Complex Required for Fertility in C. elegans. Molecular Cell 31, 67-78 (2008). 58.Ruby, J. et al. Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans. Cell 127, 1193-1207 (2006). 59.Gu, W. et al. CapSeq and CIP-TAP Identify Pol II Start Sites and Reveal Capped Small RNAs as C. elegans piRNA Precursors. Cell 151, 1488-1500 (2012). 60.Montgomery, T. et al. PIWI Associated siRNAs and piRNAs Specifically Require the Caenorhabditis elegans HEN1 Ortholog henn-1. PLoS Genetics 8, e1002616 (2012). 61.Das, P. et al. Piwi and piRNAs Act Upstream of an Endogenous siRNA Pathway to Suppress Tc3 Transposon Mobility in the Caenorhabditis elegans Germline. Molecular Cell 31, 79-90 (2008). 62.Bagijn, M. et al. Function, Targets, and Evolution of Caenorhabditis elegans piRNAs. Science 337, 574-578 (2012).

87 63.Billi, A., Freeberg, M. & Kim, J. piRNAs and siRNAs collaborate in Caenorhabditis elegans genome defense. Genome Biol 13, 164 (2012). 64.Han, T. et al. 26G endo-siRNAs regulate spermatogenic and zygotic gene expression in Caenorhabditis elegans. Proceedings of the National Academy of Sciences 106, 18674-18679 (2009). 65.Conine, C. et al. Argonautes ALG-3 and ALG-4 are required for spermatogenesis- specific 26G-RNAs and thermotolerant sperm in Caenorhabditis elegans. Proceedings of the National Academy of Sciences 107, 3588-3593 (2010). 66.Vasale, J. et al. Sequential rounds of RNA-dependent RNA transcription drive endogenous small-RNA biogenesis in the ERGO-1/Argonaute pathway. Proceedings of the National Academy of Sciences 107, 3582-3587 (2010). 67.Billi, A. et al. The Caenorhabditis elegans HEN1 Ortholog, HENN-1, Methylates and Stabilizes Select Subclasses of Germline Small RNAs. PLoS Genetics 8, e1002617 (2012). 68.LEE, R. Interacting endogenous and exogenous RNAi pathways in Caenorhabditis elegans. RNA 12, 589-597 (2006). 69.Gu, W. et al. Distinct Argonaute-Mediated 22G-RNA Pathways Direct Genome Surveillance in the C. elegans Germline. Molecular Cell 36, 231-244 (2009). 70.Aoki, K., Moriguchi, H., Yoshioka, T., Okawa, K. & Tabara, H. In vitro analyses of the production and activity of secondary small interfering RNAs in C. elegans. The EMBO Journal 26, 5007-5019 (2007). 71.Claycomb, J. et al. The Argonaute CSR-1 and Its 22G-RNA Cofactors Are Required for Holocentric Chromosome Segregation. Cell 139, 123-134 (2009). 72.She, X., Xu, X., Fedotov, A., Kelly, W. & Maine, E. Regulation of Heterochromatin Assembly on Unpaired Chromosomes during Caenorhabditis elegans Meiosis by Components of a Small RNA-Mediated Pathway. PLoS Genetics 5, e1000624 (2009). 73.Yigit, E. et al. Analysis of the C. elegans Argonaute Family Reveals that Distinct Argonautes Act Sequentially during RNAi. Cell 127, 747-757 (2006). 74.Wedeles, C., Wu, M. & Claycomb, J. Protection of Germline Gene Expression by the C. elegans Argonaute CSR-1. Developmental Cell 27, 664-671 (2013). 75.Seth, M. et al. The C. elegans CSR-1 Argonaute Pathway Counteracts Epigenetic Silencing to Promote Germline Gene Expression. Developmental Cell 27, 656-663 (2013). 76.Maniar, J. & Fire, A. EGO-1, a C. elegans RdRP, Modulates Gene Expression via Production of mRNA-Templated Short Antisense RNAs. Current Biology 21, 449- 459 (2011).

88 77.Ni, J., Chen, E. & Gu, S. Complex coding of endogenous siRNA, transcriptional silencing and H3K9 methylation on native targets of germline nuclear RNAi in C. elegans. BMC Genomics 15, 1157 (2014). 78.Bagga, S. et al. Regulation by let-7 and lin-4 miRNAs Results in Target mRNA Degradation. Cell122, 553-563 (2005). 79.Liu, J. Argonaute2 Is the Catalytic Engine of Mammalian RNAi. Science 305, 1437- 1441 (2004). 80.Rivas, F. et al. Purified Argonaute2 and an siRNA form recombinant human RISC. Nat Struct Mol Biol 12, 340-349 (2005). 81.Braun, J. et al. A direct interaction between DCP1 and XRN1 couples mRNA decapping to 5′ exonucleolytic degradation. Nature Structural & Molecular Biology 19, 1324-1331 (2012). 82.Humphreys, D., Westman, B., Martin, D. & Preiss, T. MicroRNAs control translation initiation by inhibiting eukaryotic initiation factor 4E/cap and poly(A) tail function. Proceedings of the National Academy of Sciences 102, 16961-16966 (2005). 83.Jackson, R., Hellen, C. & Pestova, T. The mechanism of eukaryotic translation initiation and principles of its regulation. Nature Reviews Molecular Cell Biology 11, 113-127 (2010). 84.Chendrimada, T. et al. MicroRNA silencing through RISC recruitment of eIF6. Nature 447, 823-828 (2007). 85.Castel, S. & Martienssen, R. RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nature Reviews Genetics 14, 100-112 (2013). 86.Madison-Villar, M., Sun, C., Lau, N., Settles, M. & Mueller, R. Small RNAs from a Big Genome: The piRNA Pathway and Transposable Elements in the Salamander Species Desmognathus fuscus. Journal of Molecular Evolution 83, 126-136 (2016). 87.Ito, H. Small RNAs and regulation of transposons in plants. Genes Genet. Syst. 88, 3-7 (2013). 88.Heard, E. & Martienssen, R. Transgenerational Epigenetic Inheritance: Myths and Mechanisms. Cell157, 95-109 (2014). 89.Colmenares, S., Buker, S., Buhler, M., Dlakić, M. & Moazed, D. Coupling of Double- Stranded RNA Synthesis and siRNA Generation in Fission Yeast RNAi. Molecular Cell 27, 449-461 (2007). 90.Motamedi, M. et al. Two RNAi Complexes, RITS and RDRC, Physically Interact and Localize to Noncoding Centromeric RNAs. Cell 119, 789-802 (2004).

89 91.Verdel, A. RNAi-Mediated Targeting of Heterochromatin by the RITS Complex. Science 303, 672-676 (2004). 92.Irvine, D. Argonaute Slicing Is Required for Heterochromatic Silencing and Spreading. Science 313, 1134-1137 (2006). 93.Zhang, K., Mosch, K., Fischle, W. & Grewal, S. Roles of the Clr4 methyltransferase complex in nucleation, spreading and maintenance of heterochromatin. Nat Struct Mol Biol 15, 381-388 (2008). 94.Djupedal, I. RNA Pol II subunit Rpb7 promotes centromeric transcription and RNAi- directed chromatin silencing. Genes & Development 19, 2301-2306 (2005). 95.Mette, M., Aufsatz, W., van der Winden, J., Matzke, M. & Matzke, A. Transcriptional silencing and promoter methylation triggered by double-stranded RNA. The EMBO Journal 19, 5194-5201 (2000). 96.He, X., Ma, Z. & Liu, Z. Non-Coding RNA Transcription and RNA-Directed DNA Methylation in Arabidopsis. Molecular Plant 7, 1406-1414 (2014). 97.Law, J., Vashisht, A., Wohlschlegel, J. & Jacobsen, S. SHH1, a Homeodomain Protein Required for DNA Methylation, As Well As RDR2, RDM4, and Chromatin Remodeling Factors, Associate with RNA Polymerase IV. PLoS Genetics 7, e1002195 (2011). 98.Kasschau, K. et al. Genome-Wide Profiling and Analysis of Arabidopsis siRNAs. PLoS Biology 5, e57 (2007). 99.Zilberman, D. ARGONAUTE4 Control of Locus-Specific siRNA Accumulation and DNA and Histone Methylation. Science 299, 716-719 (2003). 100.Wierzbicki, A., Ream, T., Haag, J. & Pikaard, C. RNA polymerase V transcription guides ARGONAUTE4 to chromatin. Nature Genetics 41, 630-634 (2009). 101.He, X. et al. An Effector of RNA-Directed DNA Methylation in Arabidopsis Is an ARGONAUTE 4- and RNA-Binding Protein. Cell 137, 498-508 (2009). 102.Gao, Z. et al. An RNA polymerase II- and AGO4-associated protein acts in RNA- directed DNA methylation. Nature 465, 106-109 (2010). 103.Saito, K. Specific association of Piwi with rasiRNAs derived from retrotransposon and heterochromatic regions in the Drosophila genome. Genes & Development 20, 2214-2222 (2006). 104.Gunawardane, L. et al. A Slicer-Mediated Mechanism for Repeat-Associated siRNA 5' End Formation in Drosophila. Science 315, 1587-1590 (2007). 105.Nishida, K. et al. Gene silencing mechanisms mediated by Aubergine piRNA complexes in Drosophila male gonad. RNA 13, 1911-1922 (2007). 106.Gunawardane, L. et al. A Slicer-Mediated Mechanism for Repeat-Associated siRNA 5' End Formation in Drosophila. Science 315, 1587-1590 (2007).

90 107.Brennecke, J. et al. Discrete Small RNA-Generating Loci as Master Regulators of Transposon Activity in Drosophila. Cell 128, 1089-1103 (2007). 108.Czech, B. & Hannon, G. One Loop to Rule Them All: The Ping-Pong Cycle and piRNA-Guided Silencing. Trends in Biochemical Sciences 41, 324-337 (2016). 109.Li, C. et al. Collapse of Germline piRNAs in the Absence of Argonaute3 Reveals Somatic piRNAs in Flies. Cell 137, 509-521 (2009). 110.Malone, C. et al. Specialized piRNA Pathways Act in Germline and Somatic Tissues of the Drosophila Ovary. Cell 137, 522-535 (2009). 111.Wang, S. & Elgin, S. Drosophila Piwi functions downstream of piRNA production mediating a chromatin-based transposon silencing mechanism in female germ line. Proceedings of the National Academy of Sciences 108, 21164-21169 (2011). 112.Brower-Toland, B. et al. Drosophila PIWI associates with chromatin and interacts directly with HP1a. Genes & Development 21, 2300-2311 (2007). 113.Klattenhoff, C. et al. The Drosophila HP1 Homolog Rhino Is Required for Transposon Silencing and piRNA Production by Dual-Strand Clusters. Cell 138, 1137-1149 (2009). 114.Guang, S. et al. An Argonaute Transports siRNAs from the Cytoplasm to the Nucleus. Science 321, 537-541 (2008). 115.Buckley, B. et al. A nuclear Argonaute promotes multigenerational epigenetic inheritance and germline immortality. Nature 489, 447-451 (2012). 116.Cecere, G., Hoersch, S., O'Keeffe, S., Sachidanandam, R. & Grishok, A. Global effects of the CSR-1 RNA interference pathway on the transcriptional landscape. Nature Structural & Molecular Biology 21, 358-365 (2014). 117.Ashe, A. et al. piRNAs Can Trigger a Multigenerational Epigenetic Memory in the Germline of C. elegans. Cell 150, 88-99 (2012). 118.Burton, N., Burkhart, K. & Kennedy, S. Nuclear RNAi maintains heritable gene silencing in Caenorhabditis elegans. Proceedings of the National Academy of Sciences 108, 19683-19688 (2011). 119.Berget, S., Moore, C. & Sharp, P. Spliced segments at the 5′ terminus of adenovirus 2 late mRNA. Proceedings of the National Academy of Sciences 74, 3171-3175 (1977). 120.Chow, L., Gelinas, R., Broker, T. & Roberts, R. An amazing sequence arrangement at the 5′ ends of adenovirus 2 messenger RNA. Cell 12, 1-8 (1977). 121.Will, C. & Luhrmann, R. Spliceosome Structure and Function. Cold Spring Harbor Perspectives in Biology 3, a003707-a003707 (2010). 122.Kramer, A. The Structure and Function of Proteins Involved in Mammalian Pre- mRNA Splicing. Annual Review of Biochemistry 65, 367-409 (1996).

91 123.Kroiss, M. et al. Evolution of an RNP assembly system: A minimal SMN complex facilitates formation of UsnRNPs in Drosophila melanogaster. Proceedings of the National Academy of Sciences 105, 10045-10050 (2008). 124.Leung, A., Nagai, K. & Li, J. Structure of the spliceosomal U4 snRNP core domain and its implication for snRNP biogenesis. Nature 473, 536-539 (2011). 125.Kambach, C. et al. Crystal Structures of Two Sm Protein Complexes and Their Implications for the Assembly of the Spliceosomal snRNPs. Cell 96, 375-387 (1999). 126.Zhang, R. et al. Structure of a Key Intermediate of the SMN Complex Reveals Gemin2's Crucial Function in snRNP Assembly. Cell 146, 384-395 (2011). 127.Bayne, E. et al. Splicing Factors Facilitate RNAi-Directed Silencing in Fission Yeast. Science 322, 602-606 (2008). 128.Bartels, C. The ribosomal translocase homologue Snu114p is involved in unwinding U4/U6 RNA during activation of the spliceosome. EMBO Reports 3, 875-880 (2002). 129.Lockhart, S. & Rymond, B. Commitment of yeast pre-mRNA to the splicing pathway requires a novel U1 small nuclear ribonucleoprotein polypeptide, Prp39p. Molecular and Cellular Biology14, 3623-3633 (1994). 130.Herr, A., Molnar, A., Jones, A. & Baulcombe, D. Defective RNA processing enhances RNA silencing and influences flowering of Arabidopsis. Proceedings of the National Academy of Sciences 103, 14994-15001 (2006). 131.Kim, J. Functional Genomic Analysis of RNA Interference in C. elegans. Science 308, 1164-1167 (2005). 132.Parry, D., Xu, J. & Ruvkun, G. A Whole-Genome RNAi Screen for C. elegans miRNA Pathway Genes. Current Biology 17, 2013-2022 (2007). 133.Horowitz, D. & Abelson, J. Stages in the second reaction of pre-mRNA splicing: the final step is ATP independent. Genes & Development 7, 320-329 (1993). 134.Tabach, Y. et al. Identification of small RNA pathway genes using patterns of phylogenetic conservation and divergence. Nature 493, 694-698 (2012). 135.Zhou, R. et al. Comparative Analysis of Argonaute-Dependent Small RNA Pathways in Drosophila. Molecular Cell 32, 592-599 (2008). 136.Xiong, X. et al. Core small nuclear ribonucleoprotein particle splicing factor SmD1 modulates RNA interference in Drosophila. Proceedings of the National Academy of Sciences 110, 16520-16525 (2013). 137.Le Hir, H. The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. The EMBO Journal 19, 6860-6869 (2000). 138.Ballut, L. et al. The exon junction core complex is locked onto RNA by inhibition of eIF4AIII ATPase activity. Nat Struct Mol Biol 12, 861-869 (2005).

92 139.TANGE, T. Biochemical analysis of the EJC reveals two new factors and a stable tetrameric protein core. RNA 11, 1869-1883 (2005). 140.Fribourg, S., Gatfield, D., Izaurralde, E. & Conti, E. A novel mode of RBD-protein recognition in the Y14–Mago complex. Nature Structural Biology 10, 433-439 (2003). 141.Singh, K., Wachsmuth, L., Kulozik, A. & Gehring, N. Two mammalian MAGOH genes contribute to exon junction complex composition and nonsense-mediated decay. RNA Biology 10, 1291-1298 (2013). 142.Kataoka, N. et al. Pre-mRNA Splicing Imprints mRNA in the Nucleus with a Novel RNA-Binding Protein that Persists in the Cytoplasm. Molecular Cell 6, 673-682 (2000). 143.Le Hir, H. The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay. The EMBO Journal 20, 4987-4997 (2001). 144.Dostie, J. & Dreyfuss, G. Translation Is Required to Remove Y14 from mRNAs in the Cytoplasm. Current Biology 12, 1060-1067 (2002). 145.Luo, M. et al. Pre-mRNA splicing and mRNA export linked by direct interactions between UAP56 and Aly. Nature 413, 644-647 (2001). 146.Bessonov, S., Anokhina, M., Will, C., Urlaub, H. & Lührmann, R. Isolation of an active step I spliceosome and composition of its RNP core. Nature 452, 846-850 (2008). 147.Reichert, V. 5' exon interactions within the human spliceosome establish a framework for exon junction complex structure and assembly. Genes & Development 16, 2778-2791 (2002). 148.Makarov, E. Small Nuclear Ribonucleoprotein Remodeling During Catalytic Activation of the Spliceosome. Science 298, 2205-2208 (2002). 149.Zhang, Z. & Krainer, A. Splicing remodels messenger ribonucleoprotein architecture via eIF4A3-dependent and -independent recruitment of exon junction complex components. Proceedings of the National Academy of Sciences 104, 11574-11579 (2007). 150.Ideue, T., Sasaki, Y., Hagiwara, M. & Hirose, T. Introns play an essential role in splicing-dependent formation of the exon junction complex. Genes & Development 21, 1993-1998 (2007). 151.De, I. et al. The RNA helicase Aquarius exhibits structural adaptations mediating its recruitment to . Nature Structural & Molecular Biology 22, 138-144 (2015).

93 152.Gehring, N., Lamprinaki, S., Hentze, M. & Kulozik, A. The Hierarchy of Exon- Junction Complex Assembly by the Spliceosome Explains Key Features of Mammalian Nonsense-Mediated mRNA Decay. PLoS Biology 7, e1000120 (2009). 153.Gatfield, D. et al. The DExH/D box protein HEL/UAP56 is essential for mRNA nuclear export in Drosophila. Current Biology 11, 1716-1721 (2001). 154.Zhou, Z. et al. The protein Aly links pre-messenger- RNA splicing to nuclear export in metazoans. Nature 407, 401-405 (2000). 155.Chi, B. et al. Aly and THO are required for assembly of the human TREX complex and association of TREX components with the spliced mRNA. Nucleic Acids Research 41, 1294-1306 (2012). 156.Viphakone, N. et al. TREX exposes the RNA-binding domain of Nxf1 to enable mRNA export. Nature Communications 3, 1006 (2012). 157.Zhang, F. et al. UAP56 Couples piRNA Clusters to the Perinuclear Transposon Silencing Machinery. Cell 151, 871-884 (2012). 158.Francisco-Mangilet, A. et al. THO2, a core member of the THO/TREX complex, is required for microRNA production in Arabidopsis. The Plant Journal 82, 1018-1029 (2015). 159.Furumizu, C., Tsukaya, H. & Komeda, Y. Characterization of EMU, the Arabidopsis homolog of the yeast THO complex member HPR1. RNA 16, 1809-1817 (2010). 160.Pazdernik, N. & Schedl, T. Introduction to Germ Cell Development in C. elegans. Advances in Experimental Medicine and Biology 1-16 (2013). 161.Mello, C. et al. The PIE-1 protein and germline specification in C. elegans embryos. Nature 382, 710-712 (1996). 162.Batchelder, C. et al. Transcriptional repression by the Caenorhabditis elegans germ-line protein PIE-1. Genes & Development 13, 202-212 (1999). 163.Shin, T. & Mello, C. Chromatin regulation during C. elegans germline development. Current Opinion in Genetics & Development 13, 455-462 (2003). 164.Saffman, E. & Lasko, P. Germline development in vertebrates and invertebrates. Cellular and Molecular Life Sciences 55, 1141 (1999). 165.Bender, L., Cao, R., Zhang, Y. & Strome, S. The MES-2/MES-3/MES-6 Complex and Regulation of Histone H3 Methylation in C. elegans. Current Biology 14, 1639- 1643 (2004). 166.Schaner, C., Deshpande, G., Schedl, P. & Kelly, W. A Conserved Chromatin Architecture Marks and Maintains the Restricted Germ Cell Lineage in Worms and Flies. Developmental Cell 5, 747-757 (2003).

94 167.Hird, S., Paulsen, J. & Strome, S. Segregation of germ granules in living Caenorhabditis elegans embryos: cell-type-specific mechanisms for cytoplasmic localisation. Development 122, 1303-1312 (1996). 168.Cheeks, R. et al. C. elegans PAR Proteins Function by Mobilizing and Stabilizing Asymmetrically Localized Protein Complexes. Current Biology 14, 851-862 (2004). 169.Kawasaki, I. et al. PGL-1, a Predicted RNA-Binding Component of Germ Granules, Is Essential for Fertility in C. elegans. Cell 94, 635-645 (1998). 170.Kawasaki, I. The PGL Family Proteins Associate With Germ Granules and Function Redundantly in Caenorhabditis elegans Germline Development. Genetics 167, 645- 661 (2004). 171.Gruidl, M. et al. Multiple potential germ-line helicases are components of the germ- line-specific P granules of Caenorhabditis elegans. Proceedings of the National Academy of Sciences 93, 13837-13842 (1996). 172.Wang, G. & Reinke, V. A C. elegans Piwi, PRG-1, Regulates 21U-RNAs during Spermatogenesis. Current Biology 18, 861-867 (2008). 173.Katic, I. & Greenwald, I. EMB-4: A Predicted ATPase That Facilitates lin-12 Activity in Caenorhabditis elegans. Genetics 174, 1907-1915 (2006). 174.Checchi, P. & Kelly, W. emb-4 Is a Conserved Gene Required for Efficient Germline-Specific Chromatin Remodeling During Caenorhabditis elegans Embryogenesis. Genetics 174, 1895-1906 (2006). 175.Shiimori, M., Inoue, K. & Sakamoto, H. A Specific Set of Exon Junction Complex Subunits Is Required for the Nuclear Retention of Unspliced RNAs in Caenorhabditis elegans. Molecular and Cellular Biology 33, 444-456 (2012). 176.Bayne, E. et al. A systematic genetic screen identifies new factors influencing centromeric heterochromatin integrity in fission yeast. Genome Biol 15, 481 (2014). 177.Ohi, M. et al. Proteomics Analysis Reveals Stable Multiprotein Complexes in Both Fission and Budding Yeasts Containing Myb-Related Cdc5p/Cef1p, Novel Pre- mRNA Splicing Factors, and snRNAs. Molecular and Cellular Biology 22, 2011-2024 (2002). 178.Brenner, S. The genetics of Caenorhabditis elegans. Genetics 77, 71-94 (1974). 179.Timmons, L., Court, D. & Fire, A. Ingestion of bacterially expressed dsRNAs can produce specific and potent genetic interference in Caenorhabditis elegans. Gene 263, 103-112 (2001). 180. Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ 181.Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10 (2011).

95 182.Liao, Y., Smyth, G. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930 (2013). 183.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21 (2012). 184.Quinlan, A. & Hall, I. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842 (2010). 185.Tusher, V., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 98, 5116-5121 (2001). 186.Claycomb, J., MacAlpine, D., Evans, J., Bell, S. & Orr-Weaver, T. Visualization of replication initiation and elongation in Drosophila. J Cell Biol 159, 225-236 (2002). 187.Crittenden, S. et al. A conserved RNA-binding protein controls germline stem cells in Caenorhabditis elegans. Nature 417, 660-663 (2002). 188.Eckmann, C. GLD-3 and Control of the Mitosis/Meiosis Decision in the Germline of Caenorhabditis elegans. Genetics 168, 147-160 (2004). 189.Hansen, D., Albert Hubbard, E. & Schedl, T. Multi-pathway control of the proliferation versus meiotic development decision in the Caenorhabditis elegans germline. Developmental Biology 268, 342-357 (2004). 190.Dernburg, A. et al. Meiotic Recombination in C. elegans Initiates by a Conserved Mechanism and Is Dispensable for Homologous Chromosome Synapsis. Cell 94, 387-398 (1998). 191.Woglar, A. & Jantsch, V. Chromosome movement in meiosis I prophase of Caenorhabditis elegans. Chromosoma 123, 15-24 (2013). 192.Greenstein, D. Control of oocyte meiotic maturation and fertilization. WormBook (2005). doi:10.1895/wormbook.1.53.1 193.Hall, L., Smith, K., Byron, M. & Lawrence, J. Molecular anatomy of a speckle. The Anatomical Record Part A: Discoveries in Molecular, Cellular, and Evolutionary Biology 288A, 664-675 (2006). 194.Spector, D. & Lamond, A. Nuclear Speckles. Cold Spring Harbor Perspectives in Biology 3, a000646-a000646 (2010). 195.Reeder, R. rRNA synthesis in the nucleolus. Trends in Genetics 6, 390-394 (1990). 196.Prasad, A., Croydon-Sugarman, M., Murray, R. & Cutter, A. TEMPERATURE- DEPENDENT FECUNDITY ASSOCIATES WITH LATITUDE IN CAENORHABDITIS BRIGGSAE. Evolution65, 52-63 (2010).

96 197.Kao, H. & Siliciano, P. Identification of Prp40, a novel essential yeast splicing factor associated with the U1 small nuclear ribonucleoprotein particle. Molecular and Cellular Biology 16, 960-967 (1996). 198.MACMORRIS, M. UAP56 levels affect viability and mRNA export in Caenorhabditis elegans. RNA 9, 847-857 (2003). 199.Feng, X. & Guang, S. Non-coding RNAs mediate the rearrangements of genomic DNA in ciliates. Science China Life Sciences 56, 937-943 (2013). 200.Mochizuki, K. Developmentally programmed, RNA-directed genome rearrangement in Tetrahymena. Development, Growth & Differentiation 54, 108-119 (2011). 201.Tu, S. et al. Comparative functional characterization of the CSR-1 22G-RNA pathway in Caenorhabditis nematodes. Nucleic Acids Research 43, 208-224 (2014). 202.Sheth, U., Pitt, J., Dennis, S. & Priess, J. Perinuclear P granules are the principal sites of mRNA export in adult C. elegans germ cells. Development 137, 1305-1314 (2010). 203.Schulze, W. & Mann, M. A Novel Proteomic Screen for Peptide-Protein Interactions. Journal of Biological Chemistry 279, 10756-10764 (2003). 204.Langmead, B. & Salzberg, S. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357-359 (2012). 205.Love, M., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, (2014).

206.Youngman, E. & Claycomb, J. From early lessons to new frontiers: the worm as a treasure trove of small RNA biology. Frontiers in Genetics 5, (2014).

207.Shen, S. et al. rMATS: Robust and flexible detection of differential from replicate RNA-Seq data. Proceedings of the National Academy of Sciences 111, E5593-E5601 (2014).

208.Ortiz, M., Noble, D., Sorokin, E. & Kimble, J. A New Dataset of Spermatogenic vs. Oogenic Transcriptomes in the Nematode Caenorhabditis elegans. G3&#58; Genes|Genomes|Genetics 4, 1765-1772 (2014).

97