University of Connecticut OpenCommons@UConn

Doctoral Dissertations University of Connecticut Graduate School

3-28-2019 Modeling Microcephaly Caused by Inactivation of the Minor Using a U11 Conditional Knockout Mouse Mary Baumgartner University of Connecticut - Storrs, [email protected]

Follow this and additional works at: https://opencommons.uconn.edu/dissertations

Recommended Citation Baumgartner, Mary, "Modeling Microcephaly Caused by Inactivation of the Using a U11 Conditional Knockout Mouse" (2019). Doctoral Dissertations. 2076. https://opencommons.uconn.edu/dissertations/2076 ABSTRACT Modeling Microcephaly Caused by Inactivation of the Minor Spliceosome Using a U11 Conditional Knockout Mouse Mary Baumgartner, Ph.D University of Connecticut, 2019

The minor spliceosome is one of two pre-mRNA splicing machineries required for -coding expression. In multiple eukaryotic lineages, the minor spliceosome is required to remove <1% of introns (minor introns) from the pre-mRNA transcripts of minor -containing (MIGs). Despite the few minor introns in the genome, disruption of minor splicing impairs development in numerous species. In , minor spliceosome mutations are linked numerous developmental diseases that impact central nervous system (CNS) development. However, the role of minor splicing and MIG expression in mammalian CNS development and function remains unexplored. To address this, I characterized the expression of the minor spliceosome-specific small nuclear RNA (snRNA) components and MIGs in the developing mouse embryo and retina. This approach revealed enriched expression of the minor spliceosome-specific snRNAs in the developing mouse CNS and limb buds, and in differentiating retinal neurons. I next sought to inactivate the minor spliceosome in the developing mouse cortex (pallium) by ablating Rnu11, which encodes the minor spliceosome-specific U11 snRNA. In the Rnu11 conditional knockout (cKO) pallium, U11-null radial glial cells (RGCs) displayed DNA damage, accumulation, and cycle defects, culminating in self-amplifying RGC death and microcephaly. Conversely, neurons appeared to survive postnatally. These results suggested that MIG expression is essential for cycling cell survival, but is not immediately necessary for neuron survival. To investigate this, I leveraged publications identifying genes essential for cycling cell survival. MIGs were significantly enriched in the essential gene lists, underscoring their importance for cycling cell survival. In parallel, I explored how the developmental defects in the Rnu11 cKO mice affected their behavior in adulthood. The Rnu11 mutant mice displayed heightened anxiety and significant social and motor behavior impairments. Unexpectedly, Rnu11 heterozygous mice showed reduced sociability and enhanced whole-body motor performance. These findings suggest that U11 differentially impacts brain development/function, and identify the first phenotype associated with U11 haploinsufficiency. In all, these findings reveal the roles minor splicing plays in mammalian CNS development/function, laying the foundation for studies of disease pathogenesis and for research linking minor splicing to progenitor cell behavior, cell type- specific survival, and mouse behavior.

i

Modeling Microcephaly Caused by Inactivation of the Minor Spliceosome Using a U11 Conditional Knockout Mouse

Mary Baumgartner

B.A., Mount Holyoke College, 2013 M.S., University of Connecticut, 2016

A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy at the University of Connecticut

2019

ii

APPROVAL PAGE

Doctor of Philosophy Dissertation

Modeling Microcephaly Caused by Inactivation of the Minor Spliceosome Using a U11 Conditional Knockout Mouse

Presented by

Mary Baumgartner, M.S.

Major Advisor: ______

Rahul Kanadia, Ph.D

Associate Advisor: ______

Joseph LoTurco, Ph.D

Associate Advisor: ______

Jianjun Sun, Ph.D

University of Connecticut

2019

iii

Dedicated to my Grandmother, Barbara May Baumgartner, and my Grandfather, Reno Franconi

iv

ACKNOWLEDGEMENTS

First, I would like to thank Dr. Rahul Kanadia for being a great mentor for the past six years. Not only did he dedicate countless hours guiding me to become better at writing, thinking, presenting, and interpreting science, but he always prioritized my training, bringing me a new opportunity almost every week. Without his presence to help push me into new scientific terrain, I would not be the scientist I am today.

Second, I would like to thank my collaborator and director of the Murine Behavioral Neurogenetics Facility (MBNF) at the University of Connecticut, Dr. Holly Fitch, for sharing her expertise, facilities, and vast knowledge of mouse behavior, and for taking the time for our numerous meetings to discuss and interpret mouse behavioral data. I would also like to thank her students, Peter Perrino and Nicholas Buitrago, for performing the behavioral experiments at the MBNF.

Third, I would like to thank my committee members, Dr. Joseph LoTurco and Dr. Jianjun Sun, for their unwavering support, advice, and invaluable suggestions during our meetings. They both were essential for helping me manage this at times unwieldy project.

I would also like to thank the entire Kanadia lab, past and present, for their support and ability to make the most unrelenting day more bearable. Dr. Krishna Karunakaran, a past member of the lab and my first mentor, has always been an unending resource of advice, suggestions, and knowledge. I am not sure I could have survived my first SfN conference without her. I would also like to thank Dr. Abdul Rouf Banday, another past member of the lab, for his immense contributions and bioinformatics knowledge, which were both essential for the success of my first ever project in the lab. Similarly, Christopher Lemoine, another past member of the lab, was an amazing technical resource, and helped me troubleshoot many a protocol. I cannot imagine the countless hours he saved in the process. I would also like to thank Anouk Olthof, who is currently pursuing her PhD in the Kanadia lab, for her constant help and advice, both on scientific and non-scientific matters. Thank you to Anouk; Katery Hyatt, who is pursuing her Masters in the Kanadia lab; Alisa White, who is pursuing her PhD; Kyle Drake, who is pursuing his PhD; Dan Munteanu, who is pursuing his PhD; and Victor Calle, who is also pursuing his PhD, for always keeping the lab a positive place, despite how many experiments failed that day. Every one of you has always been available to help whoever might need it, either in the lab or outside of it, and you’ve all helped make the lab a true family.

I would also like to thank the entire PNB department, who have always been available to provide advice or suggestions for any scientific question. In particular, I would like to thank the newly- minted Drs. Laura Mickelsen, for being the best roommate possible and for her support over the past five years, and Fu-Shan Kuo, for offering Laura and I a place to stay during a turbulent time and for being an incredible short-term roommate. I can’t thank them, and Fu-Shan’s husband Andres Barretto, enough for educating me on all the classic movies I somehow missed out on. I would also like to thank Drs. Geoffrey Tanner and Anastasios Tzingounis, for their invaluable career advice; Dr. Xinnian Chen, for opening my eyes to new methods of teaching and for the sheer amount of time she has dedicated over the years to making PNB 2274 as rewarding as possible; Dr. Kristen Kimball; Penny Dobbins; Ed Lechowicz; Ann Hamlin; and Sonda Davis for making PNB 2264/2274 both incredibly useful, in terms of experience, but also incredibly fun. I

v would also like to thank Kathy Kelleher, Linda Armstrong, Noreen Sgro, and Taylor Renaud for their immense help navigating the administrative part of the program—especially for the last 6 months, which have been particularly rough. I would be lost without their help.

Last but not least, I would like to thank my friends and family, who have been an incredible (and incredibly understanding) support system throughout this entire program, and for their constant encouragement whenever things were rocky, or when I would fall off of the face of the planet during long writing/research stretches. In particular, I’d like to thank my grandfather for his unwavering support, and for welcoming me into his house both for my first few months of graduate school and when I was commuting from Mount Holyoke College to spend a weekend working in the Kanadia lab. Speaking of emotional support, I would like to thank my cat-based support, for being both ridiculous and adorable: Khensu and Felix, along with Laura’s cats Addison (RIP), Roary, and Logan.

Thank you!

vi

TABLE OF CONTENTS ACKNOWLEDGEMENTS ...... v LIST OF FIGURES AND TABLES ...... x Chapter 1: Minor intron splicing ...... 1 1.1 RNA splicing ...... 1 1.1.1 Intron definition: the concept of consensus sequences ...... 3 1.1.2 The spliceosome and the splicing reaction ...... 3 1.2 Minor introns and the minor spliceosome ...... 5 1.2.1 Divergent, minor-class introns ...... 5 1.2.2 The distinct, minor-class spliceosome and its action ...... 6 1.2.3 Conservation and utility of minor intron splicing ...... 12 1.2.4 Minor intron-containing genes (MIGs) ...... 15 1.3 Minor splicing, development, and disease ...... 18 1.3.1 Minor splicing and early/embryonic development ...... 19 1.3.2 Minor splicing and the central nervous system ...... 24 1.3.2.1 Microcephaly ...... 24 1.3.2.2 Motor disorders ...... 31 Chapter 2: Central nervous system development ...... 37 2.1 Retinal development ...... 37 2.1.1 Structure of the mouse retina ...... 37 2.1.2 Development of the mouse retina ...... 39 2.2 Cortical development ...... 41 2.2.1 Structure of the mammalian cortex ...... 41 2.2.2 Timeline of cortical development ...... 44 Chapter 3: The role of minor intron splicing in mouse retinal development ...... 48 3.1 Rationale...... 48 3.2 Results ...... 49 3.2.1 Identification of potential murine gene copies of the minor spliceosome-specific snRNAs ...... 49 3.2.2 The minor spliceosomal snRNAs are enriched in the developing murine CNS and limbs ...... 51 3.2.3 The minor spliceosomal snRNAs are expressed across embryonic mouse retinal development...... 52 3.2.4 The minor spliceosomal snRNAs are enriched in differentiating retinal neurons ...... 53

vii

3.2.5 Expression of MIG isoforms that require minor intron splicing shifts between embryonic and postnatal retinal development ...... 53 3.2.6 The minor spliceosome is required for the survival of differentiating retinal neurons 56 3.3 Discussion ...... 58 3.4 Materials and methods ...... 61 Chapter 4: The role of minor intron splicing in mouse cortical development ...... 78 4.1 Rationale...... 78 4.2 Pathogenesis of microcephaly ...... 79 4.2.1 The process of cell division ...... 79 4.2.2 Microcephaly and cell division defects ...... 82 4.2.3 Microcephaly, p53 activation, and DNA damage ...... 86 4.3 Results ...... 89 4.3.1 Constitutive loss of the minor spliceosome-specific U11 snRNA results in embryonic lethality ...... 89 4.3.2 Ablation of U11 in the developing mouse cortex results in microcephaly ...... 90 4.3.3 Ablation of U11 in the developing mouse cortex causes cell death ...... 90 4.3.4 Ablation of U11 predominantly affects survival of neural progenitor cells ...... 91 4.3.5 Cell types of the developing mouse cortex display differential sensitivity to U11 loss92 4.3.6 Few genes are differentially expressed in the embryonic day 12 U11-null pallium .... 94 4.3.7 U11-null neural progenitor cells display DNA damage and p53 upregulation ...... 94 4.3.8 Most DNA damage and p53 upregulation occur in U11-null radial glial cells ...... 96 4.3.9 Ablation of U11 results in significant minor intron retention in the E12 U11-null pallium ...... 96 4.3.10 MIGs involved in and DNA synthesis have significant minor intron retention in the E12 U11-null pallium ...... 98 4.3.11 U11-null radial glial cells display mitotic and cytokinetic defects ...... 99 4.3.12 U11 loss affects the fraction of radial glial cells in ...... 100 4.3.13 Most cells with DNA damage in the U11-null pallium were not in S-phase or mitosis ...... 100 4.3.14 P53 upregulation in the U11-null pallium occurs predominantly in radial glial cells in G1 phase ...... 101 4.3.15 Most dying cells in the U11-null pallium are not in S, G2, or M phase ...... 102 4.3.16 tdTomato+ cells are present in the neonatal mutant cortex ...... 102 4.3.17 U11-null neurons are present in the neonatal mutant cortex ...... 103 4.3.18 U11-null neurons born after Rnu11 ablation are present in the neonatal and juvenile mutant cortex ...... 103

viii

4.3.19 Lack of corticospinal projections and the corpus callosum in the neonatal Rnu11 mutant mouse ...... 105 4.4 Discussion ...... 105 4.5 Materials and methods ...... 109 Chapter 5: Identifying MIGs essential for the survival of cycling cells ...... 138 5.1 Identification of the “essentialome” ...... 138 5.2 Rationale...... 142 5.3 Results ...... 143 5.3.1 MIGs are significantly enriched in the “total” essentialome and in each of the 10 distinct essentialomes derived from 10 different cell lines ...... 143 5.3.2 MIGs are significantly enriched in the core essentialome ...... 143 5.3.3 MIGs of the core essentialome are predominantly involved in RNA processing and ...... 145 5.3.4 Genes of the core essentialome are evolutionarily ancient, including MIGs of the core essentialome...... 146 5.4 Discussion ...... 147 5.5 Materials and methods ...... 152 Chapter 6: The behavioral outcomes of Emx1-Cre-mediated U11 ablation in the Rnu11 conditional knockout mouse...... 174 6.1 The Emx1-Cre transgenic mouse...... 174 6.2 Rationale...... 178 6.3 Results ...... 178 6.3.1 Emx1-Cre heterozygous, Rnu11 mutant mice survive to adulthood ...... 178 6.3.2 Rnu11 mutant mice display heightened anxiety ...... 179 6.3.3 Rnu11 mutant mice have significantly impaired motor coordination ...... 182 6.3.4 Rnu11 mutant and heterozygous mice vocalize significantly less than wild-type controls in socio-sexual encounters ...... 182 6.3.5 Rnu11 heterozygous mice display enhanced performance on complex, whole-body motor tasks ...... 183 6.4 Discussion ...... 186 6.5 Materials and methods ...... 195 Chapter 7: Conclusions and future directions ...... 206 Chapter 8: List of References ...... 212 Chapter 9: Supplementary figures ...... 257 Chapter 10: Supplementary tables ...... 262 Chapter 11: Appendix I ...... 274

ix

LIST OF FIGURES AND TABLES

Chapter 1: Minor intron splicing ...... 1 Figure 1.1. Comparison of the components and action of the major, U2-type and minor, U12- type splicing pathways...... 36 Chapter 3: The role of minor intron splicing in mouse retinal development ...... 48 Figure 3.1. Gene isotypes of U4atac and U6atac in the mouse genome...... 66 Figure 3.2. Gene isotypes of U11 and U12 in the mouse genome...... 67 Figure 3.3. Multiple minor snRNA gene copies are expressed in the deep sequencing data from E16 and P0 mouse retinae...... 68 Figure 3.4. The minor snRNAs are enriched in the embryonic mouse head...... 69 Figure 3.5. Control analysis for WISH with sense and antisense probes for the minor snRNAs...... 70 Figure 3.6. The minor snRNAs are expressed in embryonic progenitor cells and differentiating neurons in the developing mouse retina...... 71 Figure 3.7. The minor snRNAs are enriched in differentiating neurons in the postnatal mouse retina...... 72 Figure 3.8. U11 and U12 snRNA expression is enriched in retinal ganglion cells at P0...... 73 Figure 3.9. Custom microarray design and probe set statistics...... 74 Figure 3.10. MIGs play different roles in embryonic versus postnatal retinal development..... 75 Figure 3.11. U12 snRNA function is essential for the survival of differentiating postnatal retinal neurons...... 76 Figure 3.12. Functions executed by MIGs are dynamic across retinal development...... 77 Chapter 4: The role of minor intron splicing in mouse cortical development ...... 78 Figure 4.1. Generation and confirmation of the Rnu11 cKO mice...... 121 Figure 4.2. U11 loss in the developing mouse neocortex causes severe microcephaly...... 122 Figure 4.3. Cell death contributes to microcephaly in the U11 cKO mouse...... 123 Figure 4.4. U11 loss results in depletion of radial glial cells...... 124 Figure 4.5. U11 loss and death of self-amplifying radial glial cells in the mutant pallium. .... 125 Figure 4.6. U11 loss triggers DNA damage and p53 upregulation in radial glial cells...... 126 Figure 4.7. U11-null radial glial cells exhibit S-phase, , and defects. . 127 Figure 4.8. Validation of expression and minor intron retention changes in the E12 mutant pallium...... 128

x

Figure 4.9. Cell cycle defects are not observed in the E11 mutant pallium, in IPCs of the E12 mutant pallium, or in G2 phase...... 129 Figure 4.10. P53 upregulation and predominantly occur in either cells in G1 or post- mitotic cells...... 130 Figure 4.11. U11-null cells with DNA damage are not in mitosis...... 131 Figure 4.12. Cells positive for the tdTomato Cre reporter are observed in the P0 mutant brain...... 132 Figure 4.13. Cells positive for the tdTomato Cre reporter are observed in the P21 mutant brain...... 133 Figure 4.14. U11-null neurons are present in the P0 mutant cortex...... 134 Figure 4.15. Embryonically-produced, U11-null neurons are present postnatally...... 135 Figure 4.16. Subcerebral projection defects in the Rnu11 mutant brain...... 136 Figure 4.17. Model of cortical development in U11 cKO mice...... 137 Chapter 5: Identifying MIGs essential for the survival of cycling cells ...... 138 Figure 5.1. MIGs are enriched in the core essentialome...... 154 Figure 5.2. Most genes of the core essentialome, including the MIGs, are ancient...... 155 Table 5.1. Mammalian functions of the MIGs identified in the core essentialome...... 156 Chapter 6: The behavioral outcomes of Emx1-Cre-mediated U11 ablation in the Rnu11 conditional knockout mouse...... 174 Figure 6.1. Survival and weight gain kinetics in the microcephalic Rnu11 mutant mice...... 203 Figure 6.2. Anxiety, motor performance, and socio-sexual behavior analysis of Rnu11 wild- type (WT), heterozygous (het), and mutant (mut) mice...... 204 Figure 6.3. Analysis of motor ability in the Rnu11 heterozygous (het) mouse...... 205

xi

Chapter 1: Minor intron splicing

This chapter covers the process of RNA splicing—in particular, a specific type of RNA splicing called minor intron splicing. Briefly, the chapter begins with an overview of RNA splicing research, leading up to the discovery of minor intron splicing in the 1990s. The second half of the chapter will focus on minor intron splicing itself, including the identification of minor introns, the machinery that removes them, conservation of minor intron splicing, and connections between minor intron splicing, development, and disease. Specifically, the relationship between disruption of minor intron splicing and diseases of the central nervous system will be detailed. While the majority of my dissertation research addresses how minor intron splicing inactivation causes microcephaly, I began my dissertation research by asking a broader question: how does minor spliceosome inactivation impact central nervous system development? Therefore, in the following chapters, I will expand upon the role of minor intron splicing in both systems I employed in my dissertation research: the developing mouse retina and the developing mouse cortex.

1.1 RNA splicing

Until the 1960s, scientists thought that eukaryotic protein-coding genes were organized similarly to prokaryotic genes—i.e., they were continuous, protein-coding sequences of DNA [1].

That theory was complicated by the results of pulse-chase experiments, in which newly transcribed

RNA in HeLa cells was radiolabeled with tritiated uridine, often followed by application of the inhibitor actinomycin. In these experiments, accumulation of radiolabeled RNA was first observed in the nucleus, followed by accumulation of radiolabeled RNA in the [2].

Notably, the nuclear radiolabeled RNA species varied in size, some of which were much larger than the cytoplasmic radiolabeled RNA species observed [3-5]. Further analysis of this heterogeneous nuclear RNA (hnRNA) in HeLa cells revealed that some hnRNA species were

1 similar in nucleotide composition to rRNA (>60% GC content), while other, larger hnRNA species were similar in nucleotide composition to HeLa cell DNA (~45% GC content) [6]. Moreover, investigation into the DNA-like hnRNA revealed that these RNA were polyadenylated and had a

5’ cap structure similar to mRNA [7-13]. In light of these findings, a popular hypothesis was that the DNA-like hnRNA were mRNA precursors, which were cleaved into shorter, mature mRNAs, much like the process of rRNA production from larger rRNA precursors [1, 14].

The precise relationship between the DNA-like hnRNA and cytoplasmic mRNA was finally elucidated in 1977. Two independent groups, led by Dr. Phillip Sharp and Dr. Richard

Roberts, employed DNA-RNA hybridization of fragmented adenovirus 2 (Ad2) DNA and viral mRNA produced by Ad2-infected HeLa cells [15, 16]. The R-loop structures formed by DNA-

RNA hybridization were then visualized by electron microscopy. Unexpectedly, they found that the viral mRNAs were not hybridizing to a continuous Ad2 DNA sequence. Instead, some viral mRNA species partially hybridized to the Ad2 DNA fragments, with unbound 5’ or 3’ mRNA tails. More strikingly, other viral mRNAs hybridized fully to the Ad2 DNA fragment—but this hybridization was not to a continuous DNA sequence. Instead, sequences in these mRNAs mapped to different DNA regions, creating loops of unhybridized, intervening DNA along the RNA-DNA hybrid. These publications concluded that precursor mRNAs undergo post-transcriptional processing, in which these intervening sequences are removed and the coding regions are linked together; this process became known as RNA splicing [15, 16].

Based on these findings, in 1978 Dr. Walter Gilbert described eukaryotic genes as mosaics of intervening, intragenic “introns” and expressed “,” which are retained in the mature mRNA [17]. However, this newly discovered gene organization raised a new question: how does the cell distinguish between introns and exons in the pre-mRNA transcript?

2

1.1.1 Intron definition: the concept of consensus sequences

In the years following the seminal 1977 RNA splicing papers, work from multiple groups revealed that specific, conserved sequences mark the 5’ -intron boundary and the 3’ intron- exon boundary. Based on analysis of available intronic sequences, the sequence at the 5’ end of the intron, called the 5’ splice site (5’SS), was defined as 5’-AG/GTRAGT (R=purine, /=exon- intron boundary) [18-20]. Similarly, a 3’ splice site (3’SS) at the 3’ end of intron was defined as

5’-NYAG/G-3’ (N=any nucleotide, Y=pyrimidine), and this sequence was always preceded by a stretch of at least 10 pyrimidines, known as the polypyrimidine tract (PPT) [18-21] (Figure 1.1).

While some variation in these splice site sequences was observed, virtually all of the introns studied contained a 5’ terminal nucleotide pair of GT and a 3’ terminal nucleotide pair of AG, leading to the GT-AG intron boundary rule [22]. However, these exon/intron boundary consensus sequences are not sufficient for successful splicing, which also requires an intronic consensus sequence of 5’-CTRACT-3’, called the branch point sequence (BPS), found upstream of the PPT

[20, 21, 23, 24].

1.1.2 The spliceosome and the splicing reaction

With these consensus sequences identified, the search for the cellular process allowing for the identification and removal of introns began. In 1980, the Wall and Steitz groups both proposed a mechanism by which small nuclear RNAs (snRNAs) that were complementary to these consensus sequences would base-pair to these sites on the pre-mRNA transcript, allowing them to identify intronic sequences and drive the splicing reaction [25, 26]. Over the next few years, sequence alignments and biochemical experiments led to the identification of five splicing snRNAs: U1, U2, U4, U5, and U6 [27-33]. Moreover, autoimmune from patients with

3 systemic lupus erythematosus were used to identify a large array of that associate with these snRNAs to form small nuclear ribonucleoprotein particles () [27]. It has now been shown that these five snRNPs comprise the U2-type spliceosome, named for the U2 snRNA that is crucial for the splicing process [34, 35].

The process of RNA splicing can be divided into three basic steps: intron recognition; the first trans-esterification reaction, forming the intron lariat and releasing the 5’ exon; and the second trans-esterification reaction, which releases the intron lariat and ligates the 5’ and 3’ exons together

[36, 37]. Briefly, the process of intron recognition is mediated by U1 base-pairing to the 5’SS and

U2 snRNA base-pairing to the BPS (Figure 1.1) [28-30]. Base-pairing between the U2 snRNA and the BPS is stabilized by the U2AF protein, which binds to the downstream PPT and 3’SS (Figure

1.1) [38-40]. The base-pairing between the U2 snRNA and the BPS is imperfect, causing protrusion of an adenosine in the BPS (Figure 1.1) [41]. The three remaining snRNAs exist as a tri-snRNP, U4/U6.U5, in which U4 and U6 are base-paired and the U5 snRNP associates via protein interactions (Figure 1.1) [42]. The U4/U6.U5 tri-snRNP manipulates the orientation of the pre-mRNA transcript so that the splicing reaction can occur. Specifically, U6 dissociates from U4 in order to base-pair to the 5’SS, displacing U1, and to base-pair to the U2 snRNA (Figure 1.1)

[43-47]. While U4 leaves the active spliceosome at this step and does not directly base-pair to the pre-mRNA transcript, it is essential for stabilizing the tri-snRNP up until this point [33, 48, 49].

Meanwhile, U5 binds to the 3’SS and regions in the 5’ exon [50]. Ultimately, these events re-orient the pre-mRNA transcript, looping the intron such that the BPS is adjacent to the 5’SS and the 3’ exon is next to the 5’ exon (Figure 1.1). The close proximity of the 5’SS and the protruding BPS adenosine triggers the first trans-esterification reaction, in which the 2’ hydroxyl group of the BPS adenosine performs a nucleophilic attack on the 5’ phosphate group at the end of the 5’ exon. This

4 reaction links the 5’ end of the intron to the BPS adenosine, creating an intron lariat and releasing the 5’ exon (Figure 1.1) [36, 37]. The U5 snRNP remains bound to the 5’ and 3’ exons, maintaining them in close proximity (Figure 1.1) [51, 52]. This prompts the second trans-esterification reaction, in which the 3’ hydroxyl group at the end of the 5’ exon performs a nucleophilic attack on the 5’ phosphate group at the start of the 3’ exon. This second reaction excises the intron lariat, which is degraded in the nucleus, and links the end of the 5’ exon to the beginning of the 3’ exon

(Figure 1.1) [36, 37].

1.2 Minor introns and the minor spliceosome

In every step of RNA splicing, base-pairing between the spliceosomal snRNAs and the pre-mRNA transcript is both extensive and essential, as described above. Therefore, it was widely believed that all spliceosomal introns contained the aforementioned consensus sequences.

However, as more sequencing data became available, scientists began to identify spliceosomal introns without these consensus sequences—and yet, these introns were being successfully spliced from pre-mRNA transcripts.

1.2.1 Divergent, minor-class introns

In 1991, Jackson re-evaluated the intronic sequences previously described and analyzed by

Senapathy and Shapiro [53, 54] and identified two introns, found in the cartilage matrix protein

(CMP) and nucleolar protein P120 (P120) genes, with divergent splice site sequences that did not follow the accepted GT-AG rule [55]. Instead, these introns contained AT-AC terminal dinucleotides, which led to the name “AT-AC introns” [56]. Further examples of these AT-AC introns were identified in 1994, when Hall and Padgett identified two additional AT-AC introns,

5 one in the mouse Rep-3 gene and the other in the gene prospero. Their analysis revealed that all four of these AT-AC introns contained unique 5’SS and 3’SS consensus sequences of 5’-/ATATCCTT-3’ and 5’-YCCAC/-3’, the latter of which was not preceded by the PPT seen in canonical GT-AG introns [57]. They also identified the presence of a highly conserved branch point sequence of 5’-TCCTTAAC-3’ [57]. Later alignment and mutational studies further clarified these sequences to be 5’-/RTATCCTTT-3’ for the 5’SS, 5’-TTCCTTRAY-3’ for the BPS, and 5’-

G YA C/-3’ for the 3’SS [19, 20, 58]. Due to the divergence in these three consensus sequences, Hall and Padgett noted that the complementarity required for spliceosomal snRNA base-pairing was lost. Then, based on the known splicing mechanism, it should not be possible for these introns to be removed—except, they were being spliced successfully. Therefore, Hall and Padgett proposed that a specialized, parallel set of spliceosomal snRNAs must exist to splice this minor class of divergent introns [57].

1.2.2 The distinct, minor-class spliceosome and its action

The successful splicing of these minor-class introns, which could not be mediated by the

U2-type spliceosome, led to the hunt for snRNAs that could recognize and splice these introns. In their 1994 paper, Hall and Padgett curated sequences of all known snRNAs to identify candidate snRNAs that could recognize the divergent consensus sequences of these introns [57]. This revealed two potential candidates. The first, the U12 snRNA, contained a stretch of nucleotides

(3’-AGGAATC-5’) that was complementary to the divergent BPS they identified, 5’-

TCCTTAAC-3’ [57]. Importantly, if this region of U12 were to base-pair to the AT-AC intron

BPS, one of the BPS adenosines would bulge outward, due to imperfect complementarity (Figure

1.1). In the U2-type spliceosome, this BPS adenosine bulge is required for the splicing reaction to

6 occur [41]. If this minor class of AT-AC introns were removed by a parallel set of splicing snRNAs in steps analogous to those of U2-type splicing, then U12 likely functioned as the analogue of U2.

The second candidate, the U11 snRNA, was proposed as a potential analogue to U1, since it contained a 3’-GGAAA-5’ sequence that could base-pair to the divergent 5’SS (5’-ATATCCTTT-

3’) (Figure 1.1) [57]. However, this hypothesized base-pairing would only span the last 5 nucleotides of the divergent 5’SS, which Hall and Padgett cautioned might be insufficient for 5’SS recognition [57]. Both U11 and U12 were sequenced by Montzka and Steitz in 1988, who noted that both were low abundance, nuclear snRNAs [59]. To distinguish these low abundance snRNA candidates from the abundant splicing snRNAs of the U2-type spliceosome, Hall and Padgett described U11 and U12 as “minor snRNAs,” sowing the seeds for future terminology: the divergent introns became known as “minor introns,” due to their small number in the genome, which are spliced by the “minor spliceosome” [20, 57, 60]. To parallel this terminology, U2-type introns were named “major introns,” and the U2-type spliceosome was termed the “major spliceosome.” Hall and Padgett also highlighted Montzka and Steitz’s finding that the majority of

U12 snRNA existed in a di-snRNP with U11 [57, 59]. This led Hall and Padgett to suggest that the strong base-pairing of U12 to the BPS might stabilize U11 base-pairing to the 5’SS, despite the incomplete complementarity [57].

In 1996, two separate groups, led by Steitz and Padgett, independently confirmed that U12 does base-pair to the BPS, and that U12 is crucial for minor intron splicing. Using 2’-O-methyl oligonucleotides, Tarn and Steitz showed that disrupting the U12-BPS interaction prevented the removal of a minor intron [56]. Similarly, Hall and Padgett showed that mutating the BPS within a minor intron also ablated its splicing, and compensatory mutations in U12 rescued minor splicing

[60]. Notably, while the primary sequence of U12 differed greatly from that of U2, its predicted

7 secondary structure was very similar to that of U2 [59, 61]. Given the essential role of U12 in the removal of these introns, minor introns are also known as U12-type introns, which are spliced by the U12-type spliceosome [34, 35]. Interestingly, the requirement for BPS conservation is not observed for major introns, as BPS mutations in these introns have no significant effect on U2 snRNP binding [62]. Instead, U2 snRNP binding requires the presence of a downstream polypyrimidine tract, to which the protein U2 auxiliary factor (U2AF) binds [38-40].

In their 1996 paper, Tarn and Steitz also identified U11 and the U5 snRNA in the minor spliceosome, by in vitro splicing assays followed by Northern blot and chromatography analyses

[56]. When they applied a 2’-O-methyl oligonucleotide that targeted a region of U5 involved in exon interactions during major splicing, they observed inhibition of both major and minor splicing in their in vitro splicing assay, suggesting that U5 plays the same role in the major and minor splicing pathways [56]. However, it was unclear whether U11 directly participated in splicing, because they could not target U11 by 2’-O-methyl oligonucleotides [56]. This echoed earlier attempts to target U11 for study: Christofori and Keller were unable to target U11 snRNA for

RNase degradation using an array of complementary oligonucleotides [63], and Montzka and

Steitz found that deoxyoligonucleotide-targeted nuclease digestion failed to degrade U11, despite the use of 11 different oligonucleotides staggered along the entirety of the U11 sequence [59].

Together, these results suggested that the U11 snRNP is too compact to be targeted by complementary oligonucleotides; therefore, Tarn and Steitz could not discount the possibility that

U11 was only present in the minor spliceosome because the majority of U12 exists as a U11/U12 di-snRNP [56, 59]. However, they did note that the predicted secondary structure of U11 was very similar to that of U1 [56]. In 1997, the Steitz and Padgett groups finally showed that U11 interacts with the 5’SS and that disruption of this base-pairing negatively affected minor splicing, through

8 experiments using 4-thiouridine (4SU) crosslinking and mutation analysis, respectively [34, 64].

The importance of U11 in minor splicing was further underscored in 1999, when Frilander and

Steitz used peptide nucleic acid (PNA) oligomers targeting the 5’SS interacting site of U11 to prevent minor splicing [65]. They also reported that the recognition of the 5’SS by U11 depended on the presence of the U12-BPS interaction, and vice versa [65]. From these studies, a splicing model was proposed in which U11 and U12 simultaneously recognize their respective consensus sequences during minor intron splicing, essentially condensing the first two steps observed in major intron splicing into one (Figure 1.1) [20, 65].

In a second 1996 paper, Tarn and Steitz identified the minor splicing counterparts of the

U6 and U4 snRNAs, using a pull-down strategy targeting the cap structure of U6, which they reasoned was shared with the minor-class U6 analogue [66]. This strategy pulled down two unknown snRNAs, which Tarn and Steitz named U6atac and U4atac for the AT-AC introns they remove [66]. Although the primary sequences of these snRNAs differed from that of U6 and U4, the predicted secondary structures of U6atac and U4atach were similar to that of U6 and U4, respectively [66, 67]. Sequence analysis of U6atac revealed that it contained a 5’-AAGGA-3’ sequence, which could base-pair to the 5’SS of minor introns (5’-ATATCCTTT-3’) (Figure 1.1)

[66]. To test whether this sequence was necessary for minor splicing, Tarn and Steitz added an oligonucleotide targeting this region of U6atac to their in vitro splicing reaction, which specifically ablated minor splicing [66]. In addition to testing this 5’SS-U6atac interaction, Tarn and Steitz also investigated whether U6atac base-paired with U12, since base-pairing is observed between

U2 and U6 in major splicing. Indeed, they found that a 2’-O-methyl oligonucleotide against the 5’ end of U12 prevented minor splicing, but did not affect major splicing. The association of U6atac with U12 in the active minor spliceosome was further verified by psoralen crosslinking, which

9 showed that the presence of a minor intron-containing substrate was necessary to generate significant U12/U6atac crosslinking [66]. In 1997, Yu and Steitz confirmed the base-pairing between U6atac and the 5’SS by a 4-thiouridine (4SU) crosslinking analysis [64]. Subsequently,

Incorvaia and Padgett showed that, when minor splicing was prevented by 5’SS mutations, compensatory mutations in the proposed 5’SS-interacting region of U6atac could rescue minor splicing [68]. For U4atac, Tarn and Steitz did not identify any potential base-pairing between

U4atac and the minor-class consensus sequences, indicating that it might have a function similar to U4 [66]. To test whether U4atac was necessary for minor splicing, Tarn and

Steitz used oligonucleotide-directed RNase H degradation to remove U4atac from their in vitro splicing reaction. They found that degradation of U4atac resulted in a loss of minor splicing, but no change in major splicing, indicating its importance in the minor splicing process [66].

With the snRNA components of the minor spliceosome identified, the question of the protein components of the minor spliceosome emerged. In the first study of minor spliceosomal proteins in 1999, Will et al. found that, of the 20 U11/U12 di-snRNP-related proteins they identified, the majority were also found in the major spliceosome; only 6 proteins were unique to the minor spliceosomal snRNP [69]. Similar analysis of the U4atac/U6atac.U5 tri-snRNP revealed that its protein composition was highly similar to that of the U4/U6.U5 tri-snRNP [70]. Further work by the Luhrmann group in 2004 identified 7 proteins specific to the U11/U12 di-snRNP:

U11/U12-65K, U11/U12-59K, U11/U12-48K, U11/U12-35K, U11/U12-31K, U11/U12-25K, and

U11/U12-20K [71]. While all seven proteins were associated with the U11/U12 di-snRNP, four of them—U11/U12-59K, U11/U12-48K, U11/U12-35K, and U11/U12-25K—could be found in the mono-U11 snRNP, indicating that they interact with the U11 snRNA [71]. The specific functions of many of the minor spliceosome-specific proteins remain unknown. In 2004, the Luhrmann

10 group interrogated the role of the U11-associated U11/U12-59K protein in HeLa cells by RNAi- mediated knockdown, which resulted in cell death. While this result underscored the importance of this protein’s function, its specific role in minor splicing could not be assessed [71]. In 2005,

Benecke et al. found that the U11/U12-65K protein binds to both U11/U12-59K protein and the

3’ stem loop of the U12 snRNA, suggesting that it may play a role in stabilizing the U11/U12 di- snRNP [72]. Turunen et al. similarly described the U11/U12-59K protein as an anchor protein in

2008, when they found that the U11-associated U11/U12-48K protein binds to the U11/U12-59K protein and the 5’ end of the 5’SS of the pre-mRNA transcript [73]. They also reported that RNAi- mediated knockdown of this 48K protein reduced the amount of U11/U12 di-snRNP in HeLa cells, indicating that this protein may also play a role in U11/U12 di-snRNP stabilization [73]. Based on its domains and experiments performed in Arabidopsis thaliana, the U11/U12-31K protein likely functions as an RNA chaperone [74]. Although knockdown of the 31K protein negatively affects minor splicing, its exact role in the process is unclear [74]. For the remaining 35K, 25K, and 20K

U11/U12 proteins, no reports describing their functions have been published since their initial discovery. However, binding assays using plant minor spliceosome-specific proteins suggest that the 35K protein binds to 25K, 31K, and 35K, and that the 25K protein binds to the 65K, 59K, 48K,

35K, 31K, and 25K proteins [75].

Since the components of the minor spliceosome act analogously to those of the major spliceosome, the process of minor splicing is extremely similar to that of major splicing. The primary difference lies in the first step of minor splicing, a.k.a. intron recognition. Unlike U1 and

U2 in the major spliceosome, U11 and U12 exist as a di-snRNP [59]. Therefore, the first step of minor splicing involves simultaneous recognition of the 5’SS and BPS by U11 and U12, respectively (Figure 1.1) [20, 65]. U11 base-pairing to the last 5 nucleotides of the 5’SS is

11 stabilized by the U11/U12-48K protein, which binds to the 5’ end of the 5’SS and to the anchor protein U11/U12-59K [73]. After this intron recognition step, the U4atac/U6atac.U5 tri-snRNP is recruited to the intron, at which point U6atac base-pairs to the 5’SS and to U12, displacing both

U11 and U4atac in the process (Figure 1.1) [20, 64, 66]. Meanwhile, U5 binds to both the 5’ exon and the 3’SS (Figure 1.1) [56, 76]. The close proximity of the bulging adenosine in the BPS to the

5’ exon triggers the first trans-esterification event, in which the 2’ hydroxyl of the BPS adenosine performs a nucleophilic attack on the 5’ phosphate group at the end of the 5’ exon. Like in major splicing, this first trans-esterification reaction forms an intron lariat, while freeing the 5’ exon, which remains in close proximity with the 3’ exon due to U5 snRNP binding (Figure 1.1). This elicits the second trans-esterification reaction, in which the 3’ phosphate group at the end of the

5’ exon performs a nucleophilic attack on the 5’ phosphate group at the beginning of the 3’ exon, ligating these exons together and freeing the intron lariat (Figure 1.1) [20, 56]. Notably, the vast majority of minor intron-containing genes (MIGs) primarily contain major introns, with only 1-2 minor introns [58]. Therefore, MIG expression requires the coordinated activity of both the major and the minor to generate a fully spliced mRNA transcript.

Ultimately, minor intron splicing is a parallel splicing pathway for a small minority of introns (<0.5% of introns in the human genome), which shares its biochemical pathway, the U5 snRNA, and most of its proteins with the major spliceosome [20, 56, 58, 69, 70]. Since its discovery, scientists have sought to understand when minor splicing emerged during evolution, and why such a complex machinery has been maintained, in spite of the existence of another spliceosome.

1.2.3 Conservation and utility of minor intron splicing

12

Throughout the 1990s and early 2000s, multiple groups reported that minor introns were found throughout metazoans and land plants, pinning the age of the minor spliceosome to at least

1 billion years [77-79]. Unexpectedly, they could not identify any minor introns or minor snRNAs in the nematode Caenorhabditis elegans, indicating that minor splicing had been lost altogether in this metazoan lineage [77]. It was only in 2006, when Russell et al. analyzed partially sequenced protist and fungal genomes, that the first minor introns and minor spliceosome components were identified in the amoeboid protist A. castellanii, the fungus R. oryzae, and two water molds, P. ramorum and P. sojae [80]. Based on the presence of minor introns and components of the minor spliceosome in all five eukaryotic supergroups, they suggested that minor splicing was likely present in the last eukaryotic common ancestor (LECA) [80]. In 2008, Davila Lopez et al. performed the most comprehensive search for minor snRNAs yet described, scanning a total of

149 eukaryotic genomes [81]. Their analysis identified minor snRNA genes in the nematode T spiralis, the slime mold P. polypephalum, and the fungal lineages Zygomycota and

Chytridiomycota [81]. More strikingly, they did not identify substantial numbers of minor snRNA genes in the fungal lineages of Microsporidia, Ascomytoca, or Basidiomycota, and no minor snRNA genes were identified in red or green algae, indicating multiple losses of the minor spliceosome in eukaryotic evolution [81]. In 2010, the same group performed a similar analysis of minor introns, in which they extracted all minor introns from a subset of these eukaryotic genomes

[82]. From their findings in both of these papers, Davila Lopez et al. calculated that minor splicing had been independently lost at least nine times across eukaryotic evolution [82]. Even in lineages that have retained minor introns and the minor spliceosome, there are indications of minor intron loss across evolution. For example, the Makałowski group found higher rates of minor intron loss in dipteran insects than in other Insecta lineages, which was unrelated to total genome size (average

13 of 18.1 minor introns vs average of 42.8) [83]. Oddly, however, the dipteral lineage also contains the only known example of minor intron gain in a species. Notably this gene, Zrsr2, has been shown to be essential for minor splicing, and the addition of a minor intron in this gene may represent a drive toward auto-regulation within the minor splicing pathway [83]. Overall, these findings exemplify the dynamic flux in the maintenance of the minor spliceosome and minor introns across eukaryotic evolution, tracing back to early .

Despite the preponderance of evidence indicating a drive to lose minor splicing throughout evolution, minor introns and the minor spliceosome are highly conserved in most metazoans and plants. In fact, compared to major introns, the positions of minor introns are more highly conserved between plants and animals [84]. This contradiction is unexpected—why is this pathway so highly conserved among metazoans and plants, when it has been lost multiple times in other eukaryotic lineages? One possibility, as Mount proposed in 1996, is that minor splicing is a molecular fossil

[85]. In this model, the maintenance of the minor introns and the minor spliceosome is dependent primarily on chance. For example, if an ancestral minor intron sits in an essential gene, it is likely it will be retained and conserved; if an ancestral minor intron is found in a gene that is rarely expressed, it is likely to be lost. Similarly, if a minor intron-containing gene (MIG) is less essential in species A than it is in species B, it is more likely to be lost in species A [85].

Another possibility is that the conservation of minor introns in early evolution may have been ensured by their presence in crucial genes, but they have now evolved additional regulatory roles separate from that requirement. For example, it has been shown that minor intron splicing is less efficient than major intron splicing in HeLa and S2 cell lines, due to the low levels of the minor snRNAs [86]. These unspliced transcripts would either be trapped in the nucleus or be degraded by nonsense-mediated decay in the cytoplasm [86, 87]. In this model,

14 minor intron splicing is a regulatory switch that allows the cell to regulate all minor intron- containing gene (MIG) expression en masse, by altering the levels of the minor spliceosome components and thereby controlling the rate of minor intron splicing. This model is supported by work from the Dreyfuss group, who found that U6atac was the rate-limiting component of the minor spliceosome [88]. When they stabilized U6atac snRNA in HeLa cells, resulting in increased

U6atac levels, they observed increased rates of minor intron splicing both in a transfected splicing construct and across multiple endogenous genes [88]. Additionally, the minor spliceosome may play a role in regulation. In Drosophila, the MIG prospero, which encodes a neuronal , contains an intron with a twintron structure, in which a major intron is nested within a minor intron. Minor intron splicing of this twintron would remove 29 amino acids, 5 of which encode the N-terminal homeodomain [89]. In 2004, Scamborova et al. showed that the nested major intron is preferentially spliced during early embryogenesis, while the outer minor intron is preferentially spliced in later development [90]. Genes with similar twintron architectures have been identified in insects and vertebrates, indicating that this type of isoform regulation may be relatively common [91].

In all, the conservation of minor introns, regardless of the evolutionary pressures that sustained them, suggests an important biological role for minor splicing, particularly in lineages that have maintained high numbers of minor introns [58]. In order to understand this role, it is necessary to determine the functions of MIGs.

1.2.4 Minor intron-containing genes (MIGs)

Most efforts to identify MIGs rely on scans of high-quality genomes for introns with these minor-class consensus sequences. As of now, three multi-species minor intron databases have been

15 published: SpliceRack, the U12 Database (U12DB), and ERISdb, a plant-specific database [19,

58, 92]. Of these databases, only the U12DB both contains entries for a large range (20) of species and is still accessible, although it has not been updated since 2007 [58]. Even before the publication of these databases, multiple groups suspected that minor splicing might control specific functions, based on observations that minor introns were enriched in specific gene families. For example, in

1997, Wu and Krainer noted the presence of multiple MIGs within the voltage-gated sodium channel (VGSC) and voltage-gated calcium channel (VGCC) alpha subunit gene families, which encode the pore region of these voltage-gated ion channels [93-95]. Since these gene families share a common ancestral gene, Wu and Krainer predicted that minor introns would be identified in other genes within the voltage-gated ion channel superfamily [93]. Indeed, based on the human minor introns listed in the U12DB, 90% of both VGSC and VGCC alpha subunit genes contain at least one minor intron [58, 94, 95]. Moreover, all 4 VGCC auxiliary alpha2delta subunit genes, which function in VGCC trafficking and gating modulation, are MIGs [58, 96]. Together, these observations strongly suggest that cell excitability, particularly action potential propagation and muscle contraction, and vesicle fusion are regulated by minor splicing [94-96].

Since 2000, multiple other MIG-rich gene families have been identified, spanning a wide variety of functions. While sequencing the human matrilin genes, Muratoglu et al. (2000) identified minor introns in all 4 matrilin genes, implicating minor splicing in the regulation of extracellular matrix assembly, particularly in cartilage [97, 98]. In 2001, Levine and Durbin identified 404 minor introns in the , and they noted enrichment of MIGs in the transcription factor, phospholipase C, diaphanous-related formin, and a subset of guanine nucleotide exchange factor (GEF) gene families [99]. Based on the human minor introns listed in the U12DB, 75% of E2F genes, 47% of phospholipase C genes, 100% of diaphanous-related

16 formin genes, 75% of Ras guanyl-releasing (RASGRP) genes, and 71% of Rap-GEF genes contain minor introns [58]. The U12DB also identified numerous other gene families with minor intron enrichment, such as the mitogen activated-protein (MAPK) family, of which 62% contain minor introns, and the phosphatase 2 (PP2A) regulatory B55 subunit genes, which are all MIGs

[58]. Notably, the RASGRP and MAPK gene families converge on the Ras signaling pathway, which modulates cell survival, growth, and differentiation [100]. Moreover, the E2F transcription factors are well-known regulators of cell proliferation and differentiation, while the B55 subunit of PP2A conveys substrate specificity, allowing for the dephosphorylation of retinoblastoma family and Smad proteins, in turn inhibiting E2F activity [101, 102]. Given the high percentage of

MIGs found in all of these gene families, it is plausible that minor splicing also plays a role in regulating cell proliferation and differentiation. Indeed, disruptions in minor splicing have been associated with abnormal cell cycle regulation and differentiation [103, 104].

With the advent of bioinformatics-based approaches to identify MIGs throughout genomes, it became possible to investigate the functional classifications of MIGs on a wider scale. For example, in 1998, Burge et al. reported that many (25%) of the 56 MIGs they identified in their multi-genome scan were involved in “information processing” functions, such as DNA replication,

DNA repair, transcription, RNA processing, and [77]. Similarly, when Chang et al. extracted ~500 minor introns from the human genome, they observed enrichment of MIGs in RNA metabolism pathways, spanning both mRNA and non-coding RNA processing, and transcription regulation [105]. More recently, Merico et al. leveraged mouse phenotypes from the Mammalian

Phenotype Ontology dataset and functional gene-set resources, such as , KEGG, and BioCarta, to identify common functions performed by the 744 MIGs they identified in the human genome [106]. Using Fisher’s exact tests to determine statistical significance, they

17 observed enrichment for numerous “information processing”-related functions, including DNA replication/repair, transcription regulation, and RNA processing, along with enrichment for ion channels [106]. However, they also identified functional enrichments for cell cycle, neurodevelopment, skeletal development, and embryonic survival [106]. In plants, MIG functions were similarly investigated by Gault et al. in 2017, who used domain analysis to study MIGs conserved between maize (Zea mays) and human. Their initial analysis of all 408 maize MIGs identified in the ERISdb revealed functional enrichments in cell cycle, RNA processing, transcription, translation, /degradation, and metabolism, similar to the functions identified in previous reports of human MIGs [104]. When they analyzed the 233 maize MIGs with human homologs, they found that 22% (50) of these conserved MIGs regulated cell cycle, while an additional 14% (33) were involved in protein glycosylation [104].

In the past decade, most efforts to investigate the functions regulated by minor splicing have sought to study the effects of minor spliceosome inactivation in a given model system; by studying the biological processes that are affected, researchers can extract the functions that require normal MIG expression. There are two common methods for this type of research: (1) to study naturally occurring diseases that are caused by minor spliceosome inactivation; and (2) to study an experimental model of minor spliceosome inactivation. The results generated from these two approaches are discussed in the following section.

1.3 Minor splicing, development, and disease

From the discovery of minor splicing until the mid- to late-2000s, most publications in the field sought to investigate the components, biochemistry, and evolution of minor splicing. Barring reports describing two models of minor spliceosome disruption [74, 89], most research was

18 performed in vitro or in cell culture. In 2011, however, the field underwent a major shift. That year, two back-to-back Science papers reported that inactivation of the minor spliceosome caused a severe developmental disorder, called microcephalic osteodysplastic primordial dwarfism type

1 (MOPD1) [107, 108]. Since 2011, research into the connection between minor splicing, development, and disease has exploded; currently, disruption of minor intron splicing has been linked to nine diseases, seven of which affect the central nervous system.

1.3.1 Minor splicing and early/embryonic development

The first study indicating the importance of the minor spliceosome in early development was forward genetics work in Drosophila melanogaster, published in 2002. Otake et al. determined that the embryonic lethality and third instar death observed in two fly lines (EP(3)3147 and l(2)k01105) were caused by P element-mediated disruption of the U12 and U6atac snRNA genes, respectively [89]. Introducing wild-type U12 or U6atac transgenes into these lines rescued these phenotypes, and RT-PCR confirmed that minor splicing was disrupted in the U6atac mutant larvae [89]. Surprisingly, they also detected the presence wild-type U6atac snRNA in the U6atac mutant larvae; the levels of this wild-type U6atac snRNA were inversely correlated with the amounts of mutant U6atac, suggesting that wild-type U6atac is maternally contributed in flies [89].

They proposed that this might explain why these U6atac mutant flies escaped embryonic lethality, unlike the U12 mutants [89]. In 2010, the Frilander group further characterized the U6atac mutant line by microarray and RT-PCR analysis [109]. The microarray they employed was designed to detect the expression of most exons in the Drosophila genome, by designing one probe for each exon. For hybridization, total RNA samples from first, second, and third instar larvae from heterozygote controls and homozygous mutants were used. Unexpectedly, this microarray analysis

19 did not reveal dramatic changes in the expression of MIGs in the mutant line, and the results did not indicate global aberrant splicing near minor introns in these MIGs [109]. While RT-PCR analysis of minor splicing did indicate increased minor intron retention in the U6atac mutants, it also revealed substantial amounts of properly spliced transcripts, suggesting that minor spliceosome function is not completely disrupted in this fly line [109]. Although the microarray did not reveal dramatic shifts in MIG expression, it did show significant downregulation of general metabolism genes in the mutant larvae [109]. The most downregulated MIG in the mutant larvae was CG15081/l(2)03709, a gene that encodes the mitochondrial protein prohibitin, which is necessary for mitochondrial stability and function [109]. Importantly, mutation in this gene has previously been shown to cause larval lethality in Drosophila [110]. Based on these findings, they concluded that disruption of this MIG likely caused the downregulation of metabolism genes and, eventually, larval death, even though the minor splicing defect in this line appeared to be relatively mild [109].

In the same year, the Kang group published the first study describing the effects of minor spliceosome dysfunction on Arabidopsis thaliana development, by targeting the gene that encodes the U11/U12-31K protein [74]. Initially, this group employed a mutant Arabidopsis thaliana line, which contained a T-DNA insertion in the coding region of U11/U12-31K. However, they were unable to generate homozygous mutants, suggesting that complete loss of U11/U12-31K function is embryonically lethal [74]. To further study the role of this gene in Arabidopsis, they instead employed artificial microRNA (amiRNA) to knock down U11/U12-31K levels by ~70-85% in developing plants. These transgenic plants exhibited various developmental abnormalities, including leaf serration, arrested meristem growth, and post-bolting rosette leaf production [74].

Additionally, the Kang group found that subsequent generations of transgenic amiRNA lines with

20 the strongest U11/U12-31K knockdown displayed more dramatic phenotypes, ranging from an inability to grow beyond the vegetative stage, death, and embryonic lethality, further underscoring the importance of U11/U12-31K function in Arabidopsis thaliana development [74]. RT-PCR analysis of minor splicing in these transgenic lines revealed minor intron retention correlating with the strength of amiRNA-mediated U11/U12-31K knockdown; however, for some MIGs, minor intron retention was not observed in these amiRNA transgenic lines [74]. They also reported that these phenotypes and the observed minor intron retention could be rescued with amiRNA-resistant

U11/U12-31K and, as described in the Kang group’s 2012 paper, with O. sativa U11/U12-31K

[74, 111]. In addition, external application of the hormone gibberellic acid, which induces cell division and expansion, could rescue the inflorescence stem growth in the transgenic lines; qRT-

PCR analysis of the meristem regions of the untreated amiRNA plants revealed that expression of the genes involved in gibberellic acid metabolism was downregulated compared to wild-type levels [111]. To further clarify the role of the minor spliceosome in A. thaliana development, the

Kang group used a similar strategy to target the gene encoding a different minor spliceosome- specific protein, U11/U12-65K [112]. As with U11/U12-31K, they found that T-DNA disruption of the U11/U12-65K gene was embryonically lethal, since they could not generate homozygous mutants. Similarly, amiRNA-mediated knockdown of U11/U12-65K resulted in a familiar complement of phenotypes, including stunted primary inflorescence stems, leaf serration, post- bolting rosette leaf formation, and small size; RT-PCR analysis also revealed higher levels minor intron retention in the amiRNA plants than in wild-type plants [112]. All of the phenotypes and the minor intron retention were rescued with amiRNA-resistant U11/U12-65K [112]. Together, these studies underscore the importance and necessity of minor spliceosome function in

Arabidopsis thaliana development.

21

Currently, there are four reports describing the role of minor splicing in vertebrate development. In 2014, the Heath group leveraged the clbn zebrafish mutant, which they found was caused by mutation in an intron of rnpc3, the gene encoding the U11/U12-65K protein [113]. The clbn mutants were characterized by microcephaly, micropthalmia, swim bladder defects, delayed yolk absorption, hypoplasia of the pancreas and liver, and intestinal epithelium abnormalities

[113]. These phenotypes began to emerge 4.5 days postfertilization; the mutants died 2.5 to 5.5 days later [113]. To determine the global effect of this rnpc3 mutation on and splicing, Markmiller et al. performed RNAseq using RNA from 108 hour-postfertilization wild- type and mutant larvae [113]. In parallel, to prevent ambiguous read mapping to short, repetitive sequences in introns, they generated a zebrafish uniqueome, which contained only regions to which reads would uniquely map. They then mapped the RNAseq reads to this zebrafish uniqueome.

They also acquired an updated dataset of zebrafish minor intron candidates, extracted by Alioto using the U12DB pipeline, and discarded minor intron candidates that did not have conserved terminal dinucleotides of either AT-AC or GT-AG, conservation at the 5’SS and BPS, or sufficient evidence from mRNA, EST, published RNAseq, or published protein-coding gene data [58, 113].

This RNAseq analysis revealed that minor intron retention was more common than major intron retention in the wild-type larvae (65.5% vs. 49.6%), and that retention of ~25% of minor introns was unique to the clbn mutant [113]. Interestingly, as was reported in Drosophila and Arabidopsis, the majority of MIGs (~90%) were not differentially expressed between the wild-type and mutant samples [113]. Of the differentially expressed MIGs, 40 were downregulated in the mutant larvae, compared with the 10 that were upregulated [113]. Overall, Markmiller et al. found that MIGs involved in mRNA quality control and cell cycle repression were differentially expressed or displayed some degree of minor intron retention in the mutant larvae [113]. This finding led them

22 to propose a model in which disruption of these mRNA control pathway MIGs leads to the destabilization of overall gene expression, minor and major intron-containing alike, by affecting multiple levels of mRNA processing, including transcription, splicing, mRNA transport, and mRNA degradation [113].

In December of 2018, the Heath group extended their investigation into the role of

U11/U12-65K in development to a mammalian system, by employing a novel transgenic mouse line targeting the U11/U12-65K gene Rnpc3 for deletion [114]. Specifically, exons 4 and 5 of the

Rnpc3 gene were flanked by loxP sites to conditionally deplete Rnpc3 levels [114]. In the germline

Rnpc3-/- mice, they observed embryonic lethality prior to blastocyst implantation; they did not observe any phenotype in the germline Rnpc3+/- mice [114]. Using a tamoxifen-inducible, systemic

Cre recombinase, Doggett et al. depleted Rnpc3 throughout adult transgenic mice. Within a week of tamoxifen induction, the mutant mice displayed significantly lower levels of lymphocytes, monocytes, erythrocytes, and thrombocytes; decreased thymus size; and substantial degeneration of the lining of the gastrointestinal tract [114]. The degeneration of the gastrointestinal mucosa occurred alongside significant cell death and reductions in cell proliferation [114]. Both semi- quantitative and quantitative RT-PCR revealed increased minor intron retention in numerous

MIGs, without a corresponding increase in major intron retention, in purified epithelial cells from the small and large intestines of the mutant mice [114]. Within the gastrointestinal mucosa,

Doggett et al. also reported higher Rnpc3 mRNA levels in the proliferating compartments relative to the differentiated cell populations; similar trends of expression were observed in the skin and the lung epithelia [114]. While Doggett et al. did not investigate the effects of Rnpc3 depletion on the hematopoietic stem cells of the red bone marrow, their findings also indicated that Rnpc3 is required for normal hematopoiesis [114, 115]. From their findings, Doggett et al. concluded that

23

Rnpc3 expression is likely essential for the survival of stem cell populations, while differentiated cells appear to survive with very low levels of U11/U12-65K [114].

The other two papers investigating the role of minor spliceosome in vertebrate development, Baumgartner et al. 2015 and Baumgartner et al. 2018, are described in chapters 3 and 4, respectively [116, 117].

1.3.2 Minor splicing and the central nervous system

Until 2014, there were no experimental models for studying minor spliceosome inactivation in mammalian development. To my knowledge, the papers published as part of my dissertation research (Baumgartner et al. 2015 and Baumgartner et al. 2018) were the first two publications investigating minor splicing disruption in a mammalian model system. Thus, the primary resource available for studying the role of minor splicing in a mammalian system are clinical reports of diseases linked to minor spliceosome disruption. Of the nine diseases that have been linked to disruption in minor splicing, seven are disorders of the central nervous system [103,

106-108, 118-123]. Specifically, five of these disorders are characterized by either microcephaly or hypoplasia of specific brain regions [106-108, 119, 121, 122]. The remaining two disorders, along with one disease marked by cerebellar hypoplasia, involve degeneration of motor system neurons [118, 120, 121].

1.3.2.1 Microcephaly

Minor spliceosome dysfunction was first linked to a human disease in 2011, when two independent groups showed that mutation in the U4atac gene, RNU4ATAC, causes the severe developmental disorder microcephalic osteodysplastic primordial dwarfism type 1 (MOPD1), also

24 known as Taybi-Linder syndrome (TALS) [107, 108]. This disorder is characterized by restricted intrauterine growth, microcephaly, brain abnormalities, malformed bones and joints, dry skin, and sparse hair. Common brain abnormalities include agenesis of the corpus callosum, gyrification defects, hypoplasia of the cortical lobes, and agenesis of the cerebellar vermis. Individuals with the disorder rarely survive past early childhood [124-127]. Using a combined approach of genome- wide mapping and high throughput, second generation sequencing, He et al. found that all MOPD1 patients from a consanguineous Ohio Amish population shared the same genomic G to A mutation at nucleotide 51 (g.51G>A) of the RNU4ATAC gene [107]. He et al. also identified RNU4ATAC g.30G>A, g.55G>A, and g.111G>A mutations in German MOPD1 patients [107]. To determine the effects of these mutations on minor splicing efficiency, He et al. employed an in vivo splicing assay in CHO cells, using a splicing reporter that could only be targeted by exogenous U11,

U6atac, and U4atac snRNAs. This assay revealed that each MOPD1-linked U4atac mutant construct reduced minor splicing efficiency by over 90% [107]. Further analysis of the effects of the g.51G>A mutation, using fibroblasts derived from MOPD1 patients with this mutation, showed that minor intron splicing was less efficient in the MOPD1 fibroblasts compared to wild-type cells, and this minor splicing defect could be rescued by wild-type U4atac snRNA [107]. Interestingly, they found that not all minor introns were equally retained in the MOPD1 fibroblasts [107]. In the second 2011 paper, Edery et al. used high-throughput targeted sequencing to identify RNU4ATAC mutations in MOPD1 families and patients from the Mediterranean basin, Morocco, India, North

America, and Norway [108]. In addition to finding the same g.51G>A mutation described by He et al., Edery et al. identified three additional RNU4ATAC mutations of g.50G>A, g.50G>C, and g.53C>G [108]. Unexpectedly, most MIGs they interrogated showed either normal expression or significant upregulation in the MOPD1 cell lines [108]. Despite this, Edery et al. detected increased

25 minor intron retention in all of the MIGs interrogated by qRT-PCR and semi-quantitative PCR analysis [108]. As observed by He et al., the degree of minor intron retention varied across the tested MIGs, indicating a differential effect on minor splicing across these minor introns [108].

Subsequent reports have identified six additional RNU4ATAC mutations in MOPD1 patients: g.29T>C, g.40C>T, g.46G>A, g.66G>C, g.124G>A, and an 85-bp insertion in position g.101, caused by an 85 bp tandem duplication of nucleotides 16-100 [127-130]. Most of the mutations associated with MOPD1 are located in the 5’ stem-loop (nt 26-57) of the U4atac snRNA, which could disrupt binding of the U4atac/U6atac di-snRNP proteins 15.5K and PRPF31 [131-

134]. In particular, the Padgett group found that mutations in positions 50 and 51 significantly reduced binding of 15.5K to U4atac [134]. Since the binding of 15.5K protein allows additional di-snRNP proteins to bind to U4atac, these mutations also prevented PRPF31 binding [132]. They also found that the g.30G>A mutation blocked PRPF31 binding [134]. For the remaining MOPD1- linked mutations, g.66G>C is located in linear RNA at the base of stem I, which would extend this stem; g.111>A sits in the stem of the 3’ stem loop (nt 83-115); and g.124>A is found in the Sm binding region (nt 116-124), which is important for snRNP assembly and maintenance of U4atac levels [67, 134, 135]. The 85 bp insertion into position 100 of RNU4ATAC is predicted to disrupt the 3’ stem loop [127]. Notably, these different mutations appear to correlate with different survival outcomes in patients. For instance, Nagy et al. reported a 9-year-old MOPD1 patient with homozygous 55G>A mutations [125]. Similarly, Abdel-Salam et al. described three MOPD1 patients with mild phenotypes, associated with homozygous g.55G>A and heterozygous g.66G>C and g.124G>A mutations [128]. Another MOPD1 patient, who lived to 12.75 years, had one g.30G>A and one g.111G>A allele [125]; an 18-year-old MOPD1 patient was identified who was homozygous for the g.55G>A mutation [136]. Kroigard et al. reported a pair of siblings,

26 aged 17 and 24, who were heterozygous for the g.40C>T and 85-bp duplication mutations [127].

The latter of these patients is the oldest known MOPD1 patient [127]. Together, these results suggest that mutations located outside of the 5’ stem-loop are less detrimental to minor splicing than 5’ stem-loop mutations, especially the g.51G>A mutation, which has been linked to a shorter lifespan compared to other mutations [125].

In 2015, mutation of U4atac was linked to another developmental disorder, called Roifman syndrome [106]. Roifman syndrome is characterized by restricted intrauterine growth, microcephaly, malformation of the bone and joints, retinal dystrophy, and deficiency

[106, 137]. Unlike MOPD1, premature death has not been reported in Roifman syndrome; the oldest reported patient was 20 years old at the time of publication [106, 138]. Using whole-genome sequencing, Merico et al. found that Roifman syndrome patients had compound heterozygous mutations in RNU4ATAC. Specifically, of the 6 patients, two were heterozygous for g.13C>T and g.37G>A mutations; one had heterozygous g.13C>T and g.48G>A mutations; two patients had one g.16G>A and one g.51G>A allele; and one patient had g.8C>T and g.118T>C [106].

As expected, RNAseq of Roifman syndrome patient mononuclear blood cells revealed significantly elevated minor intron retention than observed in carrier controls, while major splicing was not affected [106]. Merico et al. also observed a general trend of MIG upregulation in patients, which they suggested might be a compensatory response to increased minor intron retention [106].

Between 2016 and 2017, RNU4ATAC mutations were identified in two additional Roifman syndrome patients [139, 140]. One of these patients had compound heterozygous mutations of g.17G>A and g.116A>G [139, 140]. The other had a homozygous g.16G>A RNU4ATAC mutation

[140].

27

Of the mutations identified in Roifman syndrome patients, only the g.51G>A has been observed in MOPD1; the Roifman syndrome patients with this mutation also had a g16G>A mutation, which is located in stem II (nt 1-19), a structure formed by base-pairing between U4atac and U6atac [67, 106]. Notably, in 5 of these 8 Roifman syndrome patients, the compound heterozygous mutations included one mutation located in the 5’ stem-loop of U4atac, and one mutation in stem II. In two of the remaining 3 patients, the compound heterozygous mutations were located in stem II and the Sm binding region, respectively [106, 139, 140]. The final Roifman syndrome patient is the only one with a homozygous RNU4ATAC mutation, located in stem II

[140]. Notably, stem II mutations have not been reported in MOPD1 patients, suggesting that they are unique to Roifman syndrome [107, 108, 127-130]. Disruption of stem II in these patients could underlie the presence of Roifman syndrome-specific phenotypes, such as retinal dystrophy and antibody deficiency [106, 140]. Based on the phenotype of Roifman syndrome patients, relative to that of MOPD1, the Roifman group hypothesized that stem II mutations in U4atac are less detrimental to minor spliceosome function than the 5’ stem-loop mutations observed in MOPD1

[106].

In 2018, Farach et al. connected RNU4ATAC mutation to yet another developmental disorder: Lowry Wood syndrome. Symptoms of Lowry Wood syndrome include restricted intrauterine growth, microcephaly, and skeletal dysplasia; is also common

[122, 141, 142]. Sequencing of RNU4ATAC in three patients revealed compound heterozygous mutations. One patient had heterozygous g.5A>C and g.51G>A mutations; the other two had one g.111G>A allele and an allele with both g.46G>A and 123G>A mutations [122]. A subsequent report, in 2018, identified two additional Lowry Wood syndrome patients with compound heterozygous RNU4ATAC mutations: one patient had heterozygous g.53C>T and g.8C>A

28 mutations, while the second patient had one g.120T>G allele and one g.114G>C allele and more severe symptoms [143]. Three of these mutations, g.46G>A, g.51G>A, and g.111G>A, have been observed in MOPD1 patients; while the nucleotide substitutions vary, the positions of the g.8C>A and g.53C>T mutations match those reported in Roifman syndrome and MOPD1, respectively

[106-108, 127-130]. These were the first reports of the g.5A>C, g.114G>C, g.120T>G, and g.123G>A mutations. The g.5A>C mutation is located within stem II. However, based on mutational studies of the complementary position of U6atac, this mutation probably has a mild impact on minor spliceosome activity [144]. The g.114G>C mutation is located at the base of the

3’ stem-loop, where it would disrupt complementarity; mutational analyses suggest that such a disruption would have a minor effect on minor spliceosome activity [144]. The g.120T>G and g.123G>A mutations would affect the Sm , potentially impacting snRNP assembly and

U4atac stability [134, 135]. Like Roifman syndrome, Lowry Wood syndrome patients with mutations in the 5’ stem-loop have compound heterozygous mutations, with the other mutation located elsewhere in RNU4ATAC, either in stem II, the 3’ stem-loop, or the Sm binding region.

Both Roifman syndrome and Lowry Wood syndrome are also mild disorders, relative to MOPD1.

Even within MOPD1, there is a wide range of symptom severity, dependent on the specific

RNU4ATAC mutations [107, 127]. It is likely that these disorders exist on a continuum of minor spliceosome disruption, such that severe MOPD1 lies at the end corresponding to the most severe minor spliceosome disruption, while Lowry Wood syndrome and Roifman syndrome sit at the opposite end, and mild cases of MOPD1 are spread throughout the middle [143].

Alongside U4atac, mutations in other minor spliceosome-specific components are also associated with microcephaly. In 2014, Argente et al. linked mutations in RNPC3, which encodes the U11/U12-65K protein, to a family with isolated growth hormone deficiency (IGHD) [119].

29

IGHD patients grow at slower than normal rates, resulting in a small stature. In addition to their short stature, all three patients examined displayed mild microcephaly and hypoplasia of the anterior pituitary [119]. Argente et al. determined that the RNPC3 mutations in the IGHD patients would lead to a P474T missense or R502X nonsense mutation in the C-terminal RNA binding motif (RRM) in the U11/U12-65K protein [119]. Given that U11/U12-65K bridges the U11- associated 59K protein and the U12 snRNA, and that the 65K protein binds U12 via its C-terminal

RRM, this mutation could affect U11/U12 di-snRNP integrity [72]. Indeed, subsequent biochemical work revealed that these mutations significantly reduced U11/U12 di-snRNP stability, due to either impaired RRM folding or NMD-mediated reduction in U11/U12-65K protein levels [119, 145]. As expected, RNAseq of patient and family control mononuclear blood cell samples revealed increased minor intron retention and aberrant splicing around minor introns; however, relative to the entire MIG pool, relatively few MIGs displayed altered splicing of the minor intron in the patient samples [119].

In 2017, Elsaid et al. identified the first disease linked to mutation in the U12 snRNA: familial, early-onset cerebellar ataxia [121]. The affected individuals in this Qatari family had significantly delayed motor milestones from early infancy onward. While none of the patients were microcephalic, all of them had cerebellar hypoplasia, indicative of abnormal cerebellar development [121]. Unlike the other disorders described in this subsection, however, this is not a purely developmental disease: Elsaid et al. report indications of cerebellar degeneration, as well

[121]. Whole genome sequencing of these patients revealed a homozygous g.84C>T mutation, positioned at the base of stem loop III [121]. This mutation would disrupt base-pairing at the base of the stem region of stem loop III, shortening the overall structure. Based on mutational work from the Shukla group, the length of stem loop III has a mild effect on minor spliceosome activity

30

[146]. However, RNAseq of mononuclear blood cell samples revealed elevated minor intron retention of 144 minor introns in patient samples relative to carrier controls [121].

As implied by the phenotype of these cerebellar ataxia patients, minor spliceosome disruption is not implicated only in disorders affecting central nervous system development. A growing body of literature is linking minor spliceosome disruption to specific motor disorders that are characterized by degeneration of motor system neurons.

1.3.2.2 Motor disorders

The association between motor disorders and minor spliceosome function was first suggested in 2007, when Gabanella et al. published work on a mouse model of severe spinal muscular atrophy (SMA) [147]. SMA, a motor neuron disorder characterized by motor neuron degeneration and muscle atrophy, is caused by mutation in the gene SMN1 [148]; its protein, SMN, is required for the formation of all spliceosomal snRNPs, except for U6 and U6atac [149, 150].

Using immunoprecipitation, Gabanella et al. found that U11 and U12 were the only snRNPs with significantly lower levels in the SMA mouse spinal cord and brain extracts analyzed [147]. These findings were supported by Zhang et al. in 2008 and Boulisfane et al. in 2011, who observed downregulation of minor snRNAs/snRNPs in an SMA mouse model and SMA patient-derived cells, respectively [151, 152]. Despite observing a 25-fold reduction in the U4atac/U6atac.U5 tri- snRNP, Boulisfane et al. only identified aberrant minor intron splicing in 4 of the 30 MIGs they interrogated by qRT-PCR [152]. From these results, they concluded that the inefficient splicing of one or two MIGs specifically involved in motor neuron function might result in SMA. The idea that the missplicing of specific MIGs may lead to the symptoms of SMA was further supported in

2012, when Lotti et al. employed a Drosophila model of SMA to study minor snRNA levels and

31 minor intron splicing, which revealed significant downregulation of all snRNAs, minor intron retention in two MIGs, and downregulation of 7 other MIGs [118]. Using Drosophila larvae, they found that cholinergic neuron-specific knockdown of one of these MIGs, which they named stasimon, resulted in a neurotransmitter release defect very similar to that observed in smn mutant flies [118]. Analysis of stasimon, also known as Tmem41b, in a mouse model of SMA revealed a significant downregulation of stasimon mRNA in motor and proprioceptive neurons in the early symptomatic stage of the disease [118]. Moreover, antisense morpholino-mediated knockdown of stasimon in zebrafish also resulted in phenotypes similar to those seen after smn knockdown [118].

Notably, misexpression of stasimon could rescue specific smn phenotypes in both Drosophila and zebrafish [118]. Based on these findings, they concluded that while the minor snRNA downregulation in SMA does not cause universal minor intron missplicing, misexpression of specific MIGs, such as stasimon, could potentially cause SMA [118].

However, not all evidence supports the model that SMA is primarily caused by aberrant minor intron splicing or MIG misexpression. In 2009, Baumer et al. reported that minor intron splicing was largely unaffected in an SMA mouse model, based on exon microarray analysis [153].

Huo et al. performed a similar analysis in 2014, using exon junction microarrays with SMA mouse motor neuron samples; they found that all significant changes in alternative splicing occurred at major, not minor, introns [154]. Furthermore, when the Dreyfuss group interrogated splicing and gene expression in motor neurons and white matter glia from pre-symptomatic SMA mice by

RNAseq, they found that only 9 MIGs were alternatively spliced in the SMA samples [155].

Notably, none of these splice events occurred adjacent to the minor introns within these genes, and they did not detect altered splicing or expression of stasimon, suggesting that minor splicing

32 defects are not the primary cause of SMA [155]. In all, it is currently unclear where minor splicing dysfunction fits in the timeline SMA progression.

More recently, minor spliceosome function has also been implicated in the motor disorder amyotrophic lateral sclerosis (ALS), which is characterized by progressive degeneration of motor neurons culminating in muscle atrophy [156]. Mutations in multiple proteins have been associated with ALS, including the proteins fused in sarcoma (FUS) and TAR-DNA-binding protein 43 kDa

(TDP-43) [156]. Both FUS and TDP-43 are heterogeneous nuclear ribonucleoproteins (hnRNPs) that are involved in RNA transcription and splicing [157]. TDP-43 also associates with gemins, which are involved in snRNP biogenesis [135, 158]. In ALS patients, cytoplasmic protein aggregates form in the motor neurons [159]. In most patients, these aggregates contain TDP-43; in a subset of ALS patients, the cytoplasmic protein aggregates contain FUS [160]. Cytoplasmic aggregation of either protein can reduce their levels in the nucleus, which is proposed to disrupt their function [159]. In 2013, Ishihara et al. reported significant downregulation of gemin and, unexpectedly, SMN protein levels both in TDP-43 knockdown cells and in samples from ALS patients [161]. They also observed downregulation of both the U12 snRNA and the U11/U12 di- snRNP in these cells and in spinal neuron samples from ALS patients [161]. Investigation into the splicing of minor and major introns in three specific MIGs (IPO4, IFT80, and GARS) by qRT-

PCR revealed that only IPO4 showed significantly elevated minor intron retention, while the other two MIGs were spliced normally [161]. Overall, these results suggested that downregulation of the U11/U12 di-snRNP might cause MIG-specific misexpression and minor intron retention in

ALS patients. This was further supported in 2014, when Highley et al. used exon microarrays to identify gene expression and splicing changes in lower motor neurons from ALS patients. This analysis showed downregulation of U11/U12-25K (SNRNP25) and U11/U12-48K (SNRNP48) in

33 the ALS samples, as would be expected from the downregulation of the U11/U12 di-snRNP observed by Ishihara et al. [161, 162]. However, Highley et al. did not report whether they identified changes in MIG expression or splicing from their microarray analysis, so it remained unclear whether widespread MIG misexpression occurred in ALS.

In 2016, Reber et al. linked minor spliceosome disruption with an ALS-causative FUS mutation, the first report of its kind [120]. They were also the first to establish that FUS binds to the U11/U12 snRNP, based on pulldown experiments with anti-FUS antibodies and biotinylated antisense oligonucleotides targeting U11 [120]. Subsequent experiments with in vivo splicing constructs indicated that the FUS protein recruits the U11/U12 di-snRNP to minor introns [120].

To directly assess the effects of ALS-associated cytoplasmic FUS accumulation on minor intron splicing, they studied the ALS-causative mutation P525L, which is located in the FUS nuclear localization signal [120]. This mutation causes early-onset, aggressive ALS, in which symptoms appear before 25 years of age [163]. Strikingly, this mutation not only disrupted minor intron splicing, but it also sequestered U11 and U12 snRNA to cytoplasmic FUS aggregates [120]. Based on their findings, Reber et al. proposed a model in which ALS associated with FUS mutation is caused by sequestration of the U11/U12 di-snRNP in the cytoplasmic FUS protein aggregates, resulting in depletion of U11/U12 di-snRNP in the nucleus and global inhibition of minor intron splicing [120]. This model predicts that ALS is not caused by a loss of FUS function, but rather by a toxic gain of function. Indeed, a report from the Shneider group, published the same year as the Reber et al. paper, showed that FUS conditional knockout mice did not develop ALS-like symptoms [164]. Using a FUS-P525L mouse model of ALS, they also determined that motor neuron death was not caused by interaction between the mutant and wild-type FUS, or by increased

34

FUS activity [164]. Instead, the P525L mutation itself caused a toxic gain of function—just as the model proposed by Reber et al. would predict [120, 164].

These motor disorders suggest that mature motor neurons are particularly sensitive to levels of minor splicing activity, such that impaired minor splicing leads to their death. Moreover, the developmental disorders linked to disrupted minor splicing indicate that dividing, progenitor cell populations are differentially sensitive to levels of minor splicing activity. Specifically, the progenitor cells of the central nervous system and the limbs appear to be the most sensitive to minor spliceosome disruption. In my dissertation research, I am interested in understanding the effects of minor spliceosome disruption on the progenitor cells of the central nervous system.

35

Chapter 1: Figures

Figure 1.1. Comparison of the components and action of the major, U2-type and minor, U12-type splicing pathways. At top is a schematic of a pre-mRNA, in which boxes are exons and lines are introns, with the locations of the three consensus sequences: the 5’ splice site (5’SS), branch point sequence (BPS), and 3’ splice site (3’SS). The steps of the U2-type, major and U12-type, minor splicing pathways are shown at the left and right, respectively. The analogous snRNAs and snRNPs are color-coded (U1/U11=blue; U2/U12=green; U4/U4atac=orange; U6/U6atac=red; U5=purple). The center panels show snRNA base-pairing to either the intron or to other snRNAs during each of these steps. The red “A” in the BPS panels marks the branchpoint adenosine.

36

Chapter 2: Central nervous system development

This chapter will cover the model systems used in my dissertation research: the mouse retina and the mouse cortex. Both of these tissues are affected in diseases caused by minor spliceosome inactivation. For example, as discussed in Chapter 1, many of the developmental disorders linked to disruption in minor intron splicing are characterized by microcephaly [119,

124, 140, 143]. In the most severe of these, MOPD1, microcephaly often presents alongside numerous brain defects, many of which are cortical abnormalities [124-127]. In both Lowry Wood and Roifman syndromes, many patients display retinal dystrophy [140, 143]. Details about the structure and development of these tissues are discussed in the following subsections.

2.1 Retinal development

Thanks to its stereotyped structure, easy accessibility, and comparatively simple neuronal composition, the retina was one of the first characterized models of central nervous system organization and development [165, 166]. These advantages of the retina made it a powerful initial system in which to study the role of minor intron splicing in central nervous system development.

Moreover, some of the diseases associated with minor spliceosome disruption are characterized by retinal defects, further strengthening the rationale for employing the developing mouse retina for my dissertation research [140, 143].

2.1.1 Structure of the mouse retina

The mammalian eye is comprised of three distinct layers of tissue: the outermost sclera, a tough layer of fibrous connective tissue; the choroid, the vascular, middle layer of connective tissue; and the retina, the innermost layer of thin neural tissue [167]. A single-cell layer of retinal

37 pigment epithelium (RPE) is located between the choroid and the outermost neurons of the retina, the photoreceptors. The pigment in the RPE cells allow them to absorb scattered light, protecting the retina and improving its function [168]. The cells also act as an interface between the blood vessels of the choroid and the cells of the retina, by transporting nutrients from the blood to the retina while shuttling retinal waste products into the blood [169]. On the other side of the retina, the inner limiting membrane separates the retina from the vitreous humor of the eye. This structure is formed by astrocytes and the end feed of Müller glia, the major glial cell of the retina [167]. The retina is comprised of three nuclear layers of neurons and two synaptic layers. The outer nuclear layer (ONL) contains the nuclei of the rod and cone photoreceptors; the cytoplasmic outer segments of these photoreceptors lie between the ONL and the RPE [167, 170]. These outer segments contain stacks of membranes, in which the rod- and cone-specific visual pigments are embedded [171]. Nuclei of the interneurons and glia of the retina are found in the middle nuclear layer, called the inner nuclear layer (INL). Amacrine cells, bipolar cells, and horizonal cells comprise the interneurons of the INL; the INL also houses the Müller glia. The outermost nuclear layer is the ganglion cell layer (GCL), which contains the output cell of the retina, the ganglion cells. The two synaptic layers are sandwiched between the nuclear layers of the retina: the outer plexiform layer (OPL) is comprised of synapses between the ONL photoreceptors and the horizonal and bipolar cells of the INL, while the inner plexiform layer (IPL) consists of synapses between amacrine and bipolar cells of the INL and the ganglion cells of the GCL [167, 170]. Axons from the ganglion cells throughout the retina coalesce into the optic nerve, ultimately synapsing in target brain regions, such as the geniculate nucleus and superior colliculus [167, 172].

During the process of phototransduction, light has to pass through the GCL, IPL, INL,

OPL, and ONL, to reach the outer segments of the photoreceptors. These photons interact with the

38 visual pigments found in the rod and cone outer segments, changing the conformations of these proteins [171]. Activation of these visual pigments results in hyperpolarization of the photoreceptors, which in turn reduces the amount of neurotransmitter released into the OPL [171].

The response of the bipolar cells of the INL depends on the specific bipolar cell type and their post-synaptic glutamate receptors; ultimately, the signal is transmitted through the bipolar cells in the INL to the ganglion cells of the GCL [173]. The horizontal cells of the INL, which synapse with multiple photoreceptors, mediate lateral inhibition of these photoreceptors, thereby modulating and coordinating photoreceptor output to the bipolar cells [174]. The final interneuron of the INL, the amacrine cells, modulate activity between the bipolar cells and the ganglion cells in the IPL [175]. The signal received by the ganglion cells of the GCL is then transmitted through the optic nerve and to the brain [167]. While Müller glia are not directly involved with this process, evidence suggests that they act as photo-optic tubes, allowing light to pass more easily through the retina to the photoreceptors. Since Müller glia stretch from the inner limiting membrane to the far edge of the ONL and contain few , light can pass through this structure with minimal scattering [167, 176]. They also function in the maintenance of neuronal health, through uptake of neuronal waste products and excess neurotransmitters [177].

2.1.2 Development of the mouse retina

In the mouse, eye development around embryonic day (E) 8.5. At this time-point, the lateral regions of the diencephalon evaginate to form the optic vesicles, which continue to expand laterally

[178]. When the distal side of the optic vesicle comes in close contact with the surface ectoderm at around E9.5, it triggers thickening of the surface ectoderm into the lens placode [178, 179]. The lens placode then signals back to the optic vesicle, inducing optic vesicle invagination [178, 179].

39

By E10.5, this optic vesicle invagination forms a thin, cup-like structure, fittingly termed the optic cup. The inner wall of the optic cup, which used to be the lateral wall of the optic vesicle, becomes the neural retina. In contrast, the outer wall receives signals from the surrounding mesenchyme and gives rise to the RPE [178, 179].

As retinal development progresses, the retinal neuroepithelial cells (a.k.a. retinal progenitor cells) begin to divide at the apical edge of the retina, adjacent to the RPE, to produce retinal neurons

[180]. As neurons are born, they migrate out of the proliferative, neuroblastic zone to their respective neuronal layer [181, 182]. Neurogenesis in the developing mouse retina begins around embryonic day (E) 11; both neurogenesis and gliogenesis end by approximately postnatal day (P)

14 [178]. Thanks to extensive pulse-chase experiments performed by Drs. Richard L Sidman and

Richard W. Young, the birth windows of the different retinal neurons are known [183, 184]. The first neurons born are ganglion cells, amacrine cells, horizonal cells, and cone photoreceptors, starting at E11. While the birth of these neurons begins around the same time in retinal development, their birth windows terminate at different time-points. Horizonal cell birth only continues until E15, while cone photoreceptors are born until E18. Birth of ganglion cells and amacrine cells continues until P3, although very few ganglion cells are born postnatally. Bipolar cells are the next cell type generated, with their production starting at E14 and ending at P11. The first Müller glia are born at E16, and their production continues until P9. Finally, rod photoreceptors are born both embryonically and postnatally, with production starting around E12 and ending around P14. In summary, most ganglion cells, horizontal cells, amacrine cells, and cone photoreceptors are born during embryonic retinal development. In contrast, most bipolar cells, Müller glia, and rod photoreceptors are born postnatally [183, 184].

40

While the adult retina contains three distinct nuclear layers, the E11 retina is initially comprised of a single layer of retinal progenitor cells. As neurogenesis begins, newborn ganglion cells begin to migrate to the basal edge of the retina [178]. By E14, the retina has two separate layers: the GCL and the outer neuroblastic layer (ONBL), which contains the retinal progenitor cells and newly born neurons [185]. Although the INL and ONL are not distinguishable at this point in development, newborn amacrine cells line the inner edge of the ONBL [186]. At approximately P5, the ONBL separates into the ONL and INL; active separation of these two nuclear layers continues until P10 [187].

2.2 Cortical development

As mentioned in Chapter 1, microcephaly is one of the most common symptoms of the

CNS disorders linked to minor spliceosome disruption [124, 140, 143]. Given the size of the in the human brain, microcephaly is often caused by impaired development of the cerebral cortex [188]. Indeed, hypoplasia of the cortical lobes has been reported in MOPD1 patients. Moreover, the microcephaly observed in MOPD1 commonly presents alongside cortical abnormalities, including hypoplasia of the cortical lobes, aberrant gyrification, and agenesis of the corpus callosum [124-127]. Therefore, I sought to study the effects of minor spliceosome inactivation in the developing mammalian cortex, using the developing mouse cortex as a model system. The following subsections expound upon the structure, development, and progenitor cells of the mammalian cortex.

2.2.1 Structure of the mammalian cortex

The mammalian cerebral cortex is, as its name suggests, the outer layer of the cerebrum.

This gray matter structure is comprised of six distinct neuronal layers, numbered from the

41 outermost I to innermost VI layers, along with a host of glial cells. Layer VI is located above a layer of white matter, comprised of axons from layers II-VI. Neurons of the cortex are divided into two general types: excitatory pyramidal cells, which project to different regions of the brain and make up ~80% of cortical neurons; and inhibitory interneurons, which modulate the activity of these excitatory neurons and comprise the remaining ~20% of cortical neurons [189]. Inhibitory interneurons are found throughout the cortical layers, although they are concentrated in layers I,

II, and V [190].

The outermost neuronal layer, layer I, is also known as the molecular layer. In the adult brain, layer I contains sparsely distributed inhibitory neurons, which receive input from the thalamus and other cortical regions [191, 192]. However, in the developing cortex, this layer is tightly packed with inhibitory Cajal-Retzius cells [191]; their source and function are described in the next subsection. Layers II and III contain commissural projection neurons, due to their projections to the contralateral cerebral cortex through the corpus callosum or, less commonly, the anterior commissure. The commissural projection neurons that cross through the corpus callosum are also known as callosal projection neurons [193]. In rodents, layers II and III are difficult to distinguish based on their cytoarchitecture alone; therefore, these layers are often studied together and referred to as layer II/III [189]. Layer IV is characterized by excitatory granular neurons, which receive input from thalamocortical projections and project up into layer II/III [193]. Layers V and

VI primarily consist of output projection neurons that transmit signals to subcortical regions of the brain, called corticofugal projection neurons [193]. Specifically, most projection neurons of layer

V are subcerebral projection neurons, which synapse with targets in the midbrain, brainstem, or spinal cord, depending on the cortical location of the neuron’s cell body [193]. For example, subcerebral projection neurons of the motor cortex send axons to the spinal cord and motor nuclei

42 of the brainstem [193]. Layer VI consists of corticothalamic projection neurons, which alter thalamic neuron activity and therefore modulate the signals sent from the thalamus to layer IV

[193, 194]. While most commissural projection neurons are found in layer II/III, commissural neurons are found throughout layers IV-VI. Some of these are associative projection neurons that project ipsilaterally, to different regions within the same hemisphere, while some are callosal neurons that project contralaterally. Some commissural neurons in layer V project ventrally, to the striatum [193, 195]. In addition to excitatory neurons that project to a single target, some cortical projection neurons have multiple projections that target different brain regions [193].

In addition to neurons, multiple subtypes of glial cells are present in the mammalian cerebral cortex. Astrocytes, the most common and most commonly studied glial subtype, are found throughout the cortical layers, where they associate with neurons and the vasculature [196, 197].

Astrocytic processes wrap around the blood vessels in the cortex; as a result, astrocytes contribute to the blood brain barrier, regulate vascular tone in the brain, and transport water, nutrients, and wastes between the blood and nearby neurons [198]. The many and varied functions of astrocytes also include the regulation of extracellular ion homeostasis; uptake and metabolism of extracellular neurotransmitters; synaptogenesis; promotion of synaptic pruning; and, more controversially, modulation of neuronal activity via the release of “gliotransmitters” [198-201]. The next glial subtype, the oligodendrocytes, are found in all cortical layers, although they are the densest in the deep layers (V-IV) [202]. Oligodendrocytes are the myelinating cells of the central nervous system, allowing for faster action potential conduction [203]. Oligodendrocytes are produced from

NG2-expressing oligodendrocyte precursor cells (OPCs), the third glial cell type in the cortex [202,

203]. OPCs also produce a subset of astrocytes in the mouse brain [202]. In the mature cortex,

OPCs are scattered throughout the cortical layers [202]. Evidence indicates that they serve as a

43 pool of quiescent progenitor cells that, in the presence of a demyelinating insult, migrate to the site of injury, proliferate, and differentiate into oligodendrocytes, thereby driving remyelination [204].

The final type of glia in the cortex is microglia. Microglia are a type of macrophage that function in the brain’s inflammatory immune response, and, like astrocytes and OPCs, microglia are uniformly distributed across the cortical layers [205]. Microglia have also been implicated in synaptic pruning, the promotion of neuron survival during development, and the regulation of neural progenitor cell number in the prenatal cortex [205].

2.2.2 Timeline of cortical development

In the mouse, the developing forebrain is first distinguishable in the closing neural tube around E8. At this point, specific regions along the rostral end of the closing neural tube begin to grow outward, bulging into defined vesicles [206]. The most rostral of these is the prosencephalon, or the primordial forebrain. By E9, the prosencephalon has segmented further, into the rostral- most telencephalon and the diencephalon. Specifically, the rostral end of the prosencephalon bulges laterally into two symmetrical vesicles, laying the foundation for the two cerebral hemispheres [207]. The dorsal region of the telencephalon will develop into the cerebral cortex and hippocampus, while the ventral telencephalon will produce the striatum. These regions of the telencephalon are alternatively called the pallium (dorsal telencephalon) and the subpallium

(ventral telencephalon) [208].

When the dorsal telencephalon is formed, it is comprised of a monolayer of pseudostratified neuroepithelial cells, whose processes anchor them to the apical (ventricular) and basal (pial) edges of the dorsal telencephalon. As these cells progress through the cell cycle, their nuclei move along the apical-basal plane such that mitosis occurs at the apical/ventricular edge,

44 while S-phase occurs at the basal edge [209]. Until E10, these neuroepithelial cells undergo symmetric, proliferative divisions to produce two neuroepithelial daughter cells, in order to amplify the progenitor pool [210, 211]. At E10, the neuroepithelial cells transition into radial glial cells (RGCs), also known as apical progenitors (APs) [211, 212]. Like the neuroepithelial cells,

RGCs undergo INM as they progress through the cell cycle [209, 211]. Most of these RGCs undergo symmetric proliferative divisions, producing two RGCs. However, starting between E10 and E11, some RGCs undergo asymmetric, differentiative divisions, which make one RGC and one differentiating cell [213-216]. Some of these differentiative divisions are neurogenic, in which this second daughter cell is a newborn excitatory neuron [215-217]. These newborn cortical neurons use the basal process of their mother RGC to migrate from the ventricular edge to the basal edge of the dorsal telencephalon. As more neurons reach their final destinations, the dorsal telencephalon becomes laminar, with the RGC-comprised ventricular zone (VZ) and the neuronal cortical plate (CP). Other differentiative divisions generate one RGC and another progenitor cell of the cortex, called intermediate progenitor cells (IPCs), also known as basal progenitors (BPs)

[216, 218, 219]. These IPCs migrate away from the ventricular edge to sit between the VZ and the

CP. These IPCs eventually form a third layer in the pallium, called the subventricular zone (SVZ), which becomes distinguishable from the VZ by E13 [220]. Unlike RGCs, IPCs have multipolar, short processes that do not span the dorsal telencephalon, and they do not undergo INM [218, 219,

221]. Most IPCs undergo symmetric neurogenic divisions to produce two neurons, which use nearby RGC basal processes to migrate to the CP [211, 216, 222]. A small fraction of IPCs divide symmetrically to expand the IPC pool, producing two IPCs, prior to terminal symmetric neurogenic division [216]. As a result, most excitatory neurons of the mouse cortex originate from

IPCs [220]. RGCs can also divide to produce one RGC, which remains in the VZ, and a monopolar

45

RGC lacking an apical process, which will reside in the SVZ. These basal RGCs, also known as outer RGCs, primarily undergo asymmetric neurogenic divisions [223, 224].

Neurogenesis in the developing mouse cortex occurs from E11 to E18, just prior to birth.

Cortical excitatory neurons are born in an “inside-out” manner, in which the deep layer neurons are born first and the upper layer neurons late in cortical neurogenesis. The first neurons born in the dorsal telencephalon are subplate neurons, which are produced from E10 until E12 [193].

These subplate neurons form a transitory neuronal layer below layer VI that persists until approximately P10. These neurons are important for cortical development, playing roles in synapse maturation and in the guidance of both incoming thalamocortical axons to layer IV and outgoing corticothalamic axons [225]. In this window of time, the neurons that will form layer I are also invading the basal dorsal telencephalon, forming the transient marginal zone of the developing cortex. These inhibitory neurons, which include Cajal-Retzius cells, are produced from other regions of the developing forebrain and migrate tangentially into the cortex. Specifically, the Cajal-

Retzius cells originate from the cortical hem, the pallial/subpallial border, and the retrobulbar area/pallial septum. These cells produce the extracellular matrix protein reelin, which serves as a signal to stop basally directed migration of newborn excitatory neurons [226]. Given their developmental function, most Cajal-Retzius cells undergo apoptosis, starting from approximately

P8 [227, 228]. Layer VI corticothalamic projection neurons are the next born in the dorsal telencephalon, between E11.5 and E13.5. From E13.5 until E14.5, the subcerebral projection neurons of layer V are born. Layer IV granular neurons are the next type generated, between E13.5 and E14.5. While the majority of the commissural projection neurons of layer II/III are born last in cortical development, between E14.5 and E18, production of these neurons actually begins by

E12.5 [193]. Throughout the production of excitatory neurons from the neural progenitor cells

46

(NPCs) of the dorsal telencephalon, inhibitory neuron production is occurring in the ganglionic eminences of the ventral telencephalon. These inhibitory neurons migrate both tangentially and radially into the developing cortex, starting at E12 and ending by birth [229].

Production of glia from the progenitor cells of the dorsal telencephalon starts in late embryonic development, with most glia being born postnatally. At the end of neurogenesis, the

RGCs of the VZ begin to transform into astrocytes, while other astrocytes are born from SVZ progenitor cells [230]. These newborn astrocytes can divide to amplify the astrocyte pool in the first two weeks of postnatal development [231]. Astrocytes are also produced from OPCs, another glial subtype and primary source of oligodendrocytes in the cerebral cortex [202]. Most OPCs are produced in the ventral telencephalon, from the ganglionic eminences and the anterior entopeduncular region. The first wave of these ventrally produced OPCs migrate tangentially to the dorsal telencephalon by E16, in a migratory stream concentrated in the deep layers of the pallium [202, 232]. Postnatally, OPCs are produced from progenitor cells of the developing cortex, which migrate radially up into the neuronal layers [202, 232]. The final cortical glial subtype, the microglia, originate from the yolk sac in early embryonic development. These microglia invade the developing brain by E9.5, prior to the onset of neurogenesis; in the dorsal telencephalon, these microglia cluster in the proliferative VZ and SVZ [233, 234].

47

Chapter 3: The role of minor intron splicing in mouse retinal development

This chapter marks the beginning of my research into the role of minor intron splicing in central nervous system development, by focusing on the retina. First, the expression of the minor spliceosomal snRNAs in the whole mouse embryo and in the developing mouse retina is discussed, followed by a microarray-based analysis of MIG expression across retinal development and investigation into the outcome of minor spliceosome inactivation in the neonatal mouse retina.

3.1 Rationale

As discussed in Chapter 1, diseases associated with minor spliceosome dysfunction indicate that minor intron splicing is essential for mammalian CNS development [103, 106-108,

118-123]. However, prior to 2015, there were no publications investigating how minor intron splicing regulates CNS development in a mammalian model system. Given the paucity of research on minor splicing in mammalian CNS development, I sought to characterize the expression of the minor spliceosome-specific snRNAs in the developing mouse CNS. To study how minor splicing affects CNS development, I employed two separate approaches, which I ran in parallel. First, I began breeding a U11 conditional knockout mouse the Kanadia laboratory had generated to a transgenic mouse line that expresses cre recombinase in the developing cortex; these mouse lines are explained in more detail in Chapter 4 [117, 235]. Second, I sought to study minor splicing in an easily accessible model of CNS development: the developing mouse retina. Notably, retinal dystrophy is observed in Roifman syndrome and Lowry Wood syndrome, further emphasizing the importance of minor splicing in the mammalian retina [140, 143]. To study minor splicing in mouse retinal development, I employed a custom microarray for MIG expression, using cDNA samples spanning embryonic and postnatal retinal development, and investigated the effects of

48 minor spliceosome inactivation in the neonatal retina, by P0 retinal electroporations of a published antisense morpholino that blocks association of the U12 snRNA with the BPS of minor introns

[236, 237]. The following sections include the contents of one of my publications, entitled “Minor splicing snRNAs are enriched in the developing mouse CNS and are crucial for the survival of differentiating retinal neurons” [116].

3.2 Results

3.2.1 Identification of potential murine gene copies of the minor spliceosome-specific snRNAs

First, we wanted to explore the expression kinetics of the minor snRNAs across mouse development. For this, we sought to design probes specific to the U4atac, U6atac, U11, and U12 snRNAs. However, current databases do not reflect possible isotypes (i.e., gene copies) of these snRNAs in the mammalian genome. While two isotypes for U6atac and U12 have already been reported [238], we undertook a bioinformatics analysis to identify other potential isotypes for all the minor-class snRNAs. For a given snRNA, we first extracted the known human minor-class snRNA sequence from the NCBI database. We then employed BLAT to align this known human sequence with the mouse genome in order to identify potential isotypes [239]. This yielded multiple loci; only those with >87% identity were considered potential isotypes. Next, we identified the that was syntenic to the human locus of each snRNA.

Only one U4atac gene copy (Gm22710) was identified in the mouse genome, which was considered syntenic to the human RNU4ATAC, since both are located within the same intron of the gene Clasp1 (Figure 3.1A). The U6atac isotype (Gm26473) found on 2 was considered syntenic to the human RNU6ATAC, since both are flanked by Wdr5 and Rxra (Figure

3.1B). The U11 isotype (Rnu11) found on chromosome 4 was considered syntenic to the human

49

RNU11, as both are flanked by the genes Gmeb1 and Taf12 (Figure 3.2A). The U12 isotype

(Rnu12) found on was considered syntenic to the human RNU12, as both are flanked by the same genes, Poldip3 and Cyb5r3 (Figure 3.2B). To further characterize whether or not the remaining potential gene copies would produce functional snRNAs, we investigated conservation at their respective functional sites. For U6atac, there are four functional sites: the 5’ interaction site, the U12-U6atac helix I site, the stem I site, and the stem II site [66-68].

Interestingly, for all the potential U6atac gene copies (Gm25541, Gm26573, Gm26044, Gm23879,

Gm23494, and Gm24831), none were 100% conserved within these four functional domains

(Figure 3.1B). For U11, there are two functional sites: the 5’ interaction site and the Sm-binding site [34, [240]. Four U11 isotypes (Gm25097, Gm22472, Gm25785, and Gm24890) from the

Ensembl database showed near 100% identity for these functional sites (Figure 3.2A). For U12, there are three crucial functional sites: the U12-U6atac helix I site, the branch site interaction region, and the Sm-binding site [56, 61, [240]. There were two other U12 isotypes (Gm25703 and

Gm24525) from the Ensembl database that showed near 100% identity for all three functional sites

(Figure 3.2B).

The bioinformatics strategy used to identify possible gene copies does not a priori predict the transcription of the genes. To determine this, we mined RNAseq data obtained from cytoplasmic extracts of embryonic day (E) 16 and postnatal day (P) 0 mouse retinae [241]. First, we sought to verify whether the RNAseq data reflected previous findings in regards to snRNA expression levels, where the major snRNAs are expressed 100-fold higher than the minor snRNAs

[56, 59]. Indeed, we found that the minor-class snRNA levels were at least 100-fold lower than their major-class snRNA counterparts, except for U11, whose levels were only 30-fold lower than

U1 (Figure 3.3A). This confirmation of the dataset provided us the confidence to interrogate the

50 expression levels of the identified gene copies in the RNAseq data. For U6atac, two of the six identified U6atac gene copies, Gm25541 and Gm23879, were expressed at these two time points

(Figure 3.3B). For U11, Rnu11 was the highest expressed gene copy, and, of the remaining candidates, only Gm24890 showed expression above threshold (Figure 3.3B; methodology: [241]).

Notably, although the functional sites of U11 are conserved in this potential gene copy, Gm24890 is 32 nucleotides shorter than the known U11 sequence. The levels of the known U12 gene, Rnu12, were the highest, with Gm25703 expressed at much lower levels (Figure 3.3B). The other identified U12 gene copies were not expressed (Figure 3.3B). These results suggest that U6atac,

U11, and U12 have at least two potential gene copies that are expressed at some point during development. However, whether these products of these potential gene copies are functional in minor intron splicing remains is unclear.

3.2.2 The minor spliceosomal snRNAs are enriched in the developing murine CNS and limbs

Next, we employed whole mount in situ hybridization (WISH) on embryos at E9.5, E10.5, and E12.5 with probes for U4atac, U6atac, U11, and U12. Here, we utilized full length snRNA sequence for probe preparation, which could hybridize with all snRNAs transcribed from potential gene copies (Figures 3.1&3.2). Expression of U4atac was observed in the entire embryo at E9.5, with enrichment in the developing CNS, branchial arches, fore- and hind-limb buds, and the somites (Figure 3.4). At E10.5 and E12.5, U4atac expression persisted in the same structures, except expression in the developing heart was visibly lower than the surrounding tissues at E12.5

(Figure 3.4A’&A’’ arrowhead). Similarly, expression of U6atac was observed in the developing

CNS, branchial arches, and fore- and hind-limb buds at E9.5, with persistent expression in the same structures at E10.5 and E12.5 (Figure 3.4B-B’’). As with U4atac, there was little U6atac

51 expression in the developing heart at E12.5 (Figure 3.4B’’ arrowhead). Both U11 and U12 showed robust expression in the CNS, branchial arches, fore- and hind-limb buds, and somites at E9.5

(Figure 3.4C&D). While the expression pattern continued in the same structures at E10.5 and

E12.5 for both U11 and U12, there was a distinctly low signal in the developing hindbrain (Figure

3.4C’’&D’’ arrowheads). WISH was performed using either an antisense and a control sense probe for each snRNA. No signal was observed using the U11, U12, or U6atac sense probes, but sense

U4atac probe showed faint signal in the developing ventricles in the E10 embryo (Figure 3.5).

Overall, WISH analysis showed near ubiquitous expression of all minor-class snRNAs, with specific enrichment in the developing head and limb buds.

3.2.3 The minor spliceosomal snRNAs are expressed across embryonic mouse retinal development

Our WISH analysis revealed that all minor-class snRNAs were enriched in the developing head compared to the rest of the embryo, in agreement with the phenotypes observed in developmental diseases characterized by minor spliceosome disruption [124, 140, 143]. To further investigate the expression of the minor splicing components in the developing mouse CNS, we used the mouse retina as a model system and performed section in situ hybridization (SISH) across embryonic retinal development (E12, E14, E16, and E18). During embryonic development, all four minor-class snRNAs were observed in both the ONBL, which consists of newly born neurons and cycling progenitor cells, and the newly forming ganglion cell layer (GCL) (Figure 3.6) [178,

185]. It should be noted that expression of U12 appeared to be highly enriched in the ganglion cells, such that its relative expression was reduced in the ONBL by E18, barring scattered cells in

52 the ONBL with high U12 expression (Figure 3.6M-P). In contrast, robust expression of U11 was maintained in the ONBL at E18 (Figure 3.6I-L).

3.2.4 The minor spliceosomal snRNAs are enriched in differentiating retinal neurons

Next, we extended our SISH analysis across postnatal retinal development (P0, P4, P6,

P14, and adult). At P0, expression of all minor-class snRNAs except U6atac was observed at a higher level in the differentiating neurons in the GCL and the newly differentiating amacrine cells lining the basal edge of the ONBL (Figure 3.7A, K, & P arrowheads) [186]. Specifically, U12 and

U11 showed increased levels of expression in the GCL (Figure 3.7K&P). To confirm this enrichment, we dissociated P0 retinae and performed double fluorescent in situ hybridization for

U11 or U12, along with NF68, a marker for retinal ganglion cells [242]. Indeed, expression of U11 and U12 was higher in cells that were also positive for NF68 when compared to the other DAPI+ cells (Figure 3.8). At P4, P6, and P14, expression of all minor-class snRNAs appeared to be enriched in differentiating neurons (Figure 3.7B-S). Specifically, horizontal cells were observed with enriched expression of all four snRNAs at P4 (Figure 3.7B, G, L, Q arrowheads). At P6, expression of all four snRNAs was also observed in photoreceptors at the basal edge of the ONBL

(Figure 3.7C, H, M, & R arrowheads). Finally, expression of all four snRNAs was observed in all retinal neurons at P14, and this expression was maintained in the adult retina (Figure 3.7D, I, N,

S, E, J, O, & T).

3.2.5 Expression of MIG isoforms that require minor intron splicing shifts between embryonic and postnatal retinal development

53

Another way to understand the role of the minor spliceosome is by understanding its targets. In this case, those targets are minor intron-containing genes (MIGs). To study the MIGs in the mouse genome, we extracted 461 MIGs from the U12 Intron Database [58]. To glean the molecular pathways in which MIGs might participate, we performed DAVID analysis on these genes [243, 244]. This analysis yielded 45 GOTERM terms with Bonferroni corrections ≤0.005.

Shown in Table S1 are select functions enriched by these MIGs, such as intracellular transport, transmembrane transport, voltage-gated ion transport, MAP kinase activity, and ATP binding.

While this analysis revealed the potential functions that MIGs could execute, only by understanding which MIGs are expressed and when these MIGs are expressed in the developing retina can one determine the molecular pathways in which they participate. To this end, we designed a custom microarray, which allowed us to interrogate the expression of exon-exon junctions that require minor splicing (Figure 3.9A). We employed this custom microarray on cDNA made from RNA prepared from cytoplasmic extracts from E12, E16, E18, P0, P4, P10, and

P25 retinae. Here, we chose to specifically employ cytoplasmic extracts because the cytoplasmic mRNA pool represents mRNAs that have been successfully spliced and exported, thus proxying for the protein produced.

The raw microarray data was processed through Genesis (v1.7.6) and subjected to k-means clustering to find MIGs that had exon junctions with similar expression kinetics [245]. This analysis generated 10 clusters, which were further subdivided into “embryonic,” “postnatal,” or both embryonic+postnatal (“E+P”) categories (Figure 3.10A-C). The production of these three categories is described in detail in Section 3.4, the materials and methods subsection corresponding to this chapter. There were three clusters that showed enrichment of interrogated exon junctions during embryonic development. The first, cluster 1, showed enrichment at E12 to P0, followed by

54 a precipitous decline from P4 to P25. The second cluster showed enrichment at E12, followed by a drop to threshold at E16, after which it remained at threshold, except for a drop at P4 below threshold (Figure 3.10A). The final embryonic cluster, cluster 3, showed a spike in expression at

E16 and a spike just above threshold at P4, with expression remaining at or below threshold for all other time points (Figure 3.10A). There were also three clusters with exon junctions enriched postnatally. The first cluster, cluster 4, showed a steady increase in expression from E12 until P25, with expression above threshold from P4 onward (Figure 3.10B). In cluster 5, expression of exon junctions was just above threshold at E12, below threshold from E14 to P4, and showed a steady increase in expression from P4 to P25 (Figure 3.10B). The final postnatal cluster, cluster 6, contained exon junctions with a sharp rise in expression at P4 (Figure 3.10B). The remaining four clusters were grouped in the “embryonic+postnatal” (E+P) category. The first, cluster 7, showed bimodal expression, with the first peak at E16 and the second at P10 (Figure 3.10C). The second

E+P cluster, cluster 8, also showed bimodal exon junction expression, with the largest peak at P0 and a second peak just about threshold at P10 (Figure 3.10C). The third cluster, cluster 9, showed a steady expression just above threshold at all time points, except for the subthreshold troughs at

E12 and P10 (Figure 3.10C). The final E+P cluster, cluster 10, showed bimodal expression, with peaks at E12 and P4 (Figure 3.10C).

While the identification of MIGs that have exon junctions with overlapping expression patterns suggests dynamic regulation of MIGs, it does not inform their biological role in the developing retina. To extract this information, we curated genes belonging to either the embryonic clusters, the postnatal clusters, or the E+P clusters, followed by DAVID analysis. Interestingly,

MIGs in the embryonic category enriched for functions involved in RNA metabolism, including

RNA processing and ribosome biogenesis, along with cell cycle, nuclear transport, and mRNA

55 transport (Figure 3.10A’). In contrast, MIGs in the postnatal category enriched for transport-related functions, including intracellular protein transport, vesicle-mediated transport, MAP kinase activity, and voltage-gated calcium channel activity (Figure 3.10B’). Interestingly, the genes underlying vesicle-mediated transport are genes mostly involved in synapse formation, such as

Huntingtin, which is shown to be required for synaptic development [246], and syntaxin binding protein 5, which is known to be involved in the docking and fusion of synaptic vesicle [247].

Finally, in the E+P clusters, MIGs enriched for protein transport, sodium ion transport, voltage- gated calcium channel activity, and phospholipase C activity (Figure 3.10C’).

3.2.6 The minor spliceosome is required for the survival of differentiating retinal neurons

Our custom microarray analysis revealed that MIGs are dynamically expressed across retinal development. Moreover, DAVID analysis showed distinct function enrichments for MIGs enriched embryonically versus postnatally (Figure 5). Specifically, MIGs enriched postnatally showed enrichment for functions such as ion transport, protein transport, and vesicle-mediated transport. All of these functions suggest that these genes are involved in the terminal differentiation of retinal neurons, such as synapse formation. This result, together with the minor-class snRNA enrichment in differentiating neurons observed in SISH analysis, led us to hypothesize that the minor spliceosome is required for proper neuronal differentiation. To test this hypothesis, we employed in vivo P0 retinal co-electroporations of antisense morpholino (MO) against the U12 snRNA and a pCAG-GFP expression plasmid to drive GFP expression in electroporated cells (ref); a scrambled morpholino (Scr-MO) was co-electroporated with pCAG-GFP in littermate controls.

Specifically, the antisense MO targeting U12 was complementary to nucleotides 9-33, which would disrupt base-pairing both between U12 and the BPS and between U12 and U6atac [236].

56

The electroporated pups were harvested at P14 to analysis. In the U12 MO-electroporated P14 retina, we observed very few GFP+ neurons relative to the control, suggesting that the electroporated neurons might be undergoing apoptosis in the absence of minor spliceosome activity (Figure 3.11). To confirm this, we performed deoxynucleotidyl dUTP nick end labeling (TUNEL) to identify cell death in U12 MO- and scrambled MO-electroporated neurons

(Figure 3.11D&D’). Indeed, we observed TUNEL+ cells in the outer nuclear layer (ONL) of the

U12 MO-electroporated retina (Figure 3.11D’), including GFP+/TUNEL+ neurons undergoing apoptosis (Figure 3.11D’’). In all, this suggested that loss of minor spliceosome function most likely resulted in neuronal death. However, it did not rule out progenitor cell death. To distinguish between these two possibilities, we harvested retinae of P0-electroporated mice at P3, P7, and P10.

At P3, we observed U12 MO-electroporated retinae with GFP+ progenitor cells and newly differentiating amacrine cells at the basal edge of the ONBL, similar to those in retinae electroporated with the scrambed control MO. We also performed TUNEL on these sections, and we observed no difference in number of TUNEL+ cells between control and U12 MO- electroporated retinae. A similar analysis was performed at P7, where we observed GFP+ cells in both control and U12 MO-electroporated retinae. However, we observed more TUNEL+ cells in the U12 MO-electroporated retinae. A similar increase in TUNEL+ cells was observed at P10

(Figure 3.10F). In all, the majority of the TUNEL+ cells were observed between P7 and P10, so we quantified the number of TUNEL+ cells as a percentage of the total number of GFP+ electroporated cells. This analysis showed that there was a significant increase in TUNEL+ cells, relative to the number of GFP+ electroporated cells, in the U12 MO-electroporated retinae when compared to the scrambled MO-electroporated retinae at P7 and P10 (Figure 3.10E&F).

57

3.3 Discussion

Here, we have bioinformatically shown the presence of multiple potential isotypes for each minor-class snRNA, except for U4atac, along with the expression of these isotypes in the E16 and

P0 retina. While their transcription across the entirety of development remains unexplored, their presence raises the possibility that a mutation in one could be compensated by the other isotypes.

It is also possible that each isotype might have a tissue-specific expression pattern, and therefore a tissue-specific function, necessitating further exploration into the expression of these potential isotypes. Interestingly, U4atac, which lacks even potential gene copies, is the snRNA linked to the most diseases, including MOPD1 [124, 140, 143]. Some of the symptoms of MOPD1 include microcephaly, micrognathia, and skeletal dysplasia [126]. Indeed, WISH expression analysis for

U4atac across embryonic mouse development revealed that it was enriched in the developing head, branchial arches, and limb buds, thus suggesting an important role for this snRNA and, by extension, the minor spliceosome, in mammalian CNS and skeletal development. Given that the minor spliceosome requires the presence of all minor snRNAs to function, it is not surprising that the other snRNAs show similar expression patterns. The efficiency of minor splicing has been associated with limiting amounts of these snRNAs. Specifically, in HeLa cells, the levels of U6atac are maintained very low and are upregulated under stress to increase the efficiency of minor splicing [88]. Our RNA sequencing analysis also supports the premise of U6atac as a limiting factor, as its levels were much lower than any of the other minor-class snRNAs (Figure 3.3A).

Thus, it can be imagined that a similar mechanism, whereby U6atac levels inform the efficiency of minor intron splicing, might play a role in overall embryonic development. WISH analysis showed that all four minor-class snRNAs are enriched in the developing CNS and limb buds, which suggests that minor splicing is most effective in the developing CNS and limb buds (Figure 3.4).

58

Unexpectedly, WISH analysis using the sense probe for U4atac also revealed enrichment in the developing head (Figure 3.5). Since the mouse U4atac gene, Gm22710, lies within a Clasp1 intron on the opposite strand, it is possible that the U4atac sense probe is hybridizing with unspliced

Clasp1 transcripts, resulting in the observed signal (Figure 3.1A).

The enrichment of the minor-class snRNAs in the developing head suggests that the functions executed by the minor spliceosome and its targets play a major role in the developing mouse CNS. Specifically, in the retina, the minor-class snRNAs were further enriched in newly differentiating retinal neurons during postnatal development (Figures 3.6&3.7). Interestingly, this enrichment was also observed in differentiating retinal neurons in embryonic development, but only for U12 (Figure 3.6N-P); the other minor-class snRNAs were enriched ubiquitously in the embryonic retina (Figure 3.6). Given that all minor-class snRNAs are required to form a functional minor spliceosome, this divergence in expression was unexpected. One possibility is that the U12 antisense probe is detecting the expression of both U12 and the piwi-interacting RNA (piRNA) that is embedded in the Rnu12 gene [248, 249]. Another possibility is that the U12 snRNA performs another function, separate from its role in minor splicing, that is needed in these differentiating neurons. In either case, further investigation is required to understand this divergence in U12 expression. The expression kinetics of another minor snRNA, U6atac, also differs from those of the other minor snRNAs (Figure 3.7). Instead of the differentiating neuron- specific enrichment at P0 seen for the other snRNAs, U6atac signal was ubiquitous in the P0 retina

(Figure 3.7A,F,K,P). Since U6atac has been linked to stress response, it is possible that stress caused by the birthing process could result in the ubiquitous expression observed [88]. As with

U12, this divergent expression warrants further investigation.

59

The shift from ubiquitous minor-class snRNA signal in the embryonic retina to more specific enrichment in newly differentiating retinal neurons in postnatal development suggests that the demand for minor splicing shifts across retinal development. In turn, this shift suggests that the targets of minor splicing, the MIGs, play shifting roles in mouse retinal development. Indeed, our microarray analysis showed that MIGs enriched in embryonic development were involved in a specific set of functions, while those enriched in postnatal development were involved in a separate set of functions. These shifts in MIG-regulated functions across retinal development are schematized in Figure 3.12, with each function denoted by a certain color. Here, RNA processing and cell cycle were enriched only in embryonic retinal development, while vesicle-mediated transport was only enriched in postnatal development (Figure 3.12, orange, green, and purple).

Intracellular protein transport was maintained across all of retinal development (Figure 3.12, red).

Additionally, calcium homeostasis-related functions, including voltage-gated calcium channel activity and phospholipase C activity, were present across retinal development, with enrichment in postnatal development (Figure 3.12, yellow). Notably, the embryonic functions identified here mirror those affected in developing zebrafish with deficient minor splicing, which die 7-10 days post fertilization [113]. In these larval zebrafish, the expression of genes involved in exosome formation, mRNA transport, and cell cycle was disrupted [113].

Intriguingly, some of the postnatally enriched MIGs involved in vesicle-mediated transport have been linked to neuronal differentiation, such as syntaxin binding protein 5 and huntingtin

[246, 247]. Taken together with the retinal SISH data, this suggests that minor splicing is vital for the terminal differentiation of retinal neurons. This idea is further captured by our observation that

U12 morpholino-mediated disruption of minor spliceosome function resulted in apoptosis between

P7 and P10, when the majority of neurons born postnatally undergo differentiation. In all, our

60 analysis is one of the first attempts to decipher the role of the minor spliceosome in mouse development. Moreover, it creates a framework within which we can begin to further delineate the role of this unique machinery in mammalian development.

3.4 Materials and methods

Gene Copy Analysis

Known human minor spliceosome-specific snRNA gene sequences for RNU4ATAC,

RNU6ATAC, RNU11, and RNU12 were obtained from the National Center for Biotechnology

Information (NCBI) database. These were then compared to the mouse genome utilizing the UCSC

BLAT function [239]. All sequences were compared against the Dec. 2011 (GRCm38/mm10) assembly, using default settings for the BLAT function [248]. Sequence similarity was rated by

BLAT based on percent sequence conservation (% identity) and a score of nucleotide matches vs. mismatches between the query and return sequences (score). Any BLAT return with overall identity of less than 87% or a score of less than 87 was ignored. The gene loci of BLAT returns were verified using the Ensembl Genome Browser [250]. Minor spliceosomal snRNA secondary structures were also obtained from Ensembl.

Animal Protocol

CD1 mice from Charles River Laboratory, MA, were used for all experiments. All mice procedures were compliant with the protocols approved by the University of Connecticut’s

Institutional Animal Care and Use Committee (IACUC).

RNA Sequencing

61

CD1 mouse retinae were collected at embryonic day (E) 16 and postnatal day (P) 0. RNA was isolated using the Thermo Scientific NE-PER Nuclear and Cytoplasmic Extraction Kit

(Thermo Fisher Scientific), as described previously [251]. The cytoplasmic samples were then used for RNAseq on the Illumina HighSeq 2000 platform. The details of data processing and analysis are available in [241].

Whole Mount in Situ Hybridization (WISH)

Amplified PCR products of full-length mouse U11, U12, U4atac, and U6atac snRNAs were generated using the primers in Table S2. These products were then cloned into pGEMT, followed by PCR with T7 and SP6 primers. The resulting PCR products were then used for in vitro transcription for production of DIG-labeled antisense RNA probes. Whole-mount in situ hybridization on E9.5, E10.5, and E12.5 mouse embryos was performed as described previously

[252].

Section in Situ Hybridization (SISH)

Sixteen μm cryosections of CD1 mouse heads (for embryonic time points) and retinae (for postnatal time points) were used for in situ hybridization, as described previously [242]. The same probes used for whole mount in situ hybridization were also used for section in situ hybridization.

Double Fluorescent in Situ Hybridization

Retinae from P0 CD1 mice were dissociated and processed for double fluorescent in situ hybridization, as described previously [242, 253]. Antisense RNA probes against U11 and U12

62 were labeled with FITC. The DIG-labeled antisense probe against NF68 was the same as that described previously [242].

Custom Microarray Design

The custom microarray was designed by leveraging the U12DB [58]. From the U12DB, a list of 461 MIGs was extracted, and the exon-exon junctions requiring minor splicing were curated.

Of the 461 MIGs, 329 MIGs had 1 minor intron, 116 had 2 minor introns, 11 had 3 minor introns, and 5 had 4 minor introns (Figure 3.9B). As shown in Figure 3.9A, we designed probe sets at the exon-exon junctions formed by minor splicing, which allowed us to detect both the successful splicing of each minor intron and the expression of all MIG isoforms generated by minor intron splicing. Internal control probe sets were designed by Affymetrix to validate successful hybridization, and only those arrays that met these requirements were used for our analysis.

Retinae from CD1 mice at different stages of development (E12, E14, E16, E18, P0, P4, and P25) were harvested and subjected to the RNA fractionation as previously described [251].

One µg of the cytoplasmic RNA was then employed for microarray analysis, which was performed at the Yale Center for Genome Analysis.

Microarray Data Analysis

Expression levels of probe targets were computed from the raw intensity values using the

Robust Multichip Average (RMA), which was performed with the affy R package [254-256]. Here, data normalization was performed for a probe-set over time, and not among the probe-sets. This analysis allowed us to capture expression kinetics of a given exon-exon junction over time without the influence of expression levels of different probe-sets at the same time point. Subsequently, the

63 data were processed through Genesis (v1.7.6) for k-means clustering [245]. Here, we ran 50 iterations to generate a total of 10 clusters that fell into three categories based on expression kinetics. These trends were defined as embryonic, postnatal, and both (“embryonic+postnatal”).

However, we encountered an issue with probe distribution within clusters. In our analysis, a given

MIG can have multiple probe-sets, each of which detects the splicing of a specific minor intron.

Additionally, different combinations of these minor introns may be spliced in different isoforms.

As a result, a single probe set for a MIG could hybridize with a single isoform, or all isoforms, or select isoforms of that MIG, depending on how many isoforms contain that exon-exon junction.

Since each isoform may have a different expression profile across time, the probe-sets for the same

MIG could show up in different clusters. Indeed, this is what we observed. Thus, to find MIGs with the highest degree of overlap in expression kinetics, we analyzed how each MIG’s probe-sets were distributed across the aforementioned clusters. If all (100%) of a gene’s probe-sets were found in clusters that belonged to the same category (e.g., embryonic), the gene was considered to belong to that category (e.g., embryonic). In contrast, if all of a MIG’s probe-sets were not present in the same category, that gene was moved to the embryonic+postnatal category. After this binning process, the final three gene lists were separately analyzed using DAVID [243, 244].

U12 Morpholino P0 in vivo Electroporation

Previously published U12 morpholino (5’-TCGTTATTTTCCTTACTCATAAGTT-3’) or scrambled control morpholino (5’-CCTCTTACCTCAGTTACAATTTATA-3’) were co- electroporated with pCAG-GFP in P0 mouse retinae [221] (Gene Tools). Electroporation was performed as previously described, with one modification [213, 222]. Instead of using a Hamilton

64 needle, we employed a glass needle attached to a femtojet (Eppendorf) to deliver plasmids.

Electroporated retinae were harvested at P3, P7, P10, and P14.

TUNEL

TUNEL assay was performed on 16 μm cryosections of P0 electroporated retinae using the in situ cell death detection kit, TMR red (Roche Diagnostics), in accordance with the manufacturer’s instructions.

Image Acquisition and TUNEL Quantification

Processed sections were imaged by confocal fluorescence microscopy, which was performed with a Leica SP2 confocal microscope. Collected images were further processed using

IMARIS (Bitplane Inc.) and Adobe Photoshop CS4 (Adobe Systems Inc.). GFP+ and TUNEL+ cells were manually counted using the spot tool in the IMARIS software. The mean

TUNEL+/GFP+ ratios for U12 morpholino- and scrambled morpholino-electroporated retinal images were calculated. Statistical significance was determined using two-tailed student’s t-tests.

65

Chapter 3: Figures

Figure 3.1. Gene isotypes of U4atac and U6atac in the mouse genome. Human snRNA gene sequences for (A) RNU4ATAC and (B) RNU6ATAC were compared to the mouse genome utilizing the UCSC BLAT tool and returns were rated based on percent sequence conservation (% Identity) and a match vs. mismatch score (Score). At the top of each table is the human snRNA gene. All genes listed under the dark dividing line are BLAT returns organized by score. Gene loci were determined using Ensembl Genome browser to visualize genes flanking all BLAT returns. The secondary structure for each snRNA was also given by Ensembl. Colored boxes indicate sequences within the snRNA which are important for either RNA or protein interactions. Deviations in these binding sequences are denoted by either blue letters (nucleotide mismatches), red dashes (nucleotide deletions), or green letters (nucleotide additions).

66

Figure 3.2. Gene isotypes of U11 and U12 in the mouse genome. Human minor spliceosome snRNA gene sequences for (A) RNU11 and (B) RNU12 were compared to the mouse genome utilizing the UCSC BLAT tool and returns were rated based on percent sequence conservation (% Identity) and a match vs. mismatch score (Score). At the top of each table is the human snRNA gene. All genes listed under the dark dividing line are BLAT returns organized by score. Gene loci were determined using Ensembl Genome browser to visualize genes flanking all BLAT returns. The secondary structure for each snRNA was also given by Ensembl. Colored boxes indicate sequences within the snRNA which are important for either RNA or protein interactions. Deviations in these binding sequences are denoted by either blue letters (nucleotide mismatches), green letters (nucleotide additions), or red dashes (nucleotide deletions).

67

Figure 3.3. Multiple minor snRNA gene copies are expressed in the deep sequencing data from E16 and P0 mouse retinae. (A) Expression levels of the major and minor snRNAs in the cytoplasm of the E16 and P0 mouse retinae, by deep sequencing. Samples are listed on the x-axis, with FPKM values on the y-axis, on a log scale. Solid lines represent the levels of the minor snRNAs (purple=U4atac, green=U6atac, blue=U11, red=U12), while dashed lines represent the levels of their major snRNA counterparts (purple=U4, green=U6, blue=U1, red=U2). (B) Expression of the potential minor snRNA gene copies in the E16 and P0 cytoplasmic extracts by deep sequencing. Heading colors correspond to the colors from (A). A threshold of 1 FPKM was used for deeming whether a gene was expressed. FPKM = fragments per kilobase per million reads. FPKM = fragments per kilobase of transcript per million mapped reads

68

Figure 3.4. The minor snRNAs are enriched in the embryonic mouse head. Whole mount in situ hybridization with DIG-labeled anti-sense RNA probe for U4atac (A-A”), U6atac (B-B’’), U11 (C-C”), and U12 (D-D”) in E9.5 (A, B, C, D), E10.5 (A’, B’, C’, D’), and E12.5 (A’’, B’’, C’’, D’’) mouse embryos. Lateral and dorsal views of embryos are shown here.

69

Figure 3.5. Control analysis for WISH with sense and antisense probes for the minor snRNAs. In situ hybridization using DIG-labeled RNA sense (right column) and antisense (left column) probes for all minor snRNAs in the E10 mouse embryo. Lateral and dorsal views of the embryos are shown here.

70

Figure 3.6. The minor snRNAs are expressed in embryonic progenitor cells and differentiating neurons in the developing mouse retina. Section in situ hybridization with DIG-labeled anti-sense RNA probe for U4atac (A-D), U6atac (E-H), U11 (I-L), and U12 (M-P) across embryonic retinal development. Minor snRNA names are listed on the left, with the developmental time points listed at top. The entire collage was modified in Adobe Photoshop CS4 with the following values: 50 for shadow input, 0.85 for midtone input, and 239 for highlight input. Scale bars are all 100 µm. ONBL = outer neuroblastic layer; GCL = ganglion cell layer.

71

Figure 3.7. The minor snRNAs are enriched in differentiating neurons in the postnatal mouse retina. Section in situ hybridization with DIG-labeled anti-sense RNA probe for U4atac (A-E), U6atac (F-J), U11 (K-O), and U12 (P-T) across postnatal retinal development from P0 to adult. Minor snRNA names are listed on the left, with the developmental time points listed at top. The entire collage was modified in Adobe Photoshop CS4 with the following values: 61 for shadow input, 0.87 for midtone input, and 240 for highlight input. Scale bars are all 100 µm. ONL = outer nuclear layer; INL = inner nuclear layer; GCL = ganglion cell layer.

72

Figure 3.8. U11 and U12 snRNA expression is enriched in retinal ganglion cells at P0. (A-D) Double fluorescent in situ hybridization on P0 retinal cells with FITC-labeled U11 probe (green) along with DIG-labeled Nf68 (red). (E-H) FITC-labeled U12 probe (green) along with DIG-labeled NF68 (red). Nuclei were stained with DAPI (blue). Each color layer of the collage was modified with Adobe Photoshop CS4 with the following values. Blue layer: 0 for shadow input, 0.73 for midtone input, and 113 for highlight input. Green layer: 0 for shadow input, 0.61 for midtone input, and 72 for highlight input. Red layer: 0 for shadow input, 0.65 for midtone input, and 110 for highlight input.

73

Figure 3.9. Custom microarray design and probe set statistics. (A) Probe set design. Major introns are shown in black and minor introns in orange. Probe sets were designed to span minor splicing-dependent exon-exon junctions, shown here as orange lines above these exon-exon junctions. (B) Minor intron exon-exon junction probe set statistics for all MIGs identified by the microarray. Numbers within each slice represent the number of MIGs with the specified number of minor splicing-dependent exon-exon junction probe sets.

74

Figure 3.10. MIGs play different roles in embryonic versus postnatal retinal development. Cluster images on the left show normalized exon-exon junction expression across cytoplasmic retinal samples at E12, E16, E18, P0, P4, P10, P25 (x-axis). Normalized expression values are listed on the y-axis. Clusters are represented in centroid view obtained from Genesis, where each bar at a given time point represents the variation in expression kinetics of each exon-exon junction at that time point. The background for embryonic clusters is colored blue (A), the background for postnatal clusters is yellow (B), and the background for clusters in the embryonic+postnatal category is colored gray (C). Shown on right are the results of DAVID analysis. Headings for the three bins correspond to their cluster color and contain total gene lists for that bin: blue for embryonic (A’), yellow for postnatal (B’), and gray for embryonic+postnatal (C’). Only terms from the GOTERM category with a Benjamini value of ≤0.05 were included. Domain-specific DAVID results were discarded. Synonymous terms were also discarded in favor of a representative term with the highest number of genes. Indented terms indicate that the indented function was enriched by a subset of genes in the preceding functional term.

75

Figure 3.11. U12 snRNA function is essential for the survival of differentiating postnatal retinal neurons. Shown here are sections of retinae that were electroporated with either control (scrambled) morpholino (left column) or morpholino against U12 (right column), followed by harvests at P3 (A, A’), P7 (B, B’), P10 (C, C’), and P14 (D-D’’’). Cell death was detected by TUNEL assay, shown in red, and nuclei are marked by DAPI (blue). Higher magnification of a GFP+ neuron undergoing apoptosis (red) is shown in the inset (D’’). Quantification of TUNEL+ and GFP+ cells in images collected from electroporated retinae (n=4 for control morpholino, n=4 for U12 morpholino) harvested at P7 (E) and P14 (E’). The y-axis shows the ratio of TUNEL+ cells to GFP+ cells. The x-axis denotes the electroporated morpholino (scrambled = Scr-MO; U12 = U12- MO). The bars represent the mean ratio calculated for a single image from a given condition. The p value calculated using a two-tailed t-test is shown over the black line connecting the two bars.

76

Figure 3.12. Functions executed by MIGs are dynamic across retinal development. Schematized representation of the kinetics of MIG expression and the functions they execute across retinal development (from E12 to P25), as determined by microarray analysis. Colors and shapes correspond to the temporal shifts in functions across retinal development. The Ca2+ homeostasis category can be broken down further into “voltage-gated calcium channel activity” and “phospholipase C activity.” Time on the x-axis is not to scale.

77

Chapter 4: The role of minor intron splicing in mouse cortical development

This chapter addresses the requirement for minor intron splicing for mouse cortical development, using a novel Rnu11 conditional knockout (cKO) mouse generated by the Kanadia lab in association with the University of Connecticut Health Center. First, this chapter covers the effects of constitutive ablation of Rnu11, the gene encoding the minor spliceosome-specific U11 snRNA. The majority of this chapter describes the impact of U11 ablation on the developing mouse cortex—i.e., the microcephaly we observe in these mice.

4.1 Rationale

As discussed in Chapters 1 and 2, the symptoms of the disorders linked to minor spliceosome disruption indicate that this process is essential for normal brain development and function [124, 140, 143]. In particular, the numerous and consistent cortical defects observed in

MOPD1 patients further underscore the importance of minor intron splicing for cortical development [124-127]. To my knowledge, there are no publications describing mammalian models of minor spliceosome inactivation, barring my own papers. Moreover, I could not find any reports of any model systems of development, mammalian or otherwise, in which the minor spliceosome could be conditionally ablated. Given this gap of knowledge in the field, I sought to study both the effects of constitutive minor spliceosome inactivation in the developing mouse, and the effects of conditional minor spliceosome inactivation in the developing mouse cortex, using our Rnu11 conditional knockout mouse. The following sections contain the contents of one of my publications, entitled “Minor spliceosome inactivation causes microcephaly, owing to cell cycle defects and death of self-amplifying radial glial cells” [117]; prior to its publication in

Development, a version of this paper was published as a bioRxiv pre-print, entitled “Minor

78 spliceosome inactivation in the developing mouse cortex causes self-amplifying radial glial cell death and microcephaly” [257].

4.2 Pathogenesis of microcephaly

The microcephaly observed in diseases linked to minor spliceosome disruption is a type of primary microcephaly, or microcephaly caused by defects in brain development that result in microcephaly at birth. Given the expansion of cortical volume in humans, relative to total brain size, microcephaly is often caused by impaired cortical development [188]. Proper cortical development requires a balance between amplification of the neural progenitor cells (NPCs) and neurogenesis. Overproduction of neurons at the expense of self-amplifying NPC divisions can deplete the NPC pool early in cortical development, resulting in microcephaly [258]. This is particularly relevant for the RGCs, which function to maintain the NPC population of the developing cortex [222]. Shifts in the proportions of RGCs and IPCs can also impact cortical growth, because IPCs primarily function in neuron production [220]. If RGCs produce fewer IPCs in favor of neurogenic divisions, then the total number of neurons that can be produced in the finite window of cortical neurogenesis drops substantially, resulting in microcephaly [259]. In short, primary microcephaly can be caused by (1) death of the NPC or neuronal populations; (2) increased neuron production, resulting in depletion of the NPC pool; and/or (3) decreased IPC production in favor of neurogenic divisions, resulting in fewer neurons [260]. The following subsections explain some of the known mechanisms underlying microcephaly that is linked to these developmental defects, including cell cycle defects and the DNA damage-p53-apoptosis axis.

4.2.1 The process of cell division

79

The cell cycle is divided into three basic phases, which can themselves be further subdivided. In interphase, the progenitor cell grows in size and duplicates its DNA, to ensure there is enough material to split between two daughter cells. In mitosis, the cell’s doubled contents are physically moved to separate poles of the mother cell. Finally, in cytokinesis, contracture of a cytoskeletal ring pinches the cytoplasmic membrane and frees the two daughter cells. Interphase is subdivided into three stages. First is a “gap”/growth phase called G1, in which the cell decides whether it will continue through interphase or exit the cell cycle [261]. Following G1 phase is the

DNA synthesis phase called S phase, during which both the cell’s DNA and its are replicated [262]. The centrosome is a -organizing comprised of two , cylindrical structures consisting of nine microtubule triplets that recruit a matrix of pericentriolar proteins to the centrosome [263]. These function in bipolar spindle assembly during mitosis [263]. During S phase, the newly formed sister chromatids are linked together by cohesin, a complex of four protein subunits; these sister chromatids will remain linked by cohesin until mid-mitosis [264]. The third phase of interphase is a second gap/growth phase called G2 phase [261].

In interphase, there are multiple cell cycle checkpoints to ensure proper cell division. The intra-S-phase checkpoint monitors DNA replication during S phase, when the cell risks accumulating DNA damage. This checkpoint is mediated by the ATR and Chk2 proteins, which recognize DNA polymerase-protein complexes (replisomes) that encounter a blockade to their progression and stall on the template DNA [265]. These proteins function to stabilize the stalled replisome on the template DNA until the blockade is removed, often via the DNA damage repair pathway [265]. The second checkpoint is the G2/M checkpoint, which involves surveillance of cell size and of the DNA, to identify any sites of DNA damage. If the cell determines it is of

80 insufficient size or detects sites of DNA damage, the cell will arrest in G2 phase until the problem is addressed [265, 266].

Mitosis is subdivided into five phases: prophase, prometaphase, metaphase, , and . During prophase, the DNA condenses into , while the two centrosomes translocate to opposite poles of the cell. The centrosomes also begin driving the nucleation of , which dynamically grow and retract as they search for their targets. Some of these microtubules will bind to the mitotic chromosomes; others will associate with the proteins lining the cell cortex or with microtubules extending from the opposite pole of the cell [267, 268].

Simultaneously, begin to assemble at the centromeric DNA of the chromosomes.

Kinetochores are large protein complexes that mediate the connection between the mitotic chromosomes and the mitotic spindle microtubules [267, 269]. While a subset of inner proteins is bound to the centromere throughout cell cycle, recruitment of the outer kinetochore proteins only occurs in prophase [269]. In prometaphase, the nuclear envelope separating the mitotic chromosomes from the probing microtubules breaks down. These microtubules then bind to the kinetochores, such that each paired sister chromatid is anchored by two microtubules, one originating from each centrosome [267, 268]. Next, anaphase is triggered by the degradation of cohesin from the paired sister chromatids, which is mediated by a specific called the anaphase-promoting complex (APC/C). The APC/C ubiquitinates multiple target proteins, including the protein securin, the inhibitory subunit of the separase protease. Degradation of securin triggers activation of separase, which cleaves cohesin [270]. This allows the sister chromatids to separate and for each microtubule to pull a sister chromatid to its respective centrosome. As these sister chromatids are drawn to the two poles of the cell, unbound microtubules of the mitotic spindle elongate to overlap in the center of the cell. When the

81 chromosomes arrive at their respective pole, telophase is triggered, at which point the nuclear envelopes begin to reform around these bipolar assemblies of decondensing chromosomes [267,

268]. Like in interphase, there are multiple checkpoints during mitosis, including the spindle assembly checkpoint. As its name suggests, the purpose of this checkpoint is to ensure that microtubules are bound to all paired sister chromatids prior to the metaphase-to-anaphase transition. Activation of this checkpoint is triggered by signaling between the unbound kinetochore and the mitotic checkpoint complex (MCC). Activation of the MCC triggers inhibition of the

Cdc20 protein, a co-factor of the APC/C. As a result, APC/C activity is inhibited, the cohesin linking the sister chromatids remains intact, and mitosis is halted in metaphase [271, 272].

While cytokinesis is completed after telophase, the cell begins preparing for this process in anaphase. In anaphase, an actomyosin ring starts forming around the , the region of overlapping microtubules located in the middle of the dividing cell [273]. Phosphorylation of the myosin in the contractile ring triggers closure of the ring, pinching the membrane down over the central spindle as the cell continues through telophase. Simultaneously, the central spindle is remodeled into a tightly packed intracellular bridge called the cytokinetic midbody [273]. This structure is essential for the final stage of cytokinesis, abscission, in which the pinched plasma membrane fully separates, releasing two daughter cells [273].

4.2.2 Microcephaly and cell division defects

Most of the known mutations underlying primary microcephaly affect cell division, resulting in the death of NPCs, an increased drive to differentiation in RGCs, or a combination of the two [274]. In a recent review of primary microcephaly published by the Walsh group, they highlight that over half of the known primary microcephaly genes are localized to the centrosome

82 and are involved in centrosome biogenesis/duplication [275]. Other genes associated with primary microcephaly are involved in kinetochore assembly and cytokinesis [275].

Notably, many of the biogenesis genes linked to primary microcephaly interact with each other during centrosome duplication. For example, the protein products of two of the most commonly mutated primary microcephaly genes, ASPM and WDR62, interact at the centriole, where they are both function in centriole biogenesis [275, 276]. Specifically, the WDR62 protein is required for proper localization of both ASPM and the primary microcephaly protein CEP63 to the centriole. Localization of WDR62 itself to the centriole is also dependent on another primary microcephaly protein, CEP152. Ultimately, the presence of these four proteins at the centriole recruits yet another primary microcephaly protein, SAS-4, to the centriole [276]. Work using Aspm and Wdr62 knockout mice indicate that disruption of this pathway causes centriole duplication defects; delocalizes RGCs from the VZ; and drives RGCs to undergo differentiative, IPC- producing divisions at the expense of proliferative divisions, resulting in depletion of the NPC pool and therefore microcephaly [276]. Sas-4 knockout mice displayed a similar phenotype, with centriole depletion and delocalization of RGCs from the VZ. Instead of an increased drive to differentiate, Sas4-null progenitor cells displayed defective mitotic spindle assembly, resulting in mitotic delays characterized by prolongation of prophase/pro-metaphase [277, 278].

Another subset of primary microcephaly-linked genes is involved in attachment between the spindle assembly microtubules and the kinetochores of the mitotic chromosomes. CENPE encodes a kinetochore motor protein that functions in spindle microtubule-kinetochore attachment; mutation of this gene causes a microcephalic primordial dwarfism (MPD) syndrome. Cells derived from patients with CENPE mutations displayed spindle assembly defects, chromosome missegregation, and mitotic delays [275, 279]. Similarly, mutations in the gene CASC5, a

83 kinetochore protein required for both proper kinetochore-microtubule attachment and the spindle assembly checkpoint, are linked to primary microcephaly [275, 280].

Mutations in genes that regulate cytokinesis are also associated with primary microcephaly.

For example, mutation of the cytokinetic midbody protein (Citk) causes abnormal cytokinesis in NPCs, resulting in multinucleated cells, apoptosis, and microcephaly [281, 282].

Subsequent work has revealed that citron kinase functions in midbody organization, both by linking the proteins of the contractile ring and the central spindle and by maintaining the protein architecture of the midbody during late cytokinesis [283]. Loss of citron kinase function results in disorganization of the cytokinetic midbody, loss of cell cortex-central spindle linkages, and ultimately failure of abscission [283]. Recently, mutations in the gene KIF14 were also linked to primary microcephaly [284]. KIF14 is another protein of the cytokinetic midbody, and depletion of KIF14 revealed its importance for the localization of citron kinase to the midbody [285]. As anticipated, patient fibroblasts displayed aberrant cytokinesis, with many binucleated and apoptotic cells [284].

The majority of the cell division-associated primary microcephaly genes are involved in mitosis or cytokinesis. However, a host of genes involved in DNA replication are also linked to primary microcephaly. Specifically, causative mutations of primary microcephaly were identified in ORC1, ORC4, and ORC6, genes that encode different subunits of the origin of replication complex, which is necessary for the initiation of DNA replication in S-phase [286-289]. Mutations of other genes that function in the pre-replication complex, including CDT1 and CDC6, also cause primary microcephaly [286]. Primary microcephaly is also linked to stabilizing mutations in

GMNN (geminin), which interacts with CDT1 to inhibit DNA replication [290]. More recently, primary microcephaly has been linked to loss-of-function mutations in the MIG CDC45, which

84 functions in the pre-initiation complex and in DNA unwinding during DNA replication [291]. It is notable that mutations in these genes all cause the same MPD disorder, called Meier-Gorlin syndrome. In contrast, the primary microcephaly mutations in the aforementioned centrosomal, kinetochore, and cytokinetic genes result in different severities of microcephaly and in different sets of symptoms [275].

Experimental work investigating the connection between cell cycle and NPC behavior has complemented the identification of cell cycle-regulating genes linked to primary microcephaly. In

2003, the Huttner group investigated the effects of G1 length on NPC behavior by culturing mouse embryos in the cell cycle inhibitor olomoucine, which inhibits both the G1/S and G2/M transitions

[292]. In the embryos cultured with olomoucine, they observed premature neuron production in the E9.5 dorsal telencephalon, leading them to propose that G1 length informs NPC division type, such that longer G1 length drives NPCs to differentiative divisions [292]. In 2009, Pilaz et al. employed the opposite approach to study the effect of G1 length on NPC behavior: they sought to overexpress the D1 and E1, which function in G1/S progression, to shorten G1 phase in ventricular NPCs [293]. As expected, this in utero electroporation strategy significantly shortened

G1 in the electroporated NPCs and resulted in larger clone sizes, indicative of a shift from differentiative to proliferative division [293]. Analysis of the RGCs and IPCs in the electroporated pallium revealed that G1 shortening drove both RGCs and IPCs to proliferative divisions at the expense of differentiative, neurogenic divisions [293]. In 2016, an effort to characterize microcephaly in mice heterozygous for the exon-junction complex Magoh gene (MagohWT/KO) revealed that length of mitosis can also inform NPC behavior [294]. Pilaz et al. identified mitotic delay, characterized by shortening of prophase and lengthening of prometaphase, in the Magoh heterozygous RGCs. The mitotically delayed RGCs from the Magoh heterozygous pallium also

85 preferentially divided to produce neurons, at the expense of proliferative divisions; Pilaz et al. also observed significantly more apoptotic progeny from these RGCs [294]. Strikingly, culturing brain slices with drugs that inhibit the progression of prometaphase recapitulated the phenotype of the

Magoh heterozygous RGCs: RGCs that had been delayed in prometaphase preferentially divided to produce neurons, and longer lengths of mitosis increased the likelihood of producing apoptotic progeny [294]. Given that most of the genes linked to primary microcephaly are involved in mitosis progression, it is possible that a similar mechanism may underlie microcephaly in these patients.

4.2.3 Microcephaly, p53 activation, and DNA damage

In many mouse models, primary microcephaly is caused by activation of the transcription factor p53 and p53-mediated apoptosis. For example, in a mouse model of microcephaly caused by inactivation of the primary microcephaly gene Cep63, cortical NPCs showed upregulation of p53 alongside mitotic delay and DNA damage—and genetic ablation of p53 rescued microcephaly in the Cep63-deficient mice [295]. In mice, deletion of the primary microcephaly gene Sas-4 caused mitotic delay, prophase/prometaphase prolongation, p53 upregulation, and apoptosis in cortical NPCs. Moreover, genetic ablation of p53 rescued microcephaly in the Sas-4 knockout mice [277, 278]. Similarly, haploinsufficiency of Magoh resulted in mitotic delay, elongation of prometaphase/metaphase, p53 upregulation, and apoptosis; genetic deletion of p53 rescued microcephaly in these mice [294, 296]. Loss-of-function mutation in Kif20b, a kinesin microtubule motor protein, caused abnormal abscission and apoptosis in cortical NPCs [297]; a recent preprint showed that genetic ablation of p53 partially rescued microcephaly in these mice [298]. In microcephalic mice heterozygous (WT/KO) for the gene Tubb5, which is linked to primary microcephaly [299], cortical NPCs displayed mitotic delay, elongation of metaphase, upregulation

86 of p53, and apoptosis [300]. Genetic deletion of Citk caused aberrant cytokinesis, DNA damage, p53 upregulation, and apoptosis, resulting in microcephaly [301, 302]. Overall, these mouse models of primary microcephaly reveal a trend in their cellular defects: p53 upregulation and apoptosis are often associated with mitotic delay, specifically mitotic delay caused by prolongation of prophase, prometaphase, or metaphase.

Activation of the p53 transcription factor is triggered by a variety of cellular stress signals, which increase levels of p53 protein. Specifically, under normal physiological conditions, p53 is targeted for degradation by the E3 [303]. Cellular stress signals trigger phosphorylation, , and other post-translational modification of p53, which disrupts the association between p53 and MDM2 [304]. As a result, protein levels of p53 increase, without requiring changes in transcription from the TP53 gene [303, 304]. Increased levels of p53 correspond to nuclear translocation of p53 and increased transcriptional activity of the protein, resulting in upregulation of its targets [305]. Generally, the target genes of p53 can be categorized into two functions: cell cycle arrest or apoptosis. The specific target genes that p53 upregulates— cell cycle arrest genes versus pro-apoptotic genes—depend upon multiple factors in the cell, including p53 levels, such that higher levels inform the specific post-transcriptional modifications made to p53, determine the expression levels of different p53 co-factors, and drive p53-mediated apoptosis [306, 307]. Evidence indicates that cytoplasmic and mitochondrial p53 can also drive apoptosis, independently of p53 transcriptional activity in the nucleus [308]. Notably, embryo culture work from Bazzi and Anderson (2014) showed that drug-induced mitotic delay triggered rapid accumulation of p53 in the subsequent G1 phase, resulting in apoptosis shortly after [277].

Currently, the specific protein interactions underlying this pathway are unclear. Instead, most research into p53-mediated apoptosis has investigated the link between DNA damage and p53

87 upregulation. Briefly, DNA damage triggers activation of the protein ATM, ATR, and

DNA-PK and proteins of the PARP family, which function as key mediators of the DNA damage response [309]. The aforementioned kinases induce p53 activation by either directly phosphorylating p53 or by activating the effector kinases CHK1 and CHK2, which themselves phosphorylate p53. This results in p53 upregulation, increased expression of pro-apoptotic target genes, and p53-mediated apoptosis [309, 310]. The PARP proteins PARP1/2 function to recruit

ATM to sites of DNA damage, thereby indirectly mediating p53 activation and p53-mediated apoptosis in response to DNA damage [309].

In fact, many microcephaly-linked genes regulate the DNA damage response. For example, the first gene linked to microcephaly, , functions in the DNA damage response pathway [311, 312]. Many of the causative mutations of Seckel syndrome, an MPD disorder, are found in genes that regulate the DNA damage response, such as ATR, which functions in the G2/M

DNA damage checkpoint and in the intra-S-phase checkpoint to stabilize stalled replication forks

[265, 313]; RBBP8, which regulates double-strand break resection and ATR recruitment to sites of DNA damage [314, 315]; DNA2, a MIG that functions in multiple types of DNA damage repair

[58, 316, 317]; and TRAIP, whose function is required for phosphorylation of DNA damage markers during S-phase [318]. Other DNA damage response genes that have been linked to primary microcephaly include DONSON [319, 320]; LIG4 [321]; NBS1 [322, 323]; NSMCE2

[324]; PNKP [325]; and XLF [275, 326]. It is thought that mutations in these genes result in the accumulation of DNA damage, causing NPC death and microcephaly [275]. Indeed, in Nbs1 conditional knockout mice, genetic ablation of Nbs1 in the developing CNS caused significant

DNA damage, upregulation of p53, reduced NPC proliferation, and apoptosis; complete genetic ablation of p53 partially rescued microcephaly in these mice [327]. The DNA damage-p53-

88 apoptosis axis is also implicated in microcephaly caused by Zika virus infection, based on work in

Zika virus-infected human NPCs [328].

Comparison of the cellular phenotypes of these mouse models of primary microcephaly suggest that two different pathways can induce p53 activation and p53-mediated apoptosis. In the first of these, disruption of primary microcephaly genes that regulate cell cycle causes mitotic delays, which then induce p53 upregulation and p53-mediated apoptosis. In the second, DNA damage accumulation induces activation of key DNA damage response mediator proteins, and signaling between these DNA damage response proteins and p53 results in p53 upregulation and p53-mediated apoptosis.

4.3 Results

4.3.1 Constitutive loss of the minor spliceosome-specific U11 snRNA results in embryonic lethality

We generated the Rnu11 cKO mouse by engineering loxP sites 1090 bp upstream and 1159 bp downstream of the Rnu11 gene (Figures 4.1 and 4.2A). Successful targeting of the loxP sites was confirmed by long-range nested PCR in targeted embryonic stem cells (Figure 4.1B-C) and further validated by the loss of the wild-type (WT) allele in Rnu11Flx/Flx mice (Figure 4.2A). EIIa-

Cre-mediated germline recombination generated Rnu11WT/KO mice, which showed the presence of the knockout (KO) allele that was absent in Rnu11WT/WT genomic DNA (Figure 4.2A). Quantitative

PCR (qPCR) for the WT allele showed 50% reduction in Rnu11WT/KO mice compared to

Rnu11WT/WT mice (Figure 4.2B). Intercrossing Rnu11WT/KO mice did not yield Rnu11KO/KO mice

(Figure 4.2C), indicating embryonic lethality.

89

4.3.2 Ablation of U11 in the developing mouse cortex results in microcephaly

We employed Emx1-Cre mice to specifically ablate Rnu11 in the pallium [235]. Emx1-Cre expression begins at E9.5 in the neuroepithelium of the pallium, prior to the production of RGCs,

IPCs, or neurons at E10-E11 [235, 329]. Therefore, Emx1 promoter-driven Cre activity drives recombination of the Rnu11 floxed allele in these neuroepithelial cells, and all cells originating from these neuroepithelial cells—i.e., the RGCs, IPCs, and excitatory neurons—would then inherit the recombined knockout allele. Ablation of U11 in the pallium resulted in profound microcephaly at birth in Rnu11Flx/Flx::Emx1-Cre+/- mutant mice, due to collapse of the cortex and absence of the hippocampus (Figure 4.2D). To understand how this microcephaly precipitated, we sought to determine the kinetics of U11 snRNA loss after Emx1-Cre-mediated Rnu11 ablation. Section in situ hybridization (SISH) for U11 snRNA revealed a reduction in U11 signal (purple) in the E10 mutant pallium, relative to the control (Rnu11WT/Flx::Emx1-Cre+/-) (Figure 4.2E). This U11-null domain was expanded in the E11 mutant pallium, and by E12, the majority of cells in the Emx1-

Cre domain lacked U11 expression (Figure 4.2F&G). We observed scattered U11+ cells at this point (Figure 4.2G), which may represent either inefficient Emx1-Cre recombination or migrating interneurons produced in the ventral telencephalon, where Emx1-Cre is not active [235, 330].

There was no observable morphological difference between mutant and littermate control pallia at

E10, E11, and E12 (Figure 4.2E-G). At E13, we observed persistence of the U11-null domain alongside collapse of the lateral ventricle in the mutant telencephalon, without a significant change in pallial thickness (Figure 4.2H&J). At E14, the thickness of the mutant pallium was significantly reduced relative to the control (Figure 4.2I-J).

4.3.3 Ablation of U11 in the developing mouse cortex causes cell death

90

We first performed fluorescent ISH (FISH) for U11 to identify the U11-null domain, followed by terminal dUTP nick-end labeling (TUNEL) to detect apoptotic cells. Results showed sporadic loss of U11 snRNA in the E11 mutant pallium without a significant increase in TUNEL+ cells (Figure 4.3A). Onset of cell death was observed at E12, with a statistically significant increase in TUNEL+ cells in the mutant compared to littermate controls (Figure 4.3B), which continued at

E13 and E14 (Figure 4.3C-D). Since TUNEL marks cells in late stages of apoptosis, we sought to verify the timing of cell death in the mutant pallium using a marker of early apoptosis [331]. For this, we employed immunofluorescence (IF) for cleaved caspase 3 (CC3) [332] at E11 and E12, which confirmed that the onset of cell death was occurring at E12 (Figure 4.3E-F).

4.3.4 Ablation of U11 predominantly affects survival of neural progenitor cells

To identify the cell type being lost in the mutant pallium, we investigated the numbers of proliferating NPCs (Ki67+) versus neurons (NeuN+) by IF [333, 334]. We found no change in the number of proliferating NPCs in the E12 mutant pallium compared to the control (Figure

4.4A&D). However, at E13 and E14, the number of Ki67+ NPCs was significantly reduced in the mutant pallium compared to the control (Figure 4.4B-D). Unlike NPCs, we did not observe a significant difference in the number of neurons between the control and mutant at E12 or E13

(Figure 4.4A-B&D). To confirm that the neurons present in the E12 and E13 mutant pallium were in fact U11-null, we performed U11 FISH in conjunction with IF for Ki67 and NeuN, which showed the presence of both U11-null NPCs and U11-null neurons (Figure 4.5A). Finally, at E14, there were significantly fewer neurons in the mutant compared to the control, which was consistent with the depletion of the NPC population at E13 (Figure 4.4B-D). To further investigate whether neurons were dying in the U11-null pallium, we employed IF for Tuj1, which marks both immature

91 and mature neurons [335], and CC3, and quantified the percentage of CC3+ dying cells that were

Tuj1+ neurons. In the E12 mutant pallium, the vast majority (93.4±2.1%) of the CC3+ cells were pyknotic, indicative of late-stage apoptotic cells (Figure 4.5C). By E13, virtually all CC3+ cells were pyknotic (data not shown). Since pyknotic cells in the late stages of apoptosis are unlikely to retain Tuj1 expression, we only considered non-pyknotic CC3+ cells for the analysis. This quantification revealed that 4.44±2.42% of non-pyknotic CC3+ cells in the E12 mutant pallium were Tuj1+ neurons, which was not significantly higher than that observed in the control (0%;

Figure 4.5C). Together, these data suggested that most of the dying cells were likely the U11-null

NPCs, not the U11-null neurons.

4.3.5 Cell types of the developing mouse cortex display differential sensitivity to U11 loss

We next sought to identify the type of NPC being lost in the U11-null pallium. Using the nuclear RGC marker Pax6, we did not observe a significant difference in the number of RGCs between the control and mutant at E12 (Figure 4.4E&H) [336]. At E13, there was a significant reduction in the number of RGCs with nuclear Pax6 signal in the mutant compared to the control

(Figure 4.4F&H). However, widespread cytoplasmic Pax6 expression, which was rarely seen in control sections, was observed in the mutant (Figure 4.4F). This cytoplasmic staining was primarily observed around condensed nuclei, suggesting that the cytoplasmic Pax6 expression was occurring in dying cells. This was confirmed by colocalization of the cytoplasmic Pax6 expression with CC3 (Figure 4.5B-B’’’). Therefore, only cells with nuclear Pax6 expression were quantified.

At E14, there were few RGCs in the mutant, which was consistent with the small number of Ki67+ cells at this time-point (Figure 4.4D&G-H). Next, we investigated the number of IPCs in the U11- null pallium using the marker Tbr2 [337], which revealed no significant difference in the E12

92 mutant compared to the control (Figure 4.4I&L). At E13, we observed a significant reduction in the IPC population in the mutant relative to the control (Figure 4.4J&L), which continued at E14

(Figure 4.4K-L).

Both the RGC and IPC populations were significantly reduced in the mutant pallium at

E13, but not at E12, relative to the control (Figure 4.4D&H). However, the trends of RGC and IPC loss across mutant pallium development were not similar: While there was a significant decline in the number of RGCs in the E13 mutant pallium compared to the E12 mutant pallium, there was no significant change in the number of IPCs between these time-points in the mutant (Figure 4.5D-

E). A significant decline in IPC number was only observed between E13 and E14 mutant pallium

(Figure 4.5E). Therefore, the decline in RGC number occurred earlier in the mutant pallium than the decline in IPCs, suggesting that IPCs were being produced in the U11-null pallium.

The observation that the E13 control and mutant pallia both contained similar numbers of neurons suggested that neuron production was also occurring in the E12 mutant pallium (Figure

4.4B&D). To study neuron production in the mutant pallium, we pulsed pregnant dames carrying

E12 embryos with the thymidine analog 5-bromo-2’-deoxyruidine (BrdU), to mark S-phase NPCs, followed by harvest at E13 [338]. After removing pyknotic BrdU+ cells, quantification showed a significant decrease in NeuN+/BrdU+ cells as a percentage of all NeuN+ cells in the E13 mutant pallium compared to the control. In contrast, there was no significant difference in the percentage of non-pyknotic BrdU+ cells that were NeuN+ between the E13 control (15.7±2.17%) and mutant

(15.5±3.5%; Figure 4.5F) pallium. The presence of BrdU+ neurons in the mutant indicated that

U11-null NPCs produced neurons between E12 and E13. Together, these findings suggested that the majority of cells dying between E12 and E13 were self-amplifying RGCs.

93

4.3.6 Few genes are differentially expressed in the embryonic day 12 U11-null pallium

To understand the molecular defects underlying the loss of self-amplifying RGCs, we performed RNAseq on control (N=5) and mutant (N=5) pallia from E12 embryos (Figure 4.6A inset). At this time-point, U11 loss occurs throughout the entire Emx1-Cre domain without major histological changes, such as shifts in NPC number (Figure 4.2G and 4.4D). Confirmation of the pallial dissection was reflected by the expression of Emx1 in the control (19.1 fragments per kilobase per million mapped reads [FPKM]) and mutant (20.3 FPKM) (Table S3). Expression of

Dbx1, which marks the pallial-subpallial boundary, and Lhx6, which is expressed in the medial ganglionic eminence [339], were expressed below 1 FPKM in both the control and mutant samples

(Table S3). Together, these results confirm the integrity of our dissection. Expression of Rnu11 was reduced by 2.95-fold in the mutant compared to the control, which was further confirmed by quantitative reverse transcriptase-PCR (qRT-PCR) (Figure 4.6B, Table S3). The incomplete loss of U11 expression in the mutant likely reflects (1) contamination of non-Emx1-lineage cells of the meninges and blood vessels (Figure 4.2G) [340]; (2) contamination from non-Emx1-lineage cells migrating into the pallium (Figure 4.2G) [330]; and/or (3) incomplete recombination by Emx1-

Cre. Regardless, the mutant samples showed a clear reduction in U11 expression compared to the control, which was in agreement with the ISH analysis (Figure 4.2G). Next, we interrogated the expression of all protein-coding genes with ≥1 FPKM in the control or mutant. Overall, there was an increase in the number of genes upregulated (74; ≥2-fold change [FC], P≤0.01) relative to those downregulated (11; ≥2-FC, P≤0.01) in the mutant (Figure 4.6A).

4.3.7 U11-null neural progenitor cells display DNA damage and p53 upregulation

94

DAVID analysis [243, 244] performed on the downregulated protein-coding genes revealed no significant enrichments for GO-terms by Benjamini score. However, the upregulated protein-coding genes significantly enriched for the GO-term “intrinsic apoptotic signaling pathway in response to DNA damage by p53 class mediator” (Table S4), implicating this pathway in the cell death observed at this time-point (Figure 4.3). To test whether DNA damage and p53 activation were occurring in the E12 mutant, we employed IF with the DNA damage marker

ɣH2AX [341], along with antibody against p53. We observed that the majority of ɣH2AX+ cells were pyknotic (54.1±3.5%), suggesting that they were dying cells (Figure 4.6D). To verify this, we employed the early-apoptotic marker CC3, which revealed that 93.2±3.5% of ɣH2AX+ cells were CC3+ (Figure 4.6D). Because most cells undergoing the final stages of apoptosis do not express nuclear markers, we removed all pyknotic ɣH2AX+ cells from our quantification strategy.

Using this approach, we observed significantly more non-pyknotic ɣH2AX+ cells scattered throughout the E12 U11-null pallium relative to the control (Figure 4.6E). Similarly, we observed significantly more cells upregulating p53 throughout the E12 U11-null pallium compared to the control (Figure 4.6E). Given the GO-term enriched for by the upregulated protein-coding genes, and reports that p53 stabilization occurs in response to DNA damage [342], we expected that most

ɣH2AX+ cells would also upregulate p53. Indeed, 74.7±11.9% of ɣH2AX+ cells were also upregulating p53 (Figure 4.6E, left inset). In contrast, only 6.7±2.0% of p53-upregulating cells were also ɣH2AX+ (Figure 4.6E, right inset). At E11, we observed significantly more ɣH2AX+ cells in the mutant pallium (2.67±0.76) compared to the control (0; Figure 4.6E). However, we did not observe a significant difference in the number of p53-upregulating cells in the control and mutant, suggesting that the precipitation of DNA damage begins at E11, while cell death occurs at

E12 (Figure 4.6E).

95

4.3.8 Most DNA damage and p53 upregulation occur in U11-null radial glial cells

Since the majority of the cells dying between E12 and E13 were self-amplifying RGCs, we next investigated whether RGCs were the cell type accumulating DNA damage and upregulating p53. We found that 46.3±3.2% of ɣH2AX+ cells were Pax6+ RGCs, while virtually none were

NeuN+ neurons (5.1±5.1%) or Tbr2+ IPCs (0%, Figure 4.6F). To test whether RGCs were the primary cell type upregulating p53, we also performed IF for p53 and Pax6. This revealed that the vast majority of p53-upregulating cells were Pax6+ RGCs (98.1±0.85%), a significant enrichment compared to the percentage of Pax6+ RGCs found in the mutant pallium (93.61±0.56%; Figure

4.6G). In all, we found that U11-null RGCs are the primary cell type both accumulating DNA damage and upregulating p53, which likely contribute to their widespread death between E12 and

E13.

4.3.9 Ablation of U11 results in significant minor intron retention in the E12 U11-null pallium

Since U11 is an essential component of the minor spliceosome [64, 65], the expected result of U11 loss is disruption of minor intron splicing. However, none of the genes upregulated in the

E12 mutant pallium that enriched for p53-mediated apoptosis contained minor introns, indicating that their upregulation is a secondary effect of U11 loss. Therefore, we next investigated the expression of the minor spliceosome targets, i.e., the MIGs, and the splicing of their minor introns.

Overall, the vast majority of MIGs were not differentially expressed (Figure 4.6A). Only one MIG,

Spc24, showed >2-fold downregulation in the mutant (Figure 4.6A arrow, Table S3), which was independently confirmed by qRT-PCR (Figure 4.6C). To assess minor intron splicing, we

96 employed the strategy previously described by Madan et al. in 2015 [103], which utilizes a mis- splicing index (MSI) to reflect intron retention. Using this strategy, we identified 299 (out of 545) minor introns and 11,187 (out of 54,500 randomized major introns) that passed the filtering criteria, indicating that they show some level of intron retention (Section 4.5 Materials and methods). We then calculated the difference in MSI for each intron between the mutant and the control (mutant MSI – control MSI; ∆MSI) and generated frequency plots (Figure 4.7A). These revealed a rightward ∆MSI shift for the 299 minor introns, while the 11,187 randomized major introns followed a normal distribution (Figure 4.7A). This indicated that loss of U11 inactivates the minor spliceosome, resulting in elevated retention of minor, but not of major, introns.

To identify the specific minor introns with elevated retention in the mutant, we employed student’s t-tests to compare the MSI of each minor intron between the control and mutant samples. Of the

299 minor introns, we identified 186 minor introns, found in 178 MIGs, with significantly elevated minor intron retention in the mutant. For three of these MIGs (Coa3, Parp1, and Pten), this was confirmed by qRT-PCR (Figure 4.8A). Minor intron retention has been predicted to result in the introduction of premature stop codons, subsequently targeting MIG transcripts for degradation by the nonsense mediated decay (NMD) pathway [86]. To test whether retention of these 186 minor introns would indeed introduce a premature stop codon, we curated a list of all annotated isoforms of these 178 MIGs, and extracted all the isoforms that would require minor intron splicing for their production, resulting in a list of 330 isoforms. Bioinformatics analysis showed that minor intron retention in 99.1% of these isoforms would introduce a stop codon prior to the annotated stop codon (Figure 4.7B). Since stop codons found >50 nt upstream of an exon-exon junction are predicted to trigger the NMD pathway [343], we also interrogated the location of the premature stop codons in these 330 MIG transcripts in relation to their exon-exon junctions. Again, we found

97 that the majority (83.8%) of the premature stop codons would be predicted to activate NMD, resulting in transcript degradation (Figure 4.7B). For the remaining 53 MIG transcripts, minor intron retention would introduce a stop codon in the 3’ end of the transcript, potentially allowing these transcripts to escape NMD and be translated, resulting in the production of truncated proteins

(Figure 4.7B inset).

4.3.10 MIGs involved in mitosis and DNA synthesis have significant minor intron retention in the E12 U11-null pallium

Regardless of whether a MIG transcript would be degraded via NMD or would produce an aberrant protein, the functions executed by these MIGs would likely be compromised. To identify which biological processes would be affected in the mutant, we submitted the 178 MIGs with elevated minor intron retention to DAVID [243, 244]. Functions significantly enriched for by these

MIGs included both broad GO-terms, such as nucleotide binding (36 genes) and protein transport

(19 genes), and more specific GO-terms, such as cell cycle (18 genes) and centrosome (13 genes).

The enrichment of these more specific functions suggested that aberrant minor intron splicing might result in cell cycle defects (Figure 4.8B, Table S5). Moreover, the only downregulated MIG in the mutant pallium, Spc24, is involved in kinetochore assembly (Figure 4.6A&C, Table S3)

[344]. Among the 21 cell cycle-regulating MIGs we identified with significantly elevated minor intron retention, there were two noteworthy enrichments: nine MIGs that regulate mitosis and cytokinesis, and six MIGs that regulate S-phase and DNA replication (Table S5). In all, disruption of minor intron splicing likely affected cell cycle in the E12 mutant pallium, particularly in mitosis/cytokinesis and S-phase.

98

4.3.11 U11-null radial glial cells display mitotic and cytokinetic defects

To study mitosis in the U11-null pallium, we performed IF for Ki67 and PH3, a marker of mitosis [345], and quantified PH3+ mitotic cells as a percentage of the entire Ki67+ cycling cell population, to identify the mitotic fraction of the NPC population (Figure 4.7C). This quantification revealed a significantly larger mitotic fraction in the E12 mutant pallium relative to the control (Figure 4.7C). To investigate whether a specific stage within mitosis was affected in the U11-null E12 pallium, we employed the mitotic marker Aurora B. The staining pattern of

Aurora B corresponds to specific phases within mitosis [346] (Figure 4.9A); by quantifying cells within these specific phases of mitosis, we were able to identify the fraction of mitotic cells in prophase, prometaphase+metaphase, anaphase, and telophase. Aurora B staining was combined with IF for either Pax6 or Tbr2, to assess cell type-specific defects. This staining paradigm revealed a significantly higher percentage of mitotic Aurora B+/Pax6+ RGCs in prometaphase+metaphase in the E12 mutant pallium than the control, while prophase, anaphase, and telophase were not significantly different (Figure 4.8D-E). In mitotic Tbr2+ IPCs, we did not observe any significant differences in the fractions in prophase, prometaphase+metaphase, anaphase, or telophase (Figure

4.9B-C). To investigate cytokinesis in the E12 mutant pallium, we performed IF for citron kinase

(Citk), which marks the cytokinetic midbody [347]. At E12, there was a significant reduction in

Citk+ cytokinetic midbodies lining the ventricular edge in the U11-null pallium (Figure 4.8F). This reduction was not observed at E11, when the number of Citk+ cytokinetic midbodies across the

U11-null pallium was similar to the control (Figures 4.8F and 4.9D). These data suggest that U11 loss causes both prolonged prometaphase/metaphase, specifically in RGCs, and cytokinesis defects.

99

4.3.12 U11 loss affects the fraction of radial glial cells in S phase

Next, we explored how minor intron retention in the S phase-regulating MIGs impacted S- phase kinetics in the mutant pallium. For this, we pulsed pregnant females with the thymidine analog EdU [348], followed by harvest one hour later. EdU detection revealed a significantly lower percentage of EdU+/Pax6+ RGCs in the mutant compared to the control at E12 (Figure 4.8G), while there was no significant difference in the percentage of EdU+/Tbr2+ IPCs between the E12 control and mutant pallium (Figure 4.8H). In addition, we observed no change in the percentage of

EdU+/Pax6+ RGCs in the E11 mutant pallium compared to the control (Figure 4.9E). Thus, U11 loss in RGCs causes S-phase defects at E12, while U11-null IPCs are spared.

Together, these findings indicate that U11 loss disrupts the splicing of MIGs that are involved in mitosis, cytokinesis, and S-phase, resulting in defects in these cell cycle phases. To determine whether other cell cycle phases were affected in the U11-null pallium, we employed the markers B1, which is cytoplasmic in G2 phase, and Aurora B, which is localized to condensing chromosomes in cells in G2 [349, 350] (Figure 4.9A). Neither quantification of cyclin

B1+ G2 cells, nor quantification of the percentage of G2 Aurora B+ RGCs or IPCs, revealed a significant difference between the E12 mutant and control pallium (Figure 4.9B&F-G), further underscoring that U11 loss causes three significant and specific cell cycle defects, affecting (1) prometaphase/metaphase, (2) cytokinesis, and (3) S-phase.

4.3.13 Most cells with DNA damage in the U11-null pallium were not in S-phase or mitosis

Since the MIGs with significantly elevated minor intron retention enriched for cell cycle functions, we expected that the observed cell cycle defects were the primary cause of RGC death, while the observed DNA damage and p53 upregulation were secondary effects. To determine

100 whether DNA damage and p53 upregulation were occurring in NPCs in S-phase and mitosis/cytokinesis, we combined IF for either ɣH2AX or p53 with different cell cycle markers.

To determine whether cells accruing DNA damage were in S-phase, we employed the aforementioned EdU pulse strategy. IF for ɣH2AX combined with EdU detection revealed that

38.5±9.3% of ɣH2AX+ cells were also EdU+, similar to the total percentage of EdU+ cells in the mutant pallium (37.8±1.0%, Figure 4.10A). We did not observe any ɣH2AX+/PH3+ cells in the

E12 mutant pallium, indicating that cells with DNA damage were not in mitosis (Figure 4.11).

4.3.14 P53 upregulation in the U11-null pallium occurs predominantly in radial glial cells in

G1 phase

To investigate convergence between p53 upregulation and cell cycle phase, we employed the aforementioned EdU pulse paradigm along with IF for Aurora B. In the E12 mutant pallium, approximately one third (32.5±4.4%) of p53-upregulating cells were also EdU+ (Figure 4.10B).

However, this enrichment was not statistically significant, when compared to the total percentage of EdU+ cells in the E12 mutant pallium (42.3±1.2%, Figure 4.10B). We then leveraged Aurora B to interrogate p53 upregulation in G2 and mitosis. Quantification of cells with G2-specific Aurora

B staining in the E12 mutant pallium revealed that 14.4±2.1% of p53-upregulating cells were in

G2 phase, a significantly lower percentage than would be expected based on the total percentage of G2 cells in the mutant pallium (26.5±1.4%, Figure 4.10C). Similar quantification of mitotic

Aurora B+ cells showed that 2.4±0.3% of p53-upregulating cells were mitotic, which was significantly lower than expected from the total percentage of mitotic cells in the mutant pallium

(8.5±0.5%, Figure 4.10C). Thus, approximately half of p53-upregulating cells were not labeled by

101 these S-, G2-, and M-phase markers, suggesting that these p53-upregulating cells were in G1 phase.

4.3.15 Most dying cells in the U11-null pallium are not in S, G2, or M phase

Given that we observed a relationship between p53 upregulation and cell cycle phase, we sought to determine whether apoptosis in U11-null cells was also linked to cell cycle phase. Since pyknotic cells in later stages of apoptosis would not express cell cycle markers, we employed both

IF for CC3, to mark cells in early-to-late apoptosis, with TUNEL, to mark cells in late stages of apoptosis. For our analysis, we excluded pyknotic TUNEL+ cells, so that we only considered cells in early stages of apoptosis. First, to investigate S-phase cells, we employed the aforementioned

EdU pulse paradigm. Of the CC3+ cells of the E12 mutant pallium, 38.0±19.1% were also EdU+, which was expected based on the total percentage of EdU+ cells in the mutant pallium (42.2±1.2%,

Figure 4.10D). To assess G2 and mitotic cells, we employed IF for Aurora B. In the E12 mutant pallium, we observed virtually no CC3+/Aurora B+ cells (2.1±2.1%, Figure 4.10E), indicative of minimal death of cells in G2 and M phases. Together, these findings indicate a relationship between apoptosis and cell cycle that mirrors that of p53 upregulation and cell cycle: few dying cells are in G2 or M phase, and less than half are in S phase. Therefore, the majority of dying cells in the E12 mutant pallium are most likely in G1 phase or are neurons.

4.3.16 tdTomato+ cells are present in the neonatal mutant cortex

To determine whether U11-null cells survive postnatally, we utilized the Ai9 tdTomato Cre reporter line [351], which showed tdTomato+ cells in both the P0 and P21 control and mutant cortex (Figures 4.12 and 4.13). In rostral cortical sections at both time-points, landmarks such as

102 the anterior commissure (Figure 4.12I, bottom white arrow), the choroid plexus (Figure 4.13I-K, white arrow), and the lateral septum (Figures 4.12D&E, 4.12J&K, 4.13B&H; LS) were identifiable in the mutant. Additionally, scattered tdTomato+ cells were visible in the control and mutant striatum (Figures 4.12C’, 4.12I’, 4.13A’, 4.13G’), consistent with reports of Emx1-Cre-lineage cells in that brain structure [235, 329]. However, fewer landmarks were identifiable in more caudal sections of the mutant brain (Figures 4.12F-H, 4.12L-N, 4.13F&L).

4.3.17 U11-null neurons are present in the neonatal mutant cortex

Given that tdTomato expression in the Ai9 tdTomato mouse line is induced by Cre- mediated recombination, the presence of tdTomato+ cells in the postnatal mutant cortex suggested that recombined, U11-null neurons were present in the postnatal mutant cortex. To test this, we performed ISH for U11 on coronal sections of the P0 control and mutant heads. In the

P0 mutant cortex, we observed scattered U11-null cells in the mediorostral cortex; in contrast,

U11 expression was observed throughout the cells of the P0 control cortex (Figure 4.14A-B). To determine whether these U11-null cells in the P0 mutant cortex were neurons, we employed U11

FISH combined with IF for NeuN. This approach revealed the presence of scattered, U11-null neurons in the P0 mutant cortex; again, we did not observe U11-null cells in the P0 control cortex (Figure 4.14C-D’).

4.3.18 U11-null neurons born after Rnu11 ablation are present in the neonatal and juvenile mutant cortex

To determine whether the recombined, U11-null neurons in the postnatal mutant cortex were produced during embryonic development, we employed a pulse-chase strategy using EdU

103

[348]. We injected pregnant females with EdU once daily from E11, when U11 is first lost throughout the pallium (Figure 4.2F), until E13, to ensure we would label U11-null neurons born after Rnu11 ablation (Figures 4.4B-C and 4.5F). The EdU pulsed at each time-point would be incorporated into the replicating genome of S-phase NPCs, and this EdU signal would then be inherited by neurons born from these EdU-labeled progenitor cells. Since EdU signal is diluted by each round of cell division, only post-mitotic neurons born shortly after the initial EdU pulse would retain strong EdU signal postnatally. We harvested control and mutant brains at P0 and P21 and performed FISH for U11, followed by IF for NeuN and detection of EdU (Figure 4.15A). This approach revealed consistent expression of U11 across the P0 control cortex (Figures 4.15B and

S1B). In the P0 mutant cortex, U11-null neurons were interspersed among U11+ neurons, consistent with the presence of tdTomato- cells in the P0 mutant cortex (Figures 4.12I’’, 4.15C,

S1F-G, and S1I-J). By P21, control neurons displayed varying levels of U11 expression (Figures

4.15D, S1K&L, and S1K’-O’). In the P21 mutant cortex, we observed U11-null neurons scattered among U11+ neurons (Figures 4.15E and S1P-T), consistent with the presence of tdTomato+ and tdTomato- cells in the P21 mutant cortex (Figure 4.13G’’). To interrogate the presence of U11- null neurons born between E11 and E13, we quantified the number of U11-/NeuN+/EdU+ cells in the control and mutant cortex, as a percentage of the total number of NeuN+/EdU+ cells. This revealed significantly higher percentages of U11-/NeuN+/EdU+ cells in the mutant P0 cortex compared to the control (Figures 4.15B-C’’’’, 4.15F, and S1A-J) and in the P21 mutant cortex compared to the control (Figures 4.15D-E’’’’, 4.15G, and S1K-T). At both P0 and P21, we observed virtually no U11-/EdU+ neurons in the control sections (Figures 4.15B-G, S1A-E, and

S1K-O). In conclusion, we found that U11-null neurons can survive postnatally.

104

4.3.19 Lack of corticospinal projections and the corpus callosum in the neonatal Rnu11 mutant mouse

To interrogate the effects of U11 loss on neurite outgrowth in the excitatory neurons of the cortex, we examined the whole brains of the Ai9 tdTomato+ control and mutant mice at P0. We found that the tdTomato+ subcerebral tracts, which originate from layer V neurons of the cortex

[193], were not visible in the Rnu11 mutant brain (Figure 4.16B). Moreover, we observed agenesis of the corpus callosum in coronal sections of the P0 mutant brain (Figures 4.12I-N and 4.13G-L).

Together, these findings suggested that although U11-null neurons could survive postnatally, their neurite outgrowth was profoundly affected by U11 loss.

4.4 Discussion

Patients with MOPD1, Roifman syndrome, or Lowry Wood syndrome are thought to escape embryonic lethality due to impairment, but not complete inactivation, of minor spliceosome activity. Despite the systemic impairment of minor splicing, the specific phenotypes observed in the brain and limbs of patients indicate tissue/cell-type specific sensitivity to minor spliceosome activity [106-108, 122, 124, 140, 143]. Even within the brain, progenitor cells of the cortex are the most sensitive to inhibited minor splicing, as evidenced by the consistent set of cortical abnormalities observed in MOPD1 [106]. However, investigation into the requirement of minor splicing for cortical development, and whether this requirement is cell type-specific within the cortex has, until now, been unexplored.

Here, we present the first mammalian model of microcephaly caused by minor spliceosome inactivation (Figure 4.2D), allowing us to study the pathogenesis of microcephaly in disorders caused by minor spliceosome disruption. Unexpectedly, we found that even within the developing

105 cortex, different cells showed different sensitivities to U11 loss. Specifically, self-amplifying

RGCs were highly sensitive to minor spliceosome inactivation, while RGCs undergoing differentiative divisions, IPCs, and neurons were less affected (Figures 4.4, 4.5C-F, and 4.17A).

We did not observe a significant difference between the percentage of non-pyknotic BrdU+ cells that were NeuN+ neurons in the E13 control and mutant pallium, suggesting similar rates of neurogenesis from the control and U11-null NPCs labeled at E12 (Figure 4.5F). These findings are discordant with the similar numbers of neurons observed in the control and mutant at E13

(Figure 4.4D), which may be explained by compensatory neuron production prior to the BrdU pulse. The U11-null neurons born from these U11-null NPCs can survive postnatally (Figure 4.15), although the observed agenesis of the cerebrospinal tracts and corpus callosum in the postnatal mutant brain indicated that their differentiation was profoundly affected (Figures 4.12, 4.13, and

4.16). Together, our findings show that minor spliceosome inactivation in the developing mouse cortex predominantly affects the survival of self-amplifying RGCs and may shift RGCs to neurogenic divisions. Ultimately, this would deplete the RGC pool (Figures 4.4H & 4.17A), resulting in the observed microcephaly (Figure 4.2D).

In the developing mouse cortex, the U11-null RGCs experience multiple cell cycle defects.

For example, fewer U11-null RGCs were in S-phase (Figure 4.7G), suggesting faster S-phase progression or death in late G1/early S-phase. U11-null RGCs also displayed prolonged prometaphase/metaphase (Figure 4.7C-E), indicative of spindle assembly defects. Since prolongation of mitosis—and particularly of prometaphase/metaphase—has also been linked to increased neurogenic drive in cortical NPCs [294], this mitotic defect may drive U11-null RGCs to produce neurons. In addition, we observed fewer Citk+ cytokinetic midbodies along the ventricular edge where RGCs divide, without a change in the fraction of mitotic RGCs in telophase

106

(Figure 4.7E-F). This could be caused by RGC death prior to cytokinesis or improper Citk localization to the midbody. Since proper Citk localization is required for cytokinesis, mislocalization would cause aberrant cytokinesis, resulting in RGC death post-division [301]. In addition to these cell cycle defects, we observed accumulation of DNA damage at E11, followed by further DNA damage accumulation and p53 upregulation in RGCs at E12 (Figure 4.6E-G). The majority of RGCs with DNA damage also upregulated p53 (Figure 4.6E). Thus, activation of the

DNA damage response pathway in these cells likely results in the stabilization of p53 protein, as has been reported [352]. However, the vast majority of p53-upregulating RGCs did not have DNA damage, suggesting that DNA damage accumulation is not the sole cause of p53 upregulation and cell death in the mutant pallium (Figure 4.6E). For these RGCs, we expect that p53 upregulation is triggered by prometaphase/metaphase prolongation and cytokinesis defects, as has been proposed (Figure 4.7C-E) [353, 354]. This is consistent with the observation that most p53- upregulating cells were not in S-, G2-, or M-phases, suggesting that the majority of p53- upregulating cells were either NPCs in G1 phase or neurons (Figure 4.10B-C). Since the vast majority of p53-upregulating cells were Pax6+ (Figure 4.6G), most of these p53-upregulating cells were likely RGCs in G1 phase (Figure 4.10B-C). RNAseq of the pallium revealed that upregulated genes enriched for “intrinsic apoptotic signaling pathway in response to DNA damage by p53 class mediator,” which is consistent with the observed DNA damage and p53 upregulation in RGCs

(Figure 4.6F-G & Table S4). Two of the genes enriching for this functional category were the well- known p53-mediated apoptosis effectors Puma and Noxa (Table S4). Upregulation of these p53- mediated apoptosis effectors is likely specific to RGCs, particularly those in G1 phase, since (1) p53 upregulation was observed in RGCs (Figures 4.6G and 4.10B-D), and (2) the majority of cells in the E12 pallium are RGCs (Figure 4.4E).

107

In essence, all of these RGC-specific cellular defects are caused by minor spliceosome inactivation—i.e., missplicing of MIGs. In the U11-null pallium, 62% of minor introns were significantly retained compared to the control, 21 of which are involved in cell cycle regulation

(Figure 4.8 & Table S5). Notably, four of these cell cycle-regulating MIGs—Cdc45, Dna2, Prim1, and Rfc5—regulate DNA replication and S-phase progression (Table S5); therefore, disruption of their function likely results in DNA damage and cell death in S-phase, which is consistent with the observed cellular defects (Figures 4.6D-E, 4.7G, & 4.17B). Ineffective DNA damage repair, due to minor intron retention in the 13 MIGs regulating this process (Table S6), would further contribute to DNA damage accumulation and the subsequent p53 upregulation. This pathway may underlie the DNA damage observed in the E11 mutant pallium, prior to p53 upregulation (Figure

4.6E & 4.17B). Disrupted function of many of the remaining cell cycle-regulating MIGs, such as

Ahctf1, Dctn3, Dync1li1, Gak, Mau2, and Zfp207, is reported to cause spindle assembly defects, chromosome mis-segregation, and aberrant cytokinesis (Table S5). Thus, minor intron retention in these genes likely contributes to the prolonged prometaphase/metaphase and cytokinesis defects observed in U11-null RGCs at E12, which may in turn trigger the observed p53 upregulation in the absence of DNA damage (Figures 4.6E, 4.7E-F, and 4.17B). The identification of these subsets of MIGs, which show significantly elevated minor intron retention and whose functions track with the observed cellular defects in the U11-null pallium, opens new avenues of research for scientists investigating the pathogenesis of the developmental disorders linked to minor spliceosome inactivation.

Out of the 186 minor introns that showed significantly elevated retention in the U11-null pallium, we predicted that the majority would result in nonsense-mediated decay (Figure 4.6B).

However, none of the 147 MIGs containing these retained minor introns were downregulated in

108 the E12 U11-null pallium. This is in agreement with the finding that mutation in an essential minor spliceosome protein (rnpc3) in zebrafish did not affect expression levels of 90% of MIGs, while still resulting in elevated minor intron retention [113]. These findings suggest that inactivation of the minor spliceosome, while disrupting minor intron splicing, does not necessarily result in downstream changes in overall gene expression. This could be a result of a transcriptional feedback loop that drives compensatory upregulation of MIG transcription in response to impaired function of these MIGs [106]. Translation of these transcripts could result in aberrant protein production, with loss- or gain-of-function effects, contributing to the RGC-specific cellular defects and death.

In both the U11-null pallium and in MOPD1 patients, the observed differential sensitivity to minor spliceosome inactivation may be informed by differences in MIG expression across cell types, where higher MIG expression is correlated with increased sensitivity to minor spliceosome inactivation. Furthermore, if minor introns themselves display differential sensitivity to minor spliceosome inactivation, as our data suggest, then the identity of MIGs expressed in specific cell types could inform that cell type’s susceptibility to minor spliceosome inactivation. In all, our findings establish how minor spliceosome inactivation in the developing mammalian cortex leads to microcephaly. Moreover, we are the first to identify cell type-specific requirements for minor spliceosome activity within the developing cortex, raising new questions about the mechanisms underlying cell type-specific sensitivity to minor spliceosome activity in mammalian development and disease.

4.5 Materials and methods

Animal Procedures and Generation of the Rnu11 cKO Mouse

109

All mouse procedures were performed according to the protocols approved by the

University of Connecticut Institutional Animal Care and Use Committee. The Rnu11 conditional knockout (cKO) mouse was generated by the University of Connecticut Health Center. A single targeting construct was utilized to introduce both loxP sites into the Rnu11 locus in mouse embryonic stem (ES) cells (Figure 4.1A). This construct contained a 5’ loxP site 1090 bp from the

Rnu11 gene. Immediately upstream of the 3’ loxP site was a phosphoglycerine kinase (PGK)-

Neomycin (Neo) cassette flanked by Frt sites. Additionally, this construct contained a PGK-dTA

(diphtheria toxin A) negative selection cassette downstream of the 3’ arm of homology. This construct was electroporated into 129X1/SvJ mouse embryonic stem (ES) cells. Successful targeting was verified by G418-mediated positive selection. Subsequently, nested long-range PCR was employed to confirm successful at both the 5’ and 3’ loxP sites

(Figure 4.1C). Two ES-cell clones, F4 and H8, were then used for morula aggregation using

C57BL/6 embryos to generate a chimera. Germline transmission and ablation of the Neo cassette was verified by introducing germline Flp recombinase. Proper loxP site placement in the resulting mouse line was confirmed by PCR (Figure 4.1A; Table S7) to identify loss of the wild-type allele in the homozygous floxed line and by breeding in the germline EIIa-Cre [355] (Figure 4.2A).

Emx1-Cre was bred into the Rnu11 conditional knockout line to target Rnu11 for removal in the developing forebrain [235]. The primers used for Rnu11, Cre, Emx1-Cre zygosity, and Ai9 tdTomato Cre reporter genotyping PCRs are listed in Table S7. All experiments were performed using male and female Rnu11WT/Flx and Rnu11Flx/Flx mice that were Emx1-Cre+/-. For matings intended for embryonic harvests, the plug date was considered E0; on the harvest date, the embryonic time-point was further verified by Theiler staging of the embryos.

110

Quantitative PCR

Quantitative PCR (qPCR) using Genomic DNA To determine Rnu11 heterozygosity and loss of the WT allele, 25 ng of gDNA from Rnu11WT/KO and Rnu11WT/WT mice was used for qPCR with Rnu11 cKO, 5’ loxP primers (Table S7). Equal gDNA input was controlled with amplification for Hist1h1a, an intronless gene, which was then used to normalize amplification for

Rnu11.

Quantitative Reverse-Transcriptase PCR (qRT-PCR) Loss of U11 expression was confirmed by qRT-PCR with U11 primers (Table S7) on cDNA prepared from total RNA extracted from the pallium of control (N=3) and mutant (N=4) E12 embryos. In this case, Neat1, a non-coding RNA that did not show change in expression between the control and mutant samples by RNAseq, was used to normalize expression of U11 by qRT-PCR [356] (Table S3). The same cDNA was used to determine expression of Spc24, using primers designed in the 3’ UTR of the gene (Table S7). The

Cq values obtained for Sfrs10 (Table S7) were used to normalize expression of Spc24 [251]. For minor intron retention in Coa3, Parp1, and Pten, expression values of exon-intron (unspliced) product were normalized using Cq values generated using intergenic primers (Table S7) to offset any gDNA contamination [251]. Expression values of exon-exon (spliced) product were normalized using Sfrs10 Cq values [251]. Minor intron retention for each of these MIGs was calculated by dividing the normalized fold expression values of the unspliced product by the normalized fold expression values of the spliced product. Statistical significance was determined by two-tailed student’s t-tests.

Transcardial Perfusion

111

P21 and P70 control and mutant mice were anesthetized using isoflurane, followed by transcardial perfusion performed as previously described [357]. After the brains were removed, they were fixed overnight in 4% PFA at 4°C, followed by PBS washes and cryoprotection in 30% sucrose in PBS at 4°C for two days. Following an overnight incubation in half 30% sucrose/half OCT compound

(Fisher Scientific, #23-730-571), the brains were embedded in OCT compound and sectioned using a cryostat (Leica CM 3050 S).

Hematoxylin and Eosin Staining

Frozen 16 µm coronal sections of P0 mouse brains were cured at room temperature for 15 minutes, followed by incubation in PBS for 15 minutes. The tissue was then dehydrated using an ethanol gradient (25%, 50%, and 75%). Slides were washed twice in 100% ethanol for 30 seconds each, with agitation, then washed twice in 95% ethanol for 30 seconds each, with agitation.

Following a rinse in running tap water, the sections were incubated in Harris hematoxylin (Fisher

Scientific, #6765001) for 7 minutes. Excess hematoxylin was removed by washing in running water for 30 seconds. The slides were briefly agitated in 1% acid alcohol (1% HCl in ethanol), then briefly rinsed in running water. Sections were then incubated for 2-3 minutes in saturated lithium carbonate (1.54% in dH2O), followed by a 5-minute incubation in tap water. After a 30 second wash in 80% ethanol with agitation, slides were incubated in an eosin Y-phloxine B solution (0.1% eosin Y, 0.01% phloxine B, 0.45% glacial acetic acid, 82.9% ethanol) for 1 minute.

The sections were then washed in 95% ethanol for 30 seconds with agitation, followed by two 30- second washes in 100% ethanol with agitation and incubation in 100% ethanol for 1 minute. After washing in xylene for 30 seconds with agitation, the slides were washed in xylene for 1 minute, then mounted using PermountTM mounting medium (Fisher Scientific, #SP15-100).

112

In Situ Hybridization

16 µm cryosections of mouse heads (for E10–E12 and P0), telencephalons (E13–E14), and whole brains (for P21), N=3 for each time-point and genotype, were used for section in situ hybridization (SISH). SISH was performed using antisense, digoxigenin-labeled U11 RNA probe, which was detected using either alkaline phosphatase (AP) or fluorescent labeling (FISH), as described in [116]. The U11 probe was generated using the U11 expression primers in Table S7.

TUNEL Assay

Sixteen µm cryosections of either embryonic mouse heads (for E10 – E12) or telencephalons (E13 – E14) were used for terminal deoxynucleotidyl transferase dUTP nick ending labeling (TUNEL) analysis. The in situ cell death detection kit, TMR red (Roche, #12156792910), was used, according to the manufacturer’s instructions.

Immunofluorescence

Immunofluorescence was performed using one of three protocols. Staining for Tbr2 was performed as described in [233]. For staining with anti-NeuN antibody, heat-mediated antigen retrieval was performed using citrate buffer, per the manufacturer’s instructions (Vector

Laboratories, #H-3300), followed by 10 minute PBS washes (3 times). After three 10 minute washes in 0.5% Triton® X-100 in PBS (PBT), sections were incubated in 10% FBS in PBS for one hour at room temperature. Sections were then incubated in a PBS solution containing 1% FBS,

0.1% Triton® X-100, and the primary antibody overnight at 4°C. Following five-minute washes in PBT (10 times), incubation with the secondary antibody (1:500) was performed similarly to the

113 primary antibody incubation, except for two hours at room temperature. The slides were then washed five times with PBT, followed by three PBS washes and mounting with ProLong® Gold

Antifade Mountant (ThermoFisher Scientific, #P36930). For all other antibodies, staining was performed as described in [358]. Primary antibodies were diluted to 1:50 (mouse anti-CRIK/CITK,

BD Biosciences, #611376; rabbit anti-cyclin B1, Cell Signaling, #4138; MoBu-1 clone mouse anti-BrdU, Santa Cruz, #sc-51514), 1:100 (mouse anti-Aurora-B/AIM1, BD Biosciences,

#611082; mouse anti-Pax6, Developmental Studies Hybridoma Bank, #PAX6-b), 1:200 (mouse anti-γH2AX, MilliporeSigma, #05-636; rabbit anti-NeuN, abcam, #ab17787), 1:300 (rabbit anti- cleaved caspase 3, Cell Signaling, #9665S; mouse anti-KI67, BD Biosciences, #556003; rabbit anti-p53, Leica Microsystems, #p53-CM5p; rabbit anti-Pax6, MilliporeSigma, #AB2237), or

1:500 (rabbit anti-PH3, Bethyl Laboratories, Inc., #IHC-00061; rabbit anti-TBR2, abcam,

#ab23345).

5-bromo-2’-deoxyuridine (BrdU) and 5-ethynyl-2’deoxyuridine (EdU) Pulse Experiments

Timed-pregnant Rnu11WT/Flx::Emx1-Cre+/+ females were first weighed and peritoneally injected with 0.2 µL 25 mM BrdU or EdU in PBS per 100 mg body weight, one hour prior to embryonic harvest. BrdU incorporation was detected using the BrdU-specific MoBu-1 clone anti-

BrdU antibody (Santa Cruz, #sc-51514) [359]. EdU detection was then performed on sections of the pallium using the Click-iTTM EdU Alexa FluorTM 647 imaging kit (ThermoFisher Scientific,

#C10340) in accordance with the manufacturer’s instructions.

Image Acquisition and Quantification

114

Processed slides were imaged with a Leica SP2 confocal microscope, where settings for laser intensity, excitation-emission windows, gain, and offset conditions were identical between control and mutant sections on a given slide. For every slide that was processed for ISH, FISH, or

IF, the slide contained at least one control and one mutant section. Sections serial to those that revealed loss of U11 by ISH analysis were used for IF staining, and images were collected in regions of the pallium that showed U11 loss in the mutant by ISH (Figure 4.2E). For each channel for every experimental condition, confocal imaging settings were optimized for fluorescence in the control section on a given slide, and maintained for all other sections on that slide. For TUNEL,

CC3, γH2AX, and p53 imaging, collection settings were adjusted using the control section. These settings were maintained when imaging the mutant section(s) on the same slide. Further processing was performed on IMARIS v8.3.1 (Bitplane Inc.) and Adobe Photoshop CS4 (Adobe Systems

Inc.). All image processing in IMARIS and Photoshop was identical between images of control and mutant sections from the same slide. Manual quantification was performed using the spot tool in the IMARIS software, on 240 µm-wide columns of the pallium; statistical significance was calculated using two-tailed student’s t-tests. For quantification of mitotic phases, as identified by

Aurora B staining (Figure 4.9A), prometaphase and metaphase cells were summed together into a single category (prometaphase/metaphase). For the quantification of U11-/NeuN+/EdU+ cells, only

DAPI+ nuclei were considered during the quantification process, due to the presence of non- nuclear NeuN signal. We subsequently employed both the 3D view and slice functions in IMARIS to ensure these DAPI+ nuclei were truly U11-null, NeuN+, and EdU+ (Figures S3 and S4). In the

P0 and P21 control and mutant cortices, we observed non-nuclear FISH signal in the raw MIP images, with some FISH signal observed ~10 µm from the nearest nucleus in the section (Figures

S4 and S5). Given that U11 expression is predominantly nuclear, this non-nuclear FISH signal was

115 considered background staining. For quantification of U11-null cells, U11 signal for each slide was adjusted in IMARIS based on two criteria: (1) non-nuclear FISH signal in the control and/or mutant images, and (2) FISH signal in NeuN+ cells in the control. Specifically, FISH levels were adjusted to minimize the background FISH signal, while maintaining FISH signal in all NeuN+ cells in the control, to prevent false negative calls (Figures S2 and S3). Since we were only interrogating NeuN+ cells in the analysis, and due to the presence of U11-null glia/endothelial cells in the control P21 cortex (Figure S1), we did not consider NeuN-negative cells in the FISH thresholding process. In the case of P21 sections, we observed a high level of FISH background staining, which was not completely ablating using this thresholding process (Figures S4 and S5).

Therefore, given the presence of FISH background in these images, U11+ cell calls were made based on the presence of multiple (>1) grains of FISH signal in the nucleus of a NeuN+/EdU+ cell.

To further prevent false negatives from confounding our analysis, we divided the absolute number of U11-/NeuN+/EdU+ cells for each cortical column by the total number of NeuN+/EdU+ cells in that column, to determine the percentage of NeuN+/EdU+ cells that were U11-null. For analysis of

Pax6+ and Tbr2+ cell quantification across mutant cortical development, statistical significance was determined by one-way ANOVA, followed by the post hoc Tukey test. For comparisons between the ɣH2AX+, p53+, or CC3+ subpopulation and all DAPI+ cells in the E12 mutant pallium, statistical significance was determined by Fisher exact test. Specifically, for each N, a separate

Fisher exact test was performed on quantification data from a 240 µm-wide column of the mutant pallium. The P values from these Fisher exact tests were then combined using Fisher’s method.

Bioinformatics Analysis

116

RNAseq The pallium was dissected from E12 (N=5 Rnu11WT/Flx Emx1-Cre+/-, N=5

Rnu11Flx/Flx::Emx1-Cre+/-) embryos, followed by RNA extraction using TRIzol (Thermo Fisher

Scientific, #15596018), per the manufacturer’s instructions. Sample preparation and sequencing were performed by the University of Connecticut Center for Genome Innovation. Total RNA library was prepared by Illumina TruSeq Stranded Total RNA Library Sample Prep Kit (RS-122-

2201) with Ribozero to remove ribosomal RNA, and were then sequenced using Illumina NextSeq

500. Each sample was barcoded so that the sequencing could be multiplexed. In total, we obtained

189,989,434 paired-end 151 bp reads for the control (N=5) and 219,849,711 paired-end 151 bp reads for the mutant samples (N=5).

Gene expression First, reads of individual replicates were merges and mapped to the mm10 genome (UCSC genome browser) using Hisat2 (options –no-discordant and –no-mixed), with an overall alignment rate of 85% [248, 360]. Reads that mapped to multiple locations in the genome were removed. Since the samples were obtained from C57BL/6-129X1/SvJ mice, SNVQ was run to call single nucleotide variations (SNVs) [361], which were used to create a new reference genome. Reads from all replicates were then mapped separately to the newly generated reference genome using Hisat2. Isoform expression was determined by IsoEM2, an expectation- maximization algorithm that bootstraps samples in order to determine an expression value in fragments per kilobase per million mapped reads (FPKM) unit [362]. Gene expression was then calculated by taking the sum of the FPKM values of a given gene’s constituent isoforms. Finally, differential gene expression (≥2 fold change, P≤0.01) was determined by IsoDE2 [362]. Briefly, this tool utilizes the 200 bootstrapping iterations to compute a confident log2FC, which is the maximum log2FC that can be computed within the specified P-value. A log2FC of 100 is reported when the FPKM value for one of the conditions was 0 in one of the bootstrapping iterations.

117

Splicing efficiency To assess minor intron splicing, sequences of the 555 minor introns were curated from the U12DB [58] and blasted using Ensembl83 to obtain the minor intron coordinates in the canonical transcript. Sequences of eight minor introns (found in Arpc5l,

Cacna1c, Exosc2, Golga7, Ift80, Mlst8, Tmem161b, and Morc4) that did not have 100% identity with any annotated intron in the Ensembl database were removed from our analysis. All gene and exon coordinates from the mm10 genome were obtained from a Biomart file (Ensembl v84) [363].

Major intron coordinates were downloaded from the UCSC genome browser (Ensembl v84) and randomized after removal of introns smaller than 4 nt.

Reads from each of the biological replicates were separately mapped with Hisat2 to the previously described post-SNVQ generated reference genome. Reads that mapped to multiple locations were then filtered out. To determine splicing efficiency of minor introns, we employed the strategy described in [103]. Briefly, all reads mapping to an exon-intron boundary were extracted and quantified using BEDTools [364]. The mis-splicing index (MSI) was then calculated for each intron by dividing the number of exon-intron junction reads by the number of reads mapping to the canonical exon-exon splice-junctions. Only introns with >4 exon-intron boundary reads, ≥1 read aligning to either the 5’ or 3’ splice site, and >95% intron coverage in at least one sample were used for this analysis [103]. As a control for the 545 minor introns subjected to these criteria, we also generated 100 lists of 545 randomized major introns (54,500 randomized major introns in total), to which we applied the same criteria. For each intron that passed the filtering criteria, we calculated the difference between the mutant and the control (mutant MSI – control

MSI; ∆MSI). To determine which minor introns were retained significantly more in the mutant, we compared the average MSI of each minor intron between the control and mutant samples using two-tailed student’s t-tests.

118

ORF analysis To analyze the effect of minor intron retention on the ORF, we curated all annotated isoforms of the 178 MIGs containing minor introns with significantly elevated retention in the mutant from Ensembl (v84) [363]. From this list of isoforms, we extracted only isoforms that would require minor intron splicing for their production, i.e. the blasted minor intron coordinates match one of the annotated intron coordinates 100%, resulting in 330 isoforms.

Comparison of the annotated open reading frame (ORF) sequence, with the sequence of the isoform containing a retained minor intron, was performed to determine whether minor intron retention would result in introduction of a premature stop codon in these isoforms. Isoforms were then binned into one of four categories: (1) truncation of the ORF, due to a premature stop codon;

(2) elongation of the ORF, with usage of a premature stop codon; (3) elongation of the ORF, with usage of the isoform’s annotated stop codon; and (4) no change in ORF size, with usage of a premature stop codon. For the 273 isoforms binned in category (1), we identified the location of the stop codon relative to the downstream exon-exon junction. Based on this information, we further subdivided isoforms from category 1 into (a) isoforms with premature stop codons >50 nt upstream of an exon-exon junction, which would be predicted to activate nonsense mediated decay

(NMD) of the transcript [343]; and (b) isoforms with premature stop codons >50 nt upstream of an exon-exon junction, which might escape NMD and be used for protein production.

RNA Extraction and cDNA Preparation

Palliums were dissected from E12 Rnu11 control (N=3) and mutant (N=4) mice, and the tissue collected from each embryo was individually used for total RNA isolation. Tissue from each embryo was separately triturated in 100 µL of TRIzol (Invitrogen, #15596026). RNA was extracted using the DirectZOLTM RNA MiniPrep kit (Zymo Research, #R2050), per the

119 manufacturer’s instructions. For total RNA samples, 500 ng of RNA was used for cDNA synthesis, which was performed as previously described [116].

120

Chapter 4: Figures

Figure 4.1. Generation and confirmation of the Rnu11 cKO mice. (A) Schematic of the Rnu11 locus. The gray box represents the predicted gene Gm28874. Below the Rnu11 locus schematic is the targeting construct used to introduce the loxP sites by homologous recombination (dashed lines). (B) Schematic representing the targeted allele, showing the primers used to interrogate the 5’ loxP site (black and green arrows, left) and the 3’ loxP site (black and purple arrows, right). For the 5’ loxP site primer set, the outermost forward primer (black arrow, left) was designed outside the 5’ arm of homology, with the outermost reverse primer (black arrow, left) positioned downstream of the 5’ loxP site. The inner set of nested primers (green arrows) were designed in the 5’ arm of homology and within the 5’ loxP site, respectively. For the 3’ loxP site primer set, the outermost forward primer (black arrow, right) was designed in the Frt site located upstream of the 3’ loxP site, with the outermost reverse primer (black arrow, right) positioned downstream of the 3’ arm of homology. The inner set of nested primers (purple arrows) were designed in the 3’ loxP site and the 3’ arm of homology, respectively. (C) Agarose gel images of long-range nested PCRs, performed on genomic DNA (gDNA) from targeted ES cells, using the 5’ loxP site primer set (black and green arrows, left) and the 3’ loxP site primer set (black and purple arrows, right). The number at the right of each gel image represents the expected product size produced from this strategy.

121

Figure 4.2. U11 loss in the developing mouse neocortex causes severe microcephaly. (A) Schematic of the Rnu11 floxed (Flx) allele with positions of the loxP sites (blue triangles), with agarose gel image showing PCR results detecting the upstream (left) and downstream (right) loxP sites. Below is a schematic of the Rnu11 knockout (KO) allele, confirmed by PCR. See also Fig. S1 and Table S7. (B) Results of quantitative PCR (qPCR) detecting the WT allele. (C) Table showing genotype frequency of pups produced from crosses of Rnu11WT/KO mice. (D) Images of P0 Rnu11WT/Flx::Emx1-Cre+/- (control, ctrl) and Rnu11Flx/Flx::Emx1-Cre+/- (mutant, ctrl) pups, with lateral view of P0 ctrl and mut brains. Scale bar=5 mm. At right are coronal sections of P0 ctrl and mut brains stained with hematoxylin and eosin. Arrowheads indicate the choroid plexis. Scale bar=500 µm. (E-I) In situ hybridization (ISH) for U11 expression (purple signal) in sagittal sections of the telencephalon from ctrl (top row) and mut (bottom row) (E) E10, (F) E11, (G) E12, (H) E13, and (I) E14 embryos. Scale bars=500 µm. Insets show higher magnification of the boxed regions; scale bars=200 µm. (J) Bar graphs showing the average thickness of the E12, E13, and E14 ctrl and mut pallium. Quantification data are represented as mean±SEM. n.s.=not significant; *=P<0.05; ***=P<0.001.

122

Figure 4.3. Cell death contributes to microcephaly in the U11 cKO mouse. (A-E) Fluorescent ISH (FISH) signal for U11 (white) combined with TUNEL (red) on control (ctrl, left) and mutant (mut, middle) pallium sections from (A) E11, (B) E12, (C) E13, and (D) E14 embryos, with quantification of TUNEL+ cells (right). Ap=apical; ba=basal. (E-F) Immunofluorescence (IF) for cleaved caspase 3 (CC3) in ctrl (left) and mut (middle) pallium sections from (E) E11 and (F) E12 embryos, with quantification of CC3+ cells (right). Scale bars=30 µm. Data are represented as mean±SEM.. n.s.=not significant; *=P<0.05; **=P<0.01; ***=P<0.001.

123

Figure 4.4. U11 loss results in depletion of radial glial cells. (A-C) IF for Ki67 (green) and NeuN (magenta) on sagittal sections of the control (left) and mutant (right) pallium across (A) E12, (B) E13, and (C) E14, with (D) quantification. (E-G) IF for Pax6 (green) on sagittal sections of the control (left) and mutant (right) pallium across (E) E12, (F) E13, and (G) E14, with (H) quantification. (I-K) IF for Tbr2 (green) on sagittal sections of the control (left) and mutant (right) pallium across (I) E12, (J) E13, and (K) E14, with (L) quantification. Scale bars=30 µm. Data are represented as mean±SEM. Subp.=subpallium; n.s.=not significant; *=P<0.05; **=P<0.01; ***=P<0.001.

124

Figure 4.5. U11 loss and death of self-amplifying radial glial cells in the mutant pallium. (A) Separated channels for sagittal section U11 FISH (middle row) and immunofluorescence (IF) for Ki67 (green) and NeuN (magenta) (top row), with overlay (bottom row), from the E12 and E13 control (ctrl) and mutant (mut) pallium. Scale bars=30 µm. (B) IF for Pax6 (green) and cleaved caspase 3 (CC3, magenta) on sagittal section of E13 mutant pallium. White line marks the boundary between the subpallium and the pallium. Scale bar=30 µm. (B’-B’’’) Magnified images of the dashed box in (B), with overlay of all channels (B’), Pax6 and DAPI (B’’), and CC3 and DAPI (B’’’). Scale bars=7 µm. (C) IF for CC3 (magenta) and Tuj1 (green) in sagittal section of the E12 ctrl and mut pallium, with bar graphs showing the percentage of CC3+ cells that were pyknotic (left) and the percentage of non- pyknotic CC3+ cells that were Tuj1+ (right). Scale bars=30 µm. (D-E) Bar graphs showing quantification of cells with nuclear Pax6 staining (D) or cells with nuclear Tbr2 staining (E) across development in the mutant pallium, from E12 and E14. Statistical significance across the three tested time-points was determined by one-way ANOVA, followed by Tukey’s multiple comparison test to determine specific P values. (F) IF for BrdU (magenta) and NeuN (green) in the E13 ctrl and mut pallium, which had been pulsed with BrdU at E12. Scale bars=30 µm. At right are bar graphs showing the percentage of NeuN+ cells that were BrdU+ in the E13 pallium, or the percentage of non-pyknotic BrdU+ cells that were NeuN+ in the E13 pallium. Statistical significance was determined by two-tailed student’s t-tests. Quantification data are represented as mean±SEM from N=3 for each condition per time-point. Non-pyk.=non- pyknotic. n.s.=not significant; *=P<0.05; **=P<0.01.

125

Figure 4.6. U11 loss triggers DNA damage and p53 upregulation in radial glial cells. (A) Scatterplot depicting the expression of protein-coding genes ≥1 FPKM in the control (ctrl, x- axis) and mutant (mut, y-axis) E12 pallium. Inset shows U11 expression in the E12 mut pallium by SISH, with dashed lines marking the dissected region. (B-C) Normalized expression, determined by qRT-PCR on E12 ctrl and mut pallium, for (B) Rnu11 and (C) Spc24. (D) IF for CC3 (green) and ɣH2AX (magenta) in the E12 ctrl and mut pallium, with quantification. (E) IF for ɣH2AX (magenta) and p53 (green) in E11 and E12 ctrl and mut sagittal pallial sections, with quantification. Inset pie charts show the percentage of ɣH2AX+ cells that upregulated p53 (p53+) (left) and the percentage of p53+ cells that were ɣH2AX+ (right). (F) IF for ɣH2AX (magenta) and Pax6 (green), Tbr2 (green), or NeuN (green), on sagittal sections of the E12 mut pallium, with quantification. (G) IF for p53 (magenta) and Pax6 (green) in the E12 mut pallium, with pie charts showing the percentage of Pax6+ cells of the p53+ population (left) and of all DAPI+ cells (right). Scale bars=30 µm. Quantification data are represented as mean±SEM. n.s.=not significant; *=P<0.05; **=P<0.01; ***=P<0.001.

126

Figure 4.7. U11-null radial glial cells exhibit S-phase, metaphase, and cytokinesis defects. (A) Frequency plot of the change in mis-splicing index (∆MSI, x-axis) of major (blue) and minor (red) introns. (B) Pie chart showing the effect of minor intron retention on the open reading frame (ORF), for the 186 minor introns with significantly elevated retention, across all annotated MIG transcripts requiring minor intron splicing. The truncation category is further broken down into the percentage of transcripts predicted to be degraded by nonsense-mediated decay (NMD) or to produce protein (prot.). (C) IF for Ki67 (green) and PH3 (magenta) in sagittal sections of the E12 control and mutant pallium, with quantification. (D) IF for Aurora B (green, AurB) and Pax6 (magenta) in sagittal sections from E12 ctrl and mut pallium, with quantification of mitosis- specific Aurora B staining (M). (E) Quantification of the percentage of Aurora B+ cells in different mitotic phases for E12 ctrl and mut pallium. (F) IF for Citk (green) in sagittal sections of ctrl and mut E12 pallium, quantification of Citk+ cytokinetic midbodies along the ventricular edge in the E11 and E12 control and mutant pallium. (G) IF for Pax6 (green) and EdU (magenta) on E12 ctrl and mut sagittal pallial sections, with quantification. (H) IF for Tbr2 (green) and EdU (magenta) on sagittal sections of E12 ctrl and mut pallium, with quantification. Scale bars=30 µm. Data are represented as mean±SEM. n.s.=not significant, *=P<0.05, **=P<0.01, ***=P<0.001.

127

Figure 4.8. Validation of expression and minor intron retention changes in the E12 mutant pallium. (A) Minor intron retention of Coa3 (left), Parp1 (middle), and Pten (right) in the E12 ctrl and mut pallium, as determined by qRT-PCR. The value plotted is the average ratio of normalized unspliced expression/normalized spliced expression ± SEM from N=3 for each condition. Individual data points are superimposed on the bar graphs. Statistical significance was determined by two-tailed student’s t-tests. *=P<0.05, **=P<0.01, ***=P<0.001. (B) GO Terms enriched for by DAVID analysis of MIGs with statistically significant elevated minor intron retention in the E12 mutant pallium, with cell cycle-related terms highlighted in yellow.

128

Figure 4.9. Cell cycle defects are not observed in the E11 mutant pallium, in IPCs of the E12 mutant pallium, or in G2 phase. (A) IF for Aurora B (green, AurB) showing phase-specific Aurora B staining patterns in cells in G2, prophase (pro.), prometaphase (prometa.), metaphase (meta.), anaphase (ana.), and telophase (telo.) in the E12 pallium. Scale bars=4 µm. (B) IF for Aurora B (green) and Tbr2 (magenta) in sagittal sections of control (ctrl) and mutant (mut) E12 pallium, with bar graph showing the percentage of Tbr2+ cells with mitosis- and G2-specific (N=4) Aurora B staining patterns. (C) Bar graph showing the percentage of Aurora B+/Tbr2+ mitotic cells in pro., prometa./meta., ana., and telo. (N=4). (D) IF for Citk (green) in sagittal sections of E11 control and mutant pallium. (E) IF for Pax6 (green) and EdU detection (magenta) in sagittal sections from E11 control (ctrl, left) and mutant (mut, right) pallium, with bar graph showing the percentage of Pax6+ cells that were EdU+ in the E11 ctrl and mut pallium. (F) IF for cyclin B1 (green) in sagittal sections of the E12 ctrl and mut pallium, with bar graph showing the number of cell with cytoplasmic (cytopl.) cyclin B1. (G) Bar graph showing the percentage of Pax6+ cells with G2-specific Aurora B+ staining in the E12 ctrl and mut pallium. Scale bars=30 µm, unless otherwise specified. Quantification data are represented as mean±SEM from N=3 for each condition per time-point, unless otherwise specified. Statistical significance was determined by two-tailed student’s t-test. n.s.=not significant.

129

Figure 4.10. P53 upregulation and apoptosis predominantly occur in either cells in G1 or post-mitotic cells. (A) IF for ɣH2AX (green) and EdU (magenta) in sagittal sections of E12 mutant (mut) pallium, with pie charts showing the percentage of EdU+ cells of the non-pyknotic (non-pyk.) ɣH2AX+ and DAPI+ populations. (B) IF for p53 (green) and EdU (magenta) in sagittal sections of E12 mut pallium, with pie charts showing the percentage of EdU+ cells of the p53+ and DAPI+ populations. (C) IF for Aurora B (green, AurB) and p53 (magenta) in sagittal section of the E12 mut pallium, with pie charts showing the percentage of G2 Aurora B+ cells of the p53+ and DAPI+ populations (left), and the percentage of mitotic (M) Aurora B+ cells of the p53+ and DAPI+ populations (right). (D) IF for CC3 (green), EdU (magenta), and TUNEL (yellow) in sagittal section of E12 mut pallium, with pie charts showing the percentage of EdU+ cells of the non-pyknotic CC3+ cell population. (E) IF for CC3 (green), Aurora B (magenta), and TUNEL (yellow) in sagittal sections of the E12 mut pallium, with quantification. Scale bars=30 µm. Quantification data are represented as mean±SEM. n.s.=not significant, ***=P<0.001.

130

Figure 4.11. U11-null cells with DNA damage are not in mitosis. IF for PH3 (green) and ɣH2AX (magenta) in a sagittal section of the E12 mutant pallium. Scale bar=30 µm.

131

Figure 4.12. Cells positive for the tdTomato Cre reporter are observed in the P0 mutant brain. (A-B) Dorsal view of the whole P0 control (A) and mutant (B) brain, showing tdTomato signal (red). Exposure time for red fluorescence was determined separately for these conditions. Scale bar=1 mm. (C-N) Rostral to caudal coronal sections of tdTomato+ P0 control (C- H) and mutant (I-N) brain. Solid arrowheads indicate structures that are identifiable in both the control and the mutant brain. Open arrowheads denote structures unique to the mutant sections, relative to the control. Scale bars=500 µm. LS=lateral septum. (C’) and (I’) are magnified images of the boxed regions in (C) and (I), respectively. In these images, nuclei are marked by DAPI (blue). Ctx=cortex, str=striatum. Scale bars=250 µm. (I’’) Magnified image of the box in (I’). Scale bar=60 µm.

132

Figure 4.13. Cells positive for the tdTomato Cre reporter are observed in the P21 mutant brain. (A-L) Rostral to caudal coronal sections of tdTomato+ (red) control (A-F) and mutant (G-L) brain. Solid arrowheads indicate structures that are identifiable in both the control and mutant brain. Open arrowheads denote structures unique to the mutant sections, relative to the control. Scale bars=1 mm. LS=lateral septum. (A’) and (G’) are magnified images of the boxed regions in (A) and (G), respectively. In these images, nuclei are marked by DAPI (blue). Ctx=cortex, str=striatum. Scale bars=500 µm. (G’’) Magnified image of the box in (G’). Scale bar=60 µm

133

Figure 4.14. U11-null neurons are present in the P0 mutant cortex. (A-B) AP SISH for U11 performed on coronal sections of P0 ctrl (A) and mut (B) heads. Scale bars=200 µm. (C-D) Combined FISH for U11 (green) and IF for NeuN (magenta) on coronal sections of P0 ctrl (C) and mut (D) heads. Scale bars=200 µm. (C’-D’) are magnified images of the boxes in (C) and (D), respectively. Scale bars=60 µm. Nuclei are marked by DAPI.

134

Figure 4.15. Embryonically-produced, U11-null neurons are present postnatally. (A) Schematic of the EdU pulse-chase strategy (top) and the order of signal detection (bottom). (B-C) Maximum intensity projection (MIP, z-stack=12 µm) images of U11 FISH (green), NeuN IF (blue), and EdU detection (magenta) on coronal sections of the P0 control (B) and mutant (C) cortex. (D-E) Maximum intensity projection (MIP, z-stack=18 µm) images of U11 FISH (green), NeuN IF (blue), and EdU detection (magenta) on the P21 control (D) and mutant (D) cortex. Scale bars=60 µm. Dashed lines indicate the division between the cortex and the striatum (str). Insets show high magnification images of optical slices of the boxed regions in (C) and (E), respectively, with arrowheads pointing to U11-null, NeuN+, EdU+ cells in the P0 (C’-C’’’’) and P21 (E’-E’’’’) mutant cortex. Inset scale bars=5 µm. (F-G) Bar graphs showing quantification of the percentage of U11-/NeuN+/EdU+ cells out of the total number of NeuN+/EdU+ cells in the control and mutant cortex at P0 (F) and P21 (G), based on cells observed within a 240 µm-wide column of the cortex. Statistical significance was determined by two-tailed student’s t-tests. *=P<0.05, **=P<0.01.

135

Figure 4.16. Subcerebral projection defects in the Rnu11 mutant brain. Ventral view of tdTomato+, Emx1-Cre+ brains from (A) ctrl and (B) mut P0 mice. Arrowheads point to the region of the brainstem where the corticospinal projections are normally observed.

136

Figure 4.17. Model of cortical development in U11 cKO mice. (A) Schematic representing cortical development in wild-type and mutant embryos, from E12 to E14. Black arrows show type of cell division. Question marks and dashed lines indicate events that remain to be explored. (B) Schematic of proliferating RGCs progressing through the cell cycle in the late E11/early E12 pallium. Arrow color denotes presence (blue) or absence (black) of U11. MIGs with significantly elevated minor intron retention, and with known functions at specific cell cycle phases, are listed in boxes. Pro.=prophase; met.=metaphase; ana.=anaphase; tel.=telophase; cyt.=cytokinesis.

137

Chapter 5: Identifying MIGs essential for the survival of cycling cells

This chapter investigates the link between minor intron splicing and the survival of cycling cells, which I identified in Chapter 4. Specifically, this chapter first describes the published datasets I mined for this analysis. These datasets contain the identities of genes that, when disrupted, result in cycling cell death. Since these genes are essential for the survival of cycling cells, this set of genes is called the cycling cell “essential genome,” a.k.a. the “essentialome.” The chapter next describes the enrichment of MIGs in these cycling cell “essentialomes” and the functions these essential MIGs perform.

5.1 Identification of the “essentialome”

In 2015, three separate reports were published in Science and Cell Biology that sought to identify all genes of the human genome that are essential for the survival of cycling cells

[365-367]. For this, all three groups employed methods to disrupt the expression of most protein- coding genes in the human genome, via either gene trapping or CRISPR/Cas9-mediated mutagenesis. Between these three publications, the essentialomes of 10 different human cancer cell lines were identified [365-367]. Specifics about these cell lines, the methods, and findings of each of these reports are detailed below.

Unlike the other two groups, Blomen et al. only employed gene trapping to target protein- coding genes for inactivation. In their approach, they designed a unidirectional, retroviral gene- trapping vector, which contained a strong, adenoviral 3’ splice site at the 5’ end of the gene trap cassette, followed by blue fluorescent protein (BFP) and the SV40 signal, the latter of which functions as a transcriptional termination sequence [365, 368]. Sense insertion of this gene trap cassette into an intron would result in improper splicing of this intron, expression of

138 the BFP reporter, and production of a truncated mRNA, ultimately resulting in loss of gene function. In contrast, antisense insertion into an intron would have no effect on pre-mRNA splicing or gene expression [365]. Since retroviral infection inserts a single copy of the gene trap vector into the genome, this approach can only target one allele of one gene in each infected cell [369].

Therefore, Blomen et al. used two haploid cancer cell lines for their study: the near-haploid chronic myeloid leukemia line KBM7, and its derivative cell line, a nonhematopoietic haploid line called

HAP1 [365]. Moreover, post-infection, Blomen et al. sorted cells by DNA content, to remove any diploid cells from the pool [365]. Insertion sites were amplified and identified by linear amplification-polymerase chain reaction (LAM-PCR) and deep sequencing of the resulting PCR product [365, 370]. By plotting the number of sense versus antisense insertions observed in each gene, Blomen et al. identified the genes with significantly fewer disruptive, sense insertions than antisense insertions to be the genes essential for cell survival. This revealed 2054 genes that were essential for the survival of KBM7 cells and 2181 that were essential for HAP1 cell survival.

Through comparison of the 2054 genes of the KBM7 essentialome to the 2181-gene HAP1 essentialome, Blomen et al. identified 1734 shared essential genes, which they referred to as the

“core essentialome,” that are required for the survival of both KBM1 and HAP1 cells. Using the eggNOG protein orthology database [371], they found that the majority (77%) of the identified core essentialome genes are evolutionarily ancient, tracking back to Opisthokonta or earlier [365].

The functional properties of the core essentialome genes were also extracted from the eggNOG database, which revealed enrichment for functions relating to transcription, translation, and cell cycle [365].

The second Science paper, published by Wang et al., employed the CRIPSR/Cas9 system to introduce mutations in each gene targeted [367]. Specifically, they designed a library of guide

139

RNAs (gRNAs) such that the library would span the protein-coding exons of virtually all protein- coding genes (18,166 protein-coding genes), allowing them to target these genes for Cas9- mediated cleavage; non-targeting gRNAs were designed as controls. Cas9 and this genome-wide gRNA library were then sequentially transduced into cells via retroviral infection [367, 372]. Cells expressing gRNAs targeting essential genes would be depleted from the total cell population, allowing for the identification of essential genes via genomic DNA isolation, gRNA amplification by PCR, and deep sequencing of the amplified product [367, 372]. Wang et al. first tested this approach in the near-haploid KBM7 cell line. They assigned each gene they interrogated a CRISPR gene score (CS), calculated as the average log2 fold-change in gRNA abundance (final gRNA abundance / initial gRNA abundance). Using a CS score threshold of -0.1 and P<0.05, they identified 1878 essential genes in the KBM7 cell line. Application of a similar gene trapping strategy as Blomen et al. (2015) validated their findings in the KBM7 cell line, except for the diploid [367]. While the gene trap approach was unable to reveal any essential genes from chromosome 8, the CRISPR/Cas9 method revealed a similar proportion of essential genes on chromosome 8 as those identified on the haploid autosomes. With this evidence of efficient biallelic inactivation via the CRISPR/Cas9 system, Wang et al. expanded their approach to three other, diploid cell lines: the chronic myeloid leukemia line K562 and the Burkitt’s lymphoma cell lines Raji and Jiyoye. This strategy revealed 1660 essential genes in the K562 line;

1464 in the Raji line; and 1630 in the Jiyoye line. Unlike Blomen et al., Wang et al. did not investigate the genes shared between all four essentialomes they identified. Instead, they highlighted the presence of cell line- and tumor type-specific essential genes in their data [367].

The third report, from Hart et al., employed the same CRISPR/Cas9 approach as Wang et al., except they did not use gene trapping to validate their findings. This was likely due to their

140 decision to only investigate diploid cancer cell lines [366]. In all, Hart et al. designed gRNAs that would target the protein-coding exons of 17,232 protein-coding genes, and they used this gRNA library to interrogate the essentialomes of five cancer cell lines: the colorectal carcinoma cell lines

HCT116 and DLD1; the cervical carcinoma HeLa cell line; a glioblastoma cell line (GBM); and the immortalized retinal epithelial cell line RPE1 [366]. Gene essentiality scores were calculated using a Bayesian algorithm, which had been trained using a previously published “gold standard” reference dataset of essential and non-essential genes [366, 373]. Using a 5% false discovery rate cut-off, Hart et al. identified 2073 genes essential for HCT116 cells; 1893 genes essential for

DLD1 cells; 1696 genes essential for HeLa cells; 2197 genes essential for GBM cells; and 2038 genes essential for RPE1 cells [366]. Comparison of these five cell line essentialomes revealed

829 “core essentialome” genes that were essential for the survival of all five cell lines. However,

Hart et al. instead used a weaker criterion in their analysis of “core fitness genes”: each gene had to be present in at least 3 of the 5 cell line essentialomes, resulting in a list of 1580 genes. The biological processes enriched in this 1580-gene set were identified using the Panther database, which revealed significant enrichment for RNA processing, DNA repair/replication, and mitosis

[366]. Comparisons between the 1580-gene set and lists of essential genes extracted from yeast,

C. elegans, D. melanogaster, and Mus musculus phenotype databases showed a high degree of overlap, suggesting that the essential functions of these genes have been conserved [366]. In addition to these “core fitness genes,” Hart et al. investigated the large number of cell line-specific essential genes in their dataset, employing functional enrichment analysis and drug-based assays to study differences in cell viability between these cell lines [366].

Together, these three reports identified the essentialomes of 10 human cancer cell lines; each of these essentialomes contains between ~1500 to ~2600 genes. Notably, the cancer cell lines studied

141 represent a broad swath of cancer type/origin, including two chronic myeloid leukemia lines, two lymphoma cell lines, two colon cancer lines, a glioblastoma line; one cervical carcinoma line, an immortalized RPE line, and a haploid, non-hematopoietic line derived from a chronic myeloid leukemia cell line. Since these papers were published back-to-back, none of them included comparisons among all of the 10 cycling cell essentialomes identified in 2015 [365-367]. However, comparisons between the cell line essentialomes identified within each paper suggest that genes of the core essentialome are evolutionarily ancient and function in transcription, translation, genome integrity, and cell cycle.

5.2 Rationale

As detailed in Chapter 4, progenitor cells and neurons displayed vastly different responses to complete minor spliceosome inactivation. Constitutive inactivation of the minor spliceosome resulted in embryonic lethality (Figure 4.2C), and the distribution of genotypes identified in embryonic harvests suggested that this occurs early in embryonic development (data not shown).

Loss of U11 in the developing cortex caused rapid death of self-amplifying RGCs, while U11-null neurons survived postnatally (Figures 4.5D and 4.15). Together, our findings indicate that minor spliceosome activity is essential for the survival of cycling cells, but it is not immediately necessary for neuron survival. Since inactivation of the minor spliceosome results in missplicing of MIGs, and the resulting disruption of MIG function is what ultimately leads to the observed phenotypes,

I sought to identify MIGs known to be essential for cycling cell survival. To address this question,

I extracted the 10 published cycling cell essentialomes to study MIG enrichment in these essential gene sets.

142

5.3 Results

5.3.1 MIGs are significantly enriched in the “total” essentialome and in each of the 10 distinct essentialomes derived from 10 different cell lines

If MIGs are essential for the survival of cycling cells, then one would expect MIGs to be enriched in the essentialome gene sets. To test this, we first extracted lists of all genes interrogated by at least one of the three essentialome-defining papers and of the genes comprising the ten cell line essentialomes [365-367]. Since Blomen et al. did not report the full list of genes they investigated by their approach, the list of interrogated genes was derived from the Hart et al. and

Wang et al. publications, resulting in a list of 18,835 genes. The ten cell-line essentialomes were also combined into a single, “total” essentialome of 5206 genes, consisting of all genes shown to be essential for the survival of at least one cell line. In parallel, a list of all human MIGs was extracted from the U12DB [58]; any MIGs that were not interrogated in at least one of the three essentialome publications were excluded from the downstream analysis, resulting in a list of 577 human MIGs.

Next, we examined the percentage of MIGs found in these gene sets. We observed significant enrichment of MIGs in the total essentialome (5.05%; P=1.33E-20) compared to their proportion in the list of all genes interrogated by these studies (3.06%; Figure 5.1A). Moreover, analysis of the individual cell line essentialomes revealed even higher enrichment of MIGs in all ten essentialomes, with MIG enrichments ranging from 6.74% to 7.82% (Figure 5.1B; P≤4.14E-

19).

5.3.2 MIGs are significantly enriched in the core essentialome

143

We observed little variation in MIG enrichment among the individual cell-line essentialomes. This could indicate minimal cell type-specificity for MIG enrichment, such that a core set of essential MIGs might be shared among all ten essentialomes. To investigate this, we identified the genes of the total essentialome that were common to all ten cell-line essentialomes, which produced a core essentialome of 460 genes (Figure 5.1C). In this process, we noticed that some genes in the 18835 genes of the total essentialome were not interrogated in both the Wang et al. and Hart et al. papers [366, 367]. Therefore, even if these genes were essential for the survival of all cycling cells, our approach would not identify them as part of the core essentialome. To circumvent this problem of false negatives, we only counted genes jointly interrogated by both

Hart et al. and Wang et al. for the core essentialome analysis. This produced a 16994-gene list of all “jointly interrogated” genes and a list of 548 jointly interrogated MIGs.

Strikingly, we observed a significant, 3.1-fold increase in MIG enrichment in the core essentialome (10%; P=9.39E-12) compared to the proportion of MIGs in the jointly interrogated gene list (3.22%, Figure 5.1D). To further verify this enrichment, we sought to calculate the enrichment of a series of control gene lists, each linked by either a common function or a common feature, in the core essentialome. Since MIGs were chosen based on the presence of a specific intronic feature, we first sought to test whether genes sharing a different intronic feature would also be significantly enriched in the core essentialome. For this, we turned to micro RNA (miRNA) genes. While some miRNA genes are found in intergenic regions of the genome, many miRNA genes are found embedded in the introns of protein-coding genes [374]. Using the miRBase database [375], we extracted 705 protein-coding genes with intronic miRNA genes in the human genome. We did not observe significant enrichment for these genes in the core essentialome

(3.91%; P=0.91) compared to their proportion in the jointly interrogated gene list (4.15%, Figure

144

5.1E), suggesting that a shared intronic feature alone does not explain the significant MIG enrichment we observe. We next sought to investigate the role of shared gene function on gene list enrichment in the core essentialome. In the three essentialome publications, genes involved in cell signaling were generally depleted in the core essentialome, while genes regulating RNA processing, transcription, translation, and cell cycle were enriched. Therefore, to generate two negative control lists, we extracted 492 human kinase genes (the kinome) and 1697 human transcription factor genes from the Gene Ontology knowledgebase [376]. For a positive control list, we also extracted 1712 human cell cycle genes [376]. As expected, analysis of the human kinome did not reveal significant enrichment in the core essentialome, relative to the jointly interrogated gene list (1.95% vs 2.90%, P=0.26; Figure 5.1F). Moreover, transcription factors were significantly depleted in the core essentialome, relative to the jointly interrogated gene list (3.91% vs 9.99%, P=1.23E-06; Figure 5.1F). In contrast, the cell cycle genes were significantly enriched in the core essentialome, with a 3.2-fold increase in cell cycle gene enrichment (31.74% vs 10.07%,

P=1.22E-38; Figure 5.1F). Together, these results suggest that the enrichment of MIGs in the core essentialome is likely driven by shared essential function, not by a shared intronic feature.

5.3.3 MIGs of the core essentialome are predominantly involved in RNA processing and cell cycle

To investigate the biological processes performed by the 46 MIGs found in the core essentialome, we manually curated the literature on each MIG of the core essentialome and extracted their functions (Table 5.1). Of the 46 MIGs, nearly half (21 genes) function in cell cycle regulation; moreover, 9 of these 21 MIGs are specifically involved in mitosis (Table 5.1). The next largest functional category is RNA processing, comprised of 11 MIGs. These 11 MIGs can be

145 further subdivided into two categories: those functioning in pre-mRNA splicing (7 MIGs), and those that regulate RNA metabolism (4 MIGs). The next largest functional categories are transcription, in which 9 MIGs function, and non-coding RNA biogenesis, which is regulated by

8 MIGs. The remaining functional classifications are listed in Table 5.1.

5.3.4 Genes of the core essentialome are evolutionarily ancient, including MIGs of the core essentialome

When Hart et al. extracted the core essentialome of the five cell lines they identified, they observed enrichment of genes that overlapped with essential genes in yeast, C. elegans, D. melanogaster, and mouse [366]. Likewise, Blomen et al. extracted the core essentialome of the two cell lines they investigated and identified a subset of ancestral genes [365]. Therefore, we sought to interrogate whether the core essentialome genes we identified were similarly ancient.

For this, we employed a similar ortholog identification approach as that used by Blomen et al., which utilized the evolutionary genealogy of genes: Non-supervised Orthologous Groups

(eggNOG) database to identify "ancient" essential genes [371]. We found that 93% of the genes in the core essentialome we identified track back to the last eukaryotic common ancestor (LECA) or further, compared to the 78% of the non-core essentialome genes (P=3.12E-18; Figure 5.2A).

To avoid any sampling biases, we generated 100 randomized lists of 460 genes from the 4745 genes found in the total essentialome but not in the core essentialome, which we termed “non-core essential genes.” Fisher exact tests performed to determine ancient gene enrichment revealed that the proportion of ancient genes in the core essentialome was statistically significant, with a median

P-value of 3.11E-11 (Figure 5.2A). Thus, even within the essentialome, there is an “old” or ancient essentialome that was present in or before LECA, and a “new” or younger essentialome that

146 emerged later. Moreover, we observed a statistically significant enrichment of MIGs in this ancient core essentialome relative to their prevalence in all jointly interrogated genes (9.58%; P=4.92E-

10; Figure 5.2B). Of the 46 MIGs of the core essentialome, 41 are found in the ancient core essentialome; the 5 MIGs missing from the ancient core essentialome are C1orf109, a cell cycle regulator; DDB1, which functions in DNA damage response; KANSL2, which encodes a subunit of the acetyltransferase complex; SPC24, a gene encoding a component of the kinetochore; and TAF1C, which regulates rRNA transcription (Table 5.1).

5.4 Discussion

Since the identification of MIGs in the 1990s, investigators have sought to understand why these specific genes have minor introns, and why these genes have such a high degree of minor intron conservation. One possibility is that MIGs share specific functions, and the presence of minor introns in these genes allow the cell to regulate specific processes en masse by adjusting minor spliceosome component expression, thereby modulating minor intron splicing efficiency and MIG expression/function [77, 86]. However, comparisons of MIG functions do not reveal a small, select set of functional categories; often, researchers who have attempted to identify common functions of MIGs have to generate new, broader categories, such as “information processing,” to adhere to this concept [77]. Our investigation of the U11-null pallium provided a new lens in which to view this question. Specifically, our results indicated that minor spliceosome activity, and therefore proper MIG expression, was essential for the survival of proliferating progenitor cells [117]. This is supported by numerous findings that minor spliceosome activity is necessary for early embryonic development, when the embryo is rapidly growing [89, 111-114,

117]. Therefore, it was possible that MIGs might be linked by their requirement for cycling cell

147 survival, instead of by a specific function (e.g., cell cycle, transcription). Strikingly, we found that

MIGs are highly enriched in the 460 human genes that are essential for survival of ten different cancer cell lines—i.e., cells that undergo rapid division. Genetic ablation of 13 of these core essentialome MIGs—Actl6a, Ddb1, Ddx54, Donson, Gars, Nepro, Nup155, Orc3, Pola2, Sacm1l,

Upf1, Vps25, and Znf259—has been reported to cause early embryonic lethality in mice [319, 377-

386]. The functions performed by the 46 core essentialome MIGs—primarily cell cycle, RNA processing, and transcription—closely mirrored the functional enrichments identified both in the core essentialome Blomen et al. derived from two cell lines, which enriched for transcription, translation, and cell cycle, and in the core essentialome Hart et al. reported from five cell lines, which enriched for RNA processing, DNA repair/replication, and mitosis [265, 266]. Nearly half of the core essentialome MIGs function in cell cycle regulation, and a fifth of the core essentialome

MIGs specifically regulate mitosis (Table 5.1). This is highly consistent with the cell cycle defects observed in the U11-null RGCs of the developing cortex, where most defects affected mitosis

(Figure 4.7C-E).

Notably, some of the core essentialome MIGs have been linked to microcephaly and/or dwarfism. For example, deleterious mutations in Donson are associated with two microcephalic primordial dwarfism syndromes, both of which present with limb and craniofacial abnormalities

[319, 320]. Mutations in Ints8 are linked to intellectual disability with “borderline microcephaly,” limb dysmorphia, and craniofacial abnormalities [387]. While the MIG Orc3 has not been implicated in disease, it encodes a subunit of the origin recognition complex (ORC), which functions in the initiation of DNA replication during S-phase; mutations in the ORC subunits

ORC1, ORC4, and ORC6 are associated with primary microcephaly [286-289]. Moreover, genetic ablation of Orc3 in the developing mouse cortex results in death of RGCs and microcephaly [388].

148

Similarly, genetic ablation of Ddb1 in the developing mouse brain causes NPC apoptosis and subsequently microcephaly [379]. Primordial dwarfism without microcephaly has been linked to mutations of the MIGs Exosc2 and Znf259, presenting alongside either hearing loss and retinitis pigmentosa or renal syndrome, respectively [389, 390]. The identification of these MIGs, which both are necessary for the survival of rapidly dividing cells and are linked to microcephaly and/or dwarfism, represent exciting new targets for scientists investigating the pathogenesis of the developmental disorders linked to minor spliceosome inactivation.

To better understand the molecular underpinnings of microcephaly caused by U11 ablation in the developing mouse cortex, we identified the 43 core essentialome MIGs found in the mouse genome and extracted their expression and minor splicing retention data from the RNAseq experiment described in Chapter 4 (Table 5.1). Of the 16 MIGs with significantly elevated minor intron retention in the U11-null pallium, six function in transcription, six are cell cycle regulators, and three are involved in non-coding RNA biogenesis (Table 5.1). Of the cell cycle-regulating

MIGs, three function in mitosis (Nat10, Nup107, and Ssu72) and two regulate S-phase (Pole2 and

Rfc5), which correspond to the cell cycle defects observed in the U11-null RGCs of the developing cortex (Figure 4.7; Table 5.1). These 16 MIGs also included Ddb1, a DNA damage regulator that is necessary for NPC survival [379], and the dwarfism-associated gene Exosc2 [390]. Many of the microcephaly-associated MIGs of the core essentialome either did not pass our minor intron retention criteria (see section 4.5) or were not significantly retained in the mutant pallium (Table

5.1). This could be due to low levels of expression of these MIGs, resulting in fewer reads corresponding to these transcripts and therefore failure to pass our criteria for minor intron retention (see section 4.5); alternative splicing that circumvents minor intron splicing in the U11- null pallium; slow turnover rates of correctly spliced mRNA transcripts produced prior to U11 loss

149 in the mutant pallium; or the correct splicing of the minor introns of these MIGs (Table 5.1).

Ultimately, the 16 core essentialome MIGs with significantly elevated minor intron retention in the U11-null pallium represent high-priority targets for further investigation into the molecular underpinnings of microcephaly caused by minor spliceosome inactivation.

In the U11 cKO mouse, we observed survival of neurons in the postnatal cortex, suggesting that the missplicing of MIGs does not significantly impact neuron survival (Figure 4.15). Nearly half of the core essentialome MIGs regulate cell cycle; these MIGs might not be expressed in post- mitotic neurons, and if they were, their disruption would not likely have as detrimental an effect as their disruption in actively cycling cells. However, the other MIGs of the core essentialome perform functions that are not intuitively cell type-specific, such as RNA processing, non-coding

RNA biogenesis, and transcription. One possibility is that neurons express a different complement of genes that perform these functions, allowing these functions to continue unaffected despite minor spliceosome inactivation. In this case, the MIGs performing these functions would likely be expressed at low levels, reducing the detrimental effects of minor spliceosome inactivation in neurons. As suggested in the Discussion in Chapter 4 (section 4.4), the MIGs that are expressed in neurons may contain minor introns with low sensitivity to minor spliceosome inactivation; this possibility is bolstered by our observations of such minor introns in RNAseq data from the E12 mutant pallium (Figure 4.7A). Notably, numerous groups have shown that neuron-specific expression of splicing factors drive neuron-specific alternative splicing events in the mouse brain; it is possible that this unique complement of splicing factors may drive preferential expression of

MIG isoforms that skip minor intron splicing specifically in neurons (reviewed in [391]).

Our analysis of the core essential revealed that it is comprised of ancient genes, the majority of which can be traced back to the last eukaryotic common ancestor (LECA, Figure 5.2A). This

150 finding, together with our observation of significant MIG enrichment within this ancient core essentialome (Figure 5.2B), leads us to propose a modification to one of the models explaining the emergence of minor introns, called the parasitic invasion model [77]. In our revised model, minor introns emerged from a second invasion of group II self-splicing introns into the early eukaryotic genome, which already possessed major introns and the major spliceosome [392]. During genome evolution, the fragmentation of these group II self-splicing introns into catalytic subunits and inert sequences then gave rise to the minor-class snRNAs and minor introns, respectively. As a result of this invasion and subsequent fragmentation, minor introns were randomly distributed across the genome, in genes consisting primarily of major introns that executed both non-essential and essential functions. This gene organization resulted in inefficient splicing and reduction in the amount of mRNA encoding full-length proteins. Given this bottleneck, one can imagine a strong selective pressure to remove these minor introns from the genome. Indeed, this is supported by the progressive loss of minor introns across evolution, which can occur either by complete loss of the minor intron and fusion of the neighboring exons, or by conversion of the minor intron into a major intron [77, 82]. The latter of these mechanisms, minor-to-major intron conversion, would require specific mutations in the minor-class splice site sequences, which would allow the intron to be recognized and removed by the major spliceosome [393]. Both of these mechanisms of minor intron loss can cause perturbation in MIG expression levels or the production of aberrant proteins, ultimately impacting the functions executed by MIG-encoded proteins. While the risks associated with minor intron loss might not be an issue for genes executing non-essential functions, the same would not be true for genes essential for survival. Given the risks associated with perturbing the gene organization of essential MIGs, one would expect limited loss of the minor introns in these essential genes. Indeed, our data support this idea, as we observed significant enrichment of MIGs

151 in the essentialomes, especially in the core, ancestral essentialome (Figure 5.2). This revised model of minor intron emergence and evolution emphasizes the importance of studying minor intron conservation and MIG function in concert, particularly in organisms with few MIGs—i.e., species in lineages that have undergone significant minor intron loss/conversion. Ultimately, this revision of a model proposed 20 years ago will hopefully spark new investigations into the evolution of minor intron splicing, a subfield of minor splicing research that has been relatively dormant in recent years.

5.5 Materials and methods

Identification and extraction of human MIGs for the essentialome analysis

The list of 695 minor introns in the human genome was first extracted from the U12

Database [58]. For each minor intron, the raw intron sequence was aligned to the current hg38 assembly of the human genome using BLAST, in order to update the minor intron coordinates.

Any minor intron that did not have 100% identity with an annotated intron in the hg38 assembly was removed from the analysis. This process also allowed us to extract updated gene IDs for each minor intron-containing gene. This approach produced a list of 640 minor introns, distributed across 583 MIGs.

Prior to the comparison of this MIG list to the essentialome, the MIG gene IDs were compared to the 18835 genes interrogated across the three essentialome studies [365-367]. Five

MIGs (TSPY3, TSPY4, TSPY9P, TSPY10, and VPS11) were not interrogated in any of the three essentialome publications, and they were excluded from downstream analyses. For two MIGs,

PTEN and SLC24A6, different synonyms were used in the separate essentialome papers (TEP1 and SLC8B1, respectively) [365-367]. For each synonym pair, the essentialome data corresponding

152 to these synonyms were combined into a single entry. Together, these adjustments revised the MIG list to 577 genes. This 577-MIG list was used for assessing MIG enrichment in the total essentialome. For each cell line, the essentialome was identified using the thresholds detailed in the methods of its respective publication [365-367]. The KBM7 cell line was interrogated by two separate groups, utilizing different gene-targeting and thresholding methods. As a result, the two published KBM7 essentialomes were not identical in size (2054 genes vs 1878 genes) nor gene identity [365, 367]. This discrepancy was addressed by combining the two KBM7 essentialomes.

Specifically, any gene that was present in at least one of the two KBM7 essentialomes was retained, resulting in a combined KBM7 essentialome of 2595 genes. Statistical significance for MIG enrichment was calculated by Fisher’s exact test.

Analysis of gene enrichment in the core essentialome

Identification of the core essentialome was performed by extracting genes present in all 10 cell-line essentialomes. To minimize the confounding effect of false negatives, we discarded any genes that were not interrogated in both the Wang et al. and Hart et al. publications [366, 367].

This rule did not extend to the Blomen et al. manuscript because that paper only reported the essential genes they identified, not all genes interrogated in their approach [365]. Application of this rule produced a “jointly interrogated” list of 16994 genes and a “jointly interrogated” MIG list of 548 genes. The same approach was used to revise the gene lists used for positive (cell cycle genes) and negative (genes with intronic miRNAs, the kinome, and transcription factor genes) controls. Statistical significance for MIG enrichment was calculated by Fisher’s exact test.

153

Chapter 5: Figures and Tables

Figure 5.1. MIGs are enriched in the core essentialome. (A) Pie charts showing the percentage of MIGs (orange) in all genes interrogated in the essentialome reports (left) and in all essential genes (right). (B) Daisy model representing the ten published essentialomes (petals) and the 577 MIGs (orange center). The overlap of the center with the petals indicates the enrichment of MIGs in the cell-line essentialomes. (C) Daisy model showing the number of essential genes common to all ten cell-line essentialomes (“core essentialome”; circle). (D) Pie charts showing the percentage of MIGs in all genes jointly interrogated in the Hart et al. and Wang et al. essentialome reports (left) and in the core essentialome (right). (E) Pie charts showing the percentage of genes with intronic microRNA genes (GInt-miRs) in all jointly interrogated genes (left) and in the core essentialome (right). (F) Pie charts showing the percentage of genes with kinase activity (leftmost), transcription factor activity (middle), or a role in cell cycle regulation (rightmost) in all jointly interrogated genes and in the core essentialome. Statistical significance was determined by Fisher’s exact test. n.s.=not significant; *=P<2.0E-06.

154

Figure 5.2. Most genes of the core essentialome, including the MIGs, are ancient. (A) Evolutionary age of the core essentialome genes (green) and the non-core essentialome genes (purple). Clades are listed on the y-axis. The x-axis shows the percentage of genes that can be traced to the listed clade, but not to an older clade. The non-core essentialome gene data represent the median values of 100 lists of 460 genes from the non-core essentialome. (B) Pie chart showing the enrichment of MIGs (orange) in the ancient core essentialome (turquoise). Statistical significance, relative to the percentage of MIGs present in all jointly interrogated genes, was determined by Fisher’s exact test; *=P=4.92E-10.

155

Table 5.1. Mammalian functions of the MIGs identified in the core essentialome. MIGs are subdivided by species (Spp.) information, based on whether the MIG is found only in the Homo sapiens (Hs) genome (“Hs only”), only in the Mus musculus (Mm) genome (“Mm only”), or in both Hs and Mm genomes (“Hs + Mm”). For MIGs found in the mouse genome, average MIG expression and minor intron retention values from the E12 ctrl and mut RNAseq data analysis (Chapter 4) are reported. Syn=synonym; Ave=average; MSI=mis-splicing index. P values were calculated by two-tailed student’s t test. “-”=no mouse MIG data or value not calculatable; “N/A”=minor intron did not pass our thresholding criteria (see section 4.5). Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut part of remodeling complex [394, 395]; necessary transcription; cell Hs + and sufficient for neural Actl6a 28.823 28.9842 1.1056 2.5092 0.09536 1.4036 cycle Mm progenitor proliferation [394]; (proliferation) constitutive knockout is early embryonic lethal [383] centrosomal protein that regulates microtubule nuclea- tion and organization [396]; a fraction of ARL2 protein localizes to the mitochondria, where it regulates ATP levels and mitochondrial integrity metabolism; cell Hs ARL2 ARFL2 ------[397]; based on work in non- cycle (mitosis/ only mammalian model systems, cytokinesis) mutations in ARF2 may cause shortening of microtubules, aberrant microtubule and spindle organization, cell division and cytokinesis defects [398, 399] Hs + acidifying vacuolar ATPase Atp6v1a 30.064 30.8883 1.7557 4.6257 0.05069 2.87 ion transport Mm (vATPase) subunit A [400]

156

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut involved in maintenance of neocortex neural progenitor cells downstream of Notch and plays a role in repression of proneural gene expression; cell cycle misexpression causes inhibi- (proliferation/ Nepro, Hs + tion of neuronal differentiation BC027231 4.4732 5.78214 N/A N/A N/A N/A progenitor cell C3orf17 Mm in the early neocortex, while identity); CNS knockdown drives neuron diff- development erentiation [401]; localized to nucleolus, and knockout in mice causes impaired blasto- cyst formation and apoptosis [385] functions in cancer cell pro- Hs liferation, based on mis- cell cycle C1orf109 ------only expression and knockdown (proliferation) studies [402] large subunit of DNA damage- binding complex [403]; nu- cleotide excision repair [404]; involved in ubiquitin complex via interaction with the MIG Cul4a [405]; constitutive Hs + genome integrity Ddb1 86.094 88.3333 0.4618 2.7062 0.00438 2.2444 ablation is embryonically Mm (DNA repair) lethal in mice, and brain- specific ablation leads to accumulation of cell cycle regulators, genomic instability, and apoptosis of proliferating NPCs [379] 157

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut nucleolar RNA ; reduced function in cancer cells reduces proliferation Has1, Hs + [406]; cell cycle progression in Ddx18 13.787 14.2109 37.491 81.752 0.00073 44.261 cell cycle MrDb Mm hematopoeitic zebrafish cells, given that disruption causes p53-dependent G1 cell cycle arrest [407] RNA helicase that localizes to the nucleolus; in the DNA da- mage response, increases splic- ing efficiency of pre-mRNA transcripts generated in res- ponse to DNA damage [408]; interacts with constitutive androstane nuclear (response to xeno-chemical stimuli) and acts as co- transcription; RNA activator to upregulate ex- processing Hs + Ddx54 DP97 6.4588 6.934 N/A N/A N/A N/A pression of downstream genes (splicing); CNS Mm involved in drug metabolism development; stress [409]; estrogen-dependent in- response teraction with nuclear estrogen receptors results in inhibition of transcriptional activity [410]; constitutive ablation is early embryonically lethal [386]; binds to myelin basic protein (MBP) in oligodendro- cytes, and knockdown causes depletion of MBP [411] 158

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut a replisome protein that functions in stabilization of the replication fork and the intra-S genome integrity Hs + phase checkpoint [320]; muta- (DNA repair); cell Donson 12.196 9.60999 N/A N/A N/A N/A Mm tion causes microcephalic cycle (checkpoint dwarfism [319, 320]; control) constitutive ablation is early embryonic lethal in mice [319] translation initiation factor, but not required for translation initiation [412]; in intestinal epithelial cells, ectopic over- expression triggers onco- genesis [413]; mTOR directly interacts with eIF3 (all Eif3s, not just the product of this gene) to increase association to Hs + translation; cell Eif3i TRIP1 211.31 129.232 N/A N/A N/A N/A eIF4G [414]; regulates osteo- Mm cycle; development blast differentiation and pro- liferation, and knockdown decreases the number of cells in S phase while increasing cells in G2/M phase [415]; in vitro overexpression triggers increased cell size, increased proliferation and cell cycle progression [416] exosome component; high Hs + affinity for binding to phos- Exosc2 9.7051 9.16142 6.7412 15.899 0.01154 9.1576 RNA metabolism Mm phorylated Upf1 (involved in nonsense mediated decay… 159

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … pathway; considered signal to recruit mRNA degradation factors to transcript) [417, 418]; in HEp-2 cells, required Hs + Exosc2 9.7051 9.16142 6.7412 15.899 0.01154 9.1576 for exosome stability, with RNA metabolism Mm knockdown inhibiting [419]; in yeast, required for processing of 7S pre-rRNA to 5.8S rRNA [420] component of the MMXD complex; possibly required for Aurora B localization; local- ized to mitotic spindle, and knockdown results in ab- normal mitotic spindle form- ation, chromosome missegre- gation, and multi-nucleation [421]; interacts with and down- regulates E2-2 (role in endo- Hs + thelial cell quiescence), while cell cycle (mitosis); Fam96b 25.531 14.9509 - 100 - - Mm also enhancing endothelial mi- iron homeostasis gration, proliferation, and tube formation [422]; knockdown experiments indicate target specificity of its Fe-S assembly activity (necessary for matura- tion of nucleotide metabolism proteins DPYD and GPAT; interacts with DNA2); invol- ved in the regulation of iron homeostasis by decreasing… 160

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut Hs + … IRE-binding activity and cell cycle (mitosis); Fam96b 25.531 14.9509 - 100 - - Mm protein levels of IRP2 [423] iron homeostasis glycyl-tRNA synthetase, which covalently links glycine with corresponding tRNAs [424]; mutations are linked to Hs + Gars 31.641 34.0717 100 100 #DIV/0! 0 Charcot-Marie-Tooth (CMT) translation Mm disease, which specifically affects neurons [424]; constitu- tive genetic disruption in mice is embryonically lethal [380] in integrator complex, which associates with RNA poly- non-coding RNA Hs + Ints8 39.174 38.3163 1.8454 4.1572 0.10709 2.3117 merase II and mediates 3' end biogenesis Mm processing of snRNAs (only (snRNA) U1 and U2 tested here) [425] subunit of the NSL histone Hs + Kansl2 Nsl2 44.5 39.369 4.7214 20.86 9.4E-05 16.139 acetyltransferase complex transcription Mm [426, 427] N-acetyltransferase [428]; di- rect role in decondensation of chromosomes at mitosis exit; non-coding RNA knockdown results in prolong- biogenesis; Hs + ed chromosome condensation transcription Nat10 12.006 12.2529 8.4811 29.421 0.00872 20.94 Mm [429]; DNA damage triggers (rRNA); cell cycle increased amount of Nat10 in (mitosis/cyto- mitotic midbody, and results in kinesis) enhanced acetylation of alpha- of midbody [428]… 161

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … knockdown results in ab- normal nucleolus size, length- ened G2/M transition, multi- nucleated cells, and defects in cytokinesis, sometimes result- ing in cell death [428]; asso- ciates with U3 snoRNA and is non-coding RNA required for 18S rRNA pro- biogenesis; Hs + cessing; knockdown results in transcription Nat10 12.006 12.2529 8.4811 29.421 0.00872 20.94 Mm decreased levels of 47S pre- (rRNA); cell cycle rRNA, indicating that Nat10 is (mitosis/cyto- a transcriptional UTP (partici- kinesis) pates in pre-rRNA tran- scription) by targeting UBF for acetylation to facilitate association with RNA pol I- associated factor [430]; 18S rRNA processing [431] one of two proteins in the mRNA cap binding protein complex (other = CBP20), which regulates mRNA spli- RNA metabolism; cing [432]; nonsense mutation- RNA processing containing transcripts are Hs + (mRNA splicing); Ncbp1 Cbp80 24.319 27.8961 2.5035 4.8214 0.25505 2.3178 bound with Ncbp1 during non- Mm RNA transport; sense-mediated decay (NMD) translation (pioneer [433]; through interaction with round) TREX component Aly, allows proper mRNA export [434]; re- quired for poly(A) RNA export [435]; as CBC, mediates… 162

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut …translation initially for RNA metabolism; pioneer round, then replaced RNA processing by eIF4E, which controls Hs + (mRNA splicing); Ncbp1 Cbp80 24.319 27.8961 2.5035 4.8214 0.25505 2.3178 steady-state translation [436]; Mm RNA transport; associates with Upf1 to translation (pioneer promote nonsense-mediated round) decay [437, 438] one of two proteins in the mRNA cap binding protein complex (other = CBP80), which affects RNA stability, splicing, export (specifically U RNA export), and translation RNA metabolism; initiation by binding 5' end RNA processing [439]; Ncbp2 specifically (mRNA splicing); Hs + binds the cap [440]; also invol- Ncbp2 Cbp20 18.075 17.511 N/A N/A N/A N/A RNA transport Mm ved in processing of 3' end of (snRNA); mRNA transcripts [441]; translation (pioneer nonsense mutation-containing round) transcripts bound with Ncbp2 during NMD [433]; as CBC, mediates translation initially, then replaced by eIF4E, which controls steady-state trans- lation [442] initially discovered as prolif- eration-associated based on Hs + Nop2 P120 10.392 10.5398 4.2518 7.7775 0.06934 3.5257 expression, with moderately cell cycle Mm strong ribosome RNA methyl transferase activity [443]… … 163

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … introduction of antisense RNA limited proliferation in NIH 3T3 cells [444]; Hs + Nop2 P120 10.392 10.5398 4.2518 7.7775 0.06934 3.5257 proliferation marker of neural cell cycle Mm stem cells, and is expressed in adult brain [445]; potential role in neutrophil maturation [446] nucleoporin; depletion causes apoptosis [447]; prevents assembly of subset of nucleo- porins into nuclear pore com- plex (leads to codepletion of these proteins), mRNA export (not fully dependent on Nup- 107, but affected) [448]; RNA transport Hs + Nup107 28.66 26.2714 2.0407 13.983 0.00037 11.942 required for bipolar spindle (mRNA); cell cycle Mm assembly [449]; nuclear pore (mitosis) complex disassembles during mitosis with many constituents distributing onto spindles and kinetochores, and the Nup107- 160 complex is implicated in spindle assembly by regulating [450] nucleoporin gene; functions in mRNA export and import of N155, Hs proteins to the nucleus [382]; protein transport; NUP155 ------ATFB15 only constitutive loss of function is RNA transport embryonically lethal in mice [382] 164

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut member of the origin recog- nition complex (ORC) [451]; in Drosophila, knockdown results in reduced proliferation of NPCs of the CNS and death in the early pupae stage [452]; in human cells, directly inter- acts with ORC2 and is required to recruit ORC4 and ORC5 to the core complex [453]; in conditional mutant mouse line crossed to Emx1-Cre, signi- ficant reduction in cortical Lat, NPC division at E13, severe cell cycle (DNA Hs + Orc3 Orc3l, 37.179 36.4778 16.161 29.367 0.0544 13.206 block of NPC division by E14, replication/S Mm Latheo and disrupted cortical organiz- phase) ation [388]; in the cKO mouse, germline deletion is embryo- nically lethal [384]; in the mouse cKO crossed with Nestin-Cre, reduction in NPCs by E15, fewer TBR2+ cells at E16, postnatal cortical reduc- tion, compromised upper layer neuron production/migration, little/no effect on neural fate specification, and neonate cerebral hemorrhage concen- trated on midline [384]

165

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut expressed in immature and early-stage B lymphocytes; regulates mRNA export of RNA transport; Hs + Pcid2 10.884 9.17728 N/A N/A N/A N/A mitotic checkpoint gene protein transport; Mm MAD2 [454]; involved in cell cycle (mitosis) nuclear protein export, based on results of knockdown [455] interacts with the U3 snoRNA (involved in rRNA maturation/ biogenesis), and knockdown Alg4, Hs + represses 18S rRNA ma- non-coding RNA Pdcd11 Nfbp, 6.0127 6.71921 2.5879 10.345 0.09684 7.7567 Mm turation [456]; part of SSU biogenesis (rRNA) Rrp5 processome, where it likely recruits U3 to sites of rRNA maturation [457] indirectly, regulates DNA replication by increasing pro- tein synthesis and nuclear translocation of catalytic alpha subunit (p180) [458]; hyperphosphorylated by cell cycle (DNA Hs + cyclin-dependent kinases in Pola2 21.661 21.6355 2.3927 3.7602 0.36278 1.3675 replication/S Mm G2 phase, which enhances phase) activation of pol-alpha (DNA replication) by phosphorylated Rb [459]; constitutive knockout is embryonically lethal in mice [386] Hs + DNA polymerase accessory cell cycle (DNA Pole2 17.927 15.1327 0.3623 4.3916 0.01647 4.0292 Mm subunit (epsilon), 55 kDa… replication/S-phase 166

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … subunit [460]; possible stabilizing role for DNA cell cycle (DNA Hs + polymerase epsilon complex replication/S-phase Pole2 17.927 15.1327 0.3623 4.3916 0.01647 4.0292 Mm [460]; role in chromatin phase); regulation, based on reporter transcription plasmid assays [461] subunit shared by all RNA polymerases [462]; role in transcription activation, according to work in yeast [463]; in COS1 cells, directly binds RAP30, component of Hs + Polr2e RPB5 62.784 40.1102 0.7415 3.1917 0.00269 2.4502 the general transcription factor transcription Mm IIF (TFIIF) complex; this complex is assembled within the ini-tiation complex and is known to associate with RNA pol II, inhibiting association of TFIIF and pol II [464, 465] subunit of RNA polymerase III [466]; likely directs pol III binding to TFIIIB-DNA transcription; non- RPC3, Hs + Polr3c 14.743 10.1487 N/A N/A N/A N/A complex [466]; required for coding RNA RPC62 Mm promotor-specific biogenesis transcription initiation by pol III [467] subunit of RNA polymerase transcription; non- Hs + Polr3h 11.154 8.41693 5.2643 15.526 0.00187 10.261 III, acting paralogously to coding RNA Mm Rpb7 in pol II [468] biogenesis

167

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut arginine N-methyltransferase; regulates oligodendrocyte ma- turation and myelination [469]; methylates FUS and regulates nucleocytoplasmic distribu- tion, and is linked to disrupt- tions in histone modifications/ chromatin remodeling in cyto- plasmic FUS-related ALS protein ANM1, [470]; chromatin regulation by modification; HCP1, Mm methylation of [471]; transcription; Prmt1 43.663 46.2355 N/A N/A N/A N/A IR1B4, only regulates DNA repair and sig- protein transport; Mrmt1 nal transduction [472]; requi- genome integrity; red for early post-implantation CNS development mouse development but not for cell survival, based on gene trap approach [473]; knock- down causes DNA damage and chromosomal segregation de- fects [474]; involved in neurite outgrowth, via interaction with Btg2 [475] Hs + component of 20S proteasome Psma1 162.68 155.28 N/A N/A N/A N/A protein metabolism Mm [476] geranylgeranyl transferase; mutation in splice acceptor site protein Hs + results in gunmetal mouse, Rabggta 6.0551 4.81826 7.8456 18.3 0.0595 10.454 modification; Mm which has platelet and mega- hemostasis karyocyte defects (prolonged bleeding), and…

168

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut protein Hs + … macrothrombocytopinea Rabggta 6.0551 4.81826 7.8456 18.3 0.0595 10.454 modification; Mm [477] hemostasis member of the small ribosomal subunit (SSU) processome Rnac, Hs + non-coding RNA Rcl1 6.3878 5.57027 N/A N/A N/A N/A [478]; functions in rRNA RPCL1 Mm biogenesis (rRNA) maturation and therefore ribosome biogenesis [479] 36.5 kDa subunit of the replication factor C complex; cell cycle (DNA Recc5, Hs + interacts with the DNA clamp Rfc5 22.473 14.612 4.9543 12.849 0.00166 7.8948 replication/S RFC36 Mm PCNA, thereby regulating phase) DNA replication during S phase [480] the minor spliceosome-specific U11/U12 65k protein; when bound to the U11/U12 di- snRNP, it bridges the U12 Hs + snRNA and U11-59K protein, RNA processing Rnpc3 17.448 31.214 4.1637 6.1783 0.4499 2.0146 Mm stabilizing the di-snRNP and (splicing) thereby regulating minor splicing [72]; disruption linked to isolated familial growth hormone deficiency [119] Hs + ribosomal protein component Rpl4 391.13 435.342 0.8984 11.067 0.00017 10.168 translation Mm of the 60S subunit [481] 30 kDa subunit of the ribo- Hs + non-coding RNA Rpp30 14.166 15.4626 1.2119 2.8378 0.0006 1.6259 nuclease P/MRP complex, Mm biogenesis which processes tRNA [482]

169

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut phosphoinositide phosphatase integral membrane protein lo- calized to endoplasmic reticu- lum and Golgi apparatus [483, cell cycle (mitosis); Hs 484]; mouse KO is early em- SACM1L SAC1 ------Golgi-mediated only bryonic lethal [381]; knock- transport down causes Golgi apparatus disorganization, delayed G2/M transition, and aberrant mitotic spindle formation [381] the "E" subunit of the Sm com- plex, which functions in snRNP maturation; associates with U snRNPs, including those involved in RNA spli- RNA processing SME, Hs cing and histone mRNA pro- (splicing); SNRPE Sm-E, ------only cessing [485-488]; in cancer transcription; cell HYPT11 cells, misexpression prevents cycle DNA synthesis and arrests cells in G2, while knockdown drives cells through these phases [489] NDC80 kinetochore complex component; required to esta- Hs + Spc24 33.589 13.2965 100 100 #DIV/0! 0 blish and maintain kineto- cell cycle (mitosis) Mm chore-microtubule attachment in mitosis [344] phosphatase associated with Hs + transcription; cell Ssu72 21.074 14.7427 6.0058 11.568 0.0025 5.5625 CTD of RNA pol II (TFIIB) Mm cycle (mitosis) [490]; regulates RNA pol II…

170

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … activity via CTD phos- phatase activity in yeast [491]; Hs + transcription; cell Ssu72 21.074 14.7427 6.0058 11.568 0.0025 5.5625 phosphorylated by Aurora B to Mm cycle (mitosis) regulate sister chromatid cohesion during mitosis [492] TATA box-binding protein associated factor (TAF) for Sl1, RNA pol I; part of SL1 Hs + non-coding RNA Taf1c Tafi110, 4.3941 4.43861 9.5208 12.856 0.81863 3.3351 complex, which directs pol I Mm biogenesis (rRNA) Tafi95 transcription and can independently interact with rDNA promoter [493] cytosolic respon- sible for stabilizing folding of numerous proteins, including [494] and tubulin [495, Hs + 496]; involved in ciliogenesis Tcp1 Cct 127.7 128.145 2.1049 4.9633 0.00048 2.8584 protein folding Mm and biogenesis of rod outer segment [497]; via chaperone function for Tcab1, controls scaRNA localization and telo- merase function [498] ubiquitin-like protein; knock- down experiments indicate a role in stabilization of the RNA processing Hs spliceosome and in mitotic UBL5 HUB1 ------(splicing); cell only progression to anaphase [499]; cycle (mitosis) via this role, important for the proper splicing of the cohesin factor sororin, thereby… 171

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut RNA processing Hs … indirectly promoting sister UBL5 HUB1 ------(splicing); cell only chromatid cohesion [499] cycle (mitosis) nonsense-mediated mRNA Hs + decay [500]; genome integrity RNA metabolism; Upf1 18.959 14.8731 3.9997 16.801 0.02216 12.802 Mm [501]; constitutive knockout is genome integrity embryonically lethal [377] part of ESCRT (endosomal sorting complex required for transport)-II [502]; role in cargo sorting, especially DERP9, sorting of ubiquitinated cargo transport; Hs + Vps25 EAP20, 13.868 13.4069 1.1976 4.9033 0.00487 3.7057 [503]; by regulating receptor limb/skeletal Mm FAP20 number, affects FGF signaling development in limbs, indicative of a role in skeletal/limb development [504]; constitutive knockout is embryonically lethal [386] depletion disrupts nucleolar function, including pre-ribo- somal RNA expression [505]; interacts with the Smn protein and colocalizes in Geminins Zpr1, Hs + and Cajal bodies; depletion an- RNA processing Znf259 6.4644 4.51749 N/A N/A N/A N/A Zfp259 Mm alysis indicates Zfp259 regu- (splicing) lates localization of Smn in nu- clear bodies [506]; in condi- tional knockout mouse, loss of Zfp259 disrupts the subcellular localization of Smn and…

172

Ave Ave Ave Ave Functional MIG Syn Spp. FPKM FPKM MSI MSI P-value ∆ MSI Function Description Categories Ctrl Mut Ctrl Mut … the spliceosomal snRNPs; early embryonic lethality is observed, showing reduced proliferation and increased apoptosis [378]; depletion blocks S-phase progression, triggers G1 and G2 arrest, and Zpr1, Hs + causes mislocalization of Smn RNA processing Znf259 6.4644 4.51749 N/A N/A N/A N/A Zfp259 Mm and NPAT [506]; in Zfp259 (splicing) deficient mice, observe axonal pathology, neurodegeneration [507]; in spinal muscular atrophy, loss of Zpr1 increases motor neuron loss and severity and depletes SMN-containing subnuclear bodies [508]

173

Chapter 6: The behavioral outcomes of Emx1-Cre-mediated U11 ablation in

the Rnu11 conditional knockout mouse

In Chapter 4, I described microcephaly in the embryonic and juvenile Rnu11 mutant mouse; in

Chapter 5, I expanded upon the potential genes underlying the primary cause of microcephaly in these mice: progenitor cell death. This chapter covers the postnatal effects of minor spliceosome inactivation in the developing mouse cortex. The chapter begins by describing the Emx1-Cre transgenic mouse used to ablate Rnu11 in the developing mouse cortex, which is followed by examination of the postnatal survival and behavior of the Emx1-Cre+, Rnu11 mutant mice.

6.1 The Emx1-Cre transgenic mouse

Emx1 is one of two transcription factor orthologs of the Drosophila gene empty spiracles (ems), which functions in anterior head patterning in Drosophila development [509, 510].

In the developing mouse brain, Emx1 expression begins at E9.5, with specific enrichment in the progenitor cells and differentiating neurons of the dorsal telencephalon, spanning the primordium of the cerebral cortex and hippocampus, and the olfactory bulb [511-513]. Expression of Emx1 persists in the excitatory neurons of the postnatal cerebral cortex, hippocampus, and olfactory bulb

[513, 514]. Therefore, when the Jones group sought to generate a transgenic mouse line in which

Cre-mediated recombination would be driven in the developing cortex, they chose the Emx1 promoter. Specifically, they inserted the Cre expression cassette, comprised of an internal ribosomal entry site followed by the Cre recombinase gene and a polyadenylation signal, into the final exon of the endogenous Emx1 gene, which encodes the 3’ untranslated region of the Emx1 mRNA [235]. This Emx1-Cre mouse was then bred with the transgenic R26R Cre reporter mouse line. In this Cre reporter mouse, the lacZ expression cassette is flanked by loxP sites, such that

174 expression of β-galactosidase only occurs in Cre-expressing cells [235, 515]. As expected, staining of E10.5 Emx1-Cre+::R26R+ mouse embryos with the β-galactosidase substrate X-gal revealed robust Cre-mediated recombination in the dorsal telencephalon. However, they also observed recombination in the branchial arches, which give rise to cells of the jaw and facial bones, the muscles of the jaw and face, and a subset of cranial ganglia; scattered recombination was also observed in the limb and tail ectoderm [235, 516-519]. Subsequent characterization of Emx1-Cre activity in the Mouse Genome Informatics database also revealed scattered recombination in the kidneys of postnatal Emx1-Cre+ mice [516-518].

In the adult brain, Gorski et al. observed robust recombination in the excitatory, but not inhibitory, neurons of the cerebral cortex. Recombination was also observed in oligodendrocytes and astrocytes of the cortex and cortical white matter; they did not investigate the presence of Cre- mediated recombination in OPCs or microglia [235]. However, Gorski et al. also reported the presence of recombined cells outside of the cerebral cortex, hippocampus, and olfactory bulbs.

Scattered recombined cells were observed in the dorsal striatum, which receives input from layer

V projection neurons [235, 520]. Notably, while virtually none of the recombined cortical neurons were inhibitory, the recombined striatal cells were medium spiny neurons (MSNs), which are inhibitory cells [235, 520]. Recombined cells were also prevalent in the lateral, basolateral, and medial regions of the amygdala [235].

The identities of these recombined, Emx1-lineage striatal and amygdalar neurons were investigated by the Corbin group in 2009. Analysis of the Cre reporter-expressing, Emx1-lineage cells of the adult basolateral and lateral amygdala indicated that virtually all of the recombined cells in this region were excitatory neurons [329]. In contrast, analysis of the recombined cells in the dorsal striatum revealed that approximately half of the Emx1-lineage cells were neurons, while

175 the other half were glia, primarily oligodendrocytes and OPCs [329]. Virtually all of the recombined striatal neurons were medium spiny neurons (MSNs); these recombined MSNs represented approximately 4% of the entire MSN population in the dorsal striatum [329]. In the dorsal striatum, MSNs can be grouped based on their location within one of two striatal compartments—the striosome or the matrix—or based on the targets of their projections—the direct or indirect pathway. Generally, the MSNs of the striosome predominantly receive inputs from limbic-associated cortical regions, while those of the matrix are primarily innervated by projection neurons from the sensorimotor cortices [521, 522]. Unlike matrix MSNs, which project to the output nuclei of the basal ganglia, striosomal MSNs also project to the substantia nigra pars compacta (SNc), where they synapse with dopaminergic neurons [521, 522]. When Cocas et al. investigated the distribution of recombined MSNs between these two structural regions, they found that the majority (91%) were striosomal MSNs; all of these striosomal MSNs received inputs from the dopaminergic neurons of the substantia nigra [329]. The second method to categorize MSNs relies on their direct targets. MSNs of the direct pathway project to the entopeduncular nucleus

(EP) and the substantia nigra pars reticulata (SNr). The inhibitory neurons from these two regions of the basal ganglia project both to motor nuclei of the brainstem and to the motor thalamus, which innervates the motor areas of the cortex. MSNs in the indirect pathway instead project to the globus pallidus, and the inhibitory neurons of the global pallidus project to the subthalamic nucleus

(STN). These glutamatergic STN neurons innervate the EP and SNr, at which point the direct and indirect pathways converge. Put simply, activation of direct pathway MSNs generally increases motor cortex activity, thereby facilitating movement, while activation of the MSNs of the indirect pathway generally reduces motor cortex activity, inhibiting movement [521, 522]. Cocas et al. found that the majority (75%) of Emx1-lineage MSNs were part of the indirect pathway [329].

176

Cocas et al. also interrogated the origin of these non-cortical, Emx1-lineage cells. Via pulse-chase experiments, they established that the Emx1-lineage neurons of the amygdala were born between E9 and E13, while Emx1-lineage MSNs were born later, from E11 to E15. Most of these cells were born from Emx1-expressing progenitor cells that migrated from the Emx1 domain of the dorsal telencephalon into the ventral regions of the brain [329]. These migrating Emx1- lineage progenitor cells displayed shifting gene expression profiles as they moved into the subpallium. Specifically, the migrating Emx1-lineage cells downregulated Emx1 expression [329].

A subset of these migrating Emx1-lineage cells maintained Pax6 expression as they traveled outside of pallium; these cells produced excitatory neurons of the amygdala [329]. In contrast, other migrating Emx1-lineage cells downregulated Pax6 and instead upregulated the subpallial markers Gsx2 and Dlx2; these cells produced inhibitory MSNs of the striatum [329].

In summary, Emx1-Cre drives robust recombination in the NPCs of the dorsal telencephalon and in the olfactory bulb; in the adult brain, the vast majority of Emx1-Cre-mediated recombination is observed in the glia and excitatory neurons of the cerebral cortex, hippocampus, and olfactory bulb [235, 329]. However, due to subpallial migration of Emx1-expressing NPCs during brain development, a relatively small subset of Emx1-lineage cells produces excitatory neurons of the amygdala and inhibitory MSNs of the dorsal striatum [235, 329]. Most of these inhibitory MSNs receive dopaminergic and limbic system inputs, and the majority of afferents from the Emx1-lineage MSNs project to the globus pallidus, as part of the indirect pathway of the basal ganglia [329]. Emx1-Cre-mediated recombination is also observed outside of the brain, primarily in the jaw/face primordium (the branchial arches) and kidneys [235, 516-518].

177

6.2 Rationale

While primary microcephaly precipitates during embryonic development, the effects of these developmental defects profoundly affect postnatal development and quality of life [523]. In

MOPD1, Roifman syndrome, and Lowry Wood syndrome, patients display speech and motor delays and intellectual disability; some MOPD1 patients also suffer seizures [106-108, 122, 124-

129, 140-143]. Therefore, we sought to investigate the behavioral outcomes of microcephaly caused by minor spliceosome inactivation, which we pursued in collaboration with the Murine

Behavioral Neurogenetics Facility (MBNF) at the University of Connecticut. In order to plan the behavioral testing paradigms to be used, it was first necessary to determine the survival kinetics of the microcephalic, Emx1-Cre+ Rnu11 mutant mice.

6.3 Results

6.3.1 Emx1-Cre heterozygous, Rnu11 mutant mice survive to adulthood

Despite their severe microcephaly, many Emx1-Cre+ Rnu11 mutant mice survived to adulthood, with some mutant mice surviving up to 1.5 years (Figure 6.1A-E). The adult mutant mice could breed successfully, although the females did not take care of their pups. However, we observed a sharp decline in mutant mouse survival around P20, indicating that a subpopulation within the mutant mouse cohort was dying prematurely (Figure 6.1E). This window corresponds to the weaning age of mice, which provided a potential explanation for this pattern of death: failure to transition to solid food. Indeed, we found that the mutant mice showed slower weight gain as they approached the age of weaning (P21) (Figure 6.1F). Notably, previous reports have shown

Emx1-Cre reporter expression in the branchial arches of E10.5 embryos; these structures give rise to the developing jaw [235, 516-519]. In agreement with these reports, we observed expression of

178 the Ai9 tdTomato Cre reporter in the jaw of P0 control and mutant mice (Figure 6.1G&H) [351].

This expression persisted in both the masseter and temporalis muscles of P21 control and mutant mice (Figure 6.1I-J’). To ease the transition to solid food, the food given to the mutant mice was wet daily, starting at P19. However, this did not prevent death in the weaning period (Figure 6.1K).

Genotyping to determine Emx1-Cre zygosity revealed that mutant mice homozygous for Emx1-

Cre (Rnu11Flx/Flx::Emx1-Cre+/+) were the ones dying in the weaning period; in contrast, virtually all mutant mice that were heterozygous for Emx1-Cre survived to adulthood (Figure 6.1K). Our findings suggest that homozygosity of Emx1-Cre may result in increased recombination in the branchial arches, failure to transition to solid food, malnourishment, and early death, even when wet food is provided.

6.3.2 Rnu11 mutant mice display heightened anxiety

Given that the microcephalic Emx1-Cre heterozygous, Rnu11 mutant mice survived to adulthood, we sought to understand how minor spliceosome inactivation-mediated microcephaly affected brain function—i.e., mouse behavior. For this, we employed a battery of behavioral tests in collaboration with Dr. Holly Fitch of the MBNF at the University of Connecticut, to assess a variety of behaviors in adult male Rnu11 wild-type (WT), heterozygous (het), and mutant (mut) mice. Specifically, two separate studies were performed, using a group of 22 male mice (7 WT, 7 het, 8 mut) and a group of 43 male mice (15 WT, 15 het, 13 mut).

Assessment of anxiety was performed using the elevated plus maze and open field task. In the elevated plus maze, anxiety is measured by the amount of time spent in the open arms vs the closed arms, and by the number of entries into the open vs closed arms. More time in and entries into the open arms indicate increased exploratory behavior and reduced anxiety [524]. Univariate

179

ANOVA revealed a significant effect of genotype on the number of closed arm entries

[F(2,19)=14.208, P<0.001) and total number of arm entries [F(2,19)=10.151, P=0.001], but no significant effect of genotype on the number of open arm entries [F(2,19)=1.000, P=0.386] (Figure

6.2A). Subsequent two-way student’s t-tests showed that the Rnu11 mutant mice entered closed arms significantly less than the Rnu11 wild-type controls (P<0.05); significantly fewer total arm entries were observed for Rnu11 mutant mice relative to both Rnu11 wild-type and Rnu11 heterozygous control genotypes (P<0.05; Figure 6.2A). Univariate ANOVA also identified a significant effect of genotype on the percentage of time spent in the open arms [F(2,19)=9.948,

P=0.001], driven by the substantially longer percentage of time Rnu11 mutant mice spent in the open arms [62.7±11.5% vs 15.2±9.9% (WT) and 8.3±5.0% (het); P<0.05; Figure 6.2B]. In summary, Rnu11 mutant mice spent significantly more time in the open arms and made significantly fewer closed arm and overall arm entries, compared to the control genotype groups.

Notably, during the testing period, Rnu11 mutant mice displayed freezing behavior in the open arms, which would indicate high levels of anxiety and also explain these results (data not shown).

Due to the robustness of these results, the elevated plus maze task was only employed in the first study, using 22 male mice.

In the open field task, the subject is placed in a wall-enclosed, square arena; exploratory behavior is measured by the amount of time spent in the center region of the field, compared to the outer region adjacent to the walls. Distance traveled and velocity in the outer vs center regions can also indicate increased stress/anxiety in mice [525]. While most studies separate the open field into two regions (outer and center), we divided the open field into four zones (out-outer, in-outer, out-center, and in-center; Figure 6.2C) to identify nuanced shifts in behavior [525]. For this analysis, mice from both cohorts were tested using the open field task; since there was no

180 significant effect of study on duration, distance traveled, or velocity in the open field zones (P- value range: 0.07–0.90), the results of these studies were pooled. In total, 16 WT, 16 het, and 17 mutant mice were tested in the open field. Univariate ANOVA revealed a significant effect of genotype on the distance traveled in total [F(2,43)=7.5, P=0.002] and in the in-center

[F(2,43)=20.6, P<0.001], out-center [F(2,43)=14.761, P<0.001], in-outer [F(2,43)=3.526,

P=0.04], and out-outer [F(2,43)=12.8, P<0.001] zones; there was no significant effect of genotype on the duration spent in these open field zones (P-value range: 0.127-0.416). For velocity, univariate ANOVA identified a significant effect of genotype on velocity in the in-center

[F(2,43)=19.476, P<0.001], out-center [F(2,43)=5.585, P=0.007], and out-outer [F(2,43)=16.393,

P<0.001] zones.

Subsequent two-way student’s t-tests revealed that Rnu11 mutant mice travelled a significantly longer distance in the out-outer zone, relative to Rnu11 wild-type (P<0.001) and heterozygous (P=0.001) control groups (Figure 6.2D). However, Rnu11 mutant mice travelled significantly shorter distances in the out-center and in-center zones, compared to Rnu11 wild-type

(P=0.21; P=0.003) and heterozygous (P<0.001; P<0.001) controls (Figure 6.2E-F). Overall,

Rnu11 mutant mice travelled significantly farther than Rnu11 wild-type (P=0.002) and heterozygous (P=0.016) mice (data not shown). As expected from our open field distance findings,

Rnu11 mutants travelled at a significantly higher velocity in the out-outer zone, and a significantly lower velocity in the in-center zone, relative to both the Rnu11 WT (P<0.001) and heterozygous

(P<0.001) control mice (Figure 6.2G; data not shown). While there was not a significant difference in the velocity of Rnu11 mutant and wild-type mice in the out-center zone, Rnu11 mutant mice travelled at a significantly slower velocity than Rnu11 heterozygous mice (P=0.011; data not shown). In all, our findings indicate that Rnu11 mutant mice travelled significantly faster and

181 farther in the outermost zone of the open field (the out-outer zone), while they travelled significantly more slowly, and a shorter distance, in the innermost zone of the open field (the in- center zone).

6.3.3 Rnu11 mutant mice have significantly impaired motor coordination

To investigate the effect of Rnu11 ablation on motor coordination, we employed the rotarod test, in which the mouse is placed on a rotating beam, and latency to fall from the beam is measured

[526]. To also assess motor learning, testing was performed across five consecutive days. Rotarod testing was performed in both studies, on a total of 21 WT, 19 het, and 20 mut mice. ANOVA indicated that there was no significant effect of study on the rotarod results [F(1,54)=2.384,

P=0.128], allowing us to pool the rotarod data. A 5 (day) x 3 (genotype) repeated measures

ANOVA revealed a significant effect of genotype on latency to fall from the rotarod [F(2,54)=30.2,

P<0.001] and a significant effect of day tested on latency to fall [F(4,216)=6.5, P<0.001], evidence of motor learning (Figure 6.2H). However, we did not identify a significant interaction between day and genotype [F(8,216)=1.183, P=0.311], indicating that all groups displayed learning over the five day period. Subsequent two-way student’s t-tests identified significantly shorter latencies to fall (i.e., worse motor performance) for Rnu11 mutant mice relative to wild-type (P<0.001) and heterozygous (P<0.001) control groups across all days of testing (Figure 6.2H).

6.3.4 Rnu11 mutant and heterozygous mice vocalize significantly less than wild-type controls in socio-sexual encounters

To assess social behavior, we employed an analysis of socio-sexual ultrasonic vocalizations (USVs). For this analysis, one male mouse was placed in a cage housing a female,

182 wild-type mouse for five minutes, and the amount of time the male test subject spent vocalizing was quantified. This analysis was performed in the second study, on the 43 mice in that cohort.

Univariate ANOVA revealed a significant effect of genotype on the percentage of time spent vocalizing [F(2,40)=8.6; P=0.001]. Specifically, Rnu11 mutant mice vocalized for a significantly smaller percentage of time than Rnu11 wild-type mice, by two-way student’s t-test (P=0.001;

Figure 6.1I). Unexpectedly, we also found that Rnu11 heterozygous mice vocalized significantly less than Rnu11 wild-type controls (P=0.01; Figure 6.2I). Moreover, comparison of the percentage of time spent vocalizing between Rnu11 heterozygous and mutant mice revealed no significant difference (P=0.622), suggesting that cortical U11 haploinsufficiency may impact sociability as severely as complete U11 ablation (Figure 6.2I).

6.3.5 Rnu11 heterozygous mice display enhanced performance on complex, whole-body motor tasks

Further investigation into the Rnu11 heterozygous mice revealed significantly affected performance in other behavioral tasks, particularly in motor tasks. While Rnu11 heterozygous mice behaved similarly to wild-type controls in the elevated plus maze (Figure 6.2A), Rnu11 heterozygous mice traveled significantly farther in the out-center and in-center zones of the open field relative to wild-type controls (P=0.043, P=0.024; Figure 6.2E-F). In the pooled rotarod data, the average latency to fall for the Rnu11 heterozygous mice trended longer than wild-type controls on all days tested, although this difference was not statistically significant (P=0.157; Figure 6.2H).

However, separating the rotarod data by study revealed two visibly different trends in Rnu11 heterozygous performance on the rotarod task. In the first study, P80 Rnu11 heterozygous mice had significantly shorter latencies to fall on days 1 and 2 of testing, relative to P80 Rnu11 wild-

183 type controls (Figure 6.3A). In the second study, the average performance of P105 Rnu11 heterozygous mice was virtually identical to P105 wild-type controls on days 1 and 2 of testing, with slight, insignificant gains relative to wild-type performance on the following days (Figure

6.3B).

To further investigate motor behavior in the Rnu11 heterozygous mice, we added several motor tasks to the testing paradigm used for the second study: a modified, faster rotarod task; a balance beam test, to assess coordination/balance; a grid-walking task, to test coordination; and a wire hang task, to assay strength [526]. In all, 14 WT, 12 het, and 12 mut mice were tested in this paradigm. A 2 (day) x 3 (genotype) repeated measures ANOVA revealed a significant effect of genotype, but not day, on latency to fall on the faster rotarod task [F(2,35)=8.5, P=0.001;

F(1,35)=1.1, P=0.294]. As expected, Rnu11 mutant mice performed significantly worse than

Rnu11 wild-type (P=0.008) and heterozygous (P=0.001) mice, by two-way student’s t-test (Figure

6.3C). While the average latency to fall trended higher in the Rnu11 heterozygous mice relative to wild-type controls, there was no significant difference in rotarod performance between these groups (P=0.716; Figure 6.3C). In the balance beam test, univariate ANOVA revealed no significant effect of genotype on the time taken to cross the balance beam [F(2,35)=2.7, P=0.084]

(Figure 6.3D). Coordination was further investigated using the grid walking task, in which the mouse must cross an elevated metal grid and the number of correct foot placements on the wire grid is counted. Univariate ANOVA revealed a significant effect of genotype on the number of correct foot placements (P=0.003), due to the significantly fewer correct foot placements by Rnu11 mutant mice relative to Rnu11 wild-type (P=0.005) and heterozygous (P=0.011) mice, by two- way student’s t-tests (Figure 6.3E). There was no significant difference in performance between

Rnu11 heterozygous and wild-type mice (P=0.986). Finally, for the wire hang task, in which a

184 mouse is coaxed to grasp a wire grid and the latency to fall from this grid is quantified, univariate

ANOVA identified a significant effect of genotype on latency to fall [F(2,35)=78.7, P<0.001].

Consistent with previous results, Rnu11 mutant mice performed significantly worse than Rnu11 wild-type (P<0.001) and heterozygous (P<0.001) mice, based on two-way student’s t-tests (Figure

6.3F). However, virtually none of the Rnu11 wild-type or heterozygous mice fell from the wire grid during the 60-second task; as a result, their performance in this task was not significantly different (P=1.000) (Figure 6.3F).

In parallel with these tests, we had begun investigating spatial learning and memory in the

Rnu11 wild-type and heterozygous mice using the Morris water maze. In this test, the mouse is placed in a circular pool and must find and swim to a submerged platform to escape the water, using spatial cues located outside of the maze itself. This test is then repeated across five consecutive days. There are two general measures of memory/spatial learning in this task: latency to reach the platform (“escape latency”), where shorter latency corresponds to better memory/spatial learning; and path length from the starting position to the platform, which is a more accurate measurement of memory/spatial learning in this task [527]. A 5 (day) x 2 (genotype) repeated measures ANOVA revealed a significant effect of genotype (P=0.011) and day

[F(4,27)=15.9, P<0.001], and a significant interaction between day and genotype [F(4,27)=2.6,

P=0.038], on average escape latency (data not shown). Overall, performance of the Rnu11 heterozygous mice was better than the wild-type controls (data not shown). However, the average path length from the starting position to the platform was not significantly affected by genotype

(P=0.559); repeated measures ANOVA only identified a significant effect of day on average path length [F(4,27)=8.2, P<0.001] (Figure 6.3G). Instead, a repeated measures ANOVA revealed that the average swimming velocity was significantly impacted by genotype [F(1,27)=9.5, P=0.005]

185 and day [F(4,27)=25.3, P<0.001]. On all days of testing, the Rnu11 heterozygous mice had higher average swimming velocities relative to wild-type controls (Figure 6.3H). In summary, we did not observe a significant difference in Rnu11 heterozygous performance, relative to wild-type controls, on simple motor tasks designed to test specific domains of motor performance. Instead, significant enhancement of motor performance in the Rnu11 heterozygous mice, compared to wild-type controls, was primarily observed in complex, whole-body tests of motor performance (e.g., rotarod, swim test).

6.4 Discussion

In the multiple developmental disorders linked to minor spliceosome disruption, patients have numerous behavioral abnormalities. In MOPD1, patients display psychomotor developmental delays in infancy and early childhood. In the patients with mild MOPD1 who reach late childhood, intellectual disability is common [107, 108, 124-129]. In the more mild Roifman syndrome, cognitive delays and hypotonia are often observed [106, 140]. Patients with Lowry

Wood syndrome have a variable degree of intellectual disability, ranging from significant to mild or no intellectual disability [122, 141-143, 528-530]. Our effort to characterize the behavioral outcomes of minor spliceosome inactivation-induced microcephaly revealed a wide range of behavioral abnormalities in the microcephalic, Emx1-Cre+ Rnu11 mutant mice, including increased anxiety, global impairment of motor behavior, and reduced vocalization in socio-sexual encounters. The mutant mice particularly struggled in tasks requiring coordination and fine motor skills, such as the rotarod and grid-walking tasks (Figures 6.2H, 6.3C&E); in contrast, general locomotion appeared relatively unaffected, based on their performance in the open field test

(Figure 6.2D&G). In the P0 mutant brain, we observed loss of the corticospinal tracts, which are

186 comprised of primary motor neuron axons (Figure 4.16B). Notably, lesion studies in both primates and rodents indicate that disruptions of the corticospinal tracts severely impact fine motor skills, such as skilled walking, coordination, and manual dexterity, while voluntary locomotion and gait are not significantly affected [531, 532]. Therefore, the observed motor impairments in the Rnu11 mutant mice may be due to loss of the descending cortical axons in the spinal cord.

There are many potential causes of the observed anxiety and vocalization deficits in the mutant mice. For example, Emx1-lineage excitatory neurons are observed in the basolateral amygdala, an important region for fear conditioning [329, 533]; it is possible that either loss or dysfunction of these Emx1-lineage excitatory neurons could underlie this phenotype.

Alternatively, changes in Emx1-lineage cortical neurons that project to the amygdala could explain the heightened anxiety. Specifically, activation of the medial prefrontal cortex (mPFC), which projects to the basolateral and central amygdala, has been shown to inhibit amygdalar activity, linking mPFC activity to reduction in conditioned fear [235, 534]. Loss of these mPFC-amygdala projections, or their reduced activity, could result in increased activity in the amygdala and increased anxiety. The prefrontal cortex (PFC) is also implicated in social behavior, including social motivation. Lesion of the anterior cingulate cortex (ACC), a region within the PFC, reduces interest in social interaction [535, 536]. Moreover, the PFC also regulates social , which informs socio-sexual vocalization emission such that more dominant males vocalize more than subordinates [535, 537]. Therefore, disruption of either of these PFC functions could cause the observed reduction in socio-sexual vocalizations in Rnu11 mutant mice. Yet another possibility is that inefficient motor control could underlie this change in vocalization. Specifically, orofacial motor deficits have been linked to impaired vocalization production [538, 539]. Notably, in addition to their poor motor performance, Rnu11 mutant mice struggled to transition to solid food,

187 likely due to the prevalence of Emx1-lineage cells in the jaw musculature (Figure 6.1E-K). This is consistent with the micrognathia commonly observed in MOPD1 patients [126, 127]. It is feasible that motor deficits, caused by disruption of motor cortex function and/or impaired development/function of the jaw musculature, could contribute to the reduction in socio-sexual vocalization in the Rnu11 mutant mice.

During the span of both behavioral studies, multiple Rnu11 mutant mice died unexpectedly, despite being provided wetted food daily. These mice were visibly smaller than littermate controls and displayed signs of dehydration, including hunched posture and mussed facial fur caused by piloerection [540, 541]. Similar symptoms were observed in the Emx1-Cre heterozygous, Rnu11 mutant mice that died unexpectedly during adulthood (data not shown; Figure 6.1E&K). Given that thirst originates in the hypothalamus [542], which is located outside of the Emx1-Cre domain, these unexpected deaths most likely relate to Emx1-Cre activity in non-brain structures, specifically the jaw musculature and/or the kidneys (Figure 6.1G-J’) [235, 516-518]. Abnormal function of the jaw musculature could prevent these mice from efficiently drinking from the available water bottle. However, outcomes of occasional cage flooding events, wherein piled cage bedding came in contact with the water bottle nozzle and leached most of the water from the bottle, in cages housing control and Rnu11 mutant female mice suggested that Rnu11 mutant mice were more sensitive to dehydration than their control cage-mates (data not shown). Thus, it is most likely that Emx1-Cre-mediated U11 ablation in the renal cortex and, to a lesser extent, the renal medulla negatively impacts kidney function, resulting in water loss and dehydration.

Unexpectedly, we observed significantly altered social and motor behavior in the Rnu11 heterozygous mice relative to wild-type controls, even though Rnu11 wild-type and heterozygous mice appeared morphologically similar. Like the Rnu11 mutant mice, Rnu11 heterozygous mice

188 vocalized significantly less in socio-sexual interactions than wild-type controls; in fact, Rnu11 heterozygosity had a similar impact on USV production as complete Rnu11 ablation (Figure 6.2I), suggesting that the development/function of the cortical circuits underlying social behavior is highly sensitive to U11 levels. Since (i) the Rnu11 heterozygous brain is grossly similar to the wild-type control brain, and (ii) Rnu11 heterozygous mice did not display motor deficits by any of the tasks we employed, the underlying cause of reduced USV production in these mice likely differs from that of the microcephalic Rnu11 mutant mice. In the literature, vocalization defects have been observed in transgenic mice without microcephaly [543-547]. For instance, Epac2- deficient mice display severe sociosexual vocalization deficits alongside reduced cell density and dendritic spine number specifically in layer II/III neurons of the ACC [545]. Thus, one possibility in the Rnu11 heterozygous mice is that a subset of MIGs is required specifically for prefrontal cortex development, rendering the prefrontal cortex particularly sensitive to U11 levels. As a result, U11 haploinsufficiency would cause abnormal splicing and function of these MIGs, leading to prefrontal cortex-specific changes in cell density, neuron differentiation, and prefrontal cortex function. While many MIGs, such as Atxn10 [548], Braf [549], Dock4 [550, 551], and Myo10

[552], have been linked to abnormal neuron differentiation, we did not find any reports linking these MIGs to prefrontal cortex-specific defects. A potential mechanism that could reduce cell density in the upper layers of the prefrontal cortex is abnormal cell cycle in the prefrontal cortex

NPCs. In the U11-null pallium, we observed a significantly higher fraction of mitotic NPCs, with a significant fraction of mitotic RGCs in prometaphase/metaphase, relative to the control pallium

(Figure 4.7C-E). Prolongation of these mitotic phases is associated with increased neuron production [294]. If NPCs of the rostral pallium express MIGs at higher levels than NPCs elsewhere in the pallium, or if cell cycle in these rostral NPCs is primarily controlled by MIGs,

189 then slight changes in minor splicing efficiency could prolong mitosis length. This would shift

NPCs toward differentiative divisions, which would increase the production of early-born, lower layer neurons and gradually deplete the NPC population as development continued. In turn, fewer later-born, upper-layer neurons would be produced in the rostral cortex, resulting in a similar phenotype as that seen in the Epac2-deficient mice [545].

Alternatively, the social deficits in the Rnu11 heterozygous mice might be linked to abnormal olfaction. Emx1-Cre expression is mosaic in the olfactory bulb, with substantial recombinase activity observed in the mitral/tufted cells, the output neurons of the region [235, 516-

518, 553]. Moreover, mitral cell function has been linked to urine detection, with a subset of mitral cells specifically responding to the volatile compounds found in mouse urine [554]. Impaired detection of such scents has been associated with impaired social behavior [555-557]. Therefore, it is possible that U11 haploinsufficiency disrupts the function of MIGs important for olfactory bulb development/function, resulting in defective olfaction and social deficits. Further investigation is needed to determine whether U11 haploinsufficiency specifically impacts development/function of the prefrontal cortex, the olfactory bulb, or both. Regardless, these results are the first to implicate the minor spliceosome in either social behavior or cortical region-specific development/function in the mammalian brain, raising the possibility that MOPD1, Roifman syndrome, and Lowry Wood syndrome patients may display subtle social or olfaction deficits that have not yet been tested/identified.

In contrast with the social behavior phenotype of Rnu11 heterozygous mice, which correlated with Rnu11 mutant social deficits, we observed subtle but significant improvement of complex, whole-body motor task performance in the Rnu11 heterozygous mice relative to that of wild-type controls. The subtlety of this phenotype was highlighted by the differing rotarod

190 performance across our two studies. In our first study, in which the rotarod task was performed around P80, Rnu11 heterozygous mice performed significantly better than wild-type controls on days 1 and 2 of testing. The enhanced performance on the first two days of testing suggested that these mice displayed better baseline motor control than wild-type controls, rather than enhanced motor learning. However, in our second study, wherein the rotarod task was performed around

P105, we did not observe a significant difference between Rnu11 heterozygous and wild-type control performance. This appeared to be due to better performance of the wild-type control mice in study two, rather than worse performance by the Rnu11 heterozygous mice (Figure 6.3A-B).

While increased age has been associated with worse rotarod performance, most studies investigating the relationship between age and rotarod performance group mice by age intervals of two months or longer, while the difference in age between the mice in studies one and two is less than one month (2.67 months vs. 3.5 months) [558, 559]. Therefore, it is unclear if this observed shift in wild-type performance represents normal mouse motor development or if this shift is unique to the Rnu11 transgenic mouse line. In addition to age, multiple groups have reported that weight has a confounding effect on rotarod performance, with increased weight linked to shorter latencies to fall from the rod [560-562]. If the Rnu11 heterozygous mice of study two, but not study one, weighed more than the wild-type controls, this could explain why we did not observe improved heterozygous rotarod performance in study two. Assessment of complex motor performance by swimming speed, which has not been correlated with weight [563], would allow us to circumvent the confounding effect of weight on rotarod performance in the study two mice. In fact, when motor performance was tested via the Morris water maze task, the Rnu11 heterozygous mice used in study two swam significantly faster than wild-type controls on all days of testing (Fig. 6.3H). Similar to the rotarod performance enhancement observed in the first study,

191 the average Rnu11 heterozygous mouse swimming speed was significantly higher than that of wild-type controls on the first days of testing, indicating a higher baseline level of complex motor task performance as opposed to enhanced motor learning (Fig. 6.3H). We did not observe significantly enhanced performance on tasks that test specific domains of motor behavior, such as strength, coordination, or balance, indicating the presence of subtle motor performance enhancements that were only detectable in complex, whole-body motor tasks.

In the mouse brain, voluntary motor control originates from the motor cortex, which sends projections to the brainstem, spinal cord, and the dorsal striatum. The dorsal striatum integrates signals from the motor, sensory, and limbic cortices and signals to other nuclei in the basal ganglia, ultimately modulating motor signals via brainstem motor nuclei and thalamic neurons, which project back to the motor cortex (see section 6.1). Dopaminergic signaling from the ventral striatum, specifically the substantia nigra pars compacta (SNc), to the dorsal striatum regulates

MSN firing, thereby informing motor coordination and motivation [564, 565]. Generally, signaling from the motor cortex is associated with movement initiation and direction, while dorsal striatum function is linked to the regulation of movement speed, response time, and movement size [566-

570]. In the Emx1-Cre transgemic mouse, Cre-mediated recombination has been reported throughout the cortex, in both excitatory neurons and in pallium-derived glia, and in scattered cells of the dorsal striatum [235, 329]. Specifically, Emx1-lineage cells in the dorsal striatum include a subset of oligodendrocytes and their precursors (OPCs) and approximately 4% of the MSNs [329].

Therefore, in the Emx1-Cre+, Rnu11 heterozygous mice, U11 haploinsufficiency could affect enhanced motor performance by impacting the development and/or function of the motor cortex or the basal ganglia.

192

Enhanced rotarod performance in transgenic mouse lines is often linked to changes in striatal function (MGI refs)[516-518, 571-574]. In contrast, neither literature curation nor a phenotype-based search, using the MGI database, revealed any transgenic mouse lines in which enhanced motor performance has been associated with altered motor cortex function [516-518].

Thus, we suspect that changes in the dorsal striatum of the Emx1-Cre+, Rnu11 heterozygous mice most likely underlie the enhanced motor performance we observed. While Emx1-lineage MSNs comprise only ~4% of MSNs of the dorsal striatum, Cocas et al. found that most of the Emx1- lineage MSNs are found in striosomes (91% of all Emx1-lineage MSNs) and most are D2-receptor expressing MSNs of the indirect pathway (75% of all Emx1-lineage MSNs) [329]. In the wild-type rodent dorsal striatum, approximately 15% of all MSNs are found in the striosome; of these striosomal MSNs, 20-40% are D2 receptor-expressing MSNs of the indirect pathway. Based on the reported prevalences of these cell types in the rodent dorsal striatum, one can calculate that ~3-

6% of all dorsal striatum MSNs are D2 receptor-expressing striosomal MSNs. Thus, even though

Emx1-lineage MSNs comprise only a small fraction of the total MSN population in the dorsal striatum, they likely represent a substantial fraction of the D2 receptor-expressing striosomal MSNs of the region. It is possible that Emx1-Cre-mediated Rnu11 haploinsufficiency in these striosomal, indirect pathway MSNs would impair the development/function of these cells, resulting in the enhanced motor performance observed in the Rnu11 heterozygous mice. Since most of the cortical inputs to striosomal MSNs originate from the limbic system-associated cortices [521, 522], Rnu11 haploinsufficiency may alter limbic system-mediated control of motor coordination in these mice.

However, since most research on striosome function targets either all striosomal cells or the D1 receptor-expressing MSNs of the direct pathway, it is unclear how the striosomal, indirect pathway

MSNs specifically regulate motor behavior. It is possible that altered function/development of

193 these cells leads to secondary effects in the ventral striatum, which houses the dopaminergic neurons that project to the dorsal striatum [564, 565]. Increased levels of dopamine in the dorsal striatum have been linked to enhanced attention, motivation, and/or motor learning, in turn leading to better performance on the rotarod task in transgenic mice [516-518, 571, 572, 574-577].

Specifically, motor learning is particularly sensitive to dopamine levels [572, 574, 578-580]. While the Rnu11 heterozygous mice display enhanced baseline motor performance on the rotarod and

Morris tasks, our data do not support enhanced motor learning in these mice, relative to wild-type controls (Figures 6.2H, 6.3A-B, and 6.3H). Therefore, if high dopamine levels contribute to the observed phenotype, they likely increase motivation and/or attention in the Rnu11 heterozygous mice [565]. Yet another possibility is that striatal function in the Emx1-Cre+, Rnu11 heterozygous mice is modulated by Emx1-lineage OPCs and oligodendrocytes, which comprise ~50% of all

Emx1-lineage cells of the dorsal striatum [329]. Changes in myelination could impact striatal performance, thereby causing the observed motor phenotype. Since little is known about the specific roles of OPCs and oligodendrocytes, or the myelination patterns, in the dorsal striatum, it is unclear if increased myelination or regionally increased myelination could enhance motor performance/coordination.

Generally, it is difficult to predict the effects of U11 haploinsufficiency on the expression, splicing, and function of specific MIGs. For example, the majority of voltage-gated sodium channel alpha subunit genes and voltage-gated calcium channel alpha subunit genes contain at least one minor intron [58]. If one predicts that U11 haploinsufficiency results in inefficient minor intron splicing, in turn causing abnormal splicing around all minor introns and impaired MIG function, then one would expect impaired neuron firing across the cortex and significantly worse performance on all tasks relative to the wild-type controls. However, we did not observe signs of global cortical

194 function impairment in the Rnu11 heterozygous mice. Instead, we observed region-specific changes in brain function, likely affecting the prefrontal cortex and the basal ganglia. Moreover, the behavioral phenotypes of the Rnu11 heterozygous mice caused impairment of one of these domains (sociability), while enhancing the other (whole-body motor performance). Together, our findings indicate that specific regions of the brain are differentially sensitive to U11 haploinsufficiency, with regions underlying sociability being the most sensitive. These findings raise numerous questions regarding the carriers of mutant minor spliceosome-specific snRNAs— e.g., the parents of MOPD1, Lowry Wood syndrome, and Roifman syndrome patients. While these heterozygous carriers have previously been considered asymptomatic, these findings suggest that heterozygous carriers of these mutations may display subtle behavioral phenotypes that have not yet been described.

6.5 Materials and methods

Animal Procedures

All mouse procedures were performed according to the protocols approved by the

University of Connecticut Institutional Animal Care and Use Committee. The Rnu11 conditional knockout (cKO) mouse and its generation is described in Chapter 4 (section 4.5). The primers used for Rnu11, Cre, Emx1-Cre zygosity, and Ai9 tdTomato Cre reporter genotyping PCRs are listed in

Table S7. For survival and weight gain kinetics, both male and female control (Rnu11 wild-type and heterozygous) and mutant mice were used. To minimize suffering, animals with profound weight loss, indicative of starvation; severe dehydration, identified by hunched posture, moderate to severe skin tenting, and sunken eyes; or self-inflicted wounds were euthanized [541]. These mice were included in the survival kinetics data, using the age at which they were examined and

195 euthanized. Mice transferred to the MBNF for behavioral testing that were euthanized or died unexpectedly were not included in the survival kinetics data. For behavioral testing, adult male mice were used for pioneer studies, since behavioral analysis of female mice necessitates synchronization of estrus cycle, which has been shown to impact behavior in mice [581].

To generate the mice used for behavioral testing, timed matings were set such that each cohort would be comprised of age-matched male mice born within a 20-day (study 1) or a 13-day

(study 2) window. For study 1, 7 wild-type (WT; Emx1-Cre-negative Rnu11WT/Flx and Rnu11Flx/Flx),

7 heterozygous (het; Rnu11WT/Flx::Emx1-Cre+), and 8 mutant (mut; Rnu11Flx/Flx::Emx1-Cre+/-) mice born between 7/22/2015 and 8/11/2015 were transferred to the MBNF vivarium in late-September of 2015. For study 2, 15 wild-type (WT; Rnu11WT/WT::Emx1-Cre+), 15 heterozygous (het;

Rnu11WT/Flx::Emx1-Cre+/-), and 15 mutant (mut: Rnu11Flx/Flx::Emx1-Cre+/-) mice born between

7/2/2016 and 7/15/2016 were transferred to the MBNF vivarium in early October of 2016. Upon arrival, subjects were singly housed, with water and food available ad libitum, and maintained on a 12-hour light/12-hour dark cycle. Behavioral testing did not begin until all subjects in a study reached adulthood (>P60). All behavioral testing was performed during subjects’ light period and blind to subject genotype. The number of mice used for each behavioral test was determined both by the number of healthy mice available, as unexpected deaths particularly depleted the mutant mouse groups, and feasibility of testing within a given time window.

Transcardial Perfusion

P70 control and mutant mice were anesthetized using isoflurane, followed by transcardial perfusion performed as previously described [117]. After the brains were removed, they were fixed overnight in 4% PFA at 4°C, followed by PBS washes and cryoprotection in 30% sucrose

196 in PBS at 4°C for two days. Following an overnight incubation in half 30% sucrose/half OCT compound (Fisher Scientific, #23-730-571), the brains were embedded in OCT compound and sectioned using a cryostat (Leica CM 3050 S).

Hematoxylin and Eosin Staining

Frozen 16 µm coronal sections of P0 mouse brains were cured at room temperature for 15 minutes, followed by incubation in PBS for 15 minutes. The tissue was then dehydrated using an ethanol gradient (25%, 50%, and 75%). Slides were washed twice in 100% ethanol for 30 seconds each, with agitation, then washed twice in 95% ethanol for 30 seconds each, with agitation.

Following a rinse in running tap water, the sections were incubated in Harris hematoxylin (Fisher

Scientific, #6765001) for 7 minutes. Excess hematoxylin was removed by washing in running water for 30 seconds. The slides were briefly agitated in 1% acid alcohol (1% HCl in ethanol), then briefly rinsed in running water. Sections were then incubated for 2-3 minutes in saturated lithium carbonate (1.54% in dH2O), followed by a 5-minute incubation in tap water. After a 30 second wash in 80% ethanol with agitation, slides were incubated in an eosin Y-phloxine B solution (0.1% eosin Y, 0.01% phloxine B, 0.45% glacial acetic acid, 82.9% ethanol) for 1 minute.

The sections were then washed in 95% ethanol for 30 seconds with agitation, followed by two 30- second washes in 100% ethanol with agitation and incubation in 100% ethanol for 1 minute. After washing in xylene for 30 seconds with agitation, the slides were washed in xylene for 1 minute, then mounted using PermountTM mounting medium (Fisher Scientific, #SP15-100).

Elevated Plus Maze

197

The elevated plus maze apparatus consisted of four arms arranged in a “plus” (+) configuration and an open square in the middle of the apparatus. Two arms, located opposite of one another, had high walls blocking their sides and ends; the other two arms were open, i.e., without walls. The arm maze apparatus was elevated approximately 1 meter from the ground and oriented perpendicular to the ground. Trials were performed in a room with consistent illumination and a downward-facing, ceiling-mounted digital video camera that captured the full apparatus within its field of view. For each trial, the subject was placed in the center of the apparatus, facing an open arm; mouse activity was then recorded for five minutes. Videos were analyzed using the

TopScan (CleverSys, Reston, VA) tracking software, with manual verification.

Open Field Task

The open field consisted of a square, plastic chamber with smooth, high walls. Trials were performed in a room with consistent illumination and a downward-facing, ceiling-mounted digital video camera that captured the full open field chamber within its field of view. For each trial, the subject was placed in the center of the open field, and mouse activity was recorded for 30 minutes.

Videos were analyzed using the TopScan tracking software. For analysis, the open field was divided into four zones of equal width: an out-outer zone, located along the walls of the chamber; an in-outer zone, located just within the out-outer zone; an out-center zone, located internal to the in-outer zone; and the in-center zone, comprised of the centermost region of the open field.

Analysis of time spent in, distance travelled, and velocity within these zones was performed using the TopScan software.

Rotarod

198

For the rotarod test, each mouse was placed on a rotating cylindrical rod; the rotation speed gradually increased from 4 to 40 rotations per minute across each two-minute trial. For every test day, four trials were administered, and mouse performance was recorded on a digital camera set on a tripod. Latency to fall from the rotating rod was then quantified based on these video recordings. For each mouse, the latency to fall was averaged across the four trials for every test day. For the accelerated rotarod paradigm employed in section 6.3.5, rotation speed of the rod increased from 4 to 40 rotations in a one-minute period.

Socio-sexual vocalization assessment

Adult male mice were introduced to a recording chamber containing an unfamiliar, wild- type female, along with seven-day dirty bedding from the female’s cage to ensure exposure to pheromones from the full estrus period, for a five-minute interacting period. Socio-sexual vocalizations were recorded using a ¼ inch condenser microphone [Bruel & Kjaer (B&K) type

4136, Naerum, Denmark] hung 10 cm above the recording chamber. Since female vocalizations in socio-sexual encounters are both rare and much quieter than that of males, this set-up would efficiently detect socio-sexual vocalizations from the male test subjects. For preamplification and subsequent amplification of the microphone signal, we employed a B&K type 2619 preamplifier and a B&K type 2636 amplifier, respectively. Digitization of the amplified signal was performed using a Tucker Davis Technologies (Alachua, FL) multifunction processor (RX6), at a 200 kHz sampling rate; a custom MatLab program run on a Dell Pentium IV PC was used to save the resulting data as a .wav file. Vocalization recordings were then analyzed using the Adobe Audition software. From these recordings, the total time spent vocalizing was extracted using vocalization intervals (defined as continuous epochs of vocalization <200 ms apart), which were distinguished

199 from periods of silence. The resulting calculation of total time spent vocalizing was then used to calculate the percentage of time spent vocalizing during the five-minute interacting period. All recordings were performed in a sound-proofing room to both block external auditory stimuli during the encounter and minimize acoustic noise during the recording.

Domain-specific motor tasks

Balance beam test The balance beam apparatus employed consisted of a 1 meter-long, flat, 6 mm-wide beam raised above a table-top. One end of the beam was illuminated by light, while the opposite end was enclosed in a box with one side removed, to allow entry from the beam into the box. Bedding from the test subject’s home cage was added to this “safe box” to encourage mice to travel across the beam. For each test, the test mouse was placed at the illuminated end of the beam; mouse movement was recorded using a ceiling-mounted, downward-facing digital camera. The resulting recording was then used to calculate the latency to cross the balance beam to the enclosed platform. For each mouse, the results of four trials were averaged to calculate the latency to cross used for downstream comparisons.

Grid walking test The grid walking test, also known as the foot fault task, was performed using a metal wire grid with a 3.5 cm x 13.5cm grid. For the task, the wire grid was elevated and the test mouse was coaxed to cross the grid. The number of correct foot placements, in which the mouse successfully grasps the wire grid with its paw, was quantified from recordings made on a tripod-mounted digital camera.

Wire hang test The same metal wire grid used for the grid walking test was employed for the wire hang test. In this task, the test mouse was coaxed to grip the elevated wire grid and suspended from the grid for a 60-second interval. A tripod-mounted digital camera recorded each

200 trial, and the resulting recordings were used to calculate the latency to fall from the grid for each mouse.

Morris water maze

The Morris water maze apparatus was comprised of a 48-inch diameter, oval tub of room- temperature (~22˚C) water and a 8-inch diameter platform submerged in 2 cm of water; this hidden platform was located at the same location in the pool across all trials across all testing days. Testing was performed in a testing room with large black shapes painted on the walls and doorframe, to provide extra-maze cues for the test animals. For each trial, the mouse was placed in one of four quadrants of the pool (north, east, south, or west) and required to swim until it located the submerged platform. Prior to testing on the Morris water maze, test animals were acclimated using a water escape task, which employed the same apparatus design except that the platform was located above the water’s surface. Due to overall poor motor performance, declining health, and increased anxiety of the Rnu11 mutant mice, they were not tested on either the water escape task or the Morris water maze to prevent severe stress/drowning. For both the water escape task and the Morris water maze trials, mouse performance was recorded using a ceiling-mounted, downward-facing Sony Handcam DCR-TRV280 digital camera. The SMART Version 2.5 tracking software, installed on a Dell Dimensions E21 computer, was employed to trace mouse swimming paths and record time to reach the platform (i.e., time to escape the maze).

Measurements of path length and escape time were then used to calculate swimming velocity for each trial. If a test subject took longer than 45 seconds to escape the maze, the experimenter guided the mouse to the platform and allowed it to rest there for 10 seconds. Four trials were performed each day, with a rest period of 2 minutes under a warming lamp between trials. For each trial, the

201 start position was varied between the north, east, south, and west starting locations, without repetition on that testing day; the order of start position location was randomized for each separate day of testing. Results from these four trials were averaged to calculate performance for that day; this testing was performed across five consecutive days.

Statistical analysis

All statistical testing was performed using the SPSS 15.0 software. Two-tailed statistical tests were employed, with an alpha cut-off of 0.05. For all behavioral tests performed on both studies, ANOVA was used to determine the effect of study on test performance; if no significant effect of study was observed, the results from the two studies were pooled. Univariate analyses of variation (ANOVAs) were employed to compare performance on each task as a function of genotype (3 levels: WT, het, mut). For behavioral tests performed over multiple days, such as the rotarod and Morris water maze testing paradigms, repeated measures ANOVAs were employed.

In both cases, variables included genotype (3 levels: WT, het, mut) and day (5 levels: day 1, day

2, day 3, day 4, day 5). If a significant effect of genotype was revealed by ANOVA, pairwise comparisons of genotype-specific performance (e.g., WT vs mut, het vs mut, WT vs het) was completed using two-way student’s t-tests.

202

Chapter 6: Figures

Figure 6.1. Survival and weight gain kinetics in the microcephalic Rnu11 mutant mice. (A) Lateral view of skulls from P70 ctrl and mut mice. Scale bar=5 mm. (B) Dorsal view of the brains removed from the skulls in (A). Scale bar=5 mm. (C-D) Coronal sections of the P70 brains in (B), stained with hematoxylin and eosin. Scale bar=1 mm. (E) Survival kinetics of control (solid line, N=5) and mutant (dashed line, N=22) mice from P10 to P700. (F)Weight kinetics of Emx1- Cre+/- Rnu11 ctrl (solid line, N=3) and Rnu11 mut (dashed line, N=3) mice from birth until P30. Statistical significance at each time point was determined by two-tailed student’s t-test. *=P<0.05. (G-H) tdTomato Cre reporter signal (red) in lateral views of the P0 ctrl (G) and mut (H) head. (I- J) Lateral views of the jaw musculature of P21 ctrl (I) and mut (J) mice; Mas=masseter, tm=temporalis. (I’-J’) tdTomato Cre reporter signal in the jaw musculature of the ctrl and mut mice in (I) and (J), respectively. (K) Survival kinetics of ctrl (solid line, N=69), Rnu11 mutants heterozygous for Emx1-Cre (dashed line, N=25), and Rnu11 mutants homozygous for Emx1-Cre (dotted line, N=11) from birth to adulthood (P60).

203

Figure 6.2. Anxiety, motor performance, and socio-sexual behavior analysis of Rnu11 wild- type (WT), heterozygous (het), and mutant (mut) mice. (A-B) Elevated plus maze performance by (A) arm entry number and (B) time spent in open arms. (C) Schematic of the zones in the open field test. (D-G) Open field performance by distance in the (D) out-outer, (E) out-center, and (F) in-center zones, and by (G) velocity in the out-outer zone. (H) Latency to fall on the rotarod across 5 days of testing for mice pooled from both studies. (I) Percent time spent vocalizing in male-female, 5-minute interactions. Green=WT, orange=het, blue=mut. Statistical significance was determined by either univariate or repeated measures ANOVA, followed by two-tailed student’s t-tests. N.s.=not significant, *=P<0.05, **=P<0.01, ***=P<0.001.

204

Figure 6.3. Analysis of motor ability in the Rnu11 heterozygous (het) mouse. (A-B) Latency to fall off the rotarod across 5 days of testing for mice from (A) study 1, tested at P80, and (B) study 2, tested at P105. (C) Latency to fall off the rotarod using a rapid acceleration protocol across 5 days of testing for mice from study 2. (D) Balance beam performance by time to cross the beam, using mice from study 2. (E) Grid walking performance, measured by number of correct foot placements, for mice from study 2. (F) Wire hang test performance, measured by latency to fall, for mice from study 2. (G-H) Morris water maze performance by (G) path length to platform and (H) average swimming velocity, for mice from study 2. Orange=WT, green=het, blue=mut. Statistical significance was determined by either univariate or repeated measures ANOVA, followed by two-tailed student’s t-tests. *=P<0.05, **=P<0.01, ***=P<0.001.

205

Chapter 7: Conclusions and future directions

In Chapter 3, it was shown that the minor spliceosome-specific snRNAs are enriched in the developing CNS and limb buds, which corresponds strongly to the tissues affected in MOPD1

[107, 108, 124, 125] (Figure 3.4). In the early embryonic retina, expression of these snRNAs was relatively high in the retinal progenitor cells, but as development continued, the highest enrichment was later observed in the differentiating retinal neurons (Figures 3.6&3.7). Moreover, P0 electroporation of U12 morpholino revealed death of differentiating retinal neurons, with complete loss of electroporated cells observed by P14 (Figure 3.11). In contrast, genetic Rnu11 ablation in the E12 pallium caused widespread death of self-amplifying RGCs, while both IPCs and neurons were produced (Figures 4.4, 4.5C-F, and 4.15). In fact, our data suggest that U11-null neurons produced in the pallium can survive postnatally (Figure 4.15). It is unclear if this discrepancy is due to a difference in the knockdown/knockout strategy used—for example, morpholino-based knockdown can lead to artifacts due to non-specific binding [582]—or the timing of the knockdown/knockout, since Rnu11 ablation occurred prior to the start of cortical neurogenesis while the U12 morpholino was introduced at P0, in the middle of retinal neurogenesis [235, 329]

(Chapter 3). Alternatively, cell type-specific differences in MIG expression and splicing may underlie this discrepancy. Notably, retinal phenotypes are not commonly reported in MOPD1, while microcephaly is observed in MOPD1, Roifman Syndrome, and Lowry Wood syndrome, which supports this possibility [106-108, 122, 124-129, 139-143]. To test this, breeding the retinal progenitor cell-specific Chx10-Cre transgenic mouse into the Rnu11 cKO moue line would allow for the genetic ablation of U11 in the retinal progenitor cells starting at E11 [583], and would circumvent the problems associated with the U12 morpholino experiment.

206

In Chapter 4, prolongation of mitosis, and particularly of prometaphase/metaphase, was observed in the U11-null RGCs (Figure 4.7C-E); in 2016, Pilaz et al. reported that prolongation of prometaphase/metaphase in RGCs results in increased differentiative drive [294]. To investigate

IPC and neuron production in the developing cortex, a two-pronged approach employing in vitro and in vivo techniques would be the most effective. To study IPC and neuron production from

RGCs in vitro, NPC derivation from E11 or E12 Emx1-Cre+ and Emx1-Cre- Rnu11Flx/Flx::Mapt-

EGFP and Rnu11Flx/Flx::Tbr2-EGFP pallia followed by live-imaging would allow us to track and quantify the number of proliferative, IPC-producing, and neurogenic divisions in the wild-type and mutant pallium [220, 294, 584]. Subsequent fixation and immunofluorescence for the RGC marker Pax6 would further allow us to identify the unlabeled, dividing cells. This approach would also allow us to calculate the length of cell division in the wild-type RGCs, wild-type IPCs, mutant

RGCs, and mutant IPCs. To study divisions in vivo, in utero electroporation (IUE) of either a pCAG-Cre or pTbr2-Cre plasmid into Rnu11WT/WT::loxp-STOP-loxp-CAG-tdTomato+ and

Rnu11Flx/Flx::loxp-STOP-loxp-CAG-tdTomato+ E13 embryos would allow us to specifically ablate

Rnu11 in a subset of either RGCs (pCAG-Cre) or IPCs (pTbr2-Cre) [585, 586]. Harvest of the electroporated embryos after 48 hours, followed by quantification of tdTomato+ RGCs, IPCs, and neurons in the wild-type or mutant embryos, would allow us to determine whether IPC or neuron production is altered by U11 loss. Moreover, this strategy would allow us to distinguish between the effects of U11 loss on RGCs versus IPCs, by employing the IPC-specific pTbr2-Cre plasmid, gifted to the Kanadia lab by Dr. Tarik Haydar [585]. Additionally, by performing this experiment and harvesting at postnatal time-points, we can also use this strategy to verify and study U11-null neuron survival and differentiation.

207

To further study the cell type-specificity observed in Chapter 4, fluorescence-assisted cell sorting (FACS) is a powerful method that would allow us to study MIG expression and splicing across distinct cell types of the cortex and, potentially, the striatum [587]. This approach would allow us to determine whether the U11-null neurons in the Rnu11 mutant pallium have shifted their transcriptional programs to circumvent those requiring MIG function, or if all neurons rely on transcriptional programs with few MIGs. Moreover, we could determine whether neurons express any of the core essentialome MIGs identified in Chapter 5. To this end, we have bred the

Mapt-EGFP mouse in the Emx1-Cre+, tdTomato+ Rnu11 cKO line. By dissociating the pallium, pooling embryos of the same genotype, and employing FACS to separate the EGFP+/tdTomato+

Emx1-Cre+ neurons from the EGFP-/tdTomato+ Emx1-Cre+ NPCs, we can perform RNAseq on these distinct populations. The same approach can be used to study the Emx1-lineage cells of the striatum at later embryonic time-points, once these cells have fully migrated from the pallium to their final positions in the striatum [329]. This approach would also identify any cell type-specific alternative splicing occurring in either MIGs or non-MIGs; Anouk Olthof of the Kanadia lab is currently pursuing this line of research.

In Chapter 4, the self-amplifying RGC death we observed occurred alongside DNA damage and p53 accumulation (Figure 4.6E). In numerous microcephalic transgenic mouse lines, p53- mediated apoptosis drives microcephaly, and genetic ablation of p53 can rescue the cell death and microcephaly (see section 4.2.3). It is unclear if this is the case in the Rnu11 mutant mouse, because

(1) a subset of MIGs are essential for cycling cell survival, as discussed in Chapter 5; and (2) inactivation of the minor spliceosome is projected to affect the expression and function of hundreds of genes, each of which has its own distinct interactors. To determine if p53-mediated apoptosis is the single driver of cell death in the Rnu11 mutant pallium, we have begun crossing the Trp53

208 conditional knockout mouse into the Emx1-Cre+ Rnu11 cKO line, which will be analyzed using the same postnatal and embryonic approaches detailed in Chapter 4 [588]. Alisa White of the

Kanadia lab is currently pursuing this line of investigation.

In the behavioral testing described in Chapter 6, we observed significant behavioral changes as a result of Rnu11 heterozygosity, raising the question of whether U11 haploinsufficiency impacts progenitor cell behavior, neuron production, neuron differentiation, and/or neuron function. To understand how Rnu11 heterozygosity impacts cortical development, we must first extend the embryonic analyses described in Chapter 4 to the Rnu11 heterozygous pallium, to determine if cell survival, proliferation/differentiation, and migration is affected by

U11 haploinsufficiency. The histological analysis would have to be extended to postnatal time- points spanning from P0 to P21, to assess changes in glial cell production from the developing postnatal cortex [202, 230-232]. Given the laminar structure of the cortex, wherein the neurons of each layer have different morphologies, projections, and overall functions, we would also need to employ layer-specific neuronal markers, such as an antisense FISH probe for Cux2, to mark layers

II-IV [589]; Rorb antisense probe to mark neurons in layer IV [590]; Ctip2 antibody or Er81 antisense probe to mark layer V [591]; Foxp2 antibody to mark layer VI [592]; and Ctgf antisense probe to mark the subplate [593]. Both the absolute and relative numbers of the neurons of each layer would need to be quantified, using NeuN antibody to mark all excitatory neurons. In addition to histological characterizations, qRT-PCR and RNAseq of the Rnu11 heterozygous vs the wild- type pallium would be essential, to characterize the relative levels of U11 in the Rnu11 heterozygous pallium and identify any minor intron retention and/or MIG alternative splicing in the heterozygous pallium. To understand the effects of Rnu11 heterozygosity on neuronal differentiation, we have begun breeding the Thy1-YFP allele in the Emx1-Cre+ Rnu11 cKO mouse

209 line. In this transgenic mouse, high YFP expression is observed in scattered pyramidal cells of layer V; this strong fluorescent staining would allow us to trace the dendritic branches of the labeled neurons, and therefore assess neuronal differentiation [594, 595]. In parallel, Golgi-Cox staining could be used to assess dendritic branching and spine morphology of pyramidal cells of the cortex [596]. To understand the effects of U11 haploinsufficiency on the Emx1-lineage oligodendrocytes, OPCs, and MSNs of the dorsal striatum, we would employ the Ai9 tdTomato cre reporter, combined with staining for the oligodendrocyte marker CC1, the OPC marker Ng2, and the MSN marker DARPP-32 [329], to quantify the number of these Emx1-lineage cells. To cover the possibility of changes in neuron firing, we have discussed collaborating with Dr.

Anastasios Tzingounis to investigate changes in the electrophysiology of the cortical Rnu11 mutant and heterozygous neurons, and to potentially perform calcium imaging. For this purpose, we have begun crossing the GCaMP5 calcium reporter into the Emx1-Cre+ Rnu11 cKO line, to visualize calcium flux in live brain slices [597]. This is particularly relevant, given the number of voltage-gated calcium channel alpha subunit genes that contain minor introns [58].

Finally, as discussed in section 6.4, there are many potential causes underlying the unexpected behavioral changes in the Rnu11 heterozygous mice. For example, changes in olfaction could impair sociability, and changes in motivation/attention could cause the increased motor performance observed. Therefore, additional behavioral experiments could help narrow down the specific domains affected by U11 haploinsufficiency. The olfactory habituation/dishabituation test, using both non-social and social odors, could eliminate impaired olfaction as a potential cause of the sociability deficit [598]. The use of home cage running wheels, which can record the amount of voluntary motor activity, would provide insight into the inherent motivation of these mice to perform motor behaviors. Finally, the 5-choice serial

210 reaction time task (5-CSRTT) could be employed to study changes in attention—an essential test, given that increased attention has been linked to better rotarod performance [599, 600].

211

Chapter 8: List of References

1. Sharp, P.A., The discovery of split genes and RNA splicing. Trends Biochem Sci, 2005. 30(6): p. 279-81. 2. Zalokar, M., Nuclear origin of ribonucleic acid. Nature, 1959. 183(4671): p. 1330. 3. Harris, H. and J.W. Watts, The relationship between nuclear and cytoplasmic ribonucleic acid. Proc R Soc Lond B Biol Sci, 1962. 156: p. 109-21. 4. Scherrer, K., H. Latham, and J.E. Darnell, Demonstration of an unstable RNA and of a precursor to ribosomal RNA in HeLa cells. Proc Natl Acad Sci U S A, 1963. 49: p. 240- 8. 5. Sibatani, A., et al., Isolation of a nuclear RNA fraction resembling DNA in its base composition. Proc Natl Acad Sci U S A, 1962. 48: p. 471-7. 6. Soeiro, R., H.C. Birnboim, and J.E. Darnell, Rapidly labeled HeLa cell nuclear RNA. II. Base composition and cellular localization of a heterogeneous RNA fraction. J Mol Biol, 1966. 19(2): p. 362-72. 7. Salditt-Georgieff, M., et al., Methyl labeling of HeLa cell hnRNA: a comparison with mRNA. Cell, 1976. 7(2): p. 227-37. 8. Perry, R.P. and D.E. Kelley, Kinetics of formation of 5' terminal caps in mRNA. Cell, 1976. 8(3): p. 433-42. 9. Rottman, F., A.J. Shatkin, and R.P. Perry, Sequences containing methylated nucleotides at the 5' termini of messenger RNAs: possible implications for processing. Cell, 1974. 3(3): p. 197-9. 10. Darnell, J.E., et al., Polyadenylic acid sequences: role in conversion of nuclear RNA into messenger RNA. Science, 1971. 174(4008): p. 507-10. 11. Jelinek, W., et al., Further evidence on the nuclear origin and transfer to the cytoplasm of polyadenylic acid sequences in mammalian cell RNA. J Mol Biol, 1973. 75(3): p. 515- 32. 12. Nakazato, H., D.W. Kopp, and M. Edmonds, Localization of the polyadenylate sequences in messenger ribonucleic acid and in the heterogeneous nuclear ribonucleic acid of HeLa cells. J Biol Chem, 1973. 248(4): p. 1472-6. 13. Edmonds, M., M.H. Vaughan, Jr., and H. Nakazato, Polyadenylic acid sequences in the heterogeneous nuclear RNA and rapidly-labeled polyribosomal RNA of HeLa cells:

212

possible evidence for a precursor relationship. Proc Natl Acad Sci U S A, 1971. 68(6): p. 1336-40. 14. Berk, A.J., Discovery of RNA splicing and genes in pieces. Proc Natl Acad Sci U S A, 2016. 113(4): p. 801-5. 15. Berget, S.M., C. Moore, and P.A. Sharp, Spliced segments at the 5' terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci U S A, 1977. 74(8): p. 3171-5. 16. Chow, L.T., et al., An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA. Cell, 1977. 12(1): p. 1-8. 17. Gilbert, W., Why genes in pieces? Nature, 1978. 271(5645): p. 501. 18. Mount, S.M., A catalogue of splice junction sequences. Nucleic Acids Res, 1982. 10(2): p. 459-72. 19. Sheth, N., et al., Comprehensive splice-site analysis using comparative . Nucleic Acids Res, 2006. 34(14): p. 3955-67. 20. Patel, A.A. and J.A. Steitz, Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol, 2003. 4(12): p. 960-70. 21. Reed, R. and T. Maniatis, Intron sequences involved in lariat formation during pre- mRNA splicing. Cell, 1985. 41(1): p. 95-105. 22. Breathnach, R. and P. Chambon, Organization and expression of eucaryotic split genes coding for proteins. Annu Rev Biochem, 1981. 50: p. 349-83. 23. Fouser, L.A. and J.D. Friesen, Mutations in a yeast intron demonstrate the importance of specific conserved nucleotides for the two stages of nuclear mRNA splicing. Cell, 1986. 45(1): p. 81-93. 24. Reed, R. and T. Maniatis, The role of the mammalian branchpoint sequence in pre-mRNA splicing. Genes Dev, 1988. 2(10): p. 1268-76. 25. Lerner, M.R., et al., Are snRNPs involved in splicing? Nature, 1980. 283(5743): p. 220-4. 26. Rogers, J. and R. Wall, A mechanism for RNA splicing. Proc Natl Acad Sci U S A, 1980. 77(4): p. 1877-9. 27. Lerner, M.R. and J.A. Steitz, Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus erythematosus. Proc Natl Acad Sci U S A, 1979. 76(11): p. 5495-9.

213

28. Mount, S.M., et al., The U1 small nuclear RNA-protein complex selectively binds a 5' splice site in vitro. Cell, 1983. 33(2): p. 509-18. 29. Krainer, A.R. and T. Maniatis, Multiple factors including the small nuclear ribonucleoproteins U1 and U2 are necessary for pre-mRNA splicing in vitro. Cell, 1985. 42(3): p. 725-36. 30. Black, D.L., B. Chabot, and J.A. Steitz, U2 as well as U1 small nuclear ribonucleoproteins are involved in premessenger RNA splicing. Cell, 1985. 42(3): p. 737- 50. 31. Berget, S.M. and B.L. Robberson, U1, U2, and U4/U6 small nuclear ribonucleoproteins are required for in vitro splicing but not polyadenylation. Cell, 1986. 46(5): p. 691-6. 32. Grabowski, P.J. and P.A. Sharp, Affinity chromatography of splicing complexes: U2, U5, and U4 + U6 small nuclear ribonucleoprotein particles in the spliceosome. Science, 1986. 233(4770): p. 1294-9. 33. Konarska, M.M. and P.A. Sharp, Interactions between small nuclear ribonucleoprotein particles in formation of spliceosomes. Cell, 1987. 49(6): p. 763-74. 34. Kolossova, I. and R.A. Padgett, U11 snRNA interacts in vivo with the 5' splice site of U12-dependent (AU-AC) pre-mRNA introns. RNA, 1997. 3(3): p. 227-33. 35. Sharp, P.A. and C.B. Burge, Classification of introns: U2-type or U12-type. Cell, 1997. 91(7): p. 875-9. 36. Maschhoff, K.L. and R.A. Padgett, The stereochemical course of the first step of pre- mRNA splicing. Nucleic Acids Res, 1993. 21(23): p. 5456-62. 37. Moore, M.J. and P.A. Sharp, Evidence for two active sites in the spliceosome provided by stereochemistry of pre-mRNA splicing. Nature, 1993. 365(6444): p. 364-8. 38. Valcarcel, J., et al., Interaction of U2AF65 RS region with pre-mRNA branch point and promotion of base pairing with U2 snRNA [corrected]. Science, 1996. 273(5282): p. 1706-9. 39. Ruskin, B., P.D. Zamore, and M.R. Green, A factor, U2AF, is required for U2 snRNP binding and splicing complex assembly. Cell, 1988. 52(2): p. 207-19. 40. Zamore, P.D. and M.R. Green, Identification, purification, and biochemical characterization of U2 small nuclear ribonucleoprotein auxiliary factor. Proc Natl Acad Sci U S A, 1989. 86(23): p. 9243-7.

214

41. Query, C.C., M.J. Moore, and P.A. Sharp, Branch nucleophile selection in pre-mRNA splicing: evidence for the bulged duplex model. Genes Dev, 1994. 8(5): p. 587-97. 42. Behrens, S.E. and R. Luhrmann, Immunoaffinity purification of a [U4/U6.U5] tri-snRNP from human cells. Genes Dev, 1991. 5(8): p. 1439-52. 43. Lamond, A.I., et al., Spliceosome assembly involves the binding and release of U4 small nuclear ribonucleoprotein. Proc Natl Acad Sci U S A, 1988. 85(2): p. 411-5. 44. Madhani, H.D. and C. Guthrie, A novel base-pairing interaction between U2 and U6 snRNAs suggests a mechanism for the catalytic activation of the spliceosome. Cell, 1992. 71(5): p. 803-17. 45. Datta, B. and A.M. Weiner, Genetic evidence for base pairing between U2 and U6 snRNA in mammalian mRNA splicing. Nature, 1991. 352(6338): p. 821-4. 46. Hausner, T.P., L.M. Giglio, and A.M. Weiner, Evidence for base-pairing between mammalian U2 and U6 small nuclear ribonucleoprotein particles. Genes Dev, 1990. 4(12A): p. 2146-56. 47. Wu, J.A. and J.L. Manley, Base pairing between U2 and U6 snRNAs is necessary for splicing of a mammalian pre-mRNA. Nature, 1991. 352(6338): p. 818-21. 48. Yean, S.L. and R.J. Lin, U4 small nuclear RNA dissociates from a yeast spliceosome and does not participate in the subsequent splicing reaction. Mol Cell Biol, 1991. 11(11): p. 5571-7. 49. Black, D.L. and J.A. Steitz, Pre-mRNA splicing in vitro requires intact U4/U6 small nuclear ribonucleoprotein. Cell, 1986. 46(5): p. 697-704. 50. Newman, A.J. and C. Norman, U5 snRNA interacts with exon sequences at 5' and 3' splice sites. Cell, 1992. 68(4): p. 743-54. 51. Newman, A.J., The role of U5 snRNP in pre-mRNA splicing. EMBO J, 1997. 16(19): p. 5797-800. 52. Sontheimer, E.J. and J.A. Steitz, The U5 and U6 small nuclear RNAs as components of the spliceosome. Science, 1993. 262(5142): p. 1989-96. 53. Senapathy, P., M.B. Shapiro, and N.L. Harris, Splice junctions, branch point sites, and exons: sequence statistics, identification, and applications to genome project. Methods Enzymol, 1990. 183: p. 252-78.

215

54. Shapiro, M.B. and P. Senapathy, RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucleic Acids Res, 1987. 15(17): p. 7155-74. 55. Jackson, I.J., A reappraisal of non-consensus mRNA splice sites. Nucleic Acids Res, 1991. 19(14): p. 3795-8. 56. Tarn, W.Y. and J.A. Steitz, A novel spliceosome containing U11, U12, and U5 snRNPs excises a minor class (AT-AC) intron in vitro. Cell, 1996. 84(5): p. 801-11. 57. Hall, S.L. and R.A. Padgett, Conserved sequences in a class of rare eukaryotic nuclear introns with non-consensus splice sites. J Mol Biol, 1994. 239(3): p. 357-65. 58. Alioto, T.S., U12DB: a database of orthologous U12-type spliceosomal introns. Nucleic Acids Res, 2007. 35(Database issue): p. D110-5. 59. Montzka, K.A. and J.A. Steitz, Additional low-abundance human small nuclear ribonucleoproteins: U11, U12, etc. Proc Natl Acad Sci U S A, 1988. 85(23): p. 8885-9. 60. Hall, S.L. and R.A. Padgett, Requirement of U12 snRNA for in vivo splicing of a minor class of eukaryotic nuclear pre-mRNA introns. Science, 1996. 271(5256): p. 1716-8. 61. Tarn, W.Y., T.A. Yario, and J.A. Steitz, U12 snRNA in vertebrates: evolutionary conservation of 5' sequences implicated in splicing of pre-mRNAs containing a minor class of introns. RNA, 1995. 1(6): p. 644-56. 62. Ruskin, B., J.M. Greene, and M.R. Green, Cryptic branch point activation allows accurate in vitro splicing of human beta-globin intron mutants. Cell, 1985. 41(3): p. 833- 44. 63. Christofori, G. and W. Keller, 3' cleavage and polyadenylation of mRNA precursors in vitro requires a poly(A) polymerase, a , and a snRNP. Cell, 1988. 54(6): p. 875-89. 64. Yu, Y.T. and J.A. Steitz, Site-specific crosslinking of mammalian U11 and u6atac to the 5' splice site of an AT-AC intron. Proc Natl Acad Sci U S A, 1997. 94(12): p. 6030-5. 65. Frilander, M.J. and J.A. Steitz, Initial recognition of U12-dependent introns requires both U11/5' splice-site and U12/branchpoint interactions. Genes Dev, 1999. 13(7): p. 851-63. 66. Tarn, W.Y. and J.A. Steitz, Highly diverged U4 and U6 small nuclear RNAs required for splicing rare AT-AC introns. Science, 1996. 273(5283): p. 1824-32.

216

67. Padgett, R.A. and G.C. Shukla, A revised model for U4atac/U6atac snRNA base pairing. RNA, 2002. 8(2): p. 125-8. 68. Incorvaia, R. and R.A. Padgett, Base pairing with U6atac snRNA is required for 5' splice site activation of U12-dependent introns in vivo. RNA, 1998. 4(6): p. 709-18. 69. Will, C.L., et al., Identification of both shared and distinct proteins in the major and minor spliceosomes. Science, 1999. 284(5422): p. 2003-5. 70. Schneider, C., et al., Human U4/U6.U5 and U4atac/U6atac.U5 tri-snRNPs exhibit similar protein compositions. Mol Cell Biol, 2002. 22(10): p. 3219-29. 71. Will, C.L., et al., The human 18S U11/U12 snRNP contains a set of novel proteins not found in the U2-dependent spliceosome. RNA, 2004. 10(6): p. 929-41. 72. Benecke, H., R. Luhrmann, and C.L. Will, The U11/U12 snRNP 65K protein acts as a molecular bridge, binding the U12 snRNA and U11-59K protein. EMBO J, 2005. 24(17): p. 3057-69. 73. Turunen, J.J., et al., The U11-48K protein contacts the 5' splice site of U12-type introns and the U11-59K protein. Mol Cell Biol, 2008. 28(10): p. 3548-60. 74. Kim, W.Y., et al., The Arabidopsis U12-type spliceosomal protein U11/U12-31K is involved in U12 intron splicing via RNA chaperone activity and affects plant development. Plant Cell, 2010. 22(12): p. 3951-62. 75. Park, S.J., et al., Structural features important for the U12 snRNA binding and minor spliceosome assembly of Arabidopsis U11/U12-small nuclear ribonucleoproteins. RNA Biol, 2016. 13(7): p. 670-9. 76. Frilander, M.J. and J.A. Steitz, Dynamic exchanges of RNA interactions leading to catalytic core formation in the U12-dependent spliceosome. Mol Cell, 2001. 7(1): p. 217- 26. 77. Burge, C.B., R.A. Padgett, and P.A. Sharp, Evolutionary fates and origins of U12-type introns. Mol Cell, 1998. 2(6): p. 773-785. 78. Zhu, W. and V. Brendel, Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome. Nucleic Acids Res, 2003. 31(15): p. 4561-72. 79. Wu, H.J., et al., Non-canonical introns are at least 10(9) years old. Nat Genet, 1996. 14(4): p. 383-4.

217

80. Russell, A.G., et al., An early evolutionary origin for the minor spliceosome. Nature, 2006. 443(7113): p. 863-6. 81. Davila Lopez, M., M.A. Rosenblad, and T. Samuelsson, Computational screen for spliceosomal RNA genes aids in defining the phylogenetic distribution of major and minor spliceosomal components. Nucleic Acids Res, 2008. 36(9): p. 3001-10. 82. Bartschat, S. and T. Samuelsson, U12 type introns were lost at multiple occasions during evolution. BMC Genomics, 2010. 11: p. 106. 83. Janice, J., et al., U12-type spliceosomal introns of Insecta. Int J Biol Sci, 2012. 8(3): p. 344-52. 84. Basu, M.K., et al., U12 intron positions are more strongly conserved between animals and plants than U2 intron positions. Biol Direct, 2008. 3: p. 19. 85. Mount, S.M., AT-AC introns: an ATtACk on dogma. Science, 1996. 271(5256): p. 1690- 2. 86. Patel, A.A., M. McCarthy, and J.A. Steitz, The splicing of U12-type introns can be a rate-limiting step in gene expression. EMBO J, 2002. 21(14): p. 3804-15. 87. Hirose, T., M.D. Shu, and J.A. Steitz, Splicing of U12-type introns deposits an exon junction complex competent to induce nonsense-mediated mRNA decay. Proc Natl Acad Sci U S A, 2004. 101(52): p. 17976-81. 88. Younis, I., et al., Minor introns are embedded molecular switches regulated by highly unstable U6atac snRNA. Elife, 2013. 2: p. e00780. 89. Otake, L.R., et al., The divergent U12-type spliceosome is required for pre-mRNA splicing and is essential for development in Drosophila. Mol Cell, 2002. 9(2): p. 439-46. 90. Scamborova, P., A. Wong, and J.A. Steitz, An intronic enhancer regulates splicing of the twintron of Drosophila melanogaster prospero pre-mRNA by two different spliceosomes. Mol Cell Biol, 2004. 24(5): p. 1855-69. 91. Janice, J., M. Jakalski, and W. Makalowski, Surprisingly high number of Twintrons in vertebrates. Biol Direct, 2013. 8: p. 4. 92. Szczesniak, M.W., et al., ERISdb: a database of plant splice sites and splicing signals. Plant Cell Physiol, 2013. 54(2): p. e10.

218

93. Wu, Q. and A.R. Krainer, AT-AC pre-mRNA splicing mechanisms and conservation of minor introns in voltage-gated ion channel genes. Mol Cell Biol, 1999. 19(5): p. 3225- 36. 94. Catterall, W.A., A.L. Goldin, and S.G. Waxman, International Union of Pharmacology. XLVII. Nomenclature and structure-function relationships of voltage-gated sodium channels. Pharmacol Rev, 2005. 57(4): p. 397-409. 95. Buraei, Z. and J. Yang, The ss subunit of voltage-gated Ca2+ channels. Physiol Rev, 2010. 90(4): p. 1461-506. 96. Davies, A., et al., Functional biology of the alpha(2)delta subunits of voltage-gated calcium channels. Trends Pharmacol Sci, 2007. 28(5): p. 220-8. 97. Muratoglu, S., et al., Primary structure of human matrilin-2, chromosome location of the MATN2 gene and conservation of an AT-AC intron in matrilin genes. Cytogenet Cell Genet, 2000. 90(3-4): p. 323-7. 98. Klatt, A.R., et al., The matrilins: modulators of extracellular matrix assembly. Int J Biochem Cell Biol, 2011. 43(3): p. 320-30. 99. Levine, A. and R. Durbin, A computational scan for U12-dependent introns in the human genome sequence. Nucleic Acids Res, 2001. 29(19): p. 4006-13. 100. Cargnello, M. and P.P. Roux, Activation and function of the MAPKs and their substrates, the MAPK-activated protein kinases. Microbiol Mol Biol Rev, 2011. 75(1): p. 50-83. 101. Dimova, D.K. and N.J. Dyson, The E2F transcriptional network: old acquaintances with new faces. Oncogene, 2005. 24(17): p. 2810-26. 102. Kurimchak, A. and X. Grana, PP2A: more than a reset switch to activate pRB proteins during the cell cycle and in response to signaling cues. Cell Cycle, 2015. 14(1): p. 18-30. 103. Madan, V., et al., Aberrant splicing of U12-type introns is the hallmark of ZRSR2 mutant myelodysplastic syndrome. Nat Commun, 2015. 6: p. 6042. 104. Gault, C.M., et al., Aberrant splicing in maize rough endosperm3 reveals a conserved role for U12 splicing in eukaryotic multicellular development. Proc Natl Acad Sci U S A, 2017. 114(11): p. E2195-E2204. 105. Chang, W.C., et al., Alternative splicing and bioinformatic analysis of human U12-type introns. Nucleic Acids Res, 2007. 35(6): p. 1833-41.

219

106. Merico, D., et al., Compound heterozygous mutations in the noncoding RNU4ATAC cause Roifman Syndrome by disrupting minor intron splicing. Nat Commun, 2015. 6: p. 8718. 107. He, H., et al., Mutations in U4atac snRNA, a component of the minor spliceosome, in the developmental disorder MOPD I. Science, 2011. 332(6026): p. 238-40. 108. Edery, P., et al., Association of TALS developmental disorder with defect in minor splicing component U4atac snRNA. Science, 2011. 332(6026): p. 240-3. 109. Pessa, H.K., et al., Gene expression profiling of U12-type spliceosome mutant Drosophila reveals widespread changes in metabolic pathways. PLoS One, 2010. 5(10): p. e13215. 110. McClung, J.K., et al., Prohibitin: potential role in senescence, development, and tumor suppression. Exp Gerontol, 1995. 30(2): p. 99-124. 111. Kwak, K.J., et al., The minor spliceosomal protein U11/U12-31K is an RNA chaperone crucial for U12 intron splicing and the development of dicot and monocot plants. PLoS One, 2012. 7(8): p. e43707. 112. Jung, H.J. and H. Kang, The Arabidopsis U11/U12-65K is an indispensible component of minor spliceosome and plays a crucial role in U12 intron splicing and plant development. Plant J, 2014. 78(5): p. 799-810. 113. Markmiller, S., et al., Minor class splicing shapes the zebrafish transcriptome during development. Proc Natl Acad Sci U S A, 2014. 111(8): p. 3062-7. 114. Doggett, K., et al., Early developmental arrest and impaired gastrointestinal homeostasis in U12-dependent splicing-defective Rnpc3-deficient mice. RNA, 2018. 24(12): p. 1856- 1870. 115. Jagannathan-Bogdan, M. and L.I. Zon, Hematopoiesis. Development, 2013. 140(12): p. 2463-7. 116. Baumgartner, M., et al., Minor splicing snRNAs are enriched in the developing mouse CNS and are crucial for survival of differentiating retinal neurons. Dev Neurobiol, 2015. 75(9): p. 895-907. 117. Baumgartner, M., et al., Minor spliceosome inactivation causes microcephaly, owing to cell cycle defects and death of self-amplifying radial glial cells. Development, 2018. 145(17).

220

118. Lotti, F., et al., An SMN-dependent U12 splicing event essential for motor circuit function. Cell, 2012. 151(2): p. 440-54. 119. Argente, J., et al., Defective minor spliceosome mRNA processing results in isolated familial growth hormone deficiency. EMBO Mol Med, 2014. 6(3): p. 299-306. 120. Reber, S., et al., Minor intron splicing is regulated by FUS and affected by ALS- associated FUS mutants. EMBO J, 2016. 35(14): p. 1504-21. 121. Elsaid, M.F., et al., Mutation in noncoding RNA RNU12 causes early onset cerebellar ataxia. Ann Neurol, 2017. 81(1): p. 68-78. 122. Farach, L.S., et al., The expanding phenotype of RNU4ATAC pathogenic variants to Lowry Wood syndrome. Am J Med Genet A, 2018. 176(2): p. 465-469. 123. Xu, G.J., et al., Systematic autoantigen analysis identifies a distinct subtype of scleroderma with coincident cancer. Proc Natl Acad Sci U S A, 2016. 113(47): p. E7526- E7534. 124. Putoux, A., et al., Refining the phenotypical and mutational spectrum of Taybi-Linder syndrome. Clin Genet, 2016. 90(6): p. 550-555. 125. Nagy, R., et al., Microcephalic osteodysplastic primordial dwarfism type I with biallelic mutations in the RNU4ATAC gene. Clin Genet, 2012. 82(2): p. 140-6. 126. Sigaudy, S., et al., Microcephalic osteodysplastic primordial dwarfism Taybi-Linder type: report of four cases and review of the literature. Am J Med Genet, 1998. 80(1): p. 16-24. 127. Kroigard, A.B., et al., Two novel mutations in RNU4ATAC in two siblings with an atypical mild phenotype of microcephalic osteodysplastic primordial dwarfism type 1. Clin Dysmorphol, 2016. 25(2): p. 68-72. 128. Abdel-Salam, G.M., et al., Further delineation of the clinical spectrum in RNU4ATAC related microcephalic osteodysplastic primordial dwarfism type I. Am J Med Genet A, 2013. 161A(8): p. 1875-81. 129. Kilic, E., et al., A novel mutation in RNU4ATAC in a patient with microcephalic osteodysplastic primordial dwarfism type I. Am J Med Genet A, 2015. 167A(4): p. 919- 21.

221

130. Wang, Y., et al., Identification of compound heterozygous variants in the noncoding RNU4ATAC gene in a Chinese family with two successive foetuses with severe microcephaly. Hum Genomics, 2018. 12(1): p. 3. 131. Nottrott, S., et al., Functional interaction of a novel 15.5kD [U4/U6.U5] tri-snRNP protein with the 5' stem-loop of U4 snRNA. EMBO J, 1999. 18(21): p. 6119-33. 132. Nottrott, S., H. Urlaub, and R. Luhrmann, Hierarchical, clustered protein interactions with U4/U6 snRNA: a biochemical role for U4/U6 proteins. EMBO J, 2002. 21(20): p. 5527-38. 133. Liu, S., et al., Structural basis for the dual U4 and U4atac snRNA-binding specificity of spliceosomal protein hPrp31. RNA, 2011. 17(9): p. 1655-63. 134. Jafarifar, F., et al., Biochemical defects in minor spliceosome function in the developmental disorder MOPD I. RNA, 2014. 20(7): p. 1078-89. 135. Kiss, T., Biogenesis of small nuclear RNPs. J Cell Sci, 2004. 117(Pt 25): p. 5949-51. 136. Abdel-Salam, G.M., et al., Long-term survival in microcephalic osteodysplastic primordial dwarfism type I: Evaluation of an 18-year-old male with g.55G>A homozygous mutation in RNU4ATAC. Am J Med Genet A, 2016. 170A(1): p. 277-82. 137. Heremans, J., et al., Abnormal differentiation of B cells and megakaryocytes in patients with Roifman syndrome. J Allergy Clin Immunol, 2018. 142(2): p. 630-646. 138. Robertson, S.P., C. Rodda, and A. Bankier, Hypogonadotrophic hypogonadism in Roifman syndrome. Clin Genet, 2000. 57(6): p. 435-8. 139. Dinur Schejter, Y., Merico, D., Manson, D., Reid, B., Vong, L., A novel mutation in Roifman syndrome redefines the boundaries of the Sm protein-binding site. LymphoSign Journal, 2016. 3(4): p. 159-163. 140. Dinur Schejter, Y., et al., A homozygous mutation in the stem II domain of RNU4ATAC causes typical Roifman syndrome. NPJ Genom Med, 2017. 2: p. 23. 141. Lowry, R.B. and B.J. Wood, Syndrome of epiphyseal dysplasia, short stature, microcephaly and nystagmus. Clin Genet, 1975. 8(4): p. 269-74. 142. Lowry, R.B., et al., Epiphyseal dysplasia, microcephaly, nystagmus, and retinitis pigmentosa. Am J Med Genet, 1989. 33(3): p. 341-5.

222

143. Shelihan, I., et al., Lowry-Wood syndrome: further evidence of association with RNU4ATAC, and correlation between genotype and phenotype. Hum Genet, 2018. 137(11-12): p. 905-909. 144. Shukla, G.C., et al., Domains of human U4atac snRNA required for U12-dependent splicing in vivo. Nucleic Acids Res, 2002. 30(21): p. 4650-7. 145. Norppa, A.J., et al., Mutations in the U11/U12-65K protein associated with isolated growth hormone deficiency lead to structural destabilization and impaired binding of U12 snRNA. RNA, 2018. 24(3): p. 396-409. 146. Sikand, K. and G.C. Shukla, Functionally important structural elements of U12 snRNA. Nucleic Acids Res, 2011. 39(19): p. 8531-43. 147. Gabanella, F., et al., Ribonucleoprotein assembly defects correlate with spinal muscular atrophy severity and preferentially affect a subset of spliceosomal snRNPs. PLoS One, 2007. 2(9): p. e921. 148. Lefebvre, S., et al., Identification and characterization of a spinal muscular atrophy- determining gene. Cell, 1995. 80(1): p. 155-65. 149. Workman, E., S.J. Kolb, and D.J. Battle, Spliceosomal small nuclear ribonucleoprotein biogenesis defects and motor neuron selectivity in spinal muscular atrophy. Brain Res, 2012. 1462: p. 93-9. 150. Li, D.K., et al., SMN control of RNP assembly: from post-transcriptional gene regulation to motor neuron disease. Semin Cell Dev Biol, 2014. 32: p. 22-9. 151. Zhang, Z., et al., SMN deficiency causes tissue-specific perturbations in the repertoire of snRNAs and widespread defects in splicing. Cell, 2008. 133(4): p. 585-600. 152. Boulisfane, N., et al., Impaired minor tri-snRNP assembly generates differential splicing defects of U12-type introns in lymphoblasts derived from a type I SMA patient. Hum Mol Genet, 2011. 20(4): p. 641-8. 153. Baumer, D., et al., Alternative splicing events are a late feature of pathology in a mouse model of spinal muscular atrophy. PLoS Genet, 2009. 5(12): p. e1000773. 154. Huo, Q., et al., Splicing changes in SMA mouse motoneurons and SMN-depleted neuroblastoma cells: evidence for involvement of splicing regulatory proteins. RNA Biol, 2014. 11(11): p. 1430-46.

223

155. Zhang, Z., et al., Dysregulation of synaptogenesis genes antecedes motor neuron pathology in spinal muscular atrophy. Proc Natl Acad Sci U S A, 2013. 110(48): p. 19348-53. 156. Brown, R.H. and A. Al-Chalabi, Amyotrophic Lateral Sclerosis. N Engl J Med, 2017. 377(2): p. 162-172. 157. Purice, M.D. and J.P. Taylor, Linking hnRNP Function to ALS and FTD Pathology. Front Neurosci, 2018. 12: p. 326. 158. Wang, I.F., N.M. Reddy, and C.K. Shen, Higher order arrangement of the eukaryotic nuclear bodies. Proc Natl Acad Sci U S A, 2002. 99(21): p. 13583-8. 159. Blokhuis, A.M., et al., Protein aggregation in amyotrophic lateral sclerosis. Acta Neuropathol, 2013. 125(6): p. 777-94. 160. !!! INVALID CITATION !!! 161. Ishihara, T., et al., Decreased number of Gemini of coiled bodies and U12 snRNA level in amyotrophic lateral sclerosis. Hum Mol Genet, 2013. 22(20): p. 4136-47. 162. Highley, J.R., et al., Loss of nuclear TDP-43 in amyotrophic lateral sclerosis (ALS) causes altered expression of splicing machinery and widespread dysregulation of RNA splicing in motor neurones. Neuropathol Appl Neurobiol, 2014. 40(6): p. 670-85. 163. Conte, A., et al., P525L FUS mutation is consistently associated with a severe form of juvenile amyotrophic lateral sclerosis. Neuromuscul Disord, 2012. 22(1): p. 73-5. 164. Sharma, A., et al., ALS-associated mutant FUS induces selective motor neuron degeneration through toxic gain of function. Nat Commun, 2016. 7: p. 10465. 165. Young, R.W., Cell proliferation during postnatal development of the retina in the mouse. Brain Res, 1985. 353(2): p. 229-39. 166. Piccolino, M., Cajal and the retina: a 100-year retrospective. Trends Neurosci, 1988. 11(12): p. 521-5. 167. Remington, L.A., Goodwin, D., Clinical Anatomy and Physiology. 3rd ed. 2011: Butterworth-Heinemann. 588. 168. Sparrow, J.R., D. Hicks, and C.P. Hamel, The retinal pigment epithelium in health and disease. Curr Mol Med, 2010. 10(9): p. 802-23. 169. Boulton, M. and P. Dayhaw-Barker, The role of the retinal pigment epithelium: topographical variation and ageing changes. Eye (Lond), 2001. 15(Pt 3): p. 384-9.

224

170. Hoon, M., et al., Functional architecture of the retina: development and disease. Prog Retin Eye Res, 2014. 42: p. 44-84. 171. Kefalov, V.J., Rod and cone visual pigments and phototransduction through pharmacological, genetic, and physiological approaches. J Biol Chem, 2012. 287(3): p. 1635-41. 172. Jeffries, A.M., N.J. Killian, and J.S. Pezaris, Mapping the primate lateral geniculate nucleus: a review of experiments and methods. J Physiol Paris, 2014. 108(1): p. 3-10. 173. Euler, T., et al., Retinal bipolar cells: elementary building blocks of vision. Nat Rev Neurosci, 2014. 15(8): p. 507-19. 174. Boije, H., et al., Horizontal Cells, the Odd Ones Out in the Retina, Give Insights into Development and Disease. Front Neuroanat, 2016. 10: p. 77. 175. Masland, R.H., The tasks of amacrine cells. Vis Neurosci, 2012. 29(1): p. 3-9. 176. Franze, K., et al., Muller cells are living optical fibers in the vertebrate retina. Proc Natl Acad Sci U S A, 2007. 104(20): p. 8287-92. 177. Bringmann, A., et al., Muller cells in the healthy and diseased retina. Prog Retin Eye Res, 2006. 25(4): p. 397-424. 178. Graw, J., Eye development. Curr Top Dev Biol, 2010. 90: p. 343-86. 179. Colozza, G., M. Locker, and M. Perron, Shaping the eye from embryonic stem cells: Biological and medical implications. World J Stem Cells, 2012. 4(8): p. 80-6. 180. Centanin, L. and J. Wittbrodt, Retinal neurogenesis. Development, 2014. 141(2): p. 241- 4. 181. Amini, R., M. Rocha-Martins, and C. Norden, Neuronal Migration and Lamination in the Vertebrate Retina. Front Neurosci, 2017. 11: p. 742. 182. Blackshaw, S., et al., Genomic analysis of mouse retinal development. PLoS Biol, 2004. 2(9): p. E247. 183. Sidman, R.L., Histogenesis of mouse retina studied with thymidine-H3, in The Structure of the Eye, G.K. Smelser, Editor. 1961, Academic Press: New York. p. 487-506. 184. Young, R.W., Cell differentiation in the retina of the mouse. Anat Rec, 1985. 212(2): p. 199-205. 185. Karunakaran, D.K., et al., The expression analysis of Sfrs10 and Celf4 during mouse retinal development. Gene Expr Patterns, 2013. 13(8): p. 425-36.

225

186. Wetzel, R.K., E. Arystarkhova, and K.J. Sweadner, Cellular and subcellular specification of Na,K-ATPase alpha and beta isoforms in the postnatal development of mouse retina. J Neurosci, 1999. 19(22): p. 9878-89. 187. Olney, J.W., An electron microscopic study of synapse formation, receptor outer segment development, and other aspects of developing mouse retina. Invest Ophthalmol, 1968. 7(3): p. 250-68. 188. Mochida, G.H. and C.A. Walsh, Molecular genetics of human microcephaly. Curr Opin Neurol, 2001. 14(2): p. 151-6. 189. Petersen, C.C. and S. Crochet, Synaptic computation and sensory processing in neocortical layer 2/3. Neuron, 2013. 78(1): p. 28-48. 190. Meyer, H.S., et al., Inhibitory interneurons in a cortical column form hot zones of inhibition in layers 2 and 5A. Proc Natl Acad Sci U S A, 2011. 108(40): p. 16807-12. 191. Hestrin, S. and W.E. Armstrong, Morphology and physiology of cortical neurons in layer I. J Neurosci, 1996. 16(17): p. 5290-300. 192. Larkum, M., A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends Neurosci, 2013. 36(3): p. 141-51. 193. Greig, L.C., et al., Molecular logic of neocortical projection neuron specification, development and diversity. Nat Rev Neurosci, 2013. 14(11): p. 755-69. 194. Deck, M., et al., Pathfinding of corticothalamic axons relies on a rendezvous with thalamic projections. Neuron, 2013. 77(3): p. 472-84. 195. Sohur, U.S., et al., Anatomic and molecular development of corticostriatal projection neurons in mice. Cereb Cortex, 2014. 24(2): p. 293-303. 196. Lopez-Hidalgo, M., W.B. Hoover, and J. Schummers, Spatial organization of astrocytes in ferret visual cortex. J Comp Neurol, 2016. 524(17): p. 3561-3576. 197. Jakel, S. and L. Dimou, Glial Cells and Their Function in the Adult Brain: A Journey through the History of Their Ablation. Front Cell Neurosci, 2017. 11: p. 24. 198. Kimelberg, H.K., Functions of mature mammalian astrocytes: a current view. Neuroscientist, 2010. 16(1): p. 79-106. 199. Sloan, S.A. and B.A. Barres, Looks can be deceiving: reconsidering the evidence for gliotransmission. Neuron, 2014. 84(6): p. 1112-5.

226

200. Farhy-Tselnicker, I. and N.J. Allen, Astrocytes, neurons, synapses: a tripartite view on cortical circuit development. Neural Dev, 2018. 13(1): p. 7. 201. Savtchouk, I. and A. Volterra, Gliotransmission: Beyond Black-and-White. J Neurosci, 2018. 38(1): p. 14-25. 202. Nishiyama, A., et al., Lineage, fate, and fate potential of NG2-glia. Brain Res, 2016. 1638(Pt B): p. 116-128. 203. Michalski, J.P. and R. Kothary, Oligodendrocytes in a Nutshell. Front Cell Neurosci, 2015. 9: p. 340. 204. Nishiyama, A., et al., Polydendrocytes (NG2 cells): multifunctional cells with lineage plasticity. Nat Rev Neurosci, 2009. 10(1): p. 9-22. 205. Cronk, J.C. and J. Kipnis, Microglia - the brain's busy bees. F1000Prime Rep, 2013. 5: p. 53. 206. Jacobson, A.G. and P.P. Tam, Cephalic neurulation in the mouse embryo analyzed by SEM and morphometry. Anat Rec, 1982. 203(3): p. 375-96. 207. Rallu, M., J.G. Corbin, and G. Fishell, Parsing the prosencephalon. Nat Rev Neurosci, 2002. 3(12): p. 943-51. 208. Gotz, M. and L. Sommer, Cortical development: the art of generating cell diversity. Development, 2005. 132(15): p. 3327-32. 209. Spear, P.C. and C.A. Erickson, Interkinetic nuclear migration: a mysterious process in search of a function. Dev Growth Differ, 2012. 54(3): p. 306-16. 210. Fernandez, V., C. Llinares-Benadero, and V. Borrell, Cerebral cortex expansion and folding: what have we learned? EMBO J, 2016. 35(10): p. 1021-44. 211. Gotz, M. and W.B. Huttner, The cell biology of neurogenesis. Nat Rev Mol Cell Biol, 2005. 6(10): p. 777-88. 212. Taverna, E., M. Gotz, and W.B. Huttner, The cell biology of neurogenesis: toward an understanding of the development and evolution of the neocortex. Annu Rev Cell Dev Biol, 2014. 30: p. 465-502. 213. Reid, C.B., I. Liang, and C. Walsh, Systematic widespread clonal organization in cerebral cortex. Neuron, 1995. 15(2): p. 299-310. 214. Grove, E.A., et al., Multiple restricted lineages in the embryonic rat cerebral cortex. Development, 1993. 117(2): p. 553-61.

227

215. Malatesta, P., E. Hartfuss, and M. Gotz, Isolation of radial glial cells by fluorescent- activated cell sorting reveals a neuronal lineage. Development, 2000. 127(24): p. 5253- 63. 216. Noctor, S.C., et al., Cortical neurons arise in symmetric and asymmetric division zones and migrate through specific phases. Nat Neurosci, 2004. 7(2): p. 136-44. 217. Noctor, S.C., et al., Neurons derived from radial glial cells establish radial units in neocortex. Nature, 2001. 409(6821): p. 714-20. 218. Haubensak, W., et al., Neurons arise in the basal neuroepithelium of the early mammalian telencephalon: a major site of neurogenesis. Proc Natl Acad Sci U S A, 2004. 101(9): p. 3196-201. 219. Miyata, T., et al., Asymmetric production of surface-dividing and non-surface-dividing cortical progenitor cells. Development, 2004. 131(13): p. 3133-45. 220. Kowalczyk, T., et al., Intermediate neuronal progenitors (basal progenitors) produce pyramidal-projection neurons for all layers of cerebral cortex. Cereb Cortex, 2009. 19(10): p. 2439-50. 221. Smart, I.H., Proliferative characteristics of the ependymal layer during the early development of the mouse neocortex: a pilot study based on recording the number, location and plane of cleavage of mitotic figures. J Anat, 1973. 116(Pt 1): p. 67-91. 222. Noctor, S.C., V. Martinez-Cerdeno, and A.R. Kriegstein, Distinct behaviors of neural stem and progenitor cells underlie cortical neurogenesis. J Comp Neurol, 2008. 508(1): p. 28-44. 223. Wang, X., et al., A new subtype of progenitor cell in the mouse embryonic neocortex. Nat Neurosci, 2011. 14(5): p. 555-61. 224. Florio, M., et al., Human-specific gene ARHGAP11B promotes basal progenitor amplification and neocortex expansion. Science, 2015. 347(6229): p. 1465-70. 225. Hoerder-Suabedissen, A. and Z. Molnar, Development, evolution and pathology of neocortical subplate neurons. Nat Rev Neurosci, 2015. 16(3): p. 133-46. 226. Kirischuk, S., H.J. Luhmann, and W. Kilb, Cajal-Retzius cells: update on structural and functional properties of these mystic neurons that bridged the 20th century. Neuroscience, 2014. 275: p. 33-46.

228

227. Derer, P. and M. Derer, Cajal-Retzius cell ontogenesis and death in mouse brain visualized with horseradish peroxidase and electron microscopy. Neuroscience, 1990. 36(3): p. 839-56. 228. Chowdhury, T.G., et al., Fate of cajal-retzius neurons in the postnatal mouse neocortex. Front Neuroanat, 2010. 4: p. 10. 229. Kelsom, C. and W. Lu, Development and specification of GABAergic cortical interneurons. Cell Biosci, 2013. 3(1): p. 19. 230. Ge, W.P. and J.M. Jia, Local production of astrocytes in the cerebral cortex. Neuroscience, 2016. 323: p. 3-9. 231. Ge, W.P., et al., Local generation of glia is a major astrocyte source in postnatal cortex. Nature, 2012. 484(7394): p. 376-80. 232. Kessaris, N., et al., Competing waves of oligodendrocytes in the forebrain and postnatal elimination of an embryonic lineage. Nat Neurosci, 2006. 9(2): p. 173-9. 233. Arno, B., et al., Neural progenitor cells orchestrate microglia migration and positioning into the developing cortex. Nat Commun, 2014. 5: p. 5611. 234. Prinz, M. and J. Priller, Microglia and brain macrophages in the molecular age: from origin to neuropsychiatric disease. Nat Rev Neurosci, 2014. 15(5): p. 300-12. 235. Gorski, J.A., et al., Cortical excitatory neurons and glia, but not GABAergic neurons, are produced in the Emx1-expressing lineage. J Neurosci, 2002. 22(15): p. 6309-14. 236. Matter, N. and H. Konig, Targeted 'knockdown' of spliceosome function in mammalian cells. Nucleic Acids Res, 2005. 33(4): p. e41. 237. Karunakaran, D.K. and R. Kanadia, In Vivo and Explant Electroporation of Morpholinos in the Developing Mouse Retina. Methods Mol Biol, 2017. 1565: p. 215-227. 238. Marz, M., T. Kirsten, and P.F. Stadler, Evolution of spliceosomal snRNA genes in metazoan animals. J Mol Evol, 2008. 67(6): p. 594-607. 239. Kent, W.J., BLAT--the BLAST-like alignment tool. Genome Res, 2002. 12(4): p. 656-64. 240. Yong, J., et al., snRNAs contain specific SMN-binding domains that are essential for snRNP assembly. Mol Cell Biol, 2004. 24(7): p. 2747-56. 241. Karunakaran, D.K., et al., Network-based bioinformatics analysis of spatio-temporal RNA-Seq data reveals transcriptional programs underpinning normal and aberrant retinal development. BMC Genomics, 2016. 17 Suppl 5: p. 495.

229

242. Trimarchi, J.M., et al., Molecular heterogeneity of developing retinal ganglion and amacrine cells revealed through single cell gene expression profiling. J Comp Neurol, 2007. 502(6): p. 1047-65. 243. Huang da, W., B.T. Sherman, and R.A. Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 2009. 4(1): p. 44-57. 244. Huang da, W., B.T. Sherman, and R.A. Lempicki, Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res, 2009. 37(1): p. 1-13. 245. Sturn, A., J. Quackenbush, and Z. Trajanoski, Genesis: cluster analysis of microarray data. Bioinformatics, 2002. 18(1): p. 207-8. 246. McKinstry, S.U., et al., Huntingtin is required for normal excitatory synapse development in cortical and striatal circuits. J Neurosci, 2014. 34(28): p. 9455-72. 247. Fujita, Y., et al., Tomosyn: a syntaxin-1-binding protein that forms a novel complex in the neurotransmitter release process. Neuron, 1998. 20(5): p. 905-15. 248. Kent, W.J., et al., The human genome browser at UCSC. Genome Res, 2002. 12(6): p. 996-1006. 249. Girard, A., et al., A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature, 2006. 442(7099): p. 199-202. 250. Flicek, P., et al., Ensembl 2014. Nucleic Acids Res, 2014. 42(Database issue): p. D749- 55. 251. Banday, A.R., et al., Replication-dependent histone genes are actively transcribed in differentiating and aging retinal neurons. Cell Cycle, 2014. 13(16): p. 2526-41. 252. Kanadia, R.N., et al., Developmental expression of mouse muscleblind genes Mbnl1, Mbnl2 and Mbnl3. Gene Expr Patterns, 2003. 3(4): p. 459-62. 253. Kanadia, R.N., et al., Temporal requirement of the alternative-splicing factor Sfrs1 for the survival of retinal neurons. Development, 2008. 135(23): p. 3923-33. 254. Irizarry, R.A., et al., Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res, 2003. 31(4): p. e15. 255. Irizarry, R.A., et al., Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 2003. 4(2): p. 249-64.

230

256. Gautier, L., et al., affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics, 2004. 20(3): p. 307-15. 257. Baumgartner, M., Olthof, A. M., Hyatt, K. C., Lemoine, C., Drake, K., Sturrock, N., Nguyen, N., Al Seesi, S., Kanadia, R. N., Minor spliceosome inactivation in the developing mouse cortex causes self-amplifying radial glial cell death and microcephaly. bioRxiv, 2017. 182816. 258. Homem, C.C., M. Repic, and J.A. Knoblich, Proliferation control in neural stem and progenitor cells. Nat Rev Neurosci, 2015. 16(11): p. 647-59. 259. Sessa, A., et al., Tbr2 directs conversion of radial glia into basal precursors and guides neuronal amplification by indirect neurogenesis in the developing neocortex. Neuron, 2008. 60(1): p. 56-69. 260. Laguesse, S., et al., A Dynamic Unfolded Protein Response Contributes to the Control of Cortical Neurogenesis. Dev Cell, 2015. 35(5): p. 553-567. 261. Matson, J.P. and J.G. Cook, Cell cycle proliferation decisions: the impact of single cell analyses. FEBS J, 2017. 284(3): p. 362-375. 262. Nurse, P., Y. Masui, and L. Hartwell, Understanding the cell cycle. Nat Med, 1998. 4(10): p. 1103-6. 263. Conduit, P.T., A. Wainman, and J.W. Raff, Centrosome function and assembly in animal cells. Nat Rev Mol Cell Biol, 2015. 16(10): p. 611-24. 264. Brooker, A.S. and K.M. Berkowitz, The roles of cohesins in mitosis, meiosis, and human health and disease. Methods Mol Biol, 2014. 1170: p. 229-66. 265. Barnum, K.J. and M.J. O'Connell, Cell cycle regulation by checkpoints. Methods Mol Biol, 2014. 1170: p. 29-40. 266. Nurse, P., Universal control mechanism regulating onset of M-phase. Nature, 1990. 344(6266): p. 503-8. 267. Nigg, E.A., Mitotic kinases as regulators of cell division and its checkpoints. Nat Rev Mol Cell Biol, 2001. 2(1): p. 21-32. 268. Scholey, J.M., I. Brust-Mascher, and A. Mogilner, Cell division. Nature, 2003. 422(6933): p. 746-52. 269. Cheeseman, I.M., The kinetochore. Cold Spring Harb Perspect Biol, 2014. 6(7): p. a015826.

231

270. Primorac, I. and A. Musacchio, Panta rhei: the APC/C at steady state. J Cell Biol, 2013. 201(2): p. 177-89. 271. Musacchio, A. and E.D. Salmon, The spindle-assembly checkpoint in space and time. Nat Rev Mol Cell Biol, 2007. 8(5): p. 379-93. 272. Foley, E.A. and T.M. Kapoor, Microtubule attachment and spindle assembly checkpoint signalling at the kinetochore. Nat Rev Mol Cell Biol, 2013. 14(1): p. 25-37. 273. Green, R.A., E. Paluch, and K. Oegema, Cytokinesis in animal cells. Annu Rev Cell Dev Biol, 2012. 28: p. 29-58. 274. Gilmore, E.C. and C.A. Walsh, Genetic causes of microcephaly and lessons for neuronal development. Wiley Interdiscip Rev Dev Biol, 2013. 2(4): p. 461-78. 275. Jayaraman, D., B.I. Bae, and C.A. Walsh, The Genetics of Primary Microcephaly. Annu Rev Genomics Hum Genet, 2018. 19: p. 177-200. 276. Jayaraman, D., et al., Microcephaly Proteins Wdr62 and Aspm Define a Mother Centriole Complex Regulating Centriole Biogenesis, Apical Complex, and Cell Fate. Neuron, 2016. 92(4): p. 813-828. 277. Bazzi, H. and K.V. Anderson, Acentriolar mitosis activates a p53-dependent apoptosis pathway in the mouse embryo. Proc Natl Acad Sci U S A, 2014. 111(15): p. E1491-500. 278. Insolera, R., et al., Cortical neurogenesis in the absence of centrioles. Nat Neurosci, 2014. 17(11): p. 1528-35. 279. Mirzaa, G.M., et al., Mutations in CENPE define a novel kinetochore-centromeric mechanism for microcephalic primordial dwarfism. Hum Genet, 2014. 133(8): p. 1023- 39. 280. Genin, A., et al., Kinetochore KMN network gene CASC5 mutated in primary microcephaly. Hum Mol Genet, 2012. 21(24): p. 5306-17. 281. Mitchell, B.D., et al., Aberrant apoptosis in the neurological mutant Flathead is associated with defective cytokinesis of neural progenitor cells. Brain Res Dev Brain Res, 2001. 130(1): p. 53-63. 282. Sarkisian, M.R., et al., Citron-kinase, a protein essential to cytokinesis in neuronal progenitors, is deleted in the flathead mutant rat. J Neurosci, 2002. 22(8): p. RC217. 283. D'Avino, P.P., Citron kinase - renaissance of a neglected mitotic kinase. J Cell Sci, 2017. 130(10): p. 1701-1708.

232

284. Moawia, A., et al., Mutations of KIF14 cause primary microcephaly by impairing cytokinesis. Ann Neurol, 2017. 82(4): p. 562-577. 285. Gruneberg, U., et al., KIF14 and citron kinase act together to promote efficient cytokinesis. J Cell Biol, 2006. 172(3): p. 363-72. 286. Bicknell, L.S., et al., Mutations in the pre-replication complex cause Meier-Gorlin syndrome. Nat Genet, 2011. 43(4): p. 356-9. 287. Bicknell, L.S., et al., Mutations in ORC1, encoding the largest subunit of the origin recognition complex, cause microcephalic primordial dwarfism resembling Meier-Gorlin syndrome. Nat Genet, 2011. 43(4): p. 350-5. 288. Guernsey, D.L., et al., Mutations in origin recognition complex gene ORC4 cause Meier- Gorlin syndrome. Nat Genet, 2011. 43(4): p. 360-4. 289. Leonard, A.C. and M. Mechali, DNA replication origins. Cold Spring Harb Perspect Biol, 2013. 5(10): p. a010116. 290. Burrage, L.C., et al., De Novo GMNN Mutations Cause Autosomal-Dominant Primordial Dwarfism Associated with Meier-Gorlin Syndrome. Am J Hum Genet, 2015. 97(6): p. 904-13. 291. Fenwick, A.L., et al., Mutations in CDC45, Encoding an Essential Component of the Pre- initiation Complex, Cause Meier-Gorlin Syndrome and Craniosynostosis. Am J Hum Genet, 2016. 99(1): p. 125-38. 292. Calegari, F. and W.B. Huttner, An inhibition of cyclin-dependent kinases that lengthens, but does not arrest, neuroepithelial cell cycle induces premature neurogenesis. J Cell Sci, 2003. 116(Pt 24): p. 4947-55. 293. Pilaz, L.J., et al., Forced G1-phase reduction alters mode of division, neuron number, and laminar phenotype in the cerebral cortex. Proc Natl Acad Sci U S A, 2009. 106(51): p. 21924-9. 294. Pilaz, L.J., et al., Prolonged Mitosis of Neural Progenitors Alters Cell Fate in the Developing Brain. Neuron, 2016. 89(1): p. 83-99. 295. Marjanovic, M., et al., CEP63 deficiency promotes p53-dependent microcephaly and reveals a role for the centrosome in meiotic recombination. Nat Commun, 2015. 6: p. 7676.

233

296. Mao, H., et al., Haploinsufficiency for Core Exon Junction Complex Components Disrupts Embryonic Neurogenesis and Causes p53-Mediated Microcephaly. PLoS Genet, 2016. 12(9): p. e1006282. 297. Janisch, K.M., et al., The vertebrate-specific Kinesin-6, Kif20b, is required for normal cytokinesis of polarized cortical stem cells and cerebral cortex size. Development, 2013. 140(23): p. 4672-82. 298. Little, J.N., Dwyer, N. D., p53 deletion rescues apoptosis and microcephaly in a Kif20b mouse mutant. bioRxiv, 2018. 272393. 299. Breuss, M., et al., Mutations in the beta-tubulin gene TUBB5 cause microcephaly with structural brain abnormalities. Cell Rep, 2012. 2(6): p. 1554-62. 300. Breuss, M., et al., Mutations in the murine homologue of TUBB5 cause microcephaly by perturbing cell cycle progression and inducing p53-associated apoptosis. Development, 2016. 143(7): p. 1126-33. 301. Di Cunto, F., et al., Defective neurogenesis in citron kinase knockout mice by altered cytokinesis and massive apoptosis. Neuron, 2000. 28(1): p. 115-27. 302. Bianchi, F.T., et al., Citron Kinase Deficiency Leads to Chromosomal Instability and TP53-Sensitive Microcephaly. Cell Rep, 2017. 18(7): p. 1674-1686. 303. Moll, U.M. and O. Petrenko, The MDM2-p53 interaction. Mol Cancer Res, 2003. 1(14): p. 1001-8. 304. Dai, C. and W. Gu, p53 post-translational modification: deregulated in tumorigenesis. Trends Mol Med, 2010. 16(11): p. 528-36. 305. O'Brate, A. and P. Giannakakou, The importance of p53 location: nuclear or cytoplasmic zip code? Drug Resist Updat, 2003. 6(6): p. 313-22. 306. Kracikova, M., et al., A threshold mechanism mediates p53 cell fate decision between growth arrest and apoptosis. Cell Death Differ, 2013. 20(4): p. 576-88. 307. Carvajal, L.A. and J.J. Manfredi, Another fork in the road--life or death decisions by the tumour suppressor p53. EMBO Rep, 2013. 14(5): p. 414-21. 308. Moll, U.M., et al., Transcription-independent pro-apoptotic functions of p53. Curr Opin Cell Biol, 2005. 17(6): p. 631-6. 309. Ciccia, A. and S.J. Elledge, The DNA damage response: making it safe to play with knives. Mol Cell, 2010. 40(2): p. 179-204.

234

310. Oberle, C. and C. Blattner, Regulation of the DNA Damage Response to DSBs by Post- Translational Modifications. Curr Genomics, 2010. 11(3): p. 184-98. 311. Xu, X., J. Lee, and D.F. Stern, Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. J Biol Chem, 2004. 279(33): p. 34091-4. 312. Liu, X., Z.W. Zhou, and Z.Q. Wang, The DNA damage response molecule MCPH1 in brain development and beyond. Acta Biochim Biophys Sin (Shanghai), 2016. 48(7): p. 678-85. 313. O'Driscoll, M., et al., A splicing mutation affecting expression of ataxia-telangiectasia and Rad3-related protein (ATR) results in Seckel syndrome. Nat Genet, 2003. 33(4): p. 497-501. 314. Sartori, A.A., et al., Human CtIP promotes DNA end resection. Nature, 2007. 450(7169): p. 509-14. 315. Qvist, P., et al., CtIP Mutations Cause Seckel and Jawad Syndromes. PLoS Genet, 2011. 7(10): p. e1002310. 316. Shaheen, R., et al., Genomic analysis of primordial dwarfism reveals novel disease genes. Genome Res, 2014. 24(2): p. 291-9. 317. Pawlowska, E., J. Szczepanska, and J. Blasiak, DNA2-An Important Player in DNA Damage Response or Just Another DNA Maintenance Protein? Int J Mol Sci, 2017. 18(7). 318. Harley, M.E., et al., TRAIP promotes DNA damage response during genome replication and is mutated in primordial dwarfism. Nat Genet, 2016. 48(1): p. 36-43. 319. Evrony, G.D., et al., Integrated genome and transcriptome sequencing identifies a noncoding mutation in the genome replication factor DONSON as the cause of microcephaly-micromelia syndrome. Genome Res, 2017. 27(8): p. 1323-1335. 320. Reynolds, J.J., et al., Mutations in DONSON disrupt replication fork stability and cause microcephalic dwarfism. Nat Genet, 2017. 49(4): p. 537-549. 321. O'Driscoll, M., et al., DNA ligase IV mutations identified in patients exhibiting developmental delay and immunodeficiency. Mol Cell, 2001. 8(6): p. 1175-85. 322. Carney, J.P., et al., The hMre11/hRad50 protein complex and Nijmegen breakage syndrome: linkage of double-strand break repair to the cellular DNA damage response. Cell, 1998. 93(3): p. 477-86.

235

323. Varon, R., et al., Nibrin, a novel DNA double-strand break repair protein, is mutated in Nijmegen breakage syndrome. Cell, 1998. 93(3): p. 467-76. 324. Payne, F., et al., Hypomorphism in human NSMCE2 linked to primordial dwarfism and insulin resistance. J Clin Invest, 2014. 124(9): p. 4028-38. 325. Shen, J., et al., Mutations in PNKP cause microcephaly, seizures and defects in DNA repair. Nat Genet, 2010. 42(3): p. 245-9. 326. Buck, D., et al., Cernunnos, a novel nonhomologous end-joining factor, is mutated in human immunodeficiency with microcephaly. Cell, 2006. 124(2): p. 287-99. 327. Li, R., et al., A distinct response to endogenous DNA damage in the development of Nbs1-deficient cortical neurons. Cell Res, 2012. 22(5): p. 859-72. 328. Ghouzzi, V.E., et al., ZIKA virus elicits P53 activation and genotoxic stress in human neural progenitors similar to mutations involved in severe forms of genetic microcephaly. Cell Death Dis, 2016. 7(10): p. e2440. 329. Cocas, L.A., et al., Emx1-lineage progenitors differentially contribute to neural diversity in the striatum and amygdala. J Neurosci, 2009. 29(50): p. 15933-46. 330. Bartolini, G., G. Ciceri, and O. Marin, Integration of GABAergic interneurons into cortical cell assemblies: lessons from embryos and adults. Neuron, 2013. 79(5): p. 849- 64. 331. Kyrylkova, K., et al., Detection of apoptosis by TUNEL assay. Methods Mol Biol, 2012. 887: p. 41-7. 332. Faleiro, L., et al., Multiple species of CPP32 and Mch2 are the major active caspases present in apoptotic cells. EMBO J, 1997. 16(9): p. 2271-81. 333. Wolf, H.K., et al., NeuN: a useful neuronal marker for diagnostic histopathology. J Histochem Cytochem, 1996. 44(10): p. 1167-71. 334. Scholzen, T. and J. Gerdes, The Ki-67 protein: from the known and the unknown. J Cell Physiol, 2000. 182(3): p. 311-22. 335. Zhang, J. and J. Jiao, Molecular Biomarkers for Embryonic and Adult Neural Stem Cell and Neurogenesis. Biomed Res Int, 2015. 2015: p. 727542. 336. Gotz, M., A. Stoykova, and P. Gruss, Pax6 controls radial glia differentiation in the cerebral cortex. Neuron, 1998. 21(5): p. 1031-44.

236

337. Englund, C., et al., Pax6, Tbr2, and Tbr1 are expressed sequentially by radial glia, intermediate progenitor cells, and postmitotic neurons in developing neocortex. J Neurosci, 2005. 25(1): p. 247-51. 338. Takahashi, T., R.S. Nowakowski, and V.S. Caviness, Jr., The cell cycle of the pseudostratified ventricular epithelium of the embryonic murine cerebral wall. J Neurosci, 1995. 15(9): p. 6046-57. 339. Grant, E., A. Hoerder-Suabedissen, and Z. Molnar, Development of the corticothalamic projections. Front Neurosci, 2012. 6: p. 53. 340. Siegenthaler, J.A. and S.J. Pleasure, We have got you 'covered': how the meninges control brain development. Curr Opin Genet Dev, 2011. 21(3): p. 249-55. 341. Sharma, A., K. Singh, and A. Almasan, Histone H2AX phosphorylation: a marker for DNA damage. Methods Mol Biol, 2012. 920: p. 613-26. 342. Cheng, Q. and J. Chen, Mechanism of p53 stabilization by ATM after DNA damage. Cell Cycle, 2010. 9(3): p. 472-8. 343. Nagy, E. and L.E. Maquat, A rule for termination-codon position within intron- containing genes: when nonsense affects RNA abundance. Trends Biochem Sci, 1998. 23(6): p. 198-9. 344. McCleland, M.L., et al., The vertebrate Ndc80 complex contains Spc24 and Spc25 homologs, which are required to establish and maintain kinetochore-microtubule attachment. Curr Biol, 2004. 14(2): p. 131-7. 345. Gurley, L.R., et al., Histone phosphorylation and chromatin structure during mitosis in Chinese hamster cells. Eur J Biochem, 1978. 84(1): p. 1-15. 346. Sun, L., et al., EB1 promotes Aurora-B kinase activity through blocking its inactivation by protein phosphatase 2A. Proc Natl Acad Sci U S A, 2008. 105(20): p. 7153-8. 347. Paramasivam, M., Y.J. Chang, and J.J. LoTurco, ASPM and citron kinase co-localize to the midbody ring during cytokinesis. Cell Cycle, 2007. 6(13): p. 1605-12. 348. Chehrehasa, F., et al., EdU, a new thymidine analogue for labelling proliferating cells in the nervous system. J Neurosci Methods, 2009. 177(1): p. 122-30. 349. Clute, P. and J. Pines, Temporal and spatial control of cyclin B1 destruction in metaphase. Nat Cell Biol, 1999. 1(2): p. 82-7.

237

350. Crosio, C., et al., Mitotic phosphorylation of : spatio-temporal regulation by mammalian Aurora kinases. Mol Cell Biol, 2002. 22(3): p. 874-85. 351. Madisen, L., et al., A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat Neurosci, 2010. 13(1): p. 133-40. 352. Williams, A.B. and B. Schumacher, p53 in the DNA-Damage-Repair Process. Cold Spring Harb Perspect Med, 2016. 6(5). 353. Uetake, Y. and G. Sluder, Prolonged prometaphase blocks daughter cell proliferation despite normal completion of mitosis. Curr Biol, 2010. 20(18): p. 1666-71. 354. Ganem, N.J., et al., Cytokinesis failure triggers hippo tumor suppressor pathway activation. Cell, 2014. 158(4): p. 833-848. 355. Lakso, M., et al., Efficient in vivo manipulation of mouse genomic sequences at the zygote stage. Proc Natl Acad Sci U S A, 1996. 93(12): p. 5860-5. 356. Hutchinson, J.N., et al., A screen for nuclear transcripts identifies two linked noncoding RNAs associated with SC35 splicing domains. BMC Genomics, 2007. 8: p. 39. 357. Gerfen, C.R., Basic neuroanatomical methods. Curr Protoc Neurosci, 2003. Chapter 1: p. Unit 1 1. 358. Karunakaran, D.K., et al., Loss of citron kinase affects a subset of progenitor cells that alters late but not early neurogenesis in the developing rat retina. Invest Ophthalmol Vis Sci, 2015. 56(2): p. 787-98. 359. Liboska, R., et al., Most anti-BrdU antibodies react with 2'-deoxy-5-ethynyluridine -- the method for the effective suppression of this cross-reactivity. PLoS One, 2012. 7(12): p. e51679. 360. Kim, D., B. Langmead, and S.L. Salzberg, HISAT: a fast spliced aligner with low memory requirements. Nat Methods, 2015. 12(4): p. 357-60. 361. Duitama, J., P.K. Srivastava, and Mandoiu, II, Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data. BMC Genomics, 2012. 13 Suppl 2: p. S6. 362. Mandric, I., et al., Fast bootstrapping-based estimation of confidence intervals of expression levels and differential expression from RNA-Seq data. Bioinformatics, 2017. 33(20): p. 3302-3304. 363. Yates, A., et al., Ensembl 2016. Nucleic Acids Res, 2016. 44(D1): p. D710-6.

238

364. Quinlan, A.R. and I.M. Hall, BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 2010. 26(6): p. 841-2. 365. Blomen, V.A., et al., Gene essentiality and synthetic lethality in haploid human cells. Science, 2015. 350(6264): p. 1092-6. 366. Hart, T., et al., High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype- Specific Cancer Liabilities. Cell, 2015. 163(6): p. 1515-26. 367. Wang, T., et al., Identification and characterization of essential genes in the human genome. Science, 2015. 350(6264): p. 1096-101. 368. Carette, J.E., et al., Haploid genetic screens in human cells identify host factors used by pathogens. Science, 2009. 326(5957): p. 1231-5. 369. Stanford, W.L., J.B. Cohn, and S.P. Cordes, Gene-trap mutagenesis: past, present and beyond. Nat Rev Genet, 2001. 2(10): p. 756-68. 370. Schmidt, M., et al., High-resolution insertion-site analysis by linear amplification- mediated PCR (LAM-PCR). Nat Methods, 2007. 4(12): p. 1051-7. 371. Huerta-Cepas, J., et al., eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res, 2016. 44(D1): p. D286-93. 372. Wang, T., et al., Genetic screens in human cells using the CRISPR-Cas9 system. Science, 2014. 343(6166): p. 80-4. 373. Hart, T., et al., Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol Syst Biol, 2014. 10: p. 733. 374. Olena, A.F. and J.G. Patton, Genomic organization of microRNAs. J Cell Physiol, 2010. 222(3): p. 540-5. 375. Kozomara, A. and S. Griffiths-Jones, miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res, 2014. 42(Database issue): p. D68-73. 376. The Gene Ontology, C., Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res, 2017. 45(D1): p. D331-D338. 377. Medghalchi, S.M., et al., Rent1, a trans-effector of nonsense-mediated mRNA decay, is essential for mammalian embryonic viability. Hum Mol Genet, 2001. 10(2): p. 99-105.

239

378. Gangwani, L., R.A. Flavell, and R.J. Davis, ZPR1 is essential for survival and is required for localization of the survival motor neurons (SMN) protein to Cajal bodies. Mol Cell Biol, 2005. 25(7): p. 2744-56. 379. Cang, Y., et al., Deletion of DDB1 in mouse brain and lens leads to p53-dependent elimination of proliferating cells. Cell, 2006. 127(5): p. 929-40. 380. Seburn, K.L., et al., An active dominant mutation of glycyl-tRNA synthetase causes neuropathy in a Charcot-Marie-Tooth 2D mouse model. Neuron, 2006. 51(6): p. 715-26. 381. Liu, Y., et al., The Sac1 phosphoinositide phosphatase regulates Golgi membrane morphology and mitotic spindle organization in mammals. Mol Biol Cell, 2008. 19(7): p. 3080-96. 382. Zhang, X., et al., Mutation in nuclear pore component NUP155 leads to atrial fibrillation and early sudden cardiac death. Cell, 2008. 135(6): p. 1017-27. 383. Krasteva, V., et al., The BAF53a subunit of SWI/SNF-like BAF complexes is essential for hemopoietic stem cell function. Blood, 2012. 120(24): p. 4720-32. 384. Ma, S., H.J. Kwon, and Z. Huang, A functional requirement for astroglia in promoting blood vessel development in the early postnatal brain. PLoS One, 2012. 7(10): p. e48001. 385. Hashimoto, M., et al., Nepro is localized in the nucleolus and essential for preimplantation development in mice. Dev Growth Differ, 2015. 57(7): p. 529-38. 386. Dickinson, M.E., et al., High-throughput discovery of novel developmental phenotypes. Nature, 2016. 537(7621): p. 508-514. 387. Oegema, R., et al., Human mutations in integrator complex subunits link transcriptome integrity to brain development. PLoS Genet, 2017. 13(5): p. e1006809. 388. Kwon, H.J., S. Ma, and Z. Huang, Radial glia regulate Cajal-Retzius cell positioning in the early embryonic cerebral cortex. Dev Biol, 2011. 351(1): p. 25-34. 389. Smith, A., Clericuzio, C., Ahmed, A., McDonell, L., Sawyer, S., Bulman, D., Boycott, K., MG-131 ZNF259 is a candidate gene for alopecia-primordial dwarfism-renal syndrome (APDRS). J Med Genet, 2015. 52: p. A11. 390. Di Donato, N., et al., Mutations in EXOSC2 are associated with a novel syndrome characterised by retinitis pigmentosa, progressive hearing loss, premature ageing, short stature, mild intellectual disability and distinctive gestalt. J Med Genet, 2016. 53(6): p. 419-25.

240

391. Raj, B. and B.J. Blencowe, Alternative Splicing in the Mammalian Nervous System: Recent Insights into Mechanisms and Functional Roles. Neuron, 2015. 87(1): p. 14-27. 392. Basu, M.K., I.B. Rogozin, and E.V. Koonin, Primordial spliceosomal introns were probably U2-type. Trends Genet, 2008. 24(11): p. 525-8. 393. Dietrich, R.C., R. Incorvaia, and R.A. Padgett, Terminal intron dinucleotide sequences do not distinguish between U2- and U12-dependent introns. Mol Cell, 1997. 1(1): p. 151-60. 394. Lessard, J., et al., An essential switch in subunit composition of a chromatin remodeling complex during neural development. Neuron, 2007. 55(2): p. 201-15. 395. Bao, X., et al., ACTL6a enforces the epidermal progenitor state by suppressing SWI/SNF-dependent induction of . Cell Stem Cell, 2013. 12(2): p. 193-203. 396. Zhou, C., et al., Arl2 and Arl3 regulate different microtubule-dependent processes. Mol Biol Cell, 2006. 17(5): p. 2476-87. 397. Newman, L.E., et al., The ARL2 GTPase is required for mitochondrial morphology, motility, and maintenance of ATP levels. PLoS One, 2014. 9(6): p. e99270. 398. Antoshechkin, I. and M. Han, The C. elegans evl-20 gene is a homolog of the small GTPase ARL2 and regulates cytoskeleton dynamics during cytokinesis and morphogenesis. Dev Cell, 2002. 2(5): p. 579-91. 399. Price, H.P., et al., The small GTPase ARL2 is required for cytokinesis in Trypanosoma brucei. Mol Biochem Parasitol, 2010. 173(2): p. 123-31. 400. van Hille, B., et al., Identification of two subunit A isoforms of the vacuolar H(+)-ATPase in human osteoclastoma. J Biol Chem, 1993. 268(10): p. 7075-80. 401. Muroyama, Y. and T. Saito, Identification of Nepro, a gene required for the maintenance of neocortex neural progenitor cells downstream of Notch. Development, 2009. 136(23): p. 3889-93. 402. Liu, S.S., et al., Identification and characterization of a novel gene, c1orf109, encoding a CK2 substrate that is involved in cancer cell proliferation. J Biomed Sci, 2012. 19: p. 49. 403. Yeh, J.I., et al., Damaged DNA induced UV-damaged DNA-binding protein (UV-DDB) dimerization and its roles in chromatinized DNA repair. Proc Natl Acad Sci U S A, 2012. 109(41): p. E2737-46.

241

404. Li, J., et al., DNA damage binding protein component DDB1 participates in nucleotide excision repair through DDB2 DNA-binding and 4A ubiquitin ligase activity. Cancer Res, 2006. 66(17): p. 8590-7. 405. Leung-Pineda, V., J. Huh, and H. Piwnica-Worms, DDB1 targets Chk1 to the Cul4 E3 ligase complex in normal cycling cells and in cells experiencing replication stress. Cancer Res, 2009. 69(6): p. 2630-7. 406. Dubaele, S. and P. Chene, Cellular studies of MrDb (DDX18). Oncol Res, 2007. 16(12): p. 549-56. 407. Payne, E.M., et al., Ddx18 is essential for cell-cycle progression in zebrafish hematopoietic cells and is mutated in human AML. Blood, 2011. 118(4): p. 903-15. 408. Milek, M., et al., DDX54 regulates transcriptome dynamics during DNA damage response. Genome Res, 2017. 27(8): p. 1344-1359. 409. Kanno, Y., et al., DP97, a DEAD box DNA/RNA helicase, is a target gene-selective co- regulator of the constitutive androstane receptor. Biochem Biophys Res Commun, 2012. 426(1): p. 38-42. 410. Rajendran, R.R., et al., Regulation of transcriptional activity by a novel DEAD box RNA helicase (DP97). J Biol Chem, 2003. 278(7): p. 4628-38. 411. Zhan, R., et al., A DEAD-box RNA helicase Ddx54 protein in oligodendrocytes is indispensable for myelination in the central nervous system. J Neurosci Res, 2013. 91(3): p. 335-48. 412. Masutani, M., et al., Reconstitution reveals the functional core of mammalian eIF3. EMBO J, 2007. 26(14): p. 3373-83. 413. Qi, J., et al., EIF3i promotes colon oncogenesis by regulating COX-2 protein synthesis and beta-catenin activation. Oncogene, 2014. 33(32): p. 4156-63. 414. Harris, T.E., et al., mTOR-dependent stimulation of the association of eIF4G and eIF3 by insulin. EMBO J, 2006. 25(8): p. 1659-68. 415. Metz-Estrella, D., et al., TRIP-1: a regulator of osteoblast function. J Bone Miner Res, 2012. 27(7): p. 1576-84. 416. Ahlemann, M., et al., Carcinoma-associated eIF3i overexpression facilitates mTOR- dependent growth transformation. Mol Carcinog, 2006. 45(12): p. 957-67.

242

417. Isken, O., et al., Upf1 phosphorylation triggers translational repression during nonsense- mediated mRNA decay. Cell, 2008. 133(2): p. 314-27. 418. Lejeune, F., X. Li, and L.E. Maquat, Nonsense-mediated mRNA decay in mammalian cells involves decapping, deadenylating, and exonucleolytic activities. Mol Cell, 2003. 12(3): p. 675-87. 419. van Dijk, E.L., G. Schilders, and G.J. Pruijn, Human cell growth requires a functional cytoplasmic exosome, which is involved in various mRNA decay pathways. RNA, 2007. 13(7): p. 1027-35. 420. Mitchell, P., E. Petfalski, and D. Tollervey, The 3' end of yeast 5.8S rRNA is generated by an exonuclease processing mechanism. Genes Dev, 1996. 10(4): p. 502-13. 421. Ito, S., et al., MMXD, a TFIIH-independent XPD-MMS19 protein complex involved in chromosome segregation. Mol Cell, 2010. 39(4): p. 632-40. 422. Yang, W., et al., Interference of E2-2-mediated effect in endothelial cells by FAM96B through its limited expression of E2-2. Cancer Sci, 2011. 102(10): p. 1808-14. 423. Stehling, O., et al., Human CIA2A-FAM96A and CIA2B-FAM96B integrate iron homeostasis and maturation of different subsets of cytosolic-nuclear iron-sulfur proteins. Cell Metab, 2013. 18(2): p. 187-98. 424. Motley, W.W., K. Talbot, and K.H. Fischbeck, GARS axonopathy: not every neuron's cup of tRNA. Trends Neurosci, 2010. 33(2): p. 59-66. 425. Baillat, D., et al., Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell, 2005. 123(2): p. 265- 76. 426. Cai, Y., et al., Subunit composition and substrate specificity of a MOF-containing histone acetyltransferase distinct from the male-specific lethal (MSL) complex. J Biol Chem, 2010. 285(7): p. 4268-72. 427. Dias, J., et al., Structural analysis of the KANSL1/WDR5/KANSL2 complex reveals that WDR5 is required for efficient assembly and chromatin targeting of the NSL complex. Genes Dev, 2014. 28(9): p. 929-42. 428. Shen, Q., et al., NAT10, a nucleolar protein, localizes to the midbody and regulates cytokinesis and acetylation of microtubules. Exp Cell Res, 2009. 315(10): p. 1653-67.

243

429. Chi, Y.H., et al., Histone acetyltransferase hALP and nuclear membrane protein hsSUN1 function in de-condensation of mitotic chromosomes. J Biol Chem, 2007. 282(37): p. 27447-58. 430. Kong, R., et al., hALP, a novel transcriptional U three protein (t-UTP), activates RNA polymerase I transcription by binding and acetylating the upstream binding factor (UBF). J Biol Chem, 2011. 286(9): p. 7139-48. 431. Ito, S., et al., Human NAT10 is an ATP-dependent RNA acetyltransferase responsible for N4-acetylcytidine formation in 18 S ribosomal RNA (rRNA). J Biol Chem, 2014. 289(52): p. 35724-30. 432. Izaurralde, E., et al., A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell, 1994. 78(4): p. 657-68. 433. Ishigaki, Y., et al., Evidence for a pioneer round of mRNA translation: mRNAs subject to nonsense-mediated decay in mammalian cells are bound by CBP80 and CBP20. Cell, 2001. 106(5): p. 607-17. 434. Cheng, H., et al., Human mRNA export machinery recruited to the 5' end of mRNA. Cell, 2006. 127(7): p. 1389-400. 435. Gebhardt, A., et al., mRNA export through an additional cap-binding complex consisting of NCBP1 and NCBP3. Nat Commun, 2015. 6: p. 8192. 436. Maquat, L.E., et al., CBP80-promoted mRNP rearrangements during the pioneer round of translation, nonsense-mediated mRNA decay, and thereafter. Cold Spring Harb Symp Quant Biol, 2010. 75: p. 127-34. 437. Hosoda, N., et al., CBP80 promotes interaction of Upf1 with Upf2 during nonsense- mediated mRNA decay in mammalian cells. Nat Struct Mol Biol, 2005. 12(10): p. 893- 901. 438. Hwang, J., et al., UPF1 association with the cap-binding protein, CBP80, promotes nonsense-mediated mRNA decay at two distinct steps. Mol Cell, 2010. 39(3): p. 396-409. 439. Izaurralde, E., et al., A cap-binding protein complex mediating U snRNA export. Nature, 1995. 376(6542): p. 709-12. 440. Calero, G., et al., Structural basis of m7GpppG binding to the nuclear cap-binding protein complex. Nat Struct Biol, 2002. 9(12): p. 912-7.

244

441. Flaherty, S.M., et al., Participation of the nuclear cap binding complex in pre-mRNA 3' processing. Proc Natl Acad Sci U S A, 1997. 94(22): p. 11893-8. 442. Kim, K.M., et al., A new MIF4G domain-containing protein, CTIF, directs nuclear cap- binding protein CBP80/20-dependent translation. Genes Dev, 2009. 23(17): p. 2033-45. 443. Freeman, J.W., et al., Identification and characterization of a human proliferation- associated nucleolar antigen with a molecular weight of 120,000 expressed in early G1 phase. Cancer Res, 1988. 48(5): p. 1244-51. 444. Valdez, B.C., et al., A region of antisense RNA from human p120 cDNA with high homology to mouse p120 cDNA inhibits NIH 3T3 proliferation. Cancer Res, 1992. 52(20): p. 5681-6. 445. Kosi, N., et al., Nop2 is expressed during proliferation of neural stem cells and in adult mouse and human brain. Brain Res, 2015. 1597: p. 65-76. 446. Khanna-Gupta, A., et al., p120 nucleolar-proliferating antigen is a direct target of G- CSF signaling during myeloid differentiation. J Leukoc Biol, 2006. 79(5): p. 1011-21. 447. Banerjee, H.N., et al., Depletion of a single nucleoporin, Nup107, induces apoptosis in eukaryotic cells. Mol Cell Biochem, 2010. 343(1-2): p. 21-5. 448. Boehmer, T., et al., Depletion of a single nucleoporin, Nup107, prevents the assembly of a subset of nucleoporins into the nuclear pore complex. Proc Natl Acad Sci U S A, 2003. 100(3): p. 981-5. 449. Orjalo, A.V., et al., The Nup107-160 nucleoporin complex is required for correct bipolar spindle assembly. Mol Biol Cell, 2006. 17(9): p. 3806-18. 450. Mishra, R.K., et al., The Nup107-160 complex and gamma-TuRC regulate microtubule polymerization at kinetochores. Nat Cell Biol, 2010. 12(2): p. 164-9. 451. Springer, J., et al., Identification and chromosomal localization of murine ORC3, a new member of the mouse origin recognition complex. Cytogenet Cell Genet, 1999. 87(3-4): p. 245-51. 452. Pinto, S., et al., latheo encodes a subunit of the origin recognition complex and disrupts neuronal proliferation and adult olfactory memory when mutant. Neuron, 1999. 23(1): p. 45-54. 453. Dhar, S.K., L. Delmolino, and A. Dutta, Architecture of the human origin recognition complex. J Biol Chem, 2001. 276(31): p. 29067-71.

245

454. Nakaya, T., et al., Critical role of Pcid2 in B cell survival through the regulation of MAD2 expression. J Immunol, 2010. 185(9): p. 5180-7. 455. Cunningham, C.N., et al., Human TREX2 components PCID2 and 2, but not ENY2, have distinct functions in protein export and co-localize to the centrosome. Exp Cell Res, 2014. 320(2): p. 209-18. 456. Sweet, T., et al., Evidence for involvement of NFBP in processing of ribosomal RNA. J Cell Physiol, 2008. 214(2): p. 381-8. 457. Turner, A.J., et al., A novel small-subunit processome assembly intermediate that contains the U3 snoRNP, nucleolin, RRP5, and DBP4. Mol Cell Biol, 2009. 29(11): p. 3007-17. 458. Mizuno, T., et al., Molecular architecture of the mouse DNA polymerase alpha-primase complex. Mol Cell Biol, 1999. 19(11): p. 7886-96. 459. Takemura, M., et al., Role of the second-largest subunit of DNA polymerase alpha in the interaction between the catalytic subunit and hyperphosphorylated in late S phase. Biochim Biophys Acta, 2006. 1764(9): p. 1447-53. 460. Li, Y., et al., Purification, cDNA cloning, and gene mapping of the small subunit of human DNA polymerase epsilon. J Biol Chem, 1997. 272(51): p. 32337-44. 461. Wada, M., et al., The second largest subunit of mouse DNA polymerase epsilon, DPE2, interacts with SAP18 and recruits the Sin3 co-repressor protein to DNA. J Biochem, 2002. 131(3): p. 307-11. 462. Cheong, J.H., et al., Human RPB5, a subunit shared by eukaryotic nuclear RNA polymerases, binds human hepatitis B virus X protein and may play a role in X transactivation. EMBO J, 1995. 14(1): p. 143-50. 463. Miyao, T. and N.A. Woychik, RNA polymerase subunit RPB5 plays a role in transcriptional activation. Proc Natl Acad Sci U S A, 1998. 95(26): p. 15281-6. 464. Wei, W., et al., Direct interaction between the subunit RAP30 of transcription factor IIF (TFIIF) and RNA polymerase subunit 5, which contributes to the association between TFIIF and RNA polymerase II. J Biol Chem, 2001. 276(15): p. 12266-73. 465. Wei, W., et al., Interaction with general transcription factor IIF (TFIIF) is required for the suppression of activated transcription by RPB5-mediating protein (RMP). Cell Res, 2003. 13(2): p. 111-20.

246

466. Wang, Z. and R.G. Roeder, Three human RNA polymerase III-specific subunits form a subcomplex with a selective function in specific transcription initiation. Genes Dev, 1997. 11(10): p. 1315-26. 467. Lefevre, S., et al., Structure-function analysis of hRPC62 provides insights into RNA polymerase III transcription initiation. Nat Struct Mol Biol, 2011. 18(3): p. 352-8. 468. Hu, P., et al., Characterization of human RNA polymerase III identifies orthologues for RNA polymerase III subunits. Mol Cell Biol, 2002. 22(22): p. 8044-55. 469. Hashimoto, M., et al., Severe Hypomyelination and Developmental Defects Are Caused in Mice Lacking Protein Arginine Methyltransferase 1 (PRMT1) in the Central Nervous System. J Biol Chem, 2016. 291(5): p. 2237-45. 470. Tibshirani, M., et al., Cytoplasmic sequestration of FUS/TLS associated with ALS alters histone marks through loss of nuclear protein arginine methyltransferase 1. Hum Mol Genet, 2015. 24(3): p. 773-86. 471. Wang, H., et al., Methylation of at arginine 3 facilitating transcriptional activation by nuclear . Science, 2001. 293(5531): p. 853-7. 472. Bedford, M.T. and S.G. Clarke, Protein arginine methylation in mammals: who, what, and why. Mol Cell, 2009. 33(1): p. 1-13. 473. Pawlak, M.R., et al., Arginine N-methyltransferase 1 is required for early postimplantation mouse development, but cells deficient in the enzyme are viable. Mol Cell Biol, 2000. 20(13): p. 4859-69. 474. Yu, Z., et al., A mouse PRMT1 null allele defines an essential role for arginine methylation in genome maintenance and cell proliferation. Mol Cell Biol, 2009. 29(11): p. 2982-96. 475. Miyata, S., Y. Mori, and M. Tohyama, PRMT1 and Btg2 regulates neurite outgrowth of Neuro2a cells. Neurosci Lett, 2008. 445(2): p. 162-5. 476. Cron, K.R., et al., Proteasome inhibitors block DNA repair and radiosensitize non-small cell lung cancer. PLoS One, 2013. 8(9): p. e73710. 477. Detter, J.C., et al., Rab geranylgeranyl transferase alpha mutation in the gunmetal mouse reduces Rab prenylation and platelet synthesis. Proc Natl Acad Sci U S A, 2000. 97(8): p. 4144-9.

247

478. Sloan, K.E., et al., The roles of SSU processome components and surveillance factors in the initial processing of human ribosomal RNA. RNA, 2014. 20(4): p. 540-50. 479. Wang, M., L. Anikin, and D.G. Pestov, Two orthogonal cleavages separate subunit RNAs in mouse ribosome biogenesis. Nucleic Acids Res, 2014. 42(17): p. 11180-91. 480. Mossi, R., et al., Replication factor C interacts with the C-terminal side of proliferating cell nuclear antigen. J Biol Chem, 1997. 272(3): p. 1769-76. 481. Chan, Y.L., J. Olvera, and I.G. Wool, The primary structures of rat ribosomal proteins L4 and L41. Biochem Biophys Res Commun, 1995. 214(3): p. 810-8. 482. Jarrous, N., et al., Autoantigenic properties of some protein subunits of catalytically active complexes of human ribonuclease P. RNA, 1998. 4(4): p. 407-17. 483. Nemoto, Y., et al., Functional characterization of a mammalian Sac1 and mutants exhibiting substrate-specific defects in phosphoinositide phosphatase activity. J Biol Chem, 2000. 275(44): p. 34293-305. 484. Rohde, H.M., et al., The human phosphatidylinositol phosphatase SAC1 interacts with the coatomer I complex. J Biol Chem, 2003. 278(52): p. 52689-99. 485. Urlaub, H., et al., Sm protein-Sm site RNA interactions within the inner ring of the spliceosomal snRNP core structure. EMBO J, 2001. 20(1-2): p. 187-96. 486. Pillai, R.S., et al., Purified U7 snRNPs lack the Sm proteins D1 and D2 but contain Lsm10, a new 14 kDa Sm D1-like protein. EMBO J, 2001. 20(19): p. 5470-9. 487. Chari, A., et al., An assembly chaperone collaborates with the SMN complex to generate spliceosomal SnRNPs. Cell, 2008. 135(3): p. 497-509. 488. Pasternack, S.M., et al., Mutations in SNRPE, which encodes a core protein of the spliceosome, cause autosomal-dominant hypotrichosis simplex. Am J Hum Genet, 2013. 92(1): p. 81-7. 489. Li, Z. and B.M. Putzer, Spliceosomal protein E regulates neoplastic cell growth by modulating expression of cyclin E/CDK2 and G2/M checkpoint proteins. J Cell Mol Med, 2008. 12(6A): p. 2427-38. 490. St-Pierre, B., et al., Conserved and specific functions of mammalian ssu72. Nucleic Acids Res, 2005. 33(2): p. 464-77. 491. Krishnamurthy, S., et al., Ssu72 Is an RNA polymerase II CTD phosphatase. Mol Cell, 2004. 14(3): p. 387-94.

248

492. Kim, H.S., et al., Functional interplay between and Ssu72 phosphatase regulates sister chromatid cohesion. Nat Commun, 2013. 4: p. 2631. 493. Friedrich, J.K., et al., TBP-TAF complex SL1 directs RNA polymerase I pre-initiation complex formation and stabilizes upstream binding factor at the rDNA promoter. J Biol Chem, 2005. 280(33): p. 29551-8. 494. Neal, B.S. and J.N. Joyce, Neonatal 6-OHDA lesions differentially affect striatal D1 and D2 receptors. Synapse, 1992. 11(1): p. 35-46. 495. Yaffe, M.B., et al., TCP1 complex is a molecular chaperone in tubulin biogenesis. Nature, 1992. 358(6383): p. 245-8. 496. Yam, A.Y., et al., Defining the TRiC/CCT interactome links chaperonin function to stabilization of newly made proteins with complex topologies. Nat Struct Mol Biol, 2008. 15(12): p. 1255-62. 497. Sinha, S., et al., Essential role of the chaperonin CCT in rod outer segment biogenesis. Invest Ophthalmol Vis Sci, 2014. 55(6): p. 3775-85. 498. Freund, A., et al., Proteostatic control of function through TRiC-mediated folding of TCAB1. Cell, 2014. 159(6): p. 1389-403. 499. Bogolepova, I.N., [Structural organization of the hippocampus in the prenatal ontogenesis of the lower monkeys]. Zh Nevropatol Psikhiatr Im S S Korsakova, 1973. 73(5): p. 687-9. 500. Sun, X., et al., A mutated human homologue to yeast Upf1 protein has a dominant- negative effect on the decay of nonsense-containing mRNAs in mammalian cells. Proc Natl Acad Sci U S A, 1998. 95(17): p. 10009-14. 501. Azzalin, C.M. and J. Lingner, The human RNA surveillance factor UPF1 is required for S phase progression and genome stability. Curr Biol, 2006. 16(4): p. 433-9. 502. Yorikawa, C., et al., Human CHMP6, a myristoylated ESCRT-III protein, interacts directly with an ESCRT-II component EAP20 and regulates endosomal cargo sorting. Biochem J, 2005. 387(Pt 1): p. 17-26. 503. Im, Y.J., et al., Structure and function of the ESCRT-II-III interface in multivesicular body biogenesis. Dev Cell, 2009. 17(2): p. 234-43. 504. Handschuh, K., et al., ESCRT-II/Vps25 constrains digit number by endosome-mediated selective modulation of FGF-SHH signaling. Cell Rep, 2014. 9(2): p. 674-87.

249

505. Galcheva-Gargova, Z., et al., The cytoplasmic protein ZPR1 accumulates in the nucleolus of proliferating cells. Mol Biol Cell, 1998. 9(10): p. 2963-71. 506. Gangwani, L., et al., Spinal muscular atrophy disrupts the interaction of ZPR1 with the SMN protein. Nat Cell Biol, 2001. 3(4): p. 376-83. 507. Doran, B., et al., Deficiency of the zinc finger protein ZPR1 causes neurodegeneration. Proc Natl Acad Sci U S A, 2006. 103(19): p. 7471-5. 508. Ahmad, S., et al., The zinc finger protein ZPR1 is a potential modifier of spinal muscular atrophy. Hum Mol Genet, 2012. 21(12): p. 2745-58. 509. Dalton, D., R. Chadwick, and W. McGinnis, Expression and embryonic function of empty spiracles: a Drosophila homeo box gene with two patterning functions on the anterior- posterior axis of the embryo. Genes Dev, 1989. 3(12A): p. 1940-56. 510. Cohen, S.M. and G. Jurgens, Mediation of Drosophila head development by gap-like segmentation genes. Nature, 1990. 346(6283): p. 482-5. 511. Simeone, A., et al., Two vertebrate homeobox genes related to the Drosophila empty spiracles gene are expressed in the embryonic cerebral cortex. EMBO J, 1992. 11(7): p. 2541-50. 512. Simeone, A., et al., Nested expression domains of four homeobox genes in developing rostral brain. Nature, 1992. 358(6388): p. 687-90. 513. Gulisano, M., et al., Emx1 and Emx2 show different patterns of expression during proliferation and differentiation of the developing cerebral cortex in the mouse. Eur J Neurosci, 1996. 8(5): p. 1037-50. 514. Chan, C.H., et al., Emx1 is a marker for pyramidal neurons of the cerebral cortex. Cereb Cortex, 2001. 11(12): p. 1191-8. 515. Soriano, P., Generalized lacZ expression with the ROSA26 Cre reporter strain. Nat Genet, 1999. 21(1): p. 70-1. 516. Smith, C.L., et al., Mouse Genome Database (MGD)-2018: knowledgebase for the . Nucleic Acids Res, 2018. 46(D1): p. D836-D842. 517. Finger, J.H., et al., The mouse Gene Expression Database (GXD): 2017 update. Nucleic Acids Res, 2017. 45(D1): p. D730-D736. 518. Bult, C.J., et al., Mouse genome database 2016. Nucleic Acids Res, 2016. 44(D1): p. D840-7.

250

519. Frisdal, A. and P.A. Trainor, Development and evolution of the pharyngeal apparatus. Wiley Interdiscip Rev Dev Biol, 2014. 3(6): p. 403-18. 520. Haber, S.N., Corticostriatal circuitry. Dialogues Clin Neurosci, 2016. 18(1): p. 7-21. 521. Crittenden, J.R. and A.M. Graybiel, Basal Ganglia disorders associated with imbalances in the striatal striosome and matrix compartments. Front Neuroanat, 2011. 5: p. 59. 522. Lanciego, J.L., N. Luquin, and J.A. Obeso, Functional neuroanatomy of the basal ganglia. Cold Spring Harb Perspect Med, 2012. 2(12): p. a009621. 523. Nawathe, A., J. Doherty, and P. Pandya, Fetal microcephaly. BMJ, 2018. 361: p. k2232. 524. Walf, A.A. and C.A. Frye, The use of the elevated plus maze as an assay of anxiety- related behavior in rodents. Nat Protoc, 2007. 2(2): p. 322-8. 525. Seibenhener, M.L. and M.C. Wooten, Use of the Open Field Maze to measure locomotor and anxiety-like behavior in mice. J Vis Exp, 2015(96): p. e52434. 526. Brooks, S.P. and S.B. Dunnett, Tests to assess motor phenotype in mice: a user's guide. Nat Rev Neurosci, 2009. 10(7): p. 519-29. 527. D'Hooge, R. and P.P. De Deyn, Applications of the Morris water maze in the study of learning and memory. Brain Res Brain Res Rev, 2001. 36(1): p. 60-90. 528. Nevin, N.C., P.S. Thomas, and J. Hutchinson, Syndrome of short stature, microcephaly, mental retardation, and multiple epiphyseal dysplasia--Lowry-Wood syndrome. Am J Med Genet, 1986. 24(1): p. 33-9. 529. Hankenson, L.G., M.B. Ozonoff, and S.B. Cassidy, Epiphyseal dysplasia with coxa vara, microcephaly, and normal intelligence in sibs: expanded spectrum of Lowry-Wood syndrome? Am J Med Genet, 1989. 33(3): p. 336-40. 530. Magnani, C., et al., Multiple joint dislocations: an additional skeletal finding in Lowry- Wood syndrome? Am J Med Genet A, 2009. 149A(4): p. 737-41. 531. Lemon, R.N. and J. Griffiths, Comparing the function of the corticospinal system in different species: organizational differences for motor specialization? Muscle Nerve, 2005. 32(3): p. 261-79. 532. Bieler, L., et al., Motor deficits following dorsal corticospinal tract transection in rats: voluntary versus skilled locomotion readouts. Heliyon, 2018. 4(2): p. e00540. 533. Poulos, A.M., et al., Persistence of fear memory across time requires the basolateral amygdala complex. Proc Natl Acad Sci U S A, 2009. 106(28): p. 11737-41.

251

534. Likhtik, E., et al., Prefrontal control of the amygdala. J Neurosci, 2005. 25(32): p. 7429- 37. 535. Bicks, L.K., et al., Prefrontal Cortex and Social Cognition in Mouse and Man. Front Psychol, 2015. 6: p. 1805. 536. Rudebeck, P.H., et al., Distinct contributions of frontal areas to emotion and social behaviour in the rat. Eur J Neurosci, 2007. 26(8): p. 2315-26. 537. Nyby, J., G.A. Dizinno, and G. Whitney, Social status and ultrasonic vocalizations of male mice. Behav Biol, 1976. 18(2): p. 285-9. 538. Vargha-Khadem, F., et al., FOXP2 and the neuroanatomy of speech and language. Nat Rev Neurosci, 2005. 6(2): p. 131-8. 539. Castellucci, G.A., M.J. McGinley, and D.A. McCormick, Knockout of Foxp2 disrupts vocal development in mice. Sci Rep, 2016. 6: p. 23305. 540. Bekkevold, C.M., et al., Dehydration parameters and standards for laboratory mice. J Am Assoc Lab Anim Sci, 2013. 52(3): p. 233-9. 541. Burkholder, T., et al., Health Evaluation of Experimental Laboratory Mice. Curr Protoc Mouse Biol, 2012. 2: p. 145-165. 542. Oka, Y., M. Ye, and C.S. Zuker, Thirst driving and suppressing signals encoded by distinct neural populations in the brain. Nature, 2015. 520(7547): p. 349-52. 543. Nakatani, J., et al., Abnormal behavior in a chromosome-engineered mouse model for human 15q11-13 duplication seen in autism. Cell, 2009. 137(7): p. 1235-46. 544. Penagarikano, O., et al., Absence of CNTNAP2 leads to epilepsy, neuronal migration abnormalities, and core autism-related deficits. Cell, 2011. 147(1): p. 235-46. 545. Srivastava, N., et al., EphB2 receptor forward signaling controls cortical growth cone collapse via Nck and Pak. Mol Cell Neurosci, 2013. 52: p. 106-16. 546. Gigliucci, V., et al., Region specific up-regulation of oxytocin receptors in the opioid oprm1 (-/-) mouse model of autism. Front Pediatr, 2014. 2: p. 91. 547. Lu, D.H., et al., Impairment of social behaviors in Arhgef10 knockout mice. Mol Autism, 2018. 9: p. 11. 548. Waragai, M., et al., Ataxin 10 induces neuritogenesis via interaction with G-protein beta2 subunit. J Neurosci Res, 2006. 83(7): p. 1170-8.

252

549. O'Donovan, K.J., et al., B-RAF kinase drives developmental axon growth and promotes axon regeneration in the injured mature CNS. J Exp Med, 2014. 211(5): p. 801-14. 550. Xiao, Y., et al., The atypical guanine nucleotide exchange factor Dock4 regulates neurite differentiation through modulation of Rac1 GTPase and actin dynamics. J Biol Chem, 2013. 288(27): p. 20034-45. 551. Ueda, S., M. Negishi, and H. Katoh, Rac GEF Dock4 interacts with cortactin to regulate dendritic spine formation. Mol Biol Cell, 2013. 24(10): p. 1602-13. 552. Yu, H., et al., Headless Myo10 is a regulator of microtubule stability during neuronal development. J Neurochem, 2015. 135(2): p. 261-73. 553. Nagayama, S., R. Homma, and F. Imamura, Neuronal organization of olfactory bulb circuits. Front Neural Circuits, 2014. 8: p. 98. 554. Lin, D.Y., et al., Encoding social signals in the mouse main olfactory bulb. Nature, 2005. 434(7032): p. 470-7. 555. Fleischmann, A., et al., Mice with a "monoclonal nose": perturbations in an olfactory map impair odor discrimination. Neuron, 2008. 60(6): p. 1068-81. 556. Mandiyan, V.S., J.K. Coats, and N.M. Shah, Deficits in sexual and aggressive behaviors in Cnga2 mutant mice. Nat Neurosci, 2005. 8(12): p. 1660-2. 557. Whitney, G., Nyby, J., Cues that elicit ultrasounds from adult male mice. Amer Zool, 1979. 19(2): p. 457-463. 558. Justice, J.N., et al., Battery of behavioral tests in mice that models age-associated changes in human motor function. Age (Dordr), 2014. 36(2): p. 583-92. 559. Shoji, H., et al., Age-related changes in behavior in C57BL/6J mice from young adulthood to middle age. Mol Brain, 2016. 9: p. 11. 560. Cook, M.N., et al., Behavioral differences among 129 substrains: implications for knockout and transgenic mice. Behav Neurosci, 2002. 116(4): p. 600-11. 561. McFadyen, M.P., et al., Differences among eight inbred strains of mice in motor ability and motor learning on a rotorod. Genes Brain Behav, 2003. 2(4): p. 214-9. 562. Mao, J.H., et al., Identification of genetic factors that modify motor performance and body weight using Collaborative Cross mice. Sci Rep, 2015. 5: p. 16247. 563. Wong, A.A. and R.E. Brown, Visual detection, pattern discrimination and visual acuity in 14 strains of mice. Genes Brain Behav, 2006. 5(5): p. 389-403.

253

564. Robbins, T.W., Everitt, B. J., Functions of dopamine in the dorsal and ventral striatum. Semin Neurosci, 1992. 4(2): p. 119-127. 565. Palmiter, R.D., Dopamine signaling in the dorsal striatum is essential for motivated behaviors: lessons from dopamine-deficient mice. Ann N Y Acad Sci, 2008. 1129: p. 35- 46. 566. Evarts, E.V., Pyramidal tract activity associated with a conditioned hand movement in the monkey. J Neurophysiol, 1966. 29(6): p. 1011-27. 567. Georgopoulos, A.P., A.B. Schwartz, and R.E. Kettner, Neuronal population coding of movement direction. Science, 1986. 233(4771): p. 1416-9. 568. Turner, R.S. and M. Desmurget, Basal ganglia contributions to motor control: a vigorous tutor. Curr Opin Neurobiol, 2010. 20(6): p. 704-16. 569. Yin, H.H., Action, time and the basal ganglia. Philos Trans R Soc Lond B Biol Sci, 2014. 369(1637): p. 20120473. 570. Sreenivasan, V., et al., Movement Initiation Signals in Mouse Whisker Motor Cortex. Neuron, 2016. 92(6): p. 1368-1382. 571. Yue, M., et al., Progressive dopaminergic alterations and mitochondrial abnormalities in LRRK2 G2019S knock-in mice. Neurobiol Dis, 2015. 78: p. 172-95. 572. Rubinstein, M., et al., Mice lacking dopamine D4 receptors are supersensitive to ethanol, cocaine, and methamphetamine. Cell, 1997. 90(6): p. 991-1001. 573. Heng, M.Y., et al., Longitudinal evaluation of the Hdh(CAG)150 knock-in murine model of Huntington's disease. J Neurosci, 2007. 27(34): p. 8989-98. 574. Li, X., et al., Enhanced striatal dopamine transmission and motor performance with LRRK2 overexpression in mice is eliminated by familial Parkinson's disease mutation G2019S. J Neurosci, 2010. 30(5): p. 1788-97. 575. Xu, J., et al., Altered activity-rest patterns in mice with a human autosomal-dominant nocturnal frontal lobe epilepsy mutation in the beta2 nicotinic receptor. Mol Psychiatry, 2011. 16(10): p. 1048-61. 576. Threlfell, S. and S.J. Cragg, Dopamine signaling in dorsal versus ventral striatum: the dynamic role of cholinergic interneurons. Front Syst Neurosci, 2011. 5: p. 11. 577. Avila-Luna, A., et al., Dopamine D1 receptor activation maintains motor coordination and balance in rats. Metab Brain Dis, 2018. 33(1): p. 99-105.

254

578. Durieux, P.F., S.N. Schiffmann, and A. de Kerchove d'Exaerde, Differential regulation of motor control and response to dopaminergic drugs by D1R and D2R neurons in distinct dorsal striatum subregions. EMBO J, 2012. 31(3): p. 640-53. 579. Molina-Luna, K., et al., Dopamine in motor cortex is necessary for skill learning and synaptic plasticity. PLoS One, 2009. 4(9): p. e7082. 580. Rioult-Pedotti, M.S., et al., Dopamine Promotes Motor Cortex Plasticity and Motor Skill Learning via PLC Activation. PLoS One, 2015. 10(5): p. e0124986. 581. Meziane, H., et al., Estrous cycle effects on behavior of C57BL/6J and BALB/cByJ female mice: implications for phenotyping strategies. Genes Brain Behav, 2007. 6(2): p. 192- 200. 582. Schulte-Merker, S. and D.Y. Stainier, Out with the old, in with the new: reassessing morpholino knockdowns in light of genome editing technology. Development, 2014. 141(16): p. 3103-4. 583. Rowan, S. and C.L. Cepko, Genetic analysis of the homeodomain transcription factor Chx10 in the retina using a novel multifunctional BAC transgenic mouse reporter. Dev Biol, 2004. 271(2): p. 388-402. 584. Tucker, K.L., M. Meyer, and Y.A. Barde, Neurotrophins are required for nerve growth during development. Nat Neurosci, 2001. 4(1): p. 29-37. 585. Tyler, W.A., et al., Neural precursor lineages specify distinct neocortical pyramidal neuron types. J Neurosci, 2015. 35(15): p. 6142-52. 586. LoTurco, J., J.B. Manent, and F. Sidiqi, New and improved tools for in utero electroporation studies of developing cerebral cortex. Cereb Cortex, 2009. 19 Suppl 1: p. i120-5. 587. Shields, C.W.t., C.D. Reyes, and G.P. Lopez, Microfluidic cell sorting: a review of the advances in the separation of cells from debulking to rare cell isolation. Lab Chip, 2015. 15(5): p. 1230-49. 588. Marino, S., et al., Induction of medulloblastomas in p53-null mutant mice by somatic inactivation of Rb in the external granular layer cells of the cerebellum. Genes Dev, 2000. 14(8): p. 994-1004. 589. Nieto, M., et al., Expression of Cux-1 and Cux-2 in the subventricular zone and upper layers II-IV of the cerebral cortex. J Comp Neurol, 2004. 479(2): p. 168-80.

255

590. Schaeren-Wiemers, N., et al., The expression pattern of the orphan nuclear receptor RORbeta in the developing and adult rat nervous system suggests a role in the processing of sensory information and in circadian rhythm. Eur J Neurosci, 1997. 9(12): p. 2687- 701. 591. Arlotta, P., et al., Neuronal subtype-specific genes that control corticospinal motor neuron development in vivo. Neuron, 2005. 45(2): p. 207-21. 592. Hisaoka, T., et al., The forkhead transcription factors, Foxp1 and Foxp2, identify different subpopulations of projection neurons in the mouse cerebral cortex. Neuroscience, 2010. 166(2): p. 551-63. 593. Hoerder-Suabedissen, A., et al., Novel markers reveal subpopulations of subplate neurons in the murine cerebral cortex. Cereb Cortex, 2009. 19(8): p. 1738-50. 594. Feng, G., et al., Imaging neuronal subsets in transgenic mice expressing multiple spectral variants of GFP. Neuron, 2000. 28(1): p. 41-51. 595. Porrero, C., et al., Mapping of fluorescent protein-expressing neurons and axon pathways in adult and developing Thy1-eYFP-H transgenic mice. Brain Res, 2010. 1345: p. 59-72. 596. Bayram-Weston, Z., et al., Optimising Golgi-Cox staining for use with perfusion-fixed brain tissue validated in the zQ175 mouse model of Huntington's disease. J Neurosci Methods, 2016. 265: p. 81-8. 597. Gee, J.M., et al., Imaging activity in neurons and glia with a Polr2a-based and cre- dependent GCaMP5G-IRES-tdTomato reporter mouse. Neuron, 2014. 83(5): p. 1058-72. 598. Yang, M. and J.N. Crawley, Simple behavioral assessment of mouse olfaction. Curr Protoc Neurosci, 2009. Chapter 8: p. Unit 8 24. 599. Hennis, M.R., et al., Surprising behavioral and neurochemical enhancements in mice with combined mutations linked to Parkinson's disease. Neurobiol Dis, 2014. 62: p. 113- 23. 600. Higgins, G.A. and L.B. Silenieks, Rodent Test of Attention and Impulsivity: The 5-Choice Serial Reaction Time Task. Curr Protoc Pharmacol, 2017. 78: p. 5 49 1-5 49 34.

256

Chapter 9: Supplementary figures

Figure S1. Related to Figure 4.15. Dynamic expression of U11 in the control P21 cortex. Individual channels showing NeuN IF staining (blue), U11 FISH staining (green), EdU (magenta), merged channels, and merged channels+DAPI signal (white) in the P0 control (A-E), P0 mutant (F-J), P21 control (K-O), and P21 mutant (P-T) cortex. Scale bars=60 µm. (K’-O’) Magnified images of the three cells boxed in (K-O) with low U11 expression. The nuclei of these three cells, as defined by DAPI staining, were outlined in (K’-O’). Scale bars=5 µm. Str=striatum. 257

Figure S2. Related to Figure 4.15. Raw maximum intensity projection (MIP) and optical slice images used in Figure 4.15C-C’’’’. (A) One of the raw MIP images used in Figure 6C, which contains the cells magnified in Figure 4.15C’-C’’’’. This is one of three images used in the collage in Figure 4.15C, which shows a column of a coronal section of the P0 mutant cortex, with NeuN IF staining (blue), U11 FISH staining (green), and EdU detection (magenta). Scale bar=30 µm. (A’-A’’’’’) Magnified images of the boxed region in (A, MIP image), representing the raw MIP images of the cells shown in Figure 4.15C’-C’’’’, with NeuN IF staining (blue, A’), U11 FISH staining (green, A’’), EdU detection (magenta, A’’’), merged channels (A’’’’), and merged channels with DAPI (white) (A’’’’’). Open arrowheads mark where the arrowheads in Figure 4.15C’-C’’’’ are located. The nuclei of the three cells indicated by these arrowheads are outlined in dashed lines, as determined by DAPI staining. The color of the dashed lines indicates proximity of the outlined region to neighboring nuclei. Dashed white lines indicate that, in this MIP image, no other nuclei lie directly above or below this region, while dashed red lines indicate that at least one other nucleus is located directly above or below this region. Scale bars=5 µm. (B) Schematic of the optical slices (24 in total) comprising the MIP image in A’’’, corresponding to the raw MIP version of Figure 4.15C’. Raw optical slices numbered 16 to 20 are shown below this schematic; optical slice 18 (red box) was used to generate Figure 4.15C’-C’’’’. (C) The raw optical slice 18 image that corresponds to the MIP image shown in (A). (C’-C’’’’’) Magnified images of the boxed region in (C, optical slice), representing the raw optical slice images of the cells shown in Figure 4.15C’-C’’’’, showing NeuN IF staining (blue, C’), U11 FISH staining (green, C’’), EdU detection (magenta, C’’’), merged channels (C’’’’), and merged channels with DAPI (white) (C’’’’’). Open arrowheads mark where the arrowheads in Figure 4.15C’-C’’’’ are located. The nuclei of the three cells indicated by these arrowheads are outlined in dashed lines, as determined by DAPI staining. Scale bars=5 µm.

258

Figure S3. Related to Figure 4.15. Adjusted maximum intensity projection (MIP) and optical slice images used in Figure 4.15E-E’’’’. (A) One of the channel-adjusted MIP images used in Figure 4.15E, which contains the cells magnified in Figure 4.15E’-E’’’’. This is one of the three images used in the collage in Figure 4.15E, which showed a column of a coronal section of the P21 mutant cortex. with NeuN IF staining (blue), U11 FISH staining (green), and EdU detection (magenta). Scale bar=30 µm. (A1’-A1’’’’) Magnified images of the boxed regions in (A, MIP image), showing U11-negative, NeuN+, EdU+ cells, with EdU detection (magenta, A1-A3), Neun IF staining (blue, A1’-A3’), U11 FISH staining (green, A1’’-A3’’), merged channels (A1’’’-A3’’’), and merged channels with DAPI (white) (A1’’’’-A3’’’’). The nuclei of the U11-/NeuN+/EdU+ cells in these images were outlined in dashed lines, as determined by DAPI staining. The color of the dashed lines indicates proximity of the outlined region to neighboring nuclei. Dashed white lines indicate that, in this MIP image, no other nuclei lie directly above or below this region. Dashed red lines indicate that, in this MIP image, at least one other nucleus is located directly above or below this region. (A1-A’’’) represent the channel-adjusted MIP images of the cells shown in Figure 4.15E’-E’’’’. Scale bars=5 µm. (B) Schematic of the optical slices (36 in total) comprising the channel-adjusted MIP image in A. The optical slices numbered 3 to 7 are shown below this schematic; optical slice 5 (red box) was used to generate Figure 4.15E’-E’’’’. (C) The channel-adjusted optical slice 5 image that corresponds to the MIP image shown in (A). (C’-C’’’’’) Magnified images of the boxed region in (C, optical slice), representing the channel-adjusted optical slice images of the cells shown in Figure 4.15E’-E’’’’, showing EdU detection (magenta, C’), NeuN IF staining (blue, C’’), U11 FISH staining (green, C’’’), merged channels (C’’’’), and merged channels with DAPI (white) (C’’’’’). The nuclei of the two U11-/NeuN+/EdU+ cells were outlined in dashed lines, as determined by DAPI staining. The optical slice images in (C’-C’’’’) represent the images used to generate Figure 4.15E’-E’’’’. Scale bars=5 µm.

259

Figure S4. Related to Figure 4.15. High non-nuclear U11 FISH staining in P21 control and mutant cortical sections. (A-B) One of the raw MIP images used in Figure 4.15D (A, P21 control) and Figure 4.15E (B, P21 mutant). These images were used in the collages in Figure 4.15D&E, which show a column of coronal sections of the P21 control and mutant cortices, with U11 FISH staining (green), NeuN IF staining (blue), EdU detection (magenta), and nuclear staining by DAPI (white). The raw MIP image from the P21 mutant cortex (B) contains the cells magnified in Figure 4.15E’-E’’’’; the raw MIP image from the P21 control cortex (A) corresponds to the same location within the cortex as (B). Scale bars=30 µm. Magnified images of the boxed regions in (A) and (B) are shown in (A’-A’’’) and (B’-B’’’), respectively, showing U11 FISH staining (green, A’&B’), DAPI staining (white, A’’&B’’), and merged channels (A’’’&B’’’). Open arrowheads indicate non-nuclear U11 FISH staining. Scale bars=5 µm. (C-D) The channel-adjusted MIP images of the P21 control (C) and mutant (D) cortices, corresponding to panels (A) and (B), respectively, with the U11 channel adjusted to minimize the non-nuclear U11 FISH staining observed in both the P21 control and mutant cortices (A-B’’’), while retaining U11 FISH signal in NeuN+ cells in the P21 control cortex (Figure S1K’-O’). Scale bars=30 µm. Magnified images of the boxed regions in (C) and (D) are shown in (C’-C’’’) and (D’-D’’’), respectively, showing U11 FISH staining (green, C’&D’), DAPI staining (white, C’’&D’’), and merged channels (C’’’&D’’’). Open arrowheads indicate non-nuclear U11 FISH staining. Scale bars=5 µm.

260

Figure S5. Related to Figure S4. Higher resolution images of the non-nuclear U11 FISH staining in P21 control and mutant cortical sections. (A-F) Non-nuclear U11 FISH staining in raw MIP images from the P21 control (A-C) and P21 mutant (D-F) cortical images from Figure S4A’-B’’, showing U11 FISH staining (green, A&D), DAPI staining (white, B&E), and merged channels (C&F). (A’-F’) Non-nuclear U11 FISH staining in channel-adjusted MIP images from the P21 control (A’-C’) and P21 mutant (D’-F’) cortical images in Figure R6C’-D’’, showing U11 FISH staining (green, A’&D’), DAPI staining (white, B’&E’), and merged channels (C’&F’). Open arrowheads indicate non-nuclear U11 FISH staining. Scale bars=5 µm.

261

Chapter 10: Supplementary tables

Table S1. Select results from DAVID analysis of the 461 MIGs interrogated by custom microarray. Only terms from the GOTERM category with a Bonferroni correction ≤0.005 were included. Domain-specific DAVID results were discarded. Synonymous terms were also discarded in favor of a representative term with the highest number of genes. Term # Genes Bonferroni Intracellular transport 42 2.82E-10 Protein transport 48 8.96E-08 Nucleocytoplasmic transport 17 3.05E-06 Protein import into nucleus, docking 7 1.57E-03 Nuclear lumen 49 3.23E-04 Nuclear pore 12 1.04E-04 Transmembrane transport 33 3.07E-04 Voltage-gated ion channel activity 21 7.47E-06 Voltage-gated calcium channel activity 9 8.52E-06 Voltage-gated sodium channel activity 7 3.96E-05 MAP kinase activity 10 2.39E-09 SAP kinase activity 6 2.31E-05 Nucleotide binding 96 2.41E-06 ATP binding 66 3.91E-04

262

Table S2. Primers used for in situ hybridization probe preparation. RNA Primer Primer Sequence Target Direction Forward 5'-CTCCAAACAAGCTCTCAAGGTCCA-3' 7SK Reverse 5'-ATGCAGCGCCTCATTTGGATGTGT-3' Forward 5'-AAAGGGCTTCTGTCGTGAGTGGC-3' U11 Reverse 5'-CCGGGACCAACGATCACCAG-3' Forward 5'-ACGCCCGAGTCCTCACTGCTTAT-3' U12 Reverse 5'-ACCCAGGCATCCCGCAAAGTA-3' Forward 5'-TTTCTTGGGGTTGCGCTACTGTC-3' U4atac Reverse 5'-AAATTGCACCAAGGTAAAGCAGAGCT-3' Forward 5'-GTGTTGTATGAAAGGAGAGAAGGTTAGCAC-3' U6atac Reverse 5'-AACAGTGGTTAGATGCCACGAAGTAGGT-3'

263

Table S3. Expression data for key genes in RNAseq data in Chapter 4. FC=expression fold- change between control and mutant; Ave=average. Fold-Change Gene Log2FC Ave FPKM Control Ave FPKM Mutant (FC) Dbx1 0 1 0.608509842 0.530471762

Emx1 0.062359 1.04417 19.0897891 20.28353723

Hist1h1a -0.19403 1.14395 1229.474093 998.5326327

Lhx6 0.111118 1.08007 0.681803986 0.487783471

Neat1 0.254737 1.19312 1.388437498 1.185303486

Rnu11 -1.56145 2.95151 1877.723894 766.3865674

Spc24 -1.32088 2.49818 33.58894779 13.29649526

Tp53 0.188527 1.1396 9.33811728 10.35488266

264

Table S4. Mammalian functions of the 4 genes upregulated in the mutant pallium that enriched for “intrinsic apoptotic signaling pathway in response to DNA damage by p53 mediator” by DAVID. Related to Figure 4.6. FC=expression fold-change between control and mutant. Ave=average. Ave Ave FPKM Gene Synonyms FPKM FC Function Description Mut Ctrl p53 upregulated modulator of apoptosis, a BH3 domain-containing gene Puma, Jfy1, (PMID: 11572983); alongside Pmaip1 (below), important activator of Bbc3 0.7199858 1.8057943 2.43627 Jfy-1 genotoxic stress-induced apoptosis in NPCs (PMID: 16822983); required for apoptosis driven by increased p53 expression (PMID: 12574499) both EDA2R and its ligand, EDA, are transcriptionally activated by p53 (PMID: 19543321, 20501644); regulator of p53-mediated apoptosis via Eda2r Xedar 1.6703153 10.5807371 6.85729 interaction with the death receptor FAS, and increased expression of EDA2R results in upregulation/stabilization of FAS protein levels (PMID: 19543321) p53 apoptosis factor related to PMP-22; transcriptionally activated by p53 during p53-mediated apoptosis, but not during p53-mediated G1 arrest (PMID: 10733530, 14707288); induces p53 upregulation, post-translational Kcp1, Krtcap1, Perp 0.4671268 1.0282991 2.23922 p53 modifications that disrupt MDM2-p53 binding, and p53 nuclear Pigpc1, Thw translocation (PMID: 21451571); important positive regulator of p53- mediated apoptosis in the embryonic mouse brain (PMID: 14614825); DNA damage upregulates Perp expression (PMID: 14614825) pro-apoptotic, BH3 domain-containing gene whose transcription is activated Pmaip1 Noxa, Apr 0.7210288 2.3228602 2.56006 by p53 (PMID: 10807576, 14500851); alongside Bbc3 (above), important activator of genotoxic stress-induced apoptosis in NPCs (PMID: 16822983)

265

Table S5. Mammalian functions of the 21 MIGs regulating cell cycle with significantly elevated minor intron retention the mutant pallium. Related to Figure 4.7&9. P-values were determined by two-tailed student’s t-test. Syns=synonyms, ave=average. MSI=mis-splicing index. Ave Ave Ave Ave MSI MIG Syns FPKM FPKM MSI MSI P- ∆MSI Cell Cycle Function Description Ctrl Mut Ctrl Mut value in HeLa cells, localizes to nuclear pore and kinetochores, and RNAi-mediated knockdown causes cytokinesis defects Elys, (PMID: 17098863); required for nuclear pore assembly Ahctf1 Mst108, 25.9714 27.71142 7.64499 27.5516 0.0018 19.9066 (PMID: 17098863); siRNA-mediated knockdown in HeLa Tmbs62 cells results in mislocalization of LBR, which is important for reforming nuclear envelope post-mitosis (PMID: 22555603) cyclin K; highly expressed in pluripotent mouse embryonic stem cells, and knockdown results in increased cell Cpr4, differentiation (PMID: 22547058); knockdown of two Cdk Ccnk 5.34881 5.326958 5.05311 27.0929 0.0046 22.0398 CycK partners identified in study (Cdk12 & Cdk13), enhanced differentiation, suggesting role in self-renewal (PMID: 22547058) normally associates with chromosomes during mitosis, and Ccnt, mislocalization caused arrest at G1 and apoptosis, indicating Ccnt1 Cyct1, 16.7709 19.37268 8.40885 16.5704 0.0456 8.16158 necessary role in transcribing genes during cell division Hive1 (PMID: 18292778) initiation of DNA replication (PMID: 25933514); recruits Cdc45 24.4613 27.80446 2.46166 19.2321 0.0017 16.7704 DNA polymerase α-primase (see Prim1 below) to the DNA replication complex (PMID: 10518787) siRNA-mediated knockdown in HeLa cells shows important role in G2/M checkpoint of cell cycle (PMID: 18283122); in Cep164 Nphp15 5.06087 5.324126 5.90409 30.0608 0.0004 24.1567 human RPE-FUCCI cells, knockdown causes shorter G1, G2, and M phases, but reduced proliferation, slowed S phase, and increased apoptosis (PMID: 25340510)

266

Ave Ave Ave Ave MSI MIG Syns FPKM FPKM MSI MSI P- ∆MSI Cell Cycle Function Description Ctrl Mut Ctrl Mut value cell growth regulator; transcriptionally upregulated by p53, Cgr19, Cgrrf1 3.37512 3.102399 7.18815 35.1293 0.0015 27.9411 and its overexpression suppresses cell proliferation (PMID: Rnf197 8968090) helicase/nuclease; functions in DNA replication & S/G2 phase DNA stability (PMID: 22570476); chromosome Dna2 Dna2l 3.92161 3.757846 4.80785 17.4288 0.015 12.6209 segregation role, based on effect of knockdown (PMID: 23604072) Dctn22, dynactin necessary for mitosis progression and chromosome Dctn3 49.6795 41.57011 1.6562 12.2327 0.002 10.5765 p22, p24 segregation (PMID: 25645239, 21163948) , light intermediate chain 1; knockdown alongside LIC2 results in misregulation of centriole cohesion during Lic1, mitosis (PMID: 25422374); depletion in human cells causes Dync1li1 Dncli1, 11.1754 11.34764 2.30119 8.52481 0.021 6.22362 metaphase delay and increased interkinetichore distance, Dnclic1 suggesting a role in the spindle assembly checkpoint (PMID: 19229290) transcription factor induced by cellular stress, including DNA damage, resulting in apoptosis (PMID: 12717439); target of pRB and complexes with it—depending on circumstance and cell cycle stage, can drive either activation or repression (PMID: 20016602, 20224733); however, based Rbp3, on KO mouse studies (which interrogated , , & E2f1 Rbap1, 6.06778 6.486815 12.9125 53.2398 0.001 40.3273 ), not required for cell division in embryonic tissues and Rbbp3 specific adult tissues tested, but it is required for DNA integrity and cell survival (PMID: 20016602); misexpression in embryonic lens fiber cells causes cell cycle entry and subsequent apoptosis (PMID: 11095619); AKT is a target of E2f1 (PMID: 22298593) Rb-independent transcription repressor that may repress E2f6b, E2f6 9.95253 8.848759 6.96865 31.4943 0.0006 24.5257 E2F-responsive genes during S phase; also interacts with Ema chromatin remodeling complex member Brg1 on G1/S gene..

267

Ave Ave Ave Ave MSI MIG Syns FPKM FPKM MSI MSI P- ∆MSI Cell Cycle Function Description Ctrl Mut Ctrl Mut value … promoters during S phase, suggesting role in G1/S phase transition (PMID: 23082233); in response to DNA E2f6b, replication stress, E2f6 replaces activating E2fs at promoters E2f6 9.95253 8.848759 6.96865 31.4943 0.0006 24.5257 Ema to repress transcription before G1/S transition, and dissociation is driven by Chk phosphorylation of E2f6 (PMID: 23954429) cyclin G-associated kinase; in Nestin-cre conditional knockout, observed morphological changes in ventricular zones (cells less dense) and hypothalamus, indicating Dnaj26, Gak 17.3443 19.80183 4.88252 15.8289 0.0012 10.9464 proliferation defects (PMID: 18434600); siRNA revealed Dnajc26 importance of GAK for spindle assembly and chromosome alignment during mitosis (arrest during mitosis) (PMID: 20237935) Emp, Emlp, macrophage erythroblast attacher protein; localizes to mitotic Maea Gid9, 16.3218 17.36623 1.20219 4.79234 0.0407 3.59015 spindle and contractile ring during mitosis/cytokinesis Pig5, (PMID: 16510120) Hlc-10 sister chromatid cohesion factor required for loading cohesin onto chromosomes (PMID: 16802858, 16682347); siRNA- mediated knockdown in HeLa cells disrupted sister Mau2 Scc4 20.3413 22.14701 1.78068 9.64785 0.0129 7.86717 chromatid cohesion, resulting in chromosome mis- segregation and prolonged prometaphase, at the expense of metaphase and anaphase (PMID: 16682347) centrosomal protein that is required for γ-tubulin ring complex localization to the centrosome, with knockdown Gcp-wd, Nedd1 23.2697 21.244 7.73854 14.7733 0.0263 7.0348 causing defects in centrosomal microtubule nucleation, Tubgcp7 aberrant mitotic spindles, and inhibition of centriole duplication (PMID: 16461362)

268

Ave Ave Ave Ave MSI MIG Syns FPKM FPKM MSI MSI P- ∆MSI Cell Cycle Function Description Ctrl Mut Ctrl Mut value phosphatase regulatory subunit; in cancer cell line, knockdown enhanced T-cell proliferation and cytokine Pr55δ, production, and inhibited T-cell apoptosis (PMID: Ppp2r2d PP2A- 13.9211 11.13086 5.06246 17.2976 0.005 12.2351 24476824); work in suggests PP2a-B55δ inhibits B55δ CDK1 activation, thereby regulating mitosis entry/exit (PMID: 19696736) catalytic primase subunit of DNA polymerase α-primase (PMID: 20030400); generates an RNA primer, followed by Prim1 p48, p49 22.8139 25.48825 4.46452 23.1345 5E-06 18.67 primer extension to produce RNA-DNA primer, to initiate DNA replication (reviewed in PMID: 9759502) Recc5, component of the replication factor C complex, which loads 36kDa, PCNA onto DNA at primer-template junctions or onto Rfc5 22.4726 14.61202 4.95427 12.8491 0.0017 7.89481 36.5kDa, nicked duplex DNA, in turn allowing for DNA synthesis via p36 DNA polymerase δ (PMID: 9759502) in KO mouse, observe embryonic lethality ~E7, with impaired growth of inner cell mass, inability of trophoblast Nae2, Uba3 44.7306 35.89686 2.52561 8.4869 0.0199 5.96128 cells to enter S phase, with ICM cells undergoing Ube1c , with dysregulation of cyclin expression and reduced degradation of B-catenin (PMID: 11696557) overexpression in ovarian cancer cells causes inhibition of vascularization of tumor and of tumor growth (PMID: Vash1 6.94233 6.493455 2.45056 6.43191 0.0212 3.98134 26460696); knockdown in cancer cells, followed by transplantation into mice led to significantly increased tumor growth and metastasis (PMID: 25797264) microtubule-associated factor required for chromosome Zfp207 Bugz 74.5366 99.31389 2.50424 11.093 0.0003 8.58876 alignment in mitosis (PMID: 24462186, 24462187)

269

Table S6. The DNA damage response functions of 14 MIGs with significantly elevated minor intron retention in the mutant pallium. Related to Figure 4.6. MSI=mis-splicing index. P-values were determined by two-tailed student’s t-test. Syns=synonyms, ave=average. Ave Ave Ave MSI Ave MSI MSI MIG Syns FPKM FPKM ∆MSI DNA Damage Response Function Description Ctrl Mut P-value Ctrl Mut forms remodeling complex with ISWI (complex: WICH), which phosphorylates and Wstf, maintains this phosphorylation of H2AX Baz1b Wbscr9, 21.907384 25.945518 5.447441 12.41971 0.019728 13.97227 (ɣH2AX), thereby regulating important steps in Wbscr10 the DNA damage response process (PMID: 19092802) cyclin K; when complexed with Cdk9, Cpr4, accumulates on chromatin in response to Ccnk 55.348807 5.326979 5.053107 27.09289 0.004641 22.03978 CycK replication stress, and ssDNA in stressed cells (PMID: 20930849) centrosomal protein that is phosphorylated by the DNA damage response proteins ATR and ATM; 24.15670 siRNA-mediated knockdown in HeLa cells results Cep164 Nphp15 5.060874 5.324126 5.904086 30.06079 0.000429 1 in reduced phosphorylation of multiple DNA damage response proteins and chromosome missegregation (PMID: 18283122) as part of the SCF E3 ubiquitin ligase complex, ubiquitinates Ku80, a component of the initiating Cul1 41.697261 47.309440 2.804814 16.18051 0.001812 13.3757 complex of the NHEJ double-strand break DNA repair pathway, thereby regulating this complex's removal from DNA (PMID: 23324393) cullin; ubiquitin ligase that targets Ddb1 (another MIG, below) and Ddb2, both of which act to Cul4a 11.101313 12.071160 3.608965 15.74483 0.044297 12.13587 initiate the DNA damage response, for ubiquitination and degradation, thereby regulating the DNA damage response (PMID: 19481525)

270

Ave Ave Ave MSI Ave MSI MSI MIG Syns FPKM FPKM ∆MSI DNA Damage Response Function Description Ctrl Mut P-value Ctrl Mut due to 5’ to 3’ endonuclease activity, involved in resection extension in homologous recombination- Dna2 Dna2l 3.9216094 3.7578457 4.807850 17.42876 0.015044 12.62091 mediated DNA double stranded break repair (PMID: 28718810) large subunit of DNA damage-binding complex, which functions in nucleotide excision repair XPE, (PMID: 16951172); loss of Ddb1 in mouse brain DDBA, Ddb1 86.093717 88.333331 0.461818 2.706198 0.004385 2.244381 results in accumulation of cell cycle regulators and XAP1, increased genomic instability, ultimately causing XPCE apoptosis of proliferating neuronal progenitors (PMID: 17129780) upregulated in response to DNA damage, and Rbp3, promotes DNA damage-induced apoptosis E2f1 Rbap1, 6.067779 6.486815 12.91249 53.23981 0.000964 40.32732 (PMID: 11459832); indirectly regulates the Rbbp3 transcription of the DNA damage response gene GADD45A (PMID: 20713352) single strand-specific DNA endonuclease Ercc5 Xpg 5.363914 6.275879 1.333767 7.642992 0.048076 6.309225 involved in the 3' incision step of nucleotide excision repair (PMID: 7657672) 5’ to 3’ endonuclease involved in DNA mismatch Msa, repair and resection extension in homologous Exo1 4.4268470 4.1420745 3.731127 9.386061 0.008405 5.654934 Hex1 recombination-mediated DNA double stranded break repair (PMID: 24705021) recruited to sites of DNA damage and interacts Ints7 8.821004 8.814813 3.590800 13.52046 0.012655 9.929663 with SSB1, a DNA damage sensor recognized by ATM (PMID: 21659603)

271

Ave Ave Ave MSI Ave MSI MSI MIG Syns FPKM FPKM ∆MSI DNA Damage Response Function Description Ctrl Mut P-value Ctrl Mut nuclear protein that promotes formation of Parp, poly(ADP-ribose) chains (PARylation), which Ppol, transfers ADP-ribose group from NAD+ to target Parp1 Adprt, 47.879274 47.679811 0.771961 14.72329 0.002560 13.95133 protein and forms a scaffold around DNA breaks, Artd1, allowing for recruitment of essential DNA damage Adprt1 response factors (PMID: 21989215, 22431722) after DNA damage, phosphorylated by ATM, UBPO, resulting in Usp10 stabilization and transport to Usp10 10.625277 8.050707 4.474797 17.26511 0.012563 12.79031 Uchrp the nucleus, where it deubiquitylates p53, thereby stabilizing p53 (PMID: 20096447)

272

Table S7. Primer sequences employed in Chapter 4. Primer Name Primer Direction Sequence (5' to 3') Forward (primer 1) ACCCTCCCCTACTGTTTTAC Rnu11 cKO, 5' loxP Reverse (primer 2) AGGCTGCTACAGGATGACTC Forward (primer 3) CATGTGTTTGCTGGGAATTG Rnu11 cKO, 3' loxP Reverse (primer 4) CTCATGAGGCAGATCTCTGAA Forward AGAAGAACAACAGCCGCATCAAACTGG Hist1h1a expression Reverse CTTGGACTCAGCCTTCTTGTTCAGCTT Forward TATCCAGCAACATTTGGGCCAGCT Cre genotyping Reverse AACATTCTCCCACCGTCAGTACGTGA Emx1-Cre zygosity, Forward AAGGTGTGGTTCCAGAATCG wild-type Reverse CTCTCCACCAGAAGGCTGAG Emx1-Cre zygosity, Forward GATCTCCGGTATTGAAACTCCAGC mutant Reverse GCTAAACATGCTTCATCGTCGG Ai9 tdTomato Cre Forward AAGGGAGCTGCAGTGGAGTA reporter, wild-type Reverse CCGAAAATCTGTGGGAAGTC Ai9 tdTomato Cre Forward GGCATTAAAGCAGCGTATCC reporter, mutant Reverse CTGTTCCTGTACGGCATGG Forward AAAGGGCTTCTGTCGTGAGTGGC U11 expression Reverse CCGGGACCAACGATCACCAG Forward AATTGGCCAGAAGACAACAGGGTTTGC Neat1 expression Reverse GTATTCAGTGGCAAAGCACTCATGAGG Forward CAGTTGCAGTTTATGCGGCAGGTG Coa3 minor intron Reverse (intronic) CTAGCCACCCTTGCTGTTTTCCCAAA splicing Reverse (exonic) CAGCTTTGGCTTCATCTTCCAGCTC Forward CTCCCAGACATGACAGCCATCATCAA Pten minor intron Reverse (intronic) CTACTCCCACGTTATCAGAGTGACAGAA splicing Reverse (exonic) CAAGTCTTTCTGCAGGAAATCCCATAGC Forward AAAACCACCCCTGACCCTTCG Parp1 minor intron Reverse (intronic) GCACACAGCATAGCCAAGAAAGG splicing Reverse (exonic) AATGTACCTGGGGAGGGCAGTT Forward AGCAGGTCTTACAGCCGAGATTATCG Sfrs10 expression Reverse CCAAACACGCCAAGACAACAGTTG Forward CTCATACCTTGCACAGAACTGGGGTT Spc24 expression Reverse AATAAAAAAGAAGCTGCAGGCCAGCC Forward GGATAGTTCATCTCCTGCAGGTCACAAG Intergenic primers Reverse GTCCCACCCATCTAGTTTAGCATCAGC

273

Chapter 11: Appendix I

Before I began my graduate work studying the role of minor intron splicing on CNS development, I pursued an independent research project in the Kanadia laboratory as an undergraduate, in which I sought to identify transcriptional networks underpinning mouse retinal development. In the process, I found that replication-dependent histone genes were differentially expressed in the E16 and P0 mouse retina, contradicting long-standing models about the regulation of these genes. In my final semester of my undergraduate education at Mount Holyoke College, I continued to investigate the expression of these replication-dependent histone genes across mouse retinal development. My findings culminated in my first publication, entitled “Replication- dependent histone genes are actively transcribed in differentiating and aging retinal neurons,” which comprises the entirety of this appendix [251].

Background

Paralogous genes are functionally related with a high degree of (AA) conservation, and are derived via gene duplication. The value of paralogous genes was conceptualized by Ohno, who proposed the idea of evolution through gene duplication [601]. The central idea was that when a gene duplicated, the new paralog was relieved from the pre-existing functional constraints and was now free to evolve novel functions [601]. Recently, a more nuanced interpretation of this original idea has emerged, and it suggests that the most likely outcome of gene duplication is an accelerated evolution of both paralogs, which in turn leads to subcompartmentalization of the function of the ancestral genes. This subcompartmentalization can be achieved by transcription regulation of these paralogs in a tissue specific manner and across development. A good example of paralogs is that of the histone and the snRNA genes of the major

274 spliceosome. Here the genes have duplicated as clusters of many gene copies, or isotypes, within a locus. The large number of paralogs in this case is thought to have evolved to meet the heavy demands of the cell for these proteins and snRNAs. However, a recent publication from the

Ackerman laboratory showed that the loss of function of one of the U2 snRNA genes, called Rnu2-

8, results in ataxia and neurodegeneration [602]. This observation showed that, like protein-coding gene paralogs, a snRNA gene whose expression was thought to be ubiquitous was in fact spatiotemporally regulated. Thus, the question we sought to study was whether histone genes followed similar spatiotemporal transcription regulation.

There are five classes of histone proteins in the eukaryotic chromatin, including H2a, H2b,

H3, H4, and H1. The core nucleosome octamer consists of two molecules of , H2b,

H3, and H4, around which 146 bp of left-handed superhelical DNA is wrapped in ~1.6 turns.

Higher order structure is achieved by linking these DNA-wrapped octamers via linker histone called H1. In mammals each histone protein is encoded by multiple genes that are categorized into two major groups: replication-dependent (canonical) and replication-independent (replacement) histone genes [603, 604]. Here we focus on transcription regulation of replication-dependent histone genes that are found in three clusters: Hist1 (Chr-13), Hist2 (Chr-3) and Hist3 (Chr-11) in the mouse genome. An exception to this is the single replication-dependent histone Hist4h4, which is located on . In mice there are 6 paralogs for , 20 for H2a, 18 for H2b,

12 for H3, and 13 for H4 (Table A1.1) [603].

Replication dependent histone genes have been shown to be actively transcribed during the

S-phase of the cell cycle. They also generally lack introns and are not poly-adenylated [605-607].

Instead, their 3’ untranslated region (3’UTR) has a stem loop to which two trans-acting factors, stem-loop binding protein (SLBP) and U7 snRNP, bind to facilitate rapid export [606, 607]. This

275 in turn allows for rapid histone protein production necessary for a cell undergoing cell division.

This suggests that in a developing tissue that is composed of mitotic and post-mitotic cells, replication-dependent histone genes are not transcribed in the post-mitotic cells. However, a recent publication showed that histone proteins are translated in adult neurons, which suggests that in the central nervous system, transcription of replication-dependent histones might not be coupled to cell cycle [608]. To further understand the transcriptional regulation of histones in senescent neurons, we employed the mouse retina as our model system. The retina is derived from the developing central nervous system and is composed of six neurons, including rod photoreceptors, cone photoreceptors, amacrine cells, bipolar cells, horizontal cells, and retinal ganglion cells. The retina also contains one glial cell type, called Müller glia. The retina has a stereotypic architecture where different cells are tiled together to form three layers, including the ganglion cell layer

(GCL), which contains retinal ganglion cells and displaced amacrine cells; the inner nuclear layer

(INL), which contains amacrine cells, bipolar cells, horizontal cells, Müller glia, and displaced ganglion cells; and the outer nuclear layer (ONL), which contains rod photoreceptors and cone photoreceptors. The retinal cell types are produced from the multipotent retinal progenitors in a stereotypic temporal sequence [609, 610]. In mice, the production of retinal neurons spans from embryonic day (E) 12.5 to approximately postnatal day (P) 10 followed by terminal differentiation, synaptogenesis, and pruning until P21 [184, 609].

Here, we discovered that replication-dependent histone genes were not only transcribed during development, but also in adult and aging retinal neurons. Similarly, the replication- dependent histone genes of a gene family, irrespective of coding for identical or different protein variants, were differentially expressed during embryonic and postnatal development. Similar variation was observed in the single cell microarray analysis on individual retinal neurons. This

276 suggests that there is differential contribution of the various histone isotypes to the total histone pool that will ultimately constitute the nucleosome. This suggests the existence of a variant nucleosome across retinal development and among the various retinal neurons. Thus, mutations that lead to loss of function of one isotype might not be compensated by other histone genes.

Indeed, this possibility was recently confirmed by a report that showed that mutations in specific histone genes can lead to pediatric glioblastoma [611, 612].

Results

Deep sequencing shows differential transcription of replication-dependent histone genes

Transcription of replication-dependent histone genes in yeast and HeLa cells has been shown to be synchronized to the S-phase of the cell cycle [613]. We wanted to interrogate whether this holds true in a developing tissue such as the retina. To this end, we mined our deep sequencing data from cytoplasmic extract (CE) of E16 and P0 retinae [241]. First, we interrogated the expression patterns of housekeeping genes, such as Actb and Gapdh, along with Pax6, which showed equivalent expression in both E16 CE and P0 CE (Figure A1.1A) [614-616]. Next, we interrogated genes that are known to shift in their expression patterns, such as Fgf15, Fgf13, Nr2e3,

Nrl, and Rho [617-621]. We also used Xist to determine cross-contamination from the nuclear extract to the cytoplasmic extract. As expected, Fgf15 and Fgf13 showed higher expression at E16 than P0, while Nr2e3 and Nrl showed expression only at P0 (Figure A1.1A), which is in agreement with previous reports [618-620]. Finally, Rho did not show expression at either of these time points, and the nuclear-specific Xist transcripts were not observed in the cytoplasmic extracts at either time point (Figure A1.1A). Here we found that all histone genes were not transcribed at the same level within and between the two time points (Figure A1.1B-F). Specifically, for H1 genes

277

Hist1h (1a and 1b) were highly expressed followed by Hist1h (1e and 1c) (Figure A1.1B).

Hist1h1d was the lowest expressed H1 histone. As predicted, Hist1h1t was not detected (Figure

A1.1B) in the retina as it is exclusively expressed in spermatozoa [622]. Overall, the trend for H1 genes was higher expression in P0 CE than in E16 CE. For the H2a genes, Hist1h (2aa2, 2al, 2an,

2ao, and 2ag) were highly expressed (Figure A1.1C). This was followed by Hist2h2ac, Hist3h2a,

Hist1h (2ab and 2ak), Hist2h (2aa1 and 2ab), Hist1h (2ap, 2ah, 2af, 2ai, 2ae, 2ad, and 2ac) (Figure

A1.1C). In contrast, Hist1h (2aj and 2aa) were not expressed (Figure A1.1C). Again, like H1 genes, higher transcript levels were observed for the H2a genes in P0 CE compared to E16 CE

(Figure A1.1B&C). For H2b genes, the highly expressed genes were Hist2h2bb, Hist1h (2bk, 2bb, and 2bf) followed by Hist1h (2bp and 2bc), Hist3h2ba, and Hist1h (2bj, 2bg, 2bh and 2br) (Figure

A1.1D). The three genes Hist1h2bn, Hist2h2be, and Hist1h2bm were expressed at lower levels, and all three had higher expression levels in P0 CE compared to E16 CE. Finally, the H2b genes that were not expressed were Hist1h (2be, 2ba, 2bl, and 2bq) (Figure A1.1D). Overall, the expression pattern for H2b was similar to that of H2a (Figure A1.1C&D). For H3 genes, the highly expressed genes were Hist1h3b, Hist2h3b, and Hist1h (3i and 3h), followed by Hist1h (3g),

Hist2h3c2, and Hist1h (h3e, h3d, 3c, 3a, and 3f), while Hist2h3c1 was not expressed (Figure

A1.1E). Overall, most of the H3 genes had relatively higher expression levels at P0 than at E16, except for Hist1h3g and Hist2h3c2, which had the same expression levels at both time points

(Figure A1.1E). For H4 genes, Hist1h (4f, 4k, and 4m) were relatively highly expressed compared to Hist1h (4h, 4c, 4d, 4a, 4j, and 4b), Hist2h4, Hist4h4, and Hist1h (4n and 4i) (Figure A1.1F).

Again, the overall expression for H4 was higher at P0 than E16.

Replication-dependent histone gene isotypes encode variant proteins

278

Given that most of the histone genes were differentially expressed at E16 and P0, we wanted to investigate whether there are functional consequences to these dynamic expression patterns. To study this, we interrogated whether all of the histone paralogs encode identical proteins. First, we sought to arrange the amino acid (AA) sequences of each histone protein.

Previously, primary sequence variants have been reported [603], but there have been continuous updates in the databases, so we revisited this analysis. Based on NCBI, Ensembl, UCSC genome browser, MGI, and published literature, we organized the histone genes by name and cluster (Table

A1.1). Second, we performed multiple sequence alignments based on neighbor-joining by percent identity of polypeptides (Figure A1.2). All H1 polypeptide sequences diverged significantly at the

N-terminus (1-60 AA) and C-terminus (119-223 AA), but were highly conserved from AA 60-118

(Figure A1.2A). These sequence variations have previously been described for human H1 proteins

[623]. The most divergent protein was Hist1h1t followed by Hist1h1a, and the most similar were

Hist1h1d and Hist1h1e (Figure A1.2A). In the case of H2a, 9 proteins had 100% identical

(canonical protein) AA sequence, while 10 had AA variations compared to the canonical protein

(Figure A1.2B). Among the latter nine histone variants, Hist1h2ak and Hist1h2af differed from the canonical protein at S123T and A127P, respectively (Figure A1.2B). Two proteins, Hist1h2am and Hist1h2ah, lacked two AA residues at the end of the C-terminus compared to the canonical form. Four proteins, Hist1h2al, Hist2h2aa1, Hist3h2a, and Hist2h2ac, showed AA changes at

T17S, S41A, L52M, and R100K (Figure A1.2B). Hist2h2ac also showed amino acid changes at

H125K and G129S, and it lacked an amino acid at position K126 (Figure A1.2B). The most divergent polypeptides were Hist2h2ab and Hist1h2aa (Figure A1.2B). Hist2h2ab differed at

T17S, S41A, L52M, I88V, R100G, H125K, K126P, A127G, and G129N (Figure A1.2B).

Hist1h2aa differed from the canonical polypeptide at R4P, G5T, Q7R, A15V, T17S, K37Q, S41A,

279

E42Q, V44I, I63V, I80T, H125K, K126S, A127Q, K128T, G129K, and it also lacked a ‘K’ at position K130 (Figure A1.2B). For H2b, 6 proteins had 100% identical (canonical protein) AA sequence, while 12 proteins had variations (Figure A1.2C). For example, Hist1h2bk differed from the canonical polypeptide at position S126A (Figure A1.2C). Three H2b proteins, Hist1h2b (c, e, and g), were the same, but differed from the canonical polypeptide at position S77G (Figure

A1.2C). Similar proteins were encoded by Hist1h2bm and Hist1h2bh, but they also differed from the canonical peptide at positions A5T and V20L, respectively. The rest of the H2b proteins varied from the canonical polypeptide at several positions, such as Hist1h2bb (A5S, V20I, and T21S),

Hist2h2bb (E3D, A23V, and S77G), Hist3h2ba (A5S, K7R, A9T, V20I, S34G, V41I, and I96V) and Hist2h2be (P4L, V41I, S77N, A99S, and S126A) (Figure A1.2C). The two most distant polypeptides were Hist1h2bp and Hist1h2ba (Figure A1.2C), which differed from the canonical polypeptide at many positions. The former differed at A5V and A9V, and the latter at P4V, S8G,

P10T, A11I, P12S, S16F, A23T, D27E, K29R, S34C, V41I, V43I, G62S, and N69T (Figure

A1.2C). Additionally, the Hist1h2bp protein had 12 extra AA residues (ILWNKFYYLPSF) at its

C-terminal, and Hist1h2ba had an extra AA ‘V’ at position six (Figure A1.2C). For H3, there were

12 proteins, of which 8 were 100% identical and 4 were the same, except for a single AA change at S97C (Figure A1.2D). For H4, all 13 proteins were 100% identical (Figure A1.2E).

Histone genes are dynamically expressed across embryonic and postnatal retinal development

The various histone genes were differentially expressed at E16 and P0. Moreover, the proteins encoded by these genes have variations in their AA sequences. Taken together it suggested that expression of the different paralogs might be regulated during retinal development. To address

280 this issue, we employed qRT-PCR analysis to interrogate the transcription of specific histone isotypes across retinal development starting with E14, E16, E18, P0, P2, P4, P10, and P25. For this, we fractionated the retinae from each time point into cytoplasmic and nuclear RNA followed by cDNA synthesis. However, first we wanted to check the level of cross-contamination between the two retinal fractions. For this, we determined the expression profiles of two genes, Xist and

Malat1, both of which are known to reside predominantly in the nucleus [356, 624, 625]. Here the qRT-PCR results showed that across all time points, Xist and Malat1 RNA was predominantly observed in the nuclear fraction (Figure A1.3A). The expression levels shown in Figure 2A were normalized to that of Gapdh and the values were plotted on a log scale where values below 0.001 were considered not expressed. For Xist, expression was observed only in the nuclear fraction while the values for the cytoplasmic fraction were below the threshold (Figure A1.3A). Moreover, the primers employed for Xist were designed such that they would only detect successfully spliced

RNAs, thereby eliminating any genomic DNA contribution. For Malat1, expression was enriched in the nuclear fraction, but it was also observed in the cytoplasmic fraction (Figure A1.3A). This is consistent with previous reports that show that Malat1 does shuttle between the two fractions, but is predominantly localized in the nucleus [356]. In the case of P25, levels of Xist were below the threshold in the nuclear fraction, which was not surprising, since the retinae used in this case were obtained from male mice and Xist is exclusively expressed in females. Regardless, the enriched levels of Malat1 in the nuclear fraction confirmed minimal contamination between the two fractions from P25 (Figure A1.3A). The validation of the integrity of fractions across time also revealed the expression kinetics of these two genes. In this case, the expression kinetics for

Xist and Malat1 were such that mRNA levels remained relatively steady across retinal development. While the differences in the levels of expression of Xist can vary based on the

281 number of female retinae in the pool, the expression of Malat1 is not sex-linked. Thus, the trends observed for Malat1 reflected its transcription. For example, there was a ~2.5-fold higher expression of Malat1 in the nuclear fraction than in the cytoplasmic fraction from E14 to P0

(Figure A1.3A). However, this difference in expression between the nuclear and cytoplasmic fraction was reduced at P4 and P10 followed by an increase at P25 in the NE fraction (Figure

A1.3A).

The expression profile of Malat1 raised another issue regarding the quality of the cDNA as it relates to development, i.e., whether these changes are developmental or due to variation in cDNA quality. To address this issue, we interrogated the expression of fibroblast growth factor 15

(Fgf15) and rhodopsin (Rho) in all the fractions across development. Fgf15 expression is known to be high in embryonic development and fall during postnatal development [617], while Rho expression begins around P0 and increases postnatally [621]. Again, the qRT-PCR results for

Fgf15 and Rho were as predicted. Expression of Fgf15 and Rho intersected at P2 where Fgf15 levels were declining while levels of Rho were increasing (Figure A1.3B). For both genes, relatively higher levels were observed in the nuclear fraction compared to the cytoplasmic fraction, which was also the case for Gapdh (data not shown). Interestingly, for Rho, the expression in the nuclear fraction approached the threshold around E18, and Rho levels in the cytoplasmic fraction mirrored this at P0 (Figure A1.3B). In all, the retinal cDNA prepared had minimal cross- fractionation contamination, and it reflected expression profiles of known retinal genes. Therefore, these cDNA libraries were suitable for assessing the expression of histone genes.

Having confirmed the integrity our retinal cDNA libraries, we next sought to interrogate the expression kinetics of the histone genes. In order to profile histone gene expression, we selected several candidate genes for qRT-PCR analysis from each histone gene family that met the

282 following criteria. First, the availability of sequences to which qRT-PCR-quality primers with the right Tm, GC contribution, no palindromic sequences, no self-annealing, and no hairpins could be designed. Second, the interrogated paralogs of a specific histone family should encode a specific protein variant. Third, the primers for qRT-PCR should amplify only that paralog and not cross- hybridize to other members. Based on these criteria we found a limited number of genes from each of the five histone families to interrogate across retinal development. For histone H1, we interrogated Hist1h1a, Hist1h1b, Hist1h1c, and Hist1h1e across retinal development (Figure

A1.4A). Overall, expression of Hist1h1a, Hist1h1b, and Hist1h1e was higher during embryonic development compared to Hist1h1c expression, which was ~3-fold lower at E14 (Figure A1.4A).

In contrast, the expression of Hist1h1a, Hist1h1b, and Hist1h1e declined postnatally, while that of

Hist1h1c was higher, with peak expression observed at P4 followed by a steady decline at P10 and

P25 (Figure A1.4A). Specifically, expression of Hist1h1a was highest at E14 followed by a steady decline, and its transcript levels were consistently higher in the cytoplasmic extract compared to the nuclear extract (Figure A1.4A). In the case of Hist1h1b, expression remained steady from E14 to P2, followed by a precipitous fall at P4, where transcript levels in the nuclear fraction exceeded levels in the cytoplasmic fraction. This was followed by a steady decline at P10 and P25 (Figure

A1.4A). Similarly, Hist1h1e expression pattern followed the trends observed for Hist1h1b during embryonic development, except that it was expressed at a lower level than levels of cytoplasmic

Hist1h1b (Figure 3A). However, during postnatal development the expression of Hist1h1e peaked at P4, to coincide with the peak of Hist1h1c, followed by a steady decline at P10 and P25 (Figure

A1.4A). Again, the difference in transcript levels of Hist1h1e between the nuclear and cytoplasmic fractions was the largest at P4 (Figure A1.4A).

283

For histone H2a, we interrogated Hist3h2a, Hist1h2ac, and Hist1h2ab across retinal development. Overall, the expression of Hist3h2a and Hist1h2ab was 4-fold higher than

Hist1h2ac, except at P0, when Hist3h2a and Hist1h2ab levels were 2-fold higher (Figure A1.4B).

Expression of both Hist3h2a and Hist1h2ab was similar in that the expression levels remained steady across postnatal development. Specifically, Hist3h2a expression dipped at E18, followed by a spike at P0, and then it remained steady during postnatal development. In contrast, expression of Hist1h2ab showed a steady increase in expression with a peak at P0, followed by a steady decline, but one that was not as drastic as the decline observed for the histone H1 genes examined

(Figure A1.4B). Finally, transcript levels of Hist1h2ac were not observed over the threshold at any time (Figure A1.4B).

For , we interrogated Hist2h2bb, Hist2h2be, and Hist3h2ba during retinal development. Overall, the expression of Hist3h2ba was 1-fold higher than the expression of

Hist2h2bb or Hist2h2be (Figure A1.4C). Specifically, Hist3h2ba expression dipped at E18 followed by a spike at P0 and then steady decline through P4. This was followed by an increase from P10 to P25 (Figure A1.4C). Also, the transcript levels of Hist3h2ba were consistently higher in the cytoplasmic extract compared to the nuclear extract, which was not observed for Hist2h2bb or Hist2h2be at E16, P4, P10, and P25 (Figure A1.4C).

For histone H3 genes, we were able to design only a single primer pair that could meet the aforementioned criteria, and it was for Hist2h3c2. Regardless, we designed two additional primer pairs, of which the first could hybridize to Hist1h3e and Hist1h3f and the second set could bind to all of the H3 genes. Overall, the expression of all H3s was such that it was high embryonically and peaked at P0, with a steady decline postnatally. Specifically, expression of Hist2h3c2 was high embryonically with a dip at E18, followed by a spike at P0 and a steady increase postnatally (Figure

284

A1.4D). A similar expression pattern was observed for Hist1h3e&f, albeit at a lower level (Figure

A1.4D). Interestingly, RNA for Hist1h3e&f and Hist2h3c2 were highly enriched in the nuclear fraction compared to the cytoplasmic fraction, except for Hist1h3e&f at E18 (Figure A1.4D).

Finally, for Hist2h3c2, expression was only above threshold in both nuclear and cytoplasmic fractions after P10 followed by increased expression at P25 (Figure A1.4D).

For histone H4 genes, we interrogated Hist1h4a, Hist1h4d, and Hist4h4 across retinal development (Figure A1.4E). Overall, the expression Hist1h4a, Hist1h4d, and Hist4h4 genes showed the same pattern across retinal development (Figure A1.4E). Moreover, their transcript levels were consistently higher in the cytoplasmic extract compared to the nuclear extract, except for Hist4h4 at P25 when it was reversed (Figure A1.4E). Specifically, expression of all three H4 genes, like that of Hist3h2a and Hist3h2ba, dipped at E18 followed by a spike at P0 and then a steady decline during postnatal development.

For most histone genes tested by qRT-PCR, we observed that their transcript levels were comparatively very low at E18 and very high at P0. In order to ensure that it was not an artifact of

RNA preparation, we further tested our fractionated RNA samples (n=3) by examining expression of known genes. We analyzed three genes, Nr2e3 and Nrl, which are known to be expressed at late embryonic and postnatal time points [618-620], and Pax6, which is known to be expressed across retinal development [614-616] (Figure A1.4F). As predicted, we found that Nr2e3 and Nrl transcript levels were very low at E14 and E16 but were considerably high from E18 onward, with a steady increase through all postnatal time points examined. Nuclear transcript levels were higher than cytoplasmic levels at E14 and E16. The Pax6 expression levels were high at E18 and low at

P0, which further confirmed that histone expression levels were not due to artifacts of RNA preparation (Figure A1.4F).

285

Our qRT-PCR analysis reported the levels of histone transcripts in nuclear and cytoplasmic extracts across retinal development. To understand the export kinetics of histone mRNA during development, we further analyzed the qRT-PCR data for representative genes from each family, except for H3, where our primer pairs hybridized with all H3 transcripts. Here, we found that

Hist1h1b transcripts were predominantly in the cytoplasmic extract relative to the nuclear extract, and the pattern continued until postnatal day 2, which is in agreement with previous reports of efficient histone transport (Figure A1.5A) [626]. Hist1h1b levels were also higher in the cytoplasm than the nucleus at P10. However, at P4 and P25, there were higher levels of transcript observed in the NE compared to the CE (Figure A1.5A). A similar pattern was observed for Hist3h2a, where the CE had higher transcript levels compared with the NE from E14 until P2 (Figure A1.5B). Like

Hist1h1b, at P4 the ratio of Hist3h2a transcripts in the cytoplasmic and nuclear extracts began to shift, although not enough for the majority of transcripts to be nuclear. At P10 and P25, higher levels of Hist3h2a were detected in the NE compared to the CE (Figure A1.5B). This, however, was not the case for Hist2h2bb, where higher transcript levels in the NE were observed as early as

E16, followed by a shift toward higher levels in the CE from E18 to P2 (Figure A1.5C.

Interestingly, the levels in the NE were consistent with the previously mentioned histone genes in the postnatal retina, except that the NE levels were significantly higher than the CE levels (Figure

A1.5C). The transcript levels for Hist1h4d were consistently higher in the CE compared to the NE from E14 to P25, which differed from the pattern observed for the three aforementioned histone genes (Figure A1.5D). Finally, we also mined our deep sequencing data for the P0 CE and P0 NE data [241] for these four histone genes and found that the transcript levels in the P0 CE vs. P0 NE were in agreement with our qRT-PCR data at P0 (Figure A1.5A-D).

286

Replication-dependent histone mRNAs are expressed in progenitor cells and retinal neurons during development

The qRT-PCR analysis revealed a dynamic transcription pattern of the different histone genes during retinal development. However, it only revealed the presence/absence of a transcript, not the specific cell types of the retina that express these genes. Additionally, expression of specific histone genes was gradually increasing after P10 (Figure A1.4), when there are no cycling cells

[627]. Together, it suggested that replication-dependent histone genes might not be restricted to dividing cells and that they might be transcribed in differentiating retinal neurons. To address this issue, we employed section in situ hybridization (ISH) across retinal development starting with

E14, E16, E18, P0, P3, P7, and P14 using probes specifically designed for Hist1h1c, Hist1h2ab,

Hist3h2ba, Hist1h3a, and Hist4h4. However, given the similarity of the nucleotide sequences among the paralogs of a given histone family, three of the probes tested here, Hist1h2ab,

Hist3h2ba, and Hist1h3a, would cross-hybridize to the transcripts of other members of that family

(Table A1.4).

H3 ISH: At E14, ISH analysis showed expression of H3 in progenitor cells toward the peripheral retina and asymmetric enrichment toward the vitreal side of the developing retina

(Figure A1.6A’&A’’). At E16, expression of H3 in progenitor cells continued in the outer neuroblastic layer (ONBL). Again, expression of H3 in the peripheral retina was enriched asymmetrically toward the vitreal side of the retina (Figure A1.6B&B’). In addition, H3 expression was observed elsewhere in the brain (Figure A1.6B’’). At E18, expression of H3 in progenitor cells continued in the ONBL across the whole retina (Figure A1.6C). Moreover, the ISH signal for

H3 was enriched in cells asymmetrically toward the apical end of the retina (Figure A1.6C’&C’’),

287 where the retinal progenitor cells undergo the M-phase of their cell cycle [165, 184]. A similar expression pattern was observed elsewhere in the brain, where the signal was enriched asymmetrically in a sub-population of cells (Figure A1.6D&D’), possibly either in M-phase or exiting the cell cycle. During postnatal development, H3 mRNA was again observed in proliferating progenitor cells, and the ISH signal was enriched toward the lower part of the ONBL at P0 and P3 (Figure A1.6E&F). Specifically, at P0, H3 mRNA was highly enriched in the central retina compared to the peripheral retina (Figure A1.6E’&E’’). At P3, the signal was highly enriched in the periphery compared to the central retina. The latter was in agreement with the decrease in progenitor population in the center compared to the periphery (Figure A1.6F’&F’’).

At P7, H3 ISH signal was enriched in progenitor cells in the peripheral retina; however, in the central retina, its expression was asymmetrically enriched in the GCL and amacrine cells of the

INL (Figure A1.6G-G’’). Finally, at P14, H3 expression was enriched in the INL across the whole retinal section with very little signal observed in the photoreceptor layer (Figure A1.6H).

Hist1h1c ISH: At E14, Hist1h1c mRNA was enriched in progenitor cells toward the peripheral retina and there the mRNA was enriched asymmetrically toward the vitreal side of the developing retina (Figure A1.7A-A’’). A similar expression pattern was observed elsewhere in the brain, where it was asymmetrically enriched in a small population of cells (Figure A1.7A’’’). At E16, expression of Hist1h1c in progenitor cells continued as it was observed in the ONBL where progenitor cells reside, but not in the newly formed GCL (Figure A1.7B). Again, expression of

Hist1h1c in the peripheral retina was enriched asymmetrically toward the vitreal side of the retina

(Figure A1.7B’&B’’). Moreover, Hist1h1c was also observed toward the apical end in cells evenly distributed across the whole retinal section. Like E14, expression of Hist1h1c at E16 was also

288 observed elsewhere in the brain and was highly expressed in a subpopulation of cells (Figure

A1.7B’’’). At E18, Hist1h1c expression was observed in progenitor cells and was also enriched in the peripheral retina, but it was toward the apical end of the retina (Figure A1.7C-C’’). During postnatal development, Hist1h1c expression was enriched in proliferating progenitor cells, and the

ISH signal was enriched toward the lower part of the ONBL at P0 (Figure A1.7D). Specifically, expression level was higher in the periphery compared to the central retina, which was in agreement with the decrease in progenitor population in the center compared to the periphery

(Figure A1.7D’&D’’). This trend continued at P3 and P7 (Figure A1.7E&F), except that at P7, expression was also observed in cells that were most likely bipolar cells, based on their localization

(Figure A1.7F’&F’’). Finally, at P14, expression of Hist1h1c was observed in the outer plexiform layer and in photoreceptor cells, which were most likely rods and cone photoreceptors (Figure

A1.7G).

Hist4h4 ISH: At E14, ISH analysis showed expression of Hist4h4 in progenitor cells with enrichment of signal in puncta toward the apical end in the peripheral retina (Figure A1.8A-A’).

Similar expression was observed at E16, except the puncta observed at E14 in the periphery were now observed across the entire retina (Figure A1.8B&B’). Again, like H3, expression of Hist4h4 was observed in specific brain regions, where cells appeared to be migrating away from the dividing progenitor pool (Figure A1.8B’’&B’’’). At E18, expression was again observed in progenitor cells in the ONBL, except the puncta observed at E16 were not observed (Figure

A1.8C&C’). Elsewhere in the brain, cells expressing Hist4h4 all appeared to be migrating (Figure

A1.8C’’&C’’’). At P0, expression was highly enriched in progenitor cells (Figure A1.8D), and the pattern continued at P3 with a central to peripheral gradient (Figure A1.8E-E’’). At P7, Hist4h4

289

ISH signal was enriched in few cells of the INL, which, by their location, were most likely amacrine cells and also toward the apical part of the ONL (Figure A1.8F). The cells in the apical end of the ONL, by their position, were most likely differentiating photoreceptors (Figure A1.8F).

Finally, at P14, robust Hist4h4 expression was observed in all of the retinal neurons (Figure

A1.8G-G’’).

H2a ISH: Section ISH showed that H2a mRNA was not restricted to progenitor cells during embryonic development at E14, E16, and E18 (Figure A1.9A-C). Expression at E18 appeared to be highly enriched in differentiating neurons and in cells at the apical end, which might be RPE cells and/or cells at the M-phase of the cell cycle (Figure A1.9C’&C’’). Expression in ganglion cells continued at P0 (Figure A1.9D&D’), but was not observed at P3, where only the progenitor cells showed H2a expression (Figure A1.9E). Specifically, at P3, H2a mRNA was highly enriched in the peripheral retina compared to the central retina (Figure A1.9E’&E’’). This is in agreement with the decrease in progenitor population in the center compared to the periphery. At P7 and P14,

H2a ISH signal was observed in all three layers of retina but was enriched in the GCL and some cells in the INL, which were possibly horizontal cells based on their location (Figure A1.9F&G).

H2b ISH: Section ISH of H2b gene showed a similar trend to that of H2a (Figure A1.9&10). At

E14, H2b gene expression was enriched in progenitor cells, but the mRNA was again asymmetrically enriched toward the vitreal side and the apical side of the developing retina (Figure

A1.10A-A’’). The apical enrichment is likely in the RPE cells. A similar pattern was observed at

E16, where expression of H2b in progenitor cells was observed in the ONBL and was enriched in the newly formed GCL (Figure A1.10B-B’’). At E18, ISH signal for H2b was enriched in the

ONBL compared to the GCL (Figure A1.10C-C’). During postnatal development, H2b mRNA was

290 again observed in proliferating progenitor cells that reside in the ONBL at P0 and P3 (Figure

A1.10D&E). Specifically, at P0, H2b mRNA was highly enriched in the periphery compared to the central retina (Figure A1.10D’&D’’), while at P3, few cells showed expression in a central to peripheral gradient (Figure A1.10E-E’’). At P7, H2b ISH signal was enriched in few cells of the

INL, which, by their location, are most likely horizontal cells (Figure A1.10F). Like H2a, H2b

ISH signal was observed in all three layers of retina but was enriched in the GCL and some cells in the INL at P14 (Figure A1.10G).

Replication-dependent histone genes are differentially expressed among the different cell types of the retina

ISH analysis revealed expression of the different histone genes in subtypes of differentiating retinal neurons. Therefore, we sought to investigate whether the histone isotypes are differentially expressed among the different neurons at the same developmental time point. For this, we mined the published single-cell microarray data from the Cepko laboratory [242, 628-

630]. This analysis showed differential expression of the histone isotypes in single neurons profiled at different developmental stages starting at E12 through adult (Figure A1.11). First, ganglion cells from E12, E13, E14, E15, E16, and P0 showed large variation in the subtype of histone genes that were expressed. For example, Hist2h2ab was expressed in most of the embryonic ganglion cells and P0, but not in the ganglion cell from E16. In contrast, expression of

Hist1h2be was observed in the E16 ganglion cell, but not in the E14 or E15 ganglion cells. Similar variation was observed in amacrine and bipolar cells (Figure A1.11). The most dramatic expression pattern was observed in the adult cone photoreceptor, in which expression of isotypes for all histone families except H4 was observed. In this cone photoreceptor, expression of specific histone

291 isotypes, such as Hist2h2bb, was observed, but this was not the case for Hist1h2bp (Figure A1.11).

Expression of histone genes was significantly different in the adult rod photoreceptor compared to the adult cone photoreceptor in that very few histone isotypes were expressed in both, including

Hist1h1c and Hist1h3a. Interestingly, the same histone isotypes were not observed in a developing rod photoreceptor cell from P5, which showed expression of Hist2h2ab, Hist1h2bp, and

Hist1h2be, which were absent in the adult rod photoreceptor cell (Figure A1.11). Finally, expression of H1, H2a, and H2b was observed in the P13 Müller glial cell, while expression of H3 and H4 isotypes was not observed. The microarray utilized for single-cell profiling does not have probe sets for every histone isotype. The data obtained for the subset of histone genes showed dynamic patterns of expression of histone genes among retinal neurons across time.

Replication-dependent histone genes are transcribed in the aging retina

Our qRT-PCR analysis showed that replication-dependent histone genes are differentially transcribed in the retina long after cell proliferation has stopped (Figure A1.4). Specifically, genes such as Hist2h3c2, Hist1h2ab, and Hist1h1c showed upward trends in expression at P25 (Figure

A1.4B-D). This observation was further confirmed by our in situ analysis, which showed expression of histone genes in differentiating neurons (Figures A1.6-A1.10). Together, this suggested that the transcription of replication-dependent histone genes might continue throughout aging. To study this, we mined our microarray data obtained from retinae starting with P12, 5 weeks (w), 8w, 9w, 25w, 40w, and 75w. First, we organized the microarray values for the histone genes within the context of the values obtained for genes with known expression patterns, such as

Rho, Nrl, Nr2e3, Pax6, Gapdh, and Fgf15 (Figure A1.12). The values for these genes were used as boundaries to define categories of histone genes that were expressed at high, intermediate, and

292 low levels. The histone genes that were highly expressed included Hist1h2bc, Hist3h2a, and

Hist1h1c, in that order, and were flanked by the expression values for Rho and Pax6 (Figure

A1.12). The histone genes that were expressed at intermediate levels included Hist3h2ba,

Hist2h2aa1, Hist2h3c1, and Hist2h2be, in that order, and were flanked by the expression values for Pax6 and Gapdh. Hist1h4h, Hist1h2be, Hist1h3d, Hist2h2bb, Hist1h2bp, Hist1h1t, Hist1h4i, and Hist1h3i were expressed at low levels and were flanked by the expression values for Gapdh and Fgf15, which has been reported to be turned off in the adult retina [617]. Thus, the genes with values below that of Fgf15, such as Hist1h4i, Hist1h1e, and Hist1h4f, were considered not expressed (Figure A1.12). Most histone genes showed steady expression across time, except for

Hist1h3d and Hist2h2bb, which showed an increase in expression at 75w. Surprisingly, Hist1h1t, which has been reported to be a spermatozoa-specific histone, was expressed at 75w [622] (Figure

A1.12).

mTOR-mediated regulation of cyclins D1 and E is decoupled in retinal neurons

Our data thus far suggest that the transcription of replication-dependent histones is decoupled from cell cycle, which implies that neurons might lose histones due to chromatin remodeling and, as a result, there is active histone transcription in aging neurons. However, there is a second possibility: that replication-dependent histone transcription is being driven by the accumulation of cyclins in these aging neurons. Elevated expression of cyclins E, D, A, and B has been observed in differentiated neocortex neurons, and recent cell culture experiments have also shown that cyclins

D1 and E accumulate in senescent cells [631, 632]. Previous studies have described a relationship between cyclin E and replication-dependent histone transcription, mediated by Cdk2 and Npat. In this pathway, cyclin E, when bound to Cdk2, activates the protein Npat, which then initiates the

293 transcription of replication-dependent histones [633]. Thus, senescence-driven cyclin E accumulation could drive replication-dependent histone expression in post-mitotic cells. Before testing this possibility, it is necessary to define “senescence.” In a senescent cell, its cell cycle is arrested and growth-promoting pathways, such as mTOR, remain active. As a result, the cell loses its ability to proliferate entirely. In a quiescent cell, the cell cycle is arrested, but growth-promoting pathways are inactive [634]. Based on these definitions, neurons are senescent, as they have not been shown to have the potential to enter cell cycle and also possess an active mTOR pathway

[635, 636]. In cell culture, Leontieva et al. were able to show that treatment with rapamycin, an inhibitor of the mTOR pathway, resulted in a reduction in cyclins D1 and E [632]. Given that cyclin E interacts with Cdk2 to activate Npat, a transcription factor that regulates histone transcription, we reasoned that mTOR inhibition in neurons might decrease transcription of replication-dependent histones. To test this idea, we treated adult mice with rapamycin or carrier for 5 days by IP injections. The retinae were harvested, and both protein and RNA were generated from these samples. To confirm the effectiveness of rapamycin in the inhibition of mTOR, we interrogated the phosphorylation of S6K, a key target of mTOR activity. We saw loss of S6K phosphorylation in rapamycin-treated to untreated samples, suggesting the effective downregulation of mTOR activity (Figure A1.13A). This was followed by qRT-PCR analysis on cDNA prepared from rapamycin-treated and untreated samples for cyclin D1 (Ccnd1), cyclin E1

(Ccne1), cyclin E2 (Ccne2), and Npat (Table A1.3). Here, we found no significant change in transcript levels for all four tested genes in the treated and untreated samples (Figure A1.13B).

This suggested that in the adult retina, inhibition of mTOR by rapamycin did not significantly alter transcription of targets observed in cell culture experiments. This result, in turn, predicted that we would not observe changes in replication-dependent histone transcription between the treated and

294 untreated samples. To test this idea, we only selected histones that showed active transcription in the postnatal retina and performed qRT-PCR on cDNA from treated and untreated samples. We tested two histone genes from each histone type. Hist1h1a and Hist1h1c were selected from the

H1 histones, Hist1h2ab and Hist3h2a were selected from the H2a histones, Hist2h2be and

Hist3h2ba were selected from the H2b histones, and Hist1h4a and Hist4h4 were selected from the

H4 histones. For the H3 histones, we used the all H3s primer pair and the primer pair for

Hist2h3c2. For all 10 histones tested, none showed a significant difference in normalized fold expression between the treated and untreated samples (Figure A1.13C-G). To determine whether there is a change in cyclins during aging, we also interrogated our microarray data and found that

Ccnd1, Ccne1, and Ccne2 remain relatively steady in expression, except for a slight inversion in

Ccnd1 and Ccne1 where, at 75 weeks, Ccnd1 is upregulated while Ccne1 is further downregulated

(Figure A1.13H).

Discussion

The clustered organization of the different replication-dependent histone genes for the five different histone families has been thought to play a vital role in synchronized transcription of these genes to the S-phase of the cell cycle [603, 613]. However, our deep sequencing data showed that not all histone genes are transcribed simultaneously at the same levels in either E16 or P0 retinae. This is in agreement with a recent report by Singh et al., which showed that H2A genes are differentially transcribed in cancer cells [637]. This suggests that histone genes within clusters are also differentially transcribed and regulated during retinal development. For example, we did not observe expression of specific histone genes, such as Hist1h2aa (H2a), Hist1h2bq (H2b), and

Hist2h3c1 (H3) (Figure A1.1C-E). We also did not observe the expression of the spermatozoa-

295 specific histone Hist1h1t (Figure A1.1B) [622, 638]. Thus, it is possible that these histone genes are expressed in other tissues and/or in the retina at a different time point. The utilization of a subset of histone genes within a family that encode a specific histone can have developmental consequences, since these subsets of genes encode different protein variants (Figure A1.2). For example, Hist2h2bb was expressed at a much higher level than Hist1h2bn (Figure A1.1D), which suggests that, of the two, Hist2h2bb is the predominant H2b protein in the nucleosome octamer.

Amino acid sequence alignment of these two proteins shows that Hist2h2bb differs from

Hist1h2bn at E3D, A23V, and S77G (Figure A1.2C). Thus, the constituent H2b proteins of the nucleosome at these developmental time points are a mix of different H2b variants. Similarly, differential expression of H2a and H3 family members was also observed, along with amino acid sequence variations for the different protein variants (Figure A1.1C&E). For H4 histone genes, the expression kinetics of the different isotypes were not different, which was expected, since all of the H4 family members encode for proteins that are 100% identical (Figures A1.1F and A1.2E).

Interestingly, all of the H1 histone genes that are expressed in the retina show a great degree of variation in their amino acid sequences (Figure A1.2A). Thus, the H1 variant that is used as the linker for the nucleosome octamers could impact higher-order chromatin organization, which in turn could influence gene expression. In all, our data suggest that the nucleosome at a specific developmental time can be composed of an assortment of histone variants for H1, H2a, H2b, and

H3.

The differential contribution of the different histone isotypes to the production of the nucleosome during development was further confirmed by our qRT-PCR analysis across retinal development. This observation was underscored by the expression kinetics of Hist1h1c compared to that of Hist1h1b (Figure A1.4A). Specifically, Hist1h1b was expressed ~3-fold higher than

296

Hist1h1c from E14 to P2, at which point Hist1h1b expression began to decline while Hist1h1c expression spiked (Figure A1.4A). This suggests that two genes within the same cluster must be controlled by a higher-order regulatory mechanism. Since these two genes encode for different H1 histone variants, our qRT-PCR data suggest a dynamic switch in the expression of the gene contributing to the linker protein between P2 and P4. This observation is coincident with the initiation of terminal differentiation and the decline of the progenitor cell population, which was further reflected in our ISH results (Figure A1.7) [627]. Here we observed expression of Hist1h1c at P14 in differentiating photoreceptor cells (Figure A1.7G). The expression of Hist1h1c in photoreceptor cells is in accord with a recent publication that showed that Hist1h1c regulates the condensation of chromatin in rod photoreceptor cells [639].

Similarly, for H2a, we observed dynamic expression of Hist3h2a and Hist1h2ab, which encode different H2a variants. For example, at E18, the expression of Hist1h2ab was elevated, while that of Hist3h2a expression was significantly lower (Figure A1.4B). This difference in expression suggests differential contribution of these two genes toward the production of the H2a proteins incorporated into the nucleosome. This phenomenon is also observed at later time points, suggesting the existence of variant throughout retinal development. For H2b, H3, and H4, the expression of all genes examined showed similar patterns in that they all showed a dip in expression at E18, followed by a spike at P0 (Figure A1.4C-E). This suggests the possibility of the existence of a regulatory mechanism that represses transcription of histone at a very critical time in development, i.e., birth. The drop in transcription prior to birth was specific to these histone genes, as genes with known expression profiles, including Nr2e3, Nrl, and Pax6 [614-616, 618-

620], and other histone genes, such as Hist1h1b and Hist1h2ab, did not show a drop in expression at E18 (Figure A1.4A,B&F). Moreover, for all five histone protein families, specific genes showed

297 an upward trend in expression levels between P10 and P25 (Figure A1.4). For example, all examined H2a genes and Hist2h3c2 showed an increase in expression between P10 and P25

(Figure A1.4B). Given that there are no progenitor cells dividing at this stage, expression of these replication-dependent histone genes must occur in differentiating neurons. Indeed, like Hist1h1c

(Figure A1.7), our ISH results showed expression of H3 and H2b in bipolar, amacrine, and ganglion cells in the P14 retina (Figure A1.6&A1.10). Similarly, expression of Hist4h4 was observed in newly differentiating photoreceptors at P7, and then it was panretinal, i.e., in all retinal layers, at P14 (Figure A1.8). The expression of all the histone genes tested by qRT-PCR and ISH strongly suggests the continued expression of replication-dependent histone genes in differentiating neurons.

Another important feature of replication-dependent histone genes is that their mRNA is efficiently transported to the cytoplasm. Histone transcripts do not have a poly-A tail; instead, their

3’ UTR has a hairpin loop structure to which trans-acting factors, such as SLBP and U7 snRNA, bind and mediate export [613]. This efficient export mechanism affords dividing cells the ability to produce large amounts of histone proteins [613]. Indeed, our observation of higher transcript levels in the CE compared to the NE during early retinal development for the histone genes examined is in agreement with this idea. However, during postnatal development starting with P4, there were higher transcript levels in the NE compared to the CE for most of the histone genes examined (Figure A1.5). This is coincident with a decrease in progenitor cell population and the initiation of retinal differentiation [627]. Thus, it is possible that the histone transcript export might be regulated like that of other protein-coding transcripts in differentiating neurons, as the demand for histones might be lower than for a progenitor cell. Another possibility is that the turnover of the histone transcripts in the CE increases in differentiating neurons compared to turnover in

298 progenitor cells. In either case, our results show a dynamic shift in histone mRNA transport and turnover between progenitor cells and neurons.

In all, we observed differential spatiotemporal expression of histone isotypes encoding different histone variants in the developing retina. Based on this, here we propose a model of a variant nucleosome that consists of histone variants of H1, H2a, and H2b. This model does not include H3, even though H3 has two different replication-dependent protein variants, because our primer pairs do not distinguish transcripts from genes that encode these two variants. In this model, for the four tested genes in the H1 family, we propose that the mRNA contribution shifts from predominantly Hist1h1b and Hist1h1a during embryonic development to Hist1h1c and Hist1h1e after P4 (Figures A1.4A&A1.14A). Interestingly, for H2a, of the three genes we examined, the predominant contribution across development was observed for Hist1h2ab (Figures

A1.4B&A1.14B). However, the overall percentage of Hist1h2ab and Hist3h2a changed over time.

Similarly, for H2b, the predominant contribution across development was observed for Hist3h2ba, and the overall percentage of Hist3h2ba, Hist2h2bb, and Hist2h2be changed over time (Figures

A1.4C&A1.14C). It must be noted that the model we propose is only based on the genes we were able to interrogate via qRT-PCR. However, one could imagine that the complexity of this model would increase significantly if the expression of all the isotypes were considered across development.

Our model predicts variant nucleosomes across time, but our ISH results also suggested that variant nucleosomes might exist among the different retinal cell types at a given time point.

Indeed, interrogation of the single-cell microarray data supports this possibility, as cell type- specific expression of histone isotypes is observed. For example, Hist1h1c, which was expressed in photoreceptor cells according to ISH (Figure A1.7), showed high expression in the two adult

299 rod and cone photoreceptor cells, while expression was not observed in ganglion and amacrine cells from E12, E13, E14, E15, E16, and P0 (Figure A1.11). In contrast, Hist3h2a was observed in all of the single cells tested, except for the ganglion cell at P0 and the adult rod photoreceptor

(Figure A1.11). This dataset further confirms the possibility of variant nucleosomes within the different neurons of the retina, which in turn could inform the specific transcriptome profiles necessary for the differentiation/function of these neurons.

The persistent expression of the replication-dependent histone isotypes post-development was an intriguing observation, which was further confirmed by our microarray analysis of the aging retina. Here, we found a subset of histone genes that were expressed at levels above Gapdh and Pax6, suggesting high demand for these histone protein variants (Figure A1.12). The decoupling of the transcription of a subset of replication-dependent histones and cell cycle warrants a change in the nomenclature for these specific histone isotypes. While these isotypes are linked to cell cycle early in development, their transcriptional regulation in neurons suggests independent mechanisms of regulation, one in progenitor cells and the other in neurons. Another possibility is that the transcription of this subset of genes is regulated similarly to the other histones in progenitor cells, except that they are not turned off in neurons. The specific pathway that ensures the transcription of this subset of histones in neurons is also not directly linked to the mTOR pathway that, in neurons, is shown to operate other biological processes, such as neuromodulation

[636, 640]. One possibility is that the age-related changes in the chromatin require exchange of these histones, as has been described for replacement histones [641, 642], except that here we show a similar phenomenon for replication-dependent histones. Indeed, recent reports show that while some histone proteins are long-lived, such as H3.3, others, including H2a and H2b, are actively translated [608].

300

The significance of our findings is that replication-dependent histone genes are actively transcribed in non-dividing cells, i.e., neurons. Moreover, we show that there is dynamic expression of the different isotypes, which might contribute to the production of a variant nucleosome within cell types across time. This suggests that the histone function is subcompartmentalized by the differential transcription regulation of the various paralogs. This in turn might inform chromatin organization and transcriptional profiles of the different retinal cell types. Indeed, recent publications show that single point mutations at the amino acids that are specific to histone variants cause cancers, such as carcinoma of the endometrium, large intestine, lung, and breast tissue and lymphoid neoplasm of hematopoietic and lymphoid tissue (Table A1.6)

[642, 643]. Recent publications further underscore the importance of investigating specific replication-dependent histone genes [611, 612]. Here, they found that a point mutation in either

H3F3A, a replacement histone gene, or HIST1H3B, a replication-dependent histone gene, was observed in nearly 80% of pediatric diffuse intrinsic pontine gliomas (DIPGs) and 22% of non– brain stem gliomas. In addition, a recent study showed that siRNA knockdown of replication-dependent

HIST1H2AC gene leads to the increased rates of cell proliferation and tumorigenicity [637]. These findings, along with our observation of the continued expression of replication-dependent histone genes in retinal neurons across development and aging, show the need to further investigate the regulation of the different replication-dependent histone isotypes.

Materials and Methods

Animal Procedures

301

All experiments used CD1 mice from Charles River Laboratory, MA. All mice procedures were compliant with the protocols approved by the University of Connecticut’s Institutional Animal

Care and Use Committee (IACUC).

Fractionation and cDNA Preparation

CD1 mice retinae were collected at different developmental time points. Retinal RNA isolation was performed using the Thermo Scientific NE-PER Nuclear and Cytoplasmic Extraction

Kit (Thermo Fisher Scientific, 78833*) with the following modifications. Approximately 20 retinae at E14, E16, E18, and P0; 8-15 retinae at P2 and P4; 6-8 retinae at P10 and P14; and 2-4 retinae at P25 were used for fractionation. The retinae were harvested in PBS pH 7.4 and washed once in a microfuge tube followed by centrifugation at 4° C for 30 seconds. After aspirating the

PBS out of the tube, the retinae were resuspended in 500 µL of CER-I reagent from the NE-PER kit followed by the addition of 5 uL 1M DTT and 10 µL of RNase Inhibitor (Roche, 03335402001).

The retinae were then vortexed for 15 seconds at room temperature followed by incubation on ice for 10 minutes. Subsequently, 27.5 uL of Cer-II (NE-PER Kit) was added followed by vortexing for 5 seconds and incubation on ice for 1 minute. The lysed retinae with intact nuclei were then centrifuged at 14,000 rpm for 10 minutes at 4° C. The supernatant, which was the cytoplasmic extract, was then mixed with 500 uL of TRIzol followed by RNA preparation as per the manufacturer’s instructions. The nuclei were obtained as a pellet, which was resuspended in 500 uL of TRIzol followed by sonication to lyse the nuclei. This step was followed by RNA preparation as per the manufacturer’s instructions. 1 μg of retinal RNA was used for cDNA synthesis [644].

Deep Sequencing

302

Fractionated cytoplasmic RNA of E16 and P0 retinal time points was used for RNAseq on the Illumina HighSeq 2000 platform, as described in [241]. qRT-PCR

qRT-PCR was performed in 20 μL reactions containing 30 pmol of each primer, 1 μL of diluted (1:100) cDNA template, and 10 μL of SsoFastTM EvaGreen® Supermix (Bio-Rad, 172-

5203) using the CFX96 Touch™ Real-Time PCR Detection System (Bio-Rad, 185-5195). All reactions were formed in triplicate by using qRT-PCR protocol as seen in Table A1.2. Xist and

Malat1 primers were used to validate cDNA fractionation. Primers for Fgf15, Rho, Nr2e3, Nrl, and Pax6 were used to verify cDNA quality. Primers for 16 histone genes across all five histone groups were designed to examine histone expression trends throughout development (Table A1.3).

Gapdh was used for normalization, as suggested by a recent publication for retina [645]\. The analysis was performed by the 2(-delta delta Cq) method [646].

In situ Hybridization

Sixteen μm cryosections of CD1 mouse heads (for embryonic time points) and retinae (for postnatal time points) were used for in situ hybridization as previously described [242]. Five DIG- labeled antisense probes against Hist1h1c, Hist1h2ab, Hist3h2ba, Hist1h3a, and Hist4h4 mRNA were created from PCR amplified product. Each probe was created by adding T7 and SP6 sites onto their respective qRT-PCR primer sequences (Table A1.4). To determine probe specificity, we performed BLAST analysis. We found that except for the H1 and H4 probes, H3, H2a, and

H2b probes can hybridize with various histone isotypes within their families (Table A1.5).

Microarray

303

Three to four retinae from each time point were used for RNA extraction with TRIzol, as per the manufacturer’s instructions (Invitrogen, 15596-026). A minimum of three arrays were analyzed per time point, including P12, 5 weeks (w), 8w, 9w, 25w, 40w, and 75w. The microarray platform utilized and the bioinformatics analysis performed were the same as described previously

[647]. The microarray platform did not include probes for all histone isotypes.

Rapamycin Treatment

Rapamycin (LC Laboratories) stock was diluted at 10 mg/mL in 50% ethanol. For intra- peritoneal injections, the stock was further diluted to 2.5 mg/mL in 50% ethanol. Adult (9 week)

CD1 mice (n=4) were injected daily with rapamycin at a concentration of 3 µg/g body weight for

5 days. Control mice (n=3) were injected with equivalent volumes of 50% ethanol. Retinae were harvested 5 hours after the fifth injection.

Protein and RNA Extraction

Retinae were dissected in ice-cold 1x PBS. From each mouse, one retina was placed in 250

µL of TRIzol (Life Technologies), and the other was sonicated in 100 µL RIPA buffer containing protein and phosphatase inhibitors (Roche; cOmplete: protease inhibitor cocktail; and PhosStop: phosphatase inhibitor cocktail). RNA extraction was performed according to the manufacturer’s instructions. The sonicated sample was centrifuged at 10,000 RPM for 10 minutes at 4°C, and the supernatant was collected into a fresh tube.

Western Blot Analysis

Protein extracts (10 µg) were resolved on a 4-20% Tris-Glycin gradient gel (BioRad), followed by immunoblot analysis, as described previously [253]. The following primary antibodies were used: rabbit α-pS6240/244 (1:1000; Cell Signaling) and rabbit α-S6 (1:1000; Cell Signaling).

304

Appendix I: Figures and Tables

Figure A1.1. Expression of replication-dependent histone genes by deep sequencing analysis. Shown here are the heat maps representing the FPKM values converted to log2. (A) Expression of genes with known expression kinetics in E16 and P0 CE. Expression of H1 isotypes (B), H2a isotypes (C), H2b isotypes (D), H3 isotypes (E), and H4 isotypes (F) in E16 and P0 CE. Shown in the box in the top right corner is the key representing the expression levels ranging from 12 (blue) to 0 (yellow).

305

Figure A1.2. Multiple amino acid sequence alignment of all histone families. ClustalW was employed to generate the images. Shown on left are the gene names with sequences that are identical shown in blue background, while the changes are indicated by white boxes. Amino acid sequences of all H1 (A), H2a (B), H2b (C), H3 (D), and H4 (E) proteins in mouse. The 5 AA residues at the N-terminal of Hist1h2al were not annotated in the MGI database.

306

Figure A1.3. Validation of fractionation/quality of cDNA. Shown here are graphs representing expression values determined by qRT-PCR analysis, with the x-axis showing different time points at which the retinae were harvested. The y-axis is in log scale showing gene expression levels normalized to Gapdh values. Here the error bars represent the standard error of the mean. Dashed lines indicate expression levels of the genes tested in the nuclear fraction, while the solid lines represent expression in the cytoplasmic fraction. (A) Normalized expression of Xist (gray) and Malat1 (black) across all time points and fractions. (B) Normalized expression of Rho (gray) and Fgf15 (black) across all time points and fractions.

307

Figure A1.4. Histone mRNA expression across retinal development and fractions as determined by qRT-PCR analysis. Shown here are graphs representing expression values determined by qRT-PCR analysis, with the x-axis showing different time points at which the retinae were harvested. The y-axis is in log scale showing gene expression levels normalized to Gapdh values. Here the error bars represent the standard error of the mean. Dashed lines indicate expression levels of the genes tested in the nuclear fraction, while the solid lines represent expression in the cytoplasmic fraction. (A) Normalized expression of four H1 histone genes, including Hist1h1a (blue), Hist1h1b (black), Hist1h1e (green), and Hist1h1c (red). (B) Normalized expression of three H2a histone genes, including Hist3h2a (red), Hist1h2ac (blue), and Hist1h2ab (black). (C) Normalized expression of three H2b histone genes, including Hist2h2bb (red), Hist3h2ba (blue), and Hist2h2be (black). (D) Normalized expression of H3 histone genes, including all H3 histone genes (red), Hist1h3e and Hist1h3f (black), and Hist2h3c2 (blue). (E) Normalized expression of H4 histone genes, including Hist1h4d (red), Hist4h4 (black), and Hist1h4a (blue). (F) Additional validation of cDNA preparation across retinal development interrogated by the expression of Nr2e3 (red), Nrl (black), and Pax6 (green).

308

Figure A1.5. Transcript levels of replication-dependent histone genes in CE vs. NE fractions. The values observed in qPCR analysis shown in Figure 4 were employed to generate pie charts to reflect histone transcript levels in CE and NE fractions. One histone from each family was selected for this analysis across retinal development: (A) Hist1h1b, (B) Hist3h2a, (C) Hist2h2bb, and (D) Hist1h4d. Shown in the far right column are the pie charts generated from deep sequence from P0 CE and NE for these genes.

309

Figure A1.6. Spatiotemporal expression analysis by in situ hybridization for all H3 histone mRNA across retinal development. In situ hybridization with DIG-labeled anti-sense RNA probe detecting all H3 histone mRNA across retinal development from E14 to P14. (A) All H3 histone mRNA in situ signal in E14 section including the brain and retina. (A’) and (A’’) are magnified images of the boxes around the retina in (A) and (A’), respectively, and (A’’’) is a magnified image of the box around the brain in (A). (B) All H3 histone mRNA in situ signal in E16 section including the brain and retina. (B’) and (B’’) are magnified images of the boxes around the retina and brain (B), respectively. (C) All H3 histone mRNA in situ signal in E18 retinal section. (C’), (C’’), and (C’’’) are magnified images of the boxes in the central retina in (C), the peripheral retina in (C), and the peripheral retina in (C’’), respectively. (D) All H3 histone mRNA in situ signal for E18 brain section. (D’) is a magnified image of the box in (D). (E) All H3 histone mRNA in situ signal in P0 retinal section. (E’) and (E’’) are magnified images of the boxes in the peripheral and central retina in (E), respectively. (F) All H3 histone mRNA in situ signal in P3 retinal section. (F’) and (F’’) are magnified images of the boxes around the peripheral and central retina in (F), respectively. (G) All H3 histone mRNA in situ signal in P7 retinal section. (G’) and (G’’) are magnified images of the boxes around the peripheral and central retina in (G), respectively. (H) All H3 histone mRNA in situ signal in P14 retinal section.

310

Figure A1.7. Spatiotemporal expression analysis by in situ hybridization for Hist1h1c across retinal development. In situ hybridization with DIG-labeled anti-sense RNA probe detecting Hist1h1c across retinal development from E14 to P14. (A) Hist1h1c in situ signal in E14 section including the brain and retina. (A’), (A’’), and (A’’’) are magnified images of the boxes around the retina in (A), the retina in (A’), and the brain in (A), respectively. (B) Hist1h1c in situ signal in E16 section including brain and retina. (B’), (B’’), and (B’’’) are magnified images of the boxes around the retina in (B), the peripheral retina in (B’), and the brain in (B), respectively. (C) Hist1h1c in situ signal in E18 section including brain and retina. (C’) and (C’’) are magnified images of the boxes around the retina in (C) and the peripheral retina in (C’), respectively. (C’’’) Hist1h1c in situ signal in the brain at E18. (D) Hist1h1c in situ signal in P0 retinal section. (D’) and (D’’) are magnified images of the central and peripheral retina from (D), respectively. (E) Hist1h1c in situ signal in P3 retinal section. (E’) and (E’’) are magnified images of the central and peripheral retina from (E), respectively. (F) Hist1h1c in situ signal in P7 retinal section. (F’) and (F’’) are magnified images of the central and peripheral retina from (F), respectively. (G) Hist1h1c in situ signal in P14 retinal section.

311

Figure A1.8. Spatiotemporal expression analysis by in situ hybridization for Hist4h4 mRNA across retinal development. In situ hybridization with DIG-labeled anti-sense RNA probe detecting Hist4h4 mRNA across retinal development from E14 to P14. Hist4h4 mRNA in situ signal in the peripheral (A) and central (A’) retina in E14 retinal section. (B) Hist4h4 mRNA in situ signal in E16 retinal section. (B’) is a magnified image of the box around the retina in (B). (B’’) Hist4h4 mRNA in situ signal in E16 brain section. (B’’’) is a magnified image of the box in (B’’). (C) Hist4h4 mRNA in situ signal in E18 retinal section. (C’) is a magnified image of the box in (C). (C’’) Hist4h4 mRNA in situ signal in E18 brain section. (C’’’) is a magnified image of the box in (C’’). (D) Hist4h4 mRNA in situ signal in P0 peripheral retinal section. (E) Hist4h4 mRNA in situ signal in P3 retinal section. (E’) and (E’’) are magnified images of the boxes around the peripheral and central retina in (E), respectively. (F) Hist4h4 mRNA in situ signal in P7 retinal section. Hist4h4 mRNA in situ signal in P14 peripheral (G) and central (G’) retinal section. (G’’) is a magnified image of the box in (G’).

312

Figure A1.9. Spatiotemporal expression analysis by in situ hybridization for H2a histone mRNA across retinal development. In situ hybridization with DIG-labeled anti-sense RNA probe detecting H2a histone across retinal development from E14 to P14. (A) H2a histone in situ signal in E14 retinal section. (A’) is a magnified image of the box around the retina in (A). (B) H2a histone in situ signal in E16 retinal section. (B’) is a magnified image of the box around the retina in (B’). (C) H2a histone in situ signal in E18 retinal section. (C’) and (C’’) are magnified images of the boxes around the central and peripheral retina in (C), respectively. (D) H2a histone in situ signal in P0 retinal section. (D’) is a magnified image of the box around the retina in (D). (E) H2a histone in situ signal in P3 retinal section. (E’) and (E’’) are magnified images of the boxes around the peripheral and central retina in (E), respectively. (F) H2a histone in situ signal in P7 retinal section. (G) H2a histone in situ signal in P14 retinal section.

313

Figure A1.10. Spatiotemporal expression analysis by in situ hybridization for H2b histone mRNA across retinal development. In situ hybridization with DIG-labeled anti-sense RNA probe detecting H2b histone across retinal development from E14 to P14. (A) H2b histone in situ signal in E14 retinal section. (A’) and (A’’) are magnified images of the boxes around the peripheral and central retina in (A), respectively. (B) H2b histone in situ signal in E16 retinal section. (B’) and (B’’) are magnified images of the boxes around the peripheral and central retina in (B), respectively. (C) H2b histone in situ signal in E18 retinal section. (C’) is a magnified image of the box around the retina in (C). (D) H2b histone in situ signal in P0 retinal section. (D’) and (D’’) are magnified images of the boxes around the central and peripheral retina in (D), respectively. (E) H2b histone in situ signal in P3 retinal section. (E’) and (E’’) are magnified images of the boxes around the central and peripheral retina in (E), respectively. (F) H2b histone in situ signal in P7 peripheral retinal section. (G) H2b histone in situ signal in P14 retinal section.

314

Figure A1.11. Cell type-specific expression of histone genes by single-cell microarray analysis. Shown here is a heat map (blue = high, yellow = low) reflecting expression of different histone genes within specific cells at different time points (embryonic and postnatal). GC = ganglion cell, AC = amacrine cell, BP = bipolar cell, MG = Müller glia.

315

Figure A1.12. Expression of replication-dependent histone genes in the aging retina. Shown here is the heat map reflecting expression values (blue = high, yellow = low) of histone transcripts as observed by microarray analysis. Genes with known expression kinetics in the retina were used to contextualize the expression of the histone genes and are shown in red.

316

Figure A1.13. Cyclin and histone transcription in rapamycin-treated retinae. (A) Validation of rapamycin treatment shown by immunoblot analysis, using rabbit α-pS6240/244 (1:1000; Cell Signaling) antibody. Total S6 detected by rabbit α-S6 (1:1000; Cell Signaling) antibody used as the loading control. (B) Shown are normalized fold expression changes as detected by qPCR analysis in untreated (ctrl) and rapamycin-treated (rap) for Ccnd1, Ccne1, Ccne2, and Npat. Normalized fold expression of the various histone genes in ctrl and rap detected by qRT-PCR for Hist1h1a, Hist1h1c (C); Hist1h2ab, Hist3h2a (D); Hist2h2be, Hist3h2ba (E); Hist2h3c2, all H3s (F); and Hist1h4a and Hist4h4 (G). All qRT-PCR values for ctrl (n=3) and rap (n=4) were first normalized to the geometric mean of Gapdh values, followed by averaging and two-tailed t-test, which showed no statistical significance for all values shown. (H) Gapdh, Ccnd1, Ccne1, and Ccne2 expression levels detected in the microarray analysis of the aging retina.

317

Figure A1.14. Variant nucleosome model. Shown here is the schematic of DNA wrapped around the core octamer and connected to the linker histone. Shown here are pie charts reflecting the total mRNA contributions (qRT-PCR data) of the different histone isotypes to the total RNA of that histone family across retinal development. (A) Relative contributions of Hist1h1a (blue), Hist1h1b (red), Hist1h1e (green), and Hist1h1c (purple) to the production of the linker histone H1 across retinal development. (B) Relative contributions of Hist1h2ab (green), Hist3h2a (blue), and Hist1h2ac (red) to the production of the core histone H2a across retinal development. (C) Relative contributions of Hist3h2ba (red), Hist2h2bb (blue), and Hist2h2be (green) to the production of the core histone H2b across retinal development.

318

Table A1.1, Mouse replication-dependent histone genes organized by type and cluster. Histone Locus Histone Genes Type (Cluster Name) Hist1h1a Hist1h1c Hist1h1e H1 (Hist1) Hist1h1b Hist1h1d Hist1h1t

Hist1h2aa Hist1h2ae Hist1h2ai Hist1h2an Hist1h2ab Hist1h2af Hist1h2aj Chromosome 13 (Hist1) Hist1h2ao Hist1h2ac Hist1h2ag Hist1h2ak Hist1h2ap Hist1h2ad Hist1h2ah Hist1h2al H2A (Hist2) Hist2h2aa1 Hist2h2aa2 Hist2h2ab Hist2h2ac

Chromosome 1 (Hist3) Hist3h2a

Hist1h2ba Hist1h2bf Hist1h2bk Hist1h2bp Hist1h2bb Hist1h2bg Hist1h2bl Chromosome 13 (Hist1) Hist1h2bq Hist1h2bc Hist1h2bh Hist1h2bm Hist1h2br Hist1h2be Hist1h2bj Hist1h2bn H2B Chromosome 3 (Hist2) Hist2h2bb Hist2h2be

Chromosome 1 (Hist3) Hist3h2ba

Hist1h3a Hist1h3d Hist1h3g Chromosome 13 (Hist1) Hist1h3b Hist1h3e Hist1h3h H3 Hist1h3c Hist1h3f Hist1h3i

Chromosome 3 (Hist2) Hist2h3b Hist2h3c1 Hist2h3c2

Hist1h4a Hist1h4d Hist1h4i Hist1h4m Chromosome 13 (Hist1) Hist1h4b Hist1h4f Hist1h4j Hist1h4n Hist1h4c Hist1h4h Hist1h4k H4 Chromosome 3 (Hist2) Hist2h4

Chromosome 6 (Hist4) Hist4h4

319

Table A1.2. qRT-PCR protocol used. Step Number Temperature Time Number of Cycles Step 1 95°C 3 minutes 1 Step 2 95°C 10 seconds Step 3 60°C 30 seconds 40 Step 4 65°C 30 seconds 1

320

Table A1.3. Full qRT-PCR Primer List. Gene Primer Direction Primer Sequence Forward 5’-CTGCCGAGAAGTTGTGCATCTACACT-3’ Ccnd1 Reverse 5’-CAGGTTCCACTTGAGCTTGTTCACCA-3’ Forward 5’-GGGCAATAGAGAAGAGGTTTGGAGGA-3’ Ccne1 Reverse 5’-GCCAATCCAGAAGAACTGCTCTCATC-3’ Forward 5’-AAGTGTGAGTCCAGTGAAGCTGAAGAC-3’ Ccne2 Reverse 5’-CCTCCATTACACACTGGTGACAGCT-3’ Forward 5'-AGATATACGGGCTGATTCGCTACTCG-3' Fgf15 Reverse 5'-GCTTGGCCTGGATGAAGATGATATGG-3' Forward 5’-AGAAGAACAACAGCCGCATCAAACTGG-3’ Hist1h1a Reverse 5’-CTTGGACTCAGCCTTCTTGTTCAGCTT-3’ Forward 5’-TCTCCCGCCAAGAAGAAGACAACGAA-3’ Hist1h1b Reverse 5’-CTCCTTAGAGGCAGAAACAGCCTTAG-3’ Forward 5'-GCGTCTAAAGCCGTAAAGCCAAAGG-3' Hist1h1c Reverse 5'-TCGAGTCCCTTGCAACCTTGCTTCATT-3' Forward 5’-CCAAGAAGAGCACCAAGAAGACTCCA-3’ Hist1h1e Reverse 5’-TTAGTTGCCTTTGCCTTCTTCGGGCT-3’ Forward 5’-AACGACGAGGAGCTCAACAAGCTG-3’ Hist1h2ab Reverse 5’-CTTGTGGTGGCTCTCGGTCTTCTT-3’ Forward 5’-GCAGCTGAGAATGACTGTGGACTACTGA-3’ Hist1h2ac Reverse 5’-GCACAGTGATATCAGACTTCGTAAGCCC-3’ Forward 5’-TGACTGCTGAGATCCTGGAATTGGCTG-3’ Hist3h2a Reverse 5’-TTGTTGAGCTCCTCGTCGTTGCGGAT-3’ Forward 5’-GGTCTACCTTACGTTGCATCTTTAGACGC-3’ Hist2h2bb Reverse 5’-TTTCGTGACAGCTTTCTTAGAGCCCTTCT-3’ Forward 5’-TGTGTACAAAGTGCTGAAGCAGGTGC-3’ Hist2h2be Reverse 5’-AGAAGCCTCGTTTGCTATGCGCTCAA-3’ Forward 5’-TACTCCATCTACGTGTACAAGGTGCTG-3’ Hist3h2ba Reverse 5’-CGATGCGCTCGAAGATGTCATTGACG-3’ Forward 5’-ACTTCAAGACCGACCTGCGCTT-3’ all H3s Reverse 5’-ATGATGGTGACACGCTTGGCGT-3’ Forward 5’-TGTCACCATCATGCCCAAGGACAT-3’ Hist1h3e&f Reverse 5’-TTTGGTGGATAATGGAGGTGGCTC-3’ Forward 5’-CACATTATCCCGCTCCATAGCTCTAG-3’ Hist2h3c2 Reverse 5’-CAAGCTAGGAGTCTGAATAAGACCGC-3’ Forward 5’-GATAACATCCAGGGCATCACCAAG-3’ Hist1h4a Reverse 5’-TCACGTTCTCCAGGAACACCTTCA-3’ Forward 5’-AAGGGTCTTGGTAAGGGTGGTGCTAA-3’ Hist1h4d Reverse 5’-AACACCTTCAGCACACCACGGGTCT-3’ Hist4h4 Forward 5’-TTTCCTTGAGAATGTGATCCGCGACG-3’

321

Reverse 5’-TTGAGCGCGTACACCACGTCCATA-3’ Forward 5'-TTTGCATTGGACTTGAGCTGAGGTGC-3' Malat1 Reverse 5'-TTCCTAGCTTCACCAAACTGGCTTCG-3' Forward 5’-TGCCAAGATAACTCTCCACTGCAGAG-3’ Npat Reverse 5’-CAACAGACGCTGCATCATTAGAGAGCTG-3’ Forward 5'-CCTGGTCCTCTTCAAACCTGAAACAC-3' Nr2e3 Reverse 5'-TGAGCCTTGCTATGCTGGCTTAGCAT-3' Forward 5'-GGCAGTAGTCTCAACTGTGTCACACA-3' Nrl Reverse 5'-TCTTGAGGAGACCATTAGACCTGCCA-3' Forward 5'-GGAGTGAATCAGCTTGGTGGTGTCTT-3' Pax6 Reverse 5'-TGGAGTCGCCACTCTTGGCTTACT-3' Forward 5'-GAGAATCACGCTATCATGGGTGTGGT-3' Rho Reverse 5'-GCTTGAGTGTGTAGTAGTCAATCCCG-3' Forward 5'-TTTGTGCTCCTGCCTCAAGAAGAAG-3' Xist Reverse 5'-AATAGGTCGCCAGCACTGCAAA-3'

322

Table A1.4. In situ probe primer table.

Primer Gene Primer Sequence (5’-3’) Name All H3s All H3s F: TAAATCCACTGTGATATCGACTTCAAGACCGACCTGCGCTT R: ATTATGCTGAGTGATATCTCGTGTGTGGCTCTGAAAAGAGCCT H1C Hist1h1c F: TAAATCCACTGTGATATCGCGTCTAAAGCCGTAAAGCCAAAGG R: ATTATGCTGAGTGATATCCACAAGCAGGGCGGAACTTTCAAATC Hist4h4 Hist4h4 F: TAAATCCACTGTGATATCAAGCATCAGAAAGGAGCTGTGGACAT R: ATTATGCTGAGTGATATCTGCCCTCATTCAGGTCCACTGTCT Hist1h2ab Hist1h2ab F: TAAATCCACTGTGATATCAACGACGAGGAGCTCAACAAGCTG R: ATTATGCTGAGTGATATCGGGTGGCTCTGAAAAGAGCCTTTGTT Hist3h2ba Hist3h2ba F: TAAATCCACTGTGATATCTACTCCATCTACGTGTACAAGGTGCTG R: ATTATGCTGAGTGATATCGAGCTGGTGTATTTGGTGACTGCCTT

323

Table A1.5. BLAST alignment percent identity (%age of match) from alignment of the individual in situ probe sequences to the relevant histone genes. Probe Gene %age of match Hist1h3a 100 Hist1h3h 98 Hist1h3g 97 Hist1h3i 97 Hist1h3f 95 Hist1h3c 95 All H3s Hist1h3b 95 Hist1h3e 94 Hist1h3d 94 Hist2h3c1 92 Hist2h3c2 92 Hist2h3b 92 Hist4h4 Hist4h4 100 Hist1h2ab 100 Hist1h2ai 100 Hist1h2ac 99 Hist1h2ae 99 Hist1h2ag 99 Hist1h2ad 99 Hist1h2ak 99 Hist1h2ao 98 Hist1h2ap 98 H2a Hist3h2a 98 Hist1h2ah 100 Hist1h2af 97 Hist1h2an 97 Hist2h2aa2 94 Hist2h2aa1 94 Hist1h2aa 97 H2afx 96 H2afj 97 Hist2h2ac 97 Hist1h2bl 96 Hist1h2br 94 Hist1h2bq 94 Hist1h2bp 94 Hist1h2bb 94 H2b Hist1h2bn 94 Hist1h2bf 94 Hist1h2bg 94 Hist1h2bc 93 Hist1h2bh 93

324

Hist1h2bk 93 Hist1h2be 96 Hist1h2bm 93 Hist1h2bj 93 Hist1h2ba 90 Hist2h2be 89 Hist2h2bb 86 H1C Hist1h1c 100

325

Table A1.6. Carcinoma-linked mutations in replication-dependent histone genes. Gene Mutation: Mutation: Disease Tissue Protein Zygosity Ref CDS AA Domain HIST1H2AA 10C>T R4 Carcinoma Endometrium N- Heterozygous NA Large Intestine Terminal HIST2H2BE c.11C>T P4L Carcinoma Lung N- Unknown (642) Breast Terminal Heterozygous HIST1H1D c.75G>A p.K25K Lymphoid Haematopoietic N- Unknown NA neoplasm and Lymphoid Terminal Tissue HIST1H1E c.74G>C p.R25P Carcinoma Lung N- Unknown NA Terminal HIST1H1B c.80A>G p.K27R Carcinoma Large Intestine N- Heterozygous NA Terminal HIST1H1A c.38C>T p.A13V Carcinoma Endometrium N- Heterozygous NA Terminal HIST1H1C c.540G>A p.A180A Carcinoma Lung Heterozygous (643) Ns HIST1H1B c.410C>G p.A137G Lymphoid Haematopoietic Unknown (642) neoplasm And Lymphoid Tissue

326

Appendix I: List of References

601. Ohno, S., Evolution by gene duplication. 1970, London, New York,: Allen & Unwin; Springer-Verlag. xv, 160 p. 602. Jia, Y., J.C. Mu, and S.L. Ackerman, Mutation of a U2 snRNA gene causes global disruption of alternative splicing and neurodegeneration. Cell, 2012. 148(1-2): p. 296-308. 603. Marzluff, W.F., et al., The human and mouse replication-dependent histone genes. Genomics, 2002. 80(5): p. 487-98. 604. Chen, M., et al., Decoupling epigenetic and genetic effects through systematic analysis of gene position. Cell Rep, 2013. 3(1): p. 128-37. 605. Wu, R.S., K.W. Kohn, and W.M. Bonner, Metabolism of ubiquitinated histones. J Biol Chem, 1981. 256(11): p. 5916-20. 606. Harris, M.E., et al., Regulation of histone mRNA in the unperturbed cell cycle: evidence suggesting control at two posttranscriptional steps. Mol Cell Biol, 1991. 11(5): p. 2416- 24. 607. Manske, M., et al., Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature, 2012. 487(7407): p. 375-9. 608. Toyama, B.H., et al., Identification of long-lived proteins reveals exceptional stability of essential cellular structures. Cell, 2013. 154(5): p. 971-982. 609. Alexiades, M.R. and C.L. Cepko, Subsets of retinal progenitors display temporally regulated and distinct biases in the fates of their progeny. Development, 1997. 124(6): p. 1119-31. 610. Rapaport, D.H., et al., Timing and topography of cell genesis in the rat retina. J Comp Neurol, 2004. 474(2): p. 304-24. 611. Wu, G., et al., Somatic histone H3 alterations in pediatric diffuse intrinsic pontine gliomas and non-brainstem glioblastomas. Nat Genet, 2012. 44(3): p. 251-3. 612. Lewis, P.W., et al., Inhibition of PRC2 activity by a gain-of-function H3 mutation found in pediatric glioblastoma. Science, 2013. 340(6134): p. 857-61. 613. Jaeger, S., et al., Expression of metazoan replication-dependent histone genes. Biochimie, 2005. 87(9-10): p. 827-34. 614. Collinson, J.M., et al., The roles of Pax6 in the cornea, retina, and olfactory epithelium of the developing mouse embryo. Dev Biol, 2003. 255(2): p. 303-12.

327

615. Ashery-Padan, R. and P. Gruss, Pax6 lights-up the way for eye development. Curr Opin Cell Biol, 2001. 13(6): p. 706-14. 616. Li, S., D. Goldowitz, and D.J. Swanson, The requirement of for postnatal eye development: evidence from experimental mouse chimeras. Invest Ophthalmol Vis Sci, 2007. 48(7): p. 3292-300. 617. Kurose, H., et al., Expression of Fibroblast growth factor 19 (Fgf19) during chicken embryogenesis and eye development, compared with Fgf15 expression in the mouse. Gene Expr Patterns, 2004. 4(6): p. 687-93. 618. Liu, Q., et al., Expression of the bZIP transcription factor gene Nrl in the developing nervous system. Oncogene, 1996. 12(1): p. 207-11. 619. Cheng, H., et al., Photoreceptor-specific nuclear receptor NR2E3 functions as a transcriptional activator in rod photoreceptors. Hum Mol Genet, 2004. 13(15): p. 1563- 75. 620. Chen, J., A. Rattner, and J. Nathans, The rod photoreceptor-specific nuclear receptor Nr2e3 represses transcription of multiple cone-specific genes. J Neurosci, 2005. 25(1): p. 118-29. 621. Cepko, C.L., The patterning and onset of opsin expression in vertebrate retinae. Curr Opin Neurobiol, 1996. 6(4): p. 542-6. 622. Doenecke, D., et al., Histones: genetic diversity and tissue-specific gene expression. Histochem Cell Biol, 1997. 107(1): p. 1-10. 623. Harshman, S.W., et al., H1 histones: current perspectives and challenges. Nucleic Acids Res, 2013. 41(21): p. 9593-609. 624. Ng, K., et al., Xist and the order of silencing. EMBO Rep, 2007. 8(1): p. 34-9. 625. Ji, P., et al., MALAT-1, a novel noncoding RNA, and thymosin beta4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene, 2003. 22(39): p. 8031-41. 626. Williams, A.S., et al., Changes in the stem-loop at the 3' terminus of histone mRNA affects its nucleocytoplasmic transport and cytoplasmic regulation. Nucleic Acids Res, 1994. 22(22): p. 4660-6. 627. Alexiades, M.R. and C. Cepko, Quantitative analysis of proliferation and cell cycle length during development of the rat retina. Dev Dyn, 1996. 205(3): p. 293-307.

328

628. Trimarchi, J.M., M.B. Stadler, and C.L. Cepko, Individual retinal progenitor cells display extensive heterogeneity of gene expression. PLoS One, 2008. 3(2): p. e1588. 629. Cherry, T.J., et al., Development and diversification of retinal amacrine interneurons at single cell resolution. Proc Natl Acad Sci U S A, 2009. 106(23): p. 9495-500. 630. Roesch, K., et al., The transcriptome of retinal Muller glial cells. J Comp Neurol, 2008. 509(2): p. 225-38. 631. Schmetsdorf, S., U. Gartner, and T. Arendt, Constitutive expression of functionally active cyclin-dependent kinases and their binding partners suggests noncanonical functions of cell cycle regulators in differentiated neurons. Cereb Cortex, 2007. 17(8): p. 1821-9. 632. Leontieva, O.V., et al., Hyper-mitogenic drive coexists with mitotic incompetence in senescent cells. Cell Cycle, 2012. 11(24): p. 4642-9. 633. Zhao, J., et al., NPAT links cyclin E-Cdk2 to the regulation of replication-dependent histone gene transcription. Genes Dev, 2000. 14(18): p. 2283-97. 634. Blagosklonny, M.V., Cell cycle arrest is not yet senescence, which is not just cell cycle arrest: terminology for TOR-driven aging. Aging (Albany NY), 2012. 4(3): p. 159-65. 635. Wu, X., et al., Insulin promotes rat retinal neuronal cell survival in a p70S6K-dependent manner. J Biol Chem, 2004. 279(10): p. 9167-75. 636. Garelick, M.G. and B.K. Kennedy, TOR on the brain. Exp Gerontol, 2011. 46(2-3): p. 155- 63. 637. Singh, R., et al., Increasing the complexity of chromatin: functionally distinct roles for replication-dependent histone H2A isoforms in cell proliferation and carcinogenesis. Nucleic Acids Res, 2013. 41(20): p. 9284-95. 638. Nayernia, K., et al., Male mice lacking three germ cell expressed genes are fertile. Biol Reprod, 2003. 69(6): p. 1973-8. 639. Popova, E.Y., et al., Developmentally regulated linker histone H1c promotes heterochromatin condensation and mediates structural integrity of rod photoreceptors in mouse retina. J Biol Chem, 2013. 288(24): p. 17895-907. 640. Magnuson, B., B. Ekim, and D.C. Fingar, Regulation and function of ribosomal protein S6 kinase (S6K) within mTOR signalling networks. Biochem J, 2012. 441(1): p. 1-21. 641. Ahmad, K. and S. Henikoff, Histone H3 variants specify modes of chromatin assembly. Proc Natl Acad Sci U S A, 2002. 99 Suppl 4: p. 16477-84.

329

642. Quesada, V., et al., Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat Genet, 2011. 44(1): p. 47-52. 643. Nik-Zainal, S., et al., Mutational processes molding the genomes of 21 breast cancers. Cell, 2012. 149(5): p. 979-93. 644. Kanadia, R.N., et al., Reversal of RNA missplicing and myotonia after muscleblind overexpression in a mouse poly(CUG) model for myotonic dystrophy. Proc Natl Acad Sci U S A, 2006. 103(31): p. 11748-53. 645. Rocha-Martins, M., B. Njaine, and M.S. Silveira, Avoiding pitfalls of internal controls: validation of reference genes for analysis by qRT-PCR and Western blot throughout rat retinal development. PLoS One, 2012. 7(8): p. e43028. 646. Livak, K.J. and T.D. Schmittgen, Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods, 2001. 25(4): p. 402-8. 647. Punzo, C., K. Kornacker, and C.L. Cepko, Stimulation of the insulin/mTOR pathway delays cone death in a mouse model of retinitis pigmentosa. Nat Neurosci, 2009. 12(1): p. 44-52.

330