Identification of Novel Branch Points Reveals Insights Into RNA Processing

Identification of Novel Branch Points Reveals Insights into RNA Processing by Genevieve Michelle Gould B.A. Molecular and Cell Biology with an emphasis in Genetics, Genomics, and Development University of California, Berkeley (2009) Submitted to the Department of Biology in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2015 © Massachusetts Institute of Technology 2015. All rights reserved. Signature of Author .................................................................................................................................................... Department of Biology August 31, 2015 Certified by .................................................................................................................................................................... Christopher B. Burge Professor of Biology Thesis Supervisor Accepted by.................................................................................................................................................................... Michael Hemann Associate Professor of Biology Co-Chair, Biology Graduate Committee 1 2 Identification of Novel Branch Points Reveals Insights into RNA Processing by Genevieve Michelle Gould Submitted to the Department of Biology on August 31, 2015 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Biology Abstract Pre-mRNA splicing is a ubiquitous process necessary for the production of functional eukaryotic mRNAs. The branch point (BP) sequence is one of three key nucleotide sequences required for pre-mRNA splicing, however, in metazoa it has been less comprehensively studied than the 5' splice site (5'SS) and 3' splice site (3'SS) due to the relative difficulty of identifying each sequence element. 5'SS and 3'SS are readily identified by aligning spliced cDNAs, ESTs, or RNA-Seq reads to the genome, while lower throughput techniques such as primer extension are usually required to map BPs, with some exceptions. To understand how the BP affects splicing outcomes, we developed an experimental method to locate BPs on a genome-wide scale. Applying our method to Saccharomyces cerevisiae (S. cerevisiae), one of the only eukaryotes for which most BPs are known, allowed us to assess the sensitivity and specificity of our method. We enriched for RNA lariats by isolating RNA from debranching enzyme null yeast and purified circular RNAs (including lariats) from linear RNAs using a 2D PAGE gel. This was followed by a custom library preparation protocol that produced insert ends that identified the BP and 5'SS of individual lariats. Using this method, we located known BPs and discovered a substantial number of novel BPs both in annotated introns and other genomic regions. We attempted to verify these novel introns using RNA-seq and Lariat-seq and surprisingly observed considerable amounts of alternative splicing (AS) in S. cerevisiae beyond the previously known stress-regulated intron retention events and handful of alterative splice sites. Additionally, we observed several introns with 2 BPs and one intron with 3 BPs. In the LSM2 transcript, we showed alternative BP usage was associated with alternative splice site usage, where one of the mRNA isoforms contains a premature termination codon and leads to nonsense-mediated mRNA decay of the transcript. This suggests AS may control gene expression levels in yeast as is known to be the case in metazoans. Preliminary application of our method to Drosophila melanogaster showed recursive splicing, a phenomenon known only to occur in introns larger than 10Kb, to occur in a 383nt intron. Thesis supervisor: Christopher B. Burge Title: Professor of Biology 3 Acknowledgements I’d like to begin by thanking my advisor, Chris Burge, for allowing me to join his lab and pursue a risky project that let me combine my desire to perform both experimental and computational biology research. The Burge lab has been a great environment for me to learn and grow. Thank you Chris for being receptive to my requests over the years, agreeing to meet with me regularly to discuss my research and allowing me to present my findings at several scientific venues. To my committee members, Phil Sharp and Tom RajBhandary, thank you for all of your helpful advice over the years. Also, Robin Reed, thank you for agreeing to serve on my thesis committee and for providing me with the HeLa Nuclear Extracts that were essential to the success of my research. Next, thank you to all the members of the Burge lab, past and present, who have made the lab a great environment for doing research. I appreciate all you have taught me through sharing your own knowledge of techniques and through your efforts critiquing my presentations and writing over the years. Special thanks to Nicole for encouraging me to purify yeast DBR1 protein which was the key to getting my protocol to work, to Athma for patiently helping me learn R, Alex, Jason, Noah, Charles, Maria, Peter F. and Peter S. for teaching me new Python tricks, Matt for insightful suggestions on ways to plot data, Eric for initial ideas pertaining to my project, Jess for talking some sense into me when trying to get last minute experiments to work the night before group meeting, Reut for helpful conversations over her late-morning breakfast in the dry lab, Joe for being always being upbeat and being a wonderfully motivated guy to work with, and to Jennifer, Dan, Caitlin, Razvan, Robin, Yarden, Albert, Rob, Vincent, Monica, Yevgenia, Chetan, Abby, Cassie, Ritu, Dima, Daniel, Phil, and Brad for making my time in the lab so memorable. Thank you to my collaborators Boris, Yuchun, and Joe for countless conversations and questions; they have been some of the best parts of grad school. I’d also like to thank all of my friends in the building, especially all of my 2nd and 3rd floor neighbors for making the lab a lively place to do science, providing moral support, and organizing fun extracurricular activities. Thank you to my classmates. It’s been great bouncing ideas off of you and it has been comforting to know I always have good friends nearby. I believe the bonds we have formed will last a lifetime and I look forward to learning of everyone’s future accomplishments. Also, thank you to my BBS friends. It’s been fun to observe the differences between the MIT and Harvard Biology PhD programs over the years and it’s been wonderful having more friends in the area who understand the time requirements of research. Also thank you to my roommates, past and present, who have always been there for me when I needed to unwind at the end of the day. Thanks to MIT’s extracurricular activities, I’ve been able to maintain a work-life balance. Thank you to the friendly staff and volunteers at the MIT Sailing Pavilion, members of the MIT Figure Skating Club, and volunteers at the MIT Rock Wall for creating positive outlets. 4 Thank you to my friends from home. Even though some of you admitted you probably wouldn’t understand what I was studying, you were always willing to give it a try and wanted to catch up anyway. Thank you to my college friends, especially the Cal Sailing Team, who still make the time to get together even though we are now scattered across the globe. And to those Cal Sailors whom I discuss scientific topics with from afar, I look forward to our future conversations about scientific breakthroughs, and what the general public thinks of them. Thank you to Mike Eisen for allowing me to experience what computational biology was all about first hand. If I hadn’t worked in your lab, I wouldn’t have come to grad school. Thank you to my additional mentors outside the lab, Kim Hamad-Schifferli, Frank Solomon, and Alan Grossman, who have provided me with valuable advice over the years. I would like to especially thank my best friend, Dr. Lauren Barclay, for always being there for me. As we both know, grad school can be trying at times, and having my best friend nearby, who was going through a PhD herself, was the best thing I could have asked for. Thanks for making time to catch up and getting me out of the lab to enjoy New England! I’d like to thank my high school biology teacher for instilling in me my love of biology. Mr. Van Loo was an excellent teacher who really worked hard to make the subject matter he was teaching interesting and memorable. I’ll never forget when he dressed up a hockey player to demonstrate the Calvin Cycle, bringing a puck of “carbon” in the open “stomata” door to show us where the carbon went and what happened to it once it entered the “cell” classroom, or the time when he had a student volunteer stand on a chair, hold a couple of branches, and try, to no avail, to drink water through a long straw from a water bottle on the floor to demonstrate why transpiration was important for plants to transport water from their roots to their branches. He made biology fun and accessible. It was also through his course that I learned about the UC Davis Young Scholars Program and ended up having my first of many research experiences. I’d like to thank my extended family in the Boston area that made Cambridge a home away from home for me. It’s been great spending time with you, especially since we lived so far apart while I was growing up. I’ve really enjoyed all of our great meals together, Red Sox games, trips to the Cape and other outings. Also, thank you for opening your home to me after the Boston Marathon bombing. A special thank you to the officers who protect MIT, especially Officer Sean Collier. To my grandparents, thank you for always wanting to hear about my latest endeavors. To my “little” brother, thanks for being born after me, you would have been a tough act to follow. I’ve enjoyed all of our fun East Coast visits and appreciate all your advice over the years.

Identification of Novel Branch Points Reveals Insights Into RNA Processing

Molecular Basis for the Distinct Cellular Functions of the Lsm1-7 and Lsm2-8 Complexes

Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice

Supplementary Materials

A Master Autoantigen-Ome Links Alternative Splicing, Female Predilection, and COVID-19 to Autoimmune Diseases

Defining Essential Elements and Genetic Interactions of the Yeast

Ontology Applications in Systems Biology: a Machine Learning Approach

A High-Throughput Approach to Uncover Novel Roles of APOBEC2, a Functional Orphan of the AID/APOBEC Family

Molecular Basis for the Distinct Cellular Functions of the Lsm1-7 and Lsm2-8 Complexes

Dramatically Reduced Spliceosome in Cyanidioschyzon Merolae

Multiple Functional Interactions Between Components of the Lsm2–Lsm8 Complex, U6 Snrna, and the Yeast La Protein

Integrative Framework for Identification of Key Cell Identity Genes Uncovers

Defining Essential Elements and Genetic Interactions of the Yeast Lsm2–8 Ring and Demonstration That Essentiality of Lsm2–8