Protein-RNA Interaction Prediction with Deep Learning: Structure Matters

Total Page:16

File Type:pdf, Size:1020Kb

Protein-RNA Interaction Prediction with Deep Learning: Structure Matters Protein-RNA interaction prediction with deep learning: Structure matters Junkang Wei * 1 Siyuan Chen * 2 Licheng Zong * 1 Xin GaoB 2 Yu LiB 1 3 Abstract their importance, researchers have made significant efforts Protein-RNA interactions are of vital importance to understand the interactions (Corley et al., 2020) and the to a variety of cellular activities. Both experi- related molecular mechanism behind the processes (Lin and mental and computational techniques have been Miles, 2019; Yi et al., 2020). Due to the difficulty to perform developed to study the interactions. Due to the high-throughput structural biological experiments in the last limitation of the previous database, especially the century, the progress of this field was slow (Conn et al., lack of protein structure data, most of the exist- 1999). However, with the development and advancement of in vivo ing computational methods rely heavily on the high-throughput assays, such as the RIP-seq (Keene in vitro sequence data, with only a small portion of the et al., 2006) and CLIP-seq (Ule et al., 2005), and the methods utilizing the structural information. Re- RNACompete (Ray et al., 2009) and HT-SELEX (Roulet cently, AlphaFold has revolutionized the entire et al., 2002), we have witnessed the significant progress of protein and biology field. Foreseeably, the protein- this field as well as the large amount of accumulated data RNA interaction prediction will also be promoted (Alipanahi et al., 2015). Computational methods emerge to significantly in the upcoming years. In this work, analyze the data and accelerate the discovery (Hafner et al., we give a thorough review of this field, survey- 2021; Sun et al., 2021; Bernstein et al., 2005; Van Nostrand ing both the binding site and binding preference et al., 2020; Lam et al., 2019; Hentze et al., 2018). prediction problems and covering the commonly Similar to the experimental techniques, which can be di- used datasets, features, and models. We also point vided into the structure-based methods and the assay-based out the potential challenges and opportunities in methods, the computational methods can also be classi- this field. This survey summarizes the develop- fied into two categories, either predicting the RNA binding ment of the RBP-RNA interaction field in the past sites on the protein surface (Miao and Westhof, 2015; Yan and foresees its future development in the post- et al., 2016) or modeling the preferred RNA sequences of AlphaFold era. an RNA-binding protein (RBP) (Trabelsi et al., 2019). In the first category, people essentially resolve a binary clas- sification problem. Given the protein, researchers want to 1. Introduction predict whether it is an RBP, and if it is an RBP, at which amino acids it can interact with an RNA. In the latter one, Protein-RNA interactions are involved in a variety of cellu- given a protein with the high-throughput assay experimen- lar activities, such as gene expression regulations (Weirauch tal data, people extract the frequency of each nucleotide at et al., 2013), post-transcriptional regulations (Alipanahi each position on the preferred RNA sequences, using k-mer et al., 2015), and protein synthesis (Yan et al., 2016). Per- models (Lee et al., 2015), position weight matrix (PWM) arXiv:2107.12243v1 [q-bio.BM] 26 Jul 2021 turbation of such interactions can lead to fatal cellular dys- models (Weirauch et al., 2013), or deep learning models function and diseases (Ramanathan et al., 2019). Owing to (Alipanahi et al., 2015). If the computational method targets *Equal contribution 1Department of Computer Science and En- on genome-wide prediction, sometimes, it is also referred as gineering (CSE), The Chinese University of Hong Kong (CUHK), the binding sites prediction on RNAs (Pan and Shen, 2018a; Hong Kong, People’s Republic of China 2Computational Bio- Li et al., 2017a), which may cause confusion to the readers. science Research Center (CBRC), Computer, Electrical and Math- In the rest of the paper, binding sites prediction refers to ematical Sciences and Engineering Division (CEMSE), King Ab- dullah University of Science and Technology (KAUST), Thuwal predicting the RNA binding sites on the protein surface, 23955-6900, Saudi Arabia 3The CUHK Shenzhen Research while the binding preference prediction refers to predicting Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China. the protein binding preference against RNA sequences. On Correspondence to: Xin Gao <[email protected]>, Yu Li the other hand, as both of the two main research directions <[email protected]>. are protein-centric (Ramanathan et al., 2019), which means Copyright 2021 by the author(s). that there is intrinsic relation between the two research top- Protein-RNA interaction prediction with deep learning ics, researchers are also trying to predict both information sive local structural encoding can significantly improve the simultaneously with a unified deep learning method (Lam model’s performance. For example, people have used vox- et al., 2019). els (Torng and Altman, 2019) and graphs (Xia et al., 2021a) to encode the protein 3D structures.In terms of the RNA Since the first computational method was proposed to tackle features, the logic is similar to the protein ones. Regard- the interaction between RNA and protein specifically (Jeong ing the sequence features, people have been using the se- et al., 2004), a number of algorithms have been developed quence one-hot encodings (Grønning et al., 2020; Sun et al., to handle the problems (Miao and Westhof, 2015; Yan et al., 2021; Alipanahi et al., 2015), k-mer models (Orenstein et al., 2016; Yan and Zhu, 2020; Sagar and Xue, 2019; Pan et al., 2016), and position-weighted matrix (PWM) (Kazan et al., 2019a). They can be divided into the following categories. 2010; Orenstein et al., 2016). However, unlike the protein Firstly, based on the assumption that similar structures may secondary structures, RNA secondary structural information have similar function, people have used the template-based has been significantly emphasized, including both the pre- method to predict the binding sites (Yang et al., 2013; Chen dicted RNA secondary structures and the in vivo structure et al., 2014; Wu et al., 2018; Xie et al., 2020a) and the bind- profiles (Hiller et al., 2006; Maticzka et al., 2014; Kazan ing preference (Zheng et al., 2016). Although such methods et al., 2010; Sun et al., 2021). Meanwhile, the tertiary struc- can perform well on queries with homologs, they have diffi- tures are also shown to be very important (Zhang et al., culty in handling new sequences without homologs (Senior 2016). Despite the large variety of existing features, unfor- et al., 2020). Secondly, people combine hand-crafted fea- tunately, people have not taken full advantage of them for tures, which will be discussed in the next paragraph, with the following two reasons. Firstly, in the binding site predic- shallow-learning methods, such as support vector machine tion, people usually only consider the protein information, (SVM) (Jolma et al., 2020; Maticzka et al., 2014; Zhang while in the binding preference prediction, people usually and Liu, 2017; Su et al., 2019a), logistic regression (Kazan only consider the RNA information. The interactions be- et al., 2010; Orenstein et al., 2016; Hiller et al., 2006; Yan tween RNAs and proteins include at least two molecules, and Kurgan, 2017), and random forest (Li et al., 2017b; Sun and using information from only one side can lead to infe- et al., 2016), to investigate the topic. The commonly used rior performance. Secondly, the RNA and protein structural k-mer models (Orenstein et al., 2016) and position-weighted information has not been fully utilized as well, mainly due matrix (PWM) models (Kazan et al., 2010) are classified to the limitation of previous structure prediction methods into this category, because they are usually combined with and the unsatisfactory structure encoding methods. logistic regression. Notice that this category of methods is still under active development (Zhang and Liu, 2017; Su In recent years, we have witnessed the significant improve- et al., 2019a), even after the surge of deep learning, because ment of both the structure determination methods (Yip et al., it is difficult to represent and encode the raw structural infor- 2020) and prediction methods (Marks et al., 2011; Wang mation, which will be discussed in detail in this paper. The et al., 2017a; 2019a; Senior et al., 2020; Jumper et al., 2021; last category is the deep learning-based methods (Alipanahi Baek et al., 2021; Tunyasuvunakool et al., 2021). Consid- et al., 2015; Lam et al., 2019; Sun et al., 2021), which have ering the success of the previous computational methods been very popular in recent years. With such models, peo- targeting protein-RNA interaction prediction based on struc- ple only need to input the raw representation of the proteins tural information, it is foreseeable that researchers will make or RNAs, and let the models learn and extract useful in- significant progress in this field (Figure1). Given that, we formation by themselves. However, the transparency and review this field thoroughly in this paper, emphasizing the interpretability of the models are usually questioned (Li structural information. In this work, we also consider the et al., 2019). protein-RNA interaction binding site and binding preference prediction simultaneously for the first time, considering their Within the above algorithms, people have been using various intrinsic relationship. We notice that there are some existing features, including the ones from both proteins and RNAs. related reviews focusing on different aspects of this prob- Regarding the protein features, researchers have developed lem. More specifically, Pan et al.(2019b); Yan and Zhu representations from sequences, such as sequence one-hot (2020); Sagar and Xue(2019) list the recently developed encodings (Yan et al., 2016), position-specific scoring ma- deep learning tools for predicting binding preference.
Recommended publications
  • An Effective Computational Method Incorporating Multiple Secondary
    Old Dominion University ODU Digital Commons Computer Science Faculty Publications Computer Science 2017 An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images Abhishek Biswas Old Dominion University Desh Ranjan Old Dominion University Mohammad Zubair Old Dominion University Stephanie Zeil Old Dominion University Kamal Al Nasr See next page for additional authors Follow this and additional works at: https://digitalcommons.odu.edu/computerscience_fac_pubs Part of the Biochemistry Commons, Computer Sciences Commons, Molecular Biology Commons, and the Statistics and Probability Commons Repository Citation Biswas, Abhishek; Ranjan, Desh; Zubair, Mohammad; Zeil, Stephanie; Al Nasr, Kamal; and He, Jing, "An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images" (2017). Computer Science Faculty Publications. 81. https://digitalcommons.odu.edu/computerscience_fac_pubs/81 Original Publication Citation Biswas, A., Ranjan, D., Zubair, M., Zeil, S., Al Nasr, K., & He, J. (2017). An effective computational method incorporating multiple secondary structure predictions in topology determination for cryo-em images. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14(3), 578-586. doi:10.1109/tcbb.2016.2543721 This Article is brought to you for free and open access by the Computer Science at ODU Digital Commons. It has been accepted for inclusion in Computer Science Faculty Publications by an authorized administrator of ODU Digital Commons. For more information, please contact [email protected]. Authors Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Stephanie Zeil, Kamal Al Nasr, and Jing He This article is available at ODU Digital Commons: https://digitalcommons.odu.edu/computerscience_fac_pubs/81 HHS Public Access Author manuscript Author ManuscriptAuthor Manuscript Author IEEE/ACM Manuscript Author Trans Comput Manuscript Author Biol Bioinform.
    [Show full text]
  • Large-Scale Analysis of Genome and Transcriptome Alterations in Multiple Tumors Unveils Novel Cancer-Relevant Splicing Networks
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Research Large-scale analysis of genome and transcriptome alterations in multiple tumors unveils novel cancer-relevant splicing networks Endre Sebestyén,1,5 Babita Singh,1,5 Belén Miñana,1,2 Amadís Pagès,1 Francesca Mateo,3 Miguel Angel Pujana,3 Juan Valcárcel,1,2,4 and Eduardo Eyras1,4 1Universitat Pompeu Fabra, E08003 Barcelona, Spain; 2Centre for Genomic Regulation, E08003 Barcelona, Spain; 3Program Against Cancer Therapeutic Resistance (ProCURE), Catalan Institute of Oncology (ICO), Bellvitge Institute for Biomedical Research (IDIBELL), E08908 L’Hospitalet del Llobregat, Spain; 4Catalan Institution for Research and Advanced Studies, E08010 Barcelona, Spain Alternative splicing is regulated by multiple RNA-binding proteins and influences the expression of most eukaryotic genes. However, the role of this process in human disease, and particularly in cancer, is only starting to be unveiled. We system- atically analyzed mutation, copy number, and gene expression patterns of 1348 RNA-binding protein (RBP) genes in 11 solid tumor types, together with alternative splicing changes in these tumors and the enrichment of binding motifs in the alter- natively spliced sequences. Our comprehensive study reveals widespread alterations in the expression of RBP genes, as well as novel mutations and copy number variations in association with multiple alternative splicing changes in cancer drivers and oncogenic pathways. Remarkably, the altered splicing patterns in several tumor types recapitulate those of undifferen- tiated cells. These patterns are predicted to be mainly controlled by MBNL1 and involve multiple cancer drivers, including the mitotic gene NUMA1. We show that NUMA1 alternative splicing induces enhanced cell proliferation and centrosome am- plification in nontumorigenic mammary epithelial cells.
    [Show full text]
  • IGF2BP1 (D-9): Sc-166344
    SANTA CRUZ BIOTECHNOLOGY, INC. IGF2BP1 (D-9): sc-166344 BACKGROUND STORAGE Insulin like growth factor 2 mRNA binding proteins (IGF2BPs) bind RNA and Store at 4° C, **DO NOT FREEZE**. Stable for one year from the date of influence RNA synthesis and metabolism. IGF2BP1, also known as coding shipment. Non-hazardous. No MSDS required. region determinant-binding protein/Insulin-like growth factor II mRNA-binding protein (CRD-BP), IMP1 or VICKZ1; IGF2BP2 (IMP2, VICKZ2, p62); and IGF2BP2 DATA (IMP3, KOC1, VICKZ3) contain a unique combination of RNA recognition motifs A B and four hnRNP K homology domains. IGF2BP1 is abundant in embryonal ABC DEF tissues and is expressed in 81% of colon cancers, 73% of sarcomas and 132 K – 58.5% of breast cancers. It recognizes c-Myc, IGF-II and t mRNAs, and H19 90 K – RNA, and plays a major role in proliferation of K-562 cells by an IGF-II-depen- 55 K – IGF2BP1 dent mechanism. IGF2BP2 binds the 5' UTR of IGF-II mRNA and influences tumor cell growth, in which IGF2BP2 is associated with apoptosis induced 43 K – by tretinoin. IGF2BP3 knockdown by RNA interference decreases levels of IGF-II protein without affecting IGF-II, c-Myc, or b Actin mRNA and H19 RNA levels. IGF2BP3 is a marker for carcinomas and high-grade dysplastic lesions IGF2BP1 (D-9): sc-166344. Western blot analysis IGF2BP1 (D-9): sc-166344. Immunofluorescence staining ofIGF2BP1 expressioninHEK293 (A), NIH/3T3 (B), of methanol-fixed HeLa cells showing nucleolar and of pancreatic ductal epithelium. K-562 (C), U-698-M (D) and Raji (E)wholecell lysates cytoplasmic localization (A).
    [Show full text]
  • Exome Sequencing Revealed DNA Variants in NCOR1, IGF2BP1, SGLT2 and NEK11 As Potential Novel Causes of Ketotic Hypoglycemia in C
    www.nature.com/scientificreports OPEN Exome sequencing revealed DNA variants in NCOR1, IGF2BP1, SGLT2 and NEK11 as potential novel causes of ketotic hypoglycemia in children Yazeid Alhaidan1,2,3,4*, Martin J. Larsen 1,2, Anders Jørgen Schou5, Maria H. Stenlid6, Mohammed A. Al Balwi 3,4, Henrik Thybo Christesen2,5,7,9 & Klaus Brusgaard1,2,8,9 Unexplained or idiopathic ketotic hypoglycemia (KH) is the most common type of hypoglycemia in children. The diagnosis is based on the exclusion of routine hormonal and metabolic causes of hypoglycemia. We aimed to identify novel genes that cause KH, as this may lead to a more targeted treatment. Deep phenotyping of ten preschool age at onset KH patients (boys, n = 5; girls, n = 5) was performed followed by trio exome sequencing and comprehensive bioinformatics analysis. Data analysis revealed four novel candidate genes: (1) NCOR1 in a patient with KH, iron defciency and loose stools; (2) IGF2BP1 in a proband with KH, short stature and delayed bone age; (3) SLC5A2 in a proband with KH, intermittent glucosuria and extremely elevated p-GLP-1; and (4) NEK11 in a proband with ketotic hypoglycemia and liver afiction. These genes are associated with diferent metabolic processes, such as gluconeogenesis, translational regulation, and glucose transport. In conclusion, WES identifed DNA variants in four diferent genes as potential novel causes of IKH, suggesting that IKH is a heterogeneous disorder that can be split into several novel diseases: NCOR1-KH, IGF2BP1-KH, SGLT2-KH or familial renal glucosuria KH, and NEK11-KH. Precision medicine treatment based on exome sequencing may lead to advances in the management of IKH.
    [Show full text]
  • Overlap of Quantitative Trait Loci for Early Growth Rate, and for Body
    REPRODUCTIONRESEARCH Functional reconstruction of NANOS3 expression in the germ cell lineage by a novel transgenic reporter reveals distinct subcellular localizations of NANOS3 Masashi Yamaji1, Takashi Tanaka2, Mayo Shigeta1, Shinichiro Chuma2, Yumiko Saga3 and Mitinori Saitou1,4,5 1Laboratory for Mammalian Germ Cell Biology, RIKEN Center for Developmental Biology, 2-2-3 Minatojima- Minamimachi, Chuo-ku, Kobe 650-0047, Japan, 2Department of Development and Differentiation, Institute for Frontier Medical Sciences, Kyoto University, Shogoin, Sakyo-ku, Kyoto 606-8507, Japan, 3Department of Genetics and Division of Mammalian Development, National Institute of Genetics, SOKENDAI, 1111 Yata, Mishima, Shizuoka 411-8540, Japan, 4Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida- Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan and 5JST, CREST, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan Correspondence should be addressed to M Saitou at Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University; Email: [email protected] Abstract Mutations of RNA-binding proteins such as NANOS3, TIAL1, and DND1 in mice have been known to result in the failure of survival and/or proliferation of primordial germ cells (PGCs) soon after their fate is specified (around embryonic day (E) 8.0), leading to the infertility of these animals. However, the mechanisms of actions of these RNA-binding proteins remain largely unresolved. As a foundation to explore the role of these RNA-binding proteins in germ cells, we established a novel transgenic reporter strain that expresses NANOS3 fused with EGFP under the control of Nanos3 regulatory elements. NANOS3–EGFP exhibited exclusive expression in PGCs as early as E7.25, and continued to be expressed in female germ cells until around E14.5 and in male germ cells throughout the fetal period with declining expression levels after E16.5.
    [Show full text]
  • Evidence for Differential Alternative Splicing in Blood of Young Boys With
    Stamova et al. Molecular Autism 2013, 4:30 http://www.molecularautism.com/content/4/1/30 RESEARCH Open Access Evidence for differential alternative splicing in blood of young boys with autism spectrum disorders Boryana S Stamova1,2,5*, Yingfang Tian1,2,4, Christine W Nordahl1,3, Mark D Shen1,3, Sally Rogers1,3, David G Amaral1,3 and Frank R Sharp1,2 Abstract Background: Since RNA expression differences have been reported in autism spectrum disorder (ASD) for blood and brain, and differential alternative splicing (DAS) has been reported in ASD brains, we determined if there was DAS in blood mRNA of ASD subjects compared to typically developing (TD) controls, as well as in ASD subgroups related to cerebral volume. Methods: RNA from blood was processed on whole genome exon arrays for 2-4–year-old ASD and TD boys. An ANCOVA with age and batch as covariates was used to predict DAS for ALL ASD (n=30), ASD with normal total cerebral volumes (NTCV), and ASD with large total cerebral volumes (LTCV) compared to TD controls (n=20). Results: A total of 53 genes were predicted to have DAS for ALL ASD versus TD, 169 genes for ASD_NTCV versus TD, 1 gene for ASD_LTCV versus TD, and 27 genes for ASD_LTCV versus ASD_NTCV. These differences were significant at P <0.05 after false discovery rate corrections for multiple comparisons (FDR <5% false positives). A number of the genes predicted to have DAS in ASD are known to regulate DAS (SFPQ, SRPK1, SRSF11, SRSF2IP, FUS, LSM14A). In addition, a number of genes with predicted DAS are involved in pathways implicated in previous ASD studies, such as ROS monocyte/macrophage, Natural Killer Cell, mTOR, and NGF signaling.
    [Show full text]
  • Chapter 13 Protein Structure Learning Objectives
    Chapter 13 Protein structure Learning objectives Upon completing this material you should be able to: ■ understand the principles of protein primary, secondary, tertiary, and quaternary structure; ■use the NCBI tool CN3D to view a protein structure; ■use the NCBI tool VAST to align two structures; ■explain the role of PDB including its purpose, contents, and tools; ■explain the role of structure annotation databases such as SCOP and CATH; and ■describe approaches to modeling the three-dimensional structure of proteins. Outline Overview of protein structure Principles of protein structure Protein Data Bank Protein structure prediction Intrinsically disordered proteins Protein structure and disease Overview: protein structure The three-dimensional structure of a protein determines its capacity to function. Christian Anfinsen and others denatured ribonuclease, observed rapid refolding, and demonstrated that the primary amino acid sequence determines its three-dimensional structure. We can study protein structure to understand problems such as the consequence of disease-causing mutations; the properties of ligand-binding sites; and the functions of homologs. Outline Overview of protein structure Principles of protein structure Protein Data Bank Protein structure prediction Intrinsically disordered proteins Protein structure and disease Protein primary and secondary structure Results from three secondary structure programs are shown, with their consensus. h: alpha helix; c: random coil; e: extended strand Protein tertiary and quaternary structure Quarternary structure: the four subunits of hemoglobin are shown (with an α 2β2 composition and one beta globin chain high- lighted) as well as four noncovalently attached heme groups. The peptide bond; phi and psi angles The peptide bond; phi and psi angles in DeepView Protein secondary structure Protein secondary structure is determined by the amino acid side chains.
    [Show full text]
  • The Ire1a-XBP1 Pathway Promotes T Helper Cell Differentiation by Resolving Secretory Stress and Accelerating Proliferation
    bioRxiv preprint doi: https://doi.org/10.1101/235010; this version posted December 15, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. The IRE1a-XBP1 pathway promotes T helper cell differentiation by resolving secretory stress and accelerating proliferation Jhuma Pramanik1, Xi Chen1, Gozde Kar1,2, Tomás Gomes1, Johan Henriksson1, Zhichao Miao1,2, Kedar Natarajan1, Andrew N. J. McKenzie3, Bidesh Mahata1,2*, Sarah A. Teichmann1,2,4* 1. Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom 2. EMBL-European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom 3. MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge CB2 OQH, United Kingdom 4. Theory of Condensed Matter, Cavendish Laboratory, 19 JJ Thomson Ave, Cambridge CB3 0HE, United Kingdom. *To whom correspondence should be addressed: [email protected] and [email protected] Keywords: Th2 lymphocyte, XBP1, Genome wide XBP1 occupancy, Th2 lymphocyte proliferation, ChIP-seq, RNA-seq, Th2 transcriptome Summary The IRE1a-XBP1 pathway, a conserved adaptive mediator of the unfolded protein response, is indispensable for the development of secretory cells. It maintains endoplasmic reticulum homeostasis by facilitating protein folding and enhancing secretory capacity of the cells. Its role in immune cells is emerging. It is involved in dendritic cell, plasma cell and eosinophil development and differentiation. Using genome-wide approaches, integrating ChIPmentation and mRNA-sequencing data, we have elucidated the regulatory circuitry governed by the IRE1a-XBP1 pathway in type-2 T helper cells (Th2).
    [Show full text]
  • A Role for Dazl in Commitment to Gametogenic Fate in Embryonic Germ Cells of C57BL/6 Mice
    A Role for Dazl in Commitment to Gametogenic Fate in Embryonic Germ Cells of C57BL/6 Mice by Yanfeng Lin (Yen-Hong Lim) B.S. Biochemistry and Molecular and Cell Biology University of Wisconsin-Madison, 1998 SUBMITTED TO THE DEPARTMENT OF BIOLOGY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN BIOLOGY AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY SEPT 2005 C 2005 Yanfeng Lin. All rights reserved. The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part. Signature of Author __ _ __ Department of Biology August, 2005 V /-' ~J-2 Certified by David C. Page Professor of Biology Howard Hughes Medical Institute Thesis Supervisor Accepted b) Stephen P. Bell Co-chair, Biology Graduate Student Committee 'MACHUS ETT S NS1 I OF TECHNOLOGY ARCHIVES' OCT 0 2005 . ,. -~ I LIBRARIES .i. __ A Role for Dazl in Commitment to Gametogenic Fate in Embryonic Germ Cells of C57BL/6 Mice by Yanfeng Lin (Yen-Hong Lim) Submitted to the Department of Biology on June, 2005 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Biology Abstract Germ cells can be defined as the cells that undergo the terminal differentiating process of meiosis. In mice, as XX germ cells enter meiosis around Embryonic days 13.5-14.5 (E13.5-E14.5), they form meiotic figures and down-regulate pluripotency markers. XY germ cells enter proliferation arrest between E13.5 and E16.5, which is accompanied by a distinct morphological change as well.
    [Show full text]
  • Downloads/ (Accessed on 17 January 2020)
    cells Review Novel Approaches for Identifying the Molecular Background of Schizophrenia Arkadiy K. Golov 1,2,*, Nikolay V. Kondratyev 1 , George P. Kostyuk 3 and Vera E. Golimbet 1 1 Mental Health Research Center, 34 Kashirskoye shosse, 115522 Moscow, Russian; [email protected] (N.V.K.); [email protected] (V.E.G.) 2 Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilova Street, 119334 Moscow, Russian 3 Alekseev Psychiatric Clinical Hospital No. 1, 2 Zagorodnoye shosse, 115191 Moscow, Russian; [email protected] * Correspondence: [email protected] Received: 5 November 2019; Accepted: 16 January 2020; Published: 18 January 2020 Abstract: Recent advances in psychiatric genetics have led to the discovery of dozens of genomic loci associated with schizophrenia. However, a gap exists between the detection of genetic associations and understanding the underlying molecular mechanisms. This review describes the basic approaches used in the so-called post-GWAS studies to generate biological interpretation of the existing population genetic data, including both molecular (creation and analysis of knockout animals, exploration of the transcriptional effects of common variants in human brain cells) and computational (fine-mapping of causal variability, gene set enrichment analysis, partitioned heritability analysis) methods. The results of the crucial studies, in which these approaches were used to uncover the molecular and neurobiological basis of the disease, are also reported. Keywords: schizophrenia; GWAS; causal genetic variants; enhancers; brain epigenomics; genome/epigenome editing 1. Introduction Schizophrenia is a severe mental illness that affects between 0.5% and 0.7% of the human population [1]. Both environmental and genetic factors are thought to be involved in its pathogenesis, with genetic factors playing a key role in disease risk, as the heritability of schizophrenia is estimated to be 70–85% [2,3].
    [Show full text]
  • The Splicing Factor RNA-Binding Fox Protein 1 Mediates the Cellular Immune Response in Drosophila Melanogaster
    The Splicing Factor RNA-Binding Fox Protein 1 Mediates the Cellular Immune Response in Drosophila melanogaster This information is current as Ashley E. Nazario-Toole, Javier Robalino, Kwame Okrah, of September 27, 2021. Hector Corrada-Bravo, Stephen M. Mount and Louisa P. Wu J Immunol published online 11 July 2018 http://www.jimmunol.org/content/early/2018/07/10/jimmun ol.1800496 Downloaded from Supplementary http://www.jimmunol.org/content/suppl/2018/07/10/jimmunol.180049 Material 6.DCSupplemental http://www.jimmunol.org/ Why The JI? Submit online. • Rapid Reviews! 30 days* from submission to initial decision • No Triage! Every submission reviewed by practicing scientists • Fast Publication! 4 weeks from acceptance to publication by guest on September 27, 2021 *average Subscription Information about subscribing to The Journal of Immunology is online at: http://jimmunol.org/subscription Permissions Submit copyright permission requests at: http://www.aai.org/About/Publications/JI/copyright.html Email Alerts Receive free email-alerts when new articles cite this article. Sign up at: http://jimmunol.org/alerts The Journal of Immunology is published twice each month by The American Association of Immunologists, Inc., 1451 Rockville Pike, Suite 650, Rockville, MD 20852 Copyright © 2018 by The American Association of Immunologists, Inc. All rights reserved. Print ISSN: 0022-1767 Online ISSN: 1550-6606. Published July 11, 2018, doi:10.4049/jimmunol.1800496 The Journal of Immunology The Splicing Factor RNA-Binding Fox Protein 1 Mediates the Cellular Immune Response in Drosophila melanogaster Ashley E. Nazario-Toole,*,† Javier Robalino,* Kwame Okrah,‡ Hector Corrada-Bravo,‡ Stephen M. Mount,*,‡ and Louisa P.
    [Show full text]
  • SND1, a Component of RNA-Induced Silencing Complex, Is Up-Regulated in Human Colon Cancers and Implicated in Early Stage Colon Carcinogenesis
    Research Article SND1, a Component of RNA-Induced Silencing Complex, Is Up-regulated in Human Colon Cancers and Implicated in Early Stage Colon Carcinogenesis Naoto Tsuchiya, Masako Ochiai, Katsuhiko Nakashima, Tsuneyuki Ubagai, Takashi Sugimura, and Hitoshi Nakagama Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, Japan Abstract transcription factors is well known to be a causative event involved Colon cancers have been shown to develop after accumulation in malignant transformation, for example, with functional inacti- vation or intragenic mutation of p53 (3), and gene amplification or of multiple genetic and epigenetic alterations with changes in global gene expression profiles, contributing to the establish- overexpression of c-myc (4). ment of widely diverse phenotypes. Transcriptional and Recently, posttranscriptional regulation, especially translational posttranscriptional regulation of gene expression by small control, of gene expression has become a major interest in the field of cancer research, because accumulating evidence strongly RNA species, such as the small interfering RNA and microRNA suggests its biological relevance for cancer development and and the RNA-induced silencing complex (RISC), is currently drawing major interest with regard to cancer development. progression (5). Some oncogenes and tumor suppressor genes, including c-myc (6), Mdm-2(7), and p53 (8), have been shown to be SND1, also called Tudor-SN and p100 and recently reported to regulated at the translational level. Overexpression of the transla- be a component of RISC, is among the list of highly expressed tion initiation factor eIF4E promotes malignant transformation of genes in human colon cancers. In the present study, we normal fibroblasts (9) and cooperates with c-Myc to drive B-cell showed remarkable up-regulation of SND1 mRNA in human lymphomatogenesis in eIF4E transgenic mice (10).
    [Show full text]