Crowdsourcing RNA Structural Alignments with an Online Computer Game
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The ELIXIR Core Data Resources: Fundamental Infrastructure for The
Supplementary Data: The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences The “Supporting Material” referred to within this Supplementary Data can be found in the Supporting.Material.CDR.infrastructure file, DOI: 10.5281/zenodo.2625247 (https://zenodo.org/record/2625247). Figure 1. Scale of the Core Data Resources Table S1. Data from which Figure 1 is derived: Year 2013 2014 2015 2016 2017 Data entries 765881651 997794559 1726529931 1853429002 2715599247 Monthly user/IP addresses 1700660 2109586 2413724 2502617 2867265 FTEs 270 292.65 295.65 289.7 311.2 Figure 1 includes data from the following Core Data Resources: ArrayExpress, BRENDA, CATH, ChEBI, ChEMBL, EGA, ENA, Ensembl, Ensembl Genomes, EuropePMC, HPA, IntAct /MINT , InterPro, PDBe, PRIDE, SILVA, STRING, UniProt ● Note that Ensembl’s compute infrastructure physically relocated in 2016, so “Users/IP address” data are not available for that year. In this case, the 2015 numbers were rolled forward to 2016. ● Note that STRING makes only minor releases in 2014 and 2016, in that the interactions are re-computed, but the number of “Data entries” remains unchanged. The major releases that change the number of “Data entries” happened in 2013 and 2015. So, for “Data entries” , the number for 2013 was rolled forward to 2014, and the number for 2015 was rolled forward to 2016. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences 1 Figure 2: Usage of Core Data Resources in research The following steps were taken: 1. API calls were run on open access full text articles in Europe PMC to identify articles that mention Core Data Resource by name or include specific data record accession numbers. -
PLK-1 Promotes the Merger of the Parental Genome Into A
RESEARCH ARTICLE PLK-1 promotes the merger of the parental genome into a single nucleus by triggering lamina disassembly Griselda Velez-Aguilera1, Sylvia Nkombo Nkoula1, Batool Ossareh-Nazari1, Jana Link2, Dimitra Paouneskou2, Lucie Van Hove1, Nicolas Joly1, Nicolas Tavernier1, Jean-Marc Verbavatz3, Verena Jantsch2, Lionel Pintard1* 1Programme Equipe Labe´llise´e Ligue Contre le Cancer - Team Cell Cycle & Development - Universite´ de Paris, CNRS, Institut Jacques Monod, Paris, France; 2Department of Chromosome Biology, Max Perutz Laboratories, University of Vienna, Vienna Biocenter, Vienna, Austria; 3Universite´ de Paris, CNRS, Institut Jacques Monod, Paris, France Abstract Life of sexually reproducing organisms starts with the fusion of the haploid egg and sperm gametes to form the genome of a new diploid organism. Using the newly fertilized Caenorhabditis elegans zygote, we show that the mitotic Polo-like kinase PLK-1 phosphorylates the lamin LMN-1 to promote timely lamina disassembly and subsequent merging of the parental genomes into a single nucleus after mitosis. Expression of non-phosphorylatable versions of LMN- 1, which affect lamina depolymerization during mitosis, is sufficient to prevent the mixing of the parental chromosomes into a single nucleus in daughter cells. Finally, we recapitulate lamina depolymerization by PLK-1 in vitro demonstrating that LMN-1 is a direct PLK-1 target. Our findings indicate that the timely removal of lamin is essential for the merging of parental chromosomes at the beginning of life in C. elegans and possibly also in humans, where a defect in this process might be fatal for embryo development. *For correspondence: [email protected] Introduction Competing interests: The After fertilization, the haploid gametes of the egg and sperm have to come together to form the authors declare that no genome of a new diploid organism. -
DECIPHER: Harnessing Local Sequence Context to Improve Protein Multiple Sequence Alignment Erik S
Wright BMC Bioinformatics (2015) 16:322 DOI 10.1186/s12859-015-0749-z RESEARCH ARTICLE Open Access DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment Erik S. Wright1,2 Abstract Background: Alignment of large and diverse sequence sets is a common task in biological investigations, yet there remains considerable room for improvement in alignment quality. Multiple sequence alignment programs tend to reach maximal accuracy when aligning only a few sequences, and then diminish steadily as more sequences are added. This drop in accuracy can be partly attributed to a build-up of error and ambiguity as more sequences are aligned. Most high-throughput sequence alignment algorithms do not use contextual information under the assumption that sites are independent. This study examines the extent to which local sequence context can be exploited to improve the quality of large multiple sequence alignments. Results: Two predictors based on local sequence context were assessed: (i) single sequence secondary structure predictions, and (ii) modulation of gap costs according to the surrounding residues. The results indicate that context-based predictors have appreciable information content that can be utilized to create more accurate alignments. Furthermore, local context becomes more informative as the number of sequences increases, enabling more accurate protein alignments of large empirical benchmarks. These discoveries became the basis for DECIPHER, a new context-aware program for sequence alignment, which outperformed other programs on largesequencesets. Conclusions: Predicting secondary structure based on local sequence context is an efficient means of breaking the independence assumption in alignment. Since secondary structure is more conserved than primary sequence, it can be leveraged to improve the alignment of distantly related proteins. -
Deep Profiling of Protease Substrate Specificity Enabled by Dual Random and Scanned Human Proteome Substrate Phage Libraries
Deep profiling of protease substrate specificity enabled by dual random and scanned human proteome substrate phage libraries Jie Zhoua, Shantao Lib, Kevin K. Leunga, Brian O’Donovanc, James Y. Zoub,d, Joseph L. DeRisic,d, and James A. Wellsa,d,e,1 aDepartment of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158; bDepartment of Biomedical Data Science, Stanford University, Stanford, CA 94305; cDepartment of Biochemistry and Biophysics, University of California, San Francisco, CA 94158; dChan Zuckerberg Biohub, San Francisco, CA 94158; and eDepartment of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158 Edited by Benjamin F. Cravatt, Scripps Research Institute, La Jolla, CA, and approved August 19, 2020 (received for review May 11, 2020) Proteolysis is a major posttranslational regulator of biology inside lysate and miss low abundance proteins and those simply not and outside of cells. Broad identification of optimal cleavage sites expressed in cell lines tested that typically express only half their and natural substrates of proteases is critical for drug discovery genomes (13). and to understand protease biology. Here, we present a method To potentially screen a larger and more diverse sequence that employs two genetically encoded substrate phage display space, investigators have developed genetically encoded substrate libraries coupled with next generation sequencing (SPD-NGS) that phage (14, 15) or yeast display libraries (16, 17). Degenerate DNA allows up to 10,000-fold deeper sequence coverage of the typical six- sequences (up to 107) encoding random peptides were fused to a to eight-residue protease cleavage sites compared to state-of-the-art phage or yeast coat protein gene for a catch-and-release strategy synthetic peptide libraries or proteomics. -
Methods in and Applications of the Sequencing of Short Non-Coding Rnas" (2013)
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2013 Methods in and Applications of the Sequencing of Short Non- Coding RNAs Paul Ryvkin University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Genetics Commons, and the Molecular Biology Commons Recommended Citation Ryvkin, Paul, "Methods in and Applications of the Sequencing of Short Non-Coding RNAs" (2013). Publicly Accessible Penn Dissertations. 922. https://repository.upenn.edu/edissertations/922 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/922 For more information, please contact [email protected]. Methods in and Applications of the Sequencing of Short Non-Coding RNAs Abstract Short non-coding RNAs are important for all domains of life. With the advent of modern molecular biology their applicability to medicine has become apparent in settings ranging from diagonistic biomarkers to therapeutics and fields angingr from oncology to neurology. In addition, a critical, recent technological development is high-throughput sequencing of nucleic acids. The convergence of modern biotechnology with developments in RNA biology presents opportunities in both basic research and medical settings. Here I present two novel methods for leveraging high-throughput sequencing in the study of short non- coding RNAs, as well as a study in which they are applied to Alzheimer's Disease (AD). The computational methods presented here include High-throughput Annotation of Modified Ribonucleotides (HAMR), which enables researchers to detect post-transcriptional covalent modifications ot RNAs in a high-throughput manner. In addition, I describe Classification of RNAs by Analysis of Length (CoRAL), a computational method that allows researchers to characterize the pathways responsible for short non-coding RNA biogenesis. -
(DDD) Project: What a Genomic Approach Can Achieve
The Deciphering Development Disorders (DDD) project: What a genomic approach can achieve RCP ADVANCED MEDICINE, LONDON FEB 5TH 2018 HELEN FIRTH DM FRCP DCH, SANGER INSTITUTE 3,000,000,000 bases in each human genome Disease & developmental Health & development disorders Fascinating facts about your genome! –~20,000 protein-coding genes –~30% of genes have a known role in disease or developmental disorders –~10,000 protein altering variants –~100 protein truncating variants –~70 de novo mutations (~1-2 coding ie. In exons of genes) Rare Disease affects 1 in 17 people •Prior to DDD, diagnostic success in patients with rare paediatric disease was poor •Not possible to diagnose many patients with current methodology in routine use– maximum benefit in this group •DDD recruited patients with severe/extreme clinical features present from early childhood with high expectation of genetic basis •Recruitment was primarily of trios (ie The Doctor Sir Luke Fildes (1887) child and both parents) ~ 90% Making a genomic diagnosis of a rare disease improves care •Accurate diagnosis is the cornerstone of good medical practice – informing management, treatment, prognosis and prevention •Enables risk to other family members to be determined enabling predictive testing with potential for surveillance and therapy in some disorders February 28th 2018 •Reduces sense of isolation, enabling better access to support and information •Curtails the diagnostic odyssey •Not just a descriptive label; identifies the fundamental cause of disease A genomic diagnosis can be a gateway to better treatment •Not just a descriptive label; identifies the fundamental cause of disease •Biallelic mutations in the CFTR gene cause Cystic Fibrosis • CFTR protein is an epithelial ion channel regulating absorption/ secretion of salt and water in the lung, sweat glands, pancreas & GI tract. -
Comparing Tools for Non-Coding RNA Multiple Sequence Alignment Based On
Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 1 1 TITLE 2 RNAconTest: Comparing tools for non-coding RNA multiple sequence alignment based on 3 structural consistency 4 Running title: RNAconTest: benchmarking comparative RNA programs 5 Author: Erik S. Wright1,* 6 1 Department of Biomedical Informatics, University of Pittsburgh (Pittsburgh, PA) 7 * Corresponding author: Erik S. Wright ([email protected]) 8 Keywords: Multiple sequence alignment, Secondary structure prediction, Benchmark, non- 9 coding RNA, Consensus secondary structure 10 Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 2 11 ABSTRACT 12 The importance of non-coding RNA sequences has become increasingly clear over the past 13 decade. New RNA families are often detected and analyzed using comparative methods based on 14 multiple sequence alignments. Accordingly, a number of programs have been developed for 15 aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for 16 these tasks remain unclear because existing benchmarks contain too few sequences belonging to 17 only a small number of RNA families. RNAconTest (RNA consistency test) is a new 18 benchmarking approach relying on the observation that secondary structure is often conserved 19 across highly divergent RNA sequences from the same family. RNAconTest scores multiple 20 sequence alignments based on the level of consistency among known secondary structures 21 belonging to reference sequences in their output alignment. Similarly, consensus secondary 22 structure predictions are scored according to their agreement with one or more known structures 23 in a family. -
Annual Scientific Report 2013 on the Cover Structure 3Fof in the Protein Data Bank, Determined by Laponogov, I
EMBL-European Bioinformatics Institute Annual Scientific Report 2013 On the cover Structure 3fof in the Protein Data Bank, determined by Laponogov, I. et al. (2009) Structural insight into the quinolone-DNA cleavage complex of type IIA topoisomerases. Nature Structural & Molecular Biology 16, 667-669. © 2014 European Molecular Biology Laboratory This publication was produced by the External Relations team at the European Bioinformatics Institute (EMBL-EBI) A digital version of the brochure can be found at www.ebi.ac.uk/about/brochures For more information about EMBL-EBI please contact: [email protected] Contents Introduction & overview 3 Services 8 Genes, genomes and variation 8 Molecular atlas 12 Proteins and protein families 14 Molecular and cellular structures 18 Chemical biology 20 Molecular systems 22 Cross-domain tools and resources 24 Research 26 Support 32 ELIXIR 36 Facts and figures 38 Funding & resource allocation 38 Growth of core resources 40 Collaborations 42 Our staff in 2013 44 Scientific advisory committees 46 Major database collaborations 50 Publications 52 Organisation of EMBL-EBI leadership 61 2013 EMBL-EBI Annual Scientific Report 1 Foreword Welcome to EMBL-EBI’s 2013 Annual Scientific Report. Here we look back on our major achievements during the year, reflecting on the delivery of our world-class services, research, training, industry collaboration and European coordination of life-science data. The past year has been one full of exciting changes, both scientifically and organisationally. We unveiled a new website that helps users explore our resources more seamlessly, saw the publication of ground-breaking work in data storage and synthetic biology, joined the global alliance for global health, built important new relationships with our partners in industry and celebrated the launch of ELIXIR. -
A Unicellular Relative of Animals Generates a Layer of Polarized Cells
RESEARCH ARTICLE A unicellular relative of animals generates a layer of polarized cells by actomyosin- dependent cellularization Omaya Dudin1†*, Andrej Ondracka1†, Xavier Grau-Bove´ 1,2, Arthur AB Haraldsen3, Atsushi Toyoda4, Hiroshi Suga5, Jon Bra˚ te3, In˜ aki Ruiz-Trillo1,6,7* 1Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain; 2Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, United Kingdom; 3Section for Genetics and Evolutionary Biology (EVOGENE), Department of Biosciences, University of Oslo, Oslo, Norway; 4Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan; 5Faculty of Life and Environmental Sciences, Prefectural University of Hiroshima, Hiroshima, Japan; 6Departament de Gene`tica, Microbiologia i Estadı´stica, Universitat de Barcelona, Barcelona, Spain; 7ICREA, Barcelona, Spain Abstract In animals, cellularization of a coenocyte is a specialized form of cytokinesis that results in the formation of a polarized epithelium during early embryonic development. It is characterized by coordinated assembly of an actomyosin network, which drives inward membrane invaginations. However, whether coordinated cellularization driven by membrane invagination exists outside animals is not known. To that end, we investigate cellularization in the ichthyosporean Sphaeroforma arctica, a close unicellular relative of animals. We show that the process of cellularization involves coordinated inward plasma membrane invaginations dependent on an *For correspondence: actomyosin network and reveal the temporal order of its assembly. This leads to the formation of a [email protected] (OD); polarized layer of cells resembling an epithelium. We show that this stage is associated with tightly [email protected] (IR-T) regulated transcriptional activation of genes involved in cell adhesion. -
Human Genome Sequence 2/3 Complete
Human Genome Sequence 2/3 Complete March 29, 2000 Two Thirds of Human DNA Script Deciphered by Human Genome Project; Public Consortium To Complete “Working Draft” in June The Human Genome Project international consortium announced today that two billion of the three billion “letters” that constitute the genetic instruction book of humans have been deciphered and deposited into GenBank. GenBank, the public database of DNA sequence operated by the National Institutes of Health, is accessible freely and without restrictions to all scientists in industry and academia. The two billionth “letter,” or base pair, was deposited earlier this month by the Wellcome Trust’s Sanger Centre in Great Britain. The “letter” was a “T,” the abbreviation for thymine, one of the four chemicals or bases that make up DNA. The 2,178,076,000 unique base pairs now in GenBank have been mapped to their locations on the 24 human chromosomes. The Human Genome Project is on track to complete the “working draft,” which will include 90 percent of the human DNA sequence, with an accuracy of 99.9 percent, in June. The Human Genome Project worldwide will invest an estimated $250 million in producing the working draft. The finished, stand-the-test of time version of the human DNA sequence will be ready on or before 2003. Just four months ago, the Human Genome Project reached the one-billionth base pair milestone. “It’s good news that we’re moving so fast but it’s even better news that researchers throughout the world are using this data now to investigate the genetic underpinnings of health and diseases ranging from Alzheimer’s to diabetes,” said Dr. -
ATM Orchestrates the DNA-Damage Response to Counter Toxic Non-Homologous End-Joining at Broken Replication Forks
ARTICLE https://doi.org/10.1038/s41467-018-07729-2 OPEN ATM orchestrates the DNA-damage response to counter toxic non-homologous end-joining at broken replication forks Gabriel Balmus 1,2,3, Domenic Pilger 1, Julia Coates1, Mukerrem Demir1, Matylda Sczaniecka-Clift1, Ana C. Barros2, Michael Woods2, Beiyuan Fu2, Fengtang Yang 2, Elisabeth Chen 2, Matthias Ostermaier4, Tatjana Stankovic5, Hannes Ponstingl 2, Mareike Herzog1,2, Kosuke Yusa2, Francisco Munoz Martinez1, Stephen T. Durant 6, Yaron Galanty 1, Petra Beli 4, David J. Adams2, Allan Bradley2, Emmanouil Metzakopian2,3, Josep V. Forment 1,6 & Stephen P. Jackson 1 1234567890():,; Mutations in the ATM tumor suppressor gene confer hypersensitivity to DNA-damaging chemotherapeutic agents. To explore genetic resistance mechanisms, we performed genome-wide CRISPR-Cas9 screens in cells treated with the DNA topoisomerase I inhibitor topotecan. Thus, we here establish that inactivating terminal components of the non- homologous end-joining (NHEJ) machinery or of the BRCA1-A complex specifically confer topotecan resistance to ATM-deficient cells. We show that hypersensitivity of ATM-mutant cells to topotecan or the poly-(ADP-ribose) polymerase (PARP) inhibitor olaparib reflects delayed engagement of homologous recombination at DNA-replication-fork associated single-ended double-strand breaks (DSBs), allowing some to be subject to toxic NHEJ. Preventing DSB ligation by NHEJ, or enhancing homologous recombination by BRCA1-A complex disruption, suppresses this toxicity, highlighting a crucial role for ATM in preventing toxic LIG4-mediated chromosome fusions. Notably, suppressor mutations in ATM-mutant backgrounds are different to those in BRCA1-mutant scenarios, suggesting new opportunities for patient stratification and additional therapeutic vulnerabilities for clinical exploitation. -
JUN-AN CHEN 陳俊安, Ph.D. [email protected]
JUN-AN CHEN 陳俊安, Ph.D. [email protected] Institute of Molecular Biology Academia Sinica, Taipei, Taiwan, 11529 Tel: +886-2-2788-0460 Fax: +886-2-2782-6085 Lab Website http://www.imb.sinica.edu.tw/~jachen/ ORCID ID orcid.org/0000-0001-9870-3203 EDUCATION University of Cambridge 2002~2006 PhD, The Wellcome Trust Cancer Research UK Gurdon Institute, UK. National Taiwan University (NTU) 1997~1999 MSc, Institute of Biochemical Sciences, Taiwan. National Chung Cheng University (NCCU) 1993~1997 BSc, Department of Chemistry, Taiwan. CAREER/ACADEMIC POSITIONS Associate Research Fellow: 2019~present Principal Investigator, Institute of Molecular Biology, Academia Sinica, Taiwan. Assistant Research Fellow: 2012~2019 Principal Investigator, Institute of Molecular Biology, Academia Sinica, Taiwan. Postdoctoral Fellow: 2007~2012 Supervised by Dr. Hynek Wichterle, Columbia University, USA. PhD student: 2002~2006 Supervised by Dr. Enrique Amaya, The Wellcome Trust/CR UK Gurdon Institute, University of Cambridge, UK. Teaching Instructor: 2001~2002 Institute of Biochemical Sciences, National Taiwan University, Taiwan. PROFESSIONAL HONORS & RECOGNITION Academia Sinica Junior Research Investigators Award (年輕學者著作獎) 2018 Academia Sinica Career Development Award (前瞻計畫) 2018~2023 Taiwan National Science Council Talented Investigator Fellowship 2012~2015 Academia Sinica Investigator Fellowship (新聘學術獎) 2012~2014 Taiwan National Science Council Postdoctoral Fellowship 2008~2009 The Wellcome Trust Scholarship 2002~2006 Taiwan International PhD Scholarship