Science at Scale
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Uniprot at EMBL-EBI's Role in CTTV
Barbara P. Palka, Daniel Gonzalez, Edd Turner, Xavier Watkins, Maria J. Martin, Claire O’Donovan European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK UniProt at EMBL-EBI’s role in CTTV: contributing to improved disease knowledge Introduction The mission of UniProt is to provide the scientific community with a The Centre for Therapeutic Target Validation (CTTV) comprehensive, high quality and freely accessible resource of launched in Dec 2015 a new web platform for life- protein sequence and functional information. science researchers that helps them identify The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of therapeutic targets for new and repurposed medicines. functional information on proteins, with accurate, consistent and rich CTTV is a public-private initiative to generate evidence on the annotation. As much annotation information as possible is added to each validity of therapeutic targets based on genome-scale experiments UniProtKB record and this includes widely accepted biological ontologies, and analysis. CTTV is working to create an R&D framework that classifications and cross-references, and clear indications of the quality of applies to a wide range of human diseases, and is committed to annotation in the form of evidence attribution of experimental and sharing its data openly with the scientific community. CTTV brings computational data. together expertise from four complementary institutions: GSK, Biogen, EMBL-EBI and Wellcome Trust Sanger Institute. UniProt’s disease expert curation Q5VWK5 (IL23R_HUMAN) This section provides information on the disease(s) associated with genetic variations in a given protein. The information is extracted from the scientific literature and diseases that are also described in the OMIM database are represented with a controlled vocabulary. -
ANNUAL REVIEW 1 October 2005–30 September
WELLCOME TRUST ANNUAL REVIEW 1 October 2005–30 September 2006 ANNUAL REVIEW 2006 The Wellcome Trust is the largest charity in the UK and the second largest medical research charity in the world. It funds innovative biomedical research, in the UK and internationally, spending around £500 million each year to support the brightest scientists with the best ideas. The Wellcome Trust supports public debate about biomedical research and its impact on health and wellbeing. www.wellcome.ac.uk THE WELLCOME TRUST The Wellcome Trust is the largest charity in the UK and the second largest medical research charity in the world. 123 CONTENTS BOARD OF GOVERNORS 2 Director’s statement William Castell 4 Advancing knowledge Chairman 16 Using knowledge Martin Bobrow Deputy Chairman 24 Engaging society Adrian Bird 30 Developing people Leszek Borysiewicz 36 Facilitating research Patricia Hodgson 40 Developing our organisation Richard Hynes 41 Wellcome Trust 2005/06 Ronald Plasterk 42 Financial summary 2005/06 Alastair Ross Goobey 44 Funding developments 2005/06 Peter Smith 46 Streams funding 2005/06 Jean Thomas 48 Technology Transfer Edward Walker-Arnott 49 Wellcome Trust Genome Campus As at January 2007 50 Public Engagement 51 Library and information resources 52 Advisory committees Images 1 Surface of the gut. 3 Zebrafish. 5 Cells in a developing This Annual Review covers the 2 Young children in 4 A scene from Y fruit fly. Wellcome Trust’s financial year, from Kenya. Touring’s Every Breath. 6 Data management at the Sanger Institute. 1 October 2005 to 30 September 2006. CONTENTS 1 45 6 EXECUTIVE BOARD MAKING A DIFFERENCE Developing people: To foster a Mark Walport The Wellcome Trust’s mission is research community and individual Director to foster and promote research with researchers who can contribute to the advancement and use of knowledge Ted Bianco the aim of improving human and Director of Technology Transfer animal health. -
Original Article Text Mining in the Biocuration Workflow: Applications for Literature Curation at Wormbase, Dictybase and TAIR
Database, Vol. 2012, Article ID bas040, doi:10.1093/database/bas040 ............................................................................................................................................................................................................................................................................................. Original article Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR Kimberly Van Auken1,*, Petra Fey2, Tanya Z. Berardini3, Robert Dodson2, Laurel Cooper4, Donghui Li3, Juancarlos Chan1, Yuling Li1, Siddhartha Basu2, Hans-Michael Muller1, Downloaded from Rex Chisholm2, Eva Huala3, Paul W. Sternberg1,5 and the WormBase Consortium 1Division of Biology, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125, 2Northwestern University Biomedical Informatics Center and Center for Genetic Medicine, 420 E. Superior Street, Chicago, IL 60611, 3Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305, 4Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331 and 5Howard Hughes Medical Institute, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125, USA http://database.oxfordjournals.org/ *Corresponding author: Tel: +1 609 937 1635; Fax: +1 626 568 8012; Email: [email protected] Submitted 18 June 2012; Revised 30 September 2012; Accepted 2 October 2012 ............................................................................................................................................................................................................................................................................................ -
Ensembl Genomes: Extending Ensembl Across the Taxonomic Space P
Published online 1 November 2009 Nucleic Acids Research, 2010, Vol. 38, Database issue D563–D569 doi:10.1093/nar/gkp871 Ensembl Genomes: Extending Ensembl across the taxonomic space P. J. Kersey*, D. Lawson, E. Birney, P. S. Derwent, M. Haimel, J. Herrero, S. Keenan, A. Kerhornou, G. Koscielny, A. Ka¨ ha¨ ri, R. J. Kinsella, E. Kulesha, U. Maheswari, K. Megy, M. Nuhn, G. Proctor, D. Staines, F. Valentin, A. J. Vilella and A. Yates EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK Received August 14, 2009; Revised September 28, 2009; Accepted September 29, 2009 ABSTRACT nucleotide archives; numerous other genomes exist in states of partial assembly and annotation; thousands of Ensembl Genomes (http://www.ensemblgenomes viral genomes sequences have also been generated. .org) is a new portal offering integrated access to Moreover, the increasing use of high-throughput genome-scale data from non-vertebrate species sequencing technologies is rapidly reducing the cost of of scientific interest, developed using the Ensembl genome sequencing, leading to an accelerating rate of genome annotation and visualisation platform. data production. This not only makes it likely that in Ensembl Genomes consists of five sub-portals (for the near future, the genomes of all species of scientific bacteria, protists, fungi, plants and invertebrate interest will be sequenced; but also the genomes of many metazoa) designed to complement the availability individuals, with the possibility of providing accurate and of vertebrate genomes in Ensembl. Many of the sophisticated annotation through the similarly low-cost databases supporting the portal have been built in application of functional assays. -
The EMBL-European Bioinformatics Institute the Hub for Bioinformatics in Europe
The EMBL-European Bioinformatics Institute The hub for bioinformatics in Europe Blaise T.F. Alako, PhD [email protected] www.ebi.ac.uk What is EMBL-EBI? • Part of the European Molecular Biology Laboratory • International, non-profit research institute • Europe’s hub for biological data, services and research The European Molecular Biology Laboratory Heidelberg Hamburg Hinxton, Cambridge Basic research Structural biology Bioinformatics Administration Grenoble Monterotondo, Rome EMBO EMBL staff: 1500 people Structural biology Mouse biology >60 nationalities EMBL member states Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom Associate member state: Australia Who we are ~500 members of staff ~400 work in services & support >53 nationalities ~120 focus on basic research EMBL-EBI’s mission • Provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress • Contribute to the advancement of biology through basic investigator-driven research in bioinformatics • Provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators • Help disseminate cutting-edge technologies to industry • Coordinate biological data provision throughout Europe Services Data and tools for molecular life science www.ebi.ac.uk/services Browse our services 9 What services do we provide? Labs around the -
Functional Effects Detailed Research Plan
GeCIP Detailed Research Plan Form Background The Genomics England Clinical Interpretation Partnership (GeCIP) brings together researchers, clinicians and trainees from both academia and the NHS to analyse, refine and make new discoveries from the data from the 100,000 Genomes Project. The aims of the partnerships are: 1. To optimise: • clinical data and sample collection • clinical reporting • data validation and interpretation. 2. To improve understanding of the implications of genomic findings and improve the accuracy and reliability of information fed back to patients. To add to knowledge of the genetic basis of disease. 3. To provide a sustainable thriving training environment. The initial wave of GeCIP domains was announced in June 2015 following a first round of applications in January 2015. On the 18th June 2015 we invited the inaugurated GeCIP domains to develop more detailed research plans working closely with Genomics England. These will be used to ensure that the plans are complimentary and add real value across the GeCIP portfolio and address the aims and objectives of the 100,000 Genomes Project. They will be shared with the MRC, Wellcome Trust, NIHR and Cancer Research UK as existing members of the GeCIP Board to give advance warning and manage funding requests to maximise the funds available to each domain. However, formal applications will then be required to be submitted to individual funders. They will allow Genomics England to plan shared core analyses and the required research and computing infrastructure to support the proposed research. They will also form the basis of assessment by the Project’s Access Review Committee, to permit access to data. -
Rare Variant Contribution to Human Disease in 281,104 UK Biobank Exomes W 1,19 1,19 2,19 2 2 Quanli Wang , Ryan S
https://doi.org/10.1038/s41586-021-03855-y Accelerated Article Preview Rare variant contribution to human disease W in 281,104 UK Biobank exomes E VI Received: 3 November 2020 Quanli Wang, Ryan S. Dhindsa, Keren Carss, Andrew R. Harper, Abhishek N ag, I oa nn a Tachmazidou, Dimitrios Vitsios, Sri V. V. Deevi, Alex Mackay, EDaniel Muthas, Accepted: 28 July 2021 Michael Hühn, Sue Monkley, Henric O ls so n , S eb astian Wasilewski, Katherine R. Smith, Accelerated Article Preview Published Ruth March, Adam Platt, Carolina Haefliger & Slavé PetrovskiR online 10 August 2021 P Cite this article as: Wang, Q. et al. Rare variant This is a PDF fle of a peer-reviewed paper that has been accepted for publication. contribution to human disease in 281,104 UK Biobank exomes. Nature https:// Although unedited, the content has been subjectedE to preliminary formatting. Nature doi.org/10.1038/s41586-021-03855-y (2021). is providing this early version of the typeset paper as a service to our authors and Open access readers. The text and fgures will undergoL copyediting and a proof review before the paper is published in its fnal form. Please note that during the production process errors may be discovered which Ccould afect the content, and all legal disclaimers apply. TI R A D E T A R E L E C C A Nature | www.nature.com Article Rare variant contribution to human disease in 281,104 UK Biobank exomes W 1,19 1,19 2,19 2 2 https://doi.org/10.1038/s41586-021-03855-y Quanli Wang , Ryan S. -
The ELIXIR Core Data Resources: Fundamental Infrastructure for The
Supplementary Data: The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences The “Supporting Material” referred to within this Supplementary Data can be found in the Supporting.Material.CDR.infrastructure file, DOI: 10.5281/zenodo.2625247 (https://zenodo.org/record/2625247). Figure 1. Scale of the Core Data Resources Table S1. Data from which Figure 1 is derived: Year 2013 2014 2015 2016 2017 Data entries 765881651 997794559 1726529931 1853429002 2715599247 Monthly user/IP addresses 1700660 2109586 2413724 2502617 2867265 FTEs 270 292.65 295.65 289.7 311.2 Figure 1 includes data from the following Core Data Resources: ArrayExpress, BRENDA, CATH, ChEBI, ChEMBL, EGA, ENA, Ensembl, Ensembl Genomes, EuropePMC, HPA, IntAct /MINT , InterPro, PDBe, PRIDE, SILVA, STRING, UniProt ● Note that Ensembl’s compute infrastructure physically relocated in 2016, so “Users/IP address” data are not available for that year. In this case, the 2015 numbers were rolled forward to 2016. ● Note that STRING makes only minor releases in 2014 and 2016, in that the interactions are re-computed, but the number of “Data entries” remains unchanged. The major releases that change the number of “Data entries” happened in 2013 and 2015. So, for “Data entries” , the number for 2013 was rolled forward to 2014, and the number for 2015 was rolled forward to 2016. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences 1 Figure 2: Usage of Core Data Resources in research The following steps were taken: 1. API calls were run on open access full text articles in Europe PMC to identify articles that mention Core Data Resource by name or include specific data record accession numbers. -
Exploring the Structure and Function Paradigm Oliver C Redfern, Benoit Dessailly and Christine a Orengo
Available online at www.sciencedirect.com Exploring the structure and function paradigm Oliver C Redfern, Benoit Dessailly and Christine A Orengo Advances in protein structure determination, led by the Figure 1 shows that as the international genomics initiat- structural genomics initiatives have increased the proportion of ives gather pace, both the number of sequences and novel folds deposited in the Protein Data Bank. However, these protein families are still growing at an exponential rate, structures are often not accompanied by functional annotations although the rate of expansion of protein families is with experimental confirmation. In this review, we reassess the substantially less. This trend is also observed among meaning of structural novelty and examine its relevance domain families, which are 10-fold fewer (<10 000) than to the complexity of the structure-function paradigm. the number of protein families. By targeting these, the Recent advances in the prediction of protein function from structural genomics initiatives can aim to characterise the structure are discussed, as well as new sequence-based major building blocks of whole proteins and since these methods for partitioning large, diverse superfamilies into domains recur in different combination in the genomes, it biologically meaningful clusters. Obtaining structural data for will be an important step towards understanding the these functionally coherent groups of proteins will allow us to complete structural repertoire in nature. Furthermore, better understand the relationship between structure and the complement of molecular functions found within function. an organism is likely to be even fewer. For example, 97% of proteins in yeast can be assigned one or more of Address 4000 unique GO terms. -
Open Targets Genetics
bioRxiv preprint doi: https://doi.org/10.1101/2020.09.16.299271; this version posted September 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 1 Open Targets Genetics: An open approach to systematically prioritize causal variants 2 and genes at all published GWAS trait-associated loci 3 4 Edward Mountjoy1,2, Ellen M. Schmidt1,2, Miguel Carmona2,3, Gareth Peat2,3, Alfredo Miranda2,3, 5 Luca Fumis2,3, James Hayhurst2,3, Annalisa Buniello2,3, Jeremy Schwartzentruber1,2,3, Mohd 6 Anisul Karim1,2, Daniel Wright1,2, Andrew Hercules2,3, Eliseo Papa4, Eric Fauman5, Jeffrey C. 7 Barrett1,2, John A. Todd6, David Ochoa2,3, Ian Dunham1,2,3, Maya Ghoussaini1,2,*. 8 9 1. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 10 1SA, UK 11 2. Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK 12 3. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), 13 Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK 14 4. Systems Biology, Biogen, Cambridge, MA, 02142, United States 15 5. Integrative Biology, Internal Medicine Research Unit, Pfizer Worldwide Research, 16 Development and Medical, Cambridge, MA 02139, United States 17 6. Wellcome Centre for Human Genetics, Nuffield Department of Medicine, NIHR Oxford 18 Biomedical Research Centre, University of Oxford, Roosevelt Drive, Oxford, OX3 7BN, 19 UK 20 * Corresponding author 21 22 23 24 bioRxiv preprint doi: https://doi.org/10.1101/2020.09.16.299271; this version posted September 17, 2020. -
CRISPR and Beyond Perturbations at Scale to Understand Genomes 2-4
CRISPR and Beyond Perturbations at Scale to Understand Genomes 2-4 September 2019 Wellcome Genome Campus, UK Lectures to be held in the Francis Crick Auditorium Lunch and dinner to be held in the Hall Restaurant Poster sessions to be held in the Conference Centre Spoken presentations - If you are an invited speaker, or your abstract has been selected for a spoken presentation, please give an electronic version of your talk to the AV technician. Poster presentations – If your abstract has been selected for a poster, please display this in the Conference Centre on arrival. Conference programme Monday, 2 September 2019 11:45-12:50 Registration 12:50-13:00 Welcome and introductions Leopold Parts, Wellcome Sanger Institute UK 13.00-14.00 Keynote lecture Allan Bradley Kymab, UK 14:00-15:30 Session 1: Understanding impact of coding variations Chair: Leopold Parts, Wellcome Sanger Institute UK 14:00 New approaches for addressing the effect of genetic variation at scale Douglas Fowler University of Washington, USA 14:30 Tools for systematic proactive functional testing of human missense variants Frederick Roth University of Toronto 15:00 The mutational landscape of a prion-like domain Benedetta Bolognesi Institute for Bioengineering of Catalonia, Spain 15:15 Functional determination of all possible disease-associated variants in a region of CARD11 using saturation genome editing Richard James University of Washington, USA 15:30-16:00 Afternoon tea 16:00-17:30 Session 2: Measuring consequences of non-coding variation Chair: Lea Starita, University -
Advances in Understanding Cancer Genomes Through Second-Generation Sequencing
Advances in understanding cancer genomes through second-generation sequencing Matthew Meyerson, Stacey Gabriel and Gad Getz Second-generation A major near-term medical impact of the genome comprehensive genome-based diagnosis of cancer is sequencing technology revolution will be the elucidation of mecha- becoming increasingly crucial for therapeutic decisions. Used in this Review to refer to nisms of cancer pathogenesis, leading to improvements During the past decades, there have been major sequencing methods that have in the diagnosis of cancer and the selection of cancer advances in experimental and informatic methods emerged since 2005 that second-generation sequencing parallelize the sequencing treatment. Thanks to for genome characterization based on DNA and RNA 1–5 process and produce millions technologies , recently it has become feasible to microarrays and on capillary-based DNA sequenc- of typically short sequence sequence the expressed genes (‘transcriptomes’)6,7, ing (‘first-generation sequencing’, also known as Sanger reads (50–400 bases) from known exons (‘exomes’)8,9, and complete genomes10–15 sequencing). These technologies provided the ability amplified DNA clones. of cancer samples. to analyse exonic mutations and copy number altera- It is also often known as next-generation sequencing. These technological advances are important for tions and have led to the discovery of many important advancing our understanding of malignant neoplasms alterations in the cancer genome20. because cancer is fundamentally a disease of the genome. However, there are particular challenges for the A wide range of genomic alterations — including point detection and diagnosis of cancer genome alterations. mutations, copy number changes and rearrangements For example, some genomic alterations in cancer are — can lead to the development of cancer.