EMBL-EBI Research

Total Page:16

File Type:pdf, Size:1020Kb

EMBL-EBI Research Infrastructures for research and innovation Professor Ewan Birney FRS Director, EMBL-EBI www.ebi.ac.uk Outline of talk • Who Am I, What is EMBL? • The change in genomics • The needs for stratified patients in clinical care and drug discovery • Europe’s assets • A path to releasing Europe’s strengths The European Molecular Biology Laboratory 80+ nationalities >1600 personnel 6 sites in Europe Heidelberg, Germany Hinxton, Cambridge, UK Grenoble, France Main Laboratory Bioinformatics Structural Biology Tissue Biology, Disease Modeling Neuroscience Structural Biology Barcelona, Spain Rome, Italy Hamburg, Germany Ewan Birney • Lead the original team that analysed the human genome (gene sets) • Algorithm research in genomic information • Set up many key databases in genomics (eg, Ensembl) • Director of EMBL-EBI • Non-executive director for Genomics England (NHS clinical genomics) • Formal Advice to UK, Finnish, Danish, US governments; informal to other governments • Advisor to both large (GSK) and small (Oxford Nanopore) companies • Chair of the Global Alliance for Genomics and Health (GA4GH) We have been living through a revolution. One genome 2003 to 2018 The cost of sequencing a The cost of sequencing a genome in 2003 genome in 2018 Imaging: new technologies change the game EM tomography, Atomic-scale models from EM Super-resolution light microscopy High-resolution MRI and CT Light sheet microcopy Genomics: from research to healthcare Research Practicing Medicine • English language • National language • Light-weight legal • Heavy legal framework • Similar systems • Different systems • Open data • Closed data • Publications • Not published • Grant funding • Contract funding Big numbers! Stratification of Patients Stratification Class A Stratification Class B Class C Benefits of stratification • In clinical practice • Better diagnosis and prognosis • Better use of (expensive) medicines (“personalised” medicine) • Specific care pathways optimised for the cases • In drug discovery • More clarity on the therapeutic goals in early development • Cheaper and more likely to succeed Phase II and Phase III trials 4 Pillars of stratification At scale Clear legal basis to genomic access appropriate assays data and approach patients Very Large Harmonised Virtual Cohorts representation of ideally with key aspects of population scale EHRs ascertainment Europe’s Assets Well regulated, often state run healthcare • Total population size of >200 million • The largest coherent EHR records in the world (Denmark, 6 million Danish citizens) • Sweden, Norway, Finland all have good record keeping • Large, predominantly state run systems in France and UK • Historical as well as future health data The most advanced clinical + population genomics programs globally • Finland - >10% of the population sequenced in 5 years • Estonia – aiming for all 1 million biobanked • Denmark – 5Million EHRs, 100,000 sequenced • UK – Goal of 5 million with genomic assays within 5 years • France – Clinical + Population scale assays for ~1 million within 5 years • Spain – Variety of regional programs with scale to millions An European Framework: MEGA ELIXIR Node Map Associated Institutes ELIXIR-BE Katholieke Genomic InfrastructureUniversiteit Leuven ELIXIR-BE University of • EMBL-EBI Antwerp ELIXIR-BE University of Liège • World leader in genome ELIXIR-BE Vrije Universiteit informationBr uandssel analysis • The most comprehensiveELIXIR-BE Universiteit Hasselt lifescienceELdatasetsIXIR-BE Interuniversity Institute of Bioinformatics globally Brussels ELIXIR-CZ: Masaryk • ELIXIR University (CEITEC) ELIXIR-CZ: Masaryk • European Uwideniversity (C EnetworkRIT-SC) with NationalELIXIR -CnodesZ: Institute of to connect localChemic aresearchl Technology and healthcareELIXIR-CZ: Institute of Experimental Botany AS CR, v. v. i. ELIXIR-CZ: Institute of Molecular Genetics of the AS CR ELIXIR- CZ Institute of Microbiology ASCR ELIXIR-CZ: Cesnet ELIXIR-CZ: University of South Bohemia The need for infrastructure Clinical Record + Diagnosis Reference Infrastructure National Genome Database A vibrant commercial research sector • Many European large scale pharmaceutical companies • Sanofi, GSK, Roche, AstraZeneca, Novartis • Balance of US vs European research intensity • Vibrant SME community • Based around clusters – Heidelberg-Stuttgart-Munich-Basel, Paris-Brussels-Amsterdam, Oxford-London-Cambridge, Barcelona, Stockholm-Helsinki • Public-private partnerships • IMI • OpenTargets @EMBL-EBI A path for European stratified populations Alignment of European programs • Million genomes declaration • EMBL-EBI and ELIXIR (ESFRI) as genomic infrastructure • IMI programs as an instrument to foster cross- institutional, trans-national, public-private partnerships Engagement with Nation state Health strategy • Practical “on the ground” implementation is in the hands of the operations and regulation of the healthcare systems in Europe • Source of EHR information • Source of genomic information • Fundamental need to have >100 million person cohorts will drive trans-national work • Clear for smaller countries that between country federation is needed • Clear for rare disease in all countries; will become relevant to more diseases Engagement with global structures • Europe has to tackle trans-national coordination far earlier than the US or Chinese systems • Similar opportunity as mobile phone GSM standards – the need for ultimately trans-national access places Europe as the leader in how to solve this • Legal and ethical components (GDPR) • Technical components • Leadership in global bodies, such as GA4GH (Global Alliance for Genomics and Health) Thank you! EMBL-EBI Follow me on twitter @ewanbirney 1/11/2019 25 Our mission Deliver Deliver Train the Engage Coordinate excellent scientific next with bioinformatic research services generatio European s in Europe n of industry scientists Life science: many data types Genes, genomes & variation Gene, protein & metabolite expression Protein sequences, families & motifs Phenotypes Macromolecular structures Interactions, reactions & pathways Chemogenomics & metabolomics Data resources at EMBL-EBI Gene, protein & metabolite Genes, genomes & variation • Ensembl expression • Ensembl Genomes • Expression Atlas • GWAS Catalog • Metabolights • Metagenomics portal • PRIDE • RNA Central Protein Molecular structures sequences, families & motifs • Protein Data Bank in Europe • InterPro • Electron Microscopy Data • Pfam Bank • UniProt Literature & ontologies Chemical Systems • BioModels • Experimental Factor biology • BioSamples Ontology • ChEBI • Enzyme Portal • Gene Ontology • ChEMBL • IntAct • BioStudies • SureChEMBL • Reactome • Europe PMC Molecular Archives • European Nucleotide Archive ~410 people • European Variation Archive Worldwide collaborations • European Genome-phenome Archive • ArrayExpress Global reference data See the live map at www.ebi.ac.uk/about/our-impact Big data, big demand Scientists at over ~27 million requests to EMBL-EBI websites every 3.2 million unique IP addresses use day EMBL-EBI websites EMBL-EBI delivered Sustainable Funding 1-5 US$ billion Over 40 difference funding agencies worldwide in efficiency savings worldwide Forward commitment of over £100 million Research groups at EMBL-EBI Alex Ewan Pedro Alvis Rob Paul Andrew Moritz Bateman Birney Beltrao Brazma Finn Flicek Leach Gerstung Nick Zamin John Evangelia Oliver Janet Virginie Daniel Goldman Iqbal Marioni Petsalaki Stegle Thornton Uhlmann Zerbino Research data at EMBL-EBI Mutations affecting proteins implicated in rare diseases Evolution of Proteomic & RNA comparison phosphorylation sites Genomics of infectious disease > < Modeling unwanted variation in single-cell transcriptom e studies Single Cell Genomics Translational bioinformatics EMBL Research Community • Research group picture ~170 people ~50 visitors / year Medical Genomics Serious efforts on way • Genomics England • 100,000 Genomes by end of 2019 (35,000 done now) • Long term 60K-100K from “routine healthcare” across NHS • Plan France Génomique • ~100,000 genomes / year by 2025, first sites selected • Iceland • 40% of the population genotyped/sequenced + imputed • Switzerland • SPRT program to promote genomic medicine • Finland • at least ~10% (0.5 million) of the population with sequence data by 2020 • US – Complex payer/insurance lead market • Mixture of HMO (Geisgner) and NIH (All of Us – mainly a cohort) Genomics: from research to healthcare Research Practicing Medicine • English language • National language • Light-weight legal • Heavy legal framework • Similar systems • Different systems • Open data • Closed data • Publications • Not published • Grant funding • Contract funding Bridges need at least two anchors Long-term goals • Ideal: “Institute for Biomedical informatics” in each country • Large nations/populations: Distributed network with a clear centre of gravity • EMBL-EBI & ELIXIR handle research data: reference collections and sharing amongst researchers (including clinical) • Institute for Biomedical Informatics: • Responsible for exploiting molecular reference data • Provides the national link and point of reference (eg, around legislation) • Broker for research data (back to EMBL-EBI, NCBI & ELIXIR) France EMBL-EBI Basic Research • Working collaboratively with Elixir-France • Orphanet, CAZy • Support training in bioinformatics • Ensuring French scientists and institutes exploit EMBL- EBI • Seamless APIs to allow submission of data driven by institutes (less complexity for user/scientist, use EMBL-EBI as backup)
Recommended publications
  • Gene Prediction: the End of the Beginning Comment Colin Semple
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by PubMed Central http://genomebiology.com/2000/1/2/reports/4012.1 Meeting report Gene prediction: the end of the beginning comment Colin Semple Address: Department of Medical Sciences, Molecular Medicine Centre, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK. E-mail: [email protected] Published: 28 July 2000 reviews Genome Biology 2000, 1(2):reports4012.1–4012.3 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2000/1/2/reports/4012 © GenomeBiology.com (Print ISSN 1465-6906; Online ISSN 1465-6914) Reducing genomes to genes reports A report from the conference entitled Genome Based Gene All ab initio gene prediction programs have to balance sensi- Structure Determination, Hinxton, UK, 1-2 June, 2000, tivity against accuracy. It is often only possible to detect all organised by the European Bioinformatics Institute (EBI). the real exons present in a sequence at the expense of detect- ing many false ones. Alternatively, one may accept only pre- dictions scoring above a more stringent threshold but lose The draft sequence of the human genome will become avail- those real exons that have lower scores. The trick is to try and able later this year. For some time now it has been accepted increase accuracy without any large loss of sensitivity; this deposited research that this will mark a beginning rather than an end. A vast can be done by comparing the prediction with additional, amount of work will remain to be done, from detailing independent evidence.
    [Show full text]
  • The EMBL-European Bioinformatics Institute the Hub for Bioinformatics in Europe
    The EMBL-European Bioinformatics Institute The hub for bioinformatics in Europe Blaise T.F. Alako, PhD [email protected] www.ebi.ac.uk What is EMBL-EBI? • Part of the European Molecular Biology Laboratory • International, non-profit research institute • Europe’s hub for biological data, services and research The European Molecular Biology Laboratory Heidelberg Hamburg Hinxton, Cambridge Basic research Structural biology Bioinformatics Administration Grenoble Monterotondo, Rome EMBO EMBL staff: 1500 people Structural biology Mouse biology >60 nationalities EMBL member states Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom Associate member state: Australia Who we are ~500 members of staff ~400 work in services & support >53 nationalities ~120 focus on basic research EMBL-EBI’s mission • Provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress • Contribute to the advancement of biology through basic investigator-driven research in bioinformatics • Provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators • Help disseminate cutting-edge technologies to industry • Coordinate biological data provision throughout Europe Services Data and tools for molecular life science www.ebi.ac.uk/services Browse our services 9 What services do we provide? Labs around the
    [Show full text]
  • Functional Effects Detailed Research Plan
    GeCIP Detailed Research Plan Form Background The Genomics England Clinical Interpretation Partnership (GeCIP) brings together researchers, clinicians and trainees from both academia and the NHS to analyse, refine and make new discoveries from the data from the 100,000 Genomes Project. The aims of the partnerships are: 1. To optimise: • clinical data and sample collection • clinical reporting • data validation and interpretation. 2. To improve understanding of the implications of genomic findings and improve the accuracy and reliability of information fed back to patients. To add to knowledge of the genetic basis of disease. 3. To provide a sustainable thriving training environment. The initial wave of GeCIP domains was announced in June 2015 following a first round of applications in January 2015. On the 18th June 2015 we invited the inaugurated GeCIP domains to develop more detailed research plans working closely with Genomics England. These will be used to ensure that the plans are complimentary and add real value across the GeCIP portfolio and address the aims and objectives of the 100,000 Genomes Project. They will be shared with the MRC, Wellcome Trust, NIHR and Cancer Research UK as existing members of the GeCIP Board to give advance warning and manage funding requests to maximise the funds available to each domain. However, formal applications will then be required to be submitted to individual funders. They will allow Genomics England to plan shared core analyses and the required research and computing infrastructure to support the proposed research. They will also form the basis of assessment by the Project’s Access Review Committee, to permit access to data.
    [Show full text]
  • Download Final Programme
    Session Overview Saturday 17 September 2011 11:15 - 13:15 Arrival and Registration ATC Main Entrance 13:15 - 13:30 Welcome and Opening Remarks Klaus Tschira Auditorium 13:30 - 18:00 Session 1: Somatic Genetics I Chaired by David Tuveson and Ewan Birney Klaus Tschira Auditorium 18:00 - 19:00 Keynote Lecture: Lynda Chin Klaus Tschira Auditorium 19:00 - 20:30 Dinner ATC Canteen Sunday 18 September 2011 09:00 - 12:30 Session 2: Somatic Genetics II / Epigenetics Chaired by James R. Downing Klaus Tschira Auditorium 12:30 - 14:30 Poster Session I and Lunch ATC Foyer and Helix A 14:30 - 18:30 Session 3: Mouse Genetics Chaired by Lynda Chin Klaus Tschira Auditorium 18:30 - 23:00 Gala Dinner and Live Music ATC Canteen and ATC Rooftop Lounge Page 1 EMBO|EMBL Symposium: Cancer Genomics Monday 19 September 2011 09:00 - 13:00 Session 4: Computational Chaired by Peter Lichter Klaus Tschira Auditorium 13:00 - 15:00 Poster Session II and Lunch ATC Foyer and Helix A 15:00 - 16:00 Session 5: Somatic Genetics III Chaired by Andy Futreal Klaus Tschira Auditorium 16:00 - 17:00 Keynote Lecture: Michael Stratton Klaus Tschira Auditorium 17:00 - 17:15 Closing Remarks and Poster Prize Klaus Tschira Auditorium Page 2 Programme Saturday 17 September 2011 11:15 - 13:15 Arrival and Registration ATC Main Entrance 13:15 - 13:30 Welcome and Opening Remarks Klaus Tschira Auditorium 13:30 - 18:00 Session 1: Somatic Genetics I Chaired by David Tuveson and Ewan Birney Klaus Tschira Auditorium 13:30 - 14:00 Somatic genomic alterations in chronic lymphocytic 1 leukemia Elias
    [Show full text]
  • Phenotype Inference in an Escherichia Coli Strain Panel
    TOOLS AND RESOURCES Phenotype inference in an Escherichia coli strain panel Marco Galardini1, Alexandra Koumoutsi2, Lucia Herrera-Dominguez2, Juan Antonio Cordero Varela1, Anja Telzerow2, Omar Wagih1, Morgane Wartel2, Olivier Clermont3,4, Erick Denamur3,4,5, Athanasios Typas2*, Pedro Beltrao1* 1European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL- EBI), Hinxton, United Kingdom; 2Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany; 3INSERM, IAME, UMR1137, Paris, France; 4Universite´ Paris Diderot, Paris, France; 5APHP, Hoˆpitaux Universitaires Paris Nord Val-de-Seine, Paris, France Abstract Understanding how genetic variation contributes to phenotypic differences is a fundamental question in biology. Combining high-throughput gene function assays with mechanistic models of the impact of genetic variants is a promising alternative to genome-wide association studies. Here we have assembled a large panel of 696 Escherichia coli strains, which we have genotyped and measured their phenotypic profile across 214 growth conditions. We integrated variant effect predictors to derive gene-level probabilities of loss of function for every gene across all strains. Finally, we combined these probabilities with information on conditional gene essentiality in the reference K-12 strain to compute the growth defects of each strain. Not only could we reliably predict these defects in up to 38% of tested conditions, but we could also directly identify the causal variants that were validated through complementation assays. Our work demonstrates the power of forward predictive models and the possibility of precision genetic interventions. DOI: https://doi.org/10.7554/eLife.31035.001 *For correspondence: [email protected] (AT); [email protected] (PB) Introduction Competing interests: The Understanding the genetic and molecular basis of phenotypic differences among individuals is a authors declare that no long-standing problem in biology.
    [Show full text]
  • Molecular Genetics & Genomics
    page 46 Lab Times 5-2010 Ranking Illustration: Christina Ullman Publication Analysis 1997-2008 Molecular Genetics & Genomics Under the premise of a “narrow” definition of the field, Germany and England co-dominated European molecular genetics/genomics. The most frequently citated sub-fields were bioinformatical genomics, epigenetics, RNA biology and DNA repair. irst of all, a little science history (you’ll soon see why). As and expression. That’s where so-called computational biology is well known, in the 1950s genetics went molecular – and and systems biology enter research into basic genetic problems. Fdid not just become molecular genetics but rather molec- Given that development, it is not easy to answer the question ular bio logy. In 1963, however, Sydney Brenner wrote in his fa- what “molecular genetics & genomics” today actually is – and, mous letter to Max Perutz: “[...] I have long felt that the future of in particular, what is it in the context of our publication analy- molecular biology lies in the extension of research to other fields sis of the field? It is obvious that, as for example science historian of biology, notably development and the nervous system.” He Robert Olby put it, a “wide” definition can be distinguished from appeared not to be alone with this view and, as a consequence, a “narrow” definition of the field. The wide definition includes along with Brenner many of the leading molecular biologists all fields, into which molecular biology has entered as an exper- from the classical period redirected their research agendas, utilis- imental and theoretical paradigm. The “narrow” definition, on ing the newly developed molecular techniques to investigate un- the other hand, still tries to maintain the status as an explicit bio- solved problems in other fields.
    [Show full text]
  • Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by Pickeringhutchins Ltd
    European Bioinformatics Institute EMBL-EBI Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by PickeringHutchins Ltd www.pickeringhutchins.com EMBL member states: Austria, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom. Associate member state: Australia EMBL-EBI is a part of the European Molecular Biology Laboratory (EMBL) EMBL-EBI EMBL-EBI EMBL-EBI EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD United Kingdom Tel. +44 (0)1223 494 444, Fax +44 (0)1223 494 468 www.ebi.ac.uk EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg Germany Tel. +49 (0)6221 3870, Fax +49 (0)6221 387 8306 www.embl.org [email protected] EMBL Grenoble 6, rue Jules Horowitz, BP181 38042 Grenoble, Cedex 9 France Tel. +33 (0)476 20 7269, Fax +33 (0)476 20 2199 EMBL Hamburg c/o DESY Notkestraße 85 22603 Hamburg Germany Tel. +49 (0)4089 902 110, Fax +49 (0)4089 902 149 EMBL Monterotondo Adriano Buzzati-Traverso Campus Via Ramarini, 32 00015 Monterotondo (Rome) Italy Tel. +39 (0)6900 91402, Fax +39 (0)6900 91406 © 2012 EMBL-European Bioinformatics Institute All texts written by EBI-EMBL Group and Team Leaders. This publication was produced by the EBI’s Outreach and Training Programme. Contents Introduction Foreword 2 Major Achievements 2011 4 Services Rolf Apweiler and Ewan Birney: Protein and nucleotide data 10 Guy Cochrane: The European Nucleotide Archive 14 Paul Flicek:
    [Show full text]
  • I S C B N E W S L E T T
    ISCB NEWSLETTER FOCUS ISSUE {contents} President’s Letter 2 Member Involvement Encouraged Register for ISMB 2002 3 Registration and Tutorial Update Host ISMB 2004 or 2005 3 David Baker 4 2002 Overton Prize Recipient Overton Endowment 4 ISMB 2002 Committees 4 ISMB 2002 Opportunities 5 Sponsor and Exhibitor Benefits Best Paper Award by SGI 5 ISMB 2002 SIGs 6 New Program for 2002 ISMB Goes Down Under 7 Planning Underway for 2003 Hot Jobs! Top Companies! 8 ISMB 2002 Job Fair ISCB Board Nominations 8 Bioinformatics Pioneers 9 ISMB 2002 Keynote Speakers Invited Editorial 10 Anna Tramontano: Bioinformatics in Europe Software Recommendations11 ISCB Software Statement volume 5. issue 2. summer 2002 Community Development 12 ISCB’s Regional Affiliates Program ISCB Staff Introduction 12 Fellowship Recipients 13 Awardees at RECOMB 2002 Events and Opportunities 14 Bioinformatics events world wide INTERNATIONAL SOCIETY FOR COMPUTATIONAL BIOLOGY A NOTE FROM ISCB PRESIDENT This newsletter is packed with information on development and dissemination of bioinfor- the ISMB2002 conference. With over 200 matics. Issues arise from recommendations paper submissions and over 500 poster submis- made by the Society’s committees, Board of sions, the conference promises to be a scientific Directors, and membership at large. Important feast. On behalf of the ISCB’s Directors, staff, issues are defined as motions and are discussed EXECUTIVE COMMITTEE and membership, I would like to thank the by the Board of Directors on a bi-monthly Philip E. Bourne, Ph.D., President organizing committee, local organizing com- teleconference. Motions that pass are enacted Michael Gribskov, Ph.D., mittee, and program committee for their hard by the Executive Committee which also serves Vice President work preparing for the conference.
    [Show full text]
  • Structure-Based Realignment of Non-Coding Rnas in Multiple Whole Genome Alignments
    Structure-based Realignment of Non-coding RNAs in Multiple Whole Genome Alignments. by Michael Ku Yu Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of ARCHIVES Masters of Engineering in Computer Science and Engineering MASSACHUE N U TE at the OF TECH IOLOY MASSACHUSETTS INSTITUTE OF TECHNOLOGY JUN 2 1 2011 June 2011 LIBRARI ES @ Massachusetts Institute of Technology 2011. All rights reserved. '$7 A uthor ............ .. .. ... ............. Department of Electrical Wgineering and Computer Science May 20, 2011 Certified by..................................... ...... Bonnie Berger Professor of Applied Mathematics and Computer Science Thesis Supervisor Accepted by.... ....................................... Christopher J. Terman Chairman, Department Committee on Graduate Theses 2 Structure-based Realignment of Non-coding RNAs in Multiple Whole Genome Alignments by Michael Ku Yu Submitted to the Department of Electrical Engineering and Computer Science on May 20, 2011, in partial fulfillment of the requirements for the degree of Masters of Engineering in Computer Science and Engineering Abstract Whole genome alignments have become a central tool in biological sequence analy- sis. A major application is the de novo prediction of non-coding RNAs (ncRNAs) from structural conservation visible in the alignment. However, current methods for constructing genome alignments do so by explicitly optimizing for sequence simi- larity but not structural similarity. Therefore, de novo prediction of ncRNAs with high structural but low sequence conservation is intrinsically challenging in a genome alignment because the conservation signal is typically hidden. This study addresses this problem with a method for genome-wide realignment of potential ncRNAs ac- cording to structural similarity.
    [Show full text]
  • UC Irvine UC Irvine Previously Published Works
    UC Irvine UC Irvine Previously Published Works Title An integrated encyclopedia of DNA elements in the human genome. Permalink https://escholarship.org/uc/item/4mx2r10v Journal Nature, 489(7414) ISSN 0028-0836 Author ENCODE Project Consortium Publication Date 2012-09-01 DOI 10.1038/nature11247 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California HHS Public Access Author manuscript Author Manuscript Author ManuscriptNature. Author ManuscriptAuthor manuscript; Author Manuscript available in PMC 2013 March 06. Published in final edited form as: Nature. 2012 September 6; 489(7414): 57–74. doi:10.1038/nature11247. An Integrated Encyclopedia of DNA Elements in the Human Genome The ENCODE Project Consortium Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research.
    [Show full text]
  • Functional Genomics Workshop Report
    Report From A Planning Workshop for the National Human Genome Research Institute “From Genome Function to Biomedical Insight: ENCODE and Beyond” March 10-11, 2015 Executive Summary On March 10-11, 2015, the National Human Genome Research Institute (NHGRI) convened a workshop to discuss scientific questions and opportunities that can be addressed by large-scale functional genomics studies, and options for future NHGRI programs in this area following the conclusion of the current phase of ENCODE. ENCODE (Encyclopedia of DNA Elements) aims to develop a comprehensive map of functional elements in the human and mouse genomes. The workshop was video cast and archived (see http://www.genome.gov/27560819). The workshop was organized around three scientific opportunities that would use genomic assays of function to: identify and characterize functional elements in an unbiased manner; interpret disease-associated genetic variation; and address basic biological questions. For each, illustrations of projects that could address unmet scientific needs were presented, followed by discussion of key themes. The workshop concluded with a discussion of goals and priorities spanning the three topics, including discussion of implementation (see Appendix 1: Agenda). There was strong support for pursuit of all three opportunities. Participants were enthusiastic about ENCODE accomplishments to date and the opportunities that flow from that resource. Important themes included that NHGRI should: focus on catalytic work such as developing resources and new approaches;
    [Show full text]
  • Professor Dame Janet Thornton Director T + 44
    Professor Dame Janet Thornton Director T + 44 (0)1223 494648 F + 44 (0)1223 494496 [email protected] Professor Dame Sally Davies Chief Medical Officer Department of Health Richmond House, Room 123b 79 Whitehall London SW1A 2NS 21 March 2013 Dear Sally, Re: 100,000 Whole Genomes Project Letter from the Chief Medical Officer’s Data Working Group In January this year, you invited me to chair the above working group to consider the necessary standards, infrastructure and expertise that would be required, ideally building on existing platforms to provide the necessary data for clinicians and researchers. You also asked our working group to take into account the Government’s wider programme of activities to support Big Data and to provide platforms for both public and commercial researchers. The advice and opinion on how an integrated, interoperable data management framework can support the aims of the 100,000 Whole Genome project is presented below, with detail in the accompanying annexes. However, it must be stressed that this is a first look at the many and varied issues involved. There will be a need for continued work going forward to refine and formulate more detailed specifications for commissioning of services. The Data Working Group would be happy to provide what further advice it can to support you in this important initiative. The Group looked at three specific areas: 1. Data Infrastructure and Flow, 2. Data Specification and Standards EMBL-European Bioinformatics Institute (EBI) Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD . UK Web www.ebi.ac.uk Email [email protected] 3.
    [Show full text]