EMBL-EBI Research
Total Page:16
File Type:pdf, Size:1020Kb
Infrastructures for research and innovation Professor Ewan Birney FRS Director, EMBL-EBI www.ebi.ac.uk Outline of talk • Who Am I, What is EMBL? • The change in genomics • The needs for stratified patients in clinical care and drug discovery • Europe’s assets • A path to releasing Europe’s strengths The European Molecular Biology Laboratory 80+ nationalities >1600 personnel 6 sites in Europe Heidelberg, Germany Hinxton, Cambridge, UK Grenoble, France Main Laboratory Bioinformatics Structural Biology Tissue Biology, Disease Modeling Neuroscience Structural Biology Barcelona, Spain Rome, Italy Hamburg, Germany Ewan Birney • Lead the original team that analysed the human genome (gene sets) • Algorithm research in genomic information • Set up many key databases in genomics (eg, Ensembl) • Director of EMBL-EBI • Non-executive director for Genomics England (NHS clinical genomics) • Formal Advice to UK, Finnish, Danish, US governments; informal to other governments • Advisor to both large (GSK) and small (Oxford Nanopore) companies • Chair of the Global Alliance for Genomics and Health (GA4GH) We have been living through a revolution. One genome 2003 to 2018 The cost of sequencing a The cost of sequencing a genome in 2003 genome in 2018 Imaging: new technologies change the game EM tomography, Atomic-scale models from EM Super-resolution light microscopy High-resolution MRI and CT Light sheet microcopy Genomics: from research to healthcare Research Practicing Medicine • English language • National language • Light-weight legal • Heavy legal framework • Similar systems • Different systems • Open data • Closed data • Publications • Not published • Grant funding • Contract funding Big numbers! Stratification of Patients Stratification Class A Stratification Class B Class C Benefits of stratification • In clinical practice • Better diagnosis and prognosis • Better use of (expensive) medicines (“personalised” medicine) • Specific care pathways optimised for the cases • In drug discovery • More clarity on the therapeutic goals in early development • Cheaper and more likely to succeed Phase II and Phase III trials 4 Pillars of stratification At scale Clear legal basis to genomic access appropriate assays data and approach patients Very Large Harmonised Virtual Cohorts representation of ideally with key aspects of population scale EHRs ascertainment Europe’s Assets Well regulated, often state run healthcare • Total population size of >200 million • The largest coherent EHR records in the world (Denmark, 6 million Danish citizens) • Sweden, Norway, Finland all have good record keeping • Large, predominantly state run systems in France and UK • Historical as well as future health data The most advanced clinical + population genomics programs globally • Finland - >10% of the population sequenced in 5 years • Estonia – aiming for all 1 million biobanked • Denmark – 5Million EHRs, 100,000 sequenced • UK – Goal of 5 million with genomic assays within 5 years • France – Clinical + Population scale assays for ~1 million within 5 years • Spain – Variety of regional programs with scale to millions An European Framework: MEGA ELIXIR Node Map Associated Institutes ELIXIR-BE Katholieke Genomic InfrastructureUniversiteit Leuven ELIXIR-BE University of • EMBL-EBI Antwerp ELIXIR-BE University of Liège • World leader in genome ELIXIR-BE Vrije Universiteit informationBr uandssel analysis • The most comprehensiveELIXIR-BE Universiteit Hasselt lifescienceELdatasetsIXIR-BE Interuniversity Institute of Bioinformatics globally Brussels ELIXIR-CZ: Masaryk • ELIXIR University (CEITEC) ELIXIR-CZ: Masaryk • European Uwideniversity (C EnetworkRIT-SC) with NationalELIXIR -CnodesZ: Institute of to connect localChemic aresearchl Technology and healthcareELIXIR-CZ: Institute of Experimental Botany AS CR, v. v. i. ELIXIR-CZ: Institute of Molecular Genetics of the AS CR ELIXIR- CZ Institute of Microbiology ASCR ELIXIR-CZ: Cesnet ELIXIR-CZ: University of South Bohemia The need for infrastructure Clinical Record + Diagnosis Reference Infrastructure National Genome Database A vibrant commercial research sector • Many European large scale pharmaceutical companies • Sanofi, GSK, Roche, AstraZeneca, Novartis • Balance of US vs European research intensity • Vibrant SME community • Based around clusters – Heidelberg-Stuttgart-Munich-Basel, Paris-Brussels-Amsterdam, Oxford-London-Cambridge, Barcelona, Stockholm-Helsinki • Public-private partnerships • IMI • OpenTargets @EMBL-EBI A path for European stratified populations Alignment of European programs • Million genomes declaration • EMBL-EBI and ELIXIR (ESFRI) as genomic infrastructure • IMI programs as an instrument to foster cross- institutional, trans-national, public-private partnerships Engagement with Nation state Health strategy • Practical “on the ground” implementation is in the hands of the operations and regulation of the healthcare systems in Europe • Source of EHR information • Source of genomic information • Fundamental need to have >100 million person cohorts will drive trans-national work • Clear for smaller countries that between country federation is needed • Clear for rare disease in all countries; will become relevant to more diseases Engagement with global structures • Europe has to tackle trans-national coordination far earlier than the US or Chinese systems • Similar opportunity as mobile phone GSM standards – the need for ultimately trans-national access places Europe as the leader in how to solve this • Legal and ethical components (GDPR) • Technical components • Leadership in global bodies, such as GA4GH (Global Alliance for Genomics and Health) Thank you! EMBL-EBI Follow me on twitter @ewanbirney 1/11/2019 25 Our mission Deliver Deliver Train the Engage Coordinate excellent scientific next with bioinformatic research services generatio European s in Europe n of industry scientists Life science: many data types Genes, genomes & variation Gene, protein & metabolite expression Protein sequences, families & motifs Phenotypes Macromolecular structures Interactions, reactions & pathways Chemogenomics & metabolomics Data resources at EMBL-EBI Gene, protein & metabolite Genes, genomes & variation • Ensembl expression • Ensembl Genomes • Expression Atlas • GWAS Catalog • Metabolights • Metagenomics portal • PRIDE • RNA Central Protein Molecular structures sequences, families & motifs • Protein Data Bank in Europe • InterPro • Electron Microscopy Data • Pfam Bank • UniProt Literature & ontologies Chemical Systems • BioModels • Experimental Factor biology • BioSamples Ontology • ChEBI • Enzyme Portal • Gene Ontology • ChEMBL • IntAct • BioStudies • SureChEMBL • Reactome • Europe PMC Molecular Archives • European Nucleotide Archive ~410 people • European Variation Archive Worldwide collaborations • European Genome-phenome Archive • ArrayExpress Global reference data See the live map at www.ebi.ac.uk/about/our-impact Big data, big demand Scientists at over ~27 million requests to EMBL-EBI websites every 3.2 million unique IP addresses use day EMBL-EBI websites EMBL-EBI delivered Sustainable Funding 1-5 US$ billion Over 40 difference funding agencies worldwide in efficiency savings worldwide Forward commitment of over £100 million Research groups at EMBL-EBI Alex Ewan Pedro Alvis Rob Paul Andrew Moritz Bateman Birney Beltrao Brazma Finn Flicek Leach Gerstung Nick Zamin John Evangelia Oliver Janet Virginie Daniel Goldman Iqbal Marioni Petsalaki Stegle Thornton Uhlmann Zerbino Research data at EMBL-EBI Mutations affecting proteins implicated in rare diseases Evolution of Proteomic & RNA comparison phosphorylation sites Genomics of infectious disease > < Modeling unwanted variation in single-cell transcriptom e studies Single Cell Genomics Translational bioinformatics EMBL Research Community • Research group picture ~170 people ~50 visitors / year Medical Genomics Serious efforts on way • Genomics England • 100,000 Genomes by end of 2019 (35,000 done now) • Long term 60K-100K from “routine healthcare” across NHS • Plan France Génomique • ~100,000 genomes / year by 2025, first sites selected • Iceland • 40% of the population genotyped/sequenced + imputed • Switzerland • SPRT program to promote genomic medicine • Finland • at least ~10% (0.5 million) of the population with sequence data by 2020 • US – Complex payer/insurance lead market • Mixture of HMO (Geisgner) and NIH (All of Us – mainly a cohort) Genomics: from research to healthcare Research Practicing Medicine • English language • National language • Light-weight legal • Heavy legal framework • Similar systems • Different systems • Open data • Closed data • Publications • Not published • Grant funding • Contract funding Bridges need at least two anchors Long-term goals • Ideal: “Institute for Biomedical informatics” in each country • Large nations/populations: Distributed network with a clear centre of gravity • EMBL-EBI & ELIXIR handle research data: reference collections and sharing amongst researchers (including clinical) • Institute for Biomedical Informatics: • Responsible for exploiting molecular reference data • Provides the national link and point of reference (eg, around legislation) • Broker for research data (back to EMBL-EBI, NCBI & ELIXIR) France EMBL-EBI Basic Research • Working collaboratively with Elixir-France • Orphanet, CAZy • Support training in bioinformatics • Ensuring French scientists and institutes exploit EMBL- EBI • Seamless APIs to allow submission of data driven by institutes (less complexity for user/scientist, use EMBL-EBI as backup)