1000 Genomes Project

Total Page:16

File Type:pdf, Size:1020Kb

1000 Genomes Project Genomics Beginning & Basis of Biotechnology The Fourth Asian Conference on Biotechnology and Development February 12-13, 2009, Kathmandu, Nepal Huanming Yang Ph.D. Beijing Genome Institute (BGI), China 1. A world of crises and opportunities 2. Two pillars and platforms of genomics 3. “Three BY ALLs” for genomics 4. Four proposals for collaboration 5. Five collaborative projects undergoing Biotechnology Challenges and opportunities for Asia for developing countries 2007-2008 Global energy crisis Global food crisis Global financial crisis Global economy crisis Crisis Crisis & Opportunity 2007: A Year of Miracle “There were more breakthroughs (in life sciences) last year (2007) than in several of the past decades combined.” Dr. Eric Topol (Scripps Institute, March 11 2008 ): The hunt for genetic gold “A Year of Miracle” “These findings are just a prelude to what's shaping up as a true conceptual and technological revolution. Just as physics shocked the world in the 20th century, it is now clear that the life sciences will shake up the world in the 21st.” Newsweek Oct. 15, 2007 2007 The Year of Sequencing Nature Methods 5:11-14, 2008 2007年的 “年度问题” Question of the Year 《自然-遗传学》杂志的编辑提出了2007年的 “年度 问题”: 如果人类基因组的测序费用降至1000美元,那 将意味着什么? What does it mean if a human genome sequence costs less than $1000? “一千美元一个人类基因组”已经不是能不能、而是 什么时候的问题了。这样的问题在4年前还难以设想。 “$1000 Genome’ is not a question whether it is possible, but when it will be realized. It is unimaginable only 4 years ago. BREAKTHROUGH OF THE YEAR: Human Genetic Variation Equipped with faster, cheaper technologies for sequencing DNA Science and assessing variation in genomes on scales 318: 1842 – 1843 21 December 2007 ranging from one to millions of bases, researchers are finding out how truly different we are from one another Now, we have moved from asking what in our DNA makes us human to striving to know what in my DNA makes me me. Volume 22 | Issue 12 | Dec. 2008 TIME’s Best Inventions of 2008 “我们正处在个人 基因组学 (personal genomics) 革命的初 始阶段,这场革命不 仅会转变我们照顾自 己的方式,还将改变 9 Nov. 2008 我们个人信息的表现 形式。” 《时代》周刊 We are at the beginning of a personal- : genomics revolution that will transform 1. The Retail DNA Test not only how we take care of ourselves 个人 测试服务 位居榜首 but also what we mean by personal “ DNA ” information. Top 10 Medical Breakthroughs of 2008 4. Genomes for the Masses James Watson did it. So did Craig Venter. Now you, too, can map your entire genome and reveal some of its many secrets. Scientists debate whether that information is really worth anything at the moment — in many cases, there isn't enough scientific knowledge to interpret what it really means to have this gene variant or that one — but companies at least make it possible for you to take a gander at your genetic data. (Although the service was available previously, until this year, it's been prohibitively expensive.) You provide a sample of saliva, from which your DNA is extracted, copied and combed for the presence of 90 known genetic variations that code for different traits or conditions, from lactose intolerance (though you could probably drink a glass of milk and find out for far cheaper) to prostate cancer. Right now, there's no way to know whether you'll get cancer just because you have the gene, but once the science has advanced, the hope is that such genetic mining will predict disease, giving people the option of seeking treatment before they get sick. Nature favours developing countries To develop your own bioindustry by taking advantage of your rich genetic resources Genomics A science to find genes Basis and beginning of Biotechnology (biotech cannot be done without genes) 1. A world of crises and opportunities 2. Two pillars and platforms of genomics 3. “Three BY ALLs” for genomics 4. Four proposals for collaboration 5. Five collaborative projects undergoing Two Pillars of Genomics “Life is of sequence” Genetic information is in sequence ATTCGGTAACGATTAGAA DNA sequence: The essence of “life is life”! The same for plants Two Pillars of Genomics “Life is digital!” “the instructions for making a life from one generation to the next is digital, John Sulston NOT analogue …” “生命指令是数据的,而不是模拟的” 101010101010100101 010101001101010101 010100101010101001 0100111000011100000111000001100010101 0101010010101010101010100101010101001010 0111000011100000111000001100010010101010 0101010101010101001010101010010100111000 0111000001110000011000101010101010010101 0101010101001010101010010100111000011100 0001110100111000011100000111000001100010 1010101010010101010101010100101010101001 0100111000011100000111000001100010010101 0100101010101010101001010101010010100111 0000111000001110000011000101010101010010 1010101010101001010101010010100111000011 1000001110000011000101010101010010101010 1010101001010101010010100111000011100000 1110000011000101010101010010101010101010 “Life is of sequence”,1001010101010010100111000011100000111000 “Life is digital” 0110001010110000111000001110000011000101 0101010100101010101010101000101010101010 making sequencers and0101010101010001010101010100101010101010 supercomputers 1010010101010100101001110000111000001110 major tools for0001100010101100001110000011100000110001 genomics 0101010101001010101010101011100101100101 0101010101010101010100010111010100000000 0010111000000100010000010101010010101001 1001010101001001010010010100101010010010 Two Pillars of Genomics “Life is of sequence, Life is digital!” Revolutionizing life sciences No sequence, no knowledge! “All biology in the future will start with the knowledge of genomes and proceed hopefully. ” J. D. Watson, 2003 “未来所有生物学只有以基因组知识(重新)开始才有希望发展” An issue of cost! “One ‘buck’, one base” HGP $ 3.0 b/3 Gb ($1.0/1bp) 1999 $ 0.5 b/3 Gb 2005 $ 30 m/3 Gb ($ 0.01/bp) Future (1) $ 3.0 m /3 Gb (2) $ 0.1 m /3 Gb Sequencing is not a “luxurious” tool any longer Next-Generation Sequencers Aiming at “$100,000 genome” Sequencing Revolution 454 20 – 500 Mb (100-500 bp/read)/6 hours/run Sequencing Revolution Sequencing by Ligation SOLiD 3 Gb/run Sequencing Revolution Illumina Solexa 3-6 Gb/6 days ~1000 molecules per ~ 1 um cluster 100um ~1000 clusters per 100 um square Random array of ~40 million clusters per experiment clusters Next-Next-Generation Sequencers Aiming at “$1000 genome” NEXT-GEN SEQUENCING TECHNOLOGIES 26 (10):1146, Oct. 2008 PacBio to Start Selling Next-Gen Sequencer To Early Users in 2010 The company projects that with improvements to its enzyme biochemistry and in camera technology, it will eventually be able to generate more than 100 gigabases of sequence data per hour, provide reads at least as long as Sanger sequencing, and offer run times measuring in minutes at a cost of hundreds of dollars. Pacific Biosciences also prepares the 15-Minute Genome by 2013. GenomeWeb Newsroom February 13, 2008 VisiGen to Offer 'Nano-Sequencing' $1,000 Genome Service by 2009 [February 12, 2008] By Bernadette Toner. GenomeWeb News Editor SALT LAKE CITY (GenomeWeb News) – Next-generation sequencing firm VisiGen Biotechnologies plans to offer a service based on its real-time single- molecule sequencing, or "nano-sequencing machine", technology by the end of 2009, and to follow that with the launch of equipment and reagents in another 18 months to two years. The technology could enable researchers to sequence an entire human genome in less than a day for under $1,000, which can generate around 4 gigabases of data per day. At that throughput, the technology could sequence 44 human genomes per year at 10-fold coverage for around $1,000 per genome. In addition, read lengths for the instrument are expected to be around one kilobase. Oxford Nanopore Technologies • Earlier this year, UK-based Oxford Nanopore Technologies, a startup company developing a nanopore-based sequencing technology, raised £10 million ($20 million) in a second financing round from non-VC institutional and private investors, adding to an £7.5 million ($15 million) round in 2006 (see In Sequence 4/8/2008). Technical Platform of Genomics It is essential for developing countries to build powerful infrastructure for sustainable development of science ① Sequencing 104 MegaBACE 25 ABI 3700/3730 18 X Illumina Solexa I 1 X 454 2 X ABI SOLiD l “BGI to Ramp up Sequencing Abilities” NEW YORK March 26, 2008 (GenomeWeb News) – Beijing Genomics Institute is dramatically expanding its DNA sequencing capacity by adding fourteen new next-generation sequencers, … to bring BGI's raw-sequencing data output to up to 20 Gbps per day or more, ranking the 3rd biggest center in the world concerning its capacity. Space for 40 new sequencers Programers > 150 ② Bioinformatics Downing 3000 Supercomputers at BGI, Beijing & Hangzhou SGI Supercomputers in Beijing IBM Downing 2000 SUN CPUs: 1192 (Memory) (Storage) Memory: 2.3 T Storage: 1458 T Speed: 20T FLP Home-made new supercomputer at BGI-Shenzhen To read the genome BGI has contributed to most, if not all, programs for CNV detection and other applications by the next-generation sequencers, especially for Solexa. This is a revolution “This is just the start” 1. A world of crises and opportunities 2. Two pillars and platforms of genomics 3. “Three BY ALLs” for genomics 4. Four proposals for collaboration 5. Five collaborative projects undergoing An imbalanced world Two “tribes” in the world: The Rich & the Poor. Where is her hope? Kathmendu, Feb.11, 2008 Hope! Where is her hope? Kathmendu, Feb.11, 2008 The challenge is not only technology, but also humanity! Genomics Should not create more differences or to make the differences even bigger Nature April 27,2006 “Human genome sequencing
Recommended publications
  • Ensembl Genomes: Extending Ensembl Across the Taxonomic Space P
    Published online 1 November 2009 Nucleic Acids Research, 2010, Vol. 38, Database issue D563–D569 doi:10.1093/nar/gkp871 Ensembl Genomes: Extending Ensembl across the taxonomic space P. J. Kersey*, D. Lawson, E. Birney, P. S. Derwent, M. Haimel, J. Herrero, S. Keenan, A. Kerhornou, G. Koscielny, A. Ka¨ ha¨ ri, R. J. Kinsella, E. Kulesha, U. Maheswari, K. Megy, M. Nuhn, G. Proctor, D. Staines, F. Valentin, A. J. Vilella and A. Yates EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK Received August 14, 2009; Revised September 28, 2009; Accepted September 29, 2009 ABSTRACT nucleotide archives; numerous other genomes exist in states of partial assembly and annotation; thousands of Ensembl Genomes (http://www.ensemblgenomes viral genomes sequences have also been generated. .org) is a new portal offering integrated access to Moreover, the increasing use of high-throughput genome-scale data from non-vertebrate species sequencing technologies is rapidly reducing the cost of of scientific interest, developed using the Ensembl genome sequencing, leading to an accelerating rate of genome annotation and visualisation platform. data production. This not only makes it likely that in Ensembl Genomes consists of five sub-portals (for the near future, the genomes of all species of scientific bacteria, protists, fungi, plants and invertebrate interest will be sequenced; but also the genomes of many metazoa) designed to complement the availability individuals, with the possibility of providing accurate and of vertebrate genomes in Ensembl. Many of the sophisticated annotation through the similarly low-cost databases supporting the portal have been built in application of functional assays.
    [Show full text]
  • Rare Variant Contribution to Human Disease in 281,104 UK Biobank Exomes W ­ 1,19 1,19 2,19 2 2 Quanli Wang , Ryan S
    https://doi.org/10.1038/s41586-021-03855-y Accelerated Article Preview Rare variant contribution to human disease W in 281,104 UK Biobank exomes E VI Received: 3 November 2020 Quanli Wang, Ryan S. Dhindsa, Keren Carss, Andrew R. Harper, Abhishek N ag­­, I oa nn a Tachmazidou, Dimitrios Vitsios, Sri V. V. Deevi, Alex Mackay, EDaniel Muthas, Accepted: 28 July 2021 Michael Hühn, Sue Monkley, Henric O ls so n , S eb astian Wasilewski, Katherine R. Smith, Accelerated Article Preview Published Ruth March, Adam Platt, Carolina Haefliger & Slavé PetrovskiR online 10 August 2021 P Cite this article as: Wang, Q. et al. Rare variant This is a PDF fle of a peer-reviewed paper that has been accepted for publication. contribution to human disease in 281,104 UK Biobank exomes. Nature https:// Although unedited, the content has been subjectedE to preliminary formatting. Nature doi.org/10.1038/s41586-021-03855-y (2021). is providing this early version of the typeset paper as a service to our authors and Open access readers. The text and fgures will undergoL copyediting and a proof review before the paper is published in its fnal form. Please note that during the production process errors may be discovered which Ccould afect the content, and all legal disclaimers apply. TI R A D E T A R E L E C C A Nature | www.nature.com Article Rare variant contribution to human disease in 281,104 UK Biobank exomes W 1,19 1,19 2,19 2 2 https://doi.org/10.1038/s41586-021-03855-y Quanli Wang , Ryan S.
    [Show full text]
  • C. Elegans Whole Genome Sequencing Reveals Mutational Signatures Related to Carcinogens and DNA Repair Deficiency
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press C. elegans whole genome sequencing reveals mutational signatures related to carcinogens and DNA repair deficiency Authors: Bettina Meier * (1); Susanna L Cooke * (2); Joerg Weiss (1); Aymeric P Bailly (1,3); Ludmil B Alexandrov (2); John Marshall (2); Keiran Raine (2); Mark Maddison (2); Elizabeth Anderson (2); Michael R Stratton (2); Anton Gartner * (1); Peter J Campbell * (2,4,5). * These authors contributed equally to this project. Institutions: (1) Centre for Gene Regulation and Expression, University of Dundee, Dundee, UK. (2) Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK. (3) CRBM/CNRS UMR5237, University of Montpellier, Montpellier, France. (4) Department of Haematology, University of Cambridge, Cambridge, UK. (5) Department of Haematology, Addenbrooke’s Hospital, Cambridge, UK. Address for correspondence: Dr Peter J Campbell, Dr Anton Gartner, Cancer Genome Project, Centre for Gene Regulation and Expression, Wellcome Trust Sanger Institute, The University of Dundee, Hinxton CB10 1SA, Dow Street, Cambridgeshire, Dundee DD1 5EH UK. UK. Tel: +44 (0) 1223 494745 Phone: +44 (0) 1382 385809 Fax: +44 (0) 1223 494809 E-mail: [email protected] E-mail: [email protected] Running title: mutation profiling in C. elegans Keywords: mutation pattern, genetic and environmental factors, C. elegans, cisplatin, aflatoxin B1, whole-genome sequencing. Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press ABSTRACT Mutation is associated with developmental and hereditary disorders, ageing and cancer. While we understand some mutational processes operative in human disease, most remain mysterious.
    [Show full text]
  • Abstracts In
    ECCB 2014 Accepted Posters with Abstracts G: Bioinformatics of health and disease G01: Emile Rugamika Chimusa, Jacquiline Wangui Mugo and Nicola Mulder. Leveraging ancestry along the genome of admixed individuals to resolve missing heritability in disease scoring statistics Abstract: Human genetics has been haunted by the mystery of “missing heritability” of common traits. Although studies have discovered several variants associated with common diseases and traits, these variants typically appear to explain only a minority of the heritability. Resolving missing heritability, the difference between phenotypic variance explained by associated SNPs and estimates of narrow-sense heritability (h2), will inform strategies for disease mapping and prediction of complex traits. Among biased estimates of h2 due to epistatic interactions and rare variants not captured by genotyping arrays have been cited to be the most can be the most explanations for missing heritability. Here, we present an approach for estimating heritability of traits based on sharing local ancestry segments between pairs of unrelated individuals in an admixed population. From simulation data and real data, we demonstrated that our approach outperformed current approaches for estimating heritability of traits and holds values in admixture mapping for deconvoluting genes underlying ethnic differences in complex diseases risk. G02: Sylvain Mareschal, Pierre-Julien Viailly, Philippe Bertrand, Fabienne Desmots-Loyer, Elodie Bohers, Catherine Maingonnat, Karen Leroy, Thierry Fest and Fabrice Jardin. Next- Generation Sequencing applied to tailor targeted therapies in lymphoma: the RELYSE project Abstract: Non-Hodgkin Lymphomas (NHL) are lymphoid cell malignancies accounting for about 4% of all cancers, with an incidence rate of 12 cases per 100,000 and per year in Europe.
    [Show full text]
  • (DDD) Project: What a Genomic Approach Can Achieve
    The Deciphering Development Disorders (DDD) project: What a genomic approach can achieve RCP ADVANCED MEDICINE, LONDON FEB 5TH 2018 HELEN FIRTH DM FRCP DCH, SANGER INSTITUTE 3,000,000,000 bases in each human genome Disease & developmental Health & development disorders Fascinating facts about your genome! –~20,000 protein-coding genes –~30% of genes have a known role in disease or developmental disorders –~10,000 protein altering variants –~100 protein truncating variants –~70 de novo mutations (~1-2 coding ie. In exons of genes) Rare Disease affects 1 in 17 people •Prior to DDD, diagnostic success in patients with rare paediatric disease was poor •Not possible to diagnose many patients with current methodology in routine use– maximum benefit in this group •DDD recruited patients with severe/extreme clinical features present from early childhood with high expectation of genetic basis •Recruitment was primarily of trios (ie The Doctor Sir Luke Fildes (1887) child and both parents) ~ 90% Making a genomic diagnosis of a rare disease improves care •Accurate diagnosis is the cornerstone of good medical practice – informing management, treatment, prognosis and prevention •Enables risk to other family members to be determined enabling predictive testing with potential for surveillance and therapy in some disorders February 28th 2018 •Reduces sense of isolation, enabling better access to support and information •Curtails the diagnostic odyssey •Not just a descriptive label; identifies the fundamental cause of disease A genomic diagnosis can be a gateway to better treatment •Not just a descriptive label; identifies the fundamental cause of disease •Biallelic mutations in the CFTR gene cause Cystic Fibrosis • CFTR protein is an epithelial ion channel regulating absorption/ secretion of salt and water in the lung, sweat glands, pancreas & GI tract.
    [Show full text]
  • Different Evolutionary Patterns of Snps Between Domains and Unassigned Regions in Human Protein‑Coding Sequences
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Springer - Publisher Connector Mol Genet Genomics (2016) 291:1127–1136 DOI 10.1007/s00438-016-1170-7 ORIGINAL ARTICLE Different evolutionary patterns of SNPs between domains and unassigned regions in human protein‑coding sequences Erli Pang1 · Xiaomei Wu2 · Kui Lin1 Received: 14 September 2015 / Accepted: 18 January 2016 / Published online: 30 January 2016 © The Author(s) 2016. This article is published with open access at Springerlink.com Abstract Protein evolution plays an important role in Furthermore, the selective strength on domains is signifi- the evolution of each genome. Because of their functional cantly greater than that on unassigned regions. In addition, nature, in general, most of their parts or sites are differently among all of the human protein sequences, there are 117 constrained selectively, particularly by purifying selection. PfamA domains in which no SNPs are found. Our results Most previous studies on protein evolution considered indi- highlight an important aspect of protein domains and may vidual proteins in their entirety or compared protein-coding contribute to our understanding of protein evolution. sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each pro- Keywords Human genome · Protein-coding sequence · tein of a given genome. To this end, based on PfamA anno- Protein domain · SNPs · Natural selection tation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in Introduction protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occur- Studying protein evolution is crucial for understanding ring within protein domains and those within unassigned the evolution of speciation and adaptation, senescence and regions.
    [Show full text]
  • Annual Scientific Report 2013 on the Cover Structure 3Fof in the Protein Data Bank, Determined by Laponogov, I
    EMBL-European Bioinformatics Institute Annual Scientific Report 2013 On the cover Structure 3fof in the Protein Data Bank, determined by Laponogov, I. et al. (2009) Structural insight into the quinolone-DNA cleavage complex of type IIA topoisomerases. Nature Structural & Molecular Biology 16, 667-669. © 2014 European Molecular Biology Laboratory This publication was produced by the External Relations team at the European Bioinformatics Institute (EMBL-EBI) A digital version of the brochure can be found at www.ebi.ac.uk/about/brochures For more information about EMBL-EBI please contact: [email protected] Contents Introduction & overview 3 Services 8 Genes, genomes and variation 8 Molecular atlas 12 Proteins and protein families 14 Molecular and cellular structures 18 Chemical biology 20 Molecular systems 22 Cross-domain tools and resources 24 Research 26 Support 32 ELIXIR 36 Facts and figures 38 Funding & resource allocation 38 Growth of core resources 40 Collaborations 42 Our staff in 2013 44 Scientific advisory committees 46 Major database collaborations 50 Publications 52 Organisation of EMBL-EBI leadership 61 2013 EMBL-EBI Annual Scientific Report 1 Foreword Welcome to EMBL-EBI’s 2013 Annual Scientific Report. Here we look back on our major achievements during the year, reflecting on the delivery of our world-class services, research, training, industry collaboration and European coordination of life-science data. The past year has been one full of exciting changes, both scientifically and organisationally. We unveiled a new website that helps users explore our resources more seamlessly, saw the publication of ground-breaking work in data storage and synthetic biology, joined the global alliance for global health, built important new relationships with our partners in industry and celebrated the launch of ELIXIR.
    [Show full text]
  • 1 Constructing the Scientific Population in the Human Genome Diversity and 1000 Genome Projects Joseph Vitti I. Introduction: P
    Constructing the Scientific Population in the Human Genome Diversity and 1000 Genome Projects Joseph Vitti I. Introduction: Populations Coming into Focus In November 2012, some eleven years after the publication of the first draft sequence of a human genome, an article published in Nature reported a new ‘map’ of the human genome – created from not one, but 1,092 individuals. For many researchers, however, what was compelling was not the number of individuals sequenced, but rather the fourteen worldwide populations they represented. Comparisons that could be made within and among these populations represented new possibilities for the scientific study of human genetic variation. The paper – which has been cited over 400 times in the subsequent year – was the output of the first phase of the 1000 Genomes Project, one of several international research consortia launched with the intent of identifying and cataloguing such variation. With the project’s phase three data release, anticipated in early spring 2014, the sample size will rise to over 2500 individuals representing twenty-six populations. Each individual’s full sequence data is made publicly available online, and is also preserved through the establishment of immortal cell lines, from which DNA can be extracted and distributed. With these developments, population-based science has been made genomic, and scientific conceptions of human populations have begun to crystallize (see appendix). Such extensive biobanking and databasing of human populations is remarkable for a number of reasons, not least among them the socially charged terrain that such an enterprise inevitably must navigate. While the 1000 Genomes Project (1000G) has been relatively uncontroversial in its reception, predecessors such as the Human Genome Diversity Project (HGDP), first conceived in 1991, faced greater difficulty.
    [Show full text]
  • Human Genetics: International Projects and Personalized Medicine
    Drug Metabol Pers Ther 2016; 31(1): 3–8 Mini Review Maria Apellaniz-Ruiza, Cristina Gallegoa, Sara Ruiz-Pintoa, Angel Carracedo and Cristina Rodríguez-Antona* Human genetics: international projects and personalized medicine DOI 10.1515/dmpt-2015-0032 Received August 31, 2015; accepted October 19, 2015; previously Introduction published online November 18, 2015 Genetic variation databases describe naturally occur- Abstract: In this article, we present the progress driven ring genetic differences among individuals of the same by the recent technological advances and new revolu- species. This variation, accounting for 0.1% of our DNA tionary massive sequencing technologies in the field of [1], permits the flexibility and survival of a population human genetics. We discuss this knowledge in relation in the face of changing environmental circumstances, with drug response prediction, from the germline genetic but it also influences how people differ in their risk of variation compiled in the 1000 Genomes Project or in the disease or their response to drugs. It is well known that Genotype-Tissue Expression project, to the phenome- variability in response to drug therapy is the rule rather genome archives, the international cancer projects, such than the exception for most drugs, and these differences as The Cancer Genome Atlas or the International Cancer are among the major challenges in current clinical prac- Genome Consortium, and the epigenetic variation and its tice, drug development, and drug regulation [2, 3]. Thus, influence in gene expression, including the regulation of rather than accepting the “one drug fits all” approach, drug metabolism. This review is based on the lectures pre- researchers envision that drugs need to be tailored to fit sented by the speakers of the Symposium “Human Genet- the profile of each individual patient.
    [Show full text]
  • NIH-GDS: Genomic Data Sharing
    NIH-GDS: Genomic Data Sharing National Institutes of Health Data type Explain whether the research being considered for funding involves human data, non- human data, or both. Information to be included in this section: • Type of data being collected: human, non-human, or both human & non-human. • Type of genomic data to be shared: sequence, transcriptomic, epigenomic, and/or gene expression. • Level of the genomic data to be shared: Individual-level, aggregate-level, or both. • Relevant associated data to be shared: phenotype or exposure. • Information needed to interpret the data: study protocols, survey tools, data collection instruments, data dictionary, software (including version), codebook, pipeline metadata, etc. This information should be provided with unrestricted access for all data levels. Data repository Identify the data repositories to which the data will be submitted, and for human data, whether the data will be available through unrestricted or controlled-access. For human genomic data, investigators are expected to register all studies in the database of Genotypes and Phenotypes (dbGaP) by the time data cleaning and quality control measures begin in addition to submitting the data to the relevant NIH-designated data repository (e.g., dbGaP, Gene Expression Omnibus (GEO), Sequence Read Archive (SRA), the Cancer Genomics Hub) after registration. Non-human data may be made available through any widely used data repository, whether NIH- funded or not, such as GEO, SRA, Trace Archive, Array Express, Mouse Genome Informatics, WormBase, the Zebrafish Model Organism Database, GenBank, European Nucleotide Archive, or DNA Data Bank of Japan. Data in unrestricted-access repositories (e.g., The 1000 Genomes Project) are publicly available to anyone.
    [Show full text]
  • Strategic Plan 2011-2016
    Strategic Plan 2011-2016 Wellcome Trust Sanger Institute Strategic Plan 2011-2016 Mission The Wellcome Trust Sanger Institute uses genome sequences to advance understanding of the biology of humans and pathogens in order to improve human health. -i- Wellcome Trust Sanger Institute Strategic Plan 2011-2016 - ii - Wellcome Trust Sanger Institute Strategic Plan 2011-2016 CONTENTS Foreword ....................................................................................................................................1 Overview .....................................................................................................................................2 1. History and philosophy ............................................................................................................ 5 2. Organisation of the science ..................................................................................................... 5 3. Developments in the scientific portfolio ................................................................................... 7 4. Summary of the Scientific Programmes 2011 – 2016 .............................................................. 8 4.1 Cancer Genetics and Genomics ................................................................................ 8 4.2 Human Genetics ...................................................................................................... 10 4.3 Pathogen Variation .................................................................................................. 13 4.4 Malaria
    [Show full text]
  • The 1000 Genomes Project
    The 1000 Genomes Project: obtaining a deep catalogue of human genetic variation with new sequencing technology 2007First quarterfirstsecondfourththird20062005 quarter quarter quarter 2008 quarter Manolio, Brooks, Collins, J. Clin. Invest., May 2008 Chromosome 9p21: diabetes, coronary heart disease. Three genes, multiple SNPs 500,000 basepairs of Chr 9 (total length 109M bp) Zeggini et al, Science 2007; 316:1336-1341. After GWAS “hit”, what next? (remember, these are associations, not causes) One region (~Mb), multiple genes, or sometimes no genes (!), multiple SNPs to sort through Which is the right gene? What is the “causal” variant? The current SNP catalog is not complete – may not have the causal variant After a GWAS “hit”, what next? • One could get lucky (gene is a likely candidate based on previously known function*; a known associated SNP is a variant that prevents any gene function) • Gene expression correlates with believed function (e.g. tissue specific, disease specific) • Conservation of sequence between genomes of many mammals • Get a complete list of variants in the region, and one of them will be right. Need to sequence the associated region in many people. *CDKN: evidence for a role in islet cell growth. Also a tumor suppressor. Chromosome 9p21: diabetes, coronary heart disease. Three genes, multiple SNPs 500,000 basepairs of Chr 9 (total length 109M bp) Good bet on the gene, but what is the cause? 1000 Genomes Project: A resource for aiding human genetics studies • An essentially complete list of all variants in human
    [Show full text]