Strategic Plan 2011-2016

Strategic Plan 2011-2016 Wellcome Trust Sanger Institute Strategic Plan 2011-2016 Mission The Wellcome Trust Sanger Institute uses genome sequences to advance understanding of the biology of humans and pathogens in order to improve human health. -i- Wellcome Trust Sanger Institute Strategic Plan 2011-2016 - ii - Wellcome Trust Sanger Institute Strategic Plan 2011-2016 CONTENTS Foreword ....................................................................................................................................1 Overview .....................................................................................................................................2 1. History and philosophy ............................................................................................................ 5 2. Organisation of the science ..................................................................................................... 5 3. Developments in the scientific portfolio ................................................................................... 7 4. Summary of the Scientific Programmes 2011 – 2016 .............................................................. 8 4.1 Cancer Genetics and Genomics ................................................................................ 8 4.2 Human Genetics ...................................................................................................... 10 4.3 Pathogen Variation .................................................................................................. 13 4.4 Malaria .................................................................................................................... 15 4.5 Mouse and Zebrafish Genetics ................................................................................ 16 4.6 Genome Informatics ................................................................................................ 18 5. Research into diseases of developing countries ................................................................... 19 6. Core platforms for data generation and analysis ................................................................... 20 7. Strategic issues across all Programmes ............................................................................... 22 7.1 Strategy for DNA sequencing .................................................................................. 22 7.2 Strategy for data handling and sharing .................................................................... 25 8. Strategy for translation .......................................................................................................... 26 9. Developing our people .......................................................................................................... 27 10. Developing our organisation ............................................................................................... 30 11. Premises ............................................................................................................................. 31 12. Strategic relations ............................................................................................................... 31 13. Spreading the word ............................................................................................................. 33 14. Societal aspects of the Personal Genome........................................................................... 34 15. Resources........................................................................................................................... 35 Supporting documents ....................................................................................................... S-1/47 -iii- Wellcome Trust Sanger Institute Strategic Plan 2011-2016 Foreword We present below the Strategic Plan for the Wellcome Trust Sanger Institute for 2011-2016. The centrepiece of the Plan is the portfolio of scientific experiments we propose to conduct over the next quinquennium. However, the document also addresses training of young scientists and the ways in which we will inform and interact with the public about our work. Our science is conducted by people in a complex physical and societal environment. The Plan therefore further reflects on our scientific infrastructure and strategy for its development, the organisation and governance of the Institute, our interactions with other scientific organisations, the development of our people, our buildings, and the resources we are requesting in order to implement the Plan. It has been written subsequent to external review of our proposals and a Site Visit by a review panel at Sanger in May 2010, and incorporates the advice and comments received. -1- Wellcome Trust Sanger Institute Strategic Plan 2011-2016 Overview The Sanger Institute’s primary scientific theme in the 2011-2016 quinquennium will be the study of differences in DNA sequence between individual genomes, understanding the consequences of this variation for the biology of humans and other organisms and developing approaches by which this understanding can be used to improve human health. Our mission and strategy are entirely congruent with those of the Wellcome Trust and we aim to contribute as much as we can from our scientific corner to achieving the Trust’s goals. Why study variation in DNA sequence? Because individual members of a species, whether humans, mice, fish, worms, parasites, bacteria, or viruses differ from one another in their anatomy, physiology, metabolism and behaviour, and differences in their DNA sequences make a substantial contribution to this variation in phenotype. The overarching aim of our studies is, therefore, to characterise differences between individual genomes and identify the phenotypic changes which are the consequence of genomic variants. Our research will entail exploration of variation in DNA sequence between human genomes, between the genomes of microorganisms that cause infectious disease (pathogens) and between the genomes of individual animals from two model organism species, mouse and zebrafish. We will extensively explore naturally occurring DNA variation and will complement it with experimental genomic differences that we ourselves introduce. Our studies of DNA variation in humans will be wide ranging. They aim to identify DNA variants that cause phenotypic differences which fall within the normal spectrum, for example in hair colour or height, and also variants that result in overt disease. A spectrum of human diseases, rare and common, prevalent in developed and developing countries will be investigated including obesity, inflammatory disease, epilepsy, severity of response to malarial infection and disorders of development. Our studies in humans will also encompass somatic DNA variation, differences between the genomes of individual cells in the body, which underlies the development of all cancers. Naturally occurring variation will also be studied in many disease-associated microorganisms, allowing us to elucidate the genomic variants that are associated with important phenotypic features for human health such as microbial virulence, resistance to antibiotics and escape from vaccination. Because DNA variation distinguishes one member of a species from another and tells us how closely they are related, studying DNA sequence differences also allows us to track patterns and changes in the distribution of organisms. For example, it reveals how human populations have migrated in the past and the environmental or lifestyle factors implicated in selecting some humans over others, such as infectious disease and food scarcity. In a similar manner, it will allow us to track the spread of disease-causing microbes, monitor their routes of geographical dissemination and understand the waxing and waning of epidemics. Indeed, we will also be able to track cancer cells using somatically acquired genomic differences, even when they are undetectable by other methods. Importantly, investigation of natural variation at Sanger is complemented by our ability to introduce artificial changes into genomes. We will knock out genes in many microorganisms allowing us to identify those that are absolutely required for the organism to survive and - 2 - Wellcome Trust Sanger Institute Strategic Plan 2011-2016 therefore which potentially recommend themselves as drug targets. Our studies of experimentally induced variation in zebrafish, mice and embryonic stem cells will systematically and comprehensively explore the changes in phenotype that result when genes are deleted, one by one. We will also use engineered mutations in mice to cause cancers and thus understand in greater depth the genes that are implicated in tumour formation. Of course, these experiments will also provide the animals or cells themselves, which can then be used to investigate further the disturbance in biology that is caused by the DNA variant and which ultimately results in the phenotype. The results emanating from all these experiments will be made freely available in data repositories. In many cases we will, in addition, organise and annotate the data in intelligible and easily usable databases, transforming the ability of other research scientists to use the information. Physical resources such as animals, cells and DNA clones will also be made available and are already widely used by many research communities around the world. In some instances, the discoveries we make will lead directly to changes in medical practice which will have impact on the

Strategic Plan 2011-2016

ANNUAL REVIEW 1 October 2005–30 September

Ensembl Genomes: Extending Ensembl Across the Taxonomic Space P

Abstracts Genome 10K & Genome Science 29 Aug - 1 Sept 2017 Norwich Research Park, Norwich, Uk

The ELIXIR Core Data Resources: Fundamental Infrastructure for The

Prospects & Overviews Orthology Prediction Methods: a Quality Assessment Using Curated Protein Families

C. Elegans Whole Genome Sequencing Reveals Mutational Signatures Related to Carcinogens and DNA Repair Deficiency

Abstracts In

Deep Profiling of Protease Substrate Specificity Enabled by Dual Random and Scanned Human Proteome Substrate Phage Libraries

Methods in and Applications of the Sequencing of Short Non-Coding Rnas" (2013)

CRISPR and Beyond Perturbations at Scale to Understand Genomes 2-4

Exploiting High Throughput DNA Sequencing Data for Genomic Analysis

Advances in Understanding Cancer Genomes Through Second-Generation Sequencing