Telomere-To-Telomere Assembly of a Complete Human X Chromosome W

Total Page:16

File Type:pdf, Size:1020Kb

Telomere-To-Telomere Assembly of a Complete Human X Chromosome W https://doi.org/10.1038/s41586-020-2547-7 Accelerated Article Preview Telomere-to-telomere assembly of a W complete human X chromosome E VI Received: 30 July 2019 Karen H. Miga, Sergey Koren, Arang Rhie, Mitchell R. Vollger, Ariel Gershman, Andrey Bzikadze, Shelise Brooks, Edmund Howe, David Porubsky, GlennisE A. Logsdon, Accepted: 29 May 2020 Valerie A. Schneider, Tamara Potapova, Jonathan Wood, William Chow, Joel Armstrong, Accelerated Article Preview Published Jeanne Fredrickson, Evgenia Pak, Kristof Tigyi, Milinn Kremitzki,R Christopher Markovic, online 14 July 2020 Valerie Maduro, Amalia Dutra, Gerard G. Bouffard, Alexander M. Chang, Nancy F. Hansen, Amy B. Wilfert, Françoise Thibaud-Nissen, Anthony D. Schmitt,P Jon-Matthew Belton, Cite this article as: Miga, K. H. et al. Siddarth Selvaraj, Megan Y. Dennis, Daniela C. Soto, Ruta Sahasrabudhe, Gulhan Kaya, Telomere-to-telomere assembly of a com- Josh Quick, Nicholas J. Loman, Nadine Holmes, Matthew Loose, Urvashi Surti, plete human X chromosome. Nature Rosa ana Risques, Tina A. Graves Lindsay, RobertE Fulton, Ira Hall, Benedict Paten, https://doi.org/10.1038/s41586-020-2547-7 Kerstin Howe, Winston Timp, Alice Young, James C. Mullikin, Pavel A. Pevzner, (2020). Jennifer L. Gerton, Beth A. Sullivan, EvanL E. Eichler & Adam M. Phillippy C This is a PDF fle of a peer-reviewedI paper that has been accepted for publication. Although unedited, the Tcontent has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text andR fgures will undergo copyediting and a proof review before the paper is published in its fnal form. Please note that during the production process errors may beA discovered which could afect the content, and all legal disclaimers apply.D E T A R E L E C C A Nature | www.nature.com Article Telomere-to-telomere assembly of a complete human X chromosome W 1,24 2,24 2 3 4 https://doi.org/10.1038/s41586-020-2547-7 Karen H. Miga ✉, Sergey Koren , Arang Rhie , Mitchell R. Vollger , Ariel Gershman , 5 6 7 3 E 3 Andrey Bzikadze , Shelise Brooks , Edmund Howe , David Porubsky , Glennis A. Logsdon , Received: 30 July 2019 Valerie A. Schneider8, Tamara Potapova7, Jonathan Wood9, William Chow9, JoelI Armstrong1, Accepted: 29 May 2020 Jeanne Fredrickson10, Evgenia Pak11, Kristof Tigyi1, Milinn Kremitzki12, Christopher Markovic12, 13 11 6 2V 14 Valerie Maduro , Amalia Dutra , Gerard G. Bouffard , Alexander M. Chang , Nancy F. Hansen , Published online: 14 July 2020 Amy B. Wilfert3, Françoise Thibaud-Nissen8, Anthony D. Schmitt15, Jon-Matthew Belton15, Siddarth Selvaraj15, Megan Y. Dennis16, Daniela C. Soto16, Ruta SahasrabudheE 17, Gulhan Kaya16, Josh Quick18, Nicholas J. Loman18, Nadine Holmes19, Matthew Loose19, Urvashi Surti20, Rosa ana Risques10, Tina A. Graves Lindsay12, Robert Fulton12, IraR Hall12, Benedict Paten1, Kerstin Howe9, Winston Timp4, Alice Young6, James C. Mullikin6, Pavel A. Pevzner21, Jennifer L. Gerton7, Beth A. Sullivan22, Evan E. Eichler3,23P & Adam M. Phillippy2 ✉ After two decades of improvements, the currentE human reference genome (GRCh38) is the most accurate and complete vertebrateL genome ever produced. However, no one chromosome has been fnished end to end, and hundreds of unresolved gaps persist1,2. Here we present a de novoC human genome assembly that surpasses the continuity of GRCh382, alongI with the frst gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the completeT hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our R 3 eforts on the human X chromosome , we reconstructed the ~3.1 megabase centromericA satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). These novel sequences will be integratedD into future human reference genome releases. Additionally, a complete chromosome X, combined with the ultra-long nanopore data, allowed us to map Emethylation patterns across complex tandem repeats and satellite arrays for the frst T time. Our results demonstrate that fnishing the entire human genome is now within reach and the data presented here will enable ongoing eforts to complete the A remaining human chromosomes. R Complete, telomere-to-telomere reference assemblies are neces- length and greater than 98% identical between paralogs. Due to their sary to ensure that allE genomic variants are discovered and studied. absence from the reference, these repeat-rich sequences are often Currently, unresolved regions of the human genome are defined by excluded from genetics and genomics studies, limiting the scope of multi-megabaseL satellite arrays in the pericentromeric regions and the association and functional analyses4,5. Unresolved repeat sequences rDNA arrays on acrocentric short arms, as well as regions enriched in also result in unintended consequences such as paralogous sequence segmentalE duplications that are greater than hundreds of kilobases in variants incorrectly called as allelic variants6 and the contamination of 1UC SantaC Cruz Genomics Institute, University of California, Santa Cruz, CA, USA. 2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA. 3Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA. 4Department of Molecular Biology & Genetics, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. 5Graduate Program in Bioinformatics and Systems Biology, University of CCalifornia, San Diego, CA, USA. 6NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA. 7Stowers Institute for Medical Research, Kansas City, MO, USA. 8National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA. 9Wellcome Sanger Institute, Cambridge, UK. 10University of Washington, Department of Pathology, Seattle, WA, USA. 11Cytogenetic and Microscopy Core, National Human Genome Research Institute, National Institutes of A 12 13 Health, Bethesda, MD, USA. McDonnell Genome Institute at Washington University, St. Louis, MO, USA. Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA. 14Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA. 15Arima Genomics, San Diego, CA, USA. 16Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA, USA. 17DNA Technologies Core, Genome Center, University of California, Davis, CA, USA. 18Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK. 19DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK. 20Department of Pathology, University of Pittsburgh, Pittsburgh, PA, USA. 21Department of Computer Science and Engineering, University of California, San Diego, CA, USA. 22Department of Molecular Genetics and Microbiology, Division of Human Genetics, Duke University Medical Center, Durham, NC, USA. 23Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA. 24These authors contributed equally: Karen H. Miga, Sergey Koren. ✉e-mail: [email protected]; adam. [email protected] Nature | www.nature.com | 1 Article bacterial gene databases7. Completion of the entire human genome is after Nanopore polishing and 99.99% after PacBio polishing. Illumina expected to contribute to our understanding of chromosome function8, data was only used to correct small insertion and deletion errors in human disease9, and genomic variation which will improve technologies uniquely-mappable regions of the genome, which had a marginal effect in biomedicine that use short-read mapping to a reference genome on average accuracy but reduced the number of frame-shifted genes. (e.g. RNA-seq10, ChIP-seq11, ATAC-seq12). Putative mis-assemblies were identified via analysis of the Illumina The fundamental challenge of reconstructing a genome from many linked-read barcodes (10x Genomics) and optical mapping (Bionano comparatively short sequencing reads—a process known as genome Genomics) data not used in the initial assembly. The initial contigs assembly—is distinguishing the repeated sequences from one another13. were broken at regions of low mapping coverage and the correctedW W Resolving such repeats relies on sequencing reads that are long enough contigs were then ordered and oriented relative to one another using to span the entire repeat or accurate enough to distinguish each repeat the optical map. Over 90% of six chromosomes are represented in two 14 E E copy on the basis of unique variants . The difficulty of the assembly contigs and ten are represented by 2 scaffolds. (Fig 1a). problem and limits of past technologies are highlighted by the fact The final assembly consists of 2.94 Gbp in 448 contigsI with a contig I that the human genome remains unfinished 20 years after its initial NG50 of 70 Mbp. A total of 98 scaffolds (173 contigs) were unambigu- 15 V V release in 2001 . The first human reference genome released by the ously assigned to a
Recommended publications
  • X-Linked Recessive Inheritance
    X-LINKED RECESSIVE Fact sheet 09 INHERITANCE This fact sheet talks about how genes affect our health when they follow a well understood pattern of genetic inheritance known as X-linked recessive inheritance. The exception to this rule applies to the genes carried on the sex chromosomes called X and Y. IN SUMMARY The genes in our DNA provide the instructions • Genes contain the instructions for for proteins, which are the building blocks of the growth and development. Some gene cells that make up our body. Although we all have variations or changes may mean that variation in our genes, sometimes this can affect the gene does not work properly or works in a different way that is harmful. how our bodies grow and develop. • A variation in a gene that causes a Generally, DNA variations that have no impact health or developmental condition on our health are called benign variants or is called a pathogenic variant or polymorphisms. These variants tend to be more mutation. common in people. Less commonly, variations can change the gene so that it sends a different • If a genetic condition happens when message. These changes may mean that the gene a gene on the X chromosome has does not work properly or works in a different way a variation, this is called X-linked that is harmful. A variation in a gene that causes inheritance. a health or developmental condition is called a • An X-linked recessive gene is a gene pathogenic variant or mutation. located on the X chromosome and affects males and females differently.
    [Show full text]
  • Reference Genome Sequence of the Model Plant Setaria
    UC Davis UC Davis Previously Published Works Title Reference genome sequence of the model plant Setaria Permalink https://escholarship.org/uc/item/2rv1r405 Journal Nature Biotechnology, 30(6) ISSN 1087-0156 1546-1696 Authors Bennetzen, Jeffrey L Schmutz, Jeremy Wang, Hao et al. Publication Date 2012-05-13 DOI 10.1038/nbt.2196 Peer reviewed eScholarship.org Powered by the California Digital Library University of California ARTICLES Reference genome sequence of the model plant Setaria Jeffrey L Bennetzen1,13, Jeremy Schmutz2,3,13, Hao Wang1, Ryan Percifield1,12, Jennifer Hawkins1,12, Ana C Pontaroli1,12, Matt Estep1,4, Liang Feng1, Justin N Vaughn1, Jane Grimwood2,3, Jerry Jenkins2,3, Kerrie Barry3, Erika Lindquist3, Uffe Hellsten3, Shweta Deshpande3, Xuewen Wang5, Xiaomei Wu5,12, Therese Mitros6, Jimmy Triplett4,12, Xiaohan Yang7, Chu-Yu Ye7, Margarita Mauro-Herrera8, Lin Wang9, Pinghua Li9, Manoj Sharma10, Rita Sharma10, Pamela C Ronald10, Olivier Panaud11, Elizabeth A Kellogg4, Thomas P Brutnell9,12, Andrew N Doust8, Gerald A Tuskan7, Daniel Rokhsar3 & Katrien M Devos5 We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion.
    [Show full text]
  • Chromosomal Aberrations in Head and Neck Squamous Cell Carcinomas in Norwegian and Sudanese Populations by Array Comparative Genomic Hybridization
    825-843 12/9/08 15:31 Page 825 ONCOLOGY REPORTS 20: 825-843, 2008 825 Chromosomal aberrations in head and neck squamous cell carcinomas in Norwegian and Sudanese populations by array comparative genomic hybridization ERIC ROMAN1,2, LEONARDO A. MEZA-ZEPEDA3, STINE H. KRESSE3, OLA MYKLEBOST3,4, ENDRE N. VASSTRAND2 and SALAH O. IBRAHIM1,2 1Department of Biomedicine, Faculty of Medicine and Dentistry, University of Bergen, Jonas Lies vei 91; 2Department of Oral Sciences - Periodontology, Faculty of Medicine and Dentistry, University of Bergen, Årstadveien 17, 5009 Bergen; 3Department of Tumor Biology, Institute for Cancer Research, Rikshospitalet-Radiumhospitalet Medical Center, Montebello, 0310 Oslo; 4Department of Molecular Biosciences, University of Oslo, Blindernveien 31, 0371 Oslo, Norway Received January 30, 2008; Accepted April 29, 2008 DOI: 10.3892/or_00000080 Abstract. We used microarray-based comparative genomic logical parameters showed little correlation, suggesting an hybridization to explore genome-wide profiles of chromosomal occurrence of gains/losses regardless of ethnic differences and aberrations in 26 samples of head and neck cancers compared clinicopathological status between the patients from the two to their pair-wise normal controls. The samples were obtained countries. Our findings indicate the existence of common from Sudanese (n=11) and Norwegian (n=15) patients. The gene-specific amplifications/deletions in these tumors, findings were correlated with clinicopathological variables. regardless of the source of the samples or attributed We identified the amplification of 41 common chromosomal carcinogenic risk factors. regions (harboring 149 candidate genes) and the deletion of 22 (28 candidate genes). Predominant chromosomal alterations Introduction that were observed included high-level amplification at 1q21 (harboring the S100A gene family) and 11q22 (including Head and neck squamous cell carcinoma (HNSCC), including several MMP family members).
    [Show full text]
  • Telomere Length Shortening in Microglia: Implication for Accelerated Senescence and Neurocognitive Deficits in HIV
    Article Telomere Length Shortening in Microglia: Implication for Accelerated Senescence and Neurocognitive Deficits in HIV Chiu-Bin Hsiao 1, Harneet Bedi 2, Raquel Gomez 2, Ayesha Khan 2, Taylor Meciszewski 2, Ravikumar Aalinkeel 2, Ting Chean Khoo 3, Anna V. Sharikova 3, Alexander Khmaladze 3 and Supriya D. Mahajan 2,* 1 Medicine Institute, School of Medicine, Infectious Diseases, Drexel University, Positive Health Clinic, Allegheny General Hospital, Allegheny Health Network, Pittsburgh, PA 15212, USA; [email protected] 2 Department of Medicine, Division of Allergy, Immunology & Rheumatology, University at Buffalo’s Clinical Translational Research Center, Buffalo, NY 14203, USA; [email protected] (H.B.); [email protected] (R.G.); [email protected] (A.K.); [email protected] (T.M.); [email protected] (R.A.) 3 Department of Physics, University at Albany SUNY, Albany, NY 12222, USA; [email protected] (T.C.K.); [email protected] (A.V.S.); [email protected] (A.K.) * Correspondence: [email protected]; Tel.: +1-1-716-888-4776 Abstract: The widespread use of combination antiretroviral therapy (cART) has led to the accelerated aging of the HIV-infected population, and these patients continue to have a range of mild to moderate HIV-associated neurocognitive disorders (HAND). Infection results in altered mitochondrial function. The HIV-1 viral protein Tat significantly alters mtDNA content and enhances oxidative stress in immune cells. Microglia are the immune cells of the central nervous system (CNS) that exhibit Citation: Hsiao, C.-B.; Bedi, H.; a significant mitotic potential and are thus susceptible to telomere shortening. HIV disrupts the Gomez, R.; Khan, A.; Meciszewski, T.; normal interplay between microglia and neurons, thereby inducing neurodegeneration.
    [Show full text]
  • Epigenetic Control of Mammalian Centromere Protein Binding: Does DNA Methylation Have a Role?
    Journal of Cell Science 109, 2199-2206 (1996) 2199 Printed in Great Britain © The Company of Biologists Limited 1996 JCS3386 Epigenetic control of mammalian centromere protein binding: does DNA methylation have a role? Arthur R. Mitchell*, Peter Jeppesen, Linda Nicol†, Harris Morrison and David Kipling MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK *Author for correspondence (internet [email protected]) †Present address: MRC Reproductive Biology Unit, Edinburgh, UK SUMMARY Chromosome 1 of the inbred mouse strain DBA/2 has a block of minor satellite DNA sequences on chromosome 1. polymorphism associated with the minor satellite DNA at The binding of the CENP-E protein does not appear to be its centromere. The more terminal block of satellite DNA affected by demethylation of the minor satellite sequences. sequences on this chromosome acts as the centromere as We present a model to explain these observations. This shown by the binding of CREST ACA serum, anti-CENP- model may also indicate the mechanism by which the B and anti-CENP-E polyclonal sera. Demethylation of the CENP-B protein recognises specific sites within the arrays minor satellite DNA sequences accomplished by growing of minor satellite DNA on mouse chromosomes. cells in the presence of the drug 5-aza-2′-deoxycytidine results in a redistribution of the CENP-B protein. This protein now binds to an enlarged area on the more terminal Key words: Centromere satellite DNA, Demethylation, Centromere block and in addition it now binds to the more internal antibody INTRODUCTION A common feature of many mammalian pericentromeric domains is that they contain families of repetitive DNA The centromere of mammalian chromosomes is recognised at sequences (Singer, 1982).
    [Show full text]
  • Fructose Causes Liver Damage, Polyploidy, and Dysplasia in the Setting of Short Telomeres and P53 Loss
    H OH metabolites OH Article Fructose Causes Liver Damage, Polyploidy, and Dysplasia in the Setting of Short Telomeres and p53 Loss Christopher Chronowski 1, Viktor Akhanov 1, Doug Chan 2, Andre Catic 1,2 , Milton Finegold 3 and Ergün Sahin 1,4,* 1 Huffington Center on Aging, Baylor College of Medicine, Houston, TX 77030, USA; [email protected] (C.C.); [email protected] (V.A.); [email protected] (A.C.) 2 Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA; [email protected] 3 Department of Pathology, Baylor College of Medicine, Houston, TX 77030, USA; fi[email protected] 4 Department of Physiology and Biophysics, Baylor College of Medicine, Houston, TX 77030, USA * Correspondence: [email protected]; Tel.: +1-713-798-6685; Fax: +1-713-798-4146 Abstract: Studies in humans and model systems have established an important role of short telomeres in predisposing to liver fibrosis through pathways that are incompletely understood. Recent studies have shown that telomere dysfunction impairs cellular metabolism, but whether and how these metabolic alterations contribute to liver fibrosis is not well understood. Here, we investigated whether short telomeres change the hepatic response to metabolic stress induced by fructose, a sugar that is highly implicated in non-alcoholic fatty liver disease. We find that telomere shortening in telomerase knockout mice (TKO) imparts a pronounced susceptibility to fructose as reflected in the activation of p53, increased apoptosis, and senescence, despite lower hepatic fat accumulation in TKO mice compared to wild type mice with long telomeres. The decreased fat accumulation in TKO Citation: Chronowski, C.; Akhanov, is mediated by p53 and deletion of p53 normalizes hepatic fat content but also causes polyploidy, V.; Chan, D.; Catic, A.; Finegold, M.; Sahin, E.
    [Show full text]
  • X-Chromosome Meiotic Drive in Drosophila Simulans: a QTL Approach Reveals the Complex Polygenic Determinism of Paris Drive Suppression
    Heredity (2019) 122:906–915 https://doi.org/10.1038/s41437-018-0163-1 ARTICLE X-chromosome meiotic drive in Drosophila simulans: a QTL approach reveals the complex polygenic determinism of Paris drive suppression 1 1,2 1 2 2 Cécile Courret ● Pierre R. Gérard ● David Ogereau ● Matthieu Falque ● Laurence Moreau ● Catherine Montchamp-Moreau1 Received: 31 July 2018 / Revised: 14 October 2018 / Accepted: 24 October 2018 / Published online: 5 December 2018 © The Genetics Society 2018 Abstract Meiotic drivers are selfish genetic elements that promote their own transmission into the gametes, which results in intragenomic conflicts. In the Paris sex-ratio system of Drosophila simulans, drivers located on the X chromosome prevent the segregation of the heterochromatic Y chromosome during meiosis II, and hence the production of Y-bearing sperm. The resulting sex-ratio bias strongly impacts population dynamics and evolution. Natural selection, which tends to restore an equal sex ratio, favors the emergence of resistant Y chromosomes and autosomal suppressors. This is the case in the Paris 1234567890();,: 1234567890();,: sex-ratio system where the drivers became cryptic in most of the natural populations of D. simulans. Here, we used a quantitative trait locus (QTL) mapping approach based on the analysis of 152 highly recombinant inbred lines (RILs) to investigate the genetic determinism of autosomal suppression. The RILs were derived from an advanced intercross between two parental lines, one showing complete autosomal suppression while the other one was sensitive to drive. The confrontation of RIL autosomes with a reference XSR chromosome allowed us to identify two QTLs on chromosome 2 and three on chromosome 3, with strong epistatic interactions.
    [Show full text]
  • Y Chromosome Dynamics in Drosophila
    Y chromosome dynamics in Drosophila Amanda Larracuente Department of Biology Sex chromosomes X X X Y J. Graves Sex chromosome evolution Proto-sex Autosomes chromosomes Sex Suppressed determining recombination Differentiation X Y Reviewed in Rice 1996, Charlesworth 1996 Y chromosomes • Male-restricted • Non-recombining • Degenerate • Heterochromatic Image from Willard 2003 Drosophila Y chromosome D. melanogaster Cen Hoskins et al. 2015 ~40 Mb • ~20 genes • Acquired from autosomes • Heterochromatic: Ø 80% is simple satellite DNA Photo: A. Karwath Lohe et al. 1993 Satellite DNA • Tandem repeats • Heterochromatin • Centromeres, telomeres, Y chromosomes Yunis and Yasmineh 1970 http://www.chrombios.com Y chromosome assembly challenges • Repeats are difficult to sequence • Underrepresented • Difficult to assemble Genome Sequence read: Short read lengths cannot span repeats Single molecule real-time sequencing • Pacific Biosciences • Average read length ~15 kb • Long reads span repeats • Better genome assemblies Zero mode waveguide Eid et al. 2009 Comparative Y chromosome evolution in Drosophila I. Y chromosome assemblies II. Evolution of Y-linked genes Drosophila genomes 2 Mya 0.24 Mya Photo: A. Karwath P6C4 ~115X ~120X ~85X ~95X De novo genome assembly • Assemble genome Iterative assembly: Canu, Hybrid, Quickmerge • Polish reference Quiver x 2; Pilon 2L 2R 3L 3R 4 X Y Assembled genome Mahul Chakraborty, Ching-Ho Chang 2L 2R 3L 3R 4 X Y Y X/A heterochromatin De novo genome assembly species Total bp # contigs NG50 D. simulans 154,317,203 161 21,495729
    [Show full text]
  • Evolution on the X Chromosome: Unusual Patterns and Processes
    REVIEWS Evolution on the X chromosome: unusual patterns and processes Beatriz Vicoso and Brian Charlesworth Abstract | Although the X chromosome is usually similar to the autosomes in size and cytogenetic appearance, theoretical models predict that its hemizygosity in males may cause unusual patterns of evolution. The sequencing of several genomes has indeed revealed differences between the X chromosome and the autosomes in the rates of gene divergence, patterns of gene expression and rates of gene movement between chromosomes. A better understanding of these patterns should provide valuable information on the evolution of genes located on the X chromosome. It could also suggest solutions to more general problems in molecular evolution, such as detecting selection and estimating mutational effects on fitness. Haldane’s rule Sex-chromosome systems have evolved independently the predictions of theoretical models of X-chromosome The disproportionate loss of many times, and have attracted much attention from evolution will shed light on the assumptions on which fitness to the heterogametic evolutionary geneticists. This work has mainly focused the models are based, such as the degree of dominance of sex in F1 hybrids between on the steps leading to the initial evolution of sex chro- mutations and the existence of opposing forces species. mosomes, and the genetic degeneration of Y and W of selection on males and females, leading to a better 1 Clade chromosomes . Here, we discuss the evolution of the understanding of the forces that shape the evolution of A group of species which share X chromosome in long-established sex-chromosome eukaryotic genomes. a common ancestor.
    [Show full text]
  • The Bacteria Genome Pipeline (BAGEP): an Automated, Scalable Workflow for Bacteria Genomes with Snakemake
    The Bacteria Genome Pipeline (BAGEP): an automated, scalable workflow for bacteria genomes with Snakemake Idowu B. Olawoye1,2, Simon D.W. Frost3,4 and Christian T. Happi1,2 1 Department of Biological Sciences, Faculty of Natural Sciences, Redeemer's University, Ede, Osun State, Nigeria 2 African Centre of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State, Nigeria 3 Microsoft Research, Redmond, WA, USA 4 London School of Hygiene & Tropical Medicine, University of London, London, United Kingdom ABSTRACT Next generation sequencing technologies are becoming more accessible and affordable over the years, with entire genome sequences of several pathogens being deciphered in few hours. However, there is the need to analyze multiple genomes within a short time, in order to provide critical information about a pathogen of interest such as drug resistance, mutations and genetic relationship of isolates in an outbreak setting. Many pipelines that currently do this are stand-alone workflows and require huge computational requirements to analyze multiple genomes. We present an automated and scalable pipeline called BAGEP for monomorphic bacteria that performs quality control on FASTQ paired end files, scan reads for contaminants using a taxonomic classifier, maps reads to a reference genome of choice for variant detection, detects antimicrobial resistant (AMR) genes, constructs a phylogenetic tree from core genome alignments and provide interactive short nucleotide polymorphism (SNP) visualization across
    [Show full text]
  • Lawrence Berkeley National Laboratory Recent Work
    Lawrence Berkeley National Laboratory Recent Work Title 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Permalink https://escholarship.org/uc/item/7cx5710p Journal Nature biotechnology, 35(7) ISSN 1087-0156 Authors Mukherjee, Supratim Seshadri, Rekha Varghese, Neha J et al. Publication Date 2017-07-01 DOI 10.1038/nbt.3886 Peer reviewed eScholarship.org Powered by the California Digital Library University of California RESOU r CE OPEN 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life Supratim Mukherjee1,10, Rekha Seshadri1,10, Neha J Varghese1, Emiley A Eloe-Fadrosh1, Jan P Meier-Kolthoff2 , Markus Göker2 , R Cameron Coates1,9, Michalis Hadjithomas1, Georgios A Pavlopoulos1 , David Paez-Espino1 , Yasuo Yoshikuni1, Axel Visel1 , William B Whitman3, George M Garrity4,5, Jonathan A Eisen6, Philip Hugenholtz7 , Amrita Pati1,9, Natalia N Ivanova1, Tanja Woyke1, Hans-Peter Klenk8 & Nikos C Kyrpides1 We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity.
    [Show full text]
  • X Chromosome-Linked and Mitochondrial Gene Control of Leber
    Proc. Nati. Acad. Sci. USA Vol. 88, pp. 8198-8202, September 1991 Genetics X chromosome-linked and mitochondrial gene control of Leber hereditary optic neuropathy: Evidence from segregation analysis for dependence on X chromosome inactivation (two-locus inheritance/cytoplasmic inheritance/reduced penetrance) XIANGDONG BU AND JEROME I. ROTTER* Medical Genetics Birth Defects Center, Departments of Medicine and Pediatrics, Cedars-Sinai Medical Center and University of California School of Medicine, Los Angeles, CA 90048 Communicated by Giuseppe Attardi, July 1, 1991 ABSTRACT Leber hereditary optic neuropathy (LHON) linkage analysis, Chen et al. (5) excluded an X-linked gene has been shown to involve mutation(s) of mitochondrial DNA, alone as the cause for LHON. Earlier, Imai and Moriwaki (6) yet there remain several confusing aspects of Its inheritance not advanced the theory of cytoplasmic inheritance. The identi- explained by mitochondrial inheritance alone, including male fication of a mtDNA point mutation for LHON by Wallace et predominance, reduced penetrance, and a later age of onset in al. (3) and Singh et al. (4) subsequently proved the decades- females. By extending segregation analysis methods to disor- old hypothesis regarding the cytoplasmic inheritance of ders that involve both a mitochondrial and a nuclear gene LHON (6). As mentioned above, however, a mitochondrial locus, we show that the available pedigree data for LHON are mutation alone still cannot explain many ofthe features ofthe most consistent with a two-locus disorder, with one responsible transmission pattern of LHON, including the strong male gene being mitochondrial and the other nuclear and X chro- bias and the reduced penetrance of LHON in the maternal mosome-linked.
    [Show full text]