The Tiny Eukaryote Ostreococcus Provides Genomic Insights Into the Paradox of Plankton Speciation

Total Page:16

File Type:pdf, Size:1020Kb

The Tiny Eukaryote Ostreococcus Provides Genomic Insights Into the Paradox of Plankton Speciation The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation Brian Palenika,b, Jane Grimwoodc, Andrea Aertsd, Pierre Rouze´ e, Asaf Salamovd, Nicholas Putnamd, Chris Duponta, Richard Jorgensenf, Evelyne Derelleg, Stephane Rombautsh, Kemin Zhoud, Robert Otillard, Sabeeha S. Merchanti, Sheila Podellj, Terry Gaasterlandj, Carolyn Napolif, Karla Gendlerf, Andrea Manuellk, Vera Taia, Olivier Vallonl, Gwenael Piganeaug,Se´ verine Jancekg, Marc Heijdem, Kamel Jabbarim, Chris Bowlerm, Martin Lohrn, Steven Robbensh, Gregory Wernerd, Inna Dubchakd, Gregory J. Pazouro, Qinghu Renp, Ian Paulsenp, Chuck Delwicheq, Jeremy Schmutzc, Daniel Rokhsard, Yves Van de Peerh, Herve´ Moreaug, and Igor V. Grigorievb,d aScripps Institution of Oceanography, University of California at San Diego, La Jolla, CA 92093-0202; cJoint Genome Institute and Stanford Human Genome Center, Stanford University School of Medicine, 975 California Avenue, Palo Alto, CA 94304; dU.S. Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598; eLaboratoire Associe´de l’Institut National de la Recherche Agronomique (France), Ghent University, Technologiepark 927, B-9052 Ghent, Belgium; fDepartment of Plant Sciences, University of Arizona, 303 Forbes Building, Tucson, AZ 85721-0036; gObservatoire Oce´anologique, Laboratoire Arago, Centre National de la Recherche Scientifique/Universite´Pierre et Marie Curie Paris 6, Unite´Mixte de Recherche 7628, BP 44, 66651 Banyuls sur Mer Cedex, France; hDepartment of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology, Ghent University, Technologiepark 927, B-9052 Ghent, Belgium; iDepartment of Chemistry and Biochemistry, University of California, Box 951569, Los Angeles, CA 90095; jScripps Genome Center, Scripps Institution of Oceanography, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0202; kDepartment of Cell Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037; lInstitut de Biologie Physico-Chimique, Centre National de la Recherche Scientifique/Universite´Paris 6, Unite´Mixte de Recherche 7141, 13, Rue Pierre et Marie Curie, 75005 Paris, France; mDe´partement de Biologie, Ecole Normale Supe´rieure, Formation de Recherche en Evolution 2910, Centre National de la Recherche Scientifique, 46, Rue D’Ulm, 75230 Paris Cedex 05, France; nInstitut fu¨r Allgemeine Botanik, Johannes Gutenberg-Universita¨t, D-55099 Mainz, Germany; oProgram in Molecular Medicine, University of Massachusetts Medical School, Suite 213, Biotech II, 373 Plantation Street, Worcester, MA 01605; pThe Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850; and qCell Biology and Molecular Genetics, University of Maryland, H. J. Patterson Hall, Building 073, College Park, MD 20742-5815 Edited by Robert Haselkorn, University of Chicago, Chicago, IL, and approved March 13, 2007 (received for review December 12, 2006) The smallest known eukaryotes, at Ϸ1-␮m diameter, are Ostreo- tion, with no cell wall or flagella, and with a single chloroplast and coccus tauri and related species of marine phytoplankton. The mitochondrion (4). Recent work has shown that small-subunit genome of Ostreococcus lucimarinus has been completed and rDNA sequences of Ostreococcus from cultures and environmental compared with that of O. tauri. This comparison reveals surprising samples cluster into four different clades that are likely distinct differences across orthologous chromosomes in the two species enough to represent different species (6, 9). from highly syntenic chromosomes in most cases to chromosomes Here we report on the gene content, genome organization, and with almost no similarity. Species divergence in these phytoplank- deduced metabolic capacity of the complete genome of Ostreococ- ton is occurring through multiple mechanisms acting differently on cus sp. strain CCE9901 (7), a representative of surface–ocean different chromosomes and likely including acquisition of new adapted Ostreococcus, referred to here as Ostreococcus lucimarinus. genes through horizontal gene transfer. We speculate that this We compare it to the analogous features of the related species O. latter process may be involved in altering the cell-surface charac- tauri strain OTH95 (10). Our results show that many processes have teristics of each species. In addition, the genome of O. lucimarinus been involved in the evolution and speciation of even these sister provides insights into the unique metal metabolism of these organisms, from dramatic changes in genome structure to signifi- organisms, which are predicted to have a large number of seleno- cant differences in metabolic capabilities. cysteine-containing proteins. Selenoenzymes are more catalyti- cally active than similar enzymes lacking selenium, and thus the cell Results may require less of that protein. As reported here, selenoenzymes, Gene Content. O. lucimarinus is the first closed and finished genome novel fusion proteins, and loss of some major protein families of a green alga and as such will provide a great resource for in-depth including ones associated with chromatin are likely important analysis of genome organization and the processes of eukaryotic adaptations for achieving a small cell size. genome evolution. O. lucimarinus has a nuclear genome size of 13.2 million base pairs found in 21 chromosomes, as compared with a green algae ͉ picoeukaryote ͉ genome evolution ͉ selenium ͉ synteny genome size for O. tauri of 12.6 million base pairs found in 20 hytoplankton living in the oceans perform nearly half of total PLANT BIOLOGY Pglobal photosynthesis (1). Eukaryotic phytoplankton exhibit Author contributions: J.G. and A.A. performed research; B.P., P.R., A.S., N.P., C. Dupont, R.J., E.D., S. Rombauts, K.Z., R.O., S.S.M., S.P., T.G., C.N., K.G., A.M., V.T., O.V., G.P., S.J., M.H., K.J., great diversity that contrasts with the lower apparent diversity of C.B., M.L., S. Robbens, G.W., I.D., G.J.P., Q.R., I.P., C. Delwiche, J.S., D.R., Y.V.d.P., H.M., and ecological niches available to them in aquatic ecosystems. This I.V.G. analyzed data; and B.P., P.R., C. Dupont, R.J., K.Z., S.S.M., M.L., H.M., and I.V.G. wrote observation, know as the ‘‘paradox of the plankton,’’ has long the paper. puzzled biologists (2). By providing molecular level information on The authors declare no conflict of interest. related species, genomics is poised to provide new insights into this This article is a PNAS Direct Submission. paradox. Abbreviations: Chr n, chromosome n; GC, guanine plus cytosine. Picophytoplankton, with cell diameters Ͻ2 ␮m, play a significant Data deposition: The O. lucimarinus genome sequence, predicted genes, and annotations role in major biogeochemical processes, primary productivity, and reported in this paper have been deposited in the GenBank database (accession nos. food webs, especially in oligotrophic waters. Within this size class, CP000581–CP000601 for Chr 1 through Chr 21). The O. lucimarinus strain (CCE9901) used here has been deposited in the Provasoli-Guillard Culture Collection of Marine Phytoplank- the smallest known eukaryotes are Ostreococcus tauri and related ton (accession no. CCMP2514). species. Although more similar to flattened spheres in shape, these bTo whom correspondence may be addressed. E-mail: [email protected] or ivgrigoriev@ organisms are Ϸ1 ␮m in diameter (3, 4) and have been isolated or lbl.gov. detected from samples of diverse geographical origins (5–8). They This article contains supporting information online at www.pnas.org/cgi/content/full/ belong to the Prasinophyceae, an early diverging class within the 0611046104/DC1. green plant lineage, and have a strikingly simple cellular organiza- © 2007 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0611046104 PNAS ͉ May 1, 2007 ͉ vol. 104 ͉ no. 18 ͉ 7705–7710 Downloaded by guest on September 26, 2021 Table 1. Summary of predicted genes in Ostreococcus lucimarinus, and 7,892 genes are found in the genome of O. tauri. sp. genomes Overall gene content is similar between the genomes (Table 1). Properties O. lucimarinus O. tauri Approximately one-fifth of all genes in both genomes have mul- tiexon structure, most of which belong to chromosome 2 (Chr 2), Genome size, Mbp 13.2 12.6 and have the introns of unusual size and structure that were Chromosomes 21 20 reported earlier for O. tauri (10). A total of 6,753 pairs of orthologs No. of genes 7,651 7,892 have been identified between genes in the two Ostreococcus species Multiexon genes, % 20 25 Supported by, % with an average coverage of 93% and an average amino acid Multiple methods 28 19 identity of 70%. A comparison of the amino acid identity between Genome conservation 65 73 other sister taxa shows that they are more divergent than charac- Homology to another strain 93 92 terized species of Saccharomyces with similar levels of overall Homology to SwissProt 84 79 synteny [supporting information (SI) Table 2]. ESTs 28 21 Approximately 5–6% of gene models are genome-specific and do Peptides 13 N/D not display homology to the other species (SI Table 3). These are Average gene size, bp 1,284 1,245 mostly due to lineage-specific gene loss or acquisition or remaining Transcript size, bp 1,234 1,175 No. of exons per gene 1.27 1.57 gaps in the O. tauri sequence. The number of lineage-specific Exon size, bp 970 750 duplications is also low, 9% for O. lucimarinus and 4% for O. tauri, Intron size, bp 187 126 mostly because of several segmental duplications. N/D, not determined. Genome Structure. Based on analysis of gene content, orthology, and DNA alignments, 20 chromosomes in each genome have a counterpart in the other species. Eighteen of these 20 are highly chromosomes (10) (Table 1). For comparison here, both genomes syntenic (Fig. 1) and formed the core of the ancestral Ostreococcus were annotated by using the same tools, as described in Methods. genome. The remaining two chromosomes of O.
Recommended publications
  • Morphology, Genome Plasticity, and Phylogeny in the Genus
    Protist, Vol. 164, 643–659, xx 2013 http://www.elsevier.de/protis Published online date xxx ORIGINAL PAPER Morphology, Genome Plasticity, and Phylogeny in the Genus Ostreococcus Reveal a Cryptic Species, O. mediterraneus sp. nov. (Mamiellales, Mamiellophyceae) a,b c d a,b Lucie Subirana , Bérangère Péquin , Stéphanie Michely , Marie-Line Escande , a,b,2 a,b e a,b Julie Meilland , Evelyne Derelle , Birger Marin , Gwenaël Piganeau , a,b a,b a,b,1 Yves Desdevises , Hervé Moreau , and Nigel H. Grimsley a CNRS, UMR7232 Laboratoire de Biologie Intégrative des Organisms Marins (BIOM), Observatoire Océanologique, Banyuls-sur-Mer, France b UPMC Univ Paris 06, Observatoire Océanologique, Banyuls-sur-Mer, France c Québec-Océan, Département de Biologie and Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec (QC) G1 V 0A6, Canada d INRA UMR1319, AgroParisTech, Micalis, Bimlip, Bât. CBAI, BP 01, 78850 Thiverval-Grignon, France e Biozentrum Köln, Botanisches Institut, Universität zu Köln, Zülpicher Str. 47b, 50674 Köln, Germany Submitted December 4, 2011; Accepted June 18, 2013 Monitoring Editor: David Moreira Coastal marine waters in many regions worldwide support abundant populations of extremely small (1-3 ␮m diameter) unicellular eukaryotic green algae, dominant taxa including several species in the class Mamiellophyceae. Their diminutive size conceals surprising levels of genetic diversity and defies classical species’ descriptions. We present a detailed analysis within the genus Ostreococcus and show that morphological characteristics cannot be used to describe diversity within this group. Karyotypic analyses of the best-characterized species O. tauri show it to carry two chromosomes that vary in size between individual clonal lines, probably an evolutionarily ancient feature that emerged before species’ divergences within the Mamiellales.
    [Show full text]
  • Genome Analysis of the Smallest Free-Living Eukaryote Ostreococcus
    Genome analysis of the smallest free-living eukaryote SEE COMMENTARY Ostreococcus tauri unveils many unique features Evelyne Derellea,b, Conchita Ferrazb,c, Stephane Rombautsb,d, Pierre Rouze´ b,e, Alexandra Z. Wordenf, Steven Robbensd, Fre´ de´ ric Partenskyg, Sven Degroeved,h, Sophie Echeynie´ c, Richard Cookei, Yvan Saeysd, Jan Wuytsd, Kamel Jabbarij, Chris Bowlerk, Olivier Panaudi, BenoıˆtPie´ gui, Steven G. Ballk, Jean-Philippe Ralk, Franc¸ois-Yves Bougeta, Gwenael Piganeaua, Bernard De Baetsh, Andre´ Picarda,l, Michel Delsenyi, Jacques Demaillec, Yves Van de Peerd,m, and Herve´ Moreaua,m aObservatoire Oce´anologique, Laboratoire Arago, Unite´Mixte de Recherche 7628, Centre National de la Recherche Scientifique–Universite´Pierre et Marie Curie-Paris 6, BP44, 66651 Banyuls sur Mer Cedex, France; cInstitut de Ge´ne´ tique Humaine, Unite´Propre de Recherche 1142, Centre National de la Recherche Scientifique, 141 Rue de Cardonille, 34396 Montpellier Cedex 5, France; dDepartment of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology and eLaboratoire Associe´de l’Institut National de la Recherche Agronomique (France), Ghent University, Technologiepark 927, 9052 Ghent, Belgium; fRosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL 33149; gStation Biologique, Unite´Mixte de Recherche 7144, Centre National de la Recherche Scientifique–Universite´Pierre et Marie Curie-Paris 6, BP74, 29682 Roscoff Cedex, France; hDepartment of Applied Mathematics, Biometrics and
    [Show full text]
  • (PWP2) Gene Mapping to 21Q22.3 Ronald G
    Downloaded from genome.cshlp.org on October 2, 2021 - Published by Cold Spring Harbor Laboratory Press LETTER Isolation and Genomic Structure of a Human Homolog of the Yeast Periodic Tryptophan Protein 2 (PWP2) Gene Mapping to 21q22.3 Ronald G. Lafrenii~re, 1'4 Daniel L. Rochefort, 1 Nathalie Chr~tien, ~ Catherine E. Neville, 2 Robert G. Korneluk, 2 Lin Zuo, 3 Yalin Wei, 3 Jay Lichter, 3 and Guy A. Rouleau ~ 1Centre for Research in Neuroscience, McGill University, and Department of Neurology, Montreal General Hospital Research Institute, Montreal, Canada H3G 1A4; 2DNA Sequencing Core Facility, Canadian Genetic Diseases Network, Children's Hospital of Eastern Ontario, Ottawa, Canada K1H 8L1; 3Sequana Therapeutics, Inc., La Jolla, California 92037 As part of efforts to identify candidate genes for diseases mapping to the 21q22.3 region, we have assembled a 770-kb cosmid and BAC contig containing eight tightly linked markers. These cosmids and BACs were restriction mapped using eight rare cutting enzymes, with the goal of identifying CpG-rich islands. One such island was identified by the clustering of lqotl, Eagl, Sstll, and BssHIl sites, and corresponded to the Nod linking clone LJI04 described previously. A 7.6-kb l-lindlll fragment containing this CpG-rich island was subcloned and partially sequenced. A homology search using the sequence obtained from either side of the Nod site identified an expressed sequence tag with homology to the yeast periodic tryptophan protein 2 (PWP2). Several cDNAs corresponding to the human PWP2 gene were identified and partially sequenced. Northern blot analysis revealed a 3.3-kb transcript that was well expressed in all tissues tested.
    [Show full text]
  • Assembly and Annotation of an Ashkenazi Human Reference Genome
    bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997395; this version posted March 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Assembly and Annotation of an Ashkenazi Human Reference Genome Alaina Shumate1,2,† Aleksey V. Zimin1,2,† Rachel M. Sherman1,3 Daniela Puiu1,3 Justin M. Wagner4 Nathan D. Olson4 Mihaela Pertea1,2 Marc L. Salit5 Justin M. Zook4 Steven L. Salzberg1,2,3,6* 1Center for Computational Biology, Johns Hopkins University, Baltimore, MD 2Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 3Department of Computer Science, Johns Hopkins University, Baltimore, MD 4National Institute of Standards and Technology, Gaithersburg, MD 5Joint Initiative for Metrology in Biology, Stanford University, Stanford, CA 6Department of Biostatistics, Johns Hopkins University, Baltimore, MD †These authors contributed equally to this work. *Corresponding author. Email: [email protected] Abstract Here we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are >99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences.
    [Show full text]
  • Red and Green Algal Monophyly and Extensive Gene Sharing Found in a Rich Repertoire of Red Algal Genes
    Current Biology 21, 328–333, February 22, 2011 ª2011 Elsevier Ltd All rights reserved DOI 10.1016/j.cub.2011.01.037 Report Red and Green Algal Monophyly and Extensive Gene Sharing Found in a Rich Repertoire of Red Algal Genes Cheong Xin Chan,1,5 Eun Chan Yang,2,5 Titas Banerjee,1 sequences in our local database, in which we included the Hwan Su Yoon,2,* Patrick T. Martone,3 Jose´ M. Estevez,4 23,961 predicted proteins from C. tuberculosum (see Table and Debashish Bhattacharya1,* S1 available online). Of these hits, 9,822 proteins (72.1%, 1Department of Ecology, Evolution, and Natural Resources including many P. cruentum paralogs) were present in C. tuber- and Institute of Marine and Coastal Sciences, Rutgers culosum and/or other red algae, 6,392 (46.9%) were shared University, New Brunswick, NJ 08901, USA with C. merolae, and 1,609 were found only in red algae. A total 2Bigelow Laboratory for Ocean Sciences, West Boothbay of 1,409 proteins had hits only to red algae and one other Harbor, ME 04575, USA phylum. Using this repertoire, we adopted a simplified recip- 3Department of Botany, University of British Columbia, 6270 rocal BLAST best-hits approach to study the pattern of exclu- University Boulevard, Vancouver, BC V6T 1Z4, Canada sive gene sharing between red algae and other phyla (see 4Instituto de Fisiologı´a, Biologı´a Molecular y Neurociencias Experimental Procedures). We found that 644 proteins showed (IFIBYNE UBA-CONICET), Facultad de Ciencias Exactas y evidence of exclusive gene sharing with red algae. Of these, Naturales, Universidad de Buenos Aires, 1428 Buenos Aires, 145 (23%) were found only in red + green algae (hereafter, Argentina RG) and 139 (22%) only in red + Alveolata (Figure 1A).
    [Show full text]
  • Is Chloroplastic Class IIA Aldolase a Marine Enzyme&Quest;
    The ISME Journal (2016) 10, 2767–2772 © 2016 International Society for Microbial Ecology All rights reserved 1751-7362/16 www.nature.com/ismej SHORT COMMUNICATION Is chloroplastic class IIA aldolase a marine enzyme? Hitoshi Miyasaka1, Takeru Ogata1, Satoshi Tanaka2, Takeshi Ohama3, Sanae Kano4, Fujiwara Kazuhiro4,7, Shuhei Hayashi1, Shinjiro Yamamoto1, Hiro Takahashi5, Hideyuki Matsuura6 and Kazumasa Hirata6 1Department of Applied Life Science, Sojo University, Kumamoto, Japan; 2The Kansai Electric Power Co., Environmental Research Center, Keihanna-Plaza, Kyoto, Japan; 3School of Environmental Science and Engineering, Kochi University of Technology, Kochi, Japan; 4Chugai Technos Corporation, Hiroshima, Japan; 5Graduate School of Horticulture, Faculty of Horticulture, Chiba University, Chiba, Japan and 6Environmental Biotechnology Laboratory, Graduate School of Pharmaceutical Sciences, Osaka University, Osaka, Japan Expressed sequence tag analyses revealed that two marine Chlorophyceae green algae, Chlamydo- monas sp. W80 and Chlamydomonas sp. HS5, contain genes coding for chloroplastic class IIA aldolase (fructose-1, 6-bisphosphate aldolase: FBA). These genes show robust monophyly with those of the marine Prasinophyceae algae genera Micromonas, Ostreococcus and Bathycoccus, indicating that the acquisition of this gene through horizontal gene transfer by an ancestor of the green algal lineage occurred prior to the divergence of the core chlorophytes (Chlorophyceae and Treboux- iophyceae) and the prasinophytes. The absence of this gene in some freshwater chlorophytes, such as Chlamydomonas reinhardtii, Volvox carteri, Chlorella vulgaris, Chlorella variabilis and Coccomyxa subellipsoidea, can therefore be explained by the loss of this gene somewhere in the evolutionary process. Our survey on the distribution of this gene in genomic and transcriptome databases suggests that this gene occurs almost exclusively in marine algae, with a few exceptions, and as such, we propose that chloroplastic class IIA FBA is a marine environment-adapted enzyme.
    [Show full text]
  • WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization I International Bureau (10) International Publication Number (43) International Publication Date WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT (51) International Patent Classification: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, C12Q 1/68 (2018.01) A61P 31/18 (2006.01) DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, C12Q 1/70 (2006.01) HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, (21) International Application Number: MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, PCT/US2018/056167 OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (22) International Filing Date: SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, 16 October 2018 (16. 10.2018) TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (25) Filing Language: English (84) Designated States (unless otherwise indicated, for every kind of regional protection available): ARIPO (BW, GH, (26) Publication Language: English GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, (30) Priority Data: UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, 62/573,025 16 October 2017 (16. 10.2017) US TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, ΓΕ , IS, IT, LT, LU, LV, (71) Applicant: MASSACHUSETTS INSTITUTE OF MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TECHNOLOGY [US/US]; 77 Massachusetts Avenue, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, Cambridge, Massachusetts 02139 (US).
    [Show full text]
  • De Novo Genome Assembly and Analysis of Non- Allelic Recombination in Pathogenic Yeast Candida Glabrata
    DE NOVO GENOME ASSEMBLY AND ANALYSIS OF NON- ALLELIC RECOMBINATION IN PATHOGENIC YEAST CANDIDA GLABRATA by Zhuwei Xu A dissertation submitted to Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland November 2019 © 2019 Zhuwei Xu All Rights Reserved Abstract Candida glabrata is an opportunistic pathogen in humans, responsible for approximately 20% of disseminated candidiasis. C. glabrata’s ability to adhere to host tissue is mediated by GPI- anchored cell wall proteins (GPI-CWPs); the corresponding genes contain long tandem repeat regions and form large gene families. These tandem repeats cause mis-assemblies of GPI-CWP genes in C. glabrata genome. Subtelomeres of C. glabrata are particularly rich in GPI-CWP genes and share homology with each other. Consequently, the subtelomeres are mis-assembled in genome sequences assembled from short sequencing reads. In this thesis, we used the long single-molecule real time (SMRT) reads and performed de novo genome assembly of the C. glabrata genome to establish the correct structure of GPI-CWP genes and the subtelomeres. We assembled the genome of six C. glabrata strains: the type strain, CBS138; our lab strain, BG2; four serial clinical isolates, BG3993-96 to assess genome changes during infection. With high quality sequences in hand, we then assess recombinational exchange between GPI-CWP genes by non-allelic mitotic recombination. This question is difficult to address with normal aligners, and we developed a k-mer based method to identify recombination. Our assembly established the correct subtelomere structure of Candida glabrata and provides correct structure of the GPI-CWP gene families.
    [Show full text]
  • Visualization Assisted Hardware Configuration for Bioinformatics Algorithms Priyesh Dixit Advisors: Drs. K. R. Subramanian, A. Mukherjee, A
    Visualization assisted hardware configuration for Bioinformatics algorithms Priyesh Dixit Advisors: Drs. K. R. Subramanian, A. Mukherjee, A. Ravindran May 2005 1 Special Thanks I would like to thank Dr. Subramanian for his faith in my abilities and for selecting me to do this project. I also thank Dr. Mukherjee and Dr. Ravindran for their patience and support throughout the year and making this project possible. Thanks to Josh Foster and Nalin Subramanian for help with the FLTK and VTK. And also I would like to especially thank Jong-Ho Byun for his all his help. 2 Table of Contents Chapter 1: Introduction...................................................................................................4 Chapter 2: Background...................................................................................................6 2.1 Bioinformatics.......................................................................................................6 2.2 Field Programmable Gate Arrays..........................................................................6 2.3 The Smith-Waterman algorithm............................................................................9 2.4 Visualization.......................................................................................................10 Chapter 3: System Description......................................................................................11 3.1 Overview.............................................................................................................11 3.2 Design flow.........................................................................................................12
    [Show full text]
  • Title: "Implementation of a PRRSV Strain Database" – NPB #04-118
    Title: "Implementation of a PRRSV Strain Database" – NPB #04-118 Investigator: Kay S. Faaberg Institution: University of Minnesota Co-Investigators: James E. Collins, Ernest F. Retzel Date Received: August 1, 2006 Abstract: We describe the establishment of a PRRSV ORF5 database (http://prrsv.ahc.umn.edu/) in order to fulfill one objective of the PRRS Initiative of the National Pork Board. PRRSV ORF5 nucleotide sequence data, which is one key viral protein against which neutralizing antibodies are generated, was assembled from our diagnostic laboratory’s private collection of field samples. Searchable webpages were developed, such that any user can go from isolate sequence to generating alignments and phylogenetic dendograms that show nucleotide or amino acid relatedness to that input sequence with minimal steps. The database can be utilized for understanding PRRSV variation, epidemiological studies, and selection of vaccine candidates. We believe that this database will eventually be the only website in the world to obtain recent, as well as archival, and complete information that is critical to PRRS elimination. Introduction: This project was proposed to fulfill the stated directive of the PRRS Initiative to implement a National PRRSV Sequence Database. The organization that oversees the development, care and maintenance of the database is the Center for Computational Genomics and Bioinformatics at the University of Minnesota. The Center is not affiliated with any diagnostic laboratory or department, but is a fee-for-service facility that possesses high-throughput computational resources, and has developed databases for a number of nationwide initiatives. The database is now freely available to all PRRS researchers, veterinarians and producers for web-based queries concerning relationships to other sequences, RFLP analysis, year and state of isolation, and other related research-based endeavors.
    [Show full text]
  • Biological Sequence Analysis
    Biological Sequence Analysis CSEP 521: Applied Algorithms – Final Project Archie Russell (0638782), Jason Hogg (0641054) Introduction Background The schematic for every living organism is stored in long molecules known as chromosomes made of a substance known as DNA (deoxyribonucleic acid. Each cell in an organism has a complete copy of its DNA, also known as its genomewhich is conveniently modeled as a sequence of symbols (alternately referred to as nucleotides or bases) in the DNA alphabet {A,C,T,G}. In humans, and most mammals, this sequence is about 3 billion bases long. Through a process known as protein synthesis, instructions in our DNA, known as genes, are interpreted by machinery in our bodies and transformed first into intermediate molecules called RNA, and then into structures called proteins, which are the primary actors in living systems. These molecules can also be modeled as sequences of symbols. These genes determine core characteristics of the development of an organism, as well as its day-to-day operations. For example, the Hox1 gene family determines where limbs and other body parts will grow in a developing fetus, and defects in the BRCA1 gene are implicated in breast cancer. The rate of data acquisition has been immense. As of 2006, GenBank, the national public repository for sequence information, had accumulated over 130 billion bases of DNA and RNA sequence, a 200-fold increase over the 1996 total2. Over 10,000 species have at least one sequence in GenBank, and 50 species have at least 100,000 sequences3. Complete genome sequences, each roughly 3-gigabases in size, are available for human, mouse, rat, dog, cat, cow, chicken, elephant, chimpanzee, as well as many non-mammals.
    [Show full text]
  • Investigation of RNA Binding Proteins Regulated by Mtor
    Investigation of RNA binding proteins regulated by mTOR Thesis submitted to the University of Leicester for the degree of Doctor of Philosophy Katherine Morris BSc (University of Leicester) March 2017 1 Investigation of RNA binding proteins regulated by mTOR Katherine Morris, MRC Toxicology Unit, University of Leicester, Leicester, LE1 9HN The mammalian target of rapamycin (mTOR) is a serine/threonine protein kinase which plays a key role in the transduction of cellular energy signals, in order to coordinate and regulate a wide number of processes including cell growth and proliferation via control of protein synthesis and protein degradation. For a number of human diseases where mTOR signalling is dysregulated, including cancer, the clinical relevance of mTOR inhibitors is clear. However, understanding of the mechanisms by which mTOR controls gene expression is incomplete, with implications for adverse toxicological effects of mTOR inhibitors on clinical outcomes. mTOR has been shown to regulate 5’ TOP mRNA expression, though the exact mechanism remains unclear. It has been postulated that this may involve an intermediary factor such as an RNA binding protein, which acts downstream of mTOR signalling to bind and regulate translation or stability of specific messages. This thesis aimed to address this question through the use of whole cell RNA binding protein capture using oligo‐d(T) affinity isolation and subsequent proteomic analysis, and identify RNA binding proteins with differential binding activity following mTOR inhibition. Following validation of 4 identified mTOR‐dependent RNA binding proteins, characterisation of their specific functions with respect to growth and survival was conducted through depletion studies, identifying a promising candidate for further work; LARP1.
    [Show full text]