Genome Sequence of Halobacterium Species NRC-1
Total Page:16
File Type:pdf, Size:1020Kb
Genome sequence of Halobacterium species NRC-1 Wailap Victor Nga,b, Sean P. Kennedyc, Gregory G. Mahairasa,b, Brian Berquistc, Min Pana,b, Hem Dutt Shuklac, Stephen R. Laskya,b, Nitin S. Baligac, Vesteinn Thorssona,b, Jennifer Sbrognac, Steven Swartzella, Douglas Weirc, John Halla, Timothy A. Dahla,b, Russell Weltia,b, Young Ah Gooa,b, Brent Leithausera, Kim Kellera, Randy Cruza, Michael J. Dansond, David W. Houghd, Deborah G. Maddocksd, Peter E. Jablonskie, Mark P. Krebsf, Christine M. Angevinef, Heather Dalef, Thomas A. Isenbargerf, Ronald F. Peckf, Mechthild Pohlschroderg, John L. Spudichh, Kwang-Hwan Jungh, Maqsudul Alami, Tracey Freitasi, Shaobin Houi, Charles J. Danielsj, Patrick P. Dennisk, Arina D. Omerk, Holger Ebhardtk, Todd M. Lowel, Ping Liangm, Monica Rileym, Leroy Hooda,b,n, and Shiladitya DasSarmac,n aDepartment of Molecular Biotechnology, University of Washington, Seattle, WA 98195; cDepartment of Microbiology, University of Massachusetts, Amherst, MA 01003; dCentre for Extremophile Research, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, United Kingdom; eDepartment of Biological Sciences, Northern Illinois University, DeKalb, IL 60115; fDepartment of Biomolecular Chemistry, University of Wisconsin Medical School, Madison, WI 53706; gDepartment of Biology, University of Pennsylvania, Philadelphia, PA 19104; hDepartment of Microbiology and Molecular Genetics, University of Texas Medical School, Houston, TX 77030; iDepartment of Microbiology, University of Hawaii, Honolulu, HI 96822; jDepartment of Microbiology, Ohio State University, Columbus, OH 43210; kDepartment of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, Canada V6T 1Z3; lDepartment of Genetics, Stanford University School of Medicine, Stanford, CA 94305; mMarine Biological Laboratory, Woods Hole, MA 02543; and bInstitute for Systems Biology, Seattle, WA 98105 Contributed by Leroy Hood, July 20, 2000 We report the complete sequence of an extreme halophile, in size (8–11). We sequenced pNRC100 as a preliminary step in Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp ge- this genome project (12) and found a dynamic 191,346-bp nome containing 91 insertion sequences representing 12 families replicon containing 176 putative genes, several of which are and organized into a large chromosome and 2 related minichro- likely to be essential. mosomes. The Halobacterium NRC-1 genome codes for 2,630 pre- The complete genome sequence of Halobacterium NRC-1 is dicted proteins, 36% of which are unrelated to any previously notable because of the excellent characteristics of halophiles as reported. Analysis of the genome sequence shows the presence of experimental organisms among the archaea (13). Culturing is pathways for uptake and utilization of amino acids, active sodium- facile, because they are both aerobic and mesophilic. DNA- proton antiporter and potassium uptake systems, sophisticated mediated transformation may be accomplished at high effi- photosensory and signal transduction pathways, and DNA repli- ciency, and cloning and expression vectors with selectable mark- cation, transcription, and translation systems resembling more ers are readily available. Several gene replacement and knockout complex eukaryotic organisms. Whole proteome comparisons strategies have been used successfully, including a recently show the definite archaeal nature of this halophile with additional developed selectable and counterselectable method by using the similarities to the Gram-positive Bacillus subtilis and other bacteria. yeast ura3 gene homolog, which should permit systematic knock- The ease of culturing Halobacterium and the availability of meth- out of all nonessential genes (14). Moreover, large-scale PCR ods for its genetic manipulation in the laboratory, including con- amplification has been conducted successfully and DNA arrays struction of gene knockouts and replacements, indicate this halo- constructed for interrogating patterns of gene expression. For phile can serve as an excellent model system among the archaea. biochemical analysis, Halobacterium proteins can be released by lysis in hypotonic medium and stabilized by addition of salts and alobacterium specieso are obligately halophilic microorgan- other compatible solutes. Both membrane and soluble proteins Hisms that have adapted to optimal growth under conditions have been useful for structural studies using electron and x-ray of extremely high salinity—10 times that of sea water. They methods (15–17). These characteristics, coupled with the com- contain a correspondingly high concentration of salts internally plete genome sequence, make Halobacterium NRC-1 an excel- and exhibit a variety of unusual and unique molecular charac- lent experimental model among the archaea. teristics. Since their discovery, extreme halophiles have been studied extensively by chemists, biochemists, microbiologists, Genome Sequence, Annotation, and Organization and molecular biologists to define both molecular diversity and We sequenced the Halobacterium NRC-1 genome by using a universal features of life. A notable list of early research whole genome shotgun strategy. Approximately 45,000 high- milestones on halophiles includes the discovery of a cell envelope quality sequences were obtained by using automated Applied composed of an S-layer glycoprotein, archaeol ether lipids and Biosystems sequencers, which provided ϫ7.5 coverage of the purple membrane, and metabolic and biosynthetic processes operating at saturating salinities (1). These early discoveries established the value of investigations directed at extremophiles Abbreviation: IS, insertion sequence. and set the stage for pioneering phylogenetic studies leading to Data deposition: The sequences reported in this paper have been deposited in the GenBank the three-domain view of life and classification of Halobacterium database (accession nos. AE004437, AE004438, and AF016485). nTo whom reprint requests should be addressed. E-mail: [email protected] or as a member of the archaeal domain (2, 3). [email protected]. The Halobacterium genome was originally studied in the 1960s oHalobacterium species are referred to in the literature by a variety of designations, and found to be composed of two components, a GC-rich (68%) including H. halobium, H. cutirubrum, H. salinarium, and H. salinarum. The precise major fraction and a relatively AT-rich (58% GC) satellite (4, 5). relationships among these organisms and Halobacterium sp. strain NRC-1 are not entirely Subsequent work showed that satellite DNA corresponded to the clear (18). Strain NRC-1 was a gift from W. F. Doolittle, Dalhousie University, Halifax, presence of large and variable covalently closed extrachromo- Canada. The strain has been deposited with the American Type Culture Collection, Manassas, VA (reference no. ATCC 700922). somal circles and a large number of transposable insertion The publication costs of this article were defrayed in part by page charge payment. This sequence (IS) elements, which explained the observed genetic article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. plasticity of halophiles (6, 7). For Halobacterium NRC-1, 3 §1734 solely to indicate this fact. Ϸ circular replicons were mapped, a 2-Mbp chromosome and 2 Article published online before print: Proc. Natl. Acad. Sci. USA, 10.1073͞pnas.190337797. large replicons, pNRC200 and pNRC100, about 350 and 200 Kbp Article and publication date are at www.pnas.org͞cgi͞doi͞10.1073͞pnas.190337797 12176–12181 ͉ PNAS ͉ October 24, 2000 ͉ vol. 97 ͉ no. 22 Downloaded by guest on September 24, 2021 large chromosome. We used 505 oligonucleotides for directed salinity reaches near saturation. In the laboratory, cells are sequencing of lower-quality regions or regions with single cov- cultured best in a complex medium (13, 25). A minimal medium erage. The remaining low-quality regions were covered by described for Halobacterium includes all but 5 of the 20 amino sequencing both ends of 124 PCR amplified genomic fragments. acids for growth (26). Several amino acids may be used as a The shotgun Halobacterium NRC-1 sequences were assembled source of energy, including arginine and aspartate, which are by using the PHREDPHRAP programs (19–21). Initially, all of the passed to the citric acid cycle via 2-oxoglutarate and oxaloac- known and putative new IS elements were masked in the etate, respectively (Fig. 1). Under aerobic conditions, arginine is assembly. This resulted in 84 high-quality contigs, which were presumably converted to glutamate via the arginine deiminase subsequently merged into groups of 2–10 adjacent contigs by a pathway, and this amino acid then enters the cycle via glutamate second round of assembly without repeat masking. Finally, the dehydrogenase. The arginine deiminase pathway is coded by the grouped contig consensus sequences were merged into three arcRACB genes (27), which are found on pNRC200. circular contigs by using a third round of assembly with the In accordance with the ability of Halobacterium NRC-1 to PHREDPHRAP programs. The sequences have been deposited in grow on amino acids, which ultimately are catabolized by the GenBank and assigned the following accession numbers: citric acid cycle, the genes coding all of the enzymes for an AE004437, AE004438, and AF016485. aerobic cycle are present (Fig. 1). In common with all archaea, Our results confirmed the expected size and structure of the the conversion of pyruvate