Diversity and Evolution of Bacterial
Total Page:16
File Type:pdf, Size:1020Kb
Published online 14 March 2020 NAR Genomics and Bioinformatics, 2020, Vol. 2, No. 2 1 doi: 10.1093/nargab/lqaa018 Diversity and evolution of bacterial bioluminescence genes in the global ocean Thomas Vannier 1,2,*, Pascal Hingamp1,2, Floriane Turrel1, Lisa Tanet1, Magali Lescot 1,2,* and Youri Timsit 1,2,* 1Aix Marseille Univ, Universite´ de Toulon, CNRS, IRD, MIO UM110, 13288 Marseille, France and 2Research Federation for the study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 3 rue Michel-Ange, 75016 Paris, France Received October 21, 2019; Revised February 14, 2020; Editorial Decision March 02, 2020; Accepted March 06, 2020 ABSTRACT ganisms and is particularly widespread in marine species (7–9). The luciferase enzymes that catalyze the emission Although bioluminescent bacteria are the most abun- of photons have evolved independently over 30 times, by dant and widely distributed of all light-emitting or- convergence from non-luminescent enzymes (10,11). Al- ganisms, the biological role and evolutionary history though bioluminescent bacteria are the most abundant and of bacterial luminescence are still shrouded in mys- widely distributed of all light-emitting organisms (7,12), tery. Bioluminescence has so far been observed in certain functional and evolutionary aspects of bacterial lu- the genomes of three families of Gammaproteobac- minescence still remain enigmatic, such as its biological role teria in the form of canonical lux operons that adopt which remains a matter of debate (13). Early on biolumines- the CDAB(F)E(G) gene order. LuxA and luxB encode cence was proposed to have evolved from ancient oxygen- the two subunits of bacterial luciferase responsi- detoxifying mechanisms (14–16). It has also been argued ble for light-emission. Our deep exploration of pub- that stimulation of DNA repair through the activation of DNA photolyase may confer an advantage to luminous lic marine environmental databases considerably ex- bacteria (17), although this hypothesis is still controversial pands this view by providing a catalog of new lux ho- (18). Yet another hypothesis is that bioluminescence is a vi- molog sequences, including 401 previously unknown sual attractant for zooplankton and fish that both provide luciferase-related genes. It also reveals a broader ingested bacteria with growth medium and means for dis- diversity of the lux operon organization, which we persal (19). Symbiosis with squid or fish is also an intriguing observed in previously undescribed configurations feature of specific bioluminescent bacteria (20,21). such as CEDA, CAED and AxxCE. This expanded To date, most of the few culturable light-emitting bac- operon diversity provides clues for deciphering lux terial species that have been characterized fall within the operon evolution and propagation within the bac- Gammaproteobacteria class. These bacteria cluster phylo- terial domain. Leveraging quantitative tracking of genetically in three families (Vibrionaceae, Shewanellaceae marine bacterial genes afforded by planetary scale and Enterobacteriaceae) which all carry a highly conserved lux operon (12). Since its first identification 40 years ago, metagenomic sampling, our study also reveals that the canonical luxCDAB(F)E(G) organization has been the novel lux genes and operons described herein are systematically observed in all the bacterial bioluminescent more abundant in the global ocean than the canoni- genomes (22–24). The luxA and luxB genes encode the al- cal CDAB(F)E(G) operon. pha and beta subunits of the luciferase heterodimer that emits light by the oxidation of FMNH2 and a long chain aldehyde. Whereas they both adopt a TIM-barrel fold (25), INTRODUCTION LuxA specifically displays a disordered loop playing a criti- Marine biodiversity and evolution are intimately related cal role in light emission (26). LuxC, D and E together form with biogeography and ecology (1–3). The Tara Oceans ex- a fatty acid reductase complex responsible for the synthesis pedition recently provided a global picture of the complex of the long chain aldehyde substrate (24,27). interactions between marine micro-organisms and their en- Despite a highly conserved core, some variations have vironment (4–6). Bioluminescence, the chemical emission of been observed in the lux operon organization. Small differ- visible light, is produced by a remarkable diversity of or- ences in gene content, for instance the presence of an op- *To whom correspondence should be addressed. Tel: +33 4 86 09 06 66; Email: [email protected] Correspondence may also be addressed to Thomas Vannier. Email: [email protected] Correspondence may also be addressed to Youri Timsit. Email: [email protected] C The Author(s) 2020. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] 2 NAR Genomics and Bioinformatics, 2020, Vol. 2, No. 2 tional riboflavin genes or luxF in Photobacterium species, and a hidden Markov model (HMM) profile was have been observed (28,29). LuxG, which reduces FMN built using hmmbuild from HMMer v 3.0 with de- into FMNH2 is absent in Photorhabdus spp. whose operon fault parameters (58)(http://hmmer.org). The result- also contains multiple insertions of ERIC sequences (30). ing LuxA HMM profile was used to search for addi- Natural merodiploidy of the lux-rib operon has been also tional luciferase homologs using hmmsearch in NR noticed in some strains in Photobacterium leiognathi (31). (59)(https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz, While the phylogeny of lux genes generally supports a ver- version 06/15/2019) with an E-value threshold of 1.0E- tical inheritance, multiple examples of instability of the lux 186 set to avoid LuxB homologs. In addition, 55 draft locus and horizontal gene transfers (HGT) have been re- whole genome shotgun (WSG) marine bacterial genomes ported in different clades, at various taxonomic levels (32– (https://www.ncbi.nlm.nih.gov/genbank/wgs/, version 34). Also, mutations or loss of the lux operon are frequently 07/25/2018) and 334 marine bacterial complete genomes observed in non-luminous strains and appear to correlate (https://www.ncbi.nlm.nih.gov/genbank/genome/, version with some environmental parameters (35–38). 07/25/2018) were screened using the LuxA HMM profile; Although bioluminescent bacteria are cosmopolite in the after removing sequence redundancy, partial and synthetic oceans and occupy a great diversity of ecological niches, sequences, we obtained a reference dataset of 129 LuxA including surface and deep waters, (39–43), many studies protein sequences. This dataset was then used to compute have revealed an intricate relationship between bacterial bi- a final LuxA HMM profile (Supplementary Table S1).A oluminescent phenotype, lux operon diversity, environmen- similar procedure was used for building the Lux B, C, D, tal parameters and life style (29,44–50). Given the apparent E, G and F reference datasets. ubiquity of bioluminescence in the ocean and the ease with which light emitting bacteria can be isolated from seawater, Diversity of lux operon it has come as a surprise that bacterial bioluminescence has so far escaped detection by previous marine metagenomic Genbank bacterial genomes containing the reference and studies (51). According to the pioneering authors, the un- marine lux-like sequences are download from the NCBI expected absence of lux genes might have been explained by web site (https://www.ncbi.nlm.nih.gov/genbank/). A syn- sampling protocols which filtered out size classes of poten- theny graph representing the operon structural organiza- tial interest, and to sequencing depth which might have been tions was done using Easyfig (60). insufficient to catch bioluminescent bacteria if these were of low abundance (51). OM-RGC Lux homologs search In the present report, we surveyed the distribution of bacterial lux–related genes in a compilation of publicly The Tara Oceans OM-RGC dataset (5,61) was screened available large-scale metagenomic environmental databases with each of the Lux HMM profiles obtained above, us- (Tara Oceans 2009–2013, Malaspina 2010, GOS and ing hmmsearch with an E-value threshold of 1.0E-10 (Sup- OSD2014) giving special care to screen the largest possible plementary Table S1). Further filtering based on alignment variety of organismal size sampling fractions. Spanning a lengths eliminated incomplete Lux sequences. The length wide spectrum of marine bacteria diversity, including a ma- thresholds were set to 340, 300, 430, 275, 300 and 200 aa jority of unculturable species, our study reveals new insights for the LuxA, B, C, D, E and F homologs, respectively. about distribution, diversity and evolution of marine lux- related genes and their operon organization at a planetary OM-RGC LuxA homolog structural filtering scale. Bacterial luciferases and monooxygenases share a highly MATERIALS AND METHODS conserved TIM-barrel fold (25) that renders discrimination from each other difficult from primary sequence alignments LuxA reference sequence dataset alone. We therefore developed a specific procedure to help The coordinates of the Vibrio harveyi LuxA/Bhet- luciferase/monooxygenase discrimination based on three- erodimeric luciferase (pdbid: 3fgc) were obtained from the dimensional (3D) structure modeling and comparison. The Protein Data Bank (PDB) (52)(https://www.rcsb.org/, 3D coordinates of close structural homologs of bacterial lu- version 06/22/2019) and used as a reference lu- ciferases were retrieved from PDB (52) and structurally su- ciferase structure. The corresponding V. harveyi perimposed with PyMOL (The PyMOL Molecular Graph- LuxA protein sequence (UniProtKB––P07740) was ics System, Version 1.2r3pre, Schrodinger,¨ LLC.) (Supple- used to query UniProtKB/Swiss-Prot (53,54)(https: mentary Table S2 and Figure S2a–i).