An Optional C-Terminal Domain Is Ancestral in Α-Amylases of Bilaterian Animals
Total Page:16
File Type:pdf, Size:1020Kb
Amylase 2017; 1: 26–34 Research Article Open Access Jean-Luc Da Lage* An optional C-terminal domain is ancestral in α-amylases of bilaterian animals DOI 10.1515/amylase-2017-0003 immunoglobulins and enzymes (for details, see, e.g. [1]). Received February 17, 2017; accepted March 28, 2017 The design and building of novel proteins often rely on Abstract: The modular structure and organization of exon shuffling, in which introns serve as linking regions most proteins is a fascinating aspect of their origin and between the pieces to be joined. This process, which had evolution. α-Amylases are known to be formed of at least been predicted by Walter Gilbert in his seminal article [2], three domains. In a number of bacterial α-amylases, is quite common in Eukaryotes, especially in Vertebrates, one or several additional domains may exist, which are but has also been exemplified, for example, as a recent carbohydrate binding modules, interacting with raw event in Drosophila [3]. The domains, joined together, substrates. In animal α-amylases, however, no additional may be tightly linked or attached to each other by a low domain has been described. Here we report the presence complexity protein region named a linker. However, of a C-terminal domain, previously described only in the introns are not necessary, since domain shuffling may bacterium Pseudoalteromonas haloplanktis. This domain also occur in Prokaryotes, which are devoid of introns (see is widely distributed in invertebrate α-amylases and below). must be ancestral, although it has been lost in important α-Amylase plays a crucial role to achieve the phyla or groups, such as vertebrates and insects. Its hydrolysis of starch and other polysaccharides from food function is still unknown. In a single genome, enzymes and nutrients into maltose and maltodextrines. Although with and without the terminal domain may coexist. In a amino acid sequences are highly variable among few instances, this domain has been recruited by other organisms, the general tertiary structure is well conserved. proteins in both bacteria and animals through domain It is made of a “(β/α)8 barrel” or “TIM barrel” (domain A), shuffling. a protruding few structured domain B involved in Ca2+ ion binding and catalysis at the interface with domain A, and Keywords: domain shuffling; evolution; intron; an all-β domain C adopting a Greek key conformation. horizontal gene transfer; carbohydrate binding module In about 10% of bacterial and fungal α-amylases and related enzymes, an additional terminal domain is Abbreviations: AHA, Pseudoalteromonas haloplanktis present, sometimes in N-terminal position, but most often α-amylase; CBM, carbohydrate binding module; SBD, in C-terminal position, with a function of raw or granular starch binding domain. starch-binding, hence its name of starch binding domain (SBD) [4-6]. SBDs belong mainly to the carbohydrate binding module (CBM) families 20, 21, 25, 26, 34, 41, 45, 48, 1 Introduction 53, 58, 68 and 69 ([7], http://www.cazy.org/). They consist of all-β domains forming an open, distorted β-barrel, with Protein evolution and innovation often goes about through binding sites for polysaccharides, which use conserved admixture of existing materials, such as pieces of existing tryptophan residues. proteins, often fully functional domains. Countless A different, unrelated additional C-terminal domain examples have been described, such as muscular proteins, has been described from a bacterium, Pseudoalteromonas haloplanktis, which is an antarctic, marine γ-proteobacterium (Alteromonadaceae). Its unique *Corresponding author: Jean-Luc Da Lage, Evolution, Génomes, α-amylase (AHA), which is adapted to cold temperature, Comportement, Ecologie, CNRS, IRD, Univ Paris-Sud, Université has been extensively studied (e.g. [8-11]). Its amino acid Paris-Saclay, F-91198 Gif-sur-Yvette, France; sequence is strongly related to animal ones, and we have E-mail: [email protected] suggested that AHA and animal (bilaterian) α-amylases © 2017 Jean-Luc Da Lage, published by De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. C-terminal α-amylase domain in animals 27 could be related to each others through a lateral transfer, gigas (Mollusca, Bivalvia). Sequence data were deposited whose direction is still not ascertained, but more likely in GenBank (accession numbers are indicated in Table S1). from bacteria to bilaterians [12,13]. AHA possesses an original C-terminal domain, unrelated to SBDs, and not 2.2 Searches in databases yet ascribed to any family, which has been shown to act possibly as a secretion helper, cleaved from the core enzyme Using the putative C-terminal domains of C. fluminea or by a non-specific protease after exportation across the AHA as a query, we searched by BLASTP and TBLASTN [14] outer cell wall [10]. Within the bacterial kingdom, we had in sequence databases for other occurrences of domains previously found this “propeptide” domain only in a few similar to the AHA C-terminal domain. The GenBank nr, species of the Pseudoalteromonas genus, and also in the GenBank EST and GenBank Bacterial genomes databases bacterium Saccharophagus degradans (Cellvibrionaceae, (http://blast.ncbi.nlm.nih.gov/Blast.cgi) were searched. formerly ascribed to Alteromonadaceae). In the latter We investigated traces archives at the NCBI for non- case, the C-terminal domain was attached to a plant- vertebrate metazoans. More specific genome servers were like α-amylase, but not to the orthologous animal-like also used: the EchinoBase server (http://www.echinobase. α-amylase, which also exists in this species, showing an org/) for echinoderms; server of the Joint Genome Institute example of domain translocation and shuffling without (http://genome.jgi.doe.gov/) for Daphnia pulex, Capitella the need of an intron [12]. In animals, however, no teleta, Lottia gigantea, Branchiostoma floridae, Helobdella extra domain had been characterized until now. In fact, robusta and Ciona intestinalis; specific servers for the sequence similarity between the AHA C-terminal domain Aplysia californica EST project (http://aplysia.cu-genome. and an animal sequence had been only noticed [10] in org/) and the nematode Pristionchus pacificus (http:// Caenorhabditis elegans. www.pristionchus.org/); nematode-specific servers In the present study, we show that this domain not only (http://www.nematode.net/); the “neglected genomes” exists in animals, but is widespread, and thus is probably database (http://www.nematodes.org/bioinformatics/ ancestral in bilaterians. The long-standing presence of this databases.shtml); also for marine organisms (http://www. domain in various non-vertebrate animals suggests that it marinegenomics.org/); for the Urochordate Oikopleura has a function. Although this hypothetic function cannot dioica (http://www.genoscope.cns.fr/); for the tick be reached for now, we draw an evolutionary history of Ixodes scapularis (http://www.vectorbase.org/); and this uncharacterized, yet quite common, protein domain. the InsectBase for various insects (http://www.insect- genome.com/). The accession numbers of sequences are given in Table S1. 2 Materials and methods Amino acid sequences were aligned with the program MUSCLE [15], and the alignment was manually curated for 2.1 Experimental data uncertainties and then served for building a maximum likelihood tree using the server Phylogeny.fr (http://www. We checked the presence of a C-terminal domain phylogeny.fr/), with default parameters [16]. experimentally in several molluscan species. The α-amylase genes were entirely sequenced in the bivalves Corbicula fluminea and Mytilus edulis, and almost 3 Results entirely in the limpet Patella vulgata, using the Genome walker Universal kit (Clontech). The C-terminal domains 3.1 Occurrence of the AHA-like C-terminal were identified by BLAST search [14] in the GenBank domain in Bacteria database. From the alignment of these domains with those of P. haloplanktis, S. degradans and C. elegans, we The C-terminal domain, around 185 amino acids designed PCR primers in conserved parts of the domain in length, was found only in a limited set of (Ctermdir: CARGAYCTNTTYATHCGNGG; Ctermrev: γ-proteobacteria, mostly Alteromonadales, for instance TCNGCNCCRTACCARTC) and used various combinations Pseudoalteromonas atlantica (YP_662421) and a number for amplification of fragments, if possible showing of closely related species, e.g. P. tunicata (ZP_01134110) attachment to the core α-amylase sequence, i.e. also and Alteromonadales bacterium TW7 (ZP_01613200), using primers designed within the core enzyme. The all attached to an α-amylase orthologous of that of P. species assayed by PCR were the chiton Acantochitona sp. haloplanktis (subfamily GH13_32 in the classification of (Mollusca, Polyplacophora) and the oyster Crassostrea Stam et al. [17]). It was also found in two Cellvibrionaceae 28 Jean-Luc Da Lage – first in Saccharophagus degradans; it had been found animal α-amylases is quite complex, because numerous previously linked to a plant-type (subfamily GH13_6) [12]. independent gene duplications and sequence divergences We found a gene orthologous to this plant-type gene, occurred, so that a tree drawn from these sequences does also with the C-terminal domain, in Cellvibrio japonicus not faithfully reflect animal phylogeny (for instance, (ACE84223). Moreover, in this species, we also found a nematode sequences are highly divergent, whereas duplicate of the C-terminal