View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

FEBS Letters 585 (2011) 2627–2634

journal homepage: www.FEBSLetters.org

Hypothesis Molecular machines encoded by bacterially-derived multi-domain gene fusions that potentially synthesize, N-methylate and transfer long chain polyamines in diatoms

Anthony J. Michael

Department of Pharmacology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9041, USA

article info abstract

Article history: Silica glass formation in diatoms requires the biosynthesis of unusual, very long chain polyamines Received 20 July 2011 (LCPA) composed of iterated aminopropyl units. Diatoms processively synthesize LCPA, N-methylate Accepted 21 July 2011 the amine groups and transfer concatenated, N-dimethylated aminopropyl groups to silaffin pro- Available online 4 August 2011 teins. Here I show that diatom genomes possess signal peptide-containing gene fusions of bacteri- ally-derived polyamine biosynthetic S-adenosylmethionine decarboxylase (AdoMetDC) Edited by Miguel De la Rosa and an aminopropyltransferase, sometimes fused to a eukaryotic histone N-methyltransferase domain, that potentially synthesize and N-methylate LCPA. Fusions of similar, alternatively Keywords: configured domains but with a catalytically dead AdoMetDC and in one case a Tudor domain, may Diatom Biosilica glass N-dimethylate and transfer multiple aminopropyl unit polyamines onto silaffin proteins. Long chain polyamine biosynthesis Ó 2011 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. N-methylation Aminopropyltransferase

1. Introduction Polyamines are small organic polycations found in almost all cells and are essential for growth and cell proliferation in eukary- Diatoms are ubiquitous, aquatic single-celled eukaryotic organ- otes. In normal eukaryotic polyamine metabolism (Fig. 1A), the isms responsible for 20% of planetary photosynthetic activity, triamine spermidine is formed by transfer of an aminopropyl group equal to all terrestrial rainforests combined [1]. Their highly elab- to the diamine putrescine (1,4-diaminobutane). The aminopropyl orate, biosilica glass cell wall is constructed from monosilicic acid group is derived from decarboxylated S-adenosylmethionine

(Si(OH)4) by a biomineralization process initiated by highly modi- formed by S-adenosylmethionine decarboxylase (AdoMetDC) fied proteins known as silaffins [2–5] and by very long chain linear [14]. Transfer of the aminopropyl group to putrescine is performed polyamines (LCPA) (Fig. 1A) that may be N-methylated on second- by the aminopropyltransferase [15]. The ary and primary amines [6–10]. In some species, proteins called tetramines spermine and thermospermine are formed from sper- cingulins form an organic matrix that acts as a template to guide midine by transfer of another aminopropyl group to the N8 (amin- silicification [11]. Silaffin proteins are modified by dimethylation obutyl) or N1 (aminopropyl) ends of spermidine, respectively, by of some lysine residues and by transfer of multiple aminopropyl the aminopropyltransferases spermine [16] and thermospermine unit polyamines to other lysine e-amino groups. Transferred poly- synthase [17]. At the beginning of the pathway (Fig. 1A), putrescine amine moieties may be dimethylated on primary and secondary is formed from ornithine by the action of ornithine decarboxylase amine groups. The LCPA may be composed of as many as 20 ami- (ODC) [18]. In eukaryotic cells, spermidine levels are highly regu- nopropyl units [12], the chain length and degree of methylation lated at the level of biosynthesis, catabolism, uptake and export of primary and secondary amine groups dependent on species. In [19]. The key biosynthetic enzymes ODC and AdoMetDC are the diatom Thalassiosira pseudonana, LCPA may contain putrescine, negatively regulated in response to polyamine levels by sensitive spermidine or 1,3-diaminopropane as well as the multiple post-transcriptional feedback systems that include programmed aminopropyl repeat units [7,13]. ribosomal frameshifting and ribosome stalling [20,21]. These pow- erful homeostatic systems adjust spermidine concentrations to the level required by the physiological state of the cell. Synthesis of Abbreviations: LCPA, long chain polyamine; ODC, ornithine decarboxylase; LCPA therefore must not disrupt spermidine homeostasis and LCPA AdoMet, S-adenosylmethionine; AdoMetDC, S-adenosylmethionine decarboxylase; MTA, methylthioadenosine must be sequestered from the normal cellular binding sites of E-mail address: [email protected] polyamines. The pathway for LCPA biosynthesis, polyamine

0014-5793/$36.00 Ó 2011 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.febslet.2011.07.038 2628 A.J. Michael / FEBS Letters 585 (2011) 2627–2634

O ornithine

H2N OH S-adenosylmethionine NH2 N N HO OH ornithine O decarboxylase H2N N SH NH N O OH 2 H2N putrescine NH2 S-adenosylmethionine decarboxylase spermidine synthase NH2 N HO N H2NN OH H spermidine H2N N SH N O decarboxylated NH2 thermospermine S-adenosylmethionine synthase

NH2 H2N NN H H thermospermine

S-adenosylmethionine N N HO Silica deposition vesicle OH O H2N N SH N O OH aminopropyltransferase/ S-adenosylmethionine decarboxylase NH2 fusion domain for successive aminopropyl SET domain group transfer methyltransferase n = up to 20 NH2 H2N N NN H {}H present in some N N HO OH long chain polyamines O H2N N S N O OH S-adenosylhomocysteine NH2

Fig. 1A. Polyamine and LCPA biosynthesis in diatoms.

modification of silaffins, and methylation of primary and second- perturbation of normal polyamine homeostasis. Here I show that ary amines has remained enigmatic. In particular, it is unclear the genomes of T. pseudonana, Phaeodactylum tricornutum and how multiple aminopropyl units could be sequentially transferred Fragilariopsis cylindrus encode one set of molecular machines that to the growing polyamine chain of LCPA. It is unlikely that LCPA potentially perform the tasks of iterative addition of multiple biosynthesis is similar to normal polyamine biosynthesis where aminopropyl groups to form LCPA, by supplying their own decar- each aminopropyl group transfer is achieved by different amino- boxylated S-adenosylmethionine and concommitent or subsequent propyltransferase enzymes, otherwise there would need to be up N-methylation of the LCPA. Another set of fusion proteins could to 20 different successive aminopropyltransferases. It is also not perform aminopropyl group transfer to lysine residues of silaffins, clear how multiple amino groups in a single LCPA chain could be and processive N-dimethylation of amine groups. The molecular methylated. LCPA synthesis would require a relatively large supply machines are derived from bacterial polyamine biosynthetic of decarboxylated S-adenosylmethionine, and a mechanism would fusions and chromatin protein modification and binding have to be in place to prevent physiologically damaging domains. A.J. Michael / FEBS Letters 585 (2011) 2627–2634 2629

2. Materials and methods Formation of the triamine spermidine from putrescine in eukaryotes requires AdoMetDC and the aminopropyltransferase 2.1. Sequence and phylogenetic analysis spermidine synthase (Fig. 1A). Typical eukaryotic AdoMetDC orthologues are present in the three diatom genomes (Table S1), The diatom genomes were analysed using the US Department of and in T. pseudonana and P. tricornutum AdoMetDC mRNAs, for Energy Joint Genome Institute Genome Portal (http://genome.jgi- which mRNA sequences with long 50 leader sequences are avail- psf.org/). BLASTP of diatom proteins in the NCBI non-redundant able, an upstream open reading frame encoding peptides of 29 protein sequences yielded only very partial proteins (e.g., 70–90 and 26 amino acids, respectively, is present in the mRNA 50 region. a.a. instead of 750–950 a.a., using a Thermotoga maritima Ado- The upstream ORFs likely encode ribosome-stalling peptides anal- MetDC sequence to screen the diatom proteins). Phylogenetic anal- ogous to plant and mammalian AdoMetDC mRNAs [21], as part of a ysis was performed as previously described [22]. Alignments were polyamine-responsive autoregulatory negative feedback system. performed with ClustalW in ClustalX [23] and trees were made Typical spermidine synthase orthologues are present in each of with PAUP⁄ [24]. Signal peptides were analysed using the SignalP the three diatom genomes, and in T. pseudonana, the spermidine server http://www.cbs.dtu.dk/services/SignalP/ taking into consid- synthase paralogue ThaSDS1 has been shown [26] to be functional. eration both the result from the neural network analysis and Hid- There are three spermidine synthase paralogues in F. cylindrus, and den Markov models. All aminopropyltransferase fusions were two each in P. tricornutum and T. pseudonana (shown below in identified by TBLASTN and BLASTP usuing the DoE JGI database Fig. 3). Uniquely for spermidine synthase orthologues, the F.cylin- for each diatom species, and additional domains in the fusion pro- drusFraSDS3 and the T. pseudonana ThaSDS2 proteins have ac- teins were identified by BLASTP or PSIBLAST using the NCBI server quired signal peptides. It is possible that one of the spermidine (http://blast.ncbi.nlm.nih.gov/Blast.cgi). synthase paralogues encodes the related aminopropyltransferase . Each of the three sequenced diatom genomes 3. Results and discussion contains a thermospermine synthase orthologue (listed in Fig. 3), including the characterised T. pseudonana thermospermine syn- 3.1. Variations on normal eukaryotic polyamine biosynthesis in thase [17]. The thermospermine synthase orthologues differ from diatoms spermidine and spermine synthases by having a glutamate, rather than an asparate in the motif hhhhGGG(D/E)G (where h = L/V/I/M). A typical eukaryotic ODC orthologue (Fig. 1A, Table S1) is found The phylogenetic distribution of the diatom thermospermine syn- in T. pseudonana (ThaODC) and P. tricornutum (PhaODC1). How- thases suggests that they have been acquired from the red algal ever, P. tricornutum also possesses an aberrant ODC paralogue, Pha- ODC2, that lacks the critical residues corresponding to C360 and D361 [18] of the mouse ODC, required for substrate binding and catalysis (Fig. S1). It is very unlikely that PhaODC2 will be active, however, it does possess the PLP -binding lysine residue corresponding to the mouse K69 position. It is well established that ODC functions as a head to tail homodimer with two active sites formed across the interface of the dimer, each being constituted by residues from the N-terminal domain of one subunit and the C-terminal domain of the other. In principle, PhaODC2 could form a dimer with PhaODC1 to yield a mixed subunit ODC enzyme operating at 50% activity due to only one of the two active sites being functional. One site would be inactive because of the absence of the C360 substrate binding residue but the other active site would be have a normal complement of essential residues. Alternatively, PhaODC may be an antizyme inhibitor orthologue [25], i.e., a catalytically dead ODC homologue that binds to and sequesters the ODC inhibitory protein antizyme [20].InF. cylin- drus, there are two ODC-like proteins, FraODCa and FraODCb (Table S1), which both lack critical residues: FraODCa lacks the mouse K69 equivalent (relaced by an arginine) but possesses the C360/ D361 residues, whereas FraODCb lacks the C360/D361 positions but does possess a K69 equivalent. A mixed dimer of FraODCa and FraODCb may be active because one functional active site could be constituted from the C-terminal domain of FraODCa and the N-terminal domain of FraODCb, in the head to tail arrangement of the dimer. An obligate mixed dimer of mutant ODC paralogues has never been seen before in any species and it is probable that this occurs in F. cylindrus because there is no other route for putres- cine biosynthesis. There are agmatine deiminase/iminohydrolase (AIH) and N-carbamoylputrescine amidohydrolase (NCPAH) ortho- logues, which are of bacterial/plant origin, in P. tricornutum and F. cylindrus (Table S1). These two enzymes convert the decarboxyl- Fig. 1B. Diatom AdoMetDC-aminopropyltransferase fusion proteins. FraAPT, Frag- ated product of arginine, i.e., agmatine, to putrescine. However, ilariopsis cylindrus; PhaAPT, Phaeodactylum tricornutum; ThaAPT, Thalassiosira there are no discernable orthologues of arginine decarboxylase pseudonana. The signal peptides are indicated in purple, Class 1b AdoMetDC in blue, aminopropyltransferase in red, SET domain N-methyltransferase in green. (ADC) in the genomes, suggesting either that AIH and NCPAH do Some examples of bacterial orthologous fusion proteins are shown below the not play a role in polyamine biosynthesis or that there is a cryptic diatom fusions. The presence of an aspartate, glutamate or other residue in the ADC gene in these species. aminopropyltransferase motif hhhhGGG(E/D/X)G is indicated. 2630 A.J. Michael / FEBS Letters 585 (2011) 2627–2634

Fig. 1C. A Neighbor Joining tree of diatom and bacterial AdoMetDC-aminopropyltransferase fusion proteins. The tree is unrooted and percentage bootstrap support for 1000 replicates are shown. Diatom sequences are highlighted in colour. GenBank accession numbers and DoE Joint Genome Institute protein model descriptions are shown for the bacterial and diatom fusions, respectively. The alignment on which the tree is based is shown in Fig. S2. FraAPT, F. cylindrus; PhaAPT, P. tricornutum; ThaAPT, T. pseudonana. endosymbiont and thermospermine synthase orthologues are All the diatom AdoMetDC-aminopropyltransferase proteins found in primary algae, terrestrial plants and chromalveolates [16]. possess signal peptides (Fig. 1B) and are considerably larger than the bacterial counterparts. A thermospermine synthase motif is 3.2. Molecular machines for long chain polyamine biosynthesis present in the aminopropyltransferase domains from the F. cylin- drus fusion proteins FraAPT1, FraAPT2, FraAPT3 and FraAPT4, the Unlike normal polyamine biosynthesis, where either one or two P. tricornutum PhaAPT1, PhaAPT2 and PhaAPT3 and the T. pseudo- aminopropyl group transfers are required for triamine or tetra- nana. ThaAPT1, ThaAPT2, ThaAPT3 and ThaAPT4 (Fig. 1B), suggest- amine biosynthesis respectively, LCPA biosynthesis can involve ing transfer of an aminopropyl group to another aminopropyl, up to 20 aminopropyl groups. Furthermore, amino groups may rather than an aminobutyl acceptor group. Thermospermine is be mono- or di-methylated. Close inspection of the three diatom synthesized from spermidine by transfer of an aminopropyl group genomes using the US Department of Energy’s Joint Genome Insti- to the N1-aminopropyl side of the spermidine molecule, whereas tute Genome Portal reveals a potential solution to production of spermine is formed by aminopropyl group transfer to the N8- LCPA and their extensive N-methylation. Diatom genomes encode aminobutyl side. A thermospermine synthase motif in the fusion multidomain proteins, based on horizontally acquired bacterial fu- proteins is consistent with a role in the iterative addition of amino- sion proteins encoding bacterial Class 1b AdoMetDC domains [27] propyl groups during LCPA biosynthesis. The DoE JGI protein model and aminopropyltransferase domains (Fig. 1B). Similar but much accessions for all the diatom aminopropyltransferase sequences smaller fusion proteins in bacteria are found in the a-, b- and are presented in Fig. 1C. AdoMetDC requires a self-generated cova- d-Proteobacteria, Cyanobacteria, Bacteroidetes, Chlorobi, Actino- lent pyruvoyl cofactor for activity, formed from an internal serine bacteria and Firmicutes (Fig. 1B and 1C). We have shown recently exposed after autocatalytic cleavage of the proenzyme to form that some of these bacterial AdoMetDC-aminopropyltransferase the processed a- and b-subunits. The pyruvoyl cofactor is found fusions proteins are able to synthesize de novo spermidine from at the N-terminus of the a-subunit [14]. AdoMetDC domains of putrescine and sym-norspermidine from 1,3-diaminopropane each of the fusion proteins appear to have all the critical residues [28]. This is a strong indication that the diatom AdoMetDC-amino- for activity and processing except for ThaAPT3, which has replaced propyltransferase fusion proteins are likely to be functional poly- the equivalent of C83 of the Thermotoga maritima enzyme [27], re- amine biosynthetic modules, and being much bigger, the diatom quired for normal autocatalytic processing of the proenzyme, with fusion proteins may be capable of producing iteratively elongated a threonine residue. The presence of AdoMetDC and aminopropyl- LCPA within their active sites. The diatom fusion proteins possess domains in the same protein suggests that aminopro- conserved blocks of sequence in the linker region between the pyl group addition could be iterative and processive, elaborating AdoMetDC and aminopropyltransferase domains (Fig. S2), suggest- a longer polyamine chain by supplying the requisite decarboxyl- ing that the diatom AdoMetDC-aminopropyltransferase gene ated AdoMet to generate multiple aminopropyl group additions fusions evolved from a single bacterial fusion gene that subse- to the growing LCPA. Because of the AdoMetDC domain, the fusion quently duplicated and acquired additional functional domains. proteins would be processed into a small N-terminal chain A.J. Michael / FEBS Letters 585 (2011) 2627–2634 2631 containing the b-subunit of AdoMetDC and a large C-terminal They contain C-terminal SET domains, i.e., protein domains origi- chain containing the rest of the protein. nally found to N-methylate lysine residues in histone H3 and H4. Three of the fusion proteins, FraAPT1, FraAPT2 and ThaAPT4, The SET domain methyltransferase has now been shown to be

indicate a solution to the problem of N-methylation of the LCPA. capable of mono, di and trimethylation of histone lysine residues

NH NH NH NH

O O O O

N2H N

O O O O NH

O O

H NH N N

O O

NH NH

O O

+

N NH N

O H

O O

ThaAPT5 degraded AdoMetDC ShKT domain 970 aa ThaAPT6 987 aa FraAPT5 1027 aa PhaAPT4 1270 aa FraAPT6 Tudor domain Signal peptide Aminopropyltransferase 1535 aa

Fig. 2A. Diatom fusion proteins containing aminopropyltransferase and catalytically dead AdoMetDC domains. The upper panel shows a silaffin protein lysine residue and a sample of modifications found in diatom silaffin protein lysine positions. Lower panel shows fusion proteins found in the genomes of F. cylindrus, P. tricornutum and T. pseudonana. Signal peptides are shown in purple, the aminopropyltransferase domain in red, the catalytically dead AdoMetDC domain is indicated by a boxed sphere and the SET domain N-methyltransferase domain is green.

Fig. 2B. Alignment of AdoMetDC domains from diatom and bacterial fusion proteins. Putatively active diatom AdoMetDC domains are shown in yellow, the catalytically dead diatom AdoMetDC (*) domains in blue, AdoMetDC domains from bacterial AdoMetDC-aminopropyltransferase fusion proteins in purple and stand alone bacterial AdoMetDC proteins in green. The serine position of the pyruvoyl cofactor and autocatalytic proenzyme processing site is highlighted in red. Noc.fa, (YP_119998) Nocardia farcinica IFM 10152; cya.UC, (YP_003421651) cyanobacterium UCYN-A; Dic.tu, (YP_002353349) Dictyoglomus turgidum DSM 6724; The.ma, (NP_228464) Thermotoga maritima MSB8; Clo.di, (YP_001087363) Clostridium difficile 630; Bac.su, (ZP_03592689) Bacillus subtilis 168; Del.ac, (YP_001561775) Delftia acidovorans SPH-1; Rho.ru, (YP_426779) Rhodospirillum rubrum ATCC 11170; Pel.sp, (ZP_05069376) Candidatus Pelagibacter sp. HTCC7211; Chl.he, (YP_001996377) Chloroherpeton thalassium ATCC 35110. 2632 A.J. Michael / FEBS Letters 585 (2011) 2627–2634

[29–31]. A mix of both unmethylated and methylated LCPA could cell, probably in the silica deposition vesicle. Unusually, the SET be synthesized by the suite of AdoMetDC-aminopropyltransferase domain containing fusion proteins must bind two molecules of fusion enzymes encoded in each diatom genome, and the presence S-adenosylmethionine at separate sites within the fusion protein, of the signal peptide in each protein indicates that they are phys- one in the SET domain and the other in the AdoMetDC domain. ically separated from normal polyamine biosynthesis within the The P.tricornutum AdoMetDC-aminopropyltransferase fusion

Aureococcus anophagefferens (jgi|Auran1|59112) Chromalveolata, Stramenopiles, Pelagophyceae Ectocarpus siliculosus (CBJ29508)Chromalveolata Stramenopiles Blastocystis hominis (CBK21746)Chromalveolata, Stramenopiles Phytophthora sojae (jgi|Physo1_1|109524) Chromalveolata, Chromista, Heterokonta, Oomycetes Phytophthora infestans T30-4 (XP_002901681) Chromalveolata, Stramenopiles, Oomycetes Albugo laibachii Nc14 (CCA19405) Chromalveolata, Stramenopiles, Oomycetes Emiliania huxleyi (jgi|Emihu1|461445) Chromalveolata, Chromista, Haptophyta FraSDS2 jgi|Fracy1|225230|fgenesh2_pm.5_#_121 (295 a.a.) 97 PhaSDS2 jgi|Phatr2|53420|phatr1_ua_pm.chr_11000034 (327 a.a.) Micromonas pusilla (EEH54321) Archaeplastidia, Chlorophyta, Prasinophyceae Ostreococcus lucimarinus (XP_001420452) Archaeplastidia, Chlorophyta, Prasinophyceae 66 Chlamydomonas reinhardtii (XP_001702843) Archaeplastidia, Chlorophyta, Chlorophyceae Chorella vulgaris (jgi|Chlvu1|37758) Archaeplastidia, Chlorophyta, Trebouxiophyceae *Arabidopsis thaliana (NP_568785) Archaeplastidia, Streptophyta, Eudicotyledons (SMS) *Arabidopsis thaliana (CAB64644) Archaeplastidia, Streptophyta, Eudicotyledons (SDS1) ThaSDS1 jgi|Thaps3|30691|estExt_Genewise1.C_chr_220141 (299 a.a.) PhaSDS1 jgi|Phatr2|23974|estExt_gwp_gw1.C_chr_280140 (299 a.a.) 94 FraSDS1 jgi|Fracy1|172086|estExt_Genewise1.C_120103 (303 a.a.) Naegleria gruberi (jgi|Naegr1|55563) Excavata, Heterolobosea SDS1 Naegleria gruberi (jgi|Naegr1|34946) Excavata, Heterolobosea SDS2 Perkinsus marinus (EER15002) Chromalveolata, Alveolata, Dinozoa, Perkinsea *Plasmodium falciparum (CAB71155) Chromalveolata, Alveolata, Apicomplexa (SDS) *Leishmania major (XP_888543) Excavata, Euglenozoa, Kinetoplastids (SDS) *Trypanosoma brucei (XP_827124) Excavata, Euglenozoa, Kinetoplastids (SDS) 100 FraSDS3 (signal peptide) jgi|Fracy1|180248|e_gw1.2.1857.1 (351 a.a.) ThaSDS2 (signal peptide) jgi|Thaps3|256152|thaps1_ua_pm.chr_3000120 (363 a.a.) *Bacillus subtilis (NP_391630) Bacteria, Firmicutes (SDS) *Thermatoga maritima (Q9WZC2) Bacteria, Thermotogales (SDS) *Escherichia coli K12 (NP_414663) Bacteria, γ-Proteobacteria (SDS) *Pyrococcus furiosus (NP_577856) Archaea, Euryarchaeota, Thermococci (AAPT) Phytophthora sojae (jgi|Physo1_1|145276) Chromalveolata, Chromista, Heterokonta, Oomycetes Phytophthora infestans T30-4(XP_002896487) Chromalveolata, Stramenopiles, Oomycetes Albugo laibachii Nc14(CCA23198) Chromalveolata, Stramenopiles, Oomycetes 75 Ectocarpus siliculosus(CBJ31336) Chromalveolata, Stramenopiles, Phaeophyceae Aureococcus anophagefferens (jgi|Auran1|31657 ) Chromalveolata, Stramenopiles, Pelagophyceae FraTSMS jgi|Fracy1|206948|estExt_Genewise1Plus.C_40229 (312 a.a.) PhaTSMS jgi|Phatr2|51460|phatr1_ua_kg.chr_1000138 (347 a.a.) *ThaTSMS jgi|Thaps3|269901|estExt_thaps1_ua_kg.C_chr_170029 (291 a.a.) 61 Perkinsus marinus (EER03316) Chromalveolata, Alveolata, Dinozoa, Perkinsea Ostreococcus lucimarinus (XP_001418368) Archaeplastidia, Chlorophyta, Prasinophyceae Micromonas pusilla (EEH56215) Archaeplastidia, Chlorophyta, Prasinophyceae Chlamydomonas reinhardtii (XP_001696651) Archaeplastidia, Chlorophyta, Chlorophyceae 65 Chlorella vulgaris (jgi|Chlvu1|40225) Archaeplastidia, Chlorophyta, Trebouxiophyceae Emiliania huxleyi (jgi|Emihu1|430738) Chromalveolata, Chromista, Haptophyta *Arabidopsis thaliana (AAF01311) Archaeplastidia, Streptophyta, Eudicotyledons (TSMS) Caldivirga maquilingensis (YP_001540925) Archaea, Crenarchaeota *Thermus thermophilus (YP_004447) Bacteria, Thermus-Deinococcus (AAPT) Thermofilum pendens (YP_920782) Archaea, Crenarchaeota FraAPT1 jgi|Fracy1|234978|fgenesh2_pg.2_#_1086 (871 a.a.) FraAPT2 jgi|Fracy1|234451|fgenesh2_pg.2_#_559 (881 a.a.) FraAPT3 jgi|Fracy1|234961|fgenesh2_pg.2_#_1069 (664 a.a.) PhaA PT1 jgi|Phatr2|34890|fgenesh1_pg.C_chr_6000263 (870 a.a.) PhaA PT3 jgi|Phatr2|39424|fgenesh1_pg.C_chr_19000181 (877 a.a.) ThaAPT1 jgi|Thaps3|5581|fgenesh1_pg.C_chr_5000623 (903 a.a.) ThaAPT4 jgi|Thaps3|1987|fgenesh1_pg.C_chr_1001121 (946 a.a.) 100 ThaAPT2 jgi|Thaps3|25666|estExt_fgenesh1_pg.C_chr_200201 (704 a.a.) ThaAPT3 jgi|Thaps3|1032|fgenesh1_pg.C_chr_1000166 (901 a.a.) PhaA PT2 jgi|Phatr2|32063|fgenesh1_pg.C_chr_1000664 (748 a.a.) FraAPT4 jgi|Fracy1|259790|estExt_fgenesh2_pg.C_20809 (758 a.a.) Thermoanaerobacter tengcongensis (NP_623338) Bacteria, Firmicutes Cyanidioschyzon merolae (CMR256C) Archaeplastidia, Rhodophyta Trichoplax adhaerens (XP_002117001) Metazoa, Radiata, Placezoa Strongylocentrotus purpuratus (XP_789223) Metazoa, Bilateria, Deuterostomia *Homo sapiens (NP_004586) Metazoa, Bilateria, Deuterostomia, Chordata (SMS) Gallus gallus (NP_001025974) Metazoa, Bilateria, Deuterostomia, Chordata Danio rerio (NP_571831) Metazoa, Bilateria, Deuterostomia, Chordata 100 Monosiga brevicollis (jgi|Monbr1|30201) Choanozoa, Choanoflagellatea Drosophila melanogaster (NP_729798) Metazoa, Bilateria, Ecdysozoa, Arthropoda Anopheles gambiae (XP_315341) Metazoa, Bilateria, Ecdysozoa, Arthropoda Apis mellifera (XP_393567) Metazoa, Bilateria, Ecdysozoa, Arthropoda FraAPT5 jgi|Fracy1|251263|fgenesh2_pg.35_#_8 (1027 a.a.) ThaAPT6 jgi|Thaps3|9980|fgenesh1_pg.C_chr_14000325 (987 a.a.) PhaA PT4 gi|219110941|ref|XP_002177222.1 (1270 a.a.) FraAPT6 jgi|Fracy1|263348|estExt_fgenesh2_pg.C_140358 (1535 a.a.) ThaAPT5 gi|223996511|ref|XP_002287929.1 (970 a.a.) *Homo sapiens (NP_003123) Metazoa, Bilateria, Deuterostomia, Chordata (SDS) Danio rerio (NP_957328) Metazoa, Bilateria, Deuterostomia, Chordata Trichoplax adhaerens (XP_002113833) Metazoa, Radiata, Placezoa Strongylocentrotus purpuratus (XP_796573) Metazoa, Bilateria, Deuterostomia Monosiga brevicollis (XP_001744025) Choanozoa, Choanoflagellatea Hydra magnipapillata (XP_002167122) Metazoa, Radiata, Cnidaria *Caenorhabditis elegans (CAC37332) Metazoa, Bilateria, Ecdysozoa (SDS) *Dictyostelium discoideum (XP_647009) Amoebozoa, Conosa, Mycetozoa (SDS) Ustilago maydis (XP_761965) Fungi, Basidiomycota, Ustilaginomycetes *Cryptococcus neoformans (AAS48112) Fungi, Basidiomycota, Hymenomycetes (SDS) 79 *Aspergillus nidulans (AAL11443) Fungi, Ascomycota, Euascomycetes (SDS) *Saccharomyces cerevisiae (AAC17191) Fungi, Ascomycota, Hemiascomycetes (SDS) Schizosaccharomyces pombe (NP_596015) Fungi, Ascomycota, Archiascomycetes *Saccharomyces cerevisiae (ACC19368) Fungi, Ascomycota, Hemiascomycetes (SMS) Cyanidioschyzon merolae (CMR329C) Archaeplastidia, Rhodophyta

Fig. 3. Neighbour Joining Tree of aminopropyltransferase domains including all diatom aminopropyltransferase sequences. The unrooted tree is shown with percentage bootstrap support from 200 replicates. Diatom aminopropyltransferase sequences are shown in green boxes. Aminopropyltransferases in coloured type, preceeded by a red asterisk, have been experimentally validated. SDS, spermidine synthase (blue type); SMS, spermine synthase (red type); TSMS, thermospermine synthase (orange type); AAPT, agmatine aminopropyltransferase (green type). The DoE JGI protein model accessions are presented for each diatom sequence. A.J. Michael / FEBS Letters 585 (2011) 2627–2634 2633 proteins do not possess SET domains, which suggests that LCPA in in silaffins. Although spermidine synthase can aminopropylate this species may not be methylated, similar to the unmethylated decarboxylated lysine (cadaverine) to form aminopropylcadaver- LCPA of another diatom, Coscinodiscus granii [13]. It is not clear in ine, the fusion enzyme would have to also recognise and bind general how aminopropyl groups would be iteratively added to the silaffin protein, and then transfer an aminopropyl group or the growing LCPA. The intermediate LCPA structures may diffuse LCPA to the aminobutyl moiety of a lysine residue in the silaffin. from the active site and subsequently bind again in a different po- There is a precedent for such behaviour in polyamine metabolism. sition to allow aminopropyl group addition, or the intermediate Deoxyhypusine synthase binds to translation initiation factor LCPA molecules may shift without leaving the active site. It is also eIF5A and transfers an aminobutyl group from spermidine to a ly- not clear whether methylation of secondary amines would be coor- sine residue in eIF5A. However, deoxyhypusine synthase can also dinated with each iterative extension of the LCPA or whether bind putrescine instead of eIF5A, and the transfer of an aminobutyl methylation occurs on the fully extended LPCA. Another enigma group from spermidine to putrescine produces sym-homospermi- is how the length of the LCPA would be determined. It is possible dine [36]. In a similar manner, it is possible that the diatom sper- that the suite of AdoMetDC-aminopropyltransferases in a given midine synthase-like/dead AdoMetDC fusion proteins could bind species work in concert, extending different sections of the LCPA silaffins. The spermidine synthase-like domains could transfer depending on how much of the growing LPCA can be accommo- LCPA or aminopropyl groups to silaffin lysine residues, and the dated in the active sites. SET domains could N-methylate the aminopropyl units.

3.3. Silaffin modification 3.4. Precursors and co-products

The silaffin proteins of diatoms are important components of the Diatom LCPA may contain a putrescine moiety, which would biosilification process. Some lysine residues of silaffin proteins are have to be transported into the silica deposition vesicle, but some- dimethylated and additionally, multi-aminopropyl group poly- times contain only aminopropyl moieties, indicating that 1,3- amines are found on other lysine residues in silaffins, and may be diaminopropane was the initial substrate for aminopropyl group N-dimethylated on the primary and secondary amines [2,32]. transfer. There is no obvious biosynthetic explanation for 1,3- Another group of fusion proteins can be discerned in the diatom diaminopropane formation in diatoms. However, spermidine oxi- genomes (Fig. 2A), containing spermidine synthase-like, rather than dation by a plant-like polyamine oxidase [37] could produce thermospermine synthase-like aminopropyltransferase domains aminobutyraldehyde and 1,3-diaminopropane as catabolic prod- (FraAPT5, FraAPT6, PhaAPT4, ThaAPT5 and ThaAPT6). The side group ucts. The Acanthamoeba culbertsoni polyamine oxidase acts on N8 of lysine is an aminobutyl group, and spermidine/spermine syn- -acetylspermidine to produce large quantities of 1,3-diaminopro- thases transfer an aminopropyl group to the primary amine of an pane in this organism [38] and if equivalent activity is present in aminobutyl moiety. Each of these fusion proteins possesses an N- diatoms, the 1,3-diaminopropane could serve as the initial sub- terminal signal peptide and also a degraded, catalytically dead bac- strate for the AdoMetDC/thermospermine synthase-like fusion terial Class 1b AdoMetDC domain immediately downstream of the proteins. A co-product of aminopropyltransferases is methylthioa- spermidine synthase-like domain. These degraded AdoMetDC do- denosine (MTA), a potent inhibitor of methyltransferases and mains lack most of the catalytic and processing residues (Fig. 2B). aminopropyltransferases. The methionine salvage pathway, which The T. pseudonana protein ThaAPT6 is the simplest of these fusion rescues/detoxifies MTA, is initiated by MTA/S-adenosylhomocys- proteins possessing the signal peptide, spermidine synthase-like teine nucleosidase [39]. An orthologue of MTA/S-adenosylhomo- domain and dead AdoMetDC domain. Metazoan spermine synthase cysteine nucleosidase is present in each of the diatom genomes, contains a catalytically dead bacterial Class 1b AdoMetDC domain at and unusually, there is a signal peptide present in each orthologue the N-terminus, which although inactive as an AdoMetDC, is never- (Table S1). Furthermore, the co-product of N-methyl transfer, S- theless essential for activity of the aminopropyltransferase domain adenosylhomocysteine is also salvaged by the MTA/S-adenosylme- due to its role in dimer formation [16,33]. Two fusion proteins, Fra- thionine homocysteine pathway. Salvage of methionine in LCPA APT5 and PhaAPT4 contain an additional N-terminal SET domain, biosynthesis is likely to be critically important because a single and FraAPT6 contains a C-terminal Tudor domain (Fig. 2A). Tudor LCPA with 20 methylated aminopropyl units would produce 20 domains bind methylated histones (methylated on lysine positions) molecules of MTA, 20 of S-adenosylhomocysteine and consume and some Tudor domains are found in lysine-specific demethylases 40 of S-adenosylmethionine. [34]. It is intriguing that lysine-specific histone demethylases such as LSD1 [35] are homologous to polyamine catabolic enzyme sperm- 4. Conclusions ine oxidase. The fusion protein ThaAPT5 contains a C-terminal ShKT domain, although the significance of this domain found in toxin pro- Each of the two classes of polyamine biosynthetic gene fusions teins is unknown. The spermidine synthase-like domains of Tha- found in diatoms has evolved from a single distinct progenitor APT5, ThaAPT6, FraAPT5, FraAPT6 and PhaAPT4 contain several fusion gene of bacterial origin. It is likely that the fusion genes insertions relative to normal spermidine synthases. were present in diatoms before the evolution of biosilification in One role of the spermidine synthase-like/dead AdoMetDC fu- these organisms, and that new domains, including the N-methyl- sion proteins may be to bind silaffins and transfer methylated LCPA transferase domains were recruited to the fusions as the biosilifica- or aminopropyl groups to specific lysine residues in the silaffin tion process evolved. There are no orthologues of these fusion proteins. The presence of a Tudor domain in FraAPT6 could facili- proteins in other Stramenopiles including Oomycetes, which are tate binding of silaffins already possessing dimethylated lysine res- known to have acquired bacterial genes by horizontal gene fusion idues. Aminopropyltransferase domains of these fusion proteins and which have a large complement of fusion proteins [40].A are closely related, suggesting a unique origin for this class of dia- concommitant recruitment of methionine salvage enzymes to tom fusion protein (Fig. 3). A number of stand-alone SET domain LCPA production prevented methylthioadenosine/S-adenosylho- proteins, containing signal peptides, are found in the diatom gen- mocysteine accumulation to toxic levels and avoided massive omes (examples from F. cylindrus are listed in Table S1). Stand- depletion of S-adenosylmethionine in the cells. It is possible that alone SET domain proteins could feasibly dimethylate lysine resi- similar but independently acquired fusion genes exist in other dues in the silaffin proteins. The spermidine synthase-like domain organisms that use LCPA for biosilification, such as glass sponges could transfer LCPA or aminopropyl groups to other lysine residues [41]. 2634 A.J. Michael / FEBS Letters 585 (2011) 2627–2634

Acknowledgement [20] Kahana, C. (2009) Regulation of cellular polyamine levels and cellular proliferation by antizyme and antizyme inhibitor. Essays Biochem. 46, 47–61. [21] Ivanov, I.P., Atkins, J.F. and Michael, A.J. (2010) A profusion of upstream open This work was supported by the University of Texas Southwest- reading frame mechanisms in polyamine-responsive translational regulation. ern Medical Center. Nucleic Acids Res. 38, 353–359. [22] Lee, J., Michael, A.J., Martynowski, D., Goldsmith, E.J. and Phillips, M.A. (2007) diversity and the structural basis of substrate specificity in the beta/alpha- Appendix A. Supplementary data barrel fold basic amino acid decarboxylases. J. Biol. Chem. 282, 27115–27125. [23] Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. and Higgins, D.G. (1997) The CLUSTAL_X windows interface: flexible strategies for multiple Supplementary data associated with this article can be found, in sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, the online version, at doi:10.1016/j.febslet.2011.07.038. 4876–4882. [24] D.L. Swofford (2000). PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods), Sinaeur Associates, Inc., Sunderland, MA. References [25] Kahana, C. (2009) Antizyme and antizyme inhibitor, a regulatory tango. Cell Mol. Life Sci. 66, 2479–2488. [1] Armbrust, E.V. (2009) The life of diatoms in the world’s oceans. Nature 459, [26] Romer, P., Faltermeier, A., Mertins, V., Gedrange, T., Mai, R. and Proff, P. (2008) 185–192. Investigations about N-aminopropyl probably involved in [2] Sumper, M., Hett, R., Lehmann, G. and Wenzl, S. (2007) A code for lysine biomineralization. J. Physiol. Pharmacol. 59 Suppl 5, 27–37. modifications of a silica biomineralizing silaffin protein. Angew. Chem. Int. Ed. [27] Bale, S. and Ealick, S.E. (2010) Structural biology of S-adenosylmethionine Engl. 46, 8405–8408. decarboxylase. Amino Acids 38, 451–460. [3] Kroger, N., Deutzmann, R. and Sumper, M. (1999) Polycationic peptides from [28] R. Green, C.C. Hanfrey, K.A. Elliott, D.E. McCloskey, X. Wang, S. Kanugula, A.E. diatom biosilica that direct silica nanosphere formation. Science 286, 1129– Pegg and A.J. Michael (2011). Independent evolutionary origins of functional 1132. polyamine biosynthetic enzyme fusions catalysing de novo diamine to [4] Kroger, N., Lorenz, S., Brunner, E. and Sumper, M. (2002) Self-assembly of triamine formation, Mol. Microbiol. doi:10.1111/j.1365-2958.2011.07757.x. highly phosphorylated silaffins and their function in biosilica morphogenesis. [29] Dillon, S.C., Zhang, X., Trievel, R.C. and Cheng, X. (2005) The SET-domain Science 298, 584–586. : protein lysine methyltransferases. Genome Biol. 6, 227. [5] Poulsen, N. and Kroger, N. (2004) Silica morphogenesis by alternative [30] Qian, C. and Zhou, M.M. (2006) SET domain protein lysine methyltransferases: processing of silaffins in the diatom Thalassiosira pseudonana. J. Biol. Chem. Structure,specificity and catalysis. Cell Mol. Life Sci. 63, 2755–2763. 279, 42993–42999. [31] Yeates, T.O. (2002) Structures of SET domain proteins: protein lysine [6] Kroger, N. and Wetherbee, R. (2000) Pleuralins are involved in theca methyltransferases make their mark. Cell 111, 5–7. differentiation in the diatom Cylindrotheca fusiformis. Protist 151, 263–273. [32] Kroger, N., Deutzmann, R. and Sumper, M. (2001) Silica-precipitating peptides [7] Sumper, M., Brunner, E. and Lehmann, G. (2005) Biomineralization in from diatoms. The chemical structure of silaffin-A from Cylindrotheca diatoms:characterization of novel polyamines associated with silica. FEBS fusiformis. J. Biol. Chem. 276, 26066–26070. Lett. 579, 3765–3769. [33] Wu, H. et al. (2008) Crystal structure of human spermine synthase: [8] Sumper, M. and Lehmann, G. (2006) Silica pattern formation in diatoms: implications of substrate binding and catalytic mechanism. J. Biol. Chem. species-specific polyamine biosynthesis. Chembiochem 7, 1419–1427. 283, 16135–16146. [9] Sumper, M., Lorenz, S. and Brunner, E. (2003) Biomimetic control of size in the [34] Lasko, P. (2010) Tudor domain. Curr. Biol. 20, R666–R667. polyamine-directed formation of silica nanospheres. Angew. Chem. Int. Ed. [35] Shi, Y., Lan, F., Matson, C., Mulligan, P., Whetstine, J.R., Cole, P.A. and Casero, Engl. 42, 5192–5195. R.A. (2004) Histone demethylation mediated by the nuclear amine oxidase [10] Sumper, M. (2002) A phase separation model for the nanopatterning of diatom homolog LSD1. Cell 119, 941–953. biosilica. Science 295, 2430–2433. [36] Ober, D. and Hartmann, T. (1999) Homospermidine synthase, the first [11] Scheffel, A., Poulsen, N., Shian, S. and Kroger, N. (2011) Nanopatterned protein pathway-specific enzyme of pyrrolizidine alkaloid biosynthesis, evolved microrings from a diatom that direct silica morphogenesis. Proc. Natl. Acad. from deoxyhypusine synthase. Proc. Natl. Acad. Sci. USA 96, 14777–14782. Sci. USA 108, 3175–3180. [37] Sebela, M., Radova, A., Angelini, R., Tavladoraki, P., Frebort, I.I. and Pec, P. [12] Kroger, N., Deutzmann, R., Bergsdorf, C. and Sumper, M. (2000) Species- (2001) FAD-containing polyamine oxidases: a timely challenge for researchers specific polyamines from diatoms control silica morphology. Proc. Natl. Acad. in biochemistry and physiology of plants. Plant Sci. 160, 197–207. Sci. USA 97, 14133–14138. [38] Shukla, O.P., Aisien, S.O., Bergmann, B., Hellmund, C. and Walter, R.D. (1996) [13] Sumper, M. and Brunner, E. (2008) Silica biomineralization in diatoms: the Identification of the polyamine N8-acetyltransferase involved in the pathway model organism Thalassiosira pseudonana. Chembiochem 9, 1187–1194. of 1,3-diaminopropane production in Acanthamoeba culbertsoni. Parasitol. [14] Pegg, A.E. (2009) S-Adenosylmethionine decarboxylase. Essays Biochem. 46, Res. 82, 270–272. 25–45. [39] Albers, E. (2009) Metabolic characteristics and importance of the universal [15] Wu, H., Min, J., Ikeguchi, Y., Zeng, H., Dong, A., Loppnau, P., Pegg, A.E. and methionine salvage pathway recycling methionine from 50- Plotnikov, A.N. (2007) Structure and mechanism of spermidine synthases. methylthioadenosine. IUBMB Life 61, 113242. Biochemistry 46, 8331–8339. [40] Morris, P.F., Schlosser, L.R., Onasch, K.D., Wittenschlaeger, T., Austin, R. and [16] Pegg, A.E. and Michael, A.J. (2010) Spermine synthase. Cell Mol. Life Sci. 67, Provart, N. (2009) Multiple horizontal gene transfer events and domain 113–121. fusions have created novel regulatory and metabolic networks in the [17] Knott, J.M., Romer, P. and Sumper, M. (2007) Putative spermine synthases oomycete genome. PLoS One 4, e6133. from Thalassiosira pseudonana and Arabidopsis thaliana synthesize [41] Matsunaga, S., Sakai, R., Jimbo, M. and Kamiya, H. (2007) Long-chain thermospermine rather than spermine. FEBS Lett. 581, 3081–3086. polyamines (LCPAs) from marine sponge: possible implication in spicule [18] Pegg, A.E. (2006) Regulation of ornithine decarboxylase. J. Biol. Chem. 281, formation. Chembiochem 8, 1729–1735. 14529–14532. [19] Igarashi, K. and Kashiwagi, K. (2010) Modulation of cellular function by polyamines. Int. J. Biochem. Cell Biol. 42, 39–51.