Molecular Phylogenetics and Evolution 68 (2013) 327–339

Contents lists available at SciVerse ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier.com/locate/ympev

Phylogenetic- and genome-derived insight into the evolution of N-glycosylation in ⇑ Lina Kaminski a, Mor N. Lurie-Weinberger b, Thorsten Allers c, Uri Gophna b, Jerry Eichler a, a Department of Life Sciences, Ben Gurion University, Beersheva 84105, Israel b Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv 69978, Israel c School of Biology, University of Nottingham, Nottingham NG7 2UH, UK article info abstract

Article history: N-glycosylation, the covalent attachment of oligosaccharides to target protein Asn residues, is a post- Received 25 January 2013 translational modification that occurs in all three domains of life. In Archaea, the N-linked glycans that Revised 23 March 2013 decorate experimentally characterized glycoproteins reveal a diversity in composition and content Accepted 26 March 2013 unequaled by their bacterial or eukaryal counterparts. At the same time, relatively little is known of Available online 6 April 2013 archaeal N-glycosylation pathways outside of a handful of model strains. To gain insight into the distri- bution and evolutionary history of the archaeal version of this universal protein-processing event, 168 Keywords: archaeal genome sequences were scanned for the presence of aglB, encoding the known archaeal oli- Archaea gosaccharyltransferase, an enzyme key to N-glycosylation. Such analysis predicts the presence of AglB N-glycosylation Oligosaccharyltransferase in 166 species, with some species seemingly containing multiple versions of the protein. Phylogenetic analysis reveals that the events leading to aglB duplication occurred at various points during archaeal evolution. In many cases, aglB is found as part of a cluster of putative N-glycosylation genes. The pres- ence, arrangement and nucleotide composition of genes in aglB-based clusters in five species of the hal- ophilic archaeon Haloferax points to lateral gene transfer as contributing to the evolution of archaeal N- glycosylation. Ó 2013 Elsevier Inc. All rights reserved.

1. Introduction Igura et al., 2008; Kelly et al., 2009; Peyfoon et al., 2010; Ng et al., 2011; Matsumoto et al., 2012; Vinogradov et al., 2012). Apart Originally thought to be a process restricted to Eukarya, it is from the two similar Methanococcus N-linked glycans, distinct pro- now clear that Bacteria and Archaea can also modify proteins via tein-bound oligosaccharides are seen in each species. Moreover, in the addition of oligosaccharides to selected Asn residues, i.e. per- both Halobacterium salinarum and Haloferax volcanii, the S-layer form N-glycosylation (Calo et al., 2010; Nothaft and Szymanski, glycoprotein is simultaneously modified by two distinct N-linked 2010; Larkin and Imperiali, 2011; Eichler, 2013). At present, under- glycans (Wieland et al., 1983; Lechner et al., 1985; Guan et al., standing of archaeal N-glycosylation lags behind that of the paral- 2012). lel process in Eukarya and Bacteria. Nonetheless, analysis of even a Given the species-specific profile of N-linked glycans that deco- limited number of archaeal glycoproteins has made it clear that rate archaeal glycoproteins, pathways of oligosaccharide assembly archaeal N-linked glycans show a diversity of content and struc- unique to a given archaeon likely exist. Indeed, in the four Agl ture that is not seen elsewhere (Schwarz and Aebi, 2011; Eichler, (archaeal glycosylation) pathways responsible for N-glycosylation 2013). To date, N-linked glycans decorating glycoproteins or repor- studied to date, namely those of the halophile Hfx. volcanii, the ter peptides from Archaeoglobus fulgidus, Halobacterium salinarum, M. voltae and M. maripaludis, and the thermoacido- Haloferax volcanii, Methanococcus maripaludis, Methanococcus vol- phile S. acidocaldarius (Calo et al., 2010; Jarrell et al., 2010; Albers tae, fervidus, Pyrococcus furiosus, Sulfolobus acido- and Meyer, 2011; Eichler, 2013), few common components are caldarius and Thermoplasma acidophilum have been characterized seen. The oligosaccharyltransferase (OST) AglB, responsible for (Wieland et al., 1983; Lechner et al., 1985; Kärcher et al., 1993; delivery of the assembled glycan and its precursors from a phos- Zähringer et al., 2000; Voisin et al., 2005; Abu-Qarn et al., 2007; phorylated dolichol lipid carrier to target protein Asn residues, is, however, present in each pathway. With this is mind, Magidovich and Eichler (2009) relied on the presence of aglB, encoding the only OST currently identified in Archaea, to predict the existence of a N- ⇑ Corresponding author. Address: Department of Life Sciences, Ben Gurion University, P.O. Box 653, Beersheva 84105, Israel. Fax: +972 8647 9175. glycosylation pathway in 54 of the 56 species for which complete E-mail address: [email protected] (J. Eichler). genome sequence information was available at the time.

1055-7903/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2013.03.024 328 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339

Today, the number of publicly available archaeal genome into the prevalence, and by extension, the importance of N- sequences, including those of several phylogenetically proximal glycosylation in Archaea, may be obtained. Of the 168 genomes species, is approaching 200. This wealth of genomic data lends considered, 166 contained AglB-encoding sequences (Table 1), itself to a detailed examination of archaeal N-glycosylation from including A. fulgidus AF0329, Hfx. volcanii HVO_1530, M. voltae an evolutionary perspective. Accordingly, the examination of MVO1749, M. maripaludis MMP1424 and P. furiosus PF0156, AglB putative N-glycosylation pathway components across genome proteins all experimentally demonstrated to possess OST activity lines reported here offers novel insight into the evolution of (Chaban et al., 2006; Abu-Qarn et al., 2007; Igura et al., 2008; the archaeal version of this universal post-translational protein VanDyke et al., 2009; Matsumoto et al., 2012). In fact, AglB is modification. predicted to exist in members of all five archaeal phyla, i.e. , , Korarchaeota, Nanoarchaeota and 2. Materials and methods Thaumarchaeota, further pointing to N-glycosylation as being a common trait in Archaea. It should, however, be noted that in 2.1. Databases the vast majority of cases, neither N-glycosylation nor transcrip- tion of the predicted AglB-encoding gene has been confirmed. The list of AglB proteins, identified as containing a multi-mem- Finally, and as previously reported (Magidovich and Eichler, brane-spanning N-terminal domain and a soluble C-terminal do- 2009), no aglB sequence was detected in either Aeropyrum pernix main that includes the WWDYG consensus motif implicated in or Methanopyrus kandleri, suggesting that these species do not OST function across evolution (Yan and Lennarz, 2002; Maita perform N-glycosylation. Alternatively, given that Aeropyrum et al., 2010; Lizak et al., 2011), was obtained by scanning the fol- pernix and Methanopyrus kandleri are characterized by an atypical lowing: GT family 66 at the Carbohydrate-Active Enzymes data- gene content (Brochier et al., 2004), it is possible that a different, base (http://www.cazy.org), the Integrated Microbial Genomes – currently unrecognized OST mediates N-glycosylation in these Genome Encyclopedia of Bacteria and Archaea Genomes (IMG/ species. GEBA) (http://img.jgi.doe.gov/cgi-bin/w/main.cgi), using the term ‘EC 2.4.1.119’ as query, and the NCBI Protein Database (http:// 3.2. Multiple versions of AglB appeared throughout evolution www.ncbi.nlm.nih.gov/protein) sites, using the terms ‘Stt3’ or ‘AglB’ as query. These searches were complemented by manual In 31 of the 113 euryarchaeal species considered, and in only 2 searches of non-annotated proteins for the presence of WWDXG, of the 55 non-euryarchaeal species addressed, two or more aglB se- a relaxed form of the WWDYG motif. quences were identified. Of the 31 euryarchaeal species, 14 were methanogens (out of a total of 49 methanogens considered). In examining those methanoarchaeal species containing two or more 2.2. Phylogenetic analysis copies of aglB, no common phenotypic trait, such as an ability to grow under a given condition, is apparent. On the other hand, in The sequences of Haloferax AglB proteins were retrieved from addressing thermo- and hyperthermophilic euryarchaea, two or the IMG/GEBA website utilizing the ‘‘Gene Neighborhood’’ func- more aglB sequences were identified in all nine Thermococcus spe- tion. Homologs were aligned using MUSCLE (Edgar, 2004). The cies, in six of the seven Pyrococcus species considered, and in two of Halorubrum lacusprofundi AglB sequence served as an out-group. the three Archaeoglobus species examined. Yet, the possibility that The alignment was manually edited and ambiguously aligned posi- multiplicity of AglB in a given species is related to an elevated opti- tions were removed. The tree was then constructed utilizing the mal growth temperature is unlikely, since of the 45 crenarcheal PhyML server (http://www.atgc-montpellier.fr/phyml/)(Guindon species, all of which are thermo- or hyperthermophiles, only two, et al., 2010), using the JTT model + 4 gamma categories to approx- belonging to different genera, contain a pair of predicted AglB- imate the different substitution rates among sites, an estimation of encoding genes. invariant sites, and 100 bootstrap trials. A neighbor-joining phylo- To gain insight into the evolutionary relationship of the multi- genetic tree was generated from the list of euryarchaeal species ple versions of AglB found in euryarchaeal species, phylogenetic containing more than one copy of AglB utilizing MEGA 5 software analysis was performed (Fig. 1). The phylogenetic tree obtained as- (Tamura et al., 2011). Homologs were aligned using ClustalW (Lar- signed the multiple AglB proteins into two major groups, termed kin et al., 2007). Robustness of the tree was assessed by a bootstrap group A, including AglB sequences from the methanogens and A. test based on 500 pseudo-replicates. Bootstrap values are shown fulgidus, and group B, including Pyrococcus and Thermococcus AglB on the nodes of the tree where greater than 50%. sequences. Group A was in turn divided into two major clades (clades a and b) and an additional clade (clade c) containing the 2.3. Calculation of the codon adaptation index (CAI) and effective two Methanobacterium sp. AL-21 AglB sequences, while group B number of codons (ENC) was divided into two major clades (clades d and e). A more com- prehensive taxonomic representation of AglB phylogeny can be Calculation of CAI (Sharp and Li, 1987) and ENC (Wright, 1990) found in Fig. S1. for all clustered agl genes in five Haloferax species was performed Closer analysis of the phylogenetic tree revealed a situation utilizing Inca 2.0 (Supek and Vlahovicek, 2004). The CAI calcula- whereby the multiple versions of AglB encoded by certain species tions required manual indication of highly expressed ribosomal appeared at different times during evolution, rather than being the protein-encoding genes, which were located relying on genomic result of a single duplication event. For example, the two Methan- annotations. ocella arvoryzae AglB sequences found adjacent to each other in the genome likely appeared due to a recent gene duplication event. A 3. Results and discussion third AglB sequence from this species is assigned to cluster b, reflecting an earlier appearance of two versions of the protein 3.1. In Archaea, the gene encoding AglB is almost universally detected (Fig. 1, arrows). This claim is supported by the fact that a similar pattern of AglB distribution is seen in other species from the same Given the central role played by AglB in archaeal N-glycosyla- , i.e. Methanocella conradii and Methanocella paludicola.On tion, 168 publicly available archaeal genomes were scanned for the other hand, the pair of Methanocelleus marisnigri AglB proteins the presence of the encoding gene. In this manner, better insight encoded by genes found at distant positions from each other in the L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 329

Table 1 Archaeal AglB homologs.

Genus (Phyluma/Classb/Orderc/Familyd) Species AglB Acidilobus (C/Tp/Ac/Ac) A. saccharovorans ASAC_0278,0710 (C/Tp/Ca/Ca) C. lagunensis Calag_0026,1336 Desulfurococcus (C/Tp/De/De) D. amylolyticus SphmelDRAFT_0373 D. fermentans Desfe_0211 D. kamchatkensis DKAM_0136 D. mucosus Desmu_0294 Ignicoccus (C/Tp/De/De) I. hospitalis Igni_0016 Ignisphaera (C/Tp/De/De) I. aggregans Igag_0094 Staphylothermus (C/Tp/De/De) S. hellenicus Shell_0596 S. marinus Smar_0223 Thermogladius (C/Tp/De/De) T. cellulolyticus TCELL_1363 Thermosphaera (C/Tp/De/De) T. aggregans Tagg_0313 Hyperthermus (C/Tp/De/Py) H. butylicus Hbut_1205 Pyrolobus (C/Tp/De/Py) P. fumarii Pyrfu_1528 Fervidicoccus (C/Tp/Fe/Fe) F. fontis FFONT_0123 Acidianus (C/Tp/Su/Su) A. hospitalis Ahos_1254 Metallosphaera (C/Tp/Su/Su) M. cuprina Mcup_0430 M. sedula Msed_1805 M. yellowstonensis MetMK1DRAFT_00024050 Sulfolobus (C/Tp/Su/Su) S. acidocaldarius Saci_1274 S. islandicus HVE10/4 SiH_1127 S. islandicus L.D.8.5 LD85_1283 S. islandicus L.S.2.15 LS215_1264 S. islandicus M.14.25 M1425_1167 S. islandicus M.16.27 M1627_1231 S. islandicus M.16.4 M164_1156 S. islandicus REY15A SiRe_1041 S. islandicus Y.G.57.14 YG5714_1163 S. islandicus Y.N.15.51 YN1551_1688 S. solfataricus 98/2 Ssol_2025 S. solfataricus P2 SSO1052 S. tokodaii ST0940 Thermofilum (C/Tp/Th/Thf) T. pendens Tpen_0640 Caldivirga (C/Tp/Th/Thp) C. maquilingensis Cmaq_0438 Pyrobaculum (C/Tp/Th/Thp) P. aerophilum PAE3030 P. arsenaticum Pars_1781 P. calidifontis Pcal_0997 P. islandicum Pisl_0431 P. oguniense Pogu_0350 Pyrobaculum sp. 1860 P186_1486 Thermoproteus (C/Tp/Th/Thp) T. neutrophilus Tneu_1689 T. tenax TTX_0519 T. uzoniensis TUZN_0151 Vulcanisaeta (C/Tp/Th/Thp) V. distributa Vdis_2064 V. moutnovskia VMUT_0472 Archaeoglobus (E/Ar/Ar/Ar) A. fulgidus AF0040,0329e,0380 A. profundus Arcpr_0726,1194 A. veneficus Arcve_0568 Ferroglobus (E/Ar/Ar/Ar) F. placidus Ferp_2437 Haladaptatus (E/H/H/H) Hap. paucihalophilus ZOD2009_20113 Halalkalicoccus (E/H/H/H) Hac. jeotgali HacjB3_10630 Haloarcula (E/H/H/H) Har. californiae HAH_00005860 Har. hispanica HAH_1202 Har. marismortui rrnAC0431 Har. sinaiiensis HAI_00022250 Har. vallismortis HAJ_00008880 Haloarcula sp. AS7094 pSCM201p1 Halobacterium (E/H/H/H) Hbt. salinarum R1 OE2548F Halobacterium sp. NRC-1 VNG1068G Halobacterium sp. DL1 HalDL1_1649 Halobiforma (E/H/H/H) Hbf. lacisalsi HlacAJ_010100009178 Haloferax (E/H/H/H) Hfx. denitrificans HAK_00032060 Hfx. mediterranei HFX_1592 Hfx. mucosum HAM_16650 Hfx. sulfurifontas HAN_00007740 Hfx. volcanii HVO_1530 Halogeometricum (E/H/H/H) Hgm. borinquense Hbor_17000 Halomicrobium (E/H/H/H) Hmc. mukohataei Hmuk_2752 Halopiger (E/H/H/H) Hpg. xanaduensis Halxa_2340 Haloquadratum (E/H/H/H) Hqr. walsbyi C23 Hqrw_3013 Hqr. walsbyi DSM 16790 HQ2681A Halorhabdus (E/H/H/H) Hrd. tiamatea HLRTI_06344 Hrd. utahensis Huta_2808 Halorubrum (E/H/H/H) Hrr. lacusprofundi Hlac_1062

(continued on next page) 330 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339

Table 1 (continued)

Genus (Phyluma/Classb/Orderc/Familyd) Species AglB Haloterrigena (E/H/H/H) Htg. turkmenica Htur_2957 Natrialba (E/H/H/H) Nab. magadii Nmag_0927 Natrinema (E/H/H/H) Nnm. pellirubrum Natpe_0008 Natronobacterium (E/H/H/H) Nbt. gregoryi Natgr_1685 Natronomonas (E/H/H/H) Nmn. pharaonis NP3720A Methanobacterium (E/Mtb/Mtb/Mtb) Methanobacterium sp. AL-21 Metbo_0534,0719 Methanobacterium sp. SWAN-1 MSWAN_1855 Methanobrevibacter (E/Mtb/Mtb/Mtb) M. ruminantium mru_0391 M. smithii ATCC 35061 Msm_0716 M. smithii DSM 2374 METSMIF1_02364 M. smithii DSM 2375 METSMIALI_01371 Methanosphaera (E/Mtb/Mtb/Mtb) M. stadtmanae Msp_0368 Methanothermobacter (E/Mtb/Mtb/Mtb) M. marburgensis MTBMA_c02090 M. thermautotrophicus MTH1623 Methanothermus (E/Mtb/Mtb/Mtm) M. fervidus Mfer_0177 (E/Mtc/Mtc/Mtc) M. fervens Mefer_0590 M. infernus Metin_1222 M. jannaschii MJ1525 Methanocaldococcus sp. FS406-22 MFS40622_0538 M. vulcanius Metvu_0243 (E/Mtc/Mtc/Mtc) M. formicicus MetfoDRAFT_0029 M. igneus Metig_1797 Methanococcus (E/Mtc/Mtc/Mcc) M. aeolicus Maeo_1409 M. maripaludis C5 MmarC5_0154 M. maripaludis C6 MmarC6_1249 M. maripaludis C7 MmarC7_0669 M. maripaludis S2 MMP1424e M. maripaludis X1 GYY_07955 M. vannielii Mevan_0735 M. voltae A3 MVO_1038 M. voltae PS MVO1749e Methanothermococcus (E/Mtc/Mtc/Mcc) M. okinawensis Metok_0791 Methanocella (E, Mtm, Mtl, Mtl) M. arvoryzae LRC539,541,558 M. conradii Mtc_0182,0183,0205 M. paludicola SANAE MCP_2705,2723 Methanocorpusculum (E, Mtm, Mmb, Mcp) M. labreanum Mlab_0662 Methanoculleus (E, Mtm, Mmb, Mmc) M. marisnigri Memar_0175,2235 Methanofollis (E, Mtm, Mmb, Mmc) M. liminatans Metli_2406 Methanoplanus (E, Mtm, Mmb, Mmc) M. limicola Metlim_1216 M. petrolearius Mpet_0084,2443 Methanolinea (E, Mtm, Mmb, Mrg) M. tarda MettaDRAFT_0779 Methanoregula (E, Mtm, Mmb, Mrg) M. boonei Mboo_0249,1209 Methanosphaerula (E, Mtm, Mmb, Mrg) M. palustris Mpal_0785 Methanospirillum (E, Mtm, Mmb, Msp) M. hungatei Mhun_2859,3066,3149 Methanosaeta (E, Mtm, Msc, MSa) M. concilii MCON_1133,1444 M. harundinacea Mhar_0540,1091,1439,1730 M. thermophila Mthe_1164,1498 Methanococcoides (E, Mtm, Msc, MSr) M. burtonii Mbur_1579 Methanohalobium (E, Mtm, Msc, MSr) M. evestigatum Metev_1257 Methanohalophilus (E, Mtm, Msc, MSr) M. mahii Mmah_0123 Methanosalsum (E, Mtm, Msc, MSr) M. zhilinae Mzhil_1653 Methanosarcina (E, Mtm, Msc, MSr) M. acetivorans MA_1172,3752,3753,3754 M. barkeri Mbar_A0242,A0243,A0368 M. mazei MM_0646,0647,2210 Candidatus Haloredivivus (E/Nnh//) Candidatus Haloredivivus sp. G17 HRED_02810 Candidatus Nanosalina (E/Nnh//) Candidatus Nanosalina sp. J07AB43 J07AB43_03340 Candidatus Nanosalinarum (E/Nnh//) Candidatus Nanosalinarum J07AB56 J07AB56_11160 Pyrococcus (E/Tc/Tc/Tc) P. abyssi PAB0974,1586,2202 P. furiosus COM1 PFC_07420 P. furiosus DSM 3638 PF0156e,0411 P. horikoshii PH0242,1271 P. yayanosii PYCH_17920,19200 Pyrococcus sp. NA2 PNA2_0761,1113 Pyrococcus sp. ST04 Py04_0309,0456 Thermococcus (E/Tc/Tc/Tc) T. barophilus TERMP_00665,02078,02121 T. gammatolerans TGAM_0406,0937 T. kodakarensis TK0810,1718 T. litoralis OCC_09883, 01289, 05039 T. onnurineus TON_0775,1820 T. sibiricus TSIB_0007,0418 Thermococcus sp. 4557 GQS_05995,06090,01010 Thermococcus sp. AM4 TAM4_672,1026 Thermococcus sp. CL1 CL1_0839,0859,1904 (E/Tl/Tl/Pi) P. torridus PTO0786 Thermoplasma (E/Tl/Tl/Ts) T. acidophilum Ta1136 T. volcanium TVN1212 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 331

Table 1 (continued)

Genus (Phyluma/Classb/Orderc/Familyd) Species AglB Aciduliprofundum (E///) A. boonei Aboo_0310 Candidatus Micrarchaeum (E///) Candidatus M. acidiphilum UNLARM2_0813 Candidatus Parvarchaeum (E///) Candidatus P. acidiphilum BJBARM4_0616 Candidatus P. acidophilus BJBARM5_0254 uncultured marine group II euryarchaeote MG2_1283 Candidatus Korarchaeum (K///) Candidatus K. cryptofilum Kcr_1056 Nanoarchaeum (N///) N. equitans NEQ155 Cenarchaeum (T//Ce/Ce) C. symbiosum CENSYa_1939 Candidatus Nitrosoarchaeum (T//Ni/Ni) Candidatus N. koreensis MY1_0015 Candidatus N. limnia BG20 CNitlB_010100007878 Candidatus N. limnia SFB1 Nlim_2107 Nitrosopumilus (T//Ni/Ni) Candidatus N. salaria BD31_I1640 N. maritimus Nmar_0075 Nitrosopumilus sp. MY1 MY1_0015 Candidatus Caldiarchaeum (T///) Candidatus C. subterraneum CSUB_C0660 unclassified Archaea halophilic archaeon DL31 Halar_1620

Sequences obtained from the Carbohydrate-Active enZYmes Database (http://www.cazy.org/Home.html) (August, 2012), the Integrated Microbial Genomes – Genome Encyclopedia of Bacteria and Archaea Genomes (IMG/GEBA) (http://img.jgi.doe.gov/cgi-bin/geba/main.cgi) (August, 2012), UCSC Archaeal Genome Browser (http://archaea. ucsc.edu/) (August, 2012) and the NCBI Protein Database (http://www.ncbi.nlm.nih.gov/protein) (August, 2012) sites. Listed as AglB at CAZy glycosyltransferase group 66 (oligosaccharyltransfases) but lacking the WWDXG motif involved in oligosaccharyltransferase activity: HAH_0492, MSWAN_1515, MSWAN_1516, MTBMA_ c4670, MTBMA_c04680, MTH420, MTH1898, MTH1906, Mfer_0275, Mfer_0623, Mhun_2859, Mhun_3066, Mhun_3149, Mthe_1548. a Phylum: Crenarchaeota, C; Euryarchaeota, E; Korarchaeota, K; Nanoarchaeota, N; Thaumarchaeota, T. b Class: , Tp; Archaeoglobi, Ar; Halobacteria, H; , Mtb; , Mtc; Methanomicrobia, Mtm; Methanopyri, Mtp; Nanohaloarchaea, Nnh; Thermococci, Tc; , Tl. c Order: , Ac; Caldisphaeraceae, Ca; , De; Fervidicoccales, Fe; Sulfolobales, Su; Thermoproteales, Th; Archaeoglobales, Ar; Halobacteriales, H; , Mtb; ; Mtc; Methanocellales, Mtl; Methanomicrobiales, Mmb; Methanosarcinales, Msc; Thermococcales, Tc; , Tl; Cenarchaeales, Ce; Nitrosopumilales, Ni; Nitrososphaerales, Nt. d Family: Acidilobaceae, Ac; Caldisphaera, Ca; Desulfurococcaceae, De; Pyrodictiaceae, Py; Fervidicoccaceae, Fe; Sulfolobaceae, Su; Thermofilaceae, Thf; Thermoproteaceae, Thp; Archaeoglobaceae, Ar; Halobacteriaceae, H; Methanobacteriaceae, Mtb; , Mtm; Methanocaldococcaceae, Mtc; Methanococcaceae, Mcc; Methano- cellaceae, Mtl; Methanocorpusculaceae, Mcp; Methanomicrobiaceae, Mmc; Methanoregulaceae, Mrg; Methanospirillaceae, Msp; Methanosaetaceae, Msa; Methanosarcin- aceae, Msr; Thermococcaceae, Tc; Picrophilaceae, Pi; Thermoplasmataceae, Ts; Cenarchaeaceae, Ce; Nitrosopumilaceae, Ni; Nitrososphaeraceae, Nt. e Experimentally verified to be an oligosaccharyltransferase.

genome appeared at yet another point during evolution and clus- to clade d. Similarly, the two Methanocella arvoryzae AglB proteins tered with homologs from Methanoplanus petroleanus (Fig. 1, containing the WWDYG motif were assigned to clade a, while the arrowheads). Species-specific evolutionary patterns were also seen third AglB, in which this motif was modified to WWDDG, was as- for AglB sequences from non-methanogens. For instance, the three signed to clade b. On the other hand, both AglB proteins from Meth- Thermococcus litoralis AglB sequences also appeared at distinct anoplanus petrolearius and from Methanobacterium sp. AL-21 were points during evolution (Fig. 1, diamonds). Indeed, two of these se- assigned to the same clade, despite presenting differences in this quences cluster with their homologs from Thermococcus sibiticus motif. Likewise, the Methanosaeta harundinacea AglB sequence and Thermococcus barophilus, while the third sequence clusters where a modified WWDRG motif is found (Mhar_1439) is assigned with AglB from Thermococcus kodakarenesis. to clade b, along with two of the three additional AglB proteins pre- dicted in this organism, each of which contains a WWDYG motif at 3.3. AglB multiplicity in a given species may carry physiological this position. significance The existence of OSTs possessing unique specificities in a single organism offers a strategy for the addition of different N-linked The presence of the multiple AglB sequences in a single species glycans in a single species. This could be tested in future in genet- could be a reflection of differences in OST substrate or target pref- ically tractable species containing multiple AglB proteins, such as erence, prevalence or availability, possibly as a function of local Methanosarcina or Thermococcales species (Leigh et al., 2011), once growth conditions. Accordingly, closer examination of the multiple proof of N-glycosylation in these species has been provided. How- versions of AglB in a given species reveals differences in the con- ever, while such differential N-glycosylation has been observed in sensus WWDYG motif. It is conceivable that these modifications Hbt. salinarum and Hfx. volcanii, where S-layer glycoproteins reflect different activities of the various versions of the protein. are simultaneously modified by two distinct N-linked glycans Accordingly, examination of the phylogenetic distribution of the (Wieland et al., 1983; Lechner et al., 1985; Guan et al., 2012), each distinct versions of AglB found in a single species often suggests species only encode a single AglB protein. Nonetheless, it remains that these proteins appeared early in evolution. AglB proteins from possible that these species contain a second OST that can no longer members of the Family Thermococcaceae (Group B) that can be dis- be recognized as an AglB ortholog, or alternatively, that relies on a tinguished on the basis of variability at the fourth position of the distinct catalytic mechanism. The fact that the two N-linked gly- consensus WWDYG catalytic motif offer such examples. Within cans in Hbt. salinarum contain different linking sugars implies the the Thermococcaceae, those AglB proteins in which Tyr is replaced existence of two OSTs employing different mechanisms of with either His or Gln were all assigned to clade e (i.e., P. furiosus catalysis. PF0411, Pyrococcus horikoshii PH1271, Thermococcus sp. 4557 GQS_05995 and GQS_06090, Thermococcus gammatolerans 3.4. Identification of putative aglB-based N-glycosylation loci TGAM_0406, Thermococcus sp. AM4 TAM4_1026, Thermococcus sp. CL-1 CL1_0839 and CL1_0859 and Thermococcus onnurineus In Hfx. volcanii, one of the few archaeal species for which de- TON_1820). By contrast, AglB proteins from the same species con- tailed information on N-glycosylation is available, all but one of taining Tyr at position four of this catalytic motif were all assigned the genes known to participate in the assembly and attachment 332 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339

Fig. 1. Phylogenetic tree of euryarchaeal AglB sequences. An alignment of 77 AglB sequences from 31 euryarchaeal species containing more than one copy of AglB was used to construct a Neighbor-Joining tree. Robustness of the tree was assessed by a bootstrap test based on 500 pseudo-replicates. Bootstrap values are shown on the nodes of the tree where greater than 50%. Each entry lists the species followed by the genome-derived name of AglB, as indicated in Table 1. The limits of the different groups and clades are marked. The arrows indicate Methanocella arvoryzae AglB sequences, while the meanings of the arrowhead and diamond symbols are provided in the text. Those AglB sequences in which the Tyr of the consensus WWDYG motif is modified are indicated by the full circles.

of a pentasaccharide to selected Asn residues of N-glycosylated 2010), mannose, is found outside this cluster. On the other hand, proteins are sequestered within an aglB-containing gene cluster no N-glycosylation gene clusters (defined as containing aglB and beginning at HVO_1517, encoding AglJ, and extending to at least three other putative N-glycosylation pathway compo- HVO_1531, encoding AglM (Yurist-Doutsch and Eichler, 2009; Yur- nent-encoding genes) are seen in M. voltae, M. maripaludis or S. aci- ist-Doutsch et al., 2010). This gene cluster also includes aglP, aglQ, docaldarius, other species where genes involved in N-glycosylation aglE, aglR, aglS, aglF, aglI and aglG. Only aglD, encoding the GT have also been identified (Chaban et al., 2006; Magidovich responsible for charging the dolichol phosphate carrier with the fi- and Eichler, 2009; VanDyke et al., 2009; Meyer et al., 2011). Hence, nal sugar of the pentasaccharide (Abu-Qarn et al., 2007; Guan et al., to assess the prevalence of aglB-based N-glycosylation gene L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 333

Table 2 Clustering of putative N-glycosylation genes around archaeal aglB homologs.

Genus (Phylum/Class/Order/Familya) Species Members of putative N-glycosylation cluster Acidilobus (C/Tp/Ac/Ac) A. saccharovorans n.d. Desulfurococcus (C/Tp/De/De) D. fermentans n.d. D. kamchatkensis n.d. D. mucosus n.d. Ignicoccus (C/Tp/De/De) I. hospitalis n.d. Ignisphaera (C/Tp/De/De) I. aggregans n.d. Staphylothermus (C/Tp/De/De) S. hellenicus n.d. S. marinus n.d. Thermogladius (C/Tp/De/De) T. cellulolyticus n.d. Thermosphaera (C/Tp/De/De) T. aggregans n.d. Hyperthermus (C/Tp/De/Py) H. butylicus n.d. Pyrolobus (C/Tp/De/Py) P. fumarii n.d. Fervidicoccus (C/Tp/Fe/Fe) F. fontis n.d. Acidianus (C/Tp/Su/Su) A. hospitalis n.d. Metallosphaera (C/Tp/Su/Su) M. cuprina Mcup_0425,0426,0427,0430 M. sedula Msed_1805,1808,1809,1810,1811,1814,1816 M. yellowstonensis MetMK1DRAFT_00024050,00024120,00024130,00024150,00024220 Sulfolobus (C/Tp/Su/Su) S. acidocaldarius n.d. S. islandicus HVE10/4 n.d. S. islandicus L.D.8.5 n.d. S. islandicus L.S.2.15 n.d. S. islandicus M.14.25 n.d. S. islandicus M.16.27 n.d. S. islandicus M.16.4 n.d. S. islandicus REY15A n.d. S. islandicus Y.G.57.14 n.d. S. islandicus Y.N.15.51 n.d. S. solfataricus 98/2 n.d. S. solfataricus P2 n.d. S. tokodaii n.d. Thermofilum (C/Tp/Th/Thf) T. pendens n.d. Caldivirga (C/Tp/Th/Thp) C. maquilingensis n.d. Pyrobaculum (C/Tp/Th/Thp) P. aerophilum n.d. P. arsenaticum n.d. P. calidifontis n.d. P. islandicum n.d. P. oguniense n.d. Pyrobaculum sp. 1860 n.d. Thermoproteus (C/Tp/Th/Thp) T. neutrophilus n.d. T. tenax n.d. T. uzoniensis n.d. Vulcanisaeta (C/Tp/Th/Thp) V. distributa n.d. V. moutnovskia n.d. Archaeoglobus (E/Ar/Ar/Ar) A. fulgidus AF0035,0038,0039,0040,0043,0044,0045/0321,0322,0323a, 0323b,0324,032,0326,0327,0328,0329 A. profundus Arcpr_1194,1195,1196,1201,1202,1203,1204,1207,1214 A. veneficus Arcve_0544,0545,0546,0552,0556,055,0562,0566,0567,0568 Ferroglobus (E/Ar/Ar/Ar) F. placidus n.d. Haladaptatus (E/H/H/H) Hap. paucihalophilus ZOD2009_20058,20063,20073,20083,20098,20113 Halalkalicoccus (E/H/H/H) Hac. jeotgali HacjB3_10595,10600,10620,10625,10630 Haloarcula (E/H/H/H) Har. californiae HAH_00005730,00005780,00005820,00005840,00005850,00005860 Har. hispanica HAH_1202,1203,1206,1208,1210,1214 Har. marismortui rrnAC0419,0421,0427,0429,0430,0431 Har. sinaiiensis HAI_00022150,00022180,0002210,0002230,0002240,00022250 Har. vallismortis HAJ_00008880,00008890,00008900,00008920 Halobacterium (E/H/H/H) Hbt. salinarum R1 OE2524R,2528R,2529F,2530F,2535R,2537F,2546F,254,2548F Halobacterium sp. NRC-1 VNG1048G,1053G,1054G,1055G,1059C,1062G,1066C,1067G,1068G Halobacterium sp. DL1 HalDL1DRAFT_1630,1631,1632,1633,1634,1639,1640,1641,1642,1643,1644,1645, 1646,1647,1649 Halobiforma (E/H/H/H) Hbf. lacisalsi HlacAJ_010100009153,010100009163,010100009168,010100009173,010100009178 Haloferax (E/H/H/H) Hfx. denitrificans HAK_000032050,00032060,000032070,000032080,000032090, 000032110,000032120,000032130,000032150 Hfx. mediterranei HFX_1580,1581,1582,1587,1591,1592 Hfx. mucosum HAM_16650,16660,16700,16750,16760,16770 Hfx. sulfurifontas HAN_00007730,00007740,00007750,00007760,00007780,00007790,00007800, 00007810,00007850,00007860,00007870 Hfx. volcanii HVO_1517b,1522b,1523b,1523.1b,1524,1525b,1526b,1527b,1528b,1529b,1530,1531b Halogeometricum (E/H/H/H) Hgm. borinquense Hbor_16990,17000,17010,17020,17030,17040,17050,17060,17070,17100,17110, 17120,17130,17140,17180,17190,17200,17210 Halomicrobium (E/H/H/H) Hmc. mukohataei Hmuk_2752,2753,2754,2756,2757,2758, Halopiger (E/H/H/H) Hpg. xanaduensis Halxa_2340,2341,2342,2344,2348,2349,2351,2352,2355,2357,2538,2361,2368, 2369,2371,2372,2379,2380,2381 Haloquadratum (E/H/H/H) Hqr. walsbyi C23 Hqrw_3012,3013,3016,3017,3021,3023,3029,3036,3040,3043,3044,3045 Hqr. walsbyi DSM 16790 HQ2680A,2681A,2682A,2683A,2686A,2687A,2691A,2692A,2694A Halorhabdus (E/H/H/H) Hrd. tiamatea n.d.

(continued on next page) 334 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339

Table 2 (continued)

Genus (Phylum/Class/Order/Familya) Species Members of putative N-glycosylation cluster Hrd. utahensis n.d. Halorubrum (E/H/H/H) Hrr. lacusprofundi Hlac_1062,1063,1065,1067,1069,1071,1073,1074,1075 Haloterrigena (E/H/H/H) Htg. turkmenica Htur_2947,2949,2954,2955,2956,2957 Natrialba (E/H/H/H) Nab. magadii Nmag_0916,0917,0922,0924,0925,0926,0927 Natrinema (E/H/H/H) Nnm. pellirubrum NatpeDRAFT_0005,0006,0007,0008 Natronobacterium (E/H/H/H) Nbt. gregoryi NatgrDRAFT_1666,1669,1670,1675,1682,1683,1684,1685 Natronomonas (E/H/H/H) Nmn. pharaonis n.d. Methanobacterium (E/Mtb/Mtb/Mtb) Methanobacterium sp. AL-21 Metbo_0719,0720,0721,0722,0723,0725,0726,0727,0729,0734 Methanobacterium sp. n.d. SWAN-1 Methanobrevibacter (E/Mtb/Mtb/Mtb) M. ruminantium n.d. M. smithii ATCC 35061 n.d. M. smithii DSM 2374 n.d. M. smithii DSM 2375 n.d. Methanosphaera (E/Mtb/Mtb/Mtb) M. stadtmanae n.d. Methanothermobacter (E/Mtb/Mtb/Mtb) M. marburgensis n.d. M. thermautotrophicus n.d. Methanothermus (E/Mtb/Mtb/Mtm) M. fervidus n.d. Methanocaldococcus (E/Mtc/Mtc/Mtc) M. fervens n.d. M. infernus n.d. M. jannaschii n.d. Methanocaldococcus sp. n.d. FS406-22 M. vulcanius n.d. Methanotorris (E/Mtc/Mtc/Mtc) M. formicicus n.d. M. igneus n.d. Methanococcus (E/Mtc/Mtc/Mcc) M. aeolicus n.d. M. maripaludis C5 n.d. M. maripaludis C6 n.d. M. maripaludis C7 n.d. M. maripaludis S2 n.d. M. maripaludis X1 n.d. M. vannielii n.d. M. voltae n.d. M. voltae n.d. Methanothermococcus (E/Mtc/Mtc/Mcc) M. okinawensis n.d. Methanocella (E, Mtm, Mtl, Mtl) M. arvoryzae LRC537,539,541,542,543,544,545,547,548,549,550,551,552,553,555,558 M. conradii Mtc_0169,0171,0172,0182,0183,0186,0187,0188,0189,0190,0191,0193,0197,0198,0199, 0201,0202,0203,0205,206 M. paludicola SANAE MCP_2704,2705,2706,2707,2708,2709,2710,2711,2714,2715,2716,2717, 2718,2719,2720,2723 Methanocorpusculum (E, Mtm, Mmb, Mcp) M. labreanum Mlab_0662,0663,664,665,666 Methanoculleus (E, Mtm, Mmb, Mmc) M. marisnigri Memar_0175,0183,0184,0185,0186,0187,0188,0189,0192 Methanofollis (E, Mtm, Mmb, Mmc) M. liminatans n.d. Methanoplanus (E, Mtm, Mmb, Mmc) M. limicola n.d. M. petrolearius n.d. Methanolinea (E, Mtm, Mmb, Mrg) M. tarda MettaDRAFT_0779,0781,0782,0783,0784,0785,0786,0787 Methanoregula (E, Mtm, Mmb, Mrg) M. boonei Mboo_0249,0250,0252,0253,0254,0255 Methanosphaerula (E, Mtm, Mmb, Mrg) M. palustris n.d. Methanospirillum (E, Mtm, Mmb, Msp) M. hungatei Mhun_2852,2853,2854,2855,2856,2857,2858,2859/ 3065,3066,3067,3072,3073,3074,3075,3076,3077,3078,3079,3080,3084,3090/3138, 3145,3147,3149,3151,3154,3161 Methanosaeta (E, Mtm, Msc, MSa) M. concilii n.d. M. harundinacea Mhar_1091,1093,1094,1095,1096,1097,1098,1099,1100,1101,1102,1103,1104,1106,1110 M. thermophila n.d. Methanococcoides (E, Mtm, Msc, MSr) M. burtonii Mbur_1579,1581,1582,1583,1584,1585,1586,1587,1590,1593,1594,1597,1603,1604,1605, 1607,1608,1612,1613,1615,1617 Methanohalobium (E, Mtm, Msc, MSr) M. evestigatum Metev_1236,1237,1242,1244,1250,1252,1253,1254,1255,1257 Methanohalophilus (E, Mtm, Msc, MSr) M. mahii n.d. Methanosalsum (E, Mtm, Msc, MSr) M. zhilinae Mzhil_1638,1639,1640,1641,1642,1643,1645,1648,1649,1651,1652,1653,1655,1656 Methanosarcina (E, Mtm, Msc, MSr) M. acetivorans MA_1172,1173,1174,1175,1176,1177,1179,1180,1181,1183,1184,1185,1186,1187/ 3752,3753,3754,3755,3756,3757,3758,3764,3766,3767,3769a, 3769b,3777, 3778,3779,3780,3781 M. barkeri Mbar_A0229,A0230,A0231,A0232, A0233,A0234,A0235,A0236,A0237, A0238,A0239,A0240,A0241,A0242, A0243/A0366,A0368,A0369,A0373,A0374,A0375 M. mazei MM_0646,0647,0648,0649,0650,0651,0652,0653,0654,656,657,658,659,660/ 2208,2210,2213,2214,2215,2216,2217,2221,2222,2223 Candidatus Haloredivivus (E/Nnh//) Candidatus Haloredivivus sp. HRED_02640,02670,02680,02720,02810 G17 Candidatus Nanosalina (E/Nnh//) Candidatus Nanosalina sp. J07AB43_03180,03190,03200,03210,03240,03270,03310,03320,03330,03340 J07AB43 Candidatus Nanosalinarum (E/Nnh//) Candidatus Nanosalinarum J07AB56_11160,11200,11210,11240,11250 J07AB56 Pyrococcus (E/Tc/Tc/Tc) P. abyssi PAB1411,1410,1409,0973,0974/0783,0784,0785,0787,0789,0790.1nn, 0973,0795,0796,1587,1586 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 335

Table 2 (continued)

Genus (Phylum/Class/Order/Familya) Species Members of putative N-glycosylation cluster P. furiosus COM1 n.d. P. furiosus DSM 3638 n.d. P. horikoshii n.d. P. yayanosii PYCH_17860,17870,17880,17900,17910,17920 Pyrococcus sp. NA2 PNA2_1113,1114,1115,1120,1121 Pyrococcus sp. ST04 Py04_0454,0455,0456,0457,0461,0465 Thermococcus (E/Tc/Tc/Tc) T. barophilus TERMP_02119,02121,02122,02123,02124,02129/ 2078,2079,2080,2084,2089,2091,2094,2096,2097,2099,2100 T. gammatolerans n.d. T. kodakarensis TK1708,1711,1712,1713,1714,1715,1716,1717,1718,1719,1720,1721, 1722,1723,1725,1731,1732,1733 T. litoralis OCC_01289,01319,01324,01329,01334,01354/05039,05049,05054,05059,05064 T. onnurineus TON_1818,1819,1820,1821,1822,1823 T. sibiricus TSIB_2044,2045,2047,2048,2049,2050,2054,2059,2061,0003,0004,0005,0006,0007 Thermococcus sp. 4557 GQS_05950,05955,05960,05975,05995,06000,0600/06075,06080,06085,06090 Thermococcus sp. AM4 TAM4_1088,1094,1026,1040 Thermococcus sp. CL1 CL1_0827,0828,0830,0834,0838,0839,0840,0841/0850,0856,0857,0859 Picrophilus (E/Tl/Tl/Pi) P. torridus n.d. Thermoplasma (E/Tl/Tl/Ts) T. acidophilum n.d. T. volcanium n.d. Aciduliprofundum (E///) A. boonei n.d. Candidatus Micrarchaeum (E///) Candidatus M. acidiphilum n.d. Candidatus Parvarchaeum (E///) Candidatus P. acidiphilum n.d. Candidatus P. acidophilus n.d. uncultured marine group II n.d. euryarchaeote Candidatus Korarchaeum (K///) Candidatus K. cryptofilum n.d. Nanoarchaeum (N///) N. equitans n.d. Cenarchaeum (T//Ce/Ce) C. symbiosum A n.d. Candidatus Nitrosoarchaeum (T//Ni/Ni) Candidatus N. koreensis n.d. Candidatus N. limnia BG20 n.d. Candidatus N. limnia SFB1 n.d. Nitrosopumilus (T//Ni/Ni) Candidatus N. salaria n.d. N. maritimus n.d. Nitrosopumilus sp. MY1 n.d. Candidatus Caldiarchaeum (T///) Candidatus C. subterraneum n.d. unclassified Archaea halophilic archaeon DL31 Halar_1591,1600,1601,1610,1611,1612,1613,1615,1616,1620

Glycosylation-related annotation at the Integrated Microbial Genomes – Genome Encyclopedia of Bacteria and Archaea Genomes (IMG/GEBA) (http://img.jgi.doe.gov/cgi-bin/ geba/main.cgi) (August, 2012), UCSC Archaeal Genome Browser (http://archaea.ucsc.edu/) (August, 2012) and the NCBI Protein Database (http://www.ncbi.nlm.nih.gov/ protein) (August, 2012) sites. Cluster is defined as including AglB and at least 3 other putative proteins involved in glycosylation; AglB in bold. n.d., not detected. a The abbreviations used from the different phyla, classes, orders, and families are provided in the legend to Table 1. b Sequences other than AglB experimentally confirmed as participating in N-glycosylation.

clustering across the Archaea, those regions down- and upstream hyperthemophilic euryarchaeota, aglB-based glycosylation gene of genes annotated as encoding AglB were examined (Table 2). clustering was seen in the three Archaeoglobus species, in all In the 45 crenarcheal species considered, aglB-based gene clus- Thermococcus species apart from T. gammatolerans, and in four of tering was only observed in the three species belonging to the the seven Pyrococcus species. No aglB-based clustering was seen Genus Metallosphaera. Given the broad geographic distribution of in the Korarchaeota, Nanoarchaeota or Thaumarchaeota. Metallosphaera cuprina (sulfuric hot spring in Tengchong, Yunnan, In most cases where aglB-based glycosylation gene clustering China; Liu et al., 2011), Metallosphaera sedula (Thermal pond, Pisc- was observed, aglB itself corresponds to one edge of the cluster. iarelli Solfatara, Naples, Italy; Huber et al., 1989) and Metallosphae- In a limited number of cases, an additional glycosylation gene ra yellowstonensis (acidic geothermal springs in Yellowstone adjacent to aglB serves this role. Where multiple AglB sequences National Park; Kozubal et al., 2008), it would appear that N-glyco- are found, some species presented each aglB in a glycosylation sylation gene clustering occurred prior to division of an ancestor gene clusters, others organized only some of the multiple AglB se- into the three species. In the Euryarchaeota, aglB-based glycosyla- quences into such clusters, while yet other species containing tion gene clustering was detected in 26 of the 29 available haloar- multiple AglB sequences did not cluster N-glycosylation genes chaeal genomes, with only the two Halorhabdus species and around aglB at all. In each of the three Methanocella species, the Natronomonas pharaonis not presenting such an arrangement. This multiple versions of AglB were all found in a common gene is not unexpected, since gene clusters appear to be better con- cluster. served in haloarchaea than other archaeal groups (Berthon et al., Finally, the distribution of genes known or believed to mediate 2008). In the 49 methanoarchaeal species examined, aglB-based N-glycosylation in other Archaea suggests that N-glycosylation glycosylation gene clustering was observed largely along genus gene clusters not anchored by aglB may also exist. For example, lines, with some genera in a given family displaying aglB-based N-glycosylation roles have been demonstrated for the products of glycosylation gene clustering and others in the same family M. maripaludis MMP1079-MMP1088, while AglB is encoded by not. Indeed, even within a given methanoarchaeal genus, only MMP1424 (Chaban et al., 2006, 2009; Shams-Eldin et al., 2008; some species presented such gene clustering. In the thermo- and Jones et al., 2012). 336 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339

Fig. 2. Schematic representation of aglB-based gene clusters in five Haloferax species. The positions of agl genes in Hfx. volcanii and their homologs in Hfx. denitrificans, Hfx. mediterranei, Hfx. mucosum and Hfx. sulfurifontas are indicated, as are those of other glycosylation-related genes. The genes are arbitrarily drawn in terms of size. In those species where the orientation of the gene cluster is opposite to that in Hfx. volcanii, a double-headed arrow is found next to the species name. The legend describes the meaning of the coloring scheme employed.

3.5. Evolutionary insight into N-glycosylation in Haloferaxspecies and other sugar-processing proteins not found in the comparable derived from gene cluster comparison and gene content Hfx. volcanii cluster. Thus, based on the composition of the Hfx. den- itrificans aglB-based gene cluster, it can be predicted that N-linked To demonstrate how the phenomenon of aglB-based glycosyla- glycans in this species will be highly similar if not identical to the tion gene clustering can be used for making predictions related to N-linked pentasaccharide decorating glycoproteins in Hfx. volcanii. N-glycosylation in a given species and to gain insight into the evo- By the same reasoning, one would expect a somewhat different N- lution of this post-translational modification, the five Haloferax glycan in Hfx. sulfurifontas. In considering the identical aglB-based species for which genomic information is presently available were gene clusters seen in Hfx. mediterranei and Hfx. mucosum, not only considered. In terms of their geography, the five species are found are far fewer glycosylation-related genes observed, the few homo- distally of one another, with Hfx. denitrificans originally having logs of Hfx. volcanii Agl protein-encoding genes are distributed dif- been isolated from a saltern in California, USA (Tomlinson et al., ferently than in the Hfx. volcanii cluster. As such, the N-glycans 1986), Hfx. mediterranei from a saltern near Alicante, Spain (Rodri- predicted to decorate Hfx. mediterranei and Hfx. mucosum glycopro- guez-Valera et al., 1980), Hfx. mucosum from Shark Bay, Australia teins are expected to be identical, yet significantly differing from (Allen et al., 2008), Hfx. sulfurifontis from a sulfur spring in Okla- what is found in the other Haloferax strains. homa, USA (Elshahed et al., 2004) and Hfx. volcanii from the Dead The organization of aglB and other agl genes within each cluster Sea (Mullakhanbhai and Larsen, 1975). offers evolutionary insight into the N-glycosylation process. The At present, the pathway of N-glycosylation has been delineated similarities of the aglB-based gene clusters in Hfx. volcanii, Hfx. in Hfx. volcanii, based on a series of genetic and biochemical stud- ies. In the Agl pathway in this species, aglJ, aglG, aglI and aglE en- code GTs that sequentially add four nucleotide-activated sugars to a common dolichol phosphate carrier on the inner face of the plasma membrane (Abu-Qarn et al., 2008; Plavner and Eichler, 2008; Yurist-Doutsch et al., 2008; Guan et al., 2010; Kaminski et al., 2010). Once the tetrasaccharide-bearing dolichol phosphate has been translocated across the membrane, AglB delivers the gly- can to target protein Asn residues (Abu-Qarn et al., 2007). At the same time, the only N-glycosylation pathway component encoded by a gene outside the aglB-based glycosylation gene cluster, AglD, adds the final sugar of the N-linked pentasaccharide, mannose, to a distinct dolichol phosphate (Abu-Qarn et al., 2007; Yurist-Doutsch and Eichler, 2009; Guan et al., 2010). Mannose-charged dolichol phosphate is ‘flipped’ across the membrane in a process involving AglR (Kaminski et al., 2012), at which point AglS delivers the man- nose to the Asn-bound tetrasaccharide (Cohen-Rosenzweig et al., 2012). In additon, AglF, AglM and AglP serve various sugar process- ing roles (Magidovich et al., 2010; Yurist-Doutsch et al., 2010). Examination of the genomes of Hfx. denitrificans, Hfx. mediterra- nei, Hfx. mucosum and Hfx. sulfurifontas reveals aglB-based gene clusters containing homologs to many of the Hfx. volcanii agl genes (Fig. 2). For example, the Hfx. denitrificans aglB-based gene cluster is almost identical to its Hfx. volcanii counterpart, except for the presence of transposases in the latter. The Hfx. sulfurifontas aglB- Fig. 3. Phylogenetic tree of Haloferax AglB proteins. The phylogenetic relationships based gene cluster also contains several homologs to Hfx. volcanii of AglB from Hfx. volcanii, Hfx. denitrificans, Hfx. mediterranei, Hfx. mucosum and Hfx. agl sequences, albeit differently arranged. In addition, the Hfx. sul- sulfurifontas is presented. The Hrr. lacusprofundi AglB sequence served as an out- furifontas aglB-based gene cluster contains sequences encoding GTs group. Numbers represent the percent of bootstrap support for each node. L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 337

Table 3 G + C content of Haloferax agl genes.

Hfx. volcanii Hfx. denitrificans Hfx. sulfurifontis Hfx. mediterranei Hfx. mucosum aglJa 0.61 0.60 0.62 0.62 0.62 aglP 0.45 0.44 –b –– aglQ 0.48 0.47 – – – aglE 0.47 0.47 0.62 – – aglR 0.47 0.46 0.49 0.61 0.62 aglS 0.41 0.41 – – – aglF 0.55 0.54 0.54 – – aglI 0.58 0.58 0.51 0.64 0.64 aglG 0.54 0.56 – – – aglB 0.63 0.62 0.62 0.62 0.63 aglM 0.66 0.65 0.67 – – aglD 0.70 0.71 0.71 0.64 0.67 Genomec 0.62 0.66 0.66 0.60 0.62

a Genes are listed in the order they appear in the Hfx. volcanii agl gene cluster, except for aglD, which is found elsewhere in the genome. b Gene not detected. c G + C content as listed at http://img.jgi.doe.gov/cgi-bin/w/main.cgi.

Table 4 Codon usage bias in Haloferax agl genes.

Hfx. volcanii Hfx. denitrificans Hfx. sulfurifontis Hfx. mediterranei Hfx. mucosum aglJa 40.89b 41.82 38.74 35.79 32.89 0.506c 0.4802 0.5435 0.6255 0.7186 d aglP 51.25 51.54 – –– 0.1748 0.1551 aglQ 56.37 55.82 –– – 0.1718 0.1624 aglE 56.37 55.30 40.76 – – 0.1831 0.1628 0.5159 aglR 57.21 55.91 56.22 39.60 41.30 0.1806 0.1658 0.2015 0.5275 0.5337 aglS 56.37 49.76 –– – 0.1718 0.1246 aglF 46.98 48.19 50.83 –– 0.4565 0.4065 0.3378 aglI 47.67 46.94 56.74 34.16 33.05 0.358 0.3436 0.2089 0.6864 0.7078 aglG 43.91 42.63 –– – 0.4306 0.4613 aglB 38.28 38.84 38.01 37.72 34.97 0.5495 0.5371 0.5328 0.6469 0.6948 aglM 33.90 34.35 32.51 – – 0.6683 0.61 0.675 Genome 33.78 34.17 34.33 42.41 40.09 0.6585 0.6511 0.6494 0.5533 0.5974 s.d.e ±5.95 ±6.64 ±6.67 ±7.06 ±6.28 ±0.1199 ±0.1293 ±0.1284 ±0.1121 ±0.0990

a Genes are listed in the order they appear in the Hfx. volcanii agl gene cluster. b ENC value – values between one and two standard deviations higher than the genomic average are in bold, while values more than two standard deviations higher than the genomic average are in bold and underlined. c CAI value – values between one and two standard deviations lower than the genomic average are in bold, while values more than two standard deviations lower than the genomic average are in bold and underlined. d Gene not detected. e s.d. = standard deviation. denitrificans and Hfx. sulfurifontas and in Hfx. mediterranei and Hfx. volcanii aglD and its Hfx. denitrificans and Hfx. sulfurifontas homo- mucosum are in agreement with the grouping of these species into logs, this region was expanded to include the 9 upstream and the separate clades, based on genomic segment loss and gain studies 28 downstream genes (not shown). (Lynch et al., 2012). AglB protein phylogeny (Fig. 3) is congruent At the same time, variations in the composition of agl genes in with the Haloferax tree generated by this earlier study. At the same the aglB-based clusters of the different Haloferax strains, together time, homologs of Hfx. volcanii aglD, the only component of the N- with the concept that differences in N-linked glycosylation could glycosylation pathway in this species found outside the aglB-based provide adaptive advantages, raise the possibility that lateral gene gene cluster, were detected in Hfx. denitrificans (HAK_00016980), transfer (LGT) played a role in the evolution of AglB-based N-glyco- Hfx. mediterranei (HAL_00010870), Hfx. mucosum sylation in this genus. With the exception of Haloquadratum wal- (HAM_00012150) and Hfx. sulfurifontas (HAN_00024650). In each sbyi, haloarchaea are characterized by a high genomic G + C case, the identical six downstream and three upstream genes content (typically 65% G + C) (Hartman et al., 2010). The G + C con- bordered this GT-encoding gene. However, in the case of the Hfx. tents of aglB and aglM tend to resemble the genomic average. This 338 L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 is not the case for other agl genes, which, in several species, present facets of the archaeal version of this universal protein-processing a highly unusual base composition (Table 3). For instance, several event. of the clustered genes encoding Agl homologs in Hfx. volcanii, Hfx. denitrificans and Hfx. sulfurifontas are A + T-rich, relative to Acknowledgments the rest of the genome (i.e. a difference of P10% from the genomic mean), indicative of fairly recent horizontal acquisition. Further- The authors thank Sam Haldenby for his early contributions to more, these genes also have extremely high effective number of co- this project. JE is supported by grants from the Israel Science Foun- dons (ENC) values and low codon adaptation index (CAI) values, dation (30/07) and the US Army Research Office (W911NF-11-1- indicative of the use of codons that are rare in the respective gen- 520). UG is supported by grants from the Israel Science Foundation omes (Table 4). It is noteworthy that the CAI of aglB is substantially (201/12) and the German-Israeli Project Cooperation (DIP). TA has higher in Hfx. mediterranei and Hfx. mucosum than in the other spe- been supported by a Royal Society University Research Fellowship. cies, implying higher levels of gene expression. Combined with the LK is the recipient of a Negev-Zin Associates Scholarship. fact that the other N-glycosylation genes in these two genomes ap- pear to be ancestral rather than recently acquired, it would appear that N-glycosylation is a more fundamental trait of Hfx. mediterra- Appendix A. Supplementary material nei and Hfx. mucosum. Additionally, there is high within-cluster variation in nucleotide composition in the agl clusters of Hfx. volca- Supplementary data associated with this article can be found, in nii and Hfx. denitrificans, indicating that agl genes were recruited the online version, at http://dx.doi.org/10.1016/j.ympev.2013.03. from different sources, in agreement with the protein-based phy- 024. logenies of these genes, which show conflicting evolutionary histo- ries (Fig. S2). References

Abu-Qarn, M., Yurist-Doutsch, S., Giordano, A., Trauner, A., Morris, H.R., Hitchen, P., Medalia, O., Dell, A., Eichler, J., 2007. Haloferax volcanii AglB and AglD are 4. Conclusions involved in N-glycosylation of the S-layer glycoprotein and proper assembly of the surface layer. J. Mol. Biol. 374, 1224–1236. Abu-Qarn, M., Giordano, A., Battaglia, F., Trauner, A., Hitchen, P.G., Morris, H.R., Dell, In 1976, the Hbt. salinarum S-layer glycoprotein became the first A., Eichler, J., 2008. Identification of AglE, a second glycosyltransferase involved non-eukaryal N-glycosylated protein reported (Mescher and in N-glycosylation of the Haloferax volcanii S-layer glycoprotein. J. Bacteriol. Strominger, 1976). Over the next fifteen years, advances in describ- 190, 3140–3146. Albers, S.V., Meyer, B.H., 2011. The archaeal cell envelope. Nat. Rev. Microbiol. 9, ing both the structures of glycans N-linked to archaeal glycopro- 414–426. teins and archaeal N-glycosylation pathways were made. More Allen, M.A., Goh, F., Leuko, S., Echigo, A., Mizuki, T., Usami, R., Kamekura, M., Neilan, recently, the availability of complete genome sequences, together B.A., Burns, B.P., 2008. Haloferax elongans sp. nov. and Haloferax mucosum sp. nov., isolated from microbial mats from Hamelin Pool, Shark Bay, Australia. Int. with the development of appropriate molecular tools and tech- J. Syst. Evol. Microbiol. 58, 798–802. niques led to renewed interest in this topic. In the last decade, con- Berthon, J., Cortez, D., Forterre, P., 2008. Genomic context analysis in Archaea siderable progress has been made in addressing genes and proteins suggests previously unrecognized links between DNA replication and translation. Genome Biol. 9, R71. involved in N-glycosylation in several species. Today, alongside Brochier, C., Forterre, P., Gribaldo, S., 2004. Archaeal phylogeny based on proteins of such efforts being conducted at the molecular level, insight into the transcription and translation machineries: tackling the Methanopyrus the archaeal version of this universal post-translational modifica- kandleri paradox. Genome Biol. 5, R17. Calo, D., Kaminski, L., Eichler, J., 2010. Protein glycosylation in Archaea: sweet and tion can now be gleaned at the genome level. extreme. Glycobiology 20, 1065–1076. As discussed here, virtually all Archaea encode for components Chaban, B., Voisin, S., Kelly, J., Logan, S.M., Jarrell, K.F., 2006. Identification of genes of a N-glycosylation pathway, pointing to such protein processing involved in the biosynthesis and attachment of Methanococcus voltae N-linked as being a common event in this life form. Moreover, even though glycans: insight into N-linked glycosylation pathways in Archaea. Mol. Microbiol. 61, 259–268. relatively few examples have been experimentally characterized, it Chaban, B., Logan, S.M., Kelly, J.F., Jarrell, K.F., 2009. AglC and AglK are involved in is already abundantly clear that archaeal N-glycosylation involves biosynthesis and attachment of diacetylated glucuronic acid to the N-glycan in more variety in terms of sugars, glycan structures, and by exten- Methanococcus voltae. J. Bacteriol. 191, 187–195. Cohen-Rosenzweig, C., Yurist-Doutsch, S., Eichler, J., 2012. AglS, a novel component sion, biosynthetic pathways than seen elsewhere. By focusing on of the Haloferax volcanii N-glycosylation pathway, is a dolichol phosphate- a single component of the archaeal N-glycosylation pathway, it mannose mannosyltransferase. J. Bacteriol. 194, 6909–6916. was shown that gene duplication and modification had occurred Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32, 1792–1797. at numerous different points during evolution. Furthermore, com- Eichler, J., 2013. Extreme sweetness: protein glycosylation in Archaea. Nat. Rev. parison of the organization and content of N-glycosylation genes in Microbiol 11, 151–156. five members of the same genus revealed that substantial LGT had Elshahed, M.S., Savage, K.N., Oren, A., Gutierrez, M.C., Ventosa, A., Krumholz, L.R., 2004. Haloferax sulfurifontis sp. nov., a halophilic archaeon isolated from a occurred over the course of time. sulfide- and sulfur-rich spring. Int. J. Syst. Evol. Microbiol. 54, 2275–2279. Despite advances made in deciphering pathways of archaeal Guan, Z., Naparstek, S., Kaminski, L., Konrad, Z., Eichler, J., 2010. Distinct glycan- N-glycosylation, numerous unanswered questions remain. For in- charged phosphodolichol carriers are required for the assembly of the pentasaccharide N-linked to the Haloferax volcanii S-layer glycoprotein. Mol. stance, one can ask what species-specific changes allow the archa- Microbiol. 78, 1294–1303. eal oligosaccharyltransferase, AglB, to accommodate such a wide Guan, Z., Naparstek, S., Calo, D., Eichler, J., 2012. Protein glycosylation as an adaptive range of glycan structures. Do Archaea encountering similar envi- response in Archaea: growth at different salt concentrations leads to alterations ronmental extremes decorate their proteins with similar N-linked in Haloferax volcanii S-layer glycoprotein N-glycosylation. Environ. Microbiol. 14, 743–753. glycans? How common is the ability to modify N-glycosylation in Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O., 2010. response to changing surroundings, a phenomenon recently ob- New algorithms and methods to estimate maximum-likelihood phylogenies: served in Hfx. volcanii? As new species appeared, did N-glycosyla- assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321. Hartman, A.L., Norais, C., Badger, J.H., Delmas, S., Haldenby, S., Madupu, R., tion change at the same rate? Finally, one can ask whether it will Robinson, J., Khouri, H., Ren, Q., Lowe, T.M., Maupin-Furlow, J., Pohlschroder, become possible to describe the composition of the N-linked gly- M., Daniels, C., Pfeiffer, F., Allers, T., Eisen, J.A., 2010. The complete genome cans decorating archaeal glycoproteins based on their glycosyla- sequence of Haloferax volcanii DS2, a model archaeon. PLoS One 5, e9605. Huber, G., Spinnler, C., Gambacorta, A., Stetter, K.O., 1989. Metallosphaera sedula gen. tion gene content. Examining archaeal N-glycosylation from the nov. and sp. nov. represents a new genus of aerobic, metal-mobilizing, genomic perspective will help address these and elucidate other thermoacidophilic archaebacteria. Syst. Appl. Microbiol. 12, 38–47. L. Kaminski et al. / Molecular Phylogenetics and Evolution 68 (2013) 327–339 339

Igura, M., Maita, N., Kamishikiryo, J., Yamada, M., Obita, T., Maenaka, K., Kohda, D., Mullakhanbhai, M.F., Larsen, H., 1975. Halobacterium volcanii spec. nov., a Dead Sea 2008. Structure-guided identification of a new catalytic motif of halobacterium with a moderate salt requirement. Arch. Microbiol. 104, 207– oligosaccharyltransferase. EBMO J. 27, 234–243. 214. Jarrell, K.F., Jones, G.M., Kandiba, L., Nair, D.B., Eichler, J., 2010. S-layer glycoproteins Ng, S.Y., Wu, J., Nair, D.B., Logan, S.M., Robotham, A., Tessier, L., Kelly, J.F., Uchida, K., and flagellins: reporters of archaeal posttranslational modifications. Archaea. Aizawa, S., Jarrell, K.F., 2011. Genetic and mass spectrometry analyses of the pii: 612948. unusual type IV-like pili of the archaeon Methanococcus maripaludis. J. Bacteriol. Jones, G.M., Wu, J., Ding, Y., Uchida, K., Aizawa, S.I., Robotham, A., Logan, S.M., Kelly, 193, 804–814. J., Jarrell, K.F., 2012. Identification of genes involved in the acetamidino group Nothaft, H., Szymanski, C.M., 2010. Protein glycosylation in bacteria: sweeter than modification of the flagellin N-linked glycan of Methanococcus maripaludis.J. ever. Nat. Rev. Microbiol. 8, 765–778. Bacteriol. 194, 2693–2702. Peyfoon, E., Meyer, B., Hitchen, P.G., Panico, M., Morris, H.R., Haslam, S.M., Albers, Kaminski, L., Abu-Qarn, M., Guan, Z., Naparstek, S., Ventura, V.V., Raetz, C.R., S.V., Dell, A., 2010. The S-layer glycoprotein of the crenarchaeote Sulfolobus Hitchen, P.G., Dell, A., Eichler, J., 2010. AglJ adds the first sugar of the N-linked acidocaldarius is glycosylated at multiple sites with chitobiose-linked N- pentasaccharide decorating the Haloferax volcanii S-layer glycoprotein. J. glycans. Archaea. pii: 754101. Bacteriol. 192, 5572–5579. Plavner, N., Eichler, J., 2008. Defining the topology of the N-glycosylation Kaminski, L., Guan, Z., Abu-Qarn, M., Konrad, Z., Eichler, J., 2012. AglR is required for pathway in the halophilic archaeon Haloferax volcanii. J. Bacteriol. 190, addition of the final mannose residue of the N-linked glycan decorating the 8045–8052. Haloferax volcanii S-layer glycoprotein. Biochim. Biophys. Acta 1820, 1664– Rodriguez-Valera, F., Ruiz-Berraquero, F., Ramos-Cormenzana, A., 1980. Isolation of 1670. extremely halophilic bacteria able to grow in defined inorganic media with Kärcher, U., Schröder, H., Haslinger, E., Allmaier, G., Schreiner, R., Wieland, F., single carbon sources. J. Gen. Microbiol. 119, 535–538. Haselbeck, A., König, H., 1993. Primary structure of the heterosaccharide of the Schwarz, F., Aebi, M., 2011. Mechanisms and principles of N-linked protein surface glycoprotein of Methanothermus fervidus. J. Biol. Chem. 268, 26821– glycosylation. Curr. Opin. Struct. Biol. 21, 576–582. 26826. Shams-Eldin, H., Chaban, B., Niehus, S., Schwarz, R.T., Jarrell, K.F., 2008. Kelly, J., Logan, S.M., Jarrell, K.F., VanDyke, D.J., Vinogradov, E., 2009. A novel N- Identification of the archaeal alg7 gene homolog (encoding N- linked flagellar glycan from Methanococcus maripaludis. Carbohydr. Res. 344, acetylglucosamine-1-phosphate transferase) of the N-linked glycosylation 648–653. system by cross-domain complementation in Saccharomyces cerevisiae.J. Kozubal, M., Macur, R.E., Korf, S., Taylor, W.P., Ackerman, G.G., Nagy, A., Inskeep, Bacteriol. 190, 2217–2220. W.P., 2008. Isolation and distribution of a novel iron-oxidizing crenarchaeon Sharp, P.M., Li, W.H., 1987. The codon Adaptation Index-a measure of directional from acidic geothermal springs in Yellowstone National Park. Appl. Environ. synonymous codon usage bias, and its potential applications. Nucl. Acids Res. Microbiol. 74, 942–949. 15, 1281–1295. Larkin, A., Imperiali, B., 2011. The expanding horizons of asparagine-linked Supek, F., Vlahovicek, K., 2004. INCA: synonymous codon usage analysis and glycosylation. Biochemistry 50, 4441-4426. clustering by means of self-organizing map. Bioinformatics 20, 2329–2330. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., molecular evolutionary genetics analysis using maximum likelihood, Higgins, D.G., 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2947–2948. 2731–2739. Lechner, J., Wieland, F., Sumper, M., 1985. Biosynthesis of sulfated saccharides N- Tomlinson, G.A., Jahnke, L.L., Hochstein, L.I., 1986. Halobacterium denitrificans sp. glycosidically linked to the protein via glucose. Purification and identification of nov., an extremely halophilic denitrifying bacterium. Int. J. Syst. Bacteriol. 36, sulfated dolichyl monophosphoryl tetrasaccharides from halobacteria. J. Biol. 66–70. Chem. 260, 860–866. VanDyke, D.J., Wu, J., Logan, S.M., Kelly, J.F., Mizuno, S., Aizawa, S., Jarrell, K.F., 2009. Leigh, J.A., Albers, S.V., Atomi, H., Allers, T., 2011. Model organisms for genetics in Identification of genes involved in the assembly and attachment of a novel the domain Archaea: methanogens, halophiles, Thermococcales and Sulfolobales. flagellin N-linked tetrasaccharide important for motility in the archaeon FEMS Microbiol. Rev. 35, 577–608. Methanococcus maripaludis. Mol. Microbiol. 72, 633–644. Liu, L.J., You, X.Y., Guo, X., Liu, S.J., Jiang, C.Y., 2011. Metallosphaera cuprina sp. nov., Vinogradov, E., Deschatelets, L., Lamoureux, M., Patel, G.B., Tremblay, T.L., an acidothermophilic, metal-mobilizing archaeon. Int. J. Syst. Evol. Microbiol. Robotham, A., Goneau, M.F., Cummings-Lorbetskie, C., Watson, D.C., Brisson, 61, 2395–2400. J.R., Kelly, J.F., Gilbert, M., 2012. Cell surface glycoproteins from Thermoplasma Lizak, C., Gerber, S., Numao, S., Aebi, M., Locher, K.P., 2011. X-ray structure of a acidophilum are modified with an N-linked glycan containing 6-C-sulfofucose. bacterial oligosaccharyltransferase. Nature 474, 350–355. Glycobiology 22, 1256–1267. Lynch, E.A., Langille, M.G., Darling, A.E., Wilbanks, E.G., Haltiner, C., Shao, K.S., Starr, Voisin, S., Houliston, R.S., Kelly, J., Brisson, J.R., Watson, D., Bardy, S.L., Jarrell, K.F., M.O., Teiling, C., Harkins, T.T., Edwards, R.A., Eisen, J.A., Facciotti, M.T., 2012. Logan, S.M., 2005. Identification and characterization of the unique N-linked Sequencing of seven haloarchaeal genomes reveals patterns of genomic flux. glycan common to the flagellins and S-layer glycoprotein of Methanococcus PLoS One 7, e41389. voltae. J. Biol. Chem. 280, 16586–16593. Magidovich, H., Eichler, J., 2009. Glycosyltransferases and Wieland, F., Heitzer, R., Schaefer, W., 1983. Asparaginylglucose: novel type of oligosaccharyltransferases in Archaea: putative components of the N- carbohydrate linkage. Proc. Natl. Acad. Sci. USA 80, 5470–5474. glycosylation pathway in the third domain of life. FEMS Microbiol. Lett. 300, Wright, F., 1990. The ‘effective number of codons’ used in a gene. Gene 87, 122–130. 23–29. Magidovich, H., Yurist-Doutsch, S., Konrad, Z., Ventura, V.V., Dell, A., Hitchen, P.G., Yan, Q., Lennarz, W.J., 2002. Studies on the function of oligosaccharyl transferase Eichler, J., 2010. AglP is a S-adenosyl-L-methionine-dependent subunits. Stt3p is directly involved in the glycosylation process. J. Biol. Chem. methyltransferase that participates in the N-glycosylation pathway of 277, 47692–47700. Haloferax volcanii. Mol. Microbiol. 76, 190–199. Yurist-Doutsch, S., Eichler, J., 2009. Manual annotation, transcriptional analysis, and Maita, N., Nyirenda, J., Igura, M., Kamishikiryo, J., Kohda, D., 2010. Comparative protein expression studies reveal novel genes in the agl cluster responsible for N structural biology of eubacterial and archaeal oligosaccharyltransferases. J. Biol. glycosylation in the halophilic archaeon Haloferax volcanii. J. Bacteriol. 191, Chem. 285, 4941–4950. 3068–3075. Matsumoto, S., Igura, M., Nyirenda, J., Matsumoto, M., Yuzawa, S., Noda, N., Inagaki, Yurist-Doutsch, S., Abu-Qarn, M., Battaglia, F., Morris, H.R., Hitchen, P.G., Dell, A., F., Kohda, D., 2012. Crystal structure of the C-terminal globular domain of Eichler, J., 2008. AglF, aglG and aglI, novel members of a gene island involved in oligosaccharyltransferase from Archaeoglobus fulgidus at 1.75 Å resolution. the N-glycosylation of the Haloferax volcanii S-layer glycoprotein. Mol. Biochemistry 51, 4157–4166. Microbiol. 69, 1234–1245. Mescher, M.F., Strominger, J.L., 1976. Purification and characterization of a Yurist-Doutsch, S., Magidovich, H., Ventura, V.V., Hitchen, P.G., Dell, A., Eichler, J., prokaryotic glucoprotein from the cell envelope of Halobacterium salinarium.J. 2010. N-glycosylation in Archaea: on the coordinated actions of Haloferax Biol. Chem. 251, 2005–2014. volcanii AglF and AglM. Mol. Microbiol. 75, 1047–1058. Meyer, B.H., Zolghadr, B., Peyfoon, E., Pabst, M., Panico, M., Morris, H.R., Haslam, Zähringer, U., Moll, H., Hettmann, T., Knirel, Y.A., Schäfer, G., 2000. Cytochrome S.M., Messner, P., Schäffer, C., Dell, A., Albers, S.V., 2011. Sulfoquinovose b558/566 from the archaeon Sulfolobus acidocaldarius has a unique Asn-linked synthase – an important enzyme in the N-glycosylation pathway of Sulfolobus highly branched hexasaccharide chain containing 6-sulfoquinovose. Eur. J. acidocaldarius. Mol. Microbiol. 82, 1150–1163. Biochem. 267, 4144–4149.