<<

Supplementary Materials:

Phylogenomics provides new insights into gains and losses of selenoproteins among

Hongping Liang1,2,3#, Tong Wei2,3,4#, Yan Xu1,2,3#, Linzhou Li2,3,6, Sunil Kumar Sahu2,3,4, Hongli Wang1,2,3, Haoyuan Li1,2, Xian Fu2,3, Gengyun Zhang2,4, Michael Melkonian7, Xin Liu2,3,4, Sibo Wang2,4,5*,Huan Liu2,4,5*

1 BGI Education Center, University of Chinese Academy of Sciences, Beijing, China. 2 BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China. 3 China National Gene Bank, Institute of New Agricultural Resources, BGI-Shenzhen, Jinsha Road, Shenzhen 518120, China. 4 State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen 518083, China. 5 Department of Biology, University of Copenhagen, Copenhagen, Denmark. 6 School of Biology and Biological Engineering, South China University of Technology, 510006, China. 7 Botanical Institute, Cologne Biocenter, University of Cologne, Cologne D-50674, Germany. #these authors contributed equally to this work * Correspondence: *[email protected]

Supplementary Figure S1: Completeness of assemblies and Stop codon statistics. The genome quality was assessed by the BUSCO program. The number of single, multiple, missing and fragmented genes are shown in the middle histogram. The stop codon usage of genes is shown on the right histograms. The that lost the Sec machinery are marked with an asterisk in the tree. Each group was colored in different background in the left column.

Group Species Numbers Stop Codon

Arabidopsis thaliana Oryza sativa japonica moellendorffii BUSCO Gene Number: Marchantia polymorpha Physcomitrella patens Single Mesotaenium endlicherianum Multiple “ sp.” Coleochaete scutata Fragmented Streptophyte Chara braunii Klebsormidium nitens Missing atmophyticus viride Volvox carteri Gonium pectorale Gene Stop Codon: Chlamydomonas reinhardtii TGA Chlamydomonas eustigma TAG Chlorophyte Chromochloris zofingiensis Monoraphidium neglectum TAA Ulva mutabilis Chlorella sorokiniana Chlorella variabilis Loss of Sec trait: Coccomyxa subellipsoidea Ostreococcus sp. Ostreococcus lucimarinus Ostreococcus tauri Bathycoccus prasinos Micromonas commoda Micromonas pusilla Porphyra umbilicalis Pyropia yezoensis Chondrus crispus Rhodoplantae Porphyridium purpureum sulphuraria merolae Glaucoplantae Cyanophora paradoxa Aureococcus_anophagefferens Thalassiosira pseudonana Emiliana huxleyi 3 53 103 153 203 253 303 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Cys-homologs

Selenoprotein 2 2 2 Other homologs Selenoprotein families: GPx AhpC SelW SelM MSP FrnE SelT TR msra_b EhSEP2 selenou Ostsp3 SelH Ostsp2 di_b Sel15 ars_s MsrA SelK prx SelO rhor SelS Ostsp1 fdha SPS SelR gpx_b prx_like ugsc gst grx rsam

Arabidopsis thaliana 8 1 2 2 3 1 14 2 1 1 1 1 1 4 1 1 9 2 1 3 16 6 Oryza sativa japonica 4 1 1 2 1 3 1 10 2 1 2 1 3 1 1 8 1 2 3 1 3 2 1 1 2 2 8 8 Selaginella moellendorffii 6 2 2 2 2 8 4 13 3 1 1 2 2 16 2 1 5 2 6 2 2 9 3 8 11 Physcomitrella patens 4 1 1 2 2 3 1 21 7 1 2 1 1 12 1 1 1 3 1 2 4 8 Marchantia polymorpha 2 1 1 2 1 2 1 7 1 1 1 6 1 1 2 1 2 4 4 Mesotaenium endlicherianum 1 1 1 4 1 1 2 1 3 1 1 1 1 1 9 1 5 5 1 1 5 3 6 2 2 4 3 1 1 1 “Spirotaenia sp.” 2 4 1 3 1 4 1 2 8 6 1 4 1 1 2 3 2 13 3 1 3 6 1 1 8 3 7 6 3 Coleochaete scutata 1 2 1 1 1 1 1 1 1 3 1 17 1 1 1 1 6 1 3 7 1 1 1 1 2 2 8 2 1 Chara braunii 1 1 1 1 1 2 3 2 19 1 1 1 1 1 5 1 2 1 2 2 1 4 2 8 4 3 1 Klebsormidium flaccidum 1 1 1 1 2 2 10 1 1 1 1 1 1 2 5 1 1 1 3 3 3 2 1 Chlorokybus atmophyticus 1 1 1 1 1 1 6 1 1 1 5 1 1 3 1 2 2 1 1 Mesostigma viride 1 1 1 1 1 1 6 1 1 1 1 1 2 8 1 1 1 1 1 1 1 10 2 5 3 10 1 3 1 7 2 1 2 4 8 2 1 Volvox carteri 1 1 1 1 1 1 1 1 1 3 5 1 1 1 1 1 2 1 1 1 1 1 1 3 2 1 1 1 3 1 2 2 1 1 2 1 Gonium pectorale 1 1 1 1 2 2 1 1 2 2 5 1 1 1 1 1 2 1 1 5 3 1 2 1 2 3 2 1 1 1 1 1 Chlamydomonas reinhardtii 1 2 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 3 1 4 1 3 3 2 3 Chlamydomonas eustigma 1 1 1 1 1 1 4 1 1 1 1 2 5 1 2 3 1 4 1 1 2 2 1 Chromochloris zofingiensis 1 2 1 2 1 1 1 1 3 9 1 1 1 1 1 9 1 1 1 1 3 2 3 3 1 Monoraphidium neglectum 2 1 1 2 1 1 1 3 1 1 1 1 1 1 1 2 1 3 1 1 1 4 1 1 2 1 2 1 1 Ulva mutabilis 1 1 1 1 1 5 1 1 1 1 1 1 4 2 3 1 1 1 1 1 2 1 3 Chlorella variabilis 2 2 1 1 3 1 1 3 1 5 3 1 2 2 6 2 1 1 1 1 5 1 1 1 1 1 1 1 1 1 1 Chlorella sorokiniana 2 1 1 1 1 1 3 1 1 1 3 1 1 1 1 3 1 1 1 1 2 1 Coccomyxa subellipsoidea 1 1 1 1 1 1 3 5 1 1 1 1 2 3 1 1 1 1 1 2 1 1 2 1 Ostreococcus sp. RCC809 4 1 1 2 1 1 1 1 1 1 1 1 4 1 15 2 1 1 1 1 1 1 1 1 1 2 3 1 1 1 4 1 1 1 3 3 1 1 Ostreococcus lucimarinus 5 1 1 2 1 1 2 1 1 1 3 1 1 16 2 1 1 1 1 1 1 1 1 1 1 4 2 4 1 1 1 5 1 1 1 1 3 1 1 2 1 1 Ostreococcus tauri 5 1 1 2 2 1 1 2 1 1 1 1 1 4 1 1 17 1 1 1 1 1 1 1 1 1 3 2 2 2 4 1 1 1 3 1 1 2 2 1 Bathycoccus prasinos 4 2 2 1 1 1 2 1 1 2 3 1 19 1 1 1 1 1 1 1 4 1 2 6 1 1 3 1 2 2 1 Micromonas commoda 5 2 1 1 2 1 2 1 1 1 2 4 2 19 2 1 1 1 1 1 1 1 1 1 3 1 3 3 1 1 4 2 8 1 1 Micromonas pusilla 6 2 1 1 1 2 1 2 1 1 1 2 2 2 2 23 1 1 1 1 1 2 1 1 1 1 1 1 6 4 1 1 6 1 1 1 3 1 2 2 7 1 1 Porphyra umbilicalis 2 1 1 1 1 1 3 16 2 2 1 1 91 5 1 2 4 1 2 4 1 1 8 Pyropia yezoensis 2 1 1 2 1 11 1 1 1 8 3 1 2 1 2 1 1 4 1 1 3 2 7 1 Chondrus crispus 1 1 1 1 1 1 1 3 9 12 7 1 1 4 1 1 1 2 1 1 7 1

Porphyridium purpureum 2 1 1 1 1 11 1 1 1 1 1 4 1 5 1 1 2 8 1 1 2 9 1 1 1 1 1 6 1 2 4 2 1 1 1 1 1 1 2 3 1 Cyanidioschyzon merolae 1 1 1 8 1 1 1 6 1 2 1 1 1 1 1 3 1 1

Cyanophora paradoxa 1 1 1 1 5 1 3 1 2 2 1 1 1 1 13 4 2 1 3 2 1 1 1 2 2 1 1 3 1 1 Aureococcus anophagefferens 2 1 1 4 4 1 2 3 2 5 6 1 1 3 4 8 2 3 29 2 2 1 1 1 1 2 2 1 2 1 1 2 13 2 1 2 2 1 10 1 4 9 1 2 1 4 1 3 1 1 6 1 1 Thalassiosira pseudonana 1 1 1 1 1 2 2 1 1 1 1 4 2 20 1 2 1 1 1 1 3 1 1 4 1 4 1 1 2 1 3 1 1 6 3 1

9 2 7 2 4 7 1 7 2 2 3 1 1 2 14 2 8 36 3 1 3 7 4 5 2 2 1 3 1 5 4 3 4 1 3 5 1 1 18 3 1 2 2 1 2 8 2 1 Emiliana huxleyi

the number of selenoproteins. the number

-containing gene were selected for the analysis. The number in the left box represents represents in the left box The number for the selected gene were -containing analysis.

(red), and others (grey). 33 out of 62 selenoproteins families having at least one Sec at least having families of 62 selenoproteins 33 out and others (grey). (red),

family plotted in different color to distinguish color homologues Cys (green), Sec-containing different in plotted family

Each column corresponds to a selenoprotein to a selenoprotein corresponds column Each Selenoprofiles. by (Cys) homologues Supplementary Figure S2: Species tree with families of selenoproteins and and selenoproteins of families with tree Species S2: Figure Supplementary Supplementary Figure S3: The phylogenetic tree of SBP2, PSTK and SPS. All the predicted candidates were used to reconstruct a maximum-likelihood tree, with 500 bootstrap replicates. Here, Orange: ; Yellow: core ; Green: Streptophyte ; Red: Rhodoplantae; and Grey: Glaucoplantae.

Ulva mutabilis

Cyanidioschyzon merolae a. SBP2

Coccomyxa subellipsoidea

Galdieria sulphuraria

Monoraphidium neglectum Chromochloris zofingiensis Chlorella variabilis

ChlamydomonasChlamydomonas eustigma reinhardtii Chlorella sorokiniana Volvox carteri

99 Porphyridium purpureum

46 100 47

70

Mesostigma viride

39

46 48 84 7853 Bathycoccus prasinos 100 42 Micromonas commoda 89 Micromonas pusilla 94 51 73 Spirotaenia sp. Ostreococcus lucimarinus

Spirotaenia sp. OstreococcusOstreococcus tauri sp.

Klebsormidium nitens

Spirotaenia sp.

Mesotaenium endlicherianum Tree scale: 1 Chara braunii

Coleochaete scutata

Chlorokybus atmophyticus

b. PSTK

Cyanophora paradoxa

Chromochloris zofingiensis Micromonas commoda

Coccomyxa subellipsoidea

Bathycoccus prasinos Ulva mutabilis Galdieria sulphuraria Micromonas pusilla Ostreococcus tauri Ostreococcus lucimarinus Galdieria sulphurari Ostreococcus sp.

80 100

50 100

95 31 23 33 Chlorella sorokiniana 43 15 Chlorella variabilis

2 100 22 113 1 13 97 31 100 9 73 Klebsormidium nitens 24 Gonium pectorale Mesotaenium endlicherianum V olvoxChlamydomonas carteri.5.homologue reinhardtii

Spirotaenia sp. Mesotaenium endlicherianum Chlamydomonas eustigma

Chara braunii

Coleochaete scutata

Chlorokybus atmophyticus Porphyridium purpureum

Mesostigma viride Tree scale: 1 c. SPS

Chlamydomonas reinhardtii

Gonium pectorale Mesotaenium endlicherianum

Chara braunii

Porphyridium purpureum olvox carteri

V

83

100 Galdieria sulphuraria

100

Cyanidioschyzon merolae

76

100

100

Ulva mutabilis

100 Chlorella variabilis

Chlorella sorokiniana

MonoraphidiumChlamydomonas neglectum eustigma

70 100 Chromochloris zofingiensis

26

41 58 63 Coccomyxa subellipsoidea

56

38 919 98 25

49 100 95 71 100 Spirotaenia sp. 100 Bathycoccus prasinos Micromonas commoda 56 Micromonas pusilla

Chlorokybus atmophyticus Spirotaenia sp. Ostreococcus sp. OstreococcusOstreococcus tauri lucimarinus Spirotaenia sp.

RCC809

Mesostigma viride Coleochaete scutata

Klebsormidium nitens Mesotaenium endlicherianum

Tree scale: 1 Supplementary Figure S4: The phylogenetic tree of SPS. The maximum-likelihood tree by using the reference set of SPS proteins was reconstructed using the GTR+I+G model.

Blastocystis hominis Perkinsus marinus Saprolegnia parasitica Rhodophyta 92 Pythium ultimum 100 Hyaloperonospora arabidopsidis 93 Phytophthora sojae Protist Polysphondylium violaceum 100 Dictyostelium purpureum Fungi 9 Thecamonas trahens 28 Thalassiosira pseudonana 100 Fragilariopsis cylindrus 8 Porphyridium purpureum 97 Galdieria sulphuraria 32 Cyanidioschyzon merolae Trypanosoma cruzi 38 28 100 Trypanosoma cruzi 100 Leishmania mexicana 100 Crithidia fasciculata Capsaspora owczarzaki 40 Hydra vulgaris 54 Monosiga brevicollis 63 Salpingoeca rosetta 68 Gonapodya prolifera Bifiguratus adelaidae 37 Smittium culicis 34 64 Smittium megazygosporum 48 Smittium simulii 24 100 Capniomyces stellatus 66 Zancudomyces culisetae 47 Smittium angustum 100 Furculomyces boomerangus 31 Smittium angustum gi 500682816- aeolicus gi 500676783-Methanococcus vannielii 99 95 gi 339904471-Methanococcus maripaludis X1 100 gi 500694028-Methanococcus maripaludis 32 100 gi 150033108-Methanococcus maripaludis C7 gi 1180357516-Methanothermococcus thermolithotrophicus 45 gi 502865039- infernus 88 gi 256793975-Methanocaldococcus fervens AG86 56 gi 502573375-Methanocaldococcus vulcanius 78 60 gi 1180255750-Methanocaldococcus jannaschii gi 1393080014-Methanospirillum lacunae 100 gi 503095145-Methanolacinia petrolearia 43 49 gi 1491301803-Candidatus Lokiarchaeota archaeon 75 gi 1491281764-Candidatus Thorarchaeota archaeon 100 gi 998174615-Candidatus Thorarchaeota archaeon SMTZ1-83 51 gi 19888017- kandleri AV19 100 gi 1209289638-Methanopyrus sp. KOL6 gi 1338055885-archaeon AArc-Sl 92 61 gi 492984621-Halococcus saccharolyticus gi 910071559-Halopiger salifodinae 100 47 gi 1496211662-Halostella sp. LT12 70 gi 1133767283-Haloterrigena saccharevitans 100 54 gi 502709883-Haloterrigena turkmenica gi 490732568-Haloarcula japonica 98 gi 1547411379-Haloarcula salaria 62 gi 495587488-Haloarcula amylolytica 54 gi 1088932856-Haloarcula vallismortis 52 gi 926546870-Haloarcula rubripromontorii 64 gi 1492161838-Haloarcula sp. Atlit-120R 100 gi 1315062959-Haloarcula taiwanensis gi 1579743408-Sphingobacteriia bacterium 100 gi 652368358-Sediminibacterium salmoneum 100 gi 1088954255-Hydrobacter penzbergensis gi 1272477038-Chitinophagaceae bacterium 100 57 gi 507724718-Arcticibacter svalbardensis MN12-7 69 100 gi 671545637-Dyadobacter crusticola gi 1094627224-Hymenobacter actinosclerus 75 100 gi 1174200829-Hymenobacter roseosalivarius DSM 11622 39 gi 1098371050-Thermoflexibacter ruber gi 1181146258-Mannheimia granulomatis 60 gi 513470465-Cedecea davisae DSM 4568 97 100 gi 1443801051-Cedecea davisae gi 324113512-Escherichia fergusonii B253 100 100 99 gi 1411678368-Citrobacter koseri 90 gi 1031406728-Klebsiella oxytoca 100 gi 1080579789-Enterobacter sp. HMSC055A11 55 gi 1031223844-Klebsiella oxytoca Tolypothrix bouteillei 100 Scytonema hofmannii PCC 7110 92 Chara braunii 72 Gonium pectorale 100 Chlamydomonas reinhardtii 49 Volvox carteri 40 gi 590103574-Candidatus Competibacter denitrificans Run A D11 gi 1526175861-Rhodobacteraceae bacterium CH30 48 gi 1566221770-bacterium CAVE-375 100 gi 1182688563-Pseudogulbenkiania subflava DSM 22618 60 51 gi 930807569-Gulbenkiania 50 gi 1488681949-Vogesella indigofera 100 gi 1399229196- syringae pv. pisi 24 gi 1131936538-Sphaerotilus natans 47 98 gi 1546068161-Curvibacter sp. 67 gi 1094257372-Brachymonas denitrificans DSM 15123 99 28 gi 916360127-Brachymonas chironomi gi 1240691191-Methylocaldum marinum 80 gi 1388072985-Nitrosomonadales bacterium 23 gi 1477897452- bacterium 58 gi 1404993259-Starkeya novella 30 gi 302192727-Brevundimonas subvibrioides ATCC 15264 89 gi 1232483062-Caulobacterales bacterium 32-69-10 Plasmodium yoelii 48 Eimeria tenella CEM15043.1-Vitrella brassicaformis XP 005841917.1-Guillardia theta CCMP2712 69 KOO52778.1-Chrysochromulina sp. CCMP291 100 Aureococcus anophagefferens 8 Ectocarpus siliculosus 100 CBJ26002.1-Ectocarpus siliculosus 4 23 Chlorella variabilis 58 96 Chlorella sorokiniana 57 6 Micractinium conductrix Coccomyxa subellipsoidea gi 821560436-Microvirga massiliensis 7 72 gi 763834373-Skermanella aerolata KACC 11604 12 Raphidocelis subcapitata 99 Monoraphidium neglectum 36 57 Chlamydomonas eustigma 59 Chromochloris zofingiensis Klebsormidium flaccidum 53 Entransia fimbriata 13 Mesotaenium endlicherianum 41 Mesostigma viride 12 Dunaliella salina 17 Chlorokybus atmophyticus Coleochaete scutata gi 522186082-Reyranella massiliensis 2 99 1 gi 1064011774-Kaistia sp. SCN 65-12 24 Bathycoccus prasinos Micromonas pusilla 93 98 Micromonas commoda 90 Ostreococcus tauri 100 Ostreococcus lucimarinus 55 Ostreococcus sp. RCC809 Spirotaenia sp. 93 Spirotaenia sp. 65 Spirotaenia sp. gi 1541397954-Rhizobiales bacterium 100 gi 1176537890-Bosea vaviloviae gi 1081445178-Thiohalorhabdus denitrificans 5 gi 1247529493-Thiotrichaceae bacterium gi 1152021825-Chromatiaceae bacterium 2141T.STBD.0c.01a 1227 98 gi 1134400374-Motiliproteus sp. MSK22-1 84 gi 430011305-Thioalkalivibrio nitratireducens DSM 14787 22 49 gi 1246875575-Sulfurifustis variabilis gi 427986767-Geitlerinema sp. PCC 7407 47 gi 1512065175-Leptolyngbya sp. IPPAS B-1204 80 34 gi 1370494690-filamentous cyanobacterium CCT1 70 gi 1184395269-Synechococcus sp. OG1 100 21 gi 427354389-Leptolyngbya sp. PCC 7376 8 gi 1347926397- sp. TS-821 73 gi 768384099-Aliterella atlantica CENA595 41 gi 1324871035-Fischerella thermalis CCMEE 5330 96 gi 1546472988-Chlorogloeopsis fritschii PCC 6912 80 gi 515885046-Mastigocladopsis repens 100 gi 1199832221-Nostocales cyanobacterium HT-58-2 gi 1200223972-Rhodospirillaceae bacterium TMED8 37 gi 1347598433-Alphaproteobacteria bacterium MarineAlpha11 Bin1 98 gi 1200457446-Flavobacteriales bacterium TMED123 18 gi 575408659-Candidatus Entotheonella factor gi 1445265287-Marinobacter nanhaiticus D15-8W 10 gi 1088433685-Rhodocyclales bacterium RIFCSPLOWO2 02 FXLL 63 24 99 gi 1580477114- bacterium gi 557642842-Betaproteobacteria bacterium MOLA814 1630 gi 657731699-Polaromonas glacialis 100 gi 499783178-Rhodoferax ferrireducens 38 gi 1577053009-Limnobacter thiooxidans 15 100 gi 149824189-Limnobacter sp. MED105 80 gi 1576643628-Limnobacter sp. 48 gi 1232058649- bacterium PBBC1 94 gi 1082756923- bacterium GWE1 65 30 100 gi 1054087582- pseudoflava 11 99 gi 1054449307-Hydrogenophaga flava gi 554594048-Spiribacter curvatus gi 653078801-Marinimicrobium agarilyticum 89 gi 1515749752-Marinimicrobium koreense gi 1498031051-Gammaproteobacteria bacterium 40 9 38 gi 1409712421-Halomonas taeanensis 16 gi 1434165188-Halomonas sp. DQ26W 74 gi 1097534756-Halomonas saccharevitans gi 749551177-Halomonas salina 31 67 45 gi 1360110427-Halomonas ventosae gi 1097643367-Marinobacter zhejiangensis 64 gi 1139443531-Marinobacter lutaoensis gi 1362735033-Marinobacter halophilus 94 100 gi 764481932-Marinobacter excellens 81 gi 469612266-Marinobacter santoriniensis NKSG1 49 gi 1423368840-Marinobacter pelagius 53 gi 499161494-Marinobacter lipolyticus SM19 21 gi 582024533-Marinobacter similis Tree scale: 1 29 gi 1545879995-Marinobacter salarius Supplementary Figure S5: The alignment of the SPS domain. The complement alignment of SPS domains in the three clades. Nucleotide conservation is indicated with red and yellow coloring. The three clades are separated by spaces. From top to bottom, they are clade III, clade II and clade I.

Methanothermococcus thermolithotrophicus ...... Methanocaldococcus fervens ...... Methanocaldococcus vulcanius ...... Methanocaldococcus jannaschii ...... Methanocaldococcus infernus ...... Cedecea davisae ...... Escherichia fergusonii ...... Citrobacter koseri ...... Mannheimia granulomatis ...... Klebsiella oxytoca ...... Burkholderiales bacterium DLVLIGGGHSHVGVLRMFAMKPEPGVRITVICTDIDTPYSGMLPGYISGHYSFDEVHID L Comamonadaceae bacterium DLVLVGGGHSHVVALRMLAMQPEPGLRITLVCTDIDTPYSGMLPGYVSGHYSFDEVHID L Hydrogenophaga flava DLVLVGGGHSHVVALRMLAMRPEPGLRITMVCTDIDTPYSGMLPGYIAGHYSFDEVHID L Hydrogenophaga pseudoflava DIVLVGGGHSHVGVLRMFGMKPWPGVRLTLICTDIHTPYSGMLPGYIAGHYSYDDVHID L Limnobacter sp. DLVLVGGGHSHVVVLRMLAMQPEPGLRITLICTDIDTPYSGMLPGYISGHYSFDDVHID L

Methanothermococcus thermolithotrophicus ...... Methanocaldococcus fervens ...... Methanocaldococcus vulcanius ...... Methanocaldococcus jannaschii ...... Methanocaldococcus infernus ...... Cedecea davisae ...... Escherichia fergusonii ...... Citrobacter koseri ...... Mannheimia granulomatis ...... Klebsiella oxytoca ...... Burkholderiales bacterium GRLCAYTGARFFHDAVVGIDRQNQKVICQNRPPVAYDLLSINIGSTPQVRHVAGAQSLA V Comamonadaceae bacterium GRLAAFAGARFIHGEVTGLDRASQRVLLKGRPAIPYDLLSINTGSTPNVRQVDGAQAHA V Hydrogenophaga flava GRLAVFAGARFIHGEVTGLDRANQRVLLKGRPSIPYDLLSINTGSTPNVRQVDGAQAHA V Hydrogenophaga pseudoflava GRLCAFAGARLFKDEVVGIDRANQKVICKNRPPVAYDALSINIGSTPQVQLVPGALENA V Limnobacter sp. GRLAAFAGARFIHGEVTGLDRANQRVLLRDRPSVPYDLLSINTGSTPNVRQVDGARAHT V

Methanothermococcus thermolithotrophicus ...... Methanocaldococcus fervens ...... Methanocaldococcus vulcanius ...... Methanocaldococcus jannaschii ...... Methanocaldococcus infernus ...... Cedecea davisae ...... Escherichia fergusonii ...... Citrobacter koseri ...... Mannheimia granulomatis ...... Klebsiella oxytoca ...... Burkholderiales bacterium PVKPIAQFNQRWLALLDKARQWPIHRGRMTIAVVGAGAGGVELVLSMQYRLRNELKALG R Comamonadaceae bacterium PVKPIAHFNQRWLDLLERVRGL...RDRFTVAVVGGGAGGVELVLSMQYRLRQELQALG K Hydrogenophaga flava PVKPIAHFNQRWLALLERVRSL...RDRFTVAVVGGGAGGVELVLSMQYRLRQELQALG K Hydrogenophaga pseudoflava PVKPIARFNQRWVNLLDRVKTH...AGKTTIAVVGGGAGGVELALAMQFRLRNELSTMG R Limnobacter sp. PVKPIAHFNQRWLQLLDRVRGL...HSRFTIAVVGGGAGGVELVLSVQFRLRNELLKLG R

Methanothermococcus thermolithotrophicus ...... Methanocaldococcus fervens ...... Methanocaldococcus vulcanius ...... Methanocaldococcus jannaschii ...... Methanocaldococcus infernus ...... Cedecea davisae ...... Escherichia fergusonii ...... Citrobacter koseri ...... Mannheimia granulomatis ...... Klebsiella oxytoca ...... Burkholderiales bacterium NPEDLQFVLLTAGETILPTHNPGVRARFARVLQARNVTVHTQTEVVEVSPGCLHTRDGR T Comamonadaceae bacterium NPDLLQFVLLTAGETLLPTHNPRVRARFAHVLKERRVAVHTRAEVSQVSPGCLTTQDGR T Hydrogenophaga flava SPDLLQFVLLTAGDSLLPTHNPRVRARFAHVLKERRVAVHTRAEVTQVSPGCLTTQDGR T Hydrogenophaga pseudoflava NPDELEFHLFTADADVLPTHNPGVRARFEKVLNERGVVLHRDAEVVQLDGHTLQTKQGE R Limnobacter sp. NPDLLRFVLLTAGDTILPTHNPGVRARFARVLKERHVAVHTRAEVTEVSPGCLHTQDGR T

1 Methanothermococcus thermolithotrophicus ...... DD A A Methanocaldococcus fervens ...... DD A A Methanocaldococcus vulcanius ...... DD A A Methanocaldococcus jannaschii ...... DD A S Methanocaldococcus infernus ...... DD A S Cedecea davisae ...... DD A A Escherichia fergusonii ...... DD A A Citrobacter koseri ...... DD A A Mannheimia granulomatis ...... DD A A Klebsiella oxytoca ...... DD A A Burkholderiales bacterium FDADDTLWVTQAGGPAWLKSTGLALDKQGFILVHPQLQTLNDPLVFAAGDIA.... DD A A Comamonadaceae bacterium FDADETMWVTQAGGPAWLQGTGLALDEHGFVCVNEYLQTPDDPKIYAAGDVASFLP DD A A Hydrogenophaga flava FDADETMWVTQAGGPAWLQGTGLALDEHGFVCVNEYLQTPDDPKIFAAGDVASFLP DD A A Hydrogenophaga pseudoflava LHADEIMWVTQAGGAAWLKNTDLELDSRGFINVGPTLQTTRDAKIFAAGDIAN... DD A A Limnobacter sp. FDADETLWVTQAGGPVWLQSTGLALDEHGFIQVNQQLQTLDDPKIFAVGDVA.... DD A A 170 180 190 200 210 220 Methanothermococcus thermolithotrophicus D IL T K A GA KE G DVLI L TKP LG T Q TA M AL S R VTEEFEDLLDIEKPE K EYI I NK AIEL M T T S Methanocaldococcus fervens E VL T K A GA KE G DILI L TKP LG T Q TA M AL S R IPEEFKDLIDVSEEE K EYI I NK AIEL M T T S Methanocaldococcus vulcanius E VL T K A GA EE G DVLI L TKP LG T Q TA M AL S R IPDDYKELIGIDEEEMNYI I NK AIEL M T T S Methanocaldococcus jannaschii E VL T K A GV KV G DVLI L TKP LG T Q TA M AL S R IPEEFKDLISITEEE R DYI I NK AIEI M T T S Methanocaldococcus infernus E VL T K A GA KV G DVLI L TKP LG T Q SA M AL S R VPKEFLELINISEEE K EEI I NK AI K LM T T S Cedecea davisae R V K K NSS A EA G CK L FL TKP LG I G V L TT A E K ...... K S LL L P EHQG LAAE TM C Q L Escherichia fergusonii R V K K NST A QA G CK L FL TKP LG I G V L TT A E K ...... K S LLKP EHQG LA T EV M C R M Citrobacter koseri R V K K NST A QA G CK L FL TKP LG I G V L TT A E K ...... K S LLKP EHQG LA T EV M C R M Mannheimia granulomatis R V K R NAS A VA G CE L FL TKP LG I G I L TT A E K ...... K G LL Q P EHQH LA S EV M C Q I Klebsiella oxytoca R V K K NST A QA G CK L YL TKP LG I G V L TT A E K ...... K S LLKP EHQG LA T E TM C Q M Burkholderiales bacterium G VMRK G GM LP G DVLL L TKP IG TGTL F AA HA...... R Y AAKG RW I D AAL KS M V V S Comamonadaceae bacterium S VMRK G GM QP G DVLL L TKP IG TGTL F AA HA...... R Y AAKG RW I D AAL QS M V L S Hydrogenophaga flava G VMRK G GM RP G DVLL L TKP IG TGTL F AA HA...... QH AAKG RW V D A T L QS M V V S Hydrogenophaga pseudoflava G VMRK G GM RP G DVLL L SKP IG TGTL F AA HA...... R H AAKG RW I D AAL KS M V L S Limnobacter sp. G VM I K G GM RP G DAIL L TKP IG TGTL F AA LP...... QLKT KG RW I D AALE SM V K S

230 240 250 260 270 28 0 Methanothermococcus thermolithotrophicus N RKA L L ILR KLEEGLNEKV AN A MTD ITGFG LLGH S NE M AEQ S N V E IEIN R.. LP L I KN T G Methanocaldococcus fervens N RYA L M ALR RAEKRVGDKI AN A LTD ITGFG ILGH S NE M ADN S N V L IEINL .. LP C I KK T P Methanocaldococcus vulcanius N RYA L KS LR NAEKKVGDKV AN A LTD ITGFG LLGH S NE M A K N S N V L IEINL .. LP C I KR T P Methanocaldococcus jannaschii N RYA L K ALR KAEERVGDKI AN A LTD ITGFG ILGH S NE M A K N S N V L IEINL .. LP C I KR T P Methanocaldococcus infernus N KYS L E ALR RAEEKVGDKI AN A LTD VTGFG LLGH S NE I A K Q S K V N I K I DT.. LP C I KK T P Cedecea davisae N K VG TEF A DIA G ...... VT A MTD VTGFG LLGH LS E M CQ G A GV Q ADV WFEC VP K L P G V E Escherichia fergusonii N I AGA SF A NIE G ...... V KA MTD VTGFG LLGH LS E M CQ G A GV Q A R V DYDA IP K L P G V E Citrobacter koseri N I AGA SF AH ID G ...... V KA MTD VTGFG LLGH LS E M CQ G A GV Q A R I CYQD IP K L P G V E Mannheimia granulomatis N L IGA EFSELPE...... VT A MTD VTGFG LLGH LS E V CQ GS N V R AEV HFAD I QT L D GT K Klebsiella oxytoca N L VG SAF A NID G ...... V KA MTD VTGFG LLGH LS E V C RG A GV Q A Q LT YAS IP K L P G V E Burkholderiales bacterium N Q AGA Q ILR AH G ...... AT A CTD LTGFG LLGH L VE M T RPSGV D AELQL SA LP L L D G A V Comamonadaceae bacterium N QS GA Q ILR TH G ...... AT A CTD LTGFG LLGH L VE M T RPSGV D AELQL ST LP L L D G A V Hydrogenophaga flava N Q AGA R ILR EH G ...... AT A CTD LTGFG LLGH L VE M T RPSGM D AELQL GA LP L L D G A V Hydrogenophaga pseudoflava N Q AGA Q ILR EH G ...... AT A CTD LTGFG LLGH L VE M T RPSGV D AELQL GA LP L L D G A V Limnobacter sp. N R LGA QC LR EY G ...... SK A CTD LTGFG LLGH L VE M T RPS E V D AEL D L SA LP L L D G A L

290 300 310 Methanothermococcus thermolithotrophicus RLSS M ...... FGH S L LKG...... TGAE T AGGLL ISV KKQYK DA L I Methanocaldococcus fervens E LSR L ...... FGHAL L D G...... Y GAE T AGGLL ISA KK E YK DD L I Methanocaldococcus vulcanius E LSK M ...... FGHAL L D G...... Y GAE T AGGLL IS TKN D YK DD L I Methanocaldococcus jannaschii E LSR L ...... FGHAL L D G...... Y GAE T AGGLL ISA KE E YK DN L I Methanocaldococcus infernus E LSK M ...... FGHAL L E G...... Y GAE T AGGLL I V A KR E YK DE L I Cedecea davisae SY I EQ G C V PGG T SR N FAS YG Q LV G DM P....EAWRN LL C DPQ T SGGLL L A V RP D AEA D V F Escherichia fergusonii E Y I K LGAV PGG T ER N FAS YGHLM G EM P....REVRD LL C DPQ T SGGLL L A V TP E AE DE V K Citrobacter koseri E Y I K LGAV PGG T ER N FAS YGHLM G DM P....REVRD LL C DPQ T SGGLL L A V TP E AV EE V K Mannheimia granulomatis D Y I A LGAV PGG T TR N FDS YGHLI SA M T....DEQKA IL C DPQ T SGGLL V A V LPQAV EK V Q Klebsiella oxytoca E Y I A AGAV PGG T GR N FAS YGHLM G EM P....PEWRD LL C DPQ T SGGLL L A V TP E AE DE V L Burkholderiales bacterium E C V E AGIV SSL Q PA N VR.LR RAL RN A EDFVKDPRYP LLFDPQ T AGGLL ASV PA D RAT DC I Comamonadaceae bacterium D C V Q AGIV SSL Q PA N VR.LR RAL RNGDEFVGDPRYP LLFDPQ T AGGLL ASV PAAQA DA C V Hydrogenophaga flava E C V Q AGIV SSL Q PA N VR.LR RAL RN A EAFVDDPRYP LLFDPQ T AGGLL ASV PA D RAA DC V Hydrogenophaga pseudoflava D C V K AGIV SSL Q PA N VR.LR RAL RN A VAFVDDQRYP LLFDPQ T AGGLL ASV PA D RAA DC V Limnobacter sp. D M V S AGIV SSL Q PA N VR.LR RAI RNQAEYVNDPRYP LIFDPQ T AGGLL ATV PG E QA EA C V

320 330 340 350 Methanothermococcus thermolithotrophicus EE L QKNQ.CY A FE VG T V KKSGV G . KA Y L RDDVEILE V E GK AV. Methanocaldococcus fervens NE L EKSN.CY A FE VG RVV KKG EG . KAVL SKDVKVIE I DK.... Methanocaldococcus vulcanius EE L RKNK.VY A FE VG KVV KKG EG . RAIL SKDVKIVE V V GK ... Methanocaldococcus jannaschii DE L EK A K.CY A FE VG RVV KKG EG . KAVL SKDVKVIE I E GR AI. Methanocaldococcus infernus DE L KK VG ..Y A FE VG E VV GRG D .. KA T L SKDLKIIE V E GK AV. Cedecea davisae A VA ERN G .IK L SP IG E L KDARA G . R P MI EI...... R A..... Escherichia fergusonii AT A AEF G .IE L S AI G E LV SARG G . RAMV EI...... L G GR AMV Citrobacter koseri VT A AEF G .IE L T AI G E LV DARG G . RAMV EI...... R G GR AMV Mannheimia granulomatis Q IA KN A C.VS L FH VG RLL EQE EG . KALI EV...... I E GK A.. Klebsiella oxytoca A AA AEY G .IT L T AI G E LV TARG G . R P MI EI...... R G GR ... Burkholderiales bacterium R AL KA AG YPQTA AI G RI RGAS D VLEP VV LT...... A ...... Comamonadaceae bacterium Q AL KA AG YVHTA II G RI TLQG D AIEP VL LK...... T ...... Hydrogenophaga flava R AL QA AG YPHTA AI G RV SAAS D ALEP VV LS...... A ...... Hydrogenophaga pseudoflava R AL QA AG YPHTA AI G RI SAAS D ALEP VV LS...... A E..... Limnobacter sp. A AL KK LG YEHTA II G RIL PQG E AIEP IV LK...... G ......