<<

Supplemental Material

Supplemental Figures

a Others b 23 Methanobacteriales

25 Methanococcales 1800 1726

8 Thermoplasmatales 1600 36 Archaeoglobales 26 Methanomicrobiales 79 Methanocellales 1400 15 Methanosarcinales 1200

219 Haloarchaea 1000 f protein families o

800 Nu m be r 58 Thermococcales 600 Others 451

400 70 Sulfolobales

200 81 4 Desulfurococcales 6 0 1 2 3 4 5 Thermoproteales Number of different gain events for the same protein family Others

Supplemental Figure 1: Gain events for archaeal protein families calculated with Count’s birth-and-death model (NS dataset). (a) Reference tree of Archaeal genomes with the number of gain events calculated by Count for each archaeal group depicted in the respective node of the origin of the condensed phylum. (b) Number of different gains per protein family calculated by Count (no origins were calculated to occur at the root). Cathaya argyrophylla Pinus thunbergii Keteleeria davidiana Cedrus deodara Cryptomeria japonica Gnetum parvifoliua Welwitschia mirabilis Ephedra equisetina Cycas taitungensis Cheilanthes lindheimeri Adiantum capillusveneris Pteridium aquilinum subsp. aquilinum Alsophila spinulosa Angiopteris evecta Psilotum nudum Equisetum arvense Isoetes flaccida Selaginella moellendorffii Huperzia lucidula Anthoceros formosae Physcomitrella patens subsp. patens Syntrichia ruralis Marchantia polymorpha Staurastrum punctulatum Zygnema circumcarinatum Chara vulgaris Chaetosphaeridium globosum Mesostigma viride Chlorokybus atmophyticus Pseudendoclonium akinetum Oltmannsiellopsis viridis Chlorella vulgaris Parachlorella kessleri Bryopsis hypnoides Leptosira terrestris Bigelowiella natans Chlamydomonas reinhardtii Scenedesmus obliquus Floydiella terrestris Stigeoclonium helveticum Oedogonium cardiacum Micromonas sp. RCC299 Micromonas pusilla CCMP1545 Osterococcus tauri Monomastix sp. OKE1 Euglena gracilis Pyramimonas parkeae Pycnococcus provasolii Nephroselmis olivacea Thalassiosira oceanica CCMP1005 Odontella sinensis Durinskia baltica Kryptoperidinium foliaceum Phaeodactylum tricornutum Ectocarpus siliculosus Vaucheria litorea anophagefferens Aureoumbra lagunensis Guillardia theta Rhodomonas salina Porphyra purpurea Porphyra yezoensis Gracilaria tenuistipitata var. liui Cyanidium caldarium Cyanidioschyzon merolae strain 10D Cyanophora paradoxa 50 100 150 200 250

Supplemental Figure 2: Presence-absence pattern of plastid protein families of the PL data set with sequences with homologs in highlighted (PL dataset). Each tick indicates the presence of a protein in an OTU, colored green or black if it has a homolog or not in cyanobacteria, respectively. The number of protein families is indicated on the x axis. On the right side of the matrix are the OTUs, on the left the corresponding phylogenetic reference tree.

Number of loss: BD / DP / WP 3 / 3 / 3 Cathaya 12/ 12 / 0 Pinus 1 / 1 / 1 Keteleeria Cedrus 0 / 0 / 11 2 / 2 / 2 Cryptomeria Gymnosperms 3 / 3 / 3 3 / 3 / 2 Gnetum 1 / 2 / 2 15/ 15 / 3 Welwitschia 3 / 3 / 2 Ephedra 1 / 1 / 1 Cycas 1 / 0 / 0 Cheilanthes 1 / 1 / 1 Adiantum 0 / 1 / 1 Pteridium Alsophila Ferns 2 / 2 / 0 1 / 1 / 1 Angiopteris 5 / 6 / 4 Psilotum 1 / 2 / 0 Equisetum 0 / 0 / 1 4 / 4 / 3 Isoetes 13/ 13 /12 Selaginella Lycophytes Huperzia 4 / 4 / 3 8 / 8 / 8 Anthoceros 5 / 5 / 2 Physcomitrella 3 / 4 / 2 0 / 3 / 3 Syntrichia 4 / 4 / 3 Marchantia 1 / 2 / 1 3 / 3 / 0 Staurastrum 5 / 9 / 3 1 / 1 / 1 Zygnema 3 / 4 / 1 Chara 1 / 13 / 0 4 / 6 / 0 Chaetosphaeridium 0 / 3 / 0 2 / 3 / 2 Mesostigma 0 / 2 / 0 Chlorokybus 2 / 4 / 0 3 / 3 / 3 Pseudendoclonium 0 / 2 / 0 1 / 1 / 0 Oltmannsiellopsis 1 / 2 / 0 Chlorella 0 / 1 / 0 Parachlorella 1 / 2 / 1 8 / 11 / 3 Bryopsis 60/ 52 /48 3 / 3 / 3 Leptosira 1 / 13 / 1 18/ 25 /12 Bigelowiella Rhizaria 2 / 2 / 2 7 / 7 / 4 Chlamydomonas 6 / 11 / 3 0 / 1 / 0 Scenedesmus Chlorophyceae 2 / 3 / 1 Floydiella 1 / 1 / 1 Stigeoclonium 16/ 6 / 3 Oedogonium 2 / 2 / 2 0 / 1 / 0 Micromonas s. 8 / 8 / 3 5 / 17 / 1 30/ 30 /30 Micromonas p. Mamiellophyceae 0 / 1 / 0 Osterococcus t. 3 / 7 / 5 5 / 3 / 6 Monomastix 0 / 2 / 3 2 / 3 / 2 13/ 24 / 7 Euglena Pyramimonas Excavata 11/ 22 / 4 Pycnococcus 5 / 8 / 1 Nephroselmis 1 / 1 / 1 1 / 1 / 1 Thalassiosira o. 0 / 3 / 0 Thalassiosira p. 6 / 21 / 2 Odontella 2 / 2 / 2 0 / 1 / 0 Durinskia 1 / 0 / 1 Kryptoperidinium 0 / 1 / 0 Phaeodactylum SAR 1 / 3 / 1 3 / 5 / 2 14/ 13 / 3 Ectocarpus 1 / 2 / 1 5 / 7 / 4 Vaucheria 0 / 11 / 0 Heterosigma 43/ 36 /15 25/ 39 /19 2 / 5 / 2 Aureococcus Aureoumbra 5 / 5 / 3 1 / 2 / 0 Guillardia 2 / 3 / 1 0 / 13 / 0 3 / 3 / 1 Rhodomonas Hacrobia 33/ 37 /22 Emiliania 4 / 3 / 5 1 / 1 / 0 Porphyra p. 0 / 5 / 0 2 / 3 / 1 Porphyra y. 7 / 8 / 2 Gracilaria Rhodophyta 7 / 11 / 0 12/ 14 / 5 Cyanidium 12/ 11 / 3 Cyanidioschyzon 1 / 0 / 0 Cyanophora Glaucophyta

b

300

No loss 250 1 2

s 3 e 200

m ili 4 a f

n 5 i e t 6 r o

p 150 f

o 7 r 8 m be

u 100 9 N 10 11 50 12 13 14 0 Birth and death model Dollo parsimony Wagner parsimony

Supplemental Figure 3: Loss events for plastid protein families calculated with Count’s traditional phylogenetic methods (PL dataset). (a) Loss events for plastid protein families are depicted at the respective nodes in the following order, separated by slashes: Birth-and- Death model; Dollo Parsimony; Wagner Parsimony. Inner and outer nodes where no values are plotted have no loss events according to the calculations of Count. (b) Number of different losses per protein family for each phylogenetic model in Count; a gradient from blue to red shows multiple losses for the same protein family. Supplemental Tables Supplemental Table 1: Genomes used in the BLAST of the PL dataset against cyanobacteria

Organism name (NCBI) Assembly accession TaxID Acaryochloris marina MBIC11017 GCF_000018105.1 329726 Anabaena cylindrica PCC 7122 GCF_000317695.1 272123 Anabaena sp. 90 GCF_000312705.1 46234 Anabaena sp. wa102 GCF_001277295.1 1647413 Calothrix sp. PCC 6303 GCF_000317435.1 1170562 Calothrix sp. 336/3 GCF_000734895.2 1337936 Calothrix sp. PCC 7507 GCF_000316575.1 99598 Candidatus Atelocyanobacterium thalassa isolate ALOHA GCF_000025125.1 1453429 Chroococcidiopsis thermalis PCC 7203 GCF_000317125.1 251229 Crinalium epipsammum PCC 9333 GCF_000317495.1 1173022 Cyanobacterium aponinum PCC 10605 GCF_000317675.1 755178 cyanobacterium endosymbiont of Epithemia turgida isolate EtSB Lake Yunoko GCF_000829235.1 1228987 Cyanobium gracile PCC 6307 GCF_000316515.1 292564 Cyanothece sp. ATCC 51142 GCF_000017845.1 43989 Cyanothece sp. PCC 7424 GCF_000021825.1 65393 Cyanothece sp. PCC 7425 GCF_000022045.1 395961 Cyanothece sp. PCC 7822 GCF_000147335.1 497965 Cyanothece sp. PCC 8801 GCF_000021805.1 41431 Cyanothece sp. PCC 8802 GCF_000024045.1 395962 Dactylococcopsis salina PCC 8305 GCF_000317615.1 13035 Fischerella sp. NIES-3754 GCF_001548455.1 1752063 Geitlerinema sp. PCC 7407 GCF_000317045.1 1173025 Geminocystis sp. NIES-3708 GCF_001548095.1 1615909 Geminocystis sp. NIES-3709 GCF_001548115.1 1617448 Gloeobacter kilaueensis JS1 GCF_000484535.1 1183438 Gloeobacter violaceus PCC 7421 GCF_000011385.1 251221 Gloeocapsa sp. PCC 7428 GCF_000317555.1 1173026 Halothece sp. PCC 7418 GCF_000317635.1 65093 Leptolyngbya sp. NIES-3755 GCF_001548435.1 1752064 Leptolyngbya sp. O-77 GCF_001548395.1 1080068 Leptolyngbya sp. PCC 7376 GCF_000316605.1 111781 Microcoleus sp. PCC 7113 GCF_000317515.1 1173027 Microcystis aeruginosa NIES-843 GCF_000010625.1 449447 Microcystis aeruginosa NIES-2549 GCF_000981785.1 1641812 Microcystis aeruginosa NIES-2481 GCF_001704955.1 1698524 Microcystis panniformis FACHB-1757 GCF_001264245.1 1638788 Nostoc punctiforme PCC 73102 GCF_000020025.1 63737 Nostoc sp. NIES-3756 GCF_001548375.1 1751286 Nostoc sp. PCC 7107 GCF_000316625.1 317936 Nostoc sp. PCC 7120 GCF_000009705.1 103690 Nostoc sp. PCC 7524 GCF_000316645.1 28072 Oscillatoria acuminata PCC 6304 GCF_000317105.1 56110 Oscillatoria nigro-viridis PCC 7112 GCF_000317475.1 179408 Pleurocapsa sp. PCC 7327 GCF_000317025.1 118163 Prochlorococcus marinus subsp. marinus str. CCMP1375 GCF_000007925.1 167539 Prochlorococcus marinus subsp. pastoris str. CCMP1986 GCF_000011465.1 59919 Prochlorococcus marinus str. MIT 9313 GCF_000011485.1 74547 Prochlorococcus marinus str. NATL2A GCF_000012465.1 59920 Prochlorococcus marinus str. MIT 9312 GCF_000012645.1 74546 Prochlorococcus marinus str. AS9601 GCF_000015645.1 146891 Prochlorococcus marinus str. MIT 9515 GCF_000015665.1 167542 Prochlorococcus marinus str. NATL1A GCF_000015685.1 167555 Prochlorococcus marinus str. MIT 9303 GCF_000015705.1 59922 Prochlorococcus marinus str. MIT 9301 GCF_000015965.1 167546 Prochlorococcus marinus str. MIT 9215 GCF_000018065.1 93060 Prochlorococcus sp. MIT 0604 GCF_000757845.1 1501268 Prochlorococcus sp. MIT 0801 GCF_000757865.1 1501269 Pseudanabaena sp. PCC 7367 GCF_000317065.1 82654 Rivularia sp. PCC 7116 GCF_000316665.1 373994 Stanieria cyanosphaera PCC 7437 GCF_000317575.1 111780 elongatus PCC 6301 GCF_000010065.1 269084 Synechococcus elongatus PCC 7942 GCF_000012525.1 1140 Synechococcus sp. CC9311 GCF_000014585.1 64471 Synechococcus sp. CC9605 GCF_000012625.1 110662 Synechococcus sp. CC9902 GCF_000012505.1 316279 Synechococcus sp. JA-2-3B'a(2-13) GCF_000013225.1 321332 Synechococcus sp. JA-3-3Ab GCF_000013205.1 321327 Synechococcus sp. KORDI-100 GCF_000737535.1 1280380 Synechococcus sp. KORDI-49 GCF_000737575.1 585423 Synechococcus sp. KORDI-52 GCF_000737595.1 585425 Synechococcus sp. PCC 6312 GCF_000316685.1 195253 Synechococcus sp. PCC 7002 GCF_000019485.1 32049 Synechococcus sp. PCC 7003 GCF_001693255.1 374981 Synechococcus sp. PCC 7117 GCF_001693275.1 195498 Synechococcus sp. PCC 73109 GCF_001521855.1 374982 Synechococcus sp. PCC 7502 GCF_000317085.1 1173263 Synechococcus sp. PCC 8807 GCF_001693295.1 195248 Synechococcus sp. RCC307 GCF_000063525.1 316278 Synechococcus sp. UTEX 2973 GCF_000817325.1 1350461 Synechococcus sp. WH 7803 GCF_000063505.1 32051 Synechococcus sp. WH 8102 GCF_000195975.1 84588 Synechococcus sp. WH 8103 GCF_001182765.1 29410 Synechococcus sp. WH 8109 GCF_000161795.2 166314 Synechocystis sp. PCC 6714 GCF_000478825.2 1147 Synechocystis sp. PCC 6803 GCF_000009725.1 1148 Synechocystis sp. PCC 6803 GCF_000270265.1 1148 Synechocystis sp. PCC 6803 substr. GT-I GCF_000284135.1 1080228 Synechocystis sp. PCC 6803 substr. PCC-N GCF_000284215.1 1080229 Synechocystis sp. PCC 6803 substr. PCC-P GCF_000284455.1 1080230 Synechocystis sp. PCC 6803 GCF_000340785.1 1148 Synechocystis sp. PCC 6803 GCF_001318385.1 1148 Thermosynechococcus elongatus BP-1 GCF_000011345.1 197221 Trichodesmium erythraeum IMS101 GCF_000014265.1 203124 'Nostoc azollae' 0708 GCF_000196515.1 551115 Supplemental Table 2: Abbreviations for species names in the EK dataset

Supergroups Abbreviation Species name Arabidopsis Arabidopsis thaliana Oryza Oryza sativa Selaginella Selaginella moellendori Physcomitrella Physcomitrella patens Chlorella Chlorella variabilis Coccomyxa Coccomyxa subellipsoidea Chlamydomonas Chlamydomonas reinhardtii Micromonas Micromonas sp. RCC299 Ostreococcus Ostreococcus lucimarinus Chondrus Chondrus crispus Porphyridium Porphyridium purpureum Cyanidioschyzon Cyanidioschyzon merolae Galdieria Galdieria sulphuraria Cyanophora Cyanophora paradoxa Hacrobia Emiliania Emiliania huxleyi Guillardia Guillardia theta SAR Thalassiosira Thalassiosira pseudonana Phaeodactylum Phaeodactylum tricornutum Aureococcus Aureococcus anophageerens Ectocarpus Ectocarpus siliculosus Nannochloropsis gaditana Saprolegnia Saprolegnia parasitica Phytophthora Phytophthora sojae Blastocystis hominis Plasmodium Theileria Theileria annulata Toxoplasma Cryptosporidium Symbiodinium minutum Tetrahymena Tetrahymena thermophila Oxytricha Oxytricha trifallax Bigelowiella Bigelowiella natans Excavata Trypanosoma Trypanosoma brucei Leishmania Leishmania major Naegleria Naegleria gruberi Trichomonas Trichomonas vaginalis Giardia Giardia lamblia Opisthokonta Drosophila Drosophila melanogaster Schistosoma Schistosoma mansoni Caenorhabditis Caenorhabditis elegans Homo Homo sapiens Nematostella Nematostella vectensis Trichoplax Trichoplax adhaerens Amphimedon Amphimedon queenslandica Monosiga Monosiga brevicollis Capsaspora Capsaspora owczarzaki Saccharomyces Saccharomyces cerevisiae Cryptococcus Cryptococcus neoformans Rhizopus Rhizopus delemar Batrachochytrium Batrachochytrium dendrobatidis Encephalitozoon Encephalitozoon cuniculi Amoebozoa Dictyostelium Dictyostelium discoideum Polysphondylium Polysphondylium pallidum Entamoeba Entamoeba histolytica Acanthamoeba Acanthamoeba castellanii