<<

Supplementary Figures for

Photosynthesis is not a Universal Feature of the Rochelle M. Soo, Connor T. Skennerton, Yuji Sekiguchi, Michael Imelfort, Samuel J. Paech, Paul G. Dennis, Jason A. Steen, Donovan H. Parks, Gene W. Tyson, and Philip Hugenholtz†

† Correspondence to: Philip Hugenholtz, [email protected] Sample collection Zagget feces (3 samples), EBPR (9 samples) and UASB (4 samples)

DNA extraction MP-BIO FASTSPIN spin kit and bead-beating

Paired end sequencing Upload data from MetaHIT Community Profiling Illumina HiSeq and MiSeq 16SrRNA gene amplified and database clustered to identify (7 samples) CLC Assembly Samples co-assembled for GroopM

GroopM Population genome binning based on coverage patterns

CheckM Check for completion and contamination of standard draft population genome bins

Phlylogenetic placement Phylogenetic placement of standard draft population genome bins using complete bacterial genomes from IMG and

Mate pair sequencing Illumina HiSeq and MiSeq

SSpace scaffolding Improvement of draft population genome bins

CheckM Check for completion and contamination of standard draft population genome bins

IMG-ER submission 16S rRNA gene reconstruction Phylogenetic placement Automated annotation - HMM models of the 16S rRNA Phylogenetic placement of standard - Mapped to Greengenes using draft population genome bins BWA-MEM

Metabolic reconstruction Concatenated gene trees Kegg maps 16S rRNA gene trees - 83 and 38 single copy marker genes - ML, MP and NJ -ML, MP and NJ

Fig. S1. Flow diagram of the Methods used in this paper ML (RAxML, JTT+Gamma, 100x bootstraping) rooted with Archaea 100 100 c__Gammaproteobacteria 100 95 c__Betaproteobacteria 100 Mariprofundus ferrooxydans ML (RAxML, JTT+Gamma, 100x bootstraping) 100 c__Alphaproteobacteria 68 98 100 100 c__Oxyphotobacteria 62 c__Deltaproteobacteria 100 MEL_C1 87 p__Acidobacteria 100 65 100 100 MH_37 p__Nitrospirae 100 Zag_1 99 100 c__Epsilonproteobacteria 94 MEL_A1 91 Zag_111 100 100 100 p__Deferribacteres Ca. Gastranaerophilus phascolarctosicola 100 100 MEL_B1 p__Chrysiogenetes 100 MEL_B2 100 p__Bacteroidetes 79 ACD20 63 100 99 Ca. Obscuribacter phosphatis 100 100 p__Chlorobi Ca. Caenarcanum bioreactoricola 74 75 Ca. Cloacamonas acidaminovorans 100 68 p__Fibrobacteres 73 Gemmatimonas aurantiaca ML (FastTree, JTT+CAT, 100x bootstraping) on RAxML tree 98 p__Spirochaetes 100 100 p__Chlamydiae 100 c__Oxyphotobacteria 100 p__Verrucomicrobia 100 99 HTCC2155 100 MEL_C1 100 p__Planctomycetes 100 MH_37 Zag_1 100 100 p__Elusimicrobia 89 98 MEL_A1 100 95 Zag_111 68 p__Fusobacteria 100 Ca. Gastranaerophilus phascolarctosicola 100 P__Tenericutes 100 MEL_B1 100 MEL_B2 100 c__Oxyphotobacteria 85 ACD20 86 Ca. Obscuribacter phosphatis 100 MEL_C1 Ca. Caenarcanum bioreactoricola 100 MH_37 100 Zag_1 99 94 MEL_A1 p__Cyanobacteria 91 Zag_111 100 Ca. Gastranaerophilus phascolarctosicola 100 MEL_B1 MP (PAUP*, 100x bootstraping, >50% consensus tree) 100 MEL_B2 79 ACD20 99 Ca. Obscuribacter phosphatis Ca. Caenarcanum bioreactoricola 100 100 c__Oxyphotobacteria p__Armatimonadetes 100 100 MEL_C1 p__Actinobacteria 100 MH_37 100 p__Thermi 87 100 Zag_1 96 97 MEL_A1 p__Chloroflexi 64 Zag_111 100 Ca. Gastranaerophilus phascolarctosicola Ca. Saccharimonas aalborgensis 100 MEL_B1 100 p__Firmicutes MEL_B2 89 100 ACD20 80 p__Synergistetes 80 Ca. Obscuribacter phosphatis Caldatribacterium Ca. Caenarcanum bioreactoricola 100 59 p__Thermotogae 97 Coprothermobacter proteolyticus DSM5265 64 exile AZM16c01 100 79 p__Dictyoglomus 100 Thermodesulfobium narugense Na82 100 p__Aquificae 0.10

Fig. S2. Phylogeny of Oxyphotobacteria and Melainabacteria genomes among the based on up to 38 marker genes Phylogenetic maximum likelihood (RAxML, JTT, G) and maximum parsimony (PAUP*) trees based on a concatenated alignment of up to 38 marker genes, showing the phylogenetic robustness of the phylum Cyanobacteria and the classes Oxyphotobacteria and Melainabacteria. 422 OTUs (operational taxonomic units) from and Archaea were used to produce the phylogenetic tree (Table S4). Bootstrap analyses (100 times) were performed for the data set with maximum likelihood (RAxML, JTT, G; FastTree, JTT, CAT) and maximum parsimony (PAUP*) methods, and the values obtained are shown in respective trees, except for the values from FastTree, which are shown on the tree generated with RAxML bootstrapping. Genomes in green are representatives from Di Rienzi et al., 2013 and genomes in red are Melainabacteria from this study. Bootstrap values >60% are shown. p_ represents phylum and c_ represents class. Candidatus has been abbreviated to Ca. for the most complete genomes. ML (RAxML, JTT+Gamma, 100x bootstraping)

100 c__Oxyphotobacteria ML (RAxML, JTT+Gamma, 100x bootstraping) unrooted 100 MEL_C1 100 MH_37 c__Gammaproteobacteria 100 100 Zag_1 100 MEL_A1 c__Alphaproteobacteria 69 Zag_111 100 Ca. Gastroanaerophilus phascolarctosicola 100 MEL_B1 100 MEL_B2 100 ACD20 Mariprofundus ferrooxydans PV-1 98 Ca. Obscuribacter phosphatis Ca. Caenarcanum bioreactoricola c__Betaproteobacteria

c__Deltaproteobacteria o__Acidithiobacillales

ML (FastTree, JTT+CAT, 100x bootstraping) on RAxML tree p__Acidobacteria p__Elusimicrobia 100 p__Spirochaetes c__Oxyphotobacteria p__Deferribacteres c__Epsilonproteobacteria 100 p__Chrysiogenetes MEL_C1 100 MH_37 p__Nitrospirae p__Planctomycetes 100 100 Zag_1 100 MEL_A1 p__Aquificae Lentisphaera araneosa Zag_111 100 Ca. Gastroanaerophilus phascolarctosicola p__Verrucomicrobia 100 MEL_B1 100 MEL_B2 p__Cyanobacteria 100 ACD20 p__Chlamydiae 86 Ca. Obscuribacter phosphatis Ca. Caenarcanum bioreactoricola c__Oxyphotobacteria Gemmatimonas aurantiaca

Ca. Cloacamonas acidami c__Melainabacteria p__Chlorobi MP (PAUP*, 100x bootstraping, >50% consensus tree)

p__Actinobacteria p__Dictyoglomus p__Bacteroidetes 100 p__Thermi p__Chloroflexi p__Fusobacteria c__Oxyphotobacteria p__Synergistetes 100 MEL_C1 p__Thermotogae Coprothermobacter proteolyticus 100 MH_37 100 100 Zag_1 100 MEL_A1 63 Zag_111 100 Ca. Gastroanaerphilus phascolarctosicola 100 MEL_B1 p__Firmicutes MEL_B2 0.10 100 ACD20 68 Ca. Obscurbacter phosphatis Ca. Caenarcanum bioreactoricola

Fig. S3. Phylogeny of Oxyphotobacteria and Melainabacteria among the bacterial phyla based on up to 83 marker genes Phylogenetic maximum likelihood (RAxML, JTT, G) and maximum parsimony (PAUP*) trees based on a concatenated alignment of up to 83 marker genes, showing the phylogenetic robustness of the phylum Cyanobacteria and the classes Oxyphotobacteria and Melainabacteria. 322 OTUs from Bacteria were used to produce the phylogenetic tree (Table S4). Bootstrap analyses (100 times) were performed for the data set with maximum likelihood (RAxML, JTT, G; FastTree, JTT, CAT) and maximum parsimony (PAUP*) methods, and the values obtained are shown in respective trees, except for the values from FastTree, which are shown on the tree generated with RAxML. Genomes in green are representatives from Di Rienzi et al., 2013 and genomes in red are Melainabacteria from this study. Bootstrap values >60% are shown. p_ represents phylum, c_ represents class and o_ represents order. Candidatus has been abbreviated to Ca. for the most complete genomes. ML (RAxML, JTT+Gamma, 100x bootstraping) ML (FastTree, JTT+CAT, 100x bootstraping) on RAxML tree 100 70 100 Group A Group B 99 99 100 Group B Group A 71 99 100 100 Group D Group D 71 100 Geitlerinema sp. PCC 7407 92 Geitlerinema sp. PCC 7407 100 100 Group E Group E 100 100 100 100 Group C 100 Group C 100 100 100 Group F 100 Group F 100 100 100 Group G 100 Group G 100 100 Gloeobacter violaceus PCC 7421 100 Gloeobacter violaceus PCC 7421 100 MEL_C1 100 MEL_C1 100 MH_37 100 MH_37 100 Zag_1 100 Zag_1 MEL_A1 MEL_A1 100 Zag_111 100 Zag_111 100 Ca. Gastranaerophilus phascolarctosicola 100 Ca. Gastranaerophilus phascolarctosicola 100 MEL_B1 100 MEL_B1 100 MEL_B2 100 MEL_B2 100 ACD20 100 ACD20 100 Ca. Obscuribacter phosphatis 99 Ca. Obscuribacter phosphatis Ca. Caenarcanum bioreactoricola Ca. Caenarcanum bioreactoricola 100 100 p__Chloroflexi p__Chloroflexi

0.10 0.10

MP (PAUP*, 100x bootstraping, >50% consensus tree) 100 100 Group A 73 Group B 99 100 85 Group D 100 Geitlerinema sp. PCC 7407 100 97 Group E 100 100 Group C 100 100 Group F 100 100 Group G 100 Gloeobacter violaceus PCC 7421 100 MEL_C1 100 MH_37 100 Zag_1 MEL_A1 100 Zag_111 100 Ca. Gastranaerophilus phascolartosicola 100 MEL_B1 MEL_B2 100 ACD20 67 Ca. Obscuribacter phosphatis Ca. Caenarcanum bioreactoricola 100 p__Chloroflexi

0.10

Fig. S4. Phylogeny of Oxyphotobacteria and Melainabacteria in the Cyanobacteria phylum based on up to 83 marker genes Phylogenetic maximum likelihood (RAxML, JTT, G) and maximum parsimony (PAUP*) trees based on a concatenated alignment of up to 83 marker genes, showing the phylogenetic robustness of the phylum Cyanobacteria and the classes with used as the outgroup. The Oxyphotobacteria were grouped as according to Shih et al., 2013 (Table S3). Bootstrap analyses (100 times) were performed for the data set with maximum likelihood (RAxML, JTT, G; FastTree, JTT, CAT) and maximum parsimony (PAUP*) methods, and the values obtained are shown in respective trees, except for the values from FastTree, which are shown on the tree generated with RAxML. Genomes in green are representatives from Di Rienzi et al., 2013 and genomes in red are Melainabacteria from this study. Bootstrap values >60% are shown. p_ represents phylum, c_ represents class and o_ represents order. Candidatus has been abbreviated to Ca. for the most complete genomes. 100 78 o__Caenarcaophilales 99 o__SHAS531 100 80 100 p__Cyanobacteria o__Vampirovibrioniales 100 o__Gastroanaerophilales 100 87 o__Obscuribacteriales 100 92 c__Oxyphotobacteria 81 99 p__Tenericutes 100 p__Fusobacterium 100 93 p__Microgenomates 100 77 p__Microgenomates 100 80 p__Gracilibacteria 100 73 p__Saccharibacteria (TM7) 98 p__Chloroflexi 95 p__Caldithrixae 100 p__Chlorobi 100 p__Bacteroidetes 100 85 p__Fibrobacteres 98 p__Gemmatimonadetes 99 100 c__Gammaproteobacteria 100 85 c__Betaproteobacteria 100 c__Alphaproteobacteria 100 c__Epsilonproteobacteria c__Deltaproteobacteria 97 p__Nitrospirae 100 p__Deferribacteres 100 p__Chrysiogenetes 100 89 p__Verrucomicrobia 100 88 p__Lentisphaerae (vadinBE97) 100 p__Chlamydiae 93 p__Omnitrophica (OP3) 100 p__Poribacteria 83 p__Planctomycetes 100 p__Aminicenantes (OP8) 100 p__Cloacimonetes (WWE1) 98 p__Latescibacteria (WS3) p__Elusimicrobia 100 p__Acidobacteria 89 p__Aminicenantes (OP8) 100 p__Caldiserica (OP5) 100 p__Aerophobetes (CD12) 100 p__Nitrospirae 100 p__Synergistetes 100 p__Acetothermia (OP1) 100 p__Armatimonadetes (OP10) 100 p__Hydrogenedentes (NKB19) 100 c__Spirochaetes 100 p__Actinobacteria 71 70 p__Firmicutes 100 89 p__Thermi 100 p__Thermodesulfobacteria 100 p__Calescamantes (EM19) 100 78 100 p__Aquificales 100 p__Thermotogae 100 k__Archaea 0.10

Fig. S5. 16S rRNA gene tree showing the phylogenetic robustness of the Cyanobacteria and its class level lineages in the bacterial Maximum likelihood (RAxML, GTR, G, I; FastTree, JTT, CAT) trees based on the 16S rRNA gene, showing the phylogenetic robustness of the classes Oxyphotobacteria and Melainabacteria. 418 OTUs (operational taxonomic units) from Bacteria and Archaea were used to construct the trees. Bootstrap analyses (100 times) were performed for the data set with maximum likelihood (RAxML, GTR, G, I; FastTree, JTT, CAT). Bootstrap values >60% are shown. AGI38755 HydG [ stercorarium subsp. stercorarium DSM 8532] 97 HydG [Melainabacteria] HydG 95 92 99 97 96 HydE 85 Clostridium stercorarium subsp. stercorarium DSM 8532] Shewanella oneidensis] 84 Anabaena variabilis Methanoplanus petrolearius Methanoculleus bourgensis] brennaborense NAD(P)-de- thermophila DSM 6578] pendent 92 iron-only Clostridium ljungdahlii dehydroge- 95 Ca. nase catalytic subunit 93 76 Rhodobacter sphaeroides] [FeFe] 74 Olsenella profusa] hydrogenase, group B1/B3 Clostridium Methanoplanus petrolearius Escherichia coli Ca. 99 Anabaena variabilis] Ca. 84 Escherichia coli 99 82 85 Thermodesulfatator atlanticus] Ca. HypE 93 Klebsiella pneumoniae] Synechococcus Ca. 69 Ca. HypD Escherichia coli 85 Lyngbya majuscula 83 84 Ca. Lyngbya majuscula 66 Nostoc Clostridium 86 Cyanobium Ca. 82 Ca. Olsenella profusa 96 99 Clostridium kluyveri DSM 555] HydF 93 87 72 Escherichia coli Ca. 99 HypF 76 Klebsiella pneumoniae] Microcystis aeruginosa Methanoplanus petrolearius Thermococcus onnurineus 84 Ca. Hydrogenobaculum sp. HO] hydrogenase 97 4 Burkholderia pseudomallei 97 Clostridium termitidis] Ni,Fe-hydro- 85 Enterobacter cloacae genase III 97 large subunit 93 69 Obscuribacter phosphatis] 66 Escherichia coli Ca. 87 95 98 Treponema brennaborense Clostridium lentocellum DSM 5427] Microcystis aeruginosa Ca. Ca Fig. S6. Gene tree showing the hydrogenase genes found in the Melainabacteria representatives. A representative collection of hydrogenase genes publicly available from NCBI was combined with genes identified from this study and from Di Rienzi et al., 2013. Hydrogenase genes common to genomes from both studies are colored blue and grouped together. Hydrogenases found only in this study are colored red and genes only identified in the genomes of Di Rienzi et al., 2013 are colored green.

A MEL_C1 100 MH_37 100 Zag_1 100 100 MEL_A1 69 Zag_111 100 Ca. G. phascolarctosicola 100 MEL_B1 100 MEL_B2 100 ACD20 98 Ca. O. phosphatis Ca. C. bioreactoricola

Gain of flagella FlgI FlgJ FliE FliF FliI FliJ FliL FliP FliS FliT FlgAFlgBFlgCFlgDFlgEFlgFFlgGFlgH FlgKFlgLFlgNFlhAFlhBFliC FliD FliG FliH FliK FliM FliO FliQ FliR MotAMotB Loss of flagella OmpA FliN.SpoA Color Key

0 1 2 3 4 5 6 B 71 MEL_C1 98 Zag_1 99 MH_37 100 MEL_A1 92 96 Zag_111 98 Ca. Gastranaerophilus phascolarctosicola 100 MEL_B1 100 MEL_B2 ACD20 Ca. Caenarcanum bioreactoricola 100 97 p_Chlamydiae, type III secretory flagellar biosynthesis 100

89 p_Proteobacteria, type III secretion protein SctV

100 p_Chlamydia flagellar secretion protein 0.10

C Opitutus terrae PB90 100 petrophila 76 65 Thermotoga maritima MSB8, DSM 3109 88 84 Thermovirga lienii Cas60314, DSM 17291 Thermanaerovibrio acidaminovorans Su883, DSM 6589 100 ACD20 100 MEL_B2 MEL_B1 Halanaerobium praevalens GSL, DSM 2228 Gemmata obscuriglobus UQM 2246 80 95 Thermodesulfatator indicus CIR29812, DSM 15286 88 Rhodospirillum rubrum S1, ATCC 11170 99 Thermocrinis albus HI 11/12, DSM 14484 100 Thermovibrio ammonificans Sulfurihydrogenibium azorense Az-Fu1 Desulfurispirillum indicum S5, DSM 22839 Leptospirillum ferrooxidans C2-3 79 94 Borrelia duttonii Ly 68 Planctomyces brasiliensis IFAM, flagella 80 Rhodopirellula baltica SH 1, fla 1448, DSM 5305 Leptonema illini 3055, DSM 21528 Gemmatimonas aurantiaca Alicycliphilus denitrificans BC 100 100 Photorhabdus luminescens laumondii TTO1 95 100 Dickeya zeae Ech1591 Dickeya dadantii 3937 Pusillimonas Azotobacter vinelandii DJ, ATCC BAA-1303 100 Terriglobus saanensis SP1PR4 Acidobacterium capsulatum ATCC 51196 0.10

Fig. S7. Heat map showing the presence and absence of flagella genes from the Melainabacteria representatives (A) A maximum parsimony tree based on a concatenated alignment of up to 83 marker genes. Purple arrows indicate the gain of a functional flagella and blue arrows indicate the loss of a functional flagella. The heat map indicates the square root of the number of flagella genes found in each of the Melainabacteria representatives. (B) Flagella gene trees for FlhA and (C) flagella gene tree for FliM. Genomes used to make the trees are found in Table S6. Genomes in green are Melainabacteria representatives from Di Rienzi et al., 2013, genomes in red are Melaianabacteria representatives from this study. Candidatus has been abbreviated to Ca. for the most complete genomes. A. MH_37 total = 2358

MH_37

516

220 33

1589 MEL_C1 Zag_1

MEL_C1 total = 2114 Zag_1 total = 2156

230 75 459

B. MH_37 MH_37 MEL_C1

Zag_1 Zag_1 MEL_C1

Fig. S8. Venn diagram of the three Gastranaerophilales genomes from the same species (A) Shared orthologs between MH_37, MEL_C1 and Zag_1 showing the number of genes that are core to the three genomes, and those that are found between two genomes and individual genomes. The total number for each genome does not include paralogs. (B) Average nucleotide identity between MH_37, MEL_C1 and Zag_1, where ≥95% ANI is used to define a species. Oxyphotobacteria Melainabacteria 95% confidence intervals

[COG0600] ABC-type nitrate/sulfonate/bicarbonate transport system, permease < 1e-15 [COG0715] ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components < 1e-15 [COG1116] ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component < 1e-15 [COG1122] ABC-type cobalt transport system, ATPase component 4.60e-14 [COG3221] ABC-type phosphate/phosphonate transport system, periplasmic component 1.22e-12 [COG0025] NhaP-type Na+/H+ and K+/H+ antiporters 2.44e-11 [COG1226] Kef-type K+ transport systems, predicted NAD-binding component 5.94e-10 [COG0226] ABC-type phosphate transport system, periplasmic component 2.05e-9 [COG0609] ABC-type Fe3+-siderophore transport system, permease component 3.44e-8 [COG0474] Cation transport ATPase 1.99e-7 [COG1117] ABC-type phosphate transport system, ATPase component 2.11e-7 [COG1629] Outer membrane receptor proteins, mostly Fe transport 4.43e-7 q-value (corrected) [COG0614] ABC-type Fe3+-hydroxamate transport system, periplasmic component 3.72e-6 [COG0475] Kef-type K+ transport systems, membrane components 1.20e-5 [COG2217] Cation transport ATPase 6.81e-4 [COG0607] Rhodanese-related sulfurtransferase 1.28e-3 [COG0735] Fe2+/Zn2+ uptake regulation proteins 5.94e-3

0.0 0.1 0.00 0.02 0.04 0.06 0.08 0.10 Mean proportion (%) Difference in mean proportions (%)

Fig. S9. COG category P (inorganic ion transport and metabolism) for Oxyphotobacteria and Melainabacteria

Shown are all COGs from category P with a difference in mean proportions >= 0.02% between Oxyphotobacteria and Melainabacteria and a Storey's q-value of <= 0.05.

Fig. S10. COG category Q (secondary metabolites and biosynthesis, transport and catabolism) for Oxyphotobacteria and Melainabacteria

Shown are all COGs from category Q with a difference in mean proportions >= 0.02% between Oxyphotobacteria and Melainabacteria and a Storey's q-value of <= 0.05