Characterization of Heliorhodopsins Detected Via Functional Metagenomics in Freshwater Actinobacteria, Chloroflexi and Archaea
Total Page:16
File Type:pdf, Size:1020Kb
1 Supporting information for Characterization of heliorhodopsins detected via functional metagenomics in freshwater Actinobacteria, Chloroflexi and Archaea Ariel Chazan, Andrey Rozenberg, Kentaro Mannen, Takashi Nagata, Ran Tahan, Shir Yaish, Shirley Larom, Keiichi Inoue, Oded Béjà, and Alina Pushkarev Corresponding author: Alina Pushkarev Email: [email protected] This PDF file includes: Figures S1 to S11 Other supporting information includes: Dataset S1 (flat GenBank format). Annotated sequences of the reported fosmids and proteorhodopsin gene used as positive control in Figure 2B. Dataset S2 (Excel spreadsheet). Protein sequences and metadata for prokaryotic HeR and DTE proton pump genes collected for the analysis of gene neighbors. Dataset S3 (Excel spreadsheet): Summary statistics and results of Fisher’s exact test for the different Pfam protein families and domains that deviate significantly between groups of rhodopsins (heliorhodopsins vs. DTE proton pumps) or between the vicinities of rhodopsin genes and background genomic locations. Dataset S4 (flat hmm file): Protein profile used to collect DTE proton pumps, built from proteorhodopsin and xanthorhodopsin sequences in UniRef90. Note that the 2 matches were required to have the conserved DTE motif in TM helix C and lysine in TM helix G. Hula Ein Afek N N Figure S1. Sampling sites. Colored dots represent the different sampling sites: pink – Hula A (33°06'22.4"N 35°36'09.4"E), red – Hula B (33°06'24.4"N 35°36'04.7"E), and yellow – Ein Afek (32°50'44.15"N 35° 6'49.04"E). Figures were taken from Google Earth Pro 7.3.3 (December 20, 2014). North of Israel. Positions as mentioned above, eye altitude 762 m. 3 ● HULAa3G5: 17/33 -/9/1 ● HULAa2F4: 20/41 -/12/0.99 0.1 ●● GCA_003149555.1 [Ca. Aquiluna sp. XM-24bin5]: 130 ●● GCA_002390305.1: 116 100/112/1 ●● GCF_000257665.1 [Ca. Aquiluna sp. IMCC13023]: 134 100/129/1 -/105/0.48 ●GCA_002333405.1: 132 g__Aquiluna GCF_900100865.1 [Ca. Aquiluna sp. UB-MaderosW2red]: 133 100/59/1 ●● GCA_002479805.1: 111 GCA_002478545.1: 69 GCF_000699505.1 [Rhodoluna lacicola]: 127 100/119/1 GCF_005845385.1 [Ca. Rhodoluna limnophila]: 122 100/67/1 g__Rhodoluna GCF_001854225.1 [Ca. Rhodoluna planktonica]: 124 100/59/1 ●GCA_002342255.1: 68 Bacteria, Actinobacteria: the Aquiluna/Rhodoluna clade ● GCF_002288065.1 [Ca. Planktophila sulfonica]: 154 ● GCF_002284855.1: 154 ● GCF_002288225.1 [Ca. Planktophila dulcis]: 151 0.1 100/98/1 ● GCA_000372185.1: 134 98/113/1 GCA_002430075.1: 124 100/144/1 ● GCF_002288105.1 [Ca. Planktophila versatilis]: 146 ● 100/4/0.48 EINA62G7: 27/40 -/23/1 ● EINA20F1: 34/48 -/21/1 ● GCF_002287925.1 [Ca. Planktophila lacus]: 159 Heliorhodopsin genes: 100/5/0.28 100/144/1 ● GCF_002284875.1: 151 ● HULAa50H9: 7/31 group A GCF_002284895.1: 149 GCA_000378885.1: 119 group B ● GCF_002288185.1 [Ca. Planktophila vernalis]: 151 GCA_005789205.1: 102 100/83/1 ● GCA_003569265.1: 85 g__Planktophila 98/133/1 GCA_007279945.1: 111 100/96/1 98/103/1 GCA_005800815.1: 142 100/138/1 ● ● HULAb132A11: 21/40 100/103/1 -/13/1 ● ● GCA_001438925.1: 144 ● GCF_002288365.1 [Ca. Planktophila limnetica]: 147 ● GCA_007280125.1: 109 GCA_003569185.1: 105 100/77/1 ● GCF_002287885.2 [Ca. Nanopelagicus limnes]: 147 95/106/1 ● GCA_000383815.1: 116 86/136/1 ● GCA_000294575.1: 139 ● GCA_005793655.1: 96 100/93/1 GCF_002288005.1 [Ca. Nanopelagicus hibericus]: 146 82/147/1 100/126/0.82 ● g__Nanopelagicus GCA_000378865.1: 103 100/131/1 100/101/1 ● GCF_002288305.1 [Ca. Nanopelagicus abundans]: 144 ● GCA_001437855.1: 131 ● GCA_002340925.1: 109 100/143/1 100/59/1 ● GCA_000485495.1: 120 g__AAA044-D11 100/90/1 ● GCA_007280395.1: 121 100/106/1 ● GCA_002367715.1: 126 100/128/1 GCA_003151395.1: 132 100/145/0.98 ● GCA_005777515.1: 95 ● GCA_003569145.1: 127 100/79/1 ● GCA_002347935.1: 96 -/103/0.99 f__Nanopelagicaceae ● GCA_002470135.1: 126 100/119/1 100/72/1 GCA_005787575.1: 122 g__IMCC26077 ● GCA_005787645.1: 104 100/76/1 ● GCF_002284915.1: 145 100/101/1 ● GCA_002346305.1: 111 ● GCA_002737595.1: 97 ● GCA_002365735.1: 112 g__UBA4592 100/96/1 ● GCA_002390705.1: 137 100/55/1 ● GCA_002469525.1: 117 100/59/1 ● GCA_002422375.1: 62 95/127/0.99 ● GCA_002694095.1: 116 g__S36-B12 ● GCA_002728915.1: 113 100/107/1 ● GCA_002729215.1: 114 100/121/1 g__GCA-2737125 ● GCA_002737125.1: 78 f__S36-B12 100/40/1 ● GCA_005798555.1: 107 -/59/0.97 95/118/1 GCA_003506065.1: 84 52/90/1 ● GCA_003456155.1: 96 -/125/0.93 ● GCA_004379115.1: 125 GCA_004297305.1: 129 -/90/0.85 ● GCA_002699445.1: 121 -/90/0.99 -/84/0.75 GCA_003452655.1: 93 ●● GCA_002430405.1: 100 55/101/1 ● 48C12: 14/27 g__ATZT02 GCA_005789325.1: 76 f__AcAMD-5 100/48/1 GCA_000428345.2: 72 100/72/1 97/80/1 ● GCA_002293025.1: 118 f__UBA12327 ● GCA_003529305.1: 84 100/54/1 ● GCA_000754455.1: 79 Bacteria, Actinobacteria: Nanopelagicales 4 Figure S2. Phylogenetic affinities of six of the actinobacterial clones isolated in this study. Estimated phylogenetic position of the actinobacterial clones including the previously reported clone 48C12. For named branches present in GTDB the corresponding taxa are provided above the branches. Below the branches the three numbers indicate: bootstrap branch support from the GTDB reference tree, effective number of genes in the species inference and local posterior probability (notice that for low values of the effective number of genes, posterior probabilities are not informative). The tips are labelled with NCBI assembly accessions (in black) or clone names (in red). For the environmental clones the two numbers after the colon indicate the number of the genes taken for gene phylogenies (this number could decrease further due to filtering) and the total number of the genes on the contig. Dots indicate the presence of HeR genes from the two phylogenetic groups. 5 ● LSSD01000066: 15/33 ●GCA_002900535.1: 107 100/75/1 GCA_001595915.1: 110 0.1 100/98/1 GCA_003651045.1: 112 84/121/1 GCA_003649845.1: 114 100/85/1 GCA_002254885.1: 95 f__DHVEG-1 ●GCA_002506745.1: 95 100/89/1 96/69/1 ●GCA_002496355.1: 110 100/98/1 g__SM1-50 GCA_003942085.1: 111 100/113/1 ●GCA_001595945.1: 127 35/83/0.53 100/120/1 100/108/1 ●GCA_002900555.1: 113 o__DHVEG-1 GCA_003649715.1: 121 100/114/1 GCA_003649745.1: 92 f__B18-G2 GCA_003651105.1: 85 100/68/1 GCA_003650935.1: 106 f__JdFR-43 GCA_002011355.1: 94 o__UBA202 100/55/1 GCA_003650975.1: 106 66/86/1 GCA_002502685.1: 86 100/114/1 f__UBA9212;g__UBA9212 GCA_013329505.1: 92 o__UBA9212 100/73/1 GCA_013329495.1: 107 100/104/1 GCA_002838935.1: 117 Archaea, Thermoplasmatota: class E2 GCA_002503545.1 [Ca. Methanomethylophilus sp. UBA78]: 103 100/88/1 GCF_000300255.2 [Ca. Methanomethylophilus alvus Mx1201]: 121 0.1 100/103/1 GCF_001481295.1 [Ca. Methanomethylophilus sp. 1R26]: 109 g__Methanomethylophilus GCF_001560915.1: 118 100/112/1 GCF_000350305.1: 117 100/118/1 GCA_002495325.1: 118 100/125/1 GCA_002506905.1: 121 48/120/0.98 GCF_000800805.1 [Ca. Methanoplasma termitum]: 124 GCA_006954465.1: 101 94/120/0.97 100/86/1 GCA_002506425.1: 114 g__UBA71100/102/1 GCA_002504495.1: 104 100/120/1 GCA_006954425.1: 123 100/121/1 GCA_002504405.1: 122 100/121/1 g__ISO4-G1 GCA_002506175.1: 84 100/120/1 100/82/1 GCA_001563305.1: 122 GCA_002509405.1: 115 g__VadinCA11 98/112/1 ●GCA_002505345.1: 120 100/109/1 ●GCA_002498365.1: 117 100/125/1 100/112/1 GCA_002498605.1: 112 GCA_002506255.1: 59 g__RumEn-M2 ●GCA_001421175.1: 117 100/121/1 ●GCA_002502965.1: 124 f__Methanomethylophilaceae 93/123/0.99 GCA_006954405.1: 124 100/124/1 GCA_007116915.1: 73 100/67/1 ●GCA_003557905.1: 97 Heliorhodopsin genes: 100/64/1 GCA_003560875.1: 108 g__PWHV01100/76/1 ●GCA_007117455.1: 78 100/84/1 ●GCA_003555025.1: 123 group B 100/87/1 100/106/1 ●GCA_003550345.1: 111 GCA_013329105.1: 90 o__Methanomassiliicoccales ●GCA_002498285.1 [Methanomassiliicoccus sp. UBA386]: 106 100/122/1 g__DTU008 100/97/1 GCA_001421185.1: 119 GCA_001512965.1: 122 100/119/1 ●GCA_002504525.1 [Methanomassiliicoccus sp. UBA345]: 121 GCA_002067635.1: 88 f__Methanomassiliicoccaceae-/121/0.87 ● 100/87/1 GCA_002508545.1 [Methanomassiliicoccus sp. UBA6]: 124 100/122/1 -/125/1 GCF_000308215.1 [Methanomassiliicoccus luminyensis B10]: 125 GCF_000404225.1 [Ca. Methanomassiliicoccus intestinalis Issoire-Mx1]: 122 GCA_013329565.1: 110 g__UBA472 100/88/1 ●GCA_002497075.1: 123 100/125/1 100/101/1 GCA_002067045.1: 104 100/121/1 ●GCA_004525545.1: 122 f__UBA472100/122/1 ●GCA_002067865.1: 126 100/122/1 g__FEN-33 ●GCA_003153895.1: 106 100/100/1 ●GCA_003135935.1: 117 GCA_005879045.1: 88 100/63/1 GCA_005878525.1: 107 GCA_005878665.1: 94 100/114/1 GCA_005878985.1: 119 100/96/1 GCA_005878515.1: 99 97/118/1 GCA_005878485.1: 111 100/62/1 GCA_005878395.1: 90 100/81/1 GCA_005878995.1: 86 GCA_005879065.1: 106 100/63/1 GCA_005879015.1: 82 100/100/1 GCA_005878385.1: 105 100/86/1 GCA_005878615.1: 113 100/65/1 100/89/1 100/101/1 GCA_005878635.1: 107 GCA_005878325.1: 89 g__RBG-16-68-1297/83/0.7 GCA_005878415.1: 93 100/117/1 GCA_013329855.1: 85 GCA_001800825.1: 113 100/99/1 100/89/1 GCA_001800745.1: 99 f__RBG-16-68-12 g__EA-19 GCA_005878955.1: 103 o__RBG-16-68-12 100/96/1 100/93/1 GCA_005878915.1: 109 100/109/1 GCA_005878375.1: 103 GCA_004377185.1: 117 GCA_002499085.1: 97 100/114/1 ● o__SG8-5;f__SG8-5 ●GCA_002496385.1: 108 100/2/0.87 ● 100/85/1 GCA_001595885.1: 90 HULAa36F11: 6/32 100/122/1 g__COMBO-56-21 GCA_001800815.1: 112 o__UBA10834;f__UBA10834 100/82/1 ●GCA_001800675.1: 98 86/78/1 ●GCA_013329135.1: 109 100/90/1 ●GCA_013331315.1: 91 Archaea, Thermoplasmatota: class Thermoplasmata Figure S3. Phylogenetic affinities of the archaeal clone isolated in this study. For comparison, the same analysis was performed for the metagenomic contig that contained TaHeR (GenBank accession number LSSD01000066).