Understanding Life in Extreme Environments; from a single colony to million sequences

Avinash Sharma (PhD) [email protected] Wellcome Trust-DBT India Alliance Fellow Microorganisms are everywhere

Source:www.microbiomesupport.eu What are they? Microbes living where nothing else can

Why are they are interesting? Medicine, Environment, Human Gut, Agriculture, Food etc Why we need to study Extreme Environments

• Microorganisms represent the most important and diverse group of organisms • Widely distributed in many environmental habitats • Important for ecosystems functioning • Diversity and structure of complex microbial communities still poorly understood • Great challenge in microbial ecology to evaluate microbial diversity in complex environments Woese and Fox, 1977 Introduction to Extremophiles What are they? Microbes living where nothing else can How do they survive?

Why are they are interesting?

Extremophiles are well know for their enzymes

Why enzymes from extremophiles…?

Stabilty even at extreme conditions Life in Extreme Environments

• Many organisms adapt to extreme environments – Thermophiles (liking heat) – Acidophiles (liking acidic environments) – Psychrophiles (liking cold) – Halophiles (liking salty environments) • Demonstrates that life flourishes even in the harshest of locations Categories of Extremophiles

Environmental factor Category Definition Major microbial habitat

Temperature Hyperthermophile, Opt. growth at > 80°C Hot springs and vents, sub-surface. Thermophile < 15°C ice, deep-ocean, arctic Psychrophile

Salinity Halophile 2-5M NaCl. Salt lakes, solar salterns, brines.

Pressure Peizophile (Barophile) <1000atm Deep sea eg. Mariana Trench, sub- surface pH Low Acidophile pH < 2 acidic hot springs High Alkaliphile pH > 10 soda lakes, deserts

Oxygen

No Anaerobe (Anoxiphile) cannot tolerate O2 sediments, sub-surface High high O2 tention? sub-glacial lakes.

Radiation Radioresistant Soil contaminated areas

Toxic heavy metals Metallophiles tolerate heavy metals Contaminated areas

Low nutrition Oligotrophs Lakes

Inert substrates CH4 oxidizers, hydrocarbons etc. Soil, water etc. Microbial Identification Methods

• Morphological and microscopic features – Colony morphology, cell shape and size, staining etc. • Biochemical features – Catalase, Oxidase, Indole, Citrate, Urease, Sugar fermentation, etc. • Molecular features – Nucleic acids (DNA and RNA), fatty acids, proteins, etc. 16S rRNA Gene Sequencing • Most common housekeeping genetic marker used for a number of reasons – Its presence in almost all – Large enough for informatics purposes ( ̴1500 bp) – No change in the function • 1980 in the Approved Lists, 1,791 valid names • Today, this number has ballooned to >16000 Unknown microbial diversity The GreatThe Plate Great Count Plate Count Anomaly Anomaly Sequencing technologies

• First generation: -Maxam- Gilbert method -Sanger’s Dideoxy method

• Next Generation: - Roche 454 - SOLiD by ABI - Genome Analyzer/ Hiseq by Illumina

• Compact PGM Sequencer - Ion Torrent - Miseq by Illumina

• Third Generation: - SMRT by Pacific Bioscience - Nanopore by University of ILLINOIS DNA sequencing technologies ideally should be

1. Fast  2. Accurate  3. Easy-to-operate  4. Cost effective  DNA sequencing: Importance

• Basic blueprint for life; Aesthetics. • Gene and protein – Function – Structure – Evolution • Genome-based diseases- “inborn errors of metabolism” – Genetic disorders – Genetic predispositions to infection – Diagnostics – Therapies Evolution of Sequencing

• Remarkable improvement in sequencing efficiency since inception • The amount of sequencing that one person can perform has increased dramatically – 1980: 0.1– 1 kb per year – 1985: 2–10 kb per year – 1990: 25–50 kb per year – 1996: 100–200 kb per year – 2000: 500–1,000 kb per year --2020: ~ 300-1000 Gb per day

Cost of sequencing technologies over the years I have enough of sequencing data ..Whats next ? Strategies for Microbial Diversity Analysis

Sample Community Direct collectio DNA cloning n

Transformation

Direct Sequencing using NGS Platform PCR Amplification

Sequencin Metagenomic DNA g library Phylogenetic Trees DGG E Structural and Isolation of culturable Functional analysis Microbial Diversity Estimation microorganisms Microbial Community Structure and their survival strategies Sample Date Humidity Overhead Pressure Temperature Total UVA UVB Wind ID ozone radiation Radiation radiation Speed (%) (DU) (hPa) (ºC) (MJ m-2) (MJ m-2) (MJ m-2) (m s-1)

ST01 8-Jan-19 78.32 271.26 978.28 0.44 0.14 0.011 78.16 17.88

ST02 10-Jan-19 49.7 272.44 985.69 3.09 0.18 0.014 78.08 12.43 ST03 12-Jan-19 41.42 276.86 982.01 1.57 0.18 0.013 78.09 13.33 ST04 14-Jan-19 48.71 277.53 986.73 0.98 0.19 0.013 78.07 8.58 ST05 16-Jan-19 44.97 307.67 980.71 1.6 0.19 0.013 77.64 16.79 ST06 18-Jan-19 46.38 306.21 981.19 0.46 0.19 0.013 78.14 10.17 ST07 20-Jan-19 66.48 295.83 982.98 0.05 0.11 0.009 78.17 10.06 ST08 22-Jan-19 47.4 299.81 971.88 0.85 0.17 0.011 78.13 10.71 ST09 24-Jan-19 54.5 305.79 978.32 -2.53 0.12 0.009 78.15 7.8 ST10 26-Jan-19 72.97 304.04 977.76 -0.78 0.06 0.006 78.07 7.98 Assessment of physical parameters under temporal variation of UV radiation Assessment of physical parameters under temporal variation of UV radiation

Sample ID Chao1 Observed ASVs Shannon ST01 1863 1863 7.28 ST02 1151 1151 6.76 ST03 1550 1550 7.13 ST04 1431 1431 6.89 ST05 1629 1629 7.06 ST06 1746 1746 7.19 ST07 1448 1448 6.97 ST08 1240 1240 6.90 ST09 1584 1584 7.08 ST10 1431 1431 7.03

Estimates of alpha diversity parameters Real time PCR based estimation of bacterial biomass

Distribution of bacterial communities under the UVB radiation Functional study: abundance and distribution of genes Marisediminicola senii sp. nov. isolated from Queen Maud Land, Antarctica

Scanning electron micrograph of strain SM7_A14T.

Strain SM7_A14T, isolated from the glacier fed sediment sample collected the Queen Maud Land, Antarctica (70045’28” S, 11037’36” E) 68 Marisediminicola senii SM7_A14T (MT084553) Marisediminicola antarctica ZS314T (GQ496083) Glaciihabitans tibetensis MP203T (KC256953) 91 Glaciihabitans arcticus RP-3-7T (SISG01000001) Parafrigoribacterium mesophilum MSL-08T (EF466126) 57 Galbitalea soli KIS82-1T (JX876866) Yonghaparkia alkaliphila KSL-113T (DQ256087) Lysinibacter cavernae CC5-806T (KP411613) Frigoribacterium faeni 801T (Y18807) 96 Frigoribacterium endophyticum EGI 6500707T (KM114212) 100 Frigoribacterium salinisoli LAM9155T (KX094417) 64 Compostimonas suwonensis SMC46T (JN000316) Aurantimicrobium minutum KNCT (AP017457) Cryobacterium mesophilum MSL-15T (EF466127) Diaminobutyricibacter tongyongensis KIS66-7T (JX876865) Reconstruction of phylogenetic tree based on 16S T 53 Labedella endophytica EGI 6500705 (KM095501) 100 Cryobacterium zongtaii TMN-42T (JX949938) rRNA gene sequences using neighbour-joining Cryobacterium arcticum SK-1T (GQ406814) 60 Cryobacterium psychrotolerans CGMCC 1.5382T (jgi.1076200) algorithm, depicting the position of strain 78 Cryobacterium psychrotolerans CGMCC 1.5382T (jgi.1076200) T 51 Frondihabitans australicus DSM 17894T (RBKS01000001) SM7_A14 with closest belonging to the 83 Frondihabitans peucedani RS-15T (FM998017) Frondihabitans sucicola GRS42T (JX876867) genera members of the family 62 Frondihabitans cladoniiphilus CafT13T (FN666417) . Bootstrap values (expressed Subtercola lobariae 9583bT (KM924549) Subtercola frigoramans K265T (AF224723) as percentages of 1000 replications) of above 96 Subtercola vilae DB165T (MF276890) Planctomonas deserti 13S1-3T (MH287062) 50% are shown at the branch points. Clavibacter sepedonicus ATCC 33113T (AM849034) Clavibacter capsici PF008T (CP012573) 100 Clavibacter michiganensis subsp. michiganensis VKM Ac-1403T (jgi.1118350) Clavibacter tessellarius ATCC 33566T (MZMQ01000001) 53 Clavibacter insidiosus LMG 3663T (MZMO01000001) Clavibacter nebraskensis NCPPB 2581T (HE614873) Clavibacter michiganensis subsp. phaseoli LPPA 982T (HE608962) Clavibacter michiganensis subsp. chilensis ZUM3936T (KF663872) Clavibacter michiganensis subsp. californiensis C55T (KF663871) 100 Mycetocola tolaasinivorans CM-05T (AB012646) 74 Mycetocola saprophilus NRRL B-24119T (JOEC01000010) Mycetocola reblochoni JCM 30549T (RCUW01000025) Rathayibacter tritici DSM 7486T (X77438) T 100 Rathayibacter festucae DSM 15932 (CP028137) Rathayibacter rathayi VKM Ac-1601T (OCNL01000027) 69 Rathayibacter iranicus VKM Ac-1602T (jgi.1118354) Leucobacter komagatae JCM 9414T (D45063)

0.005 Genome wide phylogeny constructed based on whole genome sequences depicting the distinct positioning of strain SM7_A14T with members of the family Microbacteriaceae. Bootstrap values (expressed as percentages of 1000 replications) of above 50% are shown at the branch points. Microbial Ecology Soldhar Hot Spring

Source of Hot Water 3rd sampling site

2nd Sampling site

4th sampling site

90±1°C 1st sampling 92±1°C site

90±1 C ° 88±1°C

90±1°C

5th sampling site Geography of the Spring

• Longitude 79° 39’ 29”

• Latitude 39° 29’ 25”

• Altitude 1900m amsl

• Surrounding temperature during sampling min -2° C, max 8° C

• Interesting hydrogeology as the entire region has an array of small hot springs 17 isolates with 2 Genera

6 Genera in DGGE, 22 species in library construction Sampling: Agatti Island (10° 52' 47.32"N, 72° 10'11.86"E) is surrounded by land on northern side which making it unique geographic location as it is distinguished from the extend of northern low temperature zones.

Sediment samples from various depths viz 1 meter to 40 meter of the continental shelf were collected.

Microbial community structure was analyzed by targeting V3 region of the 16S rRNA gene on Illumina MiSeq platform (2x150bp). Results: PhylogeneticBacterialPhyla distribution: richnessanalysis at different depths: Beta diversity analysis Cultivable analysis Conclusion

• The continental shelf harbors a wide diversity. (Differences in UCS and LCS)

• Understanding the widespread bacterial diversity of the marine environment, can serve as an elementary data to several future multi-omics studies aiming to understand the ecology of marine habitats in relation to biogeochemical cycles.

• Strain SD111T represents a novel species of the genus Domibacillus for which the name Domibacillus indicus sp. nov. was proposed. Two Approaches

1. Onsite Cultivation 2. Cultivation in laboratory after transportation of samples (after 2 days) Total 449 isolates were obtained

Number of shared (50%) and unique genera obtained from samples processed onsite and laboratory. Kajale et al., 2020 Extremophiles-Research Group Acknowledgements • Director NCCS, Pune • Dr. Yogesh Shouche • Dr. Kunal Jani • Mr. Swapnil Kajale • Master Students • DBT, ICMR and MoES “All our dreams can come true, if we have the courage to pursue them”

Walt Disney

Thank you for attention