Microbial Community Response to Heavy and Light Crude Oil in the Great Lakes
Stephen Techtmann 10/24/19 Microbial Sensors Techtmann Lab @ MTU
Investigating the applications of environmental microbial communities
Hydraulic Fracturing Related Antibiotic Resistance Oil Bioremediation Techtmann Lab @ MTU Overview
• Background on oil biodegradation • Microbial response to light and heavy crude oil in the Great Lakes • Machine learning for prediction of contamination in the Great Lakes. Oil Spills Deepwater Horizon Enbridge Line 6B Deepwater Horizon Oil Spill
• 4,1000,000 bbl of oil released • Light Sweet Crude oil released • April 20, 2010 • 1101.7 miles of shoreline oiled
Atlas and Hazen 2011 Enbridge Line 6B Spill – Marshall MI
• 20,082 bbl of oil released • Diluted Bitumen • July 26, 2010 • 70 miles of shoreline oiled
https://www.mlive.com/news/kalamazoo/2010/07/state_of_emergency_declared_as.html Oil Transmissions Pipelines in the Great Lakes Region
Line 5: • 645 miles from Superior WI to Sarnia Ontario • 540,000 barrels per day • Light crude and natural gas liquids (NGLs) Crude oil Oil types and API Gravity Microbes and Biotechnology (Bioremediation)
Low cost input Microbe High value output Decreased Cost Contaminant Increased Efficiency Carbon dioxide or non- toxic daughter products
Carbon dioxide
Microbial Biomass
Petroleum Microbe Daughter Products
Water Microbial Ecology and Biotechnology
Low cost input Microbe High value output
Decreased Cost/Increased Efficiency Complex input Input A Microbe Microbe Output A
Input B Microbe Output B High value input output Input C Microbe Output C
Input D Microbe Output D
Complex input Microbial High value output Community Conceptual Model of Oil Biodegradation
Oxygen Carbon dioxide
pH Microbial Biomass
Petroleum Microbial Community Daughter Products Temperature
Nutrients (N, P, Fe) Water Next-generation sequencing and microbial ecology
Nucleic Acid Extraction
DNA RNA
Sequencing
Marker Gene Shotgun Sequencing Metagenomics Metatranscriptomics Bioinformatic analyses Phylogenetic Genes and Microbial Composition Functional Activity Potential Hazen et al 2013 16S rRNA Sequencing for Community Profiling
Environmental DNA
rRNA is useful in microbial ecology because 1) It is conserved in all organisms (16S rRNA – Amplified rRNA Bacteria and Archaea, 18S rRNA - Eukarya genes 2) It is rarely horizontally transferred 3) Minor difference in hypervariable regions can distinguish between microbial taxa
Next-generation sequencing can be used to Sequence generate millions of 16S rRNA reads from hundreds of samples in 24 hours
Operational Taxonomic Unit Phylogenetic Diversity (OTU) – Taxonomic grouping defined solely on sequence Relative abundance of differences microbial taxa Microbial Community Response to Released Oil in the Gulf of Mexico
Released oil resulted in a dramatic change in the microbial
community composition Mason et al 2012 ISME Oceanospirillales dominate oil-impacted sites throughout the Gulf
The microbial community changed in a reliable manner in oil impacted sites
Mason et al 2012 ISME Line 6B Microbial Response
Pseudomonas spp. dominate the microbial community in Dilbit amended microcosms from the Kalamazoo River
Deshpande et al 2018 Study Goals • Expand the understanding of the microbial response to oil in the Great Lakes. • Determine the impact of different oil types on the microbial response to crude oil. Oxygen Carbon dioxide
pH Microbial Biomass Microbial Petroleum Community Daughter Products Temperature
Nutrients (N, P, Fe) Water Overview
• Background on oil biodegradation • Microbial response to light and heavy crude oil in the Great Lakes • Machine learning for prediction of contamination in the Great Lakes. Sampling
• Surface water samples were collected from seven sites. • Water was transported back to the lab for microcosm experiments Methods - Microcosms
Control – No Oil
Bakken Crude
16S rRNA sequencing Diluted Bitumen North Dakota Bakken Cold Lake Diluted Crude Bitumen
API Gravity (° API) 40.6 API Gravity (° API) 21.7 GLRC MI141 STR1 STR2 STR3 SUP1 SUP4 1.00 Taxonomic Composition of Microcosms
GLRCGLRC MI141MI141 STR1STR1 STR2STR2 STR3STR3 SUP1SUP1 SUP4SUP4 1.00
y 0.75 t Class
i Rank3 n
y 0.75
t i
u Rank3
D_2__Acidimicrobiia D_2__Nitrospira
n u m D_2__AcidimicrobiiaD_2__Actinobacteria D_2__Nitrospira D_2__OPB35 soil group
m D_2__Actinobacteria D_2__OPB35 soil group m
m D_2__AlphaproteobacterD_2__Alphaproteobacteria D_2__Phia ycisphaerae D_2__Phycisphaerae
o o D_2__Betaproteobacteria D_2__Planctomycetacia
C D_2__Betaproteobacteria D_2__Planctomycetacia
C f
0.50 D_2__Chloroflexia D_2__Solibacteres
f o
D_2__Chloroflexia D_2__Solibacteres
0.50 D_2__Cyanobacteria D_2__Spartobacteria
o
n
o D_2__CytophagiaD_2__Cyanobacteria D_2__Sphingobacteriia D_2__Spartobacteria
i t
n D_2__Deltaproteobacteria D_2__Thermoleophilia r
o D_2__Cytophagia D_2__Sphingobacteriia o
i D_2__Flavobacteriia D_2__Verrucomicrobia Incertae Sedis
p t
o D_2__GammaproteobacterD_2__Deltaproteobacteria D_2__Via errucomicrobiae D_2__Thermoleophilia
r r
o D_2__Holophagae
P 0.25 D_2__Flavobacteriia D_2__Verrucomicrobia Incertae Sedis p
o D_2__Gammaproteobacteria D_2__Verrucomicrobiae r D_2__Holophagae P 0.25
0.00
L L L L L L L
N N N N N N N N N N N N N N
O O O O O O O
E E E E E E E E E E E E E E
R R R R R R R
K K K K K K K
M M M M M M M
T T T T T T T
K K K K K K K
U U U U U U U
N N N N N N N
A A A A A A A
T T T T T T T
I I I I I I I
B B B B B B B
O O O O O O O
B B B B B B
0.00 B
C C C C C C C
L L L L L L L
N N N N N N N N N N N N N N
O O O O O O O
E E E E E E E E E E E E E E
R R R R R R R
K K K K K K K
M M M M M M M
T T T T T T T
K K K K K K K
U U U U U U U
N N N N N N N
A A A A A A A
T T T T T T T
I I I I I I I
B B B B B B B
O O O O O O O
B B B B B B B
C C C C C C C Microbial Community Response to Oil
0.2
PERMANOVA comparing oil type
shape Stat BAKKEN F Model 7.8567 0.0 BITUMEN R2 0.03144 CONTROL
P value 0.001 ]
% SITE
2
.
0
1
[
GLRC
2
.
s i x MI141 A −0.2 Pairwise PERMANOVA by oil type Axis 2 2 [10.2%] Axis STR1 STR2 Control Bakken Dilbit STR3 Control 0.002 0.002 SUP1 SUP4 Bakken 0.0430 0.013 −0.4 Dilbit 0.0326 0.0156
−0.2 0.0 0.2 AxisAxis.1 1 [24.3%] [24.3%] PCoA Straits
0.2 Oil addition selects for a distinct community PCoA Straits
0.1 0.2 shape ] 0.1
% STR1
shape
] 9
% STR1
. STR2 9
. STR2 2 2 STR3STR3
1 0.0
1 0.0
[
[
OIL_TYPE
2
.
s
i OIL_TYPEBAKKEN x
2 BITUMEN .
A −0.1 CONTROL s
i BAKKEN x BITUMEN A −0.1 −0.2 CONTROL
−0.3
−0.2 0.0 0.2 −0.2 Axis.1 [24.4%]
−0.3
−0.2 0.0 0.2 Axis.1 [24.4%] STR1 STR2 STR3
Shifts in Microbial Community Composition
Straits 1 Straits 3 STR1 StraitsSTR2 2 STR3
0.2 ]
% 0.2
4
. ]
4 OIL_TYPE
%
2
4
.
[
4 OIL_TYPE BAKKEN
2
[ 1
BAKKEN
1
s
s i
i BITUMENBITUMEN
x
x A
0.0 CONTROL
A A
CONTROL
0.0 o
C
A
P
o
C P
−0.2
T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 Time −0.2
T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 Time Differentially Abundant OTUs between Oil and Control
Oil Control Differentially Abundant OTUs between Oil and Control
Top 5 OTUs Enriched in Oil Microcosms across all the Great Lakes Domain Phylum Class Order Family log2Fold padj Change Bacteria Proteobacteria Betaproteobacteria Rhodocyclales Rhodocyclaceae -4.825034181 1.15E-09 Bacteria Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae -4.813390899 1.01E-80 Bacteria Proteobacteria Alphaproteobacteria Alphaproteobacteria Unknown Family -4.76857179 2.89E-05 Incertae Sedis Bacteria Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae -4.437041976 0.000114849 Bacteria Proteobacteria Alphaproteobacteria SAR11 clade LD12 freshwater group -4.351315443 6.79E-12 Top 5 OTUs Enriched in Control Microcosms across all the Great Lakes Domain Phylum Class Order Family log2Fold padj Change Bacteria Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae 4.916518199 9.13E-14 Bacteria Cyanobacteria Chloroplast uncultured diatom uncultured diatom 4.409444746 3.70E-05 Bacteria Actinobacteria Actinobacteria Frankiales Sporichthyaceae 4.407054997 2.84E-28 Bacteria Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae 3.949906563 1.16E-16 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Moraxellaceae 3.908584356 5.68E-10 Oil Type Selects for a Distinct Community
Straits 1 Straits 3 STR1 StraitsSTR2 2 STR3
Pairwise PERMANOVA by oil type 0.2
] Control Bakken Dilbit
%
4 .
4 OIL_TYPE Control 0.002 0.002
2 [
BAKKEN 1
Bakken 0.0430 0.013 s
i BITUMEN
x A
0.0 CONTROL Dilbit 0.0326 0.0156
A
o
C P
−0.2
T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 T0 T1 T2 T3 T4 T5 Time Differentially Abundant OTUs between Bakken and Dilbit
Bakken DilBit Differentially Abundant OTUs between Bakken and Dilbit
Top 5 OTUs Enriched in Bakken Microcosms across all the Great Lakes
Phlyum Class Order Family Genus log2Fold padj Change Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Aquabacterium -7.267517002 4.83E-12 Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Aquabacterium -6.885120619 9.00E-12 Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae NA -5.246257406 4.22E-12 Proteobacteria Betaproteobacteria Burkholderiales Burkholderiaceae Polynucleobacter -3.833837931 0.004553027 Proteobacteria Alphaproteobacteria SAR11 clade LD12 freshwater group uncultured bacterium -3.758914796 2.83E-07 Top 5 OTUs Enriched in Dilbit Microcosms across all the Great Lakes Phlyum Class Order Family Genus log2Fold padj Change Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Brachymonas 2.771554143 0.00492125 Proteobacteria Betaproteobacteria Burkholderiales Oxalobacteraceae NA 2.189047269 0.002388574 Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae Novosphingobium 1.886806751 0.00492125 Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae NA 1.87143764 0.004692176 Proteobacteria Alphaproteobacteria SAR11 clade LD12 freshwater group uncultured bacterium 1.509995256 0.001452114 Summary so far • A diverse set of microbes responds to oil in the Great Lakes. • Alpha- and Betaproteobacteria are the primary groups enriched in response to oil. • Many of the enriched OTUs are related to known oil-degrading taxa. • There is a distinct microbial response to different types of crude oil. Overview
• Background on oil biodegradation • Microbial response to crude oil in the Great Lakes • Comparison of the microbial response to light and heavy crude oil. • Machine learning for prediction of contamination in the Great Lakes. Gulf of Mexico Sampling Sites
Smith et al 2015 mBio Ability of the Microbial Community Structure to Predict the Presence of Oil
Smith et al 2015 mBio The abundance of two taxa can predict oil contamination in the GoM
Smith et al 2015 mBio Machine Learning and Microbial Ecology
List of taxa found in oiled Sample Train ML model to identify features (taxa) that distinguish List of taxa between oil and no oil found in Non- Oiled Sample 10 oil samples Model10 no oil Yes No Taxa A?
10 oil samples 0 oil samples 0 no oil samples 10 no oil samples Machine Learning and Microbial Ecology
List of taxa found in oiled Taxa A – oil Sample Train ML model to identify features (taxa) Taxa B – oil that distinguish Taxa C – no oil List of taxa between oil and no oil Taxa D – no oil found in Non- Oiled Sample
O
Model O 75% accurate O classification
N ? O
Presence of Oil across the Great Lakes
0 .
• Random forests model 1
predicting the presence of
8
. 0
oil in microcosms for all n
o
i
6
.
s 0
sites was constructed. i
c
e
4
. r
• Accuracy on test set : 0
Precision P
0.8226 2
. 0
• F1 of test set : 0.7179 AUC = 0.911
0 . CONTROL OIL 0 CONTROL 14 3 0.0 0.2 0.4 0.6 0.8 1.0 Recall OIL 8 37 Recall
Presence of Oil in the Straits of Mackinac
0 .
• Random forests model 1
predicting the presence 8
.
0 n
of oil in microcosms from o
i
6
.
s 0
the Straits of Mackinac. i
c
e
4
. r
• Accuracy on test set: 1 0
Precision
P 2 • F1 of test set : 1 .
0 AUC = 0.974
0 . CONTROL OIL 0 CONTROL 11 0 0.0 0.2 0.4 0.6 0.8 1.0 Recall OIL 0 22 Recall Classifying the type of oil across the Great Lakes
• A random forests model was constructed to classify samples into control, Bakken or Diluted Bitumen crude oils based on the microbial community composition. Metric Control Bakken Dilbit Accuracy Mean F1 F1 0.8627 0.8000 0.7429 All Sites 0.8095 0.8019 Balanced 0.9146 0.8452 0.8017
All Sites All Accuracy Straits 0.9697 0.9696 F1 0.9565 1.0000 0.9524 Balanced 0.9773 1.0000 0.9545 Straits Accuracy Biosignatures
Model Number of features with importance >1 All sites – predict presence of oil 40 All sites – classify oil type 333 Straits – predict presence of oil 177 Straits – classify oil type 119
The most important features were OTUs classified as Sphingomonadales, Comamonadaceae, and Sporichthyaceae Summary • A diverse set of microbes responds to oil in the Great Lakes. • There is a distinct microbial response to different types of crude oil. • Machine learning can serve as an effective tool for interrogating the microbial for features that are predictive of contamination. • Models for prediction of the presence of oil across the Great Lakes were less accurate than those predicting the presence and type of oil in one location. • Models for prediction of oil and classification of the type of contamination were highly accurate in the Straits of Mackinac region. Acknowledgments
• Tim Butler • Paige Webb • Ryan Ghannam • Emma Byrne • The captain and crew of the NOAA R/V 5501 • Jorge Santo Domingo (US EPA) • MTU – REF and startup funs