<<

Nova Southeastern University NSUWorks

HCNSO Student Theses and Dissertations HCNSO Student Work

11-4-2019

Inferred Function and Dynamics of Microbial Communities from the Northern Gulf of Mexico

Deepesh Tourani

Follow this and additional works at: https://nsuworks.nova.edu/occ_stuetd

Part of the Environmental Microbiology and Microbial Ecology Commons, Genomics Commons, Commons, and the and Atmospheric Sciences and Meteorology Commons

Share Feedback About This Item

This Thesis has supplementary content. View the full record on NSUWorks here: https://nsuworks.nova.edu/occ_stuetd/523

NSUWorks Citation Deepesh Tourani. 2019. Inferred Function and Dynamics of Microbial Communities from the Northern Gulf of Mexico. Master's thesis. Nova Southeastern University. Retrieved from NSUWorks, . (523) https://nsuworks.nova.edu/occ_stuetd/523.

This Thesis is brought to you by the HCNSO Student Work at NSUWorks. It has been accepted for inclusion in HCNSO Student Theses and Dissertations by an authorized administrator of NSUWorks. For more information, please contact [email protected]. Thesis of Deepesh Tourani

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science M.S. Marine Biology

Nova Southeastern University Halmos College of Natural Sciences and Oceanography

November 2019

Approved: Thesis Committee

Major Professor: Jose V. Lopez, PhD

Committee Member: Cole G. Easson, PhD

Committee Member: Matthew W. Johnston, PhD

This thesis is available at NSUWorks: https://nsuworks.nova.edu/occ_stuetd/523

Inferred Function and Dynamics of Microbial Communities from the Northern Gulf of Mexico

Deepesh Tourani

Department of Marine and Environmental Sciences Halmos College of Natural Sciences and Oceanography Nova Southeastern University

Major Advisor: Jose V. Lopez, PhD Committee Member: Cole G. Easson, PhD Committee Member: Matthew W. Johnston, PhD

1

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ------4

ABSTRACT ------6

LIST OF FIGURES AND TABLES ------7

INTRODUCTION ------9 Marine Microbial Ecology ------9 16S Amplicon Genomics ------11 Functional Profiling of Microbial Communities ------13 Nutrient Cycling and Circulation in Northern Gulf of Mexico (NGoM) ------14 Microbial Community Dynamics of NGoM ------16 Hypotheses ------18

METHODS ------19 Microbial Sequence and Environmental Data ------19 Taxonomic Assignment ------21 Functional Prediction ------21 Statistical Analyses ------22

RESULTS ------24 DP03-DP04 Cruise and Sequence Data ------24 Taxonomic Analyses – Composition and Structure ------24 Functional Analyses – Composition and Structure ------31 Temporal Analyses – Structure and Dynamics ------40

DISCUSSION ------42 Taxonomic Composition and Structure ------42 Functional Composition and Structure ------42 Depth Stratification of Microbial Function ------45 Environmental Drivers and Functional Redundancy ------47 Microbial Functional Dynamics across NGoM ------49 Spatial, Mesoscale, and Temporal Dynamics ------49 Considerations, Developments and Future Applications ------53

CONCLUSION ------56

REFERENCES ------57

APPENDIX I Environmental Data Mapping Table ------66

APPENDIX II QIIME and PICRUSt code ------82

2

APPENDIX III PICRUSt Functions Relative Abundance Table ------84

APPENDIX IV R code for Multivariate Statistical Analyses ------85

APPENDIX V Supplemental Figures ------97

APPENDIX VI Manuscript ------100

3

ACKNOWLEDGEMENTS

Like any research project, this work was only made possible by the support of a number of people. Chief among which would be my advisor, Dr. Joe Lopez. I would like to thank him for handing me a nascent project, and letting me explore it’s potential and steer it my own way. I’m really thankful for his patience and constant support, especially during my initial, and prolonged, waywardness. I also thank him for helping me apply for almost every scholarship and grant that came my way. I’m glad to have worked under your guidance and in your lab. I would like to thank my committee member, Dr. Cole Easson, for building that foundation for the two most crucial technical skills I learned throughout this work, bioinformatics and R. With his help in understanding, using, and troubleshooting these techniques to study microbial ecology, I was able to overcome my general aversion to coding and tackle these expansive fields of study. I would also like to thank my last committee member, Dr. Matt Johnston, for teaching me the interesting field of GIS, and being the one tech-savvy person at the Oceanographic Center I could chat technology with. I would also like to thank him for the giving me the general career guidance that I needed from time to time.

I’m also thankful to Dr. James White, for helping me get acquainted with PICRUSt and sharing his subject matter expertise. I would like to mention Dr. Emily Schmitt Lavin here, for being a great support, giving me the opportunity to work as a teaching assistant, and being the person you could talk to. I would like to appreciate the help all the faculty, admin, and staff have provided throughout my time here at the OC. A lot of minor doubts and major problems were solved because of their diligence. Thank you, to the many friends I made at the OC, for the happy hours, intramurals, and company when needed. Cheers really, to the many people I call friends. It would be hard to write all of their names here, nevertheless, they’ve been there when needed.

From back home in India, thank you Dr. Nupur Bannerjee, for being, not just a high school teacher, but a great mentor and helping me find my path through the most confused of times. To my brother, Chandraprakash Tourani, and his family, thank you for being my getaway destination, and essentially a home-away-from-home. I have always believed that for any work to conclude and see the light of day, it takes the constant support and faith of a number of people. And working on this thesis and towards this graduate degree has only strengthened this belief. But the two people who have been there throughout, and without whom I could never have reached this stage, are my

4

parents, Dr. Rachna Tourani and Dr. Harish Tourani. From letting me wander off to remote Himalayan mountains year after year, to yet making sure I wasn’t lost, they have been a teacher, a boss, a mentor, a friend, and a role model. I hope through this work I can show my immense gratitude and respect for all you have done. You are a beacon of light, and I am forever indebted.

5

ABSTRACT

Microbial communities, or microbiomes, are the major drivers of global biogeochemical cycles, acting as primary producers and decomposers across the water column in the . Thus, they reflect changes in physicochemical properties and nutrient composition of the . However, this correlation between ecological changes and the function of marine microbiomes is poorly understood. Large-scale oceanic events such as the bottom-water oxygen-depleted zone (i.e., “dead zone”) and the Deepwater Horizon oil spill in the Gulf of Mexico (GoM) render the ecosystem fragile. These events decrease survival rates of pelagic and coastal macrofauna and affect the biodiversity of the region. As part of the DEEPEND Consortium, previously sequenced 16S rRNA gene data of 466 samples from two cruises in 2016 (May: DP03, August: DP04) were used to characterize the taxa and function of microbiomes across the Northern GoM (NGoM). The Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) approach was used for predicting biomolecular function based on the KEGG database inferred from 16S rRNA sequences. Metabolism was the most abundant function at KEGG Level 1 (DP03: 53.1%, DP04: 52.4%), and further analyses were centered on pathways within this function. Strong depth stratification of metabolic function was observed (p<0.001), with a major shift in function between the euphotic zone (0-200m) and the aphotic zone (200-2000m), associated primarily with decreasing relative abundance of photosynthetic functional signatures with depth. These functions were followed by methane metabolism in abundance, indicating that the microbial communities switch to primarily chemosynthetic processes in the aphotic zones. A CCA test on five physicochemical environmental variables, sampling depth, temperature, salinity, Dissolved Oxygen, and salinity, resulted in a strong association with function, explaining more variation in metabolism (DP03: 63.2%, DP04: 77.3%) than taxa (DP03: 14.3%, DP04: 25.8%). Among these variables, temperature has been reported as the primary driver of community composition, and was also the major environmental driver of function for this study (DP03: 55.4%, DP04: 68%). GIS spatial analysis of methane metabolism across the aphotic zone showed that offshore sites had relatively higher methane metabolism than nearshore, indicating potential pelagic effects of the Mississippi River outflow. Temporal analyses showed photosynthetic primary productivity was significantly different across seasons but not annually, which may be attributed to high seasonal outflow of the river.

Keywords: microbial ecology, Next-Gen Sequencing, microbiome, Gulf of Mexico, DEEPEND, QIIME, PICRUSt

6

LIST OF FIGURES AND TABLES

List of Figures

Fig 1 Marine Food Web ------10

Fig 2 Satellite image of Mississippi River drainage basin and algal bloom in Gulf of Mexico - 16

Fig 3 DP03 DP04 Sampling Stations ------20

Fig 4 PICRUSt workflow ------22

Fig 5 Average Relative Abundance of Top 10 Phyla for DP03 DP04 ------25

Fig 6 DP03 Relative Abundance of Major Phyla by Depth Zone ------26

Fig 7 DP04 Relative Abundance of Major Phyla by Depth Zone ------27

Fig 8 DP03 Taxa Bray-Curtis PCoA ------28

Fig 9 DP04 Taxa Bray-Curtis PCoA ------29

Fig 10 DP03 Taxa CCA ------30

Fig 11 DP04 Taxa CCA ------31

Fig 12 Average Relative Abundance of all KEGG Level 1 Functional Pathways from DP03 and DP04 ------32

Fig 13 Average Relative Abundance of KEGG Level 2 Metabolic Pathways from DP03 ------33

Fig 14 Average Relative Abundance of KEGG Level 2 Metabolic Pathways from DP04 ------34

Fig 15 Average Relative Abundance of KEGG Level 2 GIP and EIP Pathways from DP03 ---- 35

Fig 16 Average Relative Abundance of KEGG Level 2 GIP and EIP Pathways from DP04 ---- 36

Fig 17 DP03 Metabolism Bray-Curtis PCoA ------37

Fig 18 DP04 Metabolism Bray-Curtis PCoA ------38

Fig 19 DP03 Metabolism CCA ------39

Fig 20 DP04 Metabolism CCA ------40

Fig 21 GIS Map of Methane Metabolism across Aphotic Zone during DP03 ------51

7

Fig 22 GIS Map of Methane Metabolism across Aphotic Zone during DP04 ------52

Fig 23 Weighted UniFrac PCoA plot of Taxa with scaled coordinates ------99

Fig 24 NMDS Plot of Metabolism across Seasons ------100

Fig 25 NMDS Plot of Metabolism across Years ------101

List of Tables

Table 1 Sampling Summary ------21

Table 2 Taxa CCA results ------45

Table 3 DP03 Metabolism SIMPER results ------46

Table 4 DP04 Metabolism SIMPER results ------47

Table 5 Metabolism CCA results ------49

Table 6 Relative Abundance of all PICRUSt KEGG Level 1 Pathways for DP03 ------85

Table 7 Relative Abundance of all PICRUSt KEGG Level 1 Pathways for DP04 ------85

8

INTRODUCTION

Marine Microbial Ecology

Microbial communities constitute a significant portion of the ocean biomass and are the sole primary producers in the oceans (Moran 2015; Sunagawa et al., 2015). These communities are key components of major biogeochemical cycles, driving the nutrient cycles in the ocean. In the euphotic zone, microbes fix atmospheric carbon and nitrogen into organic matter which gets transported up the trophic levels. Microbial debris contributes to the sinking particle flux, known as , thereby transporting nutrients to deeper waters. Microbes in the aphotic zones colonize sinking particulate matter and form microenvironments, thereby supporting life at those depths. Thus, microbial production at the surface supports and affects biota beyond the sunlit zone. Even in oligotrophic waters, defined by limited available nutrients and low productivity, organic carbon is channeled and regenerated through microbes in a process termed the microbial loop (Pomeroy et al., 2007). On the , microbes carry out chemosynthesis while inhabiting sediments and extreme environments such as hydrothermal vents and methane seeps (Ruff et al., 2015; Fortunato et al., 2018) (Figure 1).

Large scale studies in the past decade have analyzed the dynamics of these communities and their correlation and interaction with marine environments across the global oceans. The Global Ocean Sampling (GOS) expedition was one of the first comprehensive phylogenetic studies of marine microbial communities (Rusch et al., 2007). A detailed global study undertaken later by the Tara Oceans also analyzed the functional composition of microbial communities (Sunagawa et al., 2015).

9

Figure 1. Marine Food Web. Bacterioplankton are the primary producers, supporting the marine food web across depth zones. US Department of Energy, Office of Biological and Environmental Research. (https://public.ornl.gov/site/gallery/detail.cfm?id=326)

Microbial community structure and function is strongly driven by environmental factors. The structure of ocean microbial communities is primarily driven by depth, while in the euphotic zone, temperature is the major environmental factor, followed by Dissolved Oxygen (DO), driving the taxonomic composition and function of microbial communities (Sunagawa et al., 2015). The most abundant microbial taxa in the oceans across depth zones are the Alphaproteobacteria and Gammaproteobacteria classes of the Proteobacteria phyla. The photosynthetic phyla cyanobacteria, consisting of genus Prochlorococcus and Synechococcus, are abundant in the sunlit zone and responsible for microbial production (Salazar and Sunagawa, 2017). The abundance of cyanobacteria in turn decreases with increasing depth. The coastal ocean as a result has higher abundance of cyanobacteria and heterotrophic bacteria than open ocean (Moran, 2015). Vertical distribution patterns also showed an increase in diversity and abundance with depth. GOS found

10

evidence of phylogenetically related communities across distant geographic locations. Microbial communities differ seasonally with negligible annual variation. This is primarily attributed to the stratification of ocean layers in higher temperatures of summer months and the higher freshwater and nutrient input.

Microbial communities reflect changes in nutrient composition and physicochemical properties such as temperature, salinity, and DO (DeLong, 2005; Xu, 2006; Fuhrman, 2009). The microbial ecology of marine environments differentiates between pelagic habitats and offers biological insight into the overall functioning and health of the oceans and constituent fauna. Characterizing the diversity and distribution of marine microbes is particularly unique and challenging as compared to other natural environments such as soil due to the dilute aquatic conditions and global mixing of water masses. Although the richness and abundance of free-living microbes is lower than host-associated communities such as human gut microbiome, these communities play a crucial role in quantitatively driving biotic life cycles in the natural ecosystems. Global oceans are considered a sink for carbon, wherein microbial production at depths plays a key role. Environmental changes induced by anthropogenic activities can thus alter the community structure and function, the effects of which will escalate to higher trophic levels and ultimately affect the entire marine ecosystem.

16S Amplicon Genomics

One of the most common ways of studying microbial communities is using metagenomics, which involves sequencing genes from environmental samples for microbial communities, most of which are not cultivable in vitro (Schloss and Handelsman, 2003; Handelsman, 2004; Allen and Banfield, 2005). Computational developments have evolved classical microbiology, which previously relied solely on lab-scale cultivation of individual species of microbes, into an informatics-based approach to analyze whole communities of microbes (Gilbert and Dupont, 2010). Characterization of such communities can be carried out by sequencing amplicons of 16S rRNA gene (Handelsman, 2004; Tringe and Hugenholtz, 2008). 16S is a component of the 30S small subunit of a ribosomal RNA found in prokaryotes. A near universal presence in prokaryotes along with slower rates of evolution and relatively unchanged function across evolutionary

11

lineages has propelled the use of 16S amplicon sequencing in microbial phylogenetic studies (Tringe and Hugenholtz, 2008; Caporaso et al., 2011).

The amplicon sequencing approach allows for the analysis of structure and composition of microbial communities, or microbiomes, for ecological studies in diverse environments such as the human gut (Turnbaugh et al., 2009), skin (Grice et al., 2009) and oral cavity (Dewhirst et al., 2010), fish gut (Sanchez et al., 2012), rhizosphere (Berendsen, Pieterse, and Bakker, 2012), microbial mats (Ley et al., 2006), and marine (Morris et al., 2002; Rusch et al., 2007; Fuhrman et al., 2008; Sunagawa et al., 2015). Research in the Molecular Microbiology and Genomics lab has employed Illumina amplicon sequencing to characterize microbial communities from freshwater and of South Florida (Campbell et al., 2015; O'Connell, 2015; Karns, 2017; Skutas, 2017; Donnelly, 2018) and anglerfish symbiont (Freed et al., 2019).

Amplicon data is commonly analyzed by clustering read sequences with a minimum 97% similarity into a common taxon ’unit’, called an operational taxonomic unit (OTU) (Schloss et al., 2009). Common methods of clustering include reference-based, where sample sequences are compared with sequences from a reference database, and de novo, where sequences are clustered based on similarities within the sample dataset. While 16S data are used for characterizing community structure in most large-scale microbiome studies, there are no widely accepted standards for taxonomic assignment of 16S-derived microbes (Pollock et al., 2018). Recent studies have shown that clustering of reads and the arbitrary similarity cut-off used by the OTU approach has some inherent limitations such as the inability to resolve closely related species (Tikhonov, Leach, and Wingreen, 2015; Callahan, McMurdie, and Holmes, 2017). Reference-based OTU clustering also relies on the curation of existing 16S sequence databases and are subject to biases introduced by the respective databases (Westcott and Schloss, 2015). A recent approach, called Amplicon Sequence Variants, addresses these concerns by inferring taxonomy based on a biological sample sequence by considering each nucleotide, rather than clustering (Callahan, McMurdie, and Holmes, 2017). The International Census of Marine Microbes (ICoMM; http://icomm.mbl.edu) is a repository that facilitates storage and distribution of microbial sequence and structure data from such studies representing various marine environments.

12

Functional Profiling of Microbial Communities Microbial ecology studies from different marine environments are mostly based on composition of microbial communities, with fewer studies on functional composition. Analyses of such communities have revealed patterns that could not be explained by taxonomic composition. Global trends show similar taxonomic composition across different spatial environments (Sunagawa et al., 2015), and varying community structure in similar environments (Fernández et al., 1999). Taxonomic studies hence do not allow enough resolution to effectively analyze community dynamics with respect to biogeochemical cycles. A possible explanation for this disparity is the lack of inclusion of phenomena such as Horizontal Gene Transfer (HGT), which allows microbes in an environment to transfer genetic material to perform ecological niche- specific functions (Boucher et al., 2003; Falkowski, Fenchel, and Delong 2008). Taxonomically different clades in a similar environment can, as a result, perform similar metabolic functions, contributing to survival of whole community.

Microbial community function has only been studied recently, with the application of Next- Gen Sequencing (NGS) in 16S and metagenomic surveys. The Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) approach assists in finding out whole community gene composition and community function using 16S sequence data and reference databases (Langille et al., 2013). Recent applications of PICRUSt include studies on the coastal Arabian Sea (Kumar, Mishra, and Jha, 2019), the Mariana Trench (Gao et al., 2019), effect of hydrocarbons and dispersants on marine biofilms (Salerno et al., 2018), and dinoflagellate (Zhou et al., 2018) and blooms (Nowinski et al., 2019).

Recent studies on community functional profiles showed higher correlation between environmental conditions and function than composition (Louca, Parfrey, and Doebeli 2016). The idea of functional redundancy explains these variations by suggesting that community functional dynamics are independent of community composition. Functional profiling of microbial communities more closely reflects correlation between microbial communities and local environmental conditions. To study marine microbial community dynamics, functional profiling can also account for globally connected water masses which may transport microbes across global oceans and show unexplained taxonomic trends. Based on metabolic pathways, marine water bodies can be differentiated and environmental stressors can be identified (Rusch et al., 2007).

13

An emerging trend in the upper layers of the global oceans is that temperature is the main environmental variable driving community structure (Sunagawa et al., 2015; Logares et al., 2018). Metabolic functions such as photosynthesis and photoheterotrophy are the most abundant in this region, where Calvin cycle carbon fixation enzymes were highly expressed. The Tara Oceans study also found genes responsible for aerobic respiration and photosynthesis in the mesopelagic, suggesting the presence of upper-layer microbes in the sinking marine snow, as they adapt to different niches and carry out remineralization (Sunagawa et al., 2015). As depth increases and sunlight and temperature decrease, microbial communities switch to chemosynthetic processes and expand into metabolic niches, increasing functional richness with depth (Orcutt et al., 2011; Sunagawa et al., 2015). Nitrogen and sulfur metabolism processes increase to support the energy requirements. Most heterotrophic respiration is fueled by sinking organic matter (Mestre et al., 2018). These processes support nutrient cycles in the aphotic zone by fixing or remineralizing organic and inorganic matter. Along with sinking organic matter, environments such as hydrothermal vents, seeps, and sediments also show high microbial activity.

Nutrient Cycling and Circulation in the Northern Gulf of Mexico

The Gulf of Mexico (GoM) is an ocean basin between the North and South American continents, providing high economic and ecological value to the region (Goolsby et al., 1999; Shen et al., 2016; Ward and Tunnell, 2017). One of the major geographic features of the Northern GoM (NGoM) is the Mississippi River (MR) delta (Figure 2) which drains the contiguous 48 states of the United States and accounts for 90% of the freshwater input into the GoM (Rabalais et al., 1996). The Mississippi river flows through regions of high agricultural activity and urban settlements and serves as a drainage for runoffs from these areas, thus accumulating sediments and nutrients. As the river flows into the GoM, it transfers sediments and nutrients along with the freshwater, thereby acting as a major driving force shaping the microbial communities and the ecosystem (Rabalais, Turner, and Wiseman Jr, 2002).

The Loop Current (LC), as part of the GoM mesoscale (1-100km) circulation, is another oceanographic feature driving the ecosystem as it brings warm water in the region (Liu et al., 2012; Williams et al., 2015). The LC enters the south-eastern GoM from the Yucatan Channel, flows

14

north into the pelagic GoM where it forms a loop, and exits through Straits of Florida forming the Gulfstream current (Leipper, 1970; Oey, Ezer, and Lee, 2005). During this process, the LC sheds cyclonic and anti-cyclonic eddies which form in the east and ebb as they travel west across the GoM. These mesoscale circulations facilitate the mixing of nutrient-rich coastal waters with pelagic waters, consequently shaping the productivity and microbial communities.

Nitrogen and phosphorus are the two major nutrients in aquatic ecosystems and are essential for primary productivity. As such, these nutrients are indicators of ecosystem health and susceptibility (Rabalais et al., 1996; Alexander, Smith, and Schwarz, 2000). There has been a documented increase in nitrogen and phosphorus concentrations in the GoM since the industrial revolution in mid-20th century, leading to atypically increased surface primary productivity and eutrophication (Rabalais et al., 2002). These changes have been attributed to excessive use of nitrogenous fertilizers in agriculture and combustion of fossil fuels which increased the nutrient concentration and loading in the Mississippi river (Rabalais et al., 2002; Turner et al., 2007). This increased nutrient flux results in over-enrichment of river water which in turn causes the higher primary production in the continental shelf as it flows into the Gulf. The increased productivity causes eutrophication on the surface and a corresponding sinking flux of organic and inorganic matter, containing dead algae and fecal pellets. Microbial communities decompose this particulate matter and grow, thereby decreasing the DO and causing hypoxia which results in oxygen stress for pelagic organisms.

The Gulf of Mexico hypoxic zone, or Dead Zone, is the second largest hypoxic zone in the world and covers 20%-50% across the water column (Rabalais et al., 2002), affecting the biogeochemical cycles across the region (Rabalais, Turner, and Wiseman Jr, 2002; Bianchi et al. 2010). The Dead Zone follows a seasonal cycle with an intermittent low DO concentration from March to May, and a sustained hypoxia from June through August (Rabalais et al., 1999). This pattern follows the Mississippi River discharge which peaks from March to May and is circulated throughout summer by the LC into the pelagic GoM. Although the hypoxic zone is studied, managed, and forecasted using statistical models (Scavia et al., 2017), less is known about the microbial communities in the river plume and the water column in the NGoM causing the hypoxia.

An ongoing bottom-water hypoxic zone present in the summer (and in progress since 1985), combined with events such as the Deepwater Horizon oil spill in 2010, has disrupted this

15

fragile ecosystem and has decreased the survival rates of fauna across all trophic levels (Rabalais et al., 2007).

Figure 2. Satellite image of Mississippi river drainage basin and algal bloom in Gulf of Mexico. As the river drains through cities (red) and farms (green) across the continental US, the nutrients and sediments ultimately get transported to the Gulf of Mexico, where they have been causing hypoxic zones for the past three decades due to intense algal blooms (darker colors). Environmental Visualization Lab, NOAA (www.nnvl.noaa.gov/MediaDetail2.php?MediaID=1062&MediaTypeID=3&ResourceID=104616)

Microbial Community Dynamics of Gulf of Mexico

The composition of microbial communities of the GoM has been well characterized using NGS over the past decade. A geophysical feature of the NGoM driving the community structure is the Mississippi River plume. Nutrient input and unique freshwater taxa from the MR plume drive the coastal communities. Depending on the time of year, one study found a decrease in diversity from the coastal to (Mason et al., 2016), while a previous study found no effect of geospatial distance on diversity (King et al., 2013). Higher diversity was observed in MR than in the pelagic ocean, with the outflow potentially enriching pelagic microbes by introducing higher nutrient content or non-marine microbes (Mason et al., 2016). In the hypoxic Dead Zone, high abundance of Thaumarchaeota has been found, suggesting hydrocarbon-degrading activity.

16

The major factor driving microbial communities across the GoM is depth, with communities differing across the euphotic, mesopelagic, and bathypelagic zones (King et al., 2013). Alphaproteobacteria and Bacteroidetes dominate the upper layers while Gammaproteobacteria and Thaumarchaeota are the most abundant taxa in deeper layers. The GoM also has high amount of hydrocarbons due to the presence of numerous methane seeps and vents which may influence microbial communities in the aphotic zone (Rakowski et al., 2015). Studies on the impact of events such as the Macondo blowout and the following DWHOS have shown that microbial communities responded to the increased hydrocarbon in the water with an increase in hydrocarbon-degrading Gammaproteobacteria (Joye, Teske, and Kostka, 2014). The effects of the oil spill and resultant change in pelagic microbial communities was also observed in beach samples off Louisiana, showing the relatively quick coastal impacts with presence of hydrocarbon- degrading taxa such as Gammaproteobacteria (Lamendella et al., 2014).

With fewer functional studies on the pelagic GoM, the interaction and effects of oceanographic features such as MR plume and the LC, and events such as the DWHOS, on community dynamics are not clearly understood. Given the lack of baseline data, the DEEPEND Consortium aims to characterize the structure and function of microbial communities across the NGoM (Sutton et al., 2017). Analysis from the biannual cruises from 2015 showed strong depth selection of microbial communities, correlated with salinity and at upper layers and circulation patterns at depth (Easson and Lopez, 2019). While much is known about the community composition and diversity, the metabolic roles of these microbes and their interaction with regional environmental features is less known. A subsequent study on predictive functional data from one of the cruises from Easson and Lopez, DP02 (August 2015), analyzed xenobiotic metabolism across the NGoM (Bos et al. 2018). This thesis uses microbial sequence data, predictive algorithms, and ETOPO1 (Amante and Eakins, 2009) bathymetry data to identify and characterize the structure and function of microbial communities from the NGoM to assess natural and anthropogenic changes, and quantitatively assist future environmental impact assessments.

17

Hypotheses

The primary goal of this study was to characterize the functional composition of pelagic microbial communities of GoM using the predictive approach of the PICRUSt pipeline. The following hypotheses were tested -

• Temperature is the major environmental variable driving inferred microbial community function across NGoM. • Coastal microbial communities are functionally distinct from pelagic communities. Functional parameters will outline the presence and extent of a hypoxic zone across coastal sampling sites. • Archaeal phyla, anaerobic metabolism, and loop current isolate the bathypelagic into a functionally distinct zone. • Integrating previously generated DP02 PICRUSt data (Bos et al. 2018) with DP03 and DP04 for temporal analyses, functional data will show no significant annual variation in surface primary productivity across the three cruises (1.5 years), with significant seasonal variation. Data from these cruises will also be analyzed for seasonal and annual ecological changes across depth zones from the timepoints August 2015, May 2016, and August 2016.

18

METHODS

Microbial Sequence and Environmental data

Seawater samples from the NGoM were previously obtained from bi-annual cruises in 2016 as part of the DEEPEND Consortium (Fig 3). Sample processing and sequence data made available for this study were previously generated by Dr. Cole G. Easson. The two cruises from 2016 analyzed in this study for community structural and functional analyses sampled 9 stations during 30 April-14 May (DP03), and 12 stations during 5-9 August (DP04). Seawater was collected at each station across four depth zones: Surface (0-10m), Epipelagic (20-200m), Mesopelagic (201-1000m) and Bathypelagic (1000-1600m). Each station had day and night sampling across the four depth zones (Table 1). Sampling stations were also classified based on the mesoscale-level water classification given by Johnston et al (2019). Based on sea-surface height and temperature, stations were classified as present in common water, LC, or mixed water, with mixed water defined as intermediate to common water and LC. Environmental parameters included in the analyses for this study include temperature, salinity, chlorophyll a concentration, and dissolved oxygen (for complete environmental data mapping table see Appendix I).

Seawater was filtered through 0.45 µm filter membranes, and DNA was extracted following standard Earth Microbiome Project (EMP) protocols. The V4 hypervariable region of the 16S rRNA gene was amplified using primers 515F and 806R and sequenced on Illumina MiSeq by Dr. Cole G. Easson (for detailed methods, see: Easson and Lopez, 2019). The resulting raw paired- end sequence data for a total of 476 microbial samples from cruises DP03 (n=293) and DP04 (n=183) used for this study is publicly available from the NCBI Sequence Read Archive database (www.ncbi.nlm.nih.gov/sra) under Accession #PRJNA429259, and the Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) (https://data.gulfresearchinitiative.org, RFP-IV, DEEPEND) database under UDI R4.x257.228:0008. Nine ONOFF samples from DP03 were excluded from further analyses as they were collected during transition between stations and not from a specific station site, and CTD41.S.2 was lost during taxonomic assignment, resulting in a total of 283 samples for cruise DP03. The respective environmental data is available on GRIIDC under UDI R4.x257.230:0011 (DP03) and R4.x257.230:0012 (DP04). PICRUSt data has been submitted as dataset R4.x257.000:0011 and DOI: 10.7266/531X8N14.

19

Figure 3. DP03 DP04 Sampling stations. DP03 cruise covered 9 stations and DP04 cruise covered 12 stations across north Gulf of Mexico as part of the DEEPEND Consortium. The basemap is ETOPO1 bathymetry.

20

Table 1. Sampling summary. Sample location and cruise description. Microbial samples from each station and depth had 3 replicates (included).

Cruise DP03 DP04 Dates 30 April - 14 May 2016 5 - 9 August 2016 Season End of Spring End of Summer # Stations 9 12 CTD casts 23 15 Surface (0-10m) 57 45 Epipelagic (20-200m) 76 48 Mesopelagic (201-1000) 104 45 Bathypelagic (1000-1600m) 56 45 Total 293 183

Taxonomic Assignment

Taxonomic assignment of 16S sequence data was carried out on Quantitative Insights Into Microbial Ecology (QIIME) ver. 1.9.1 (Caporaso et al., 2010). The metadata mapping file was validated using the validate_mapping_file.py script with options to disable BarcodeSequence (-b) and LinkerPrimerSequence (-p) check. The script multiple_split_libraries_fastq.py, with the option Phred_offset 33, was run on the per-sample sequence data for demultiplexing. Demultiplexed data was then used as input to assign taxonomy and pick OTUs using the pick_closed_reference_otus.py script, the standard 97% sequence similarity cutoff, and Greengenes ver. 13.8 (DeSantis et al., 2006) as the reference database. The output was a per- sample OTU abundance table in Biological Observation Matrix (BIOM) format. This workflow and options were recommended for the data to be compatible with the PICRUSt pipeline (Langille et al., 2013). The BIOM table was converted to text format using biom convert script and options --to-tsv and --header-key taxonomy (for complete script see Appendix II).

Functional Prediction

Functional inference of sequence data was carried out using PICRUSt ver. 1.1.3 (Langille et al., 2013) using command-line interface locally in the Python-based Conda programming environment. The OTU table output from taxonomic assignment was used as input for the script

21

normalize_by_copy_number.py to normalize the OTU abundance for gene copy number. This script normalizes OTU abundance for variation in the 16S gene copy number across species. Per- sample functional predictions are then carried out based on Kyoto Encyclopedia of Genes and Genomes (KEGG) database Orthologs (KO) using the corrected OTUs, using the script predict_metagenomes.py (Kanehisa et al., 2007). KOs uniquely store molecular-level functions for orthologs. Weighted Nearest Sequenced Taxon Index (NSTI) values were also calculated with the option -a --with_confidence. NSTI values are calculated to check for phylogenetic distance between samples and the closest related sequenced genome in reference databases. KOs were then combined into biological pathways using categorize_by_function.py (Fig 4). The final output is a per-sample functional pathways abundance table based on the KEGG database (for complete script see Appendix II).

Figure 4. PICRUSt Workflow. General workflow of PICRUSt ver. 1.1.3 for metagenomic inference. OTU taxonomic data in BIOM format based on Greengenes database is required as input.

22

Statistical Analyses

Depth profiles for temperature, salinity, chlorophyll a, and DO were generated in MS Excel. Whole abundance OTU and PICRUSt data was summarized by depth zone using the QIIME script summarize_taxa_through_plots.py. Further statistical analyses were carried out in R Studio ver. 1.1.453 using the packages ggplot2, clusterSim, and vegan (Oksanen et al., 2011). Raw OTU abundance data was cleaned up to reduce noise by removing OTUs that were present once or had a total absolute abundance of 1. Due to the dilute nature of microbial communities in the marine environment, the Quality Control (QC) cleanup process for OTU abundance was less stringent than other microbial environments, such as freshwater or soil, to ensure sufficient coverage across samples. Whole abundance OTU and PICRUSt tables were then normalized across each sample using the option n10 of the data.Normalization function of clusterSim package. All further statistical tests were conducted on the relative abundance data.

Taxonomic and functional data were analyzed using the package vegan in R Studio. Beta diversity was analyzed using Bray-Curtis dissimilarity. Beta diversity refers to between-sample differences. Bray-Curtis is a dissimilarity distance matrix widely used in microbial ecology statistics to study the compositional structure as it takes into account the null observances. Bray- Curtis dissimilarity was calculated using the vegdist function. Principal Coordinate Analysis (PCoA) was used to visualize this dissimilarity matrix in a two-dimensional plot using the function betadisper. The function adonis was used for pairwise comparison between Bray-Curtis dissimilarity and environmental data. To find the features causing the (dis)similarity, the simper function was used which calculates Similarity of Percentages. For temporal analyses, previously generated DP02 PICRUSt data (Bos et al. 2018) was integrated with DP03 and DP04. Functional data from DP02 (August 2015) provides an additional timepoint and statistical depth to analyze temporal changes in metabolism across the NGoM.

23

RESULTS

DP03-DP04 Cruise and Sequence Data

Following EMP Ontology (EMPO) classification, the microbial environment for this study is defined as free-living (level 1), saline (level 2), and surface and water (level 3) (Thompson et al., 2017). As community structure and dynamics differ between environment types, EMPO classification helps standardize organization of microbial environments and data types which will further facilitate reference-based analyses. Cruise DP03 was undertaken during the end of spring season which sees a higher outflow of MR river, lower surface temperatures and just before the onset of hypoxia (Rabalais, Turner, and Wiseman Jr, 2002). Cruise DP04 was undertaken near the end of summer season which sees a relatively less outflow from MR, has higher surface temperatures, and is after the hypoxia season. Sampling depths in the epipelagic zone corresponded to the (DCM) and in the to oxygen minimum zone (OMZ).

Raw 16S sequence data was summarized using FastQC (Andrews, 2010) and MultiQC (Ewels et al., 2016) on the Galaxy platform (Afgan et al., 2018). DP03 consists of an average 77,152 sequences per sample with an average sequence length of 253 bp. Samples with duplicate MiSeq sequencing runs were combined to get accurate averages. DP04 sequence data consists of average 107,738 sequences per sample with an average sequence length of 251 bp.

Taxonomic Analyses – Composition and Structure

Closed-reference OTU picking on QIIME based on the Greengenes 13_8 database resulted in a total of 12,824 raw OTUs for DP03 and 10,423 raw OTUs for DP04. After QC cleanup, a total of 466 samples (DP03=283, DP04=183) were analyzed for community composition and structure.

OTU data was summarized using the summarize_taxa_through_plots.py script in the QIIME platform. Average relative abundance of OTUs across depth zones shows that Proteobacteria was the dominant microbial phyla (DP03: 45.3%, DP04: 48.4%), followed by Cyanobacteria (DP03: 18.6%, DP04: 16.2%) (Fig. 5). For DP03, these were followed by Actinobacteria (6.9%), Crenarchaeota (6.8%), and Bacteroidetes (6.4%) averaged across the depth

24

zones, while for DP04 it was Euryarchaeota (6.3%), Bacteroidetes (6.1%), and SAR406 (4.9%). Specifically, at individual depth zones, relative abundance of Proteobacteria increased from the sunlit euphotic zone (surface+epipelagic) (DP03: 38.0%, DP04: 43.15%) to the aphotic zone (mesopelagic+bathypelagic) (DP03: 52.55%, DP04: 53.75%) (Fig. 6, 7). The relative abundance of Cyanobacteria decreased sharply from the euphotic (DP03: 33.8%, DP04: 31.4%) to aphotic zone (DP03: 3.45%, DP04: 1.0%), being replaced by Crenarchaeota (12.7%) during DP03 and Euryarchaeota (9.0%) during DP04.

Figure 5. Average Relative Abundance of Top 10 Phyla for DP03 and DP04. Percent relative abundance of major phyla from cruise DP03 and DP04 cruise representing 97.5% and 97.6% of taxa respectively. Proteobacteria was the most abundant phyla, followed by Cyanobacteria.

25

Figure 6. DP03 Relative Abundance of Major Phyla by Depth Zone. Top seven abundant Phyla from cruise DP03 representing 93.9% of taxa. Dashed lines represent abundance of taxa decreasing from surface to bathypelagic. Abundance of Proteobacteria increased with depth, while that of Cyanobacteria decreased.

26

Figure 7. DP04 Relative Abundance of Major Phyla by Depth Zone. Top eight abundant Phyla from cruise DP03 representing 93.6% of taxa. Dashed lines represent abundance of taxa decreasing from surface to bathypelagic. Abundance of Proteobacteria increased with depth, while that of Cyanobacteria decreased.

Initial Beta-diversity analyses were carried out using the beta_diversity_through_plots.py script in QIIME. Beta diversity is used to analyze differences between samples. Weighted UniFrac is a quantitative method used to calculate distance between communities which also takes into account the phylogenetic relatedness of the OTUs (Lozupone and Knight, 2005). Microbial community composition varied significantly across depth zones (adonis test: df = 3, F = 95.476, R2 = 0.50657, p<0.001) which explained 50.6% of the variation in DP03 and 65.9% in DP04. The distance between samples in DP03 and DP04 and the spread across depth zones were visualized in a scaled-coordinates PCoA plot (Appendix V). PCoA plots based on Bray-Curtis dissimilarity

27

index reinforce the spread of taxa based on depth zones (Fig. 8, 9). Bray-Curtis index is based on the OTU abundance data.

Figure 8. DP03 Taxa Bray-Curtis PCoA. Principal Coordinate Analysis (PCoA) plot for DP03 cruise taxa. Taxa from the euphotic zone (surface+epipelagic) cluster separate from aphotic zone (mesopelagic+bathypelagic) with some overlap.

28

Figure 9. DP04 Taxa Bray-Curtis PCoA. PCoA plot for DP04 cruise taxa. Taxa from surface and epipelagic (DCM) cluster independent from each other and from the aphotic zone.

For DP03, the primary taxa causing a difference between epipelagic to mesopelagic were OTUs 355538, 557211, 823476, and 884345. OTUs 355538 of Cyanobacteria (genus Prochlorococcus) and 884345 of Gammaproteobacteria (family SUP05) were significantly different (SIMPER, p<0.01). For DP04, the primary OTUs significantly different from surface to epipelagic were 823476 (SIMPER, p<0.01) of Gammaproteobacteria (family Alteromonadaceae), and 557211 and 427495 (SIMPER, p<0.01) of Cyanobacteria (genus Synechococcus). Between epipelagic and mesopelagic, OTU 823476 of Gammaproteobacteria (family Alteromonadaceae) was the primary taxa causing a significant change (SIMPER, p<0.01), followed by 355538 of Cyanobacteria (family Synechococcaceae).

A CCA test was conducted using a forward-selecting model building function to analyze the effects of environmental variables on taxonomic composition. Environmental variables explained 14.3% (DP03) and 32.7% (DP04) of the variation across water column, with temperature being the major environmental driver (R2 adjusted, DP03: 0.081, DP04: 0.152) (Table 2) (Fig. 10, 11).

29

Figure 10. DP03 Taxa CCA. Canonical Correlation Analysis (CCA) plot showing correlation of environmental variables with taxonomic abundance from DP03 cruise. Variables explained 14.3% of the variation, with temperature as the major environmental driver.

30

Figure 11. DP04 Taxa CCA. CCA plot showing correlation of environmental variables with taxonomic abundance from DP04 cruise. Variables explained 32.7% of the variation, with temperature as the major environmental driver.

Functional Analyses – Composition and Structure

Weighted NSTI values were calculated as recommended by Langille et al. (2013). Mean weighted NSTI values for DP03 (0.17 ± 0.042) and DP04 (0.17 ± 0.047) were relatively high, indicating fewer marine metagenomes available in reference databases. PICRUSt output resulted in a total of 310 KEGG level 3 functional pathways for DP03 and 314 pathways for DP04. PICRUSt data was also summarized using the summarize_taxa_through_plots.py script after reorganizing the raw functional abundance output into a QIIME-compatible BIOM format. Average relative abundance of KEGG level 1 pathways across the water column shows that Metabolism was the dominant functional category (DP03: 53.1%, DP04: 52.4%) (Fig. 12). The next known abundant pathway was Genetic Information Processing (GIP) (DP03: 18.0%, DP04: 17.8%), which consists of pathways such as transcription and translation at KEGG level 2. GIP

31

was followed by Environmental Information Processing (EIP) (DP03: 11.0%, DP04: 11.0%), which consists of pathways such as membrane transport and signal transduction at level 2.

Figure 12. Average Relative Abundance of all KEGG Level 1 Functional Pathways from DP03 and DP04. Based on relative abundance, Metabolism was the most abundant pathway for both cruises (DP03: 53.1%, DP04: 52.4%), followed by Genetic Information Processing.

For KEGG level 2 analyses, percentages are calculated across respective KEGG level 1 pathways. The most abundant metabolic pathway was amino acid metabolism (DP03: 21.3%, DP04: 21.2%), followed by carbohydrate metabolism (DP03: 18.7%, DP04: 18.6%) (Appendix III). The abundance of amino acid metabolism and carbohydrate metabolism, increased with depth, while the following two, energy metabolism and metabolism of cofactors and vitamins, decreased (Fig. 13, 14). Among GIP and EIP, membrane transport and replication and repair processes decreased with depth, while translation, signal transduction and folding, sorting and degradation, and transcription increased (Fig. 15, 16).

32

Figure 13. Average Relative Abundance of KEGG Level 2 Metabolic Pathways from DP03. Depth profiles of KEGG level 2 pathways under the level 1 category of Metabolism, representing 91.2% of metabolism for cruise DP03. Dashed lines represent pathway abundance decreasing from surface to bathypelagic.

33

Figure 14. Average Relative Abundance of KEGG Level 2 Metabolic Pathways from DP04. Depth profiles of KEGG level 2 pathways under the level 1 category of Metabolism, representing 90.9% of metabolism for cruise DP04. Dashed lines represent pathway abundance decreasing from surface to bathypelagic.

34

Figure 15. Average Relative Abundance of KEGG Level 2 GIP and EIP Pathways from DP03. Depth profiles of KEGG level 2 pathways under the level 1 of Genetic Information Processing and Environmental Information Processing for cruise DP03. Dashed lines represent pathway abundance decreasing from surface to bathypelagic. Inset- overall relative abundance of GIP and EIP.

35

Figure 16. Average Relative Abundance of KEGG Level 2 GIP and EIP Pathways from DP04. Depth profiles of KEGG level 2 pathways under the level 1 of Genetic Information Processing and Environmental Information Processing for cruise DP04. Dashed lines represent pathway abundance decreasing from surface to bathypelagic. Inset- overall relative abundance of GIP and EIP.

Further analyses were centered on metabolism at KEGG level 2 and 3 pathways. At the finest functional resolution, KEGG level 3, metabolic functions varied significantly across individual depth zones (adonis test, DP03: p<0.001, DP04: p<0.001). PCoA plots based on Bray- Curtis distance of KEGG level 3 metabolic functions showed that metabolism in the surface and epipelagic zones clustered distinct from that in mesopelagic and bathypelagic zones, with a slight extension of epipelagic function into the aphotic zone during DP03 (Fig. 17, 18).

36

Figure 17. DP03 Metabolism Bray-Curtis PCoA. Clustering of metabolism in euphotic and aphotic zone based on a PCoA plot. The difference in function between epipelagic and mesopelagic (8.8%, SIMPER) was twice more than that between other consecutive depth zones.

37

Figure 18. DP04 Metabolism Bray-Curtis PCoA. Distinct clustering of metabolism in euphotic and aphotic zone based on a PCoA plot. The difference in function between epipelagic and mesopelagic (8.36%, SIMPER) was more than that between other consecutive depth zones.

Across consecutive depth zones, the largest difference was between epipelagic and mesopelagic (SIMPER summary, DP03: 8.8%, DP04: 8.36%), twice more than surface-epipelagic and mesopelagic-bathypelagic. The major pathways significantly different between epipelagic and mesopelagic were photosynthesis proteins, photosynthesis, and porphyrin and chlorophyll metabolism (SIMPER cumulative sum, DP03: 25.1%, DP04: 27.7%) (Table 2). These were followed by methane metabolism (DP03: 3.7%, DP04: 3.4%).

A CCA test was conducted using a forward-selecting model building function to analyze the effects of environmental variables on community function. Environmental variables explained 63.2% (DP03) and 77.3% (DP04) of the variation across water column, with temperature being the major environmental driver (R2 adjusted, DP03: 0.554, DP04: 0.679). Across geospatial

38

features, metabolism across stations of both cruises was significantly different in the euphotic (adonis test, p<0.001) and aphotic zones (adonis test, p<0.002). Across common water, mixed water, and LC water classifications, metabolism was significantly different across euphotic zones of both cruises (adonis test, p<0.001) and across aphotic zone in DP03 (adonis test, p<0.05). However, metabolism was not different across aphotic zone of DP04.

Figure 19. DP03 Metabolism CCA. CCA plot showing correlation of environmental variables with metabolic functional abundance from DP03 cruise. Temperature was the major environmental driver, explaining 55.4% of the variation.

39

Figure 20. DP04 Metabolism CCA. CCA plot showing correlation of environmental variables with metabolic functional abundance from DP03 cruise. Similar to DP03, temperature was the major environmental driver, explaining 67.9% of the variation.

Temporal Analyses – Structure and Dynamics

Stratification of the oceanic water column in summer seasons was evident as depth zones explained 70.6% variation in metabolism during DP04 as compared to 63.6% variation during DP03. Functional abundance data from three cruises, DP02, DP03 and DP04 was used for the following temporal analyses. Microbial metabolism was significantly different across seasons of May and August (adonis test, dF = 1, F = 14.133, R2 = 0.02305, p<0.001). The top three differential metabolic functions across seasons were photosynthesis proteins, photosynthesis, and porphyrin and chlorophyll metabolism (SIMPER, cumulative sum 22.1%). Metabolism was also significantly different annually across 2015 and 2016 (adonis test, dF = 1, F = 22.476, R2 = 0.03617, p<0.001). Although the top three differential functions were the same as those across seasons, they were not significantly different annually. Across individual cruises, these

40

photosynthetic functions were significantly different across DP03-DP04 (SIMPER, p<0.01), and not across DP02-DP03 or DP02-DP04. Overall, metabolism across seasons and year was more stable with increase in depth (Appendix V).

41

DISCUSSION

Microbial ecology studies have increased rapidly in the past decade, owing to the technological and computational advancements in molecular NGS and biological data analysis techniques (Franzosa et al., 2015; Stephens et al., 2015). These approaches have facilitated whole microbial community, or microbiome, analyses, as opposed to the classical microbiology of cultivable in situ single gene or species analyses. Large scale projects such as Human Microbiome Project (Huttenhower et al., 2012) and EMP (Gilbert, Jansson, and Knight, 2014; Thompson et al., 2017) employ these approaches extensively to study the structure and function of microbiomes across various environments. In the natural environment, microbes are the major drivers of biogeochemical cycles, cycling major elements and nutrients such as carbon and nitrogen, across soil and water. They are primary producers and decomposers in the most ubiquitous and extreme ecosystems. This means that microbial communities are strongly linked to the environment and respond to ecological changes brought on by seasonal, geological, or anthropogenic events. Studying the environmental microbiomes can thus indicate how these environmental changes will affect the higher trophic levels.

Marine microbial communities constitute a significant portion of the ocean biomass and are the sole primary producers across depth zones. However, they have only recently been studied for structure, function, distribution, and dynamics. Large-scale studies such as GOS (Rusch et al., 2007), Malaspina expedition (Duarte, 2015), and Tara Oceans (Sunagawa et al., 2015) studied global ocean trends and dynamics of microbiome. Small-scale studies analyzed regional microbial profiles, studying local effects of currents and drainage, and found distinct community structure across freshwater and marine water bodies. To this end, one of the goals of the DEEPEND consortium is to establish a baseline for microbial community structure and dynamics across the NGoM (Sutton et al., 2017). As part of this goal, the primary objective of this study was to characterize the functional profiles of the NGoM microbiome. Overall, a total of 466 16S rRNA amplicon samples from cruises during May and August of 2016 were analyzed for microbial community dynamics.

42

Taxonomic Composition and Structure

Before analyzing the functional dynamics, it is informative to look at the taxonomic dynamics of the NGoM microbiome during two different seasons in 2016. The overall taxonomic composition and structure was consistent with other marine environments, with Proteobacteria being the most abundant phylum, followed by cyanobacteria. Proteobacteria increased with depth, with overall relative abundance of class Gammaproteobacteria more than Alphaproteobacteria. Gammaproteobacteria was more abundant in the bathypelagic, while Alphaproteobacteria was more abundant at surface. This trend has been reported to vary from that in most other marine water bodies and the global ocean, where abundance of Gammaproteobacteria is higher (King et al., 2013; Sunagawa et al., 2015). This variation in trend has been attributed to higher amounts of hydrocarbons in the deep NGoM compared with other deep pelagic habitats, which support the growth of Gammaproteobacteria. Genus Pseudoalteromonas of this phylum specifically responds strongly to the presence of hydrocarbons (King et al., 2013). In DP03, the relative abundance of Pseudoalteromonas in bathypelagic was 7.4%, while in DP04 it was 1%. The higher abundance may also indicate hypoxic conditions at depths, leading to more utilization of hydrocarbons. Another explanation for this varying trend may be due to the 515F-806R primer bias which is known to underestimate the Alphaproteobacterial clade SAR11 or overestimate Gammaproteobacteria (Parada et al., 2016; Easson and Lopez, 2019).

Abundance of cyanobacteria abundance decreased with depth due to decreasing sunlight, although Cyanobacteria was observed in low abundance even beyond the euphotic zone. This trend has been suggested due to the role of cyanobacteria in sinking particle flux (Sunagawa et al., 2015). Another possible explanation for the presence of cyanobacteria in the deeper layers is the variation in abundance of genus and species-level taxa, and ecotypes within, which are better adapted to low light conditions (Hess, 2004). Abundance of cyanobacteria was higher in DP03 than DP04, indicating higher primary production in the spring season. Actinobacteria phyla also had higher relative abundance in DP03, which has been proposed as a microbial freshwater signature (Mason et al., 2016). Archaeal phyla Crenarchaeota and Euryarchaeota increased in abundance from surface to bathypelagic (Fig. 6, 7). It must be noted that due to the use of Greengenes 13.8 database as reference for OTU picking, Thaumarchaeota is ranked as a class under Crenarchaeota. Thaumarchaeota has since been established as a major phylum of Archaea (Brochier-Armanet et

43

al., 2008). In the bathypelagic, relative abundance of Thaumarchaeota was higher in DP03 (10.6%) than in DP04 (6.6%). This is consistent with previous reports of higher abundance of Ammonia Oxidizing Archaea, primarily Thaumarchaeota, in the hypoxic zone (Campbell et al., 2019; Bristow et al., 2015). Combined with higher abundance of Pseudoalteromonas in the early summer of DP03, these trends may possibly indicate the onset of a bottom-water hypoxia. While the beta- diversity trends of family or genus-level taxa within each of these classes might vary across depth, since lower-level taxa are still grouped phylogenetically, a class-level analysis offers a broad resolution in terms of taxonomic analyses.

PCoA plots based on Bray-Curtis dissimilarity index display a major shift in taxonomic composition from euphotic to aphotic zone, as surface and epipelagic cluster distinct from mesopelagic and bathypelagic, with some overlap during end of spring season (Fig. 8, 9). This trend has also been reported for other studies on the NGoM, including DEEPEND cruise DP02, and the global ocean (King et al., 2013; Sunagawa et al., 2015; Easson and Lopez, 2019). A CCA test was carried out to study the effect of physicochemical environmental variables (Fig. 10, 11). The environmental variables included were temperature, salinity, DO, and chlorophyll a concentration. These variables together explained 14% and 26% of the variation in taxonomic structure, with temperature being the major environmental driver (Table 2). While the microbial communities are structured along depth zones, temperature, and not individual sampling depth, was the major environmental driver. This was also found in the cruise DP02 (Easson and Lopez, 2019). The thermal stratification during summer months was evident in the PCoA and CCA analyses of DP04. Depth was the second major environmental driver after temperature in DP04, while in DP03 it was DO. In DP04, the surface and epipelagic also cluster quite distinctly from mesopelagic and bathypelagic, while in DP03, there is slight overlap from epipelagic to mesopelagic. This is also why temperature explained almost twice as much variation in DP04 (15.3%) as in DP03 (8.1%).

44

Table 2: CCA of environmental variables on OTUs. Effect of environmental variables on community taxonomic composition of DP03 and DP04 showed temperature as the major environmental driver.

Cruise Environmental p value 푹ퟐ Adjusted variable + Temperature 0.002 0.081 + DO 0.002 0.025

DP03 + Chla 0.002 0.020 + Salinity 0.002 0.01

0.143

+ Temperature 0.002 0.152 + Salinity 0.002 0.058

DP04 + Depth 0.002 0.052 + DO 0.002 0.046 0.327

Functional Composition and Structure

Depth Stratification of Microbial Function

The functional profiles of microbial communities, inferred using the PICRUSt approach, showed metabolism as the most abundant functional category at KEGG level 1 for both DP03 and DP04 (relative abund. > 50%) (Fig. 12). At KEGG level 2 of metabolism, amino acid metabolism and carbohydrate metabolism were the most abundant pathways (Fig. 13, 14). Amino acid metabolism is a functional group that consists of specific pathways such as Histidine metabolism and Tryptophan metabolism. Carbohydrate metabolism consists of pathways such as Pyruvate metabolism and Glycolysis. A targeted functional study on carbon source utilization by surface microbial communities from the Arabian Sea also found carbohydrates and amino acids as the most utilized substrates (Kumar, Mishra, and Jha, 2019). Relative abundance of amino acid metabolism and carbohydrate metabolism increased with depth and were the most abundant in deeper waters, indicating higher biochemical function. The next two abundant pathways, energy metabolism and metabolism of cofactors and vitamins, decreased with depth, primarily due to the decrease in photosynthetic activity as categorized in KEGG level 3 (Fig. 13, 14). Based on the level 2 of GIP and EIP functions, microbes in deeper waters increased their key biomolecular

45

processes necessary for survival as abundance of transcription and translation pathways increased, while membrane transport and replication and repair decreased (Fig. 15, 16). Specifically, high genetic function was noted in the bathypelagic zone as Purine metabolism was the most abundant pathway in both the cruises.

All further analyses were centered on the level 3 pathways of most abundant level 1 function, metabolism. Metabolism was significantly different across depth zones for both cruises (adonis test, p<0.001). As the physicochemical parameters change noticeably from euphotic to aphotic zone, microbial communities adapt to this change by having a major functional shift. A PCoA plot based on Bray-Curtis index helps visualize this dissimilarity in metabolic abundance (Fig. 17, 18). Photosynthesis proteins, photosynthesis, and Porphyrin and chlorophyll metabolism were the major differential pathways between epipelagic and mesopelagic zones, accounting for 25% and 28% of functional abundance for DP03 and DP04 respectively (Table 3, 4). This shift in function quantitatively supports the significance of sunlight and photosynthesis in driving the oceanic microbial function, as a major portion of the photosynthesis-derived organic carbon is exported to deeper layers (Falkowski, Barber, and Smetacek, 1998; Mestre et al., 2018). Thus any changes in microbial function in the euphotic zone will lead to changes in community function across depth zones, in turn affecting availability of organic material to higher trophic levels and global biogeochemical cycles (Falkowski, Barber, and Smetacek, 1998).

Table 3: SIMPER test on metabolic pathways of DP03. Major pathways causing functional shift from epipelagic to mesopelagic in DP03.

KEGG Level 2 category Pathway p value Cumulative sum Photosynthesis proteins 0.01 0.11 Energy Metabolism Photosynthesis 0.01 0.20 Porphyrin and chlorophyll 0.01 0.25 metabolism Methane metabolism 0.01 0.29 Energy Metabolism Carbon fixation pathways 0.01 0.32 in prokaryotes Citrate cycle (TCA cycle) 0.01 0.35 Carbohydrate Butanoate metabolism 0.01 0.38 Metabolism Propanoate metabolism 0.01 0.404

46

Table 4: SIMPER test on metabolic pathways of DP04. Major pathways causing functional shift from epipelagic to mesopelagic in DP04.

KEGG Level 2 category Pathway p value Cumulative sum Photosynthesis proteins 0.01 0.12 Energy Metabolism Photosynthesis 0.01 0.21 Porphyrin and chlorophyll 0.01 0.28 metabolism Methane metabolism 0.01 0.31 Energy Metabolism Carbon fixation pathways 0.01 0.35 in prokaryotes Citrate cycle (TCA cycle) 0.01 0.375 Carbohydrate Butanoate metabolism 0.01 0.40 Metabolism Photosynthesis-antenna 0.01 0.423 proteins

Contrary to early theories, studies in the past two decades have shown that microbes are present and active in the deep pelagic (Arístegui et al., 2009). In this study, the energy metabolism of microbial communities switched from light-dependent photosynthesis in the euphotic zone, to chemosynthetic processes such as methane metabolism, other sources of carbon fixation, and butanoate metabolism in the aphotic zone. A post-oil spill study from the DWHOS site found high amounts of methane in the deep pelagic, accounting for 15% of the released hydrocarbons (Reddy et al., 2012). Microbial studies in the months following the oil spill found high abundance of methanotrophic bacteria and low abundance in the following month, suggesting insufficient methane consumption (Joye, Teske, and Kostka, 2014). For this study in 2016, methane metabolism was slightly higher in May (DP03: 2.2%) as compared to August (DP04: 2.15%), with highest abundance in the mesopelagic zone.

Quantitative studies based on Net Primary Productivity (NPP) by definition do not include carbon fixed by non-photosynthetic processes, which primarily occur in the aphotic zone, making it important to study those processes and include them in biogeochemical budgets (Orcutt et al., 2011). As photosynthesis is replaced by chemosynthetic processes as the method of primary productivity in the sunlight-deprived aphotic zone, studying these processes can help identify dynamics of processes driving microbial communities. Activity of microbial communities in the deeper layers have also been connected strongly to the sinking particle flux (Nagata et al., 2000;

47

Mestre et al., 2018). Microbes are known to colonize this particle flux, called marine snow, due to their high organic matter content. The switch in function between euphotic and aphotic zone may also be attributed to this increased reliance on readily available organic carbon as energy source (Orcutt et al., 2011). Targeted deep ocean studies may be required to tease apart the function and driving factors of particle flux-associated microbes versus free-living communities and how much do they contribute to marine nutrient cycles and global biogeochemical cycles.

Environmental Drivers and Functional Redundancy

Environmental variables often explain more variation in community function more composition (King et al., 2013; Thompson et al., 2017). For cruises DP03 and DP04, environmental variables explained 63% and 77% variation in metabolic function respectively. Temperature was the major environmental driver and positively correlated with function during DP03 and negatively during DP04 (Fig. 19, 20). Even though function varied significantly across depth zones (adonis test, p<0.001), temperature, and not individual sampling depth, was the major environmental driver of community function across the water column (Table 5). Temperature explained only 8.1% and 15.2% variation in taxa, while it explained 55.4% and 68% of variation in function (Table 2, 5). This underlines the significance of temperature in driving local microbial community structure and function along with its influence across global marine environments (Sunagawa et al. 2015; Thompson et al., 2017). This difference in community taxa and community function is also observed when comparing relative abundance of taxonomic composition and metabolic potential. Variation in abundance of phylum-level taxa (relative abund. SD, DP03: 0.069, DP04: 0.072) was higher than that of metabolic pathways (relative abund. SD, DP03: 0.0082, DP04: 0.0081). A study on the global ocean microbiome also found high variation in taxonomic abundance and relatively stable gene functions (Sunagawa et al., 2015). The results above highlight this variation between taxonomic and functional potential of marine microbial communities, and support the idea of functional redundancy, where community function more closely reflects environmental conditions.

Taxonomic analyses may not always be conclusive, as a recent study on microbial response to oil spill off the South-Eastern coast of India found high abundance of hydrocarbon degraders before and after oil spill in sediment and seawater, but enhanced hydrocarbon genes only post oil spill (Neethu et al., 2018). The current microbial classification system also does not accommodate

48

for processes such as HGT and high mutation rates in the taxonomy, which are key for carrying out selective functions for niche adaptation (Medini et al., 2005). The pan-genome concept addresses this discrepancy by suggesting that the genetic repertoire of a microbe comprises of its core genome, which is shared across species and is essential for basic survival functions, and a dispensable genome which allows for selective adaptive functions and differentiates between strains (Medini et al., 2005; Vernikos et al., 2015). Hence, functional redundancy potentially offers a buffering capacity for microbial communities, where the underlying function may remain within a small range while the composition may vary considerably (Sunagawa et al., 2015; Louca, Parfrey, and Doebeli, 2016). It is this underlying function which is seen more closely associated with environmental factors. Nutrient data was not included while testing environmental drivers as the Tara Oceans study did not find their significance on a global level (Sunagawa et al., 2015).

Table 5: CCA of environmental variables on metabolic function. Effect of environmental variables on community function of DP03 and DP04 showed temperature as the major environmental driver.

Cruise Environmental p value 푹ퟐ Adjusted variable + Temperature 0.002 0.554 + DO 0.002 0.043

DP03 + Chla 0.002 0.019 + Salinity 0.002 0.013

0.632

+ Temperature 0.002 0.68 + DO 0.002 0.053

DP04 + Chla 0.002 0.025 + Salinity 0.002 0.014 0.773

Microbial Functional Dynamics across the NGoM

Spatial, Mesoscale, and Temporal Dynamics

There are varying reports of spatial diversity of microbial taxa in the NGoM, with cases for both, presence (Mason et al., 2016; Easson and Lopez, 2019) and absence (King et al., 2013) of spatial variability, depending on sampling time and location. Overall metabolic function in the

49

NGoM was significantly different across sampling stations in the euphotic and aphotic zones in May (DP03) and August (DP04) of 2016. To further analyze what factors could be causing these differences, mesoscale-level water classification, based on sea-surface height and temperature, across the GoM was used (Johnston et al., 2019). Sampling stations were classified as present in common water, mixed water, or LC region, with mixed water defined as intermediate to common water and LC. Function was significantly different across water classification in the euphotic zone during both sampling times, May (p=0.007) and August (p=0.001). While microbial function was also significantly different across the aphotic zone during May (p=0.023), it was similar in the aphotic zone during August (p=0.109). This observation can be explained by the combined effect of high residence time and the presence of strong water column stratification. As the water is stratified during summer, there is less vertical mixing and the water currents in deeper layers do not cause sufficient mixing, leading to similar microbial activity across the deep waters of the NGoM and the function in aphotic being effectively isolated from that in the euphotic zone.

The euphotic zone has been extensively studied for microbial composition and functional dynamics across the global oceans, with emphasis on surface primary production as it is responsible for ~46% of global primary productivity and feeds the deeper layers with organic matter (Logares et al., 2018). However, supply of organic matter is less than the microbial activity noted in the aphotic zone, indicating significant primary production (Acinas et al., 2019). Methane metabolism is one of the primary methods of fixing inorganic carbon in the deep. Using the PICRUSt functional data from sampling stations of cruises DP03 and DP04, relative abundance of methane metabolism across the aphotic zone was spatially mapped. The abundance of methane metabolism at the site closest to the shore, B175, was higher in late summer of 2016 (Fig. 22) than in early spring (Fig. 21), indicating seasonal variation. Another sampling site slightly further offshore, B252, had high metabolism throughout both seasons. No correlation was found with mesoscale water classification, possibly due to low sample coverage of the LC, a weak LC during both cruises or the absence of the LC in bathypelagic. The GoM seabed has high hydrocarbon content, including approximately 22,000 methane seeps (Joye et al., 2014). While specific natural seeps were not mapped along with the cruise stations, this difference in metabolism across the two relatively close sites may be due to the presence of a natural seep at/near station B252 responsible for a high steady release of methane, causing increased metabolism than a nearby site. Studies on microbial taxa from the GoM have shown that methane may be a key driving force structuring the

50

community across water column and needs to be further studied (Rakowski et al., 2015). A longer time-scale analysis might also be needed to study temporal changes on microbial methane metabolism in the aphotic zone. Due to the higher number of hydrocarbon seeps in the GoM as compared with other marine ecosystems, there is higher concentration of methane in the deeper layers (Rakowski et al., 2015). These concentrations also increased 10-1000 times after the DWHOS (Joye et al., 2011). As methane is a major greenhouse gas and responsible for global ocean warming (Lelieveld, Crutzen, and Dentener, 1998), among other large-scale effects, studying methane microbial dynamics may help understand the factors driving and affecting these dynamics.

Figure 21. GIS Map of Methane Metabolism across Aphotic Zone of the NGoM during DP03. Lighter colors stations show low abundance, while darker colors show higher abundance. Stations further offshore had high relative abundance of methane metabolism in the aphotic zone. The station closest to the shore, B175, had the lowest abundance. The basemap is ETOPO1 bathymetry. Mapped using Equal Interval distribution in QGIS ver. 3.8.2.

51

Figure 22. GIS Map of Methane Metabolism across Aphotic Zone of the NGoM during DP04. Lighter colors stations show low abundance, while darker colors show higher abundance. Stations further offshore had high relative abundance of methane metabolism in the aphotic zone. The station closest to the shore, B175, also had the highest abundance, as opposed to low abundance during DP03. The basemap is ETOPO1 bathymetry. Mapped using Equal Interval distribution in QGIS ver. 3.8.2.

The strength of association of environmental variables with microbial communities differs between taxa and function, with seasonality strongly affecting how these variables drive microbial communities (Sunagawa et al., 2015). For the current dataset, environmental variables explained more variation of function than of taxa (Table 2, 5). Salinity had a positive correlation with relative abundance of metabolism during both time periods, while temperature, oxygen, and chlorophyll a were positively correlated during early spring (DP03) and negatively during late summer (DP04) (Fig. 19, 20). Along with spatial dynamics, time-series studies of marine microbiomes can help determine seasonal and annual patterns, along with potential changes to function caused by irregular natural or anthropogenic events such as hurricanes and oil spills. Functional data from

52

cruise DP02 (Bos et al., 2018) was integrated with the current study of DP03 and DP04 for temporal analyses. The stratification of oceanic depth zones during summer months is also reflected in the metabolic structure of microbial communities, with depth zones explaining more metabolic variation during August 2016 (DP04) and the distinct clustering of euphotic zones and aphotic zones, than in 2016 (DP03). Photosynthetic primary productivity overall was significantly different across seasons (May and August), but not annually (2015 and 2016), supporting a previous study that also found surface primary productivity differs seasonally (Gilbert et al., 2012). Testing for this seasonal variation individually between the three time points of this study, photosynthetic functional abundance was found significantly different between May 2016 and August 2016, but not between August 2015 and May 2016. While the spring season in the NGoM is associated with higher primary productivity than summer, productivity across summer of 2015 was similar to that of spring 2016. A possible explanation for this deviation may be the presence of a strong LC observed during August 2015. Cell abundance studies across the NGoM have shown that the upwelling eddies formed due to the LC enhance nutrient mixing across the water column, thereby causing an increase in microbial plankton abundance and primary productivity (Williams et al., 2015). This deviation may also be due to the lack of sufficient sampling within the LC, as only two sites were sampled within the LC during May 2016 (DP03). Long-term seasonal and annual sampling in the LC might be needed to verify the above analysis. As the LC and the MR outflow are two major mesoscale circulations in the GoM, further studies focused on water circulation and microbial community function can help understand potential pelagic effects of the increased nutrient loading from the Mississippi River runoff.

Considerations, Developments and Future Applications

Whole-genome and amplicon sequencing are immensely powerful tools in studying free- living microbial communities in natural environments such as soil, sediments, and oceans. These techniques are often computationally demanding, thus requiring analysis with caution to check for computational and/or statistical biases. For this study, PICRUSt NSTI values were relatively high (>0.15). NSTI values show the strength of functional predictions and higher values may indicate lower confidence. As the functional predictions and OTU taxonomic clustering depend on availability of whole genomes in reference databases, a higher NSTI number is not uncommon for

53

marine microbiomes as there are fewer sequenced marine microbes (Langille et al., 2013). This is also why NSTI values from PICRUSt analyses are generally lower for human microbiomes as they are better surveyed and the reference databases are more comprehensive. In the current study, this can be seen in the relatively high abundance of Unclassified KEGG level 1 category of microbial functions (Fig. 12).

There are primarily three points of concerns in using the PICRUSt approach. The first concern arises from the use of reference database Greengenes. As microbial single-gene and whole-genome studies are reported, they are added to publicly-accessible reference databases, which are then curated. The version of PICRUSt used in this study only accepted taxonomy derived from Greengenes, which is a relatively older microbial sequence database and might not contain recently surveyed microbes. The second concern is the reliance of PICRUSt on OTUs as taxonomic classification. OTUs are known to resolve taxonomy between genus and species level due to the 97% sequence clustering. This inability to resolve taxonomy to a finer level leaves out the biological variation between microbes, which often have strain-level diversity. To overcome this problem, microbial compositional studies are increasingly using the ASV approach which resolves taxonomy without clustering thereby accommodating for every nucleotide in classification (Callahan, McMurdie, and Holmes, 2017). A third general concern is the overall use of 16S rRNA gene. There are some known biases with the selection of variable regions of the 16S gene (V4, V5, V6) primers (515F-806R) used in profiling microbial communities (Klindworth et al., 2013; Brooks et al., 2015; Willis, Desai, and LaRoche, 2019). With increased specificity and sophistication of lab-based sample processing protocols, NGS, bioinformatic, and statistical tools, these biases can often be accounted or corrected for (Hughes et al., 2001; Schloss and Westcott, 2011; Kennedy et al., 2014; Eisenstein, 2018).

PICRUSt continues to be used in studying aquatic and marine microbial communities with reasonably accurate functional inferences, which will potentially improve with the application of PICRUSt2 (Douglas et al., 2019). While deep-sequencing metagenomics studies offer in-depth insights into the gene content and function of marine microbial communities, they are often expensive and computationally intensive. The 16S amplicon-based tools such as PICRUSt thus offer cost-effective alternatives. Both approaches have in effect highlighted the importance of

54

analyzing microbial function to study marine environments. Microbial analyses can also support studies on marine chemical data and nutrient cycles.

Development of high-throughput sequencing standards, single-gene surveys, metagenomic surveys, and curation of sequence databases are key methods to accurately analyze microbial communities in natural environments. As marine microbial community composition and function has evolved distinctively from other natural environments, a marine environment-specific database such as OMRGC (Sunagawa et al., 2015) can serve as a repository and reference database for future amplicon, metagenomic and metatranscriptomic analyses, avoiding ecosystem bias.

55

CONCLUSIONS

The current study is the first comprehensive report of microbial community function from surface to bathypelagic zone across the pelagic Northern Gulf of Mexico. Metabolic functional data across cruises in May and August of 2016 showed strong depth stratification, with a major shift in function from the euphotic to aphotic zone. Photosynthetic functional signatures such as photosynthesis proteins and photosynthesis metabolism were the major functions driving this shift. Sunlight-dependent primary productivity at the surface was replaced by methane metabolism as the most abundant pathway for primary productivity in the aphotic zone. Methane metabolism in the aphotic zone mapped across the sampling sites showed high abundance in the farthest sites. Analysis of association of physicochemical environmental variables showed temperature as the major driver among other parameters: salinity, absolute depth, chlorophyll a concentration, and dissolved oxygen. Environmental variables overall also had a stronger correlation with function (63% and 77%), explaining more variation than taxonomic composition (14% and 26%).

Metabolic function was also significantly different across the euphotic and aphotic zone of both sampling seasons, indicating spatial variability. Based on mesoscale water classification, common water, mixed water and Loop Current, function was found similar across the deeper layers of the NGoM during late summer, August 2016, which may be attributed to a strong stratification of water column. Previously analyzed data from an earlier cruise in August 2015, was included for temporal analyses. Photosynthetic primary productivity overall was significantly different across seasons (May and August), but not annually (2015 and 2016). This analysis shows potential pelagic effects of the Mississippi River as early spring season of May corresponds with a higher river outflow.

Marine microbes are key drivers of the biogeochemical cycles, and studying their function can help assess ecological changes and environmental effects which may also affect higher trophic levels. The analyses of functional data presented here contribute to studying the dynamics of microbial communities of the GoM and establishing a functional baseline for the region.

56

REFERENCES

Acinas, S. G., Sánchez, P., Salazar, G., Cornejo-Castillo, F. M., Sebastián, M., Logares, R., et al. (2019) Metabolic Architecture of the Deep Ocean Microbiome. bioRxiv, 635680.

Afgan, E., Baker, D., Batut, B., Van Den Beek, M., Bouvier, D., Čech, M., et al. (2018) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic acids research, 46(W1), W537-W544.

Alexander, R. B., Smith, R. A., & Schwarz, G. E. (2000) Effect of stream channel size on the delivery of nitrogen to the Gulf of Mexico. Nature, 403(6771), 758.

Allen, E. E., & Banfield, J. F. (2005) Community genomics in microbial ecology and evolution. Nature Reviews Microbiology, 3(6), 489.

Amante, C., & Eakins, B. W. (2009) ETOPO1 arc-minute global relief model: procedures, data sources and analysis.

Andrews, S. (2010) FastQC: a quality control tool for high throughput sequence data. In: Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom.

Arístegui, J., Gasol, J. M., Duarte, C. M., & Herndld, G. J. (2009) Microbial oceanography of the dark ocean's pelagic realm. and Oceanography, 54(5), 1501-1529.

Berendsen, R. L., Pieterse, C. M., & Bakker, P. A. (2012) The rhizosphere microbiome and plant health. Trends in plant science, 17(8), 478-486.

Bianchi, T. S., DiMarco, S., Cowan Jr, J., Hetland, R., Chapman, P., Day, J., & Allison, M. (2010) The science of hypoxia in the Northern Gulf of Mexico: a review. Science of the Total Environment, 408(7), 1471-1484.

Bos, R. P., Lopez, J., Easson, C., & White, J. (2018) PICRUSt Analysis of Shallow and Deep- Water Microbial Communities in the Gulf of Mexico.

Boucher, Y., Douady, C. J., Papke, R. T., Walsh, D. A., Boudreau, M. E. R., Nesbø, C. L., et al. (2003) Lateral gene transfer and the origins of prokaryotic groups. Annual review of genetics, 37(1), 283-328.

Bristow, L. A., Sarode, N., Cartee, J., Caro‐Quintero, A., Thamdrup, B., & Stewart, F. J. (2015) Biogeochemical and metagenomic analysis of nitrite accumulation in the G ulf of M exico hypoxic zone. Limnology and Oceanography, 60(5), 1733-1750.

Brochier-Armanet, C., Boussau, B., Gribaldo, S., & Forterre, P. (2008) Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nature Reviews Microbiology, 6(3), 245.

57

Brooks, J. P., Edwards, D. J., Harwich, M. D., Rivera, M. C., Fettweis, J. M., Serrano, M. G., et al. (2015) The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC microbiology, 15(1), 66.

Callahan, B. J., McMurdie, P. J., & Holmes, S. P. (2017) Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. The ISME journal, 11(12), 2639.

Campbell, A. M., Fleisher, J., Sinigalliano, C., White, J. R., & Lopez, J. V. (2015) Dynamics of marine bacterial community diversity of the coastal waters of the reefs, inlets, and wastewater outfalls of southeast F lorida. MicrobiologyOpen, 4(3), 390-408.

Campbell, L. G., Thrash, J. C., Rabalais, N. N., & Mason, O. U. (2019) Extent of the annual Gulf of Mexico hypoxic zone influences microbial community structure. PloS one, 14(4), e0209055.

Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., et al. (2010) QIIME allows analysis of high-throughput community sequencing data. Nature methods, 7(5), 335.

Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Turnbaugh, P. J., et al. (2011) Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences, 108(Supplement 1), 4516- 4522.

DeLong, E. F. (2005) Microbial community genomics in the ocean. Nature Reviews Microbiology, 3(6), 459.

DeSantis, T. Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E. L., Keller, K., et al. (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and Environmental Microbiology, 72(7), 5069-5072.

Dewhirst, F. E., Chen, T., Izard, J., Paster, B. J., Tanner, A. C., Yu, W.-H., et al. (2010) The human oral microbiome. Journal of bacteriology, 192(19), 5002-5017.

Donnelly, C. P. (2018) Microbial Ecology of South Florida Surface Waters: Examining the Potential for Anthropogenic Influences.

Douglas, G. M., Maffei, V. J., Zaneveld, J., Yurgel, S. N., Brown, J. R., Taylor, C. M., et al. (2019) PICRUSt2: An improved and extensible approach for metagenome inference. bioRxiv, 672295.

Duarte, C. M. (2015) Seafaring in the 21st century: The Malaspina 2010 circumnavigation expedition. Limnology and Oceanography Bulletin, 24(1), 11-14.

58

Easson, C. G., & Lopez, J. V. (2018) Depth-dependent environmental drivers of microbial plankton community structure in the northern Gulf of Mexico. Frontiers in microbiology, 9.

Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047 -3048.

Eisenstein, M. (2018) Microbiology: making the best of PCR bias. In: Nature Publishing Group.

Falkowski, P. G., Barber, R. T., & Smetacek, V. (1998) Biogeochemical controls and feedbacks on ocean primary production. Science, 281(5374), 200-206.

Falkowski, P. G., Fenchel, T., & Delong, E. F. (2008) The microbial engines that drive Earth's biogeochemical cycles. Science, 320(5879), 1034-1039.

Fernández, A., Huang, S., Seston, S., Xing, J., Hickey, R., Criddle, C., & Tiedje, J. (1999) How stable is stable? Function versus community composition. Appl. Environ. Microbiol., 65(8), 3697-3704.

Fortunato, C. S., Larson, B., Butterfield, D. A., & Huber, J. A. (2018) Spatially distinct, temporally stable microbial populations mediate biogeochemical cycling at and below the seafloor in hydrothermal vent fluids. Environmental microbiology, 20(2), 769-784.

Franzosa, E. A., Hsu, T., Sirota-Madi, A., Shafquat, A., Abu-Ali, G., Morgan, X. C., & Huttenhower, C. (2015) Sequencing and beyond: integrating molecular'omics' for microbial community profiling. Nature Reviews Microbiology, 13(6), 360.

Freed, L. L. (2018) Characterization of the bioluminescent symbionts from ceratioids collected in the Gulf of Mexico.

Fuhrman, J. A. (2009) Microbial community structure and its functional implications. Nature, 459(7244), 193.

Fuhrman, J. A., Steele, J. A., Hewson, I., Schwalbach, M. S., Brown, M. V., Green, J. L., & Brown, J. H. (2008) A latitudinal diversity gradient in planktonic marine bacteria. Proceedings of the National Academy of Sciences, 105(22), 7774-7778.

Gao, Z. M., Huang, J. M., Cui, G. J., Li, W. L., Li, J., Wei, Z. F., et al. (2019) In situ meta‐omic insights into the community compositions and ecological roles of hadal microbes in the Mariana Trench. Environmental microbiology.

Gilbert, J. A., & Dupont, C. L. (2010) Microbial metagenomics: beyond the genome.

Gilbert, J. A., Jansson, J. K., & Knight, R. (2014) The Earth Microbiome project: successes and aspirations. BMC biology, 12(1), 69.

59

Gilbert, J. A., Steele, J. A., Caporaso, J. G., Steinbrück, L., Reeder, J., Temperton, B., et al. (2012) Defining seasonal marine microbial community dynamics. The ISME journal, 6(2), 298.

Goolsby, D. A., Battaglin, W. A., Lawrence, G. B., Artz, R. S., Aulenbach, B. T., Hooper, R. P., et al. (1999) Flux and sources of nutrients in the Mississippi-Atchafalaya River Basin. National Oceanic and Atmospheric Administration National Ocean Service Coastal Ocean Program.

Grice, E. A., Kong, H. H., Conlan, S., Deming, C. B., Davis, J., Young, A. C., et al. (2009) Topographical and temporal diversity of the human skin microbiome. Science, 324(5931), 1190-1192.

Handelsman, J. (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiology and molecular biology reviews, 68(4), 669-685.

Hess, W. R. (2004) Genome analysis of marine photosynthetic microbes and their global role. Current opinion in biotechnology, 15(3), 191-198.

Hughes, J. B., Hellmann, J. J., Ricketts, T. H., & Bohannan, B. J. (2001) Counting the uncountable: statistical approaches to estimating microbial diversity. Appl. Environ. Microbiol., 67(10), 4399-4406.

Huttenhower, C., Gevers, D., Knight, R., Abubucker, S., Badger, J. H., Chinwalla, A. T., et al. (2012) Structure, function and diversity of the healthy human microbiome. Nature, 486(7402), 207.

Johnston, M. W., Milligan, R. J., Easson, C. G., DeRada, S., English, D. C., Penta, B., & Sutton, T. T. (2019) An empirically validated method for characterizing pelagic habitats in the Gulf of Mexico using ocean model data. Limnology and Oceanography: Methods.

Joye, S. B., MacDonald, I. R., Leifer, I., & Asper, V. (2011) Magnitude and oxidation potential of hydrocarbon gases released from the BP oil well blowout. Nature Geoscience, 4(3), 160.

Joye, S. B., Teske, A. P., & Kostka, J. E. (2014) Microbial dynamics following the Macondo oil well blowout across Gulf of Mexico environments. BioScience, 64(9), 766-777.

Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., et al. (2007) KEGG for linking genomes to life and the environment. Nucleic acids research, 36(suppl_1), D480- D484.

Karns, R. C. (2017) Microbial Community Richness Distinguishes Shark Species Microbiomes in South Florida.

60

Kennedy, K., Hall, M. W., Lynch, M. D., Moreno-Hagelsieb, G., & Neufeld, J. D. (2014) Evaluating bias of Illumina-based bacterial 16S rRNA gene profiles. Appl. Environ. Microbiol., 80(18), 5717-5722.

King, G. M., Smith, C., Tolar, B., & Hollibaugh, J. T. (2013) Analysis of composition and structure of coastal to mesopelagic bacterioplankton communities in the northern gulf of Mexico. Frontiers in microbiology, 3, 438.

Klindworth, A., Pruesse, E., Schweer, T., Peplies, J., Quast, C., Horn, M., & Glöckner, F. O. (2013) Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic acids research, 41(1), e1-e1.

Kumar, R., Mishra, A., & Jha, B. (2019) Bacterial community structure and functional diversity in subsurface seawater from the western coastal ecosystem of the Arabian Sea, India. Gene, 701, 55-64.

Lamendella, R., Strutt, S., Borglin, S. E., Chakraborty, R., Tas, N., Mason, O. U., et al. (2014) Assessment of the Deepwater Horizon oil spill impact on Gulf coast microbial communities. Frontiers in microbiology, 5, 130.

Langille, M. G., Zaneveld, J., Caporaso, J. G., McDonald, D., Knights, D., Reyes, J. A., et al. (2013) Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature biotechnology, 31(9), 814.

Leipper, D. F. (1970) A sequence of current patterns in the Gulf of Mexico. Journal of Geophysical Research, 75(3), 637-657.

Lelieveld, J., Crutzen, P. J., & Dentener, F. J. (1998) Changing concentration, lifetime and climate forcing of atmospheric methane. Tellus B, 50(2), 128-150.

Ley, R. E., Harris, J. K., Wilcox, J., Spear, J. R., Miller, S. R., Bebout, B. M., et al. (2006) Unexpected diversity and complexity of the Guerrero Negro hypersaline microbial mat. Applied and Environmental Microbiology, 72(5), 3685-3695.

Liu, Y., Lee, S. K., Muhling, B. A., Lamkin, J. T., & Enfield, D. B. (2012) Significant reduction of the Loop Current in the 21st century and its impact on the Gulf of Mexico. Journal of Geophysical Research: Oceans, 117(C5).

Logares, R., Deutschmann, I. M., Giner, C. R., Krabberød, A. K., Schmidt, T. S., Rubinat-Ripoll, L., et al. (2018) Different processes shape prokaryotic and picoeukaryotic assemblages in the sunlit ocean microbiome. bioRxiv, 374298.

Louca, S., Parfrey, L. W., & Doebeli, M. (2016) Decoupling function and taxonomy in the global ocean microbiome. Science, 353(6305), 1272-1277.

61

Lozupone, C., & Knight, R. (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol., 71(12), 8228-8235.

Mason, O. U., Canter, E. J., Gillies, L. E., Paisie, T. K., & Roberts, B. J. (2016) Mississippi river plume enriches microbial diversity in the northern gulf of Mexico. Frontiers in microbiology, 7, 1048.

Medini, D., Donati, C., Tettelin, H., Masignani, V., & Rappuoli, R. (2005) The microbial pan- genome. Current opinion in genetics & development, 15(6), 589-594.

Mestre, M., Ruiz-González, C., Logares, R., Duarte, C. M., Gasol, J. M., & Sala, M. M. (2018) Sinking particles promote vertical connectivity in the ocean microbiome. Proceedings of the National Academy of Sciences, 115(29), E6799-E6807.

Moran, M. A. (2015) The global ocean microbiome. Science, 350(6266), aac8455.

Morris, R. M., Rappé, M. S., Connon, S. A., Vergin, K. L., Siebold, W. A., Carlson, C. A., & Giovannoni, S. J. (2002) SAR11 clade dominates ocean surface bacterioplankton communities. Nature, 420(6917), 806.

Nagata, T., Fukuda, H., Fukuda, R., & Koike, I. (2000) Bacterioplankton distribution and production in deep Pacific waters: Large–scale geographic variations and possible coupling with sinking particle fluxes. Limnology and Oceanography, 45(2), 426-435.

Neethu, C., Saravanakumar, C., Purvaja, R., Robin, R., & Ramesh, R. (2019) oil-spill triggered shift in Indigenous Microbial structure and Functional Dynamics in Different Marine environmental Matrices. Scientific reports, 9(1), 1354.

Nowinski, B., Smith, C. B., Thomas, C. M., Esson, K., Marin, R., Preston, C. M., et al. (2019) Microbial metagenomes and metatranscriptomes during a coastal phytoplankton bloom. Scientific data, 6(1), 129.

O'Connell, L. M. (2015) Fine-Grained Bacterial Compositional Analsysis of the Port Everglades Inlet (Broward County, FL) Microbiome using High Throughput DNA Sequencing.

Oey, L., Ezer, T., & Lee, H. (2005) Loop Current, rings and related circulation in the Gulf of Mexico: A review of numerical models and future challenges. Geophysical Monograph- American Geophysical Union, 161, 31.

Oksanen, J., Blanchet, F. G., Kindt, R., Legendre, P., Minchin, P. R., O’hara, R., et al. (2011) vegan: Community ecology package. R package version, 117-118.

Orcutt, B. N., Sylvan, J. B., Knab, N. J., & Edwards, K. J. (2011) Microbial ecology of the dark ocean above, at, and below the seafloor. Microbiol. Mol. Biol. Rev., 75(2), 361-422.

62

Parada, A. E., Needham, D. M., & Fuhrman, J. A. (2016) Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environmental microbiology, 18(5), 1403-1414.

Pollock, J., Glendinning, L., Wisedchanwet, T., & Watson, M. (2018) The madness of microbiome: attempting to find consensus “best practice” for 16S microbiome studies. Applied and Environmental Microbiology, 84(7), e02627-02617.

Pomeroy, L. R., leB. WILLIAMS, P. J., Azam, F., & Hobbie, J. E. (2007) The microbial loop. Oceanography, 20(2), 28-33.

Rabalais, N. N., Turner, R., Gupta, B. S., Boesch, D., Chapman, P., & Murrell, M. (2007) Hypoxia in the northern Gulf of Mexico: Does the science support the plan to reduce, mitigate, and control hypoxia? Estuaries and Coasts, 30(5), 753-772.

Rabalais, N. N., Turner, R. E., Dortch, Q., Justic, D., Bierman, V. J., & Wiseman, W. J. (2002) Nutrient-enhanced productivity in the northern Gulf of Mexico: past, present and future. In Nutrients and Eutrophication in Estuaries and Coastal Waters (pp. 39-63): Springer.

Rabalais, N. N., Turner, R. E., Justic, D., Dortch, Q., & Wiseman Jr, W. J. (1999) Characterization of hypoxia: topic I report for the integrated assessment on hypoxia in the Gulf of Mexico.

Rabalais, N. N., Turner, R. E., Justić, D., Dortch, Q., Wiseman, W. J., & Gupta, B. K. S. (1996) Nutrient changes in the Mississippi River and system responses on the adjacent continental shelf. Estuaries, 19(2), 386-407.

Rabalais, N. N., Turner, R. E., & Wiseman Jr, W. J. (2002) Gulf of Mexico hypoxia, aka “The dead zone”. Annual Review of ecology and Systematics, 33(1), 235-263.

Rakowski, C., Magen, C., Bosman, S., Gillies, L., Rogers, K., Chanton, J., & Mason, O. U. (2015) Methane and microbial dynamics in the Gulf of Mexico water column. Frontiers in Marine Science, 2, 69.

Reddy, C. M., Arey, J. S., Seewald, J. S., Sylva, S. P., Lemkau, K. L., Nelson, R. K., et al. (2012) Composition and fate of gas and oil released to the water column during the Deepwater Horizon oil spill. Proceedings of the National Academy of Sciences, 109(50), 20229-20234.

Ruff, S. E., Biddle, J. F., Teske, A. P., Knittel, K., Boetius, A., & Ramette, A. (2015) Global dispersion and local diversification of the methane seep microbiome. Proceedings of the National Academy of Sciences, 112(13), 4015-4020.

Rusch, D. B., Halpern, A. L., Sutton, G., Heidelberg, K. B., Williamson, S., Yooseph, S., et al. (2007) The Sorcerer II global ocean sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS biology, 5(3), e77.

63

Salazar, G., & Sunagawa, S. (2017) Marine microbial diversity. Current Biology, 27(11), R489- R494.

Salerno, J., Little, B., Lee, J., & Hamdan, L. J. (2018) Exposure to crude oil and chemical dispersant may impact marine microbial biofilm composition and steel corrosion. Frontiers in Marine Science, 5(196)

Sanchez, L. M., Wong, W. R., Riener, R. M., Schulze, C. J., & Linington, R. G. (2012) Examining the fish microbiome: vertebrate-derived bacteria as an environmental niche for the discovery of unique marine natural products. PloS one, 7(5), e35398.

Scavia, D., Bertani, I., Obenour, D. R., Turner, R. E., Forrest, D. R., & Katin, A. (2017) Ensemble modeling informs hypoxia management in the northern Gulf of Mexico. Proceedings of the National Academy of Sciences, 114(33), 8823-8828.

Schloss, P. D., & Handelsman, J. (2003) Biotechnological prospects from metagenomics. Current opinion in biotechnology, 14(3), 303-310.

Schloss, P. D., & Westcott, S. L. (2011) Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis. Appl. Environ. Microbiol., 77(10), 3219-3226.

Schloss, P. D., Westcott, S. L., Ryabin, T., Hall, J. R., Hartmann, M., Hollister, E. B., et al. (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75(23), 7537-7541.

Shen, Y., Fichot, C. G., Liang, S. K., & Benner, R. (2016) Biological hot spots and the accumulation of marine dissolved organic matter in a highly productive ocean margin. Limnology and Oceanography, 61(4), 1287-1300.

Skutas, J. L. (2017) Microbial and Genomic Analysis of Environmental Samples in Search of Pathogenic Salmonella.

Stephens, Z. D., Lee, S. Y., Faghri, F., Campbell, R. H., Zhai, C., Efron, M. J., et al. (2015) Big data: astronomical or genomical? PLoS biology, 13(7), e1002195.

Sunagawa, S., Coelho, L. P., Chaffron, S., Kultima, J. R., Labadie, K., Salazar, G., et al. (2015) Structure and function of the global ocean microbiome. Science, 348(6237), 1261359.

Sutton, T., Cook, A., Boswell, K. M., Bracken-Grissom, H. D., DeRada, S., English, D., et al. (2017) Deep-Pelagic Research in the Gulf of Mexico: The DEEPEND Consortium.

Thompson, L. R., Sanders, J. G., McDonald, D., Amir, A., Ladau, J., Locey, K. J., et al. (2017) A communal catalogue reveals Earth’s multiscale microbial diversity. Nature, 551(7681).

64

Tikhonov, M., Leach, R. W., & Wingreen, N. S. (2015) Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution. The ISME journal, 9(1), 68.

Tringe, S. G., & Hugenholtz, P. (2008) A renaissance for the pioneering 16S rRNA gene. Current opinion in microbiology, 11(5), 442-446.

Turnbaugh, P. J., Ridaura, V. K., Faith, J. J., Rey, F. E., Knight, R., & Gordon, J. I. (2009) The effect of diet on the human gut microbiome: a metagenomic analysis in humanized gnotobiotic mice. Science translational medicine, 1(6), 6ra14-16ra14.

Turner, R. E., Rabalais, N. N., Alexander, R. B., McIsaac, G., & Howarth, R. (2007) Characterization of nutrient, organic carbon, and sediment loads and concentrations from the Mississippi River into the northern Gulf of Mexico. Estuaries and Coasts, 30(5), 773- 790.

Vernikos, G., Medini, D., Riley, D. R., & Tettelin, H. (2015) Ten years of pan-genome analyses. Current opinion in microbiology, 23, 148-154.

Ward, C. H., & Tunnell, J. W. (2017) Habitats and Biota of the Gulf of Mexico: An Overview. In Habitats and Biota of the Gulf of Mexico: Before the Deepwater Horizon Oil Spill (pp. 1- 54): Springer.

Westcott, S. L., & Schloss, P. D. (2015) De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units. PeerJ, 3, e1487.

Williams, A. K., McInnes, A. S., Rooker, J. R., & Quigg, A. (2015) Changes in microbial plankton assemblages induced by mesoscale oceanographic features in the northern Gulf of Mexico. PloS one, 10(9), e0138230.

Willis, C., Desai, D., & LaRoche, J. (2019) Influence of 16S rRNA variable region on perceived diversity of marine microbial communities of the Northern North Atlantic. FEMS microbiology letters, 366(13), fnz152.

Xu, J. (2006) Invited review: microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances. Molecular ecology, 15(7), 1713-1731.

Zhou, J., L Richlen, M., R Sehein, T., M Kulis, D., Anderson, D., & Cai, Z. (2018) Microbial community structure and associations during a marine dinoflagellate bloom. Frontiers in microbiology, 9, 1201.

65

APPENDIX I Environmental Data Mapping Table

Combined microbial sample and environmental data mapping table. Original cruise data obtained from the GRIIDC database under UDI: R4.x257.230:0011 (DP03) and R4.x257.230:0012 (DP04)

Depth CTD Identifier Cruise Station mesoscale_feat (m) Pelagic_zone Temperature Salinity Chla Oxygen NCBI SRA CTD31.1600.1 DP03 B082 Common_water 1600 4.Bathypelagic 4.297 34.968 0.01 NA SRR6457113 CTD31.1600.2 DP03 B082 Common_water 1600 4.Bathypelagic 4.297 34.968 0.01 NA SRR6457087 CTD31.1600.3 DP03 B082 Common_water 1600 4.Bathypelagic 4.297 34.968 0.01 NA SRR6457092 CTD31.450.1 DP03 B082 Common_water 450 3.Mesopelagic 8.0095 34.963 0 NA SRR6457114 CTD31.450.2 DP03 B082 Common_water 450 3.Mesopelagic 8.0095 34.963 0 NA SRR6457174 CTD31.450.3 DP03 B082 Common_water 450 3.Mesopelagic 8.0095 34.963 0 NA SRR6457173 CTD31.80.12.1 DP03 B082 Common_water 80 2.Epipelagic 20.4115 36.577 0.597 NA SRR6457086 CTD31.80.12.2 DP03 B082 Common_water 80 2.Epipelagic 20.4115 36.577 0.597 NA SRR6457085 CTD31.80.12.3 DP03 B082 Common_water 80 2.Epipelagic 20.4115 36.577 0.597 NA SRR6457095 CTDS.SWSYS.1 DP03 B082 Common_water 2 1.Surface 25.042 36.289 0.043 NA SRR6457096 CTDS.SWSYS.2 DP03 B082 Common_water 2 1.Surface 25.042 36.289 0.043 NA SRR6457269 CTDS.SWSYS.3 DP03 B082 Common_water 2 1.Surface 25.042 36.289 0.043 NA SRR6457268 CTD32.1600.1 DP03 B082 Common_water 1600 4.Bathypelagic 4.285 34.97 0.007 4.66 SRR6457267 CTD32.1600.2.2 DP03 B082 Common_water 1600 4.Bathypelagic 4.285 34.97 0.007 4.66 SRR6457266 CTD32.1600.3 DP03 B082 Common_water 1600 4.Bathypelagic 4.285 34.97 0.007 4.66 SRR6457265 CTD32.450.1 DP03 B082 Common_water 450 3.Mesopelagic 8.312 34.988 0 2.674 SRR6457264 CTD32.450.2 DP03 B082 Common_water 450 3.Mesopelagic 8.312 34.988 0 2.674 SRR6457263 CTD32.450.3 DP03 B082 Common_water 450 3.Mesopelagic 8.312 34.988 0 2.674 SRR6457262 CTD32.80.10.1 DP03 B082 Common_water 80 2.Epipelagic 19.86 36.528 0.521 3.115 SRR6457272 CTD32.80.10.2 DP03 B082 Common_water 80 2.Epipelagic 19.86 36.528 0.521 3.115 SRR6457271 CTD32.80.10.3 DP03 B082 Common_water 80 2.Epipelagic 19.86 36.528 0.521 3.115 SRR6457210 CTD32.S.11.1 DP03 B082 Common_water 2 1.Surface 24.9435 36.261 0.032 4.677 SRR6457211 CTD32.S.11.2 DP03 B082 Common_water 2 1.Surface 24.9435 36.261 0.032 4.677 SRR6457208 CTD32.S.11.3 DP03 B082 Common_water 2 1.Surface 24.9435 36.261 0.032 4.677 SRR6457209

67 CTD33.1500.1 DP03 B082 Common_water 1500 4.Bathypelagic 4.3095 34.968 0.01 4.638 SRR6457206 CTD33.1500.2 DP03 B082 Common_water 1500 4.Bathypelagic 4.3095 34.968 0.01 4.638 SRR6457207

CTD33.1500.3 DP03 B082 Common_water 1500 4.Bathypelagic 4.3095 34.968 0.01 4.638 SRR6457204 CTD33.377.1 DP03 B082 Common_water 377 3.Mesopelagic 9.469 35.116 0.005 2.609 SRR6457205 CTD33.377.2 DP03 B082 Common_water 377 3.Mesopelagic 9.469 35.116 0.005 2.609 SRR6457202 CTD33.377.3 DP03 B082 Common_water 377 3.Mesopelagic 9.469 35.116 0.005 2.609 SRR6457203 CTD33.68.10.1 DP03 B082 Common_water 68 2.Epipelagic 21.0005 36.447 1.477 4.419 SRR6457536 CTD33.68.10.2 DP03 B082 Common_water 68 2.Epipelagic 21.0005 36.447 1.477 4.419 SRR6457535 CTD33.68.10.3 DP03 B082 Common_water 68 2.Epipelagic 21.0005 36.447 1.477 4.419 SRR6457538 CTD33.S.12.1 DP03 B082 Common_water 2 1.Surface 25.4345 36.287 0.043 4.651 SRR6457537 CTD33.S.12.2 DP03 B082 Common_water 2 1.Surface 25.4345 36.287 0.043 4.651 SRR6457532 CTD33.S.12.3 DP03 B082 Common_water 2 1.Surface 25.4345 36.287 0.043 4.651 SRR6457531 CTD34.1600.1 DP03 B082 Common_water 1600 4.Bathypelagic 4.278 34.97 0.007 4.66 SRR6457534 CTD34.1600.2 DP03 B082 Common_water 1600 4.Bathypelagic 4.278 34.97 0.007 4.66 SRR6457533 CTD34.1600.3 DP03 B082 Common_water 1600 4.Bathypelagic 4.278 34.97 0.007 4.66 SRR6457530 CTD34.375.1 DP03 B082 Common_water 375 3.Mesopelagic 9.4965 35.125 0 2.568 SRR6457529 CTD34.375.3 DP03 B082 Common_water 375 3.Mesopelagic 9.4965 35.125 0 2.568 SRR6457457 CTD34.375.7.2 DP03 B082 Common_water 375 3.Mesopelagic 9.4965 35.125 0 2.568 SRR6457458 CTD34.50.10.1 DP03 B082 Common_water 50 2.Epipelagic 22.641 36.783 0.999 4.198 SRR6457459 CTD34.50.10.2 DP03 B082 Common_water 50 2.Epipelagic 22.641 36.783 0.999 4.198 SRR6457460 CTD34.50.10.3 DP03 B082 Common_water 50 2.Epipelagic 22.641 36.783 0.999 4.198 SRR6457453 CTD34.S.12.1 DP03 B082 Common_water 2 1.Surface 24.922 36.297 0.032 4.681 SRR6457454 CTD34.S.12.2 DP03 B082 Common_water 2 1.Surface 24.922 36.297 0.032 4.681 SRR6457455 CTD34.S.12.3 DP03 B082 Common_water 2 1.Surface 24.922 36.297 0.032 4.681 SRR6457456 CTD35.1500.1 DP03 B287 Common_water 1500 4.Bathypelagic 4.277 34.97 0.032 4.668 SRR6457461 CTD35.1500.2 DP03 B287 Common_water 1500 4.Bathypelagic 4.277 34.97 0.032 4.668 SRR6457462 CTD35.1500.3 DP03 B287 Common_water 1500 4.Bathypelagic 4.277 34.97 0.032 4.668 SRR6457342 CTD35.303.1 DP03 B287 Common_water 303 3.Mesopelagic 10.5005 35.266 0.021 2.543 SRR6457341 CTD35.303.2 DP03 B287 Common_water 303 3.Mesopelagic 10.5005 35.266 0.021 2.543 SRR6457340 CTD35.303.3 DP03 B287 Common_water 303 3.Mesopelagic 10.5005 35.266 0.021 2.543 SRR6457339 CTD35.56.8.1 DP03 B287 Common_water 56 2.Epipelagic 21.262 36.46 1.227 4.495 SRR6457346 CTD35.56.8.2 DP03 B287 Common_water 56 2.Epipelagic 21.262 36.46 1.227 4.495 SRR6457345 CTD35.56.8.3 DP03 B287 Common_water 56 2.Epipelagic 21.262 36.46 1.227 4.495 SRR6457344

68

CTD35.S.12.1 DP03 B287 Common_water 2 1.Surface 25.555 36.285 0.075 4.712 SRR6457343

CTD35.S.12.2 DP03 B287 Common_water 2 1.Surface 25.555 36.285 0.075 4.712 SRR6457337 CTD35.S.12.3 DP03 B287 Common_water 2 1.Surface 25.555 36.285 0.075 4.712 SRR6457336 CTD36.1500.1 DP03 B287 Common_water 1500 4.Bathypelagic 4.273 34.97 0.043 4.67 SRR6457516 CTD36.1500.2 DP03 B287 Common_water 1500 4.Bathypelagic 4.273 34.97 0.043 4.67 SRR6457290 CTD36.1500.3 DP03 B287 Common_water 1500 4.Bathypelagic 4.273 34.97 0.043 4.67 SRR6457287 CTD36.160.1 DP03 B287 Common_water 160 2.Epipelagic 14.771 35.943 0.021 3.065 SRR6457288 CTD36.160.2 DP03 B287 Common_water 160 2.Epipelagic 14.771 35.943 0.021 3.065 SRR6457293 CTD36.160.3 DP03 B287 Common_water 160 2.Epipelagic 14.771 35.943 0.021 3.065 SRR6457294 CTD36.283.1 DP03 B287 Common_water 283 3.Mesopelagic 10.7695 35.308 0.01 2.546 SRR6457291 CTD36.283.2 DP03 B287 Common_water 283 3.Mesopelagic 10.7695 35.308 0.01 2.546 SRR6457292 CTD36.283.3 DP03 B287 Common_water 283 3.Mesopelagic 10.7695 35.308 0.01 2.546 SRR6457296 CTD36.52.1 DP03 B287 Common_water 52 2.Epipelagic 21.753 36.542 0.836 4.116 SRR6457297 CTD36.52.2 DP03 B287 Common_water 52 2.Epipelagic 21.753 36.542 0.836 4.116 SRR6457122 CTD36.52.3 DP03 B287 Common_water 52 2.Epipelagic 21.753 36.542 0.836 4.116 SRR6457121 CTD36.S.11.1 DP03 B287 Common_water 2 1.Surface 25.4075 36.326 0.065 5.714 SRR6457124 CTD36.S.11.2 DP03 B287 Common_water 2 1.Surface 25.4075 36.326 0.065 5.714 SRR6457123 CTD36.S.11.3 DP03 B287 Common_water 2 1.Surface 25.4075 36.326 0.065 5.714 SRR6457126 CTD37.TR.245.1 DP03 B287 Common_water 245 3.Mesopelagic 11.245 35.377 0.021 2.712 SRR6457125 CTD37.TR.245.1.R2 DP03 B287 Common_water 245 3.Mesopelagic 11.245 35.377 0.021 2.712 SRR6457128 CTD37.TR.245.2 DP03 B287 Common_water 245 3.Mesopelagic 11.245 35.377 0.021 2.712 SRR6457127 CTD37.TR.245.3 DP03 B287 Common_water 245 3.Mesopelagic 11.245 35.377 0.021 2.712 SRR6457130 CTD37.TR.274.1 DP03 B287 Common_water 274 3.Mesopelagic 10.2645 35.234 0.021 2.669 SRR6457129 CTD37.TR.274.2 DP03 B287 Common_water 274 3.Mesopelagic 10.2645 35.234 0.021 2.669 SRR6457099 CTD37.TR.274.3 DP03 B287 Common_water 274 3.Mesopelagic 10.2645 35.234 0.021 2.669 SRR6457338 CTD37.TR.50.1 DP03 B287 Common_water 50 2.Epipelagic 22.262 36.379 0.358 4.738 SRR6457357 CTD37.TR.50.2 DP03 B287 Common_water 50 2.Epipelagic 22.262 36.379 0.358 4.738 SRR6457368 CTD37.TR.50.3 DP03 B287 Common_water 50 2.Epipelagic 22.262 36.379 0.358 4.738 SRR6457370 CTD37.TR.50.3.R2 DP03 B287 Common_water 50 2.Epipelagic 22.262 36.379 0.358 4.738 SRR6457118 CTD38.244.1 DP03 B003 Common_water 244 3.Mesopelagic 9.4595 35.152 0.032 2.748 SRR6457091 CTD38.244.2 DP03 B003 Common_water 244 3.Mesopelagic 9.4595 35.152 0.032 2.748 SRR6457298 CTD38.244.3 DP03 B003 Common_water 244 3.Mesopelagic 9.4595 35.152 0.032 2.748 SRR6457295

69

CTD38.59.7.1 DP03 B003 Common_water 59 2.Epipelagic 21.832 36.414 0.945 4.232 SRR6457201

CTD38.59.7.2 DP03 B003 Common_water 59 2.Epipelagic 21.832 36.414 0.945 4.232 SRR6457397 CTD38.59.7.3 DP03 B003 Common_water 59 2.Epipelagic 21.832 36.414 0.945 4.232 SRR6457278 CTD38.S.1 DP03 B003 Common_water 2 1.Surface 25.329 36.343 0.086 4.658 SRR6457275 CTD38.S.2 DP03 B003 Common_water 2 1.Surface 25.329 36.343 0.086 4.658 SRR6457274 CTD38.S.3 DP03 B003 Common_water 2 1.Surface 25.329 36.343 0.086 4.658 SRR6457273 DPO3.38.1500.7A DP03 B003 Common_water 1500 4.Bathypelagic 4.291 34.969 0.032 4.652 SRR6457119 DPO3.38.1500.7B DP03 B003 Common_water 1500 4.Bathypelagic 4.291 34.969 0.032 4.652 SRR6457120 DPO3.38.1500.7C DP03 B003 Common_water 1500 4.Bathypelagic 4.291 34.969 0.032 4.652 SRR6457089 TR.50.39.2A DP03 B003 Common_water 50 2.Epipelagic 23.3315 36.356 0.227 5.083 SRR6457144 TR.50.39.2B DP03 B003 Common_water 50 2.Epipelagic 23.3315 36.356 0.227 5.083 SRR6457143 TR.50.39.2C DP03 B003 Common_water 50 2.Epipelagic 23.3315 36.356 0.227 5.083 SRR6457477 TR.500.39.1.2A DP03 B003 Common_water 301 3.Mesopelagic 8.501 35.033 0.032 2.99 SRR6457473 TR.500.39.1.2B DP03 B003 Common_water 301 3.Mesopelagic 8.501 35.033 0.032 2.99 SRR6457474 TR.500.39.1.2C DP03 B003 Common_water 301 3.Mesopelagic 8.501 35.033 0.032 2.99 SRR6457471 TR.500.39.2.2A DP03 B003 Common_water 300 3.Mesopelagic 8.5275 35.042 0.032 2.998 SRR6457478 TR.500.39.2.2B DP03 B003 Common_water 300 3.Mesopelagic 8.5275 35.042 0.032 2.998 SRR6457475 TR.500.39.2.2C DP03 B003 Common_water 300 3.Mesopelagic 8.5275 35.042 0.032 2.998 SRR6457476 CTD40.1500.1 DP03 B003 Common_water 1500 4.Bathypelagic NA NA NA NA SRR6457472 CTD40.1500.2 DP03 B003 Common_water 1500 4.Bathypelagic NA NA NA NA SRR6457469 CTD40.1500.3 DP03 B003 Common_water 1500 4.Bathypelagic NA NA NA NA SRR6457470 CTD40.252.1 DP03 B003 Common_water 252 3.Mesopelagic 9.52 35.153 0.021 2.707 SRR6457150 CTD40.252.2 DP03 B003 Common_water 252 3.Mesopelagic 9.52 35.153 0.021 2.707 SRR6457149 CTD40.252.3 DP03 B003 Common_water 252 3.Mesopelagic 9.52 35.153 0.021 2.707 SRR6457152 CTD40.64.1 DP03 B003 Common_water 64 2.Epipelagic 22.529 36.994 0.966 NA SRR6457151 CTD40.64.2 DP03 B003 Common_water 64 2.Epipelagic 22.529 36.994 1.966 NA SRR6457146 CTD40.64.3 DP03 B003 Common_water 64 2.Epipelagic 22.529 36.994 2.966 NA SRR6457145 CTD40.S.1 DP03 B003 Common_water 2 1.Surface 24.771 36.316 0.086 3.381 SRR6457148 CTD40.S.2 DP03 B003 Common_water 2 1.Surface 24.771 36.316 0.086 3.381 SRR6457147 CTD40.S.3 DP03 B003 Common_water 2 1.Surface 24.771 36.316 0.086 3.381 SRR6457142 CTD41.1500.1 DP03 B003 Common_water 1500 4.Bathypelagic 4.285 34.968 0.032 4.645 SRR6457141 CTD41.1500.2 DP03 B003 Common_water 1500 4.Bathypelagic 4.285 34.968 0.032 4.645 SRR6457226

70

CTD41.1500.3 DP03 B003 Common_water 1500 4.Bathypelagic 4.285 34.968 0.032 4.645 SRR6457227

CTD41.237.1 DP03 B003 Common_water 237 3.Mesopelagic 10.1745 35.243 0.01 2.733 SRR6457228 CTD41.237.2 DP03 B003 Common_water 237 3.Mesopelagic 10.1745 35.243 0.01 2.733 SRR6457229 CTD41.237.3 DP03 B003 Common_water 237 3.Mesopelagic 10.1745 35.243 0.01 2.733 SRR6457222 CTD41.70.1 DP03 B003 Common_water 70 2.Epipelagic 21.4245 36.913 1.059 NA SRR6457223 CTD41.70.2 DP03 B003 Common_water 70 2.Epipelagic 21.4245 36.913 1.059 NA SRR6457224 CTD41.70.3 DP03 B003 Common_water 70 2.Epipelagic 21.4245 36.913 1.059 NA SRR6457225 CTD41.S.1 DP03 B003 Common_water 2 1.Surface 24.7885 36.362 0.086 4.675 SRR6457230 CTD41.S.3 DP03 B003 Common_water 2 1.Surface 24.7885 36.362 0.086 4.675 SRR6457361 CTD42.1500.2.1 DP03 B079 Common_water 1500 4.Bathypelagic 4.291 34.968 0.032 4.619 SRR6457360 CTD42.1500.2.2 DP03 B079 Common_water 1500 4.Bathypelagic 4.291 34.968 0.032 4.619 SRR6457359 CTD42.1500.2.3 DP03 B079 Common_water 1500 4.Bathypelagic 4.291 34.968 0.032 4.619 SRR6457358 CTD42.347.4.1 DP03 B079 Common_water 347 3.Mesopelagic 10.3 35.283 0.021 NA SRR6457365 CTD42.347.4.2 DP03 B079 Common_water 347 3.Mesopelagic 10.3 35.283 0.021 NA SRR6457364 CTD42.347.4.3 DP03 B079 Common_water 347 3.Mesopelagic 10.3 35.283 0.021 NA SRR6457363 CTD42.94.1 DP03 B079 Common_water 94 2.Epipelagic 21.149 36.5 0.76 4.198 SRR6457362 CTD42.94.2 DP03 B079 Common_water 94 2.Epipelagic 21.149 36.5 0.76 4.198 SRR6457367 CTD42.94.3 DP03 B079 Common_water 94 2.Epipelagic 21.149 36.5 0.76 4.198 SRR6457366 CTD42.S.1 DP03 B079 Common_water 2 1.Surface 25.92 36.465 0.075 4.517 SRR6457519 CTD42.S.2 DP03 B079 Common_water 2 1.Surface 25.92 36.465 0.075 4.517 SRR6457520 CTD42.S.3 DP03 B079 Common_water 2 1.Surface 25.92 36.465 0.075 4.517 SRR6457517 CTD43.1500.2.1 DP03 B079 Common_water 1500 4.Bathypelagic 4.307 34.968 0.021 4.616 SRR6457518 CTD43.1500.2.2 DP03 B079 Common_water 1500 4.Bathypelagic 4.307 34.968 0.021 4.616 SRR6457523 CTD43.1500.2.3 DP03 B079 Common_water 1500 4.Bathypelagic 4.307 34.968 0.021 4.616 SRR6457524 CTD43.360.5.1 DP03 B079 Common_water 360 3.Mesopelagic 10.392 35.246 0.01 2.565 SRR6457521 CTD43.360.5.2 DP03 B079 Common_water 360 3.Mesopelagic 10.392 35.246 0.01 2.565 SRR6457522 CTD43.360.5.2.R2 DP03 B079 Common_water 360 3.Mesopelagic 10.392 35.246 0.01 2.565 SRR6457525 CTD43.360.5.3 DP03 B079 Common_water 360 3.Mesopelagic 10.392 35.246 0.01 2.565 SRR6457526 CTD43.86.8.1 DP03 B079 Common_water 86 2.Epipelagic 21.3 36.511 0.684 4.4 SRR6457192 CTD43.86.8.2 DP03 B079 Common_water 86 2.Epipelagic 21.3 36.511 0.684 4.4 SRR6457191 CTD43.86.8.3 DP03 B079 Common_water 86 2.Epipelagic 21.3 36.511 0.684 4.4 SRR6457194 CTD43.S.10.1 DP03 B079 Common_water 2 1.Surface 24.686 36.179 0.075 4.628 SRR6457193

71

CTD43.S.10.2 DP03 B079 Common_water 2 1.Surface 24.686 36.179 0.075 4.628 SRR6457196

CTD43.S.10.3 DP03 B079 Common_water 2 1.Surface 24.686 36.179 0.075 4.628 SRR6457195 CTD44.300.4.1A DP03 SE4 Loop_Current 300 3.Mesopelagic 16.4615 36.21 0.021 3.053 SRR6457198 CTD44.300.4.2A DP03 SE4 Loop_Current 300 3.Mesopelagic 16.4615 36.21 0.021 3.053 SRR6457197 CTD44.300.4.3A DP03 SE4 Loop_Current 300 3.Mesopelagic 16.4615 36.21 0.021 3.053 SRR6457200 CTD44.300.5.1B DP03 SE4 Loop_Current 301 3.Mesopelagic 16.4615 36.21 0.01 3.075 SRR6457199 CTD44.300.5.2B DP03 SE4 Loop_Current 301 3.Mesopelagic 16.4615 36.21 0.01 3.075 SRR6457301 CTD44.300.5.3B DP03 SE4 Loop_Current 301 3.Mesopelagic 16.4615 36.21 0.01 3.075 SRR6457302 CTD44.50.6.1 DP03 SE4 Loop_Current 50 2.Epipelagic 26.317 36.33 0.097 4.53 SRR6457303 CTD44.50.6.2 DP03 SE4 Loop_Current 50 2.Epipelagic 26.317 36.33 0.097 4.53 SRR6457304 CTD44.50.6.3 DP03 SE4 Loop_Current 50 2.Epipelagic 26.317 36.33 0.097 4.53 SRR6457305 CTD45.105.9.1 DP03 SE4 Mixed_water 105 2.Epipelagic 25.734 36.304 0.586 4.33 SRR6457306 CTD45.105.9.2 DP03 SE4 Mixed_water 105 2.Epipelagic 25.734 36.304 0.586 4.33 SRR6457307 CTD45.105.9.3 DP03 SE4 Mixed_water 105 2.Epipelagic 25.734 36.304 0.586 4.33 SRR6457308 CTD45.145.7.1 DP03 SE4 Mixed_water 145 2.Epipelagic 25.4275 36.623 0.162 3.783 SRR6457299 CTD45.145.7.2 DP03 SE4 Mixed_water 145 2.Epipelagic 25.4275 36.623 0.162 3.783 SRR6457300 CTD45.145.7.3 DP03 SE4 Mixed_water 145 2.Epipelagic 25.4275 36.623 0.162 3.783 SRR6457419 CTD45.1500.1.1 DP03 SE4 Mixed_water 1500 4.Bathypelagic 4.319 34.967 0.021 4.592 SRR6457418 CTD45.1500.1.2 DP03 SE4 Mixed_water 1500 4.Bathypelagic 4.319 34.967 0.021 4.592 SRR6457417 CTD45.1500.1.3 DP03 SE4 Mixed_water 1500 4.Bathypelagic 4.319 34.967 0.021 4.592 SRR6457416 CTD45.533.4.1 DP03 SE4 Mixed_water 533 3.Mesopelagic 10.1605 35.22 0.032 2.463 SRR6457415 CTD45.533.4.2 DP03 SE4 Mixed_water 533 3.Mesopelagic 10.1605 35.22 0.032 2.463 SRR6457414 CTD45.533.4.2.R2 DP03 SE4 Mixed_water 533 3.Mesopelagic 10.1605 35.22 0.032 2.463 SRR6457413 CTD45.533.4.3 DP03 SE4 Mixed_water 533 3.Mesopelagic 10.1605 35.22 0.032 2.463 SRR6457412 CTD45.S.12.1 DP03 SE4 Mixed_water 2 1.Surface 26.309 36.335 0.043 4.463 SRR6457421 CTD45.S.12.2 DP03 SE4 Mixed_water 2 1.Surface 26.309 36.335 0.043 4.463 SRR6457420 CTD45.S.12.3 DP03 SE4 Mixed_water 2 1.Surface 26.309 36.335 0.043 4.463 SRR6457377 CTD46.300.4.1A DP03 SE4 Loop_Current 300 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457378 CTD46.300.4.1A.R1 DP03 SE4 Loop_Current 300 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457375 CTD46.300.4.1B DP03 SE4 Loop_Current 301 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457376 CTD46.300.4.2A DP03 SE4 Loop_Current 300 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457373 CTD46.300.4.2A.R1 DP03 SE4 Loop_Current 300 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457374

72

CTD46.300.4.2B DP03 SE4 Loop_Current 301 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457371

CTD46.300.4.3A DP03 SE4 Loop_Current 300 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457372 CTD46.300.4.3B DP03 SE4 Loop_Current 301 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457379 CTD46.300.5.2B DP03 SE4 Loop_Current 301 3.Mesopelagic 17.5725 36.399 0 3.271 SRR6457380 CTD46.50.9.1 DP03 SE4 Loop_Current 50 2.Epipelagic 26.403 36.337 0.065 4.535 SRR6457484 CTD46.50.9.2 DP03 SE4 Loop_Current 50 2.Epipelagic 26.403 36.337 0.065 4.535 SRR6457483 CTD46.50.9.3 DP03 SE4 Loop_Current 50 2.Epipelagic 26.403 36.337 0.065 4.535 SRR6457486 CTD47.106.8.1 DP03 SE5 Mixed_water 106 2.Epipelagic 25.7235 36.306 0.532 4.353 SRR6457485 CTD47.106.8.2 DP03 SE5 Mixed_water 106 2.Epipelagic 25.7235 36.306 0.532 4.353 SRR6457480 CTD47.106.8.3 DP03 SE5 Mixed_water 106 2.Epipelagic 25.7235 36.306 0.532 4.353 SRR6457479 CTD47.1500.1.1 DP03 SE5 Mixed_water 1500 4.Bathypelagic 4.315 34.971 0.032 4.609 SRR6457482 CTD47.1500.1.2 DP03 SE5 Mixed_water 1500 4.Bathypelagic 4.315 34.971 0.032 4.609 SRR6457481 CTD47.1500.1.3 DP03 SE5 Mixed_water 1500 4.Bathypelagic 4.315 34.971 0.032 4.609 SRR6457488 CTD47.511.4.1 DP03 SE5 Mixed_water 511 3.Mesopelagic 9.5735 35.275 0.01 0 SRR6457487 CTD47.511.4.2 DP03 SE5 Mixed_water 511 3.Mesopelagic 9.5735 35.275 0.01 0 SRR6457088 CTD47.511.4.2.R2 DP03 SE5 Mixed_water 511 3.Mesopelagic 9.5735 35.275 0.01 0 SRR6457112 CTD47.511.4.3 DP03 SE5 Mixed_water 511 3.Mesopelagic 9.5735 35.275 0.01 0 SRR6457115 CTD47.S.1 DP03 SE5 Mixed_water 2 1.Surface 26.4155 36.35 0.059 4.501 SRR6457090 CTD47.S.2 DP03 SE5 Mixed_water 2 1.Surface 26.4155 36.35 0.059 4.501 SRR6457109 CTD47.S.3 DP03 SE5 Mixed_water 2 1.Surface 26.4155 36.35 0.059 4.501 SRR6457550 CTD48.1500.1.2 DP03 B252 Common_water 1500 4.Bathypelagic 4.3225 34.97 0.021 4.602 SRR6457111 CTD48.1500.1.3 DP03 B252 Common_water 1500 4.Bathypelagic 4.3225 34.97 0.021 4.602 SRR6457110 CTD48.396.4.1 DP03 B252 Common_water 396 3.Mesopelagic 9.1815 35.089 0 2.612 SRR6457117 CTD48.396.4.1.R2 DP03 B252 Common_water 396 3.Mesopelagic 9.1815 35.089 0 2.612 SRR6457116 CTD48.396.4.2 DP03 B252 Common_water 396 3.Mesopelagic 9.1815 35.089 0 2.612 SRR6457449 CTD48.396.4.3 DP03 B252 Common_water 396 3.Mesopelagic 9.1815 35.089 0 2.612 SRR6457450 CTD48.396.4.3.R2 DP03 B252 Common_water 396 3.Mesopelagic 9.1815 35.089 0 2.612 SRR6457447 CTD48.64.7.1 DP03 B252 Common_water 64 2.Epipelagic 21.569 36.586 0.934 4.052 SRR6457448 CTD48.64.7.2 DP03 B252 Common_water 64 2.Epipelagic 21.569 36.586 0.934 4.052 SRR6457445 CTD48.64.7.3 DP03 B252 Common_water 64 2.Epipelagic 21.569 36.586 0.934 4.052 SRR6457446 CTD48.S.10.1 DP03 B252 Common_water 2 1.Surface 24.73 36.14 0.13 4.717 SRR6457443 CTD48.S.10.2 DP03 B252 Common_water 2 1.Surface 24.73 36.14 0.13 4.717 SRR6457444

73

CTD48.S.10.3 DP03 B252 Common_water 2 1.Surface 24.73 36.14 0.13 4.717 SRR6457451

CTD49.1500.1.1 DP03 B252 Common_water 1500 4.Bathypelagic 4.337 34.97 0.01 4.584 SRR6457452 CTD49.1500.1.2 DP03 B252 Common_water 1500 4.Bathypelagic 4.337 34.97 0.01 4.584 SRR6457331 CTD49.1500.1.3 DP03 B252 Common_water 1500 4.Bathypelagic 4.337 34.97 0.01 4.584 SRR6457330 CTD49.360.4.1 DP03 B252 Common_water 360 3.Mesopelagic 9.823 35.171 0.01 2.556 SRR6457411 CTD49.360.4.1.R2 DP03 B252 Common_water 360 3.Mesopelagic 9.823 35.171 0.01 2.556 SRR6457410 CTD49.360.4.2 DP03 B252 Common_water 360 3.Mesopelagic 9.823 35.171 0.01 2.556 SRR6457335 CTD49.360.4.2.R2 DP03 B252 Common_water 360 3.Mesopelagic 9.823 35.171 0.01 2.556 SRR6457334 CTD49.360.4.3 DP03 B252 Common_water 360 3.Mesopelagic 9.823 35.171 0.01 2.556 SRR6457333 CTD49.65.7.1 DP03 B252 Common_water 65 2.Epipelagic 21.397 36.56 0.858 4.2 SRR6457332 CTD49.65.7.2 DP03 B252 Common_water 65 2.Epipelagic 21.397 36.56 0.858 4.2 SRR6457399 CTD49.65.7.3 DP03 B252 Common_water 65 2.Epipelagic 21.397 36.56 0.858 4.2 SRR6457398 CTD49.S.1 DP03 B252 Common_water 2 1.Surface 24.653 36.198 0.075 NA SRR6457545 CTD49.S.2 DP03 B252 Common_water 2 1.Surface 24.653 36.198 0.075 NA SRR6457549 CTD49.S.3 DP03 B252 Common_water 2 1.Surface 24.653 36.198 0.075 NA SRR6457189 CTD50.1500.1.1 DP03 B081 Common_water 1500 4.Bathypelagic 4.3375 34.97 0.032 4.581 SRR6457190 CTD50.1500.1.2 DP03 B081 Common_water 1500 4.Bathypelagic 4.3375 34.97 0.032 4.581 SRR6457183 CTD50.1500.1.3 DP03 B081 Common_water 1500 4.Bathypelagic 4.3375 34.97 0.032 4.581 SRR6457184 CTD50.467.4.1 DP03 B081 Common_water 467 3.Mesopelagic 8.961 35.058 0.021 2.609 SRR6457185 CTD50.467.4.1.R2 DP03 B081 Common_water 467 3.Mesopelagic 8.961 35.058 0.021 2.609 SRR6457186 CTD50.467.4.2 DP03 B081 Common_water 467 3.Mesopelagic 8.961 35.058 0.021 2.609 SRR6457179 CTD50.467.4.3 DP03 B081 Common_water 467 3.Mesopelagic 8.961 35.058 0.021 2.609 SRR6457180 CTD50.49.1 DP03 B081 Common_water 49 2.Epipelagic 23.612 36.699 1.081 4.062 SRR6457166 CTD50.49.2 DP03 B081 Common_water 49 2.Epipelagic 23.612 36.699 1.081 4.062 SRR6457165 CTD50.49.3 DP03 B081 Common_water 49 2.Epipelagic 23.612 36.699 1.081 4.062 SRR6457168 CTD50.S.1 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457167 CTD50.S.2 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457170 CTD50.S.3 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457169 CTD51.1500.2.1 DP03 B081 Common_water 1500 4.Bathypelagic 4.3365 34.97 0.032 4.574 SRR6457172 CTD51.1500.2.2 DP03 B081 Common_water 1500 4.Bathypelagic 4.3365 34.97 0.032 4.574 SRR6457171 CTD51.1500.2.3 DP03 B081 Common_water 1500 4.Bathypelagic 4.3365 34.97 0.032 4.574 SRR6457164 CTD51.480.4.1 DP03 B081 Common_water 480 3.Mesopelagic 8.936 35.329 0 NA SRR6457163

74

CTD51.480.4.1.R2 DP03 B081 Common_water 480 3.Mesopelagic 8.936 35.329 0 NA SRR6457108

CTD51.480.4.2 DP03 B081 Common_water 480 3.Mesopelagic 8.936 35.329 0 NA SRR6457500 CTD51.480.4.3 DP03 B081 Common_water 480 3.Mesopelagic 8.936 35.329 0 NA SRR6457497 CTD51.53.1 DP03 B081 Common_water 53 2.Epipelagic 23.487 36.685 1.021 4.064 SRR6457498 CTD51.53.2 DP03 B081 Common_water 53 2.Epipelagic 23.487 36.685 1.021 4.064 SRR6457503 CTD51.53.3 DP03 B081 Common_water 53 2.Epipelagic 23.487 36.685 1.021 4.064 SRR6457504 CTD51.S.1 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457501 CTD51.S.2 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457502 CTD51.S.3 DP03 B081 Common_water 2 1.Surface NA NA NA NA SRR6457491 CTD52.1500.2.1 DP03 B175 Common_water 1500 4.Bathypelagic 4.3325 34.971 0.021 4.577 SRR6457492 CTD52.1500.2.2 DP03 B175 Common_water 1500 4.Bathypelagic 4.3325 34.971 0.021 4.577 SRR6457396 CTD52.1500.2.3 DP03 B175 Common_water 1500 4.Bathypelagic 4.3325 34.971 0.021 4.577 SRR6457395 CTD52.485.4.1 DP03 B175 Common_water 485 3.Mesopelagic 9.066 35.073 0.01 2.6 SRR6457394 CTD52.485.4.2 DP03 B175 Common_water 485 3.Mesopelagic 9.066 35.073 0.01 2.6 SRR6457393 CTD52.54.1 DP03 B175 Common_water 54 2.Epipelagic 23.807 36.586 0.717 4.301 SRR6457392 CTD52.54.2 DP03 B175 Common_water 54 2.Epipelagic 23.807 36.586 0.717 4.301 SRR6457391 CTD52.54.3 DP03 B175 Common_water 54 2.Epipelagic 23.807 36.586 0.717 4.301 SRR6457390 CTD52.S.1 DP03 B175 Common_water 2 1.Surface NA NA NA NA SRR6457389 CTD52.S.2 DP03 B175 Common_water 2 1.Surface NA NA NA NA SRR6457388 CTD52.S.3 DP03 B175 Common_water 2 1.Surface NA NA NA NA SRR6457387 CTD53.1500.1.1 DP03 B175 Common_water 1500 4.Bathypelagic 4.339 34.971 0.012 4.574 SRR6457252 CTD53.1500.1.2 DP03 B175 Common_water 1500 4.Bathypelagic 4.339 34.971 0.012 4.574 SRR6457253 CTD53.1500.1.3 DP03 B175 Common_water 1500 4.Bathypelagic 4.339 34.971 0.012 4.574 SRR6457254 CTD53.507.4.1 DP03 B175 Common_water 507 3.Mesopelagic 9.009 35.064 0.012 2.599 SRR6457255 CTD53.507.4.2 DP03 B175 Common_water 507 3.Mesopelagic 9.009 35.064 0.012 2.599 SRR6457256 CTD53.507.4.2.R2 DP03 B175 Common_water 507 3.Mesopelagic 9.009 35.064 0.012 2.599 SRR6457257 CTD53.507.4.3 DP03 B175 Common_water 507 3.Mesopelagic 9.009 35.064 0.012 2.599 SRR6457258 CTD53.507.4.3.R2 DP03 B175 Common_water 507 3.Mesopelagic 9.009 35.064 0.012 2.599 SRR6457259 CTD53.59.1 DP03 B175 Common_water 59 2.Epipelagic 23.174 36.501 0.181 4.668 SRR6457260 CTD53.59.2 DP03 B175 Common_water 59 2.Epipelagic 23.174 36.501 0.181 4.668 SRR6457261 CTD53.59.3 DP03 B175 Common_water 59 2.Epipelagic 23.174 36.501 0.181 4.668 SRR6457239 CTD53.S.1 DP03 B175 Common_water 2 1.Surface 26.6675 36.464 0.015 NA SRR6457238

75

CTD53.S.2 DP03 B175 Common_water 2 1.Surface 26.6675 36.464 1.015 NA SRR6457241

CTD53.S.3 DP03 B175 Common_water 2 1.Surface 26.6675 36.464 2.015 NA SRR6457240 CTD54.130.8.1 DP04 SW6 Loop_Current 130 2.Epipelagic 25.7015 36.284 0.47 4.098 SRR6457235 CTD54.130.8.2 DP04 SW6 Loop_Current 130 2.Epipelagic 25.7015 36.284 0.47 4.098 SRR6457234 CTD54.130.8.3 DP04 SW6 Loop_Current 130 2.Epipelagic 25.7015 36.284 0.47 4.098 SRR6457237 CTD54.1499M.1.1 DP04 SW6 Loop_Current 1499 4.Bathypelagic 4.3435 34.963 0.029 4.503 SRR6457236 CTD54.1499M.1.2 DP04 SW6 Loop_Current 1499 4.Bathypelagic 4.3435 34.963 0.029 4.503 SRR6457233 CTD54.1499M.1.3 DP04 SW6 Loop_Current 1499 4.Bathypelagic 4.3435 34.963 0.029 4.503 SRR6457232 CTD54.2M.12.1 DP04 SW6 Loop_Current 2 1.Surface 30.3845 36.44 0.036 4.154 SRR6457083 CTD54.2M.12.2 DP04 SW6 Loop_Current 2 1.Surface 30.3845 36.44 0.036 4.154 SRR6457084 CTD54.2M.12.3 DP04 SW6 Loop_Current 2 1.Surface 30.3845 36.44 0.036 4.154 SRR6457081 CTD54.545M.6.1 DP04 SW6 Loop_Current 545 3.Mesopelagic 10.2415 35.232 0.022 2.442 SRR6457082 CTD54.545M.6.2 DP04 SW6 Loop_Current 545 3.Mesopelagic 10.2415 35.232 0.022 2.442 SRR6457079 CTD54.545M.6.3 DP04 SW6 Loop_Current 545 3.Mesopelagic 10.2415 35.232 0.022 2.442 SRR6457080 CTD55.125.7.1 DP04 SW6 Loop_Current 125 2.Epipelagic 25.704 36.282 0.49 4.169 SRR6457077 CTD55.125.7.2 DP04 SW6 Loop_Current 125 2.Epipelagic 25.704 36.282 0.49 4.169 SRR6457078 CTD55.125.7.3 DP04 SW6 Loop_Current 125 2.Epipelagic 25.704 36.282 0.49 4.169 SRR6457551 CTD55.1502.1.1 DP04 SW6 Loop_Current 1502 4.Bathypelagic 4.3295 34.963 0.022 4.559 SRR6457076 CTD55.1502.1.2 DP04 SW6 Loop_Current 1502 4.Bathypelagic 4.3295 34.963 0.022 4.559 SRR6457505 CTD55.1502.1.3 DP04 SW6 Loop_Current 1502 4.Bathypelagic 4.3295 34.963 0.022 4.559 SRR6457284 CTD55.2M.10.1 DP04 SW6 Loop_Current 2 1.Surface 30.4855 36.373 0.022 4.283 SRR6457285 CTD55.2M.10.2 DP04 SW6 Loop_Current 2 1.Surface 30.4855 36.373 0.022 4.283 SRR6457286 CTD55.2M.10.3 DP04 SW6 Loop_Current 2 1.Surface 30.4855 36.373 0.022 4.283 SRR6457279 CTD55.516M.4.1 DP04 SW6 Loop_Current 516 3.Mesopelagic 10.9815 35.337 0.022 2.464 SRR6457280 CTD55.516M.4.2 DP04 SW6 Loop_Current 516 3.Mesopelagic 10.9815 35.337 0.022 2.464 SRR6457281 CTD55.516M.4.3 DP04 SW6 Loop_Current 516 3.Mesopelagic 10.9815 35.337 0.022 2.464 SRR6457282 CTD56.1500.1.1 DP04 SW4 Common_water 1500 4.Bathypelagic 4.293 34.965 0.029 4.607 SRR6457276 CTD56.1500.1.2 DP04 SW4 Common_water 1500 4.Bathypelagic 4.293 34.965 0.029 4.607 SRR6457277 CTD56.1500.1.3 DP04 SW4 Common_water 1500 4.Bathypelagic 4.293 34.965 0.029 4.607 SRR6457405 CTD56.2M.12.1 DP04 SW4 Common_water 2 1.Surface 31.3845 28.793 0.32 4.407 SRR6457404 CTD56.2M.12.2 DP04 SW4 Common_water 2 1.Surface 31.3845 28.793 0.32 4.407 SRR6457403 CTD56.2M.12.3 DP04 SW4 Common_water 2 1.Surface 31.3845 28.793 0.32 4.407 SRR6457402

76

CTD56.43.8.1 DP04 SW4 Common_water 43 2.Epipelagic 26.4605 35.893 0.565 4.58 SRR6457409

CTD56.43M.8.2 DP04 SW4 Common_water 43 2.Epipelagic 26.4605 35.893 0.565 4.58 SRR6457408 CTD56.43M8.3 DP04 SW4 Common_water 43 2.Epipelagic 26.4605 35.893 0.565 4.58 SRR6457407 CTD56.446M.4.1 DP04 SW4 Common_water 446 3.Mesopelagic 9.233 35.091 0.029 2.576 SRR6457406 CTD56.446M.4.2 DP04 SW4 Common_water 446 3.Mesopelagic 9.233 35.091 0.029 2.576 SRR6457401 CTD56.446M.4.3 DP04 SW4 Common_water 446 3.Mesopelagic 9.233 35.091 0.029 2.576 SRR6457400 CTD57.1485.3.1 DP04 SE1 Common_water 1485 4.Bathypelagic 4.315 34.964 0.042 4.59 SRR6457512 CTD57.1495.3.2 DP04 SE1 Common_water 1485 4.Bathypelagic 4.315 34.964 0.042 4.59 SRR6457513 CTD57.1495.3.3 DP04 SE1 Common_water 1485 4.Bathypelagic 4.315 34.964 0.042 4.59 SRR6457510 CTD57.2M.11.1 DP04 SE1 Common_water 2 1.Surface 30.794 34.255 0.192 4.316 SRR6457511 CTD57.2M.11.2 DP04 SE1 Common_water 2 1.Surface 30.794 34.255 0.192 4.316 SRR6457508 CTD57.2M.11.3 DP04 SE1 Common_water 2 1.Surface 30.794 34.255 0.192 4.316 SRR6457509 CTD57.441M.4.1 DP04 SE1 Common_water 441 3.Mesopelagic 9.136 35.09 0.05 2.537 SRR6457506 CTD57.441M.4.2 DP04 SE1 Common_water 441 3.Mesopelagic 9.136 35.09 0.05 2.537 SRR6457507 CTD57.441M.4.3 DP04 SE1 Common_water 441 3.Mesopelagic 9.136 35.09 0.05 2.537 SRR6457514 CTD57.68M.8.1 DP04 SE1 Common_water 68 2.Epipelagic 21.8925 36.464 0.476 4.351 SRR6457515 CTD57.68M.8.2 DP04 SE1 Common_water 68 2.Epipelagic 21.8925 36.464 0.476 4.351 SRR6457528 CTD57.68M.8.3 DP04 SE1 Common_water 68 2.Epipelagic 21.8925 36.464 0.476 4.351 SRR6457527 CTD58.1501.1.1 DP04 SE3 Common_water 1501 4.Bathypelagic 4.3085 34.964 0.036 4.6 SRR6457182 CTD58.1501.1.2 DP04 SE3 Common_water 1501 4.Bathypelagic 4.3085 34.964 0.036 4.6 SRR6457181 CTD58.1501.1.3 DP04 SE3 Common_water 1501 4.Bathypelagic 4.3085 34.964 0.036 4.6 SRR6457176 CTD58.2M.11.1 DP04 SE3 Common_water 2 1.Surface 29.796 34.206 0.165 4.241 SRR6457175 CTD58.2M.11.2 DP04 SE3 Common_water 2 1.Surface 29.796 34.206 0.165 4.241 SRR6457178 CTD58.2M.11.3 DP04 SE3 Common_water 2 1.Surface 29.796 34.206 0.165 4.241 SRR6457177 CTD58.444M.5.1 DP04 SE3 Common_water 444 3.Mesopelagic 9.836 35.17 0.029 2.532 SRR6457188 CTD58.444M.5.2 DP04 SE3 Common_water 444 3.Mesopelagic 9.836 35.17 0.029 2.532 SRR6457187 CTD58.444M.5.3 DP04 SE3 Common_water 444 3.Mesopelagic 9.836 35.17 0.029 2.532 SRR6457320 CTD58.90M.7.1 DP04 SE3 Common_water 90 2.Epipelagic 21.9465 36.418 0.402 4.338 SRR6457321 CTD58.90M.7.2 DP04 SE3 Common_water 90 2.Epipelagic 21.9465 36.418 0.402 4.338 SRR6457322 CTD58.90M.7.3 DP04 SE3 Common_water 90 2.Epipelagic 21.9465 36.418 0.402 4.338 SRR6457323 CTD59.1500.3.1 DP04 SE3 Common_water 1500 4.Bathypelagic 4.3005 34.965 0.042 4.603 SRR6457324 CTD59.1500.3.2 DP04 SE3 Common_water 1500 4.Bathypelagic 4.3005 34.965 0.042 4.603 SRR6457325

77

CTD59.1500.3.3 DP04 SE3 Common_water 1500 4.Bathypelagic 4.3005 34.965 0.042 4.603 SRR6457326

CTD59.2M.10.1 DP04 SE3 Common_water 2 1.Surface 29.996 34.368 0.117 4.268 SRR6457327 CTD59.2M.10.2 DP04 SE3 Common_water 2 1.Surface 29.996 34.368 0.117 4.268 SRR6457328 CTD59.2M.10.3 DP04 SE3 Common_water 2 1.Surface 29.996 34.368 0.117 4.268 SRR6457329 CTD59.418M.4.1 DP04 SE3 Common_water 418 3.Mesopelagic 9.882 35.177 0.036 2.531 SRR6457442 CTD59.418M.4.2 DP04 SE3 Common_water 418 3.Mesopelagic 9.882 35.177 0.036 2.531 SRR6457441 CTD59.418M.4.3 DP04 SE3 Common_water 418 3.Mesopelagic 9.882 35.177 0.036 2.531 SRR6457440 CTD59.86M.8.1 DP04 SE3 Common_water 86 2.Epipelagic 22.088 36.367 0.409 4.266 SRR6457439 CTD59.86M.8.2 DP04 SE3 Common_water 86 2.Epipelagic 22.088 36.367 0.409 4.266 SRR6457438 CTD59.86M.8.3 DP04 SE3 Common_water 86 2.Epipelagic 22.088 36.367 0.409 4.266 SRR6457437 CTD60.1500.3.1 DP04 SE2 Common_water 1500 4.Bathypelagic 4.287 34.965 0.029 4.617 SRR6457436 CTD60.1500.3.2 DP04 SE2 Common_water 1500 4.Bathypelagic 4.287 34.965 0.029 4.617 SRR6457435 CTD60.1500.3.3 DP04 SE2 Common_water 1500 4.Bathypelagic 4.287 34.965 0.029 4.617 SRR6457434 CTD60.2M.12.1 DP04 SE2 Common_water 2 1.Surface 30.0715 34.341 0.155 4.279 SRR6457433 CTD60.2M.12.2 DP04 SE2 Common_water 2 1.Surface 30.0715 34.341 0.155 4.279 SRR6457102 CTD60.2M.12.3 DP04 SE2 Common_water 2 1.Surface 30.0715 34.341 0.155 4.279 SRR6457103 CTD60.386M.5.1 DP04 SE2 Common_water 386 3.Mesopelagic 9.9995 35.199 0.029 2.512 SRR6457100 CTD60.386M.5.2 DP04 SE2 Common_water 386 3.Mesopelagic 9.9995 35.199 0.029 2.512 SRR6457101 CTD60.386M.5.3 DP04 SE2 Common_water 386 3.Mesopelagic 9.9995 35.199 0.029 2.512 SRR6457106 CTD60.86M.9.1 DP04 SE2 Common_water 86 2.Epipelagic 21.607 36.395 0.497 4.198 SRR6457107 CTD60.86M.9.2 DP04 SE2 Common_water 86 2.Epipelagic 21.607 36.395 0.497 4.198 SRR6457104 CTD60.86M.9.3 DP04 SE2 Common_water 86 2.Epipelagic 21.607 36.395 0.497 4.198 SRR6457105 CTD61.1500.2.1 DP04 SW3 Mixed_water 1500 4.Bathypelagic 4.2755 34.966 0.016 4.617 SRR6457097 CTD61.1500.2.2 DP04 SW3 Mixed_water 1500 4.Bathypelagic 4.2755 34.966 0.016 4.617 SRR6457098 CTD61.1500.2.3 DP04 SW3 Mixed_water 1500 4.Bathypelagic 4.2755 34.966 0.016 4.617 SRR6457245 CTD61.2M.10.1 DP04 SW3 Mixed_water 2 1.Surface 29.604 28.691 1.73 4.393 SRR6457244 CTD61.2M.10.2 DP04 SW3 Mixed_water 2 1.Surface 29.604 28.691 1.73 4.393 SRR6457247 CTD61.2M.10.3 DP04 SW3 Mixed_water 2 1.Surface 29.604 28.691 1.73 4.393 SRR6457246 CTD61.359.4.2 DP04 SW3 Mixed_water 359 3.Mesopelagic 10.3975 35.248 0.029 2.538 SRR6457249 CTD61.359M.4.1 DP04 SW3 Mixed_water 359 3.Mesopelagic 10.3975 35.248 0.029 2.538 SRR6457248 CTD61.359M.4.3 DP04 SW3 Mixed_water 359 3.Mesopelagic 10.3975 35.248 0.029 2.538 SRR6457251 CTD61.76M.8.1 DP04 SW3 Mixed_water 76 2.Epipelagic 21.2735 36.53 0.239 3.764 SRR6457250

78

CTD61.76M.8.2 DP04 SW3 Mixed_water 76 2.Epipelagic 21.2735 36.53 0.239 3.764 SRR6457243

CTD61.76M.8.3 DP04 SW3 Mixed_water 76 2.Epipelagic 21.2735 36.53 0.239 3.764 SRR6457242 CTD62.110M.7.1 DP04 SW5 Mixed_water 110 2.Epipelagic 25.6535 36.446 0.293 3.845 SRR6457385 CTD62.110M.7.2 DP04 SW5 Mixed_water 110 2.Epipelagic 25.6535 36.446 0.293 3.845 SRR6457386 CTD62.110M.7.3 DP04 SW5 Mixed_water 110 2.Epipelagic 25.6535 36.446 0.293 3.845 SRR6457270 CTD62.1500.3.1 DP04 SW5 Mixed_water 1500 4.Bathypelagic 4.305 34.965 0.029 4.581 SRR6457432 CTD62.1500.3.2 DP04 SW5 Mixed_water 1500 4.Bathypelagic 4.305 34.965 0.029 4.581 SRR6457381 CTD62.1500.3.3 DP04 SW5 Mixed_water 1500 4.Bathypelagic 4.305 34.965 0.029 4.581 SRR6457382 CTD62.2M.11.1 DP04 SW5 Mixed_water 2 1.Surface 30.01 36.325 0.07 4.279 SRR6457383 CTD62.2M.11.2 DP04 SW5 Mixed_water 2 1.Surface 30.01 36.325 0.07 4.279 SRR6457384 CTD62.2M.11.3 DP04 SW5 Mixed_water 2 1.Surface 30.01 36.325 0.07 4.279 SRR6457283 CTD62.498M.5.1 DP04 SW5 Mixed_water 498 3.Mesopelagic 9.872 35.183 0.029 2.477 SRR6457289 CTD62.498M.5.2 DP04 SW5 Mixed_water 498 3.Mesopelagic 9.872 35.183 0.029 2.477 SRR6457094 CTD62.498M.5.3 DP04 SW5 Mixed_water 498 3.Mesopelagic 9.872 35.183 0.029 2.477 SRR6457093 CTD63.1520.2.1 DP04 B064 Common_water 1520 4.Bathypelagic 4.3235 34.964 0.022 4.561 SRR6457490 CTD63.1520.2.2 DP04 B064 Common_water 1520 4.Bathypelagic 4.3235 34.964 0.022 4.561 SRR6457489 CTD63.1520.2.3 DP04 B064 Common_water 1520 4.Bathypelagic 4.3235 34.964 0.022 4.561 SRR6457496 CTD63.2m.10.1 DP04 B064 Common_water 2 1.Surface 29.247 27.472 2.685 4.645 SRR6457495 CTD63.2m.10.2 DP04 B064 Common_water 2 1.Surface 29.247 27.472 2.685 4.645 SRR6457494 CTD63.2m.10.3 DP04 B064 Common_water 2 1.Surface 29.247 27.472 2.685 4.645 SRR6457493 CTD63.421M.4.1 DP04 B064 Common_water 421 3.Mesopelagic 9.2825 35.104 0.036 2.516 SRR6457319 CTD63.421M.4.2 DP04 B064 Common_water 421 3.Mesopelagic 9.2825 35.104 0.036 2.516 SRR6457499 CTD63.421M.4.3 DP04 B064 Common_water 421 3.Mesopelagic 9.2825 35.104 0.036 2.516 SRR6457312 CTD63.97M.9.1 DP04 B064 Common_water 97 2.Epipelagic 20.6695 36.495 0.158 3.516 SRR6457311 CTD63.97M.9.2 DP04 B064 Common_water 97 2.Epipelagic 20.6695 36.495 0.158 3.516 SRR6457310 CTD63.97M.9.3 DP04 B064 Common_water 97 2.Epipelagic 20.6695 36.495 0.158 3.516 SRR6457309 CTD64.1500.1.1 DP04 B064 Common_water 1500 4.Bathypelagic 4.314 34.964 0.036 SRR6457316 CTD64.1500.1.2 DP04 B064 Common_water 1500 4.Bathypelagic 4.314 34.964 0.036 SRR6457315 CTD64.1500.1.3 DP04 B064 Common_water 1500 4.Bathypelagic 4.314 34.964 0.036 SRR6457314 CTD64.22M.9.1 DP04 B064 Common_water 22 2.Epipelagic 29.3635 34.767 0.7 SRR6457313 CTD64.22M.9.2 DP04 B064 Common_water 22 2.Epipelagic 29.3635 34.767 0.7 SRR6457318 CTD64.22M.9.3 DP04 B064 Common_water 22 2.Epipelagic 29.3635 34.767 0.7 SRR6457317

79

CTD64.2M.10.1 DP04 B064 Common_water 2 1.Surface 30.224 32.536 0.131 SRR6457157

CTD64.2M.10.2 DP04 B064 Common_water 2 1.Surface 30.224 32.536 0.131 SRR6457158 CTD64.2M.10.3 DP04 B064 Common_water 2 1.Surface 30.224 32.536 0.131 SRR6457159 CTD64.415M.3.1 DP04 B064 Common_water 415 3.Mesopelagic 9.441 35.121 0.029 SRR6457160 CTD64.415M.3.2 DP04 B064 Common_water 415 3.Mesopelagic 9.441 35.121 0.029 SRR6457153 CTD64.415M.3.3 DP04 B064 Common_water 415 3.Mesopelagic 9.441 35.121 0.029 SRR6457154 CTD64.95M.1 DP04 B064 Common_water 95 2.Epipelagic 20.645 36.468 0.185 SRR6457155 CTD64.95M.2 DP04 B064 Common_water 95 2.Epipelagic 20.645 36.468 0.185 SRR6457156 CTD64.95M.3 DP04 B064 Common_water 95 2.Epipelagic 20.645 36.468 0.185 SRR6457161 CTD65.1500.3.1 DP04 B065 Common_water 1500 4.Bathypelagic 4.3225 34.964 0.042 4.571 SRR6457162 CTD65.1500.3.2 DP04 B065 Common_water 1500 4.Bathypelagic 4.3225 34.964 0.042 4.571 SRR6457546 CTD65.1500.3.3 DP04 B065 Common_water 1500 4.Bathypelagic 4.3225 34.964 0.042 4.571 SRR6457369 CTD65.2M.10.1 DP04 B065 Common_water 2 1.Surface 29.67 33.874 0.246 4.388 SRR6457548 CTD65.2M.10.2 DP04 B065 Common_water 2 1.Surface 29.67 33.874 0.246 4.388 SRR6457547 CTD65.2M.10.3 DP04 B065 Common_water 2 1.Surface 29.67 33.874 0.246 4.388 SRR6457542 CTD65.334M.6.1 DP04 B065 Common_water 334 3.Mesopelagic 9.266 35.093 0.036 2.597 SRR6457541 CTD65.334M.6.2 DP04 B065 Common_water 334 3.Mesopelagic 9.266 35.093 0.036 2.597 SRR6457544 CTD65.334M.6.3 DP04 B065 Common_water 334 3.Mesopelagic 9.266 35.093 0.036 2.597 SRR6457543 CTD65.58M.7.1 DP04 B065 Common_water 58 2.Epipelagic 23.937 36.373 0.531 4.862 SRR6457540 CTD65.58M.7.2 DP04 B065 Common_water 58 2.Epipelagic 23.937 36.373 0.531 4.862 SRR6457539 CTD65.58M.7.3 DP04 B065 Common_water 58 2.Epipelagic 23.937 36.373 0.531 4.862 SRR6457430 CTD66.1503.3.1 DP04 B287 Mixed_water 1503 4.Bathypelagic 4.313 34.964 0.036 4.581 SRR6457431 CTD66.1503.3.2 DP04 B287 Mixed_water 1503 4.Bathypelagic 4.313 34.964 0.036 4.581 SRR6457428 CTD66.1503.3.3 DP04 B287 Mixed_water 1503 4.Bathypelagic 4.313 34.964 0.036 4.581 SRR6457429 CTD66.2M.11.1 DP04 B287 Mixed_water 2 1.Surface 29.9555 34.103 0.226 4.388 SRR6457426 CTD66.2M.11.2 DP04 B287 Mixed_water 2 1.Surface 29.9555 34.103 0.226 4.388 SRR6457427 CTD66.2M.11.3 DP04 B287 Mixed_water 2 1.Surface 29.9555 34.103 0.226 4.388 SRR6457424 CTD66.340M.4.1 DP04 B287 Mixed_water 340 3.Mesopelagic 10.613 35.275 0.063 2.595 SRR6457425 CTD66.340M.4.2 DP04 B287 Mixed_water 340 3.Mesopelagic 10.613 35.275 0.063 2.595 SRR6457422 CTD66.340M.4.3 DP04 B287 Mixed_water 340 3.Mesopelagic 10.613 35.275 0.063 2.595 SRR6457423 CTD66.70M.8.1 DP04 B287 Mixed_water 70 2.Epipelagic 21.7925 36.508 0.449 4.051 SRR6457354 CTD66.70M.8.2 DP04 B287 Mixed_water 70 2.Epipelagic 21.7925 36.508 0.449 4.051 SRR6457353

80

CTD66.70M.8.3 DP04 B287 Mixed_water 70 2.Epipelagic 21.7925 36.508 0.449 4.051 SRR6457352

CTD67.1501.3.1 DP04 B252 Common_water 1501 4.Bathypelagic 4.3165 34.965 0.029 4.573 SRR6457351 CTD67.1501.3.2 DP04 B252 Common_water 1501 4.Bathypelagic 4.3165 34.965 0.029 4.573 SRR6457350 CTD67.1501.3.3 DP04 B252 Common_water 1501 4.Bathypelagic 4.3165 34.965 0.029 4.573 SRR6457349 CTD67.2M.11.1 DP04 B252 Common_water 2 1.Surface 30.0415 34.652 0.124 4.261 SRR6457348 CTD67.2M.11.2 DP04 B252 Common_water 2 1.Surface 30.0415 34.652 0.124 4.261 SRR6457347 CTD67.2M.11.3 DP04 B252 Common_water 2 1.Surface 30.0415 34.652 0.124 4.261 SRR6457356 CTD67.415M.5.1 DP04 B252 Common_water 415 3.Mesopelagic 9.58 35.134 0.029 2.591 SRR6457355 CTD67.415M.5.2 DP04 B252 Common_water 415 3.Mesopelagic 9.58 35.134 0.029 2.591 SRR6457214 CTD67.415M.5.3 DP04 B252 Common_water 415 3.Mesopelagic 9.58 35.134 0.029 2.591 SRR6457215 CTD67.80M.7.1 DP04 B252 Common_water 80 2.Epipelagic 21.6455 36.432 0.618 4.221 SRR6457216 CTD67.80M.7.2 DP04 B252 Common_water 80 2.Epipelagic 21.6455 36.432 0.618 4.221 SRR6457217 CTD67.80M.7.3 DP04 B252 Common_water 80 2.Epipelagic 21.6455 36.432 0.618 4.221 SRR6457218 CTD68.1500.1.1 DP04 B175 Common_water 1500 4.Bathypelagic 4.3525 34.963 0.036 4.548 SRR6457219 CTD68.1500.1.2 DP04 B175 Common_water 1500 4.Bathypelagic 4.3525 34.963 0.036 4.548 SRR6457220 CTD68.1500.1.3 DP04 B175 Common_water 1500 4.Bathypelagic 4.3525 34.963 0.036 4.548 SRR6457221 CTD68.2.11.1 DP04 B175 Common_water 2 1.Surface 30.288 35.197 0.09 4.342 SRR6457212 CTD68.2M.11.2 DP04 B175 Common_water 2 1.Surface 30.288 35.197 0.09 4.342 SRR6457213 CTD68.2M.11.3 DP04 B175 Common_water 2 1.Surface 30.288 35.197 0.09 4.342 SRR6457132 CTD68.374M.4.1 DP04 B175 Common_water 374 3.Mesopelagic 10.706 35.297 0.05 2.507 SRR6457131 CTD68.374M.4.2 DP04 B175 Common_water 374 3.Mesopelagic 10.706 35.297 0.05 2.507 SRR6457134 CTD68.374M.4.3 DP04 B175 Common_water 374 3.Mesopelagic 10.706 35.297 0.05 2.507 SRR6457133 CTD68.51M.7.1 DP04 B175 Common_water 51 2.Epipelagic 24.039 36.377 0.686 4.81 SRR6457136 CTD68.51M.7.2 DP04 B175 Common_water 51 2.Epipelagic 24.039 36.377 0.686 4.81 SRR6457135 CTD68.51M.7.3 DP04 B175 Common_water 51 2.Epipelagic 24.039 36.377 0.686 4.81 SRR6457138

81

APPENDIX II QIIME and PICRUSt code

QIIME Code

Version: QIIME 1.9.1 Coding Environment: Conda (Python based) on MacOS OTU picking method: closed reference Reference database: Greengenes 13.8 (Aug 2013) release OTU threshold: 97% similarity

source activate base conda activate qiime1 cd ~/working_directory validate_mapping_file.py -p -b -m mapping_file.txt -o validated_mapping_file echo "split_libraries_fastq:phred_offset 33" >> split_libraries_params.txt multiple_split_libraries_fastq.py -i persample_sequence_data -o split_libraries -m sampleid_by_file -p split_libraries_params.txt echo "pick_otus:enable_rev_strand_match True" >> $PWD/otu_picking_params_97.txt echo "pick_otus:similarity 0.97" >> $PWD/otu_picking_params_97.txt pick_closed_reference_otus.py -i $PWD/seqs.fna -o $PWD/ucrC97/ -p $PWD/otu_picking_params_97.txt -r $PWD/gg_13_8_otus/rep_set/97_otus.fasta -t $PWD/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt biom convert -i ucrC97/otu_table.biom -o testrun_otutable.txt --to-tsv --header-key taxonomy

82

PICRUSt Code

Version: PICRUSt 1.1.3 Coding Environment: Conda (Python based) on MacOS Requirements: QIIME 1.9.1 Data output type: with chloroplast, whole abundance

conda install -c bioconda picrust conda activate download_picrust_files.py normalize_by_copy_number.py -i otu_table.biom -o otus_corrected.biom biom convert -i otus_corrected.biom -o otus_corrected.txt --to-tsv -- header-key taxonomy predict_metagenomes.py -i otus_corrected.biom -o ko_predictions.biom - a –-with_confidence biom convert -i ko_predictions.biom -o ko_predictions.txt --to-tsv -- header-key KEGG_Description categorize_by_function.py -i ko_predictions.biom -c KEGG_Pathways -l 3 -o pathway_predictions.biom biom convert -i pathway_predictions.biom -o pathway_predictions.txt -- to-tsv --header-key KEGG_Pathways

83

APPENDIX III PICRUSt Functions Relative Abundance Table

Table 6: Relative Abundance of all PICRUSt KEGG Level 1 Functions for DP03. Normalized across samples and averaged across depth zones.

KEGG Level 1 Pathways 1.Surface 2.Epipelagic 3.Mesopelagic 4.Bathypelagic Cellular Processes 0.0275 0.0279 0.0332 0.0394 Environmental Information Processing 0.1054 0.1060 0.1127 0.1168 Genetic Information Processing 0.1842 0.1824 0.1802 0.1744 Human Diseases 0.0121 0.0110 0.0104 0.0106 Metabolism 0.5485 0.5437 0.5209 0.5116 Organismal Systems 0.0078 0.0079 0.0077 0.0078 Unclassified 0.1145 0.1211 0.1349 0.1394

Table 7: Relative Abundance of all PICRUSt KEGG Level 1 Functions for DP04. Normalized across samples and averaged across depth zones.

KEGG Level 1 Pathways 1.Surface 2.Epipelagic 3.Mesopelagic 4.Bathypelagic Cellular Processes 0.0291 0.0318 0.0370 0.0412 Environmental Information Processing 0.1040 0.1060 0.1118 0.1169 Genetic Information Processing 0.1796 0.1795 0.1790 0.1730 Human Diseases 0.0124 0.0110 0.0103 0.0107 Metabolism 0.5427 0.5305 0.5153 0.5095 Organismal Systems 0.0076 0.0079 0.0077 0.0077 Unclassified 0.1245 0.1332 0.1389 0.1409

84

APPENDIX IV R code for Multivariate Statistical Analyses

R Studio version: 1.1.453 R version: 3.5.3 Libraries/packages used: data.table, base, ggplot2, vegan

##Load all required libraries/packages library(data.table) library(base) library(ggplot2) library(vegan)

##Part1: OTU Taxonomic Analyses------

#import raw OTU data and metadata otus.raw <- read_xlsx(file.choose(), header=T) metadata <- read_xlsx(file.choose(), header=T)

#make sure the row names are same in both the datasets all.equal(rownames(otus.raw),rownames(metadata))

##1.1 OTU table cleanup------

#Reduce noise: remove OTUs with total absolute abundance of 1

(singletons), sum > 1 n <- 1 unique.col <- which(colSums(otus.raw)>n) otus.nonoise <- otus.raw[,unique.col]

#save the denoised data write.table(otus.nonoise, "C:\file_path\file_name.txt", sep="\t")

85

#Remove rare taxa: remove OTUs that are present only once, count > 1

#carried out in Excel using COUNT function on denoised data

##1.2 Normalize for relative abundance------

#import OTU data after previous quality control steps otus <- read.table(file.choose(), header=T, sep ="\t")

#normalize raw abundance data by sample otus.relabund <- decostand(otus, method = "total")

#save the relative abundance data write.table(otus.relabund, "otus_norm.txt", sep="\t")

##1.3 Beta diversity using Bray-Curtis dissimilarity index------otus.bray <- vegdist(otus.relabund, method = "bray")

#Test effect of pelagic zones (depth stratification) on OTUs adonis(otus.bray~Pelagic_zone, data = metadata)

#Test the effect of stations (spatial distribution) on OTUs adonis(otus.bray~Station, data = metadata)

##1.4 PCoA plot------

#calculate betadispersion otus.betadisp <- betadisper(otus.relabund, metadata$Pelagic_zone)

#view in boxplot boxplot(otus.betadisp)

86

#PCoA plot plot(otus.betadisp, main = "Taxa Bray-Curtis PCoA")

##1.5 SIMPER test to obtain similarity/dissimilarity percentages----- otus.simp <- simper(otus.relabund, metadata$Pelagic_zone,

permutations = 999) sink("otus-simper_depthzone.csv") summary(otus.simp) sink()

#to summarize dissimilarity percentages by group/site lapply(otus.simp, FUN=function(x){x$overall)

##1.6 CCA plot for effect of environmental variables------

#samples with missing environmental data were excluded set.seed(42); env.cca <-

cca(otus.relabund~Temperature+Salinity+Depth+Chla+DO,

data=metadata) vif.cca(env.cca)

anova.cca(env.cca)

#zero the variables set.seed(42); lwr <- cca(otus.relabund~1, data=metadata) lwr

#using a forward selecting model

87

set.seed(42); mods.all <- ordiR2step(lwr, scope = formula(env.cca)) #takes a long time... mods.all vif.cca(mods.all) anova.cca(mods.all)

R2.adj.all <- RsquareAdj(mods.all)

R2.adj.all

mods.all$anova

#CCA plot cca.otus <- plot(mods.all,type = "none") points(cca.otus, "sites", col= "green", pch=3,

select = metadata$Pelagic_zone == "1.Surface") points(cca.otus, "sites", col= "black", pch=1,

select = metadata$Pelagic_zone == "2.Epipelagic") points(cca.otus, "sites", col= "blue", pch=4,

select = metadata$Pelagic_zone == "3.Mesopelagic") points(cca.otus, "sites", col= "red", pch=2,

select = metadata$Pelagic_zone == "4.Bathypelagic")

ef.all <- envfit(cca.otus,

metadata[,c("Temperature","Oxygen","Chla","Salinity")]) plot(ef.all, title(main="OTU Taxa CCA"))

legend("topright",legend=as.character(paste(" ",

88

unique(metadata$Pelagic_zone))), cex = 0.99, pch=1:4,

col=1:length(unique(metadata$Pelagic_zone)))

#rerun steps 1.1 - 1.6 for the second cruise dataset

##Part2: PICRUSt Functional Analyses------picrust.raw <- read_xlsx(file.choose(), header=T)

#import subsetted KEGG level 3 metabolism pathway abundances from the complete picrust dataset metabolism.raw <- read_xlsx(file.choose(), header=T)

##2.1 Normalize for relative abundance------picrust.relabund <- decostand(picrust.raw, method = "total") #normalizes across sample

#save the relative abundance data write.table(picrust.relabund, "otus_norm.txt", sep="\t")

metabolism.relabund <- decostand(metabolism.raw, method = "total") #normalizes across sample

#save the relative abundance data write.table(metabolism.relabund, "otus_norm.txt", sep="\t")

##2.2 Beta diversity using Bray-Curtis dissimilarity index------metabolism.bray <- vegdist(metabolism.relabund, method = "bray")

#Test effect of pelagic zones (depth stratification) on metabolic pathways

89

adonis(metabolism.bray~Pelagic_zone, data = metadata)

#Test effect of stations (spatial distribution) on metabolic pathways adonis(metabolism.bray~Station, data = metadata)

#Test effect of mesoscale features on metabolic pathways adonis(metabolism.bray~mesoscale_feat, data = metadata)

##2.3 PCoA plot------metabolism.betadisp <- betadisper(metabolism.relabund,

metadata$Pelagic_zone) boxplot(metabolism.betadisp) #view in boxplot plot(metabolism.betadisp, main = "Metabolism Bray-Curtis PCoA")

#PCoA plot

##2.4 SIMPER test to obtain similarity/dissimilarity percentages------metabolism.simp <- simper(metabolism.relabund,

metadata$Pelagic_zone, permutations = 999) sink("metabolism-simper_depthzone.csv") summary(metabolism.simp) sink()

#to summarize dissimilarity percentages by group/site lapply(metabolism.simp, FUN=function(x){x$overall})

##2.5 CCA plot for effect of environmental variables------

#samples with missing environmental data were excluded

90

set.seed(42); env.cca <-

cca(metabolism.relabund~Temperature+Salinity+Depth+Chla+DO,

data=metadata) vif.cca(env.cca)

anova.cca(env.cca)

#zero the variables set.seed(42); lwr <- cca(otus.relabund~1, data=metadata) lwr

#using a forward selecting model set.seed(42); mods.all <- ordiR2step(lwr, scope = formula(env.cca)) #takes a long time... mods.all vif.cca(mods.all) anova.cca(mods.all)

R2.adj.all <- RsquareAdj(mods.all)

R2.adj.all

mods.all$anova

#CCA plot cca.metabolism <- plot(mods.all,type = "none") points(cca.metabolism, "sites", col= "green", pch=3,

select = metadata$Pelagic_zone == "1.Surface")

91

points(cca.metabolism, "sites", col= "black", pch=1,

select = metadata$Pelagic_zone == "2.Epipelagic") points(cca.metabolism, "sites", col= "blue", pch=4,

select = metadata$Pelagic_zone == "3.Mesopelagic") points(cca.metabolism, "sites", col= "red", pch=2,

select = metadata$Pelagic_zone == "4.Bathypelagic")

ef.all <- envfit(cca.metabolism,

metadata[,c("Temperature","Oxygen","Chla","Salinity")]) plot(ef.all, title(main="PICRUSt Metabolism CCA"))

legend("topright",legend=as.character(paste(" ",

unique(metadata$Pelagic_zone))), cex = 0.99, pch=1:4,

col=1:length(unique(metadata$Pelagic_zone)))

#rerun steps 2.1 – 2.5 for the second cruise dataset

##2.6 NMDS plot for seasonal and annual effects by depth zone------

#from the interated PICRUSt dataset, sort and subset metabolism data,

#and the associated mapping data by depth zone srf <- read_xlsx(file.choose(), header=T) #surface data epi <- read_xlsx(file.choose(), header=T) #epipelagic data meso <- read_xlsx(file.choose(), header=T) #mesopelagic data bathy <- read_xlsx(file.choose(), header=T) #bathypelagic data

metadata.srf <- read_xlsx(file.choose(), header=T) #surface data metadata.epi <- read_xlsx(file.choose(), header=T) #epipelagic data

92

metadata.meso <- read_xlsx(file.choose(), header=T) #mesopelagic data metadata.bathy <- read_xlsx(file.choose(), header=T) #bathypelagic

data

#NMDS plot based on Bray-Curtis distance of all four depth zones in one image par(mfrow=c(2,2))

#by season

#surface srf.bray.nmds <- metaMDS(srf, distance="bray", trymax=100) mds.fig.srf <- ordiplot(srf.bray.nmds, display="sites") ordiellipse(mds.fig.srf, metadata.srf$Season, label = F,

col=2:5, conf=0.95, lwd=1.5, cex=1.5, title(main="Surface"))

#epipelagic epi.bray.nmds <- metaMDS(epi, distance="bray", trymax=100) mds.fig.epi <- ordiplot(epi.bray.nmds, display="sites") ordiellipse(mds.fig.epi, metadata.epi$Season, label = F,

col=2:5, conf=0.95, lwd=1.5, cex=1.5, title(main="Epipelagic"))

#mesopelagic meso.bray.nmds <- metaMDS(meso, distance="bray", trymax=100) mds.fig.meso <- ordiplot(meso.bray.nmds, display="sites") ordiellipse(mds.fig.meso, metadata.meso$Season, label = F,

col=2:5, conf=0.95, lwd=1.5, cex=1.5, title(main="Mesopelagic"))

#bathypelagic

93

bathy.bray.nmds <- metaMDS(bathy, distance="bray", trymax=100) mds.fig.bathy <- ordiplot(bathy.bray.nmds, display="sites") ordiellipse(mds.fig.bathy, metadata.bathy$Season, label = F,

col=2:5, conf=0.95, lwd=1.5, cex=1.5, title(main="Bathypelagic"))

#by year

#surface par(mfrow=c(2,2)) mds.fig.srf <- ordiplot(srf.bray.nmds, display="sites") ordiellipse(mds.fig.srf, metadata.srf$Year, label = T,

col=1:2, conf=0.95, lwd=1.5, cex=1.5, title(main="Surface"))

#epipelagic mds.fig.epi <- ordiplot(epi.bray.nmds, display="sites") ordiellipse(mds.fig.epi, metadata.epi$Year, label = F,

col=1:2, conf=0.95, lwd=1.5, cex=1.5, title(main="Epipelagic"))

#mesopelagic mds.fig.meso <- ordiplot(meso.bray.nmds, display="sites") ordiellipse(mds.fig.meso, metadata.meso$Year, label = F,

col=1:2, conf=0.95, lwd=1.5, cex=1.5, title(main="Mesopelagic"))

#bathypelagic mds.fig.bathy <- ordiplot(bathy.bray.nmds, display="sites") ordiellipse(mds.fig.bathy, metadata.bathy$Year, label = F,

col=1:2, conf=0.95, lwd=1.5, cex=1.5, title(main="Bathypelagic"))

94

##Part3: PICRUSt Temporal Analyses------

#externally integrate normalized PICRUSt datasets of cruises DP02, DP03, DP04

#subset for KEGG Level 3 metabolism pathways, and import metabolism.integrated <- read_xlsx(file.choose(), header=T) metadata.integrated <- read_xlsx(file.choose(), header=T)

##3.1 Beta diversity using Bray-Curtis dissimilarity index----- temporal.bray <- vegdist(metabolism.integrated, method = "bray")

#Test the effect of season (May vs Aug) adonis(temporal.bray~Season, data = metadata.integrated)

#Test the effect of year (2015 vs 2016) adonis(temporal.bray~Year, data = metadata.integrated)

#Test the effect of cruise (DP02 vs DP03 vs DP04) adonis(temporal.bray~Cruise, data = metadata.integrated)

##3.2 SIMPER test to obtain similarity/dissimilarity percentages------

#Test the effect of season (May vs Aug) season.simp <- simper(metabolism.integrated,

metadata.integrated$Season, permutations = 999) sink("temporal-simper_season.csv") summary(season.simp)

95

sink()

#to summarize dissimilarity percentages by group/site lapply(season.simp, FUN=function(x){x$overall)

#Test the effect of year (2015 vs 2016) year.simp <- simper(metabolism.integrated,

metadata.integrated$Year, permutations = 999) sink("temporal-simper_year.csv") summary(year.simp) sink()

#to summarize dissimilarity percentages by group/site lapply(year.simp, FUN=function(x){x$overall)

96

APPENDIX V Supplemental Figures

Figure 23. Weighted UniFrac PCoA of taxonomic abundance with scaled coordinates. The plot shows strong stratification of taxa along depth zones.

97

Figure 24. NMDS plot of metabolism across seasons (May vs August) at different depth zones. Metabolism was more similar across seasons in bathypelagic.

98

Figure 25. NMDS plot of metabolism across year (2015 vs 2016) at different depth zones. Metabolism was similar across year in the bathypelagic, as compared to the surface.

99

APPENDIX VI Manuscript – Environmental Microbiology

Functional Dynamics of Microbial Communities Across the Pelagic Northern Gulf of Mexico

Deepesh Tourani1, Cole G. Easson2, Matthew W. Johnston3, Jose V. Lopez3

1 Department of Marine and Environmental Sciences, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Dania Beach, FL, United States

2 Middle Tennessee State University, Murfreesboro, TN, United States

3 Department of Biological Sciences, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Dania Beach, FL, United States

Deepesh Tourani [email protected]

Dr. Jose V. Lopez [email protected]

100

Abstract

Microbial communities, or microbiomes, are the major drivers of global biogeochemical cycles, acting as primary producers and decomposers across the water column in the oceans. Thus, they reflect changes in physicochemical properties and nutrient composition of the ocean. However, this correlation between ecological changes and the function of marine microbiomes is poorly understood. As part of the DEEPEND Consortium, previously sequenced 16S rRNA gene data of 466 samples from cruises in May and August of 2016 was used to characterize the function of microbiomes across the pelagic Northern Gulf of Mexico (NGoM). The Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) approach was used for predicting biomolecular function based on the KEGG database inferred from 16S rRNA sequences. This is the first comprehensive study characterizing the function of pelagic microbial communities across the GoM. Metabolism was the most abundant function at KEGG Level 1, and further analyses were centered on pathways within this function. Strong depth zone stratification of metabolic function was observed (p<0.001), with a major shift in function between euphotic zone and aphotic zone, associated with a major differential abundance of photosynthetic functional signatures. These functions were followed by methane metabolism in abundance, indicating that the microbial communities switch to primarily chemosynthetic processes in the aphotic zones. Five physicochemical environmental variables resulted in a strong association with function, explaining more variation in metabolism (DP03: 63.2%, DP04: 77.3%) than taxa (DP03: 14.3%, DP04: 25.8%). Among these variables, temperature was the major environmental driver of function (DP03: 55.4%, DP04: 68%). GIS spatial analysis of methane metabolism across the aphotic zone showed that offshore sites had relatively higher metabolism than nearshore. Temporal analyses showed photosynthetic primary productivity was significantly different across season but not year, which may be attributed to high seasonal outflow of the Mississippi river.

Keywords: microbial ecology, microbiome, Gulf of Mexico, DEEPEND, QIIME, PICRUSt

101

Introduction

A significant portion of the Earth’s primary production occurs in the oceans, which occurs throughout the water column from the sunlit upper layers to the deep pelagic. Microbes are the sole primary producers in the oceans, providing organic carbon for higher trophic levels. Microbes also act as decomposers throughout the water column, scavenging and recycling the organic carbon. Hence they are affected by ecosystem changes such as changes in circulation, and nutrient input, essentially acting as ‘first responders’. Large-scale studies in the past decade have analyzed the dynamics of these communities and their correlation and interaction with marine environments across the global oceans (Rusch et al., 2007; Sunagawa et al., 2015). These studies found that microbial community structure and function is strongly driven by environmental factors. The structure of ocean microbial communities is primarily driven by depth, while in the euphotic zone, temperature is the major environmental factor, followed by Dissolved Oxygen (DO), driving the taxonomic composition and function of microbial communities (Sunagawa et al., 2015). Microbial communities reflect changes in nutrient composition and physicochemical properties such as temperature, salinity, and DO (DeLong and Karl, 2005; Xu, 2006; Fuhrman, 2009). The microbial ecology of marine environments differentiates between pelagic habitats and offers biological insight into the overall functioning and health of the oceans and constituent fauna.

An emerging trend in the upper layers of the global oceans is that temperature is the main environmental variable driving community structure (Sunagawa et al., 2015; Logares et al., 2018). Metabolic functions such as photosynthesis and photoheterotrophy are the most abundant in this region, where Calvin cycle carbon fixation enzymes were highly expressed. The Tara Oceans study also found genes responsible for aerobic respiration and photosynthesis in the mesopelagic, suggesting the presence of upper-layer microbes in the sinking marine snow, as they adapt to different niches and carry out remineralization. As depth increases and sunlight and temperature decrease, microbial communities switch to chemosynthesis processes and expand into metabolic niches, increasing functional richness with depth (Orcutt et al., 2011; Sunagawa et al., 2015). Nitrate, nitrite, and sulfur reduction metabolism processes increase to support the energy requirements. Most heterotrophic respiration is fueled by sinking organic matter (Mestre et al., 2018). These processes support nutrient cycles in the aphotic zone by fixing or remineralizing

102

organic and inorganic matter. Along with sinking organic matter, environments such as hydrothermal vents, seeps, and sediments also show high microbial activity.

Microbial communities, or microbiomes, have only been studied recently, with the application of Next-Gen Sequencing (NGS) in 16S rRNA and metagenomic surveys. Microbial ecology studies from different marine environments are mostly based on the composition of microbial communities, with fewer studies on function. Analyses of such communities revealed patterns that could not be explained by taxonomic composition. Global trends show similar taxonomic composition across different spatial environments (Sunagawa et al., 2015), and varying community structure in similar environments (Fernández et al., 1999). Based on metabolic pathways, marine water bodies can be differentiated and environmental stressors can also be identified (Rusch et al., 2007). To study marine microbial community dynamics, functional profiling can also take into account globally connected water masses which may transport microbes across global oceans and show unexplained taxonomic trends. Recent studies have shown higher correlation between environmental conditions and function than composition (Louca, Parfrey, and Doebeli, 2016).

The idea of functional redundancy explains these variations by suggesting that community functional dynamics are independent of community composition. Taxonomic studies hence do not allow enough resolution to effectively analyze community dynamics with respect to biogeochemical cycles. A possible explanation for this disparity is the lack of inclusion of phenomena such as HGT, which allows microbes in an environment to transfer genetic material to perform ecologically relevant functions (Boucher et al., 2003; Falkowski, Fenchel, and Delong, 2008). Taxonomically-different clades can thus perform similar metabolic functions, contributing to survival of whole community.

Studying community function is also challenging as it requires metagenomic approaches, which are resource and computationally intensive. The Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) approach assists in finding out whole community gene composition and community function using 16S sequence data (Langille et al., 2013). Recent applications of PICRUSt include studies on the coastal Arabian Sea (Kumar, Mishra, and Jha, 2019), the Mariana Trench (Gao et al., 2019), the effect of hydrocarbons and

103

dispersants on marine biofilms (Salerno et al., 2018), and dinoflagellate (Zhou et al., 2018) and phytoplankton blooms (Nowinski et al., 2019).

The Gulf of Mexico (GoM) is an ocean basin between the North and South American continents, providing high economic and ecological value to the region (Goolsby et al., 1999; Shen et al., 2016; Ward and Tunnell, 2017). One of the major geographic features of the Northern GoM (NGoM) is the Mississippi River delta (Fig. 2) which drains the contiguous 48 states of the United States and accounts for 90% of the freshwater input into the GoM (Rabalais et al., 1996). The Mississippi river flows through regions of high agricultural activity and urban settlements and serves as a drainage for runoffs from these areas, thus accumulating sediments and nutrients. As the river flows into the GoM, it transfers sediments and nutrients along with the freshwater, thereby acting as a major driving force shaping the microbial communities and the ecosystem (Rabalais, Turner, and Wiseman Jr, 2002). An ongoing bottom-water hypoxic zone present in the summer (and in progress since 1985), combined with events such as the Deepwater Horizon oil spill (DWHOS) in 2010, has disrupted this fragile ecosystem and has decreased the survival rates of fauna across all trophic levels (Rabalais et al., 2007). The Deep-Pelagic Nekton Dynamics (DEEPEND) Consortium was established to characterize the meso- and bathypelagic zones of the NGoM and establish baseline data characterizing pelagic fauna and biogeochemical cycles (Sutton et al., 2017).

The composition of microbial communities of the GoM has been well characterized using NGS over the past decade. A geophysical feature of the NGoM driving the community structure is the Mississippi River plume. The effects of the oil spill and resultant change in pelagic microbial communities was also observed in beach samples off Louisiana, showing the relatively quick coastal impacts with presence of hydrocarbon-degrading taxa such as Gammaproteobacteria (Lamendella et al., 2014). Given the lack of baseline data, the DEEPEND Consortium aims to characterize the structure and function of microbial communities across the NGoM. Analysis from the biannual cruises in 2015 showed strong depth selection of microbial communities, correlated with salinity and turbidity at upper layers and circulation patterns at depth (Easson and Lopez, 2019). While much is known about the community composition and diversity, there are fewer functional studies on the pelagic GoM. The interactions and effects of oceanographic features such as MR plume and the Loop Current, on the metabolic roles of microbes and the overall functional

104

composition and structure are not clearly understood. As part of this study, microbial 16S sequence data and PICRUSt was used to identify and characterize the structure and function of microbial communities from the NGoM, and quantitatively assist future environmental impact assessments.

Methods

Seawater samples from the NGoM were previously obtained from bi-annual cruises in 2016 as part of the DEEPEND Consortium (Fig 3). Sample processing and sequence data made available for this study was previously generated by Dr. Cole G. Easson. The two cruises from 2016 analyzed in this study for community structural and functional analyses sampled 9 stations during 30 April-14 May (DP03), and 12 stations during 5-9 August (DP04). Seawater was collected at each station across four depth zones: Surface (0-10m), Epipelagic (20-200m), Mesopelagic (201-1000m) and Bathypelagic (1000-1600m), and three water types: Common water, Mixed water and Loop Current. Each station had day and night sampling across the four depth zones (Table 1). Environmental parameters included in the analyses for this study include temperature, salinity, chlorophyll a concentration, and DO (Supplementary Data I).

Seawater was filtered through 0.45 µm filter membranes, and DNA was extracted following standard Earth Microbiome Project (EMP) protocols. The V4 hypervariable region of the 16S rRNA gene was amplified using primers 515F and 806R and sequenced on Illumina MiSeq by Dr. Cole G. Easson (for detailed methods, see: Easson and Lopez, 2019). The resulting raw paired-end sequence data for a total of 476 microbial samples from cruises DP03 (n=293) and DP04 (n=183) used for this study was publicly available from the NCBI Sequence Read Archive database under Accession #PRJNA429259, and the Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) database (https://data.gulfresearchinitiative.org, RFP-IV, DEEPEND) under UDI R4.x257.228:0008. Nine ONOFF samples from DP03 were excluded from further analyses and CTD41.S.2 was lost during taxonomic assignment, leaving a total of 283 samples for this cruise. The respective environmental data is available on GRIIDC under UDI R4.x257.230:0011 (DP03) and R4.x257.230:0012 (DP04). PICRUSt data has been submitted as dataset R4.x257.000:0011 and DOI: 10.7266/531X8N14.

105

Figure 3. DP03 DP04 Sampling stations. DP03 cruise covered 9 stations and DP04 cruise covered 12 stations across north Gulf of Mexico as part of the DEEPEND Consortium. The basemap is ETOPO1 bathymetry.

Table 1. Sampling summary. Sample location and cruise description. Microbial samples from each station and depth had 3 replicates (included).

Taxonomic Assignment

Taxonomic assignment of 16S sequence data was carried out on Quantitative Insights Into Microbial Ecology (QIIME) ver. 1.9.1 (Caporaso et al., 2010). The script multiple_split_libraries_fastq.py, with the option Phred_offset 33, was run on the per-sample sequence data for demultiplexing. Demultiplexed data was then used as input to assign taxonomy and pick OTUs using the pick_closed_reference_otus.py script, the standard 97% sequence similarity cutoff, and Greengenes ver. 13.8 (DeSantis et al., 2006) as the reference database. This workflow was recommended for the data to be compatible with the PICRUSt pipeline (Langille et al., 2013).

Functional Prediction

Functional inference of sequence data was carried out using PICRUSt ver. 1.1.3 (Langille et al., 2013) locally in the Python-based Conda programming environment. The OTU table output from taxonomic assignment was used as input for the script normalize_by_copy_number.py to normalize the OTU abundance for gene copy number. Per-sample functional predictions are then carried out based on Kyoto Encyclopedia of Genes and Genomes (KEGG) database Orthologs (KO) using the corrected OTUs, using the script predict_metagenomes.py (Kanehisa et al., 2007). Weighted Nearest Sequenced Taxon Index (NSTI) values were also calculated with the option -a --with_confidence. KOs were then combined into biological pathways using categorize_by_function.py.

106

Statistical Analyses

Raw 16S sequence data was summarized using FastQC (Andrews, 2010) and MultiQC (Ewels et al., 2016) on the Galaxy platform (Afgan et al., 2018). Whole abundance OTU and PICRUSt data was summarized by depth zone using the QIIME script summarize_taxa_through_plots.py. Further statistical analyses were carried out in R Studio ver. 1.1.453 using the packages ggplot2, clusterSim, and vegan (Oksanen et al., 2011). Raw OTU abundance data was cleaned up to reduce noise by removing OTUs that were present once or had a total absolute abundance of 1. Due to the dilute nature of marine environment, the cleanup process was less stringent than other microbial environments such as aquatic or soil to ensure sufficient coverage. Whole abundance OTU and PICRUSt tables were then normalized across each sample using the option n10 of the data.Normalization function in clusterSim package. All further statistical tests were conducted on the relative abundance data.

Taxonomic and functional data were analyzed using the vegan package in R Studio. Beta diversity was analyzed using Bray-Curtis dissimilarity index. Bray-Curtis dissimilarity was calculated using the vegdist function. Principal Coordinate Analysis (PCoA) was used to visualize this dissimilarity matrix in a two-dimensional plot using the function betadisper. The function adonis was used for pairwise comparison between Bray-Curtis dissimilarity and environmental data. To find the features causing the (dis)similarity, the simper function was used which calculates Similarity of Percentages. DP02 data was only included for temporal analyses.

Results

Following EMP Ontology (EMPO) classification, the microbial environment for this study is defined as free-living (level 1), saline (level 2), and surface and water (level 3) (Thompson et al., 2017). Cruise DP03 was undertaken during the end of spring season which sees a higher outflow of MR river, lower surface temperatures and just before the onset of hypoxia (Rabalais, Turner, and Wiseman Jr, 2002). Cruise DP04 was undertaken near the end of summer season which sees a relatively less outflow from MR, has higher surface temperatures, and is after the hypoxia season. Sampling depths in the epipelagic zone corresponded to the deep chlorophyll maximum (DCM) and in the mesopelagic zone to oxygen minimum zone (OMZ). DP03 consists of an

107

average 76,053 sequences per sample with an average sequence length of 253 bp. Samples with duplicate MiSeq sequencing runs were combined to get accurate averages. DP04 sequence data consists of average 107,738 sequences per sample with an average sequence length of 251 bp.

Depth Stratification

For the PICRUSt functional data, weighted NSTI values were calculated as recommended by Langille et al., (2013). Mean weighted NSTI values for DP03 (0.17 ± 0.042) and DP04 (0.17 ± 0.047) were relatively high, indicating fewer marine metagenomes available in reference databases. PICRUSt output resulted in a total of 310 KEGG level 3 functional pathways for DP03 and 314 pathways for DP04. PICRUSt data was also summarized using the summarize_taxa_through_plots.py script after reorganizing the raw functional abundance output into a QIIME-compatible BIOM format. Average relative abundance of KEGG level 1 pathways across the water column shows that Metabolism was the dominant functional category (DP03: 53.1%, DP04: 52.4%). The next known abundant pathways were Genetic Information Processing (GIP) (DP03: 18.0%, DP04: 17.8%) and Environmental Information Processing (EIP) (DP03: 11.0%, DP04: 11.0%).

For KEGG level 2 analyses, percentages are calculated across respective KEGG level 1 pathways. The most abundant metabolic pathway was amino acid metabolism (DP03: 21.3%, DP04: 21.2%), followed by carbohydrate metabolism (DP03: 18.7%, DP04: 18.6%) (Appendix III). The abundance of amino acid metabolism and carbohydrate metabolism, increased with depth, while the following two, energy metabolism and metabolism of cofactors and vitamins, decreased (Fig. 13, 14). Among GIP and EIP, membrane transport and replication and repair processes decreased with depth, while translation, signal transduction and folding, sorting and degradation, and transcription increased.

Figure 13 and Figure 14. Average Relative Abundance of KEGG Level 2 Metabolic Pathways from DP03 and DP04. Depth profiles of KEGG level 2 pathways under the level 1 category of Metabolism, representing 91.2% and 90.9% of metabolism for cruise DP03 and DP04

108

respectively. Dashed lines represent pathway abundance decreasing from surface to bathypelagic.

Further analyses were centered on metabolism at KEGG level 2 and 3 pathways. At the finest functional resolution, KEGG level 3, metabolic functions varied significantly across individual depth zones (adonis test, DP03: p<0.001, DP04: p<0.001). PCoA plots based on Bray- Curtis distance of KEGG level 3 metabolic functions showed that metabolism in the surface and epipelagic zones clustered distinct from that in mesopelagic and bathypelagic zones, with a slight extension of epipelagic function into the aphotic zone during DP03 (Fig. 17, 18).

Figure 17. DP03 Metabolism Bray-Curtis PCoA. Clustering of metabolism in euphotic and aphotic zone based on a PCoA plot. The difference in function between epipelagic and mesopelagic (8.8%, SIMPER) was twice more than that between other consecutive depth zones.

Figure 18. DP04 Metabolism Bray-Curtis PCoA. Distinct clustering of metabolism in euphotic and aphotic zone based on a PCoA plot. The difference in function between epipelagic and mesopelagic (8.36%, SIMPER) was more than that between other consecutive depth zones.

Across consecutive depth zones, the largest difference was between epipelagic and mesopelagic (SIMPER summary, DP03: 8.8%, DP04: 8.36%), twice more than surface-epipelagic and mesopelagic-bathypelagic. The major pathways significantly different between epipelagic and mesopelagic were photosynthesis proteins, photosynthesis, and porphyrin and chlorophyll metabolism (SIMPER cumulative sum, DP03: 25.1%, DP04: 27.7%) (Table 2). These were followed by methane metabolism (DP03: 3.7%, DP04: 3.4%).

A CCA test was conducted using a forward-selecting model building function to analyze the effects of environmental variables on community function. Environmental variables explained 63.2% (DP03) and 77.3% (DP04) of the variation across water column, with temperature being

109

the major environmental driver (R2 adjusted, DP03: 0.554, DP04: 0.679). Across geospatial features, metabolism across stations of both cruises was significantly different in the euphotic (adonis test, p<0.001) and aphotic zones (adonis test, p<0.002). Across common water, mixed water, and Loop Current water classifications, metabolism was significantly different across euphotic zones of both cruises (adonis test, p<0.001) and across aphotic zone in DP03 (adonis test, p<0.05). However, metabolism was not different across aphotic zone of DP04.

Temporal Analyses – Structure and Dynamics

Depth zones explained 70.6% variation in metabolism during DP04 as compared to 63.6% variation during DP03. Functional abundance data from three cruises, DP02, DP03 and DP04 was used for the following temporal analyses. Microbial metabolism was significantly different across seasons of May and August (adonis test, dF = 1, F = 14.133, R2 = 0.02305, p<0.001). The top three differential metabolic functions across seasons were photosynthesis proteins, photosynthesis, and porphyrin and chlorophyll metabolism (SIMPER, cumulative sum 22.1%). Metabolism was also significantly different annually across 2015 and 2016 (adonis test, dF = 1, F = 22.476, R2 = 0.03617, p<0.001). Although the top three differential functions were the same as those across seasons, they were not significantly different annually. Across individual cruises, these photosynthetic functions were significantly different across DP03-DP04 (SIMPER, p<0.01), and not across DP02-DP03 or DP02-DP04. Overall, metabolism across seasons and year was more stable with increase in depth (Appendix V).

Discussion

Marine microbial communities constitute a significant portion of the ocean biomass and are the sole primary producers across depth zones. However, they have only recently been studied for structure, function, distribution, and dynamics. Large-scale studies such as GOS (Rusch et al., 2007), Malaspina expedition (Duarte, 2015), and Tara Oceans (Sunagawa et al., 2015) studied global ocean trends and dynamics of microbiome. Small-scale studies analyzed regional microbial profiles, studying local effects of currents and drainage, and found distinct community structure across freshwater and marine water bodies. To this end, one of the goals of the DEEPEND

110

consortium is to establish a baseline for microbial community structure and dynamics across the NGoM (Sutton et al., 2017). As part of this goal, the primary objective of this study was to characterize the functional profiles of the NGoM microbiome. Overall, a total of 466 16S rRNA amplicon samples from cruises during May and August of 2016 were analyzed for microbial community dynamics.

Depth stratification of microbial community function

At the broadest level, KEGG level 1, metabolism was the most abundant functional category for both DP03 and DP04 (relative abundance >50%), followed by Genetic Information Processing (GIP), Unclassified, and Environmental Information Processing (EIP) (Fig. 12). Depth profiles of the top three known level 1 categories were also similar across respective KEGG level 2 functions (Fig. 13, 14). At KEGG level 2 of metabolism, amino acid metabolism and carbohydrate metabolism pathways increased with depth and were the most abundant, indicating higher biochemical function in deeper waters. A targeted functional study on carbon source utilization by surface microbial communities from the Arabian Sea also found carbohydrates and amino acids as the most utilized substrates (Kumar, Mishra, and Jha, 2019).

All further analyses were centered on the level 3 pathways of metabolism. Metabolic functions were significantly different across depth zones for both cruises (adonis test, p<0.001). As the physicochemical parameters change noticeably from euphotic to aphotic zone, microbial communities adapt to this change by undergoing a major functional shift (Fig. 17, 18). Photosynthesis proteins, photosynthesis, and Porphyrin and chlorophyll metabolism were the major differential pathways between epipelagic and mesopelagic zones, accounting for 25% and 28% of functional abundance for DP03 and DP04 respectively (Table 3, 4). This shift in function quantitatively supports the significance of sunlight and photosynthesis in driving the oceanic microbial function, as a major portion of the photosynthesis-derived organic carbon is exported to deeper layers (Falkowski, Barber, and Smetacek, 1998; Mestre et al., 2018). Thus any changes in microbial function in the euphotic zone will lead to changes in community function across depth zones, in turn affecting availability of organic material to higher trophic levels and global biogeochemical cycles (Falkowski, Barber, and Smetacek, 1998).

111

Table 3: SIMPER test on metabolic pathways of DP03. Major pathways causing functional shift from epipelagic to mesopelagic in DP03.

Table 4: SIMPER test on metabolic pathways of DP04. Major pathways causing functional shift from epipelagic to mesopelagic in DP04.

Contrary to early theories, studies in the past two decades have shown that microbes are present and active in the deep pelagic (Arístegui et al., 2009). In this study, the energy metabolism of microbial communities switched from light-dependent photosynthesis in the euphotic zone, to chemosynthetic processes such as methane metabolism, other sources of carbon fixation, and butanoate metabolism in the aphotic zone. In the GoM, a post-oil spill study from the DWHOS site found high amounts of methane in the deep pelagic, accounting for 15% of the released hydrocarbons (Reddy et al., 2012). Microbial studies in the months following the oil spill found high abundance of methanotrophic bacteria and low abundance in the following month, suggesting insufficient methane consumption (Joye, Teske, and Kostka, 2014). For this study in 2016, methane metabolism was slightly higher in May (DP03: 2.2%) as compared to August (DP04: 2.15%), with highest abundance in the mesopelagic zone.

Quantitative studies based on Net Primary Productivity (NPP) by definition do not include carbon fixed by non-photosynthetic processes, which primarily occur in the aphotic zone, making it important to study those processes and include them in biogeochemical budgets (Orcutt et al., 2011). As photosynthesis is replaced by chemosynthetic processes as the method of primary productivity in the sunlight-deprived aphotic zone, studying these processes can help identify dynamics of processes driving microbial communities. Activity of microbial communities in the deeper layers have also been connected strongly to the sinking particle flux (Nagata et al., 2000; Mestre et al., 2018). Microbes are known to colonize this particle flux, called marine snow, due to their high organic matter content. The switch in function between the euphotic and aphotic zone may also be attributed to this increased reliance on readily available organic carbon as energy source (Orcutt et al., 2011). Targeted deep ocean studies may be required to tease apart the function

112

and driving factors of particle flux-associated microbes versus free-living communities and how much do they contribute to marine nutrient cycles and global biogeochemical cycles.

Environmental Drivers and Functional Redundancy

Environmental variables often explain more variation in community function more composition (King et al., 2013; Thompson et al., 2017). For cruises DP03 and DP04, environmental variables explained 63% and 77% variation in metabolic function respectively. Temperature was the major environmental driver and positively correlated with metabolic function during DP03 and negatively during DP04. Although function varied significantly across depth zones (adonis test, p<0.001), temperature, and not individual sampling depth, was the major environmental driver of community function across the water column (Table 5). Compared with OTU taxonomic abundance, temperature explained only 8.1% and 15.3% variation, while it explained 55.4% and 68% of variation in function (Table 2, 5). This underlines the significance of temperature in driving local microbial community structure and function along with its influence across global marine environments (Sunagawa et al., 2015; Thompson et al., 2017).

This difference in community taxa and community function is also observed when comparing relative abundance of taxonomic composition and metabolic potential. Variation in abundance of phylum-level taxa (relative abund. SD, DP03: 0.069, DP04: 0.072) was higher than that of metabolic pathways (relative abund. SD, DP03: 0.0082, DP04: 0.0081). A study on the global ocean microbiome also found high variation in taxonomic abundance and relatively stable gene functions (Sunagawa et al., 2015). The results above highlight this variation between taxonomic and functional potential of marine microbial communities, and support the idea of functional redundancy, where community function more closely reflects environmental conditions.

Taxonomic analyses may not always be conclusive, as a study on oil spills found high abundance of hydrocarbon degraders before and after oil spill in sediment and seawater, but enhanced hydrocarbon genes only post oil spill (Neethu et al., 2018). The current microbial classification system also does not accommodate for processes such as HGT and high mutation rates in the taxonomy, which are key for carrying out selective functions for niche adaptation (Medini et al., 2005). The pan-genome concept addresses this discrepancy by suggesting that the

113

genetic repertoire of a microbe comprises of its core genome, which is shared across species and is essential for basic survival functions, and a dispensable genome which allows for selective adaptive functions and differentiates between strains (Medini et al., 2005; Vernikos et al., 2015). Hence, functional redundancy potentially offers a buffering capacity for microbial communities, where the underlying function may remain within a small range while the composition may vary considerably (Sunagawa et al., 2015; Louca, Parfrey, and Doebeli, 2016). It is this underlying function which is seen more closely associated with environmental factors.

Table 5: CCA of environmental variables on metabolic function. Effect of environmental variables on community function of DP03 and DP04 showed temperature as the major environmental driver.

Spatial, Mesoscale, and Temporal Dynamics

There are varying reports of spatial diversity of microbial taxa from the NGoM, with cases for both, presence (Mason et al., 2016; Easson and Lopez, 2019) and absence (King et al., 2013) of spatial variability, depending on sampling time and location. Metabolic function in the NGoM was significantly different across sampling stations in the euphotic and aphotic zones in May (DP03) and August (DP04) of 2016. To further analyze what factors could be causing these differences, mesoscale-level water classification, based on sea-surface height and temperature, across the GoM was used (Johnston et al., 2019). Sampling stations were classified as present in common water, mixed water, or the LC region, with mixed water defined as intermediate to common water and LC. Function was significantly different across water classification in the euphotic zone during both sampling times, May (p=0.007) and August (p=0.001). While microbial function was also significantly different across the aphotic zone during May (p=0.023), it was similar in the aphotic zone during August (p=0.109). This observation can be explained by the combined effect of high residence time and the presence of strong water column stratification. As the water is stratified during summer, there is less vertical mixing and the water currents in deeper layers do not cause sufficient mixing, leading to similar microbial activity across the deep waters of the NGoM and the function in aphotic being effectively isolated from that in the euphotic zone.

114

The euphotic zone has been extensively studied for microbial composition and functional dynamics across the global oceans, with emphasis on surface primary production as it is responsible for ~46% of global primary productivity and feeds the deeper layers with organic matter (Logares et al., 2018). However, supply of organic matter is less than the microbial activity noted in the aphotic zone, indicating significant primary production in the deep (Acinas et al., 2019). Methane metabolism is one of the primary methods of fixing inorganic carbon in the deep. The abundance of methane metabolism at the site closest to the shore, B175, was higher in late summer of 2016 (Fig. 22) than in early spring (Fig. 21), indicating seasonal variation. Another sampling site slightly further offshore, B252, had high metabolism throughout both seasons.

No correlation was found with mesoscale water classification, possibly due to low sample coverage of the LC, a weak LC during both cruises, or the absence of LC in the bathypelagic. The GoM seabed has high hydrocarbon content, including approximately 22,000 methane seeps (Joye et al., 2014). While specific natural seeps were not mapped along with the cruise stations, this difference in metabolism across the two relatively close sites may be due to the presence of a natural seep at/near station B252 responsible for a high steady release of methane, causing increased metabolism than a nearby site. Studies on microbial taxa from the GoM have shown that methane may be a key driving force structuring the community across water column and needs to be further studied (Rakowski et al., 2015). A longer time-scale analysis might also be needed to study temporal changes on microbial methane metabolism in the aphotic zone. Due to the higher number of hydrocarbon seeps in the GoM as compared with other marine ecosystems, there is higher concentration of methane in the deeper layers (Rakowski et al., 2015). These concentrations also increased 10-1000 times after the DWHOS (Joye et al., 2011). As methane is a major greenhouse gas and responsible for global ocean warming (Lelieveld, Crutzen, and Dentener, 1998), among other large-scale effects, studying methane microbial dynamics may help understand the factors driving and affecting these dynamics.

The strength of association of environmental variables with microbial communities differs between taxa and function, with seasonality strongly affecting how these variables drive microbial communities (Sunagawa et al., 2015). For the current dataset, environmental variables explained more variation of function than of taxa (Table 2, 5). Along with spatial dynamics, time-series studies of marine microbiomes can help determine seasonal and annual patterns, along with

115

potential changes to function caused by irregular natural or anthropogenic events such as hurricanes and oil spills. Functional data from cruise DP02 (Bos et al., 2018) was integrated with the current study of DP03 and DP04 for temporal analyses.

The stratification of oceanic depth zones during summer months is evident from the metabolic activity of microbial communities, with depth zones explaining more metabolic variation during end of summer season (August 2016, DP04) and the distinct clustering of euphotic zones and aphotic zones, than during end of spring (May 2016, DP03). Photosynthetic primary productivity overall was significantly different across seasons (May vs August), but not annually (2015 vs 2016), supporting a previous study that also found surface primary productivity differs seasonally (Gilbert et al., 2012). Testing for this seasonal variation individually between the three time points of this study, photosynthetic functional abundance was found significantly different between May 2016 and August 2016, but not between August 2015 and May 2016. While the spring season in the NGoM is associated with higher primary productivity than summer, productivity across summer of 2015 was similar to that of spring 2016. A possible explanation for this deviation may be the presence of a strong LC observed during August 2015. Cell abundance studies across the NGoM have shown that the upwelling eddies formed due to the LC enhance nutrient mixing across the water column, thereby causing an increase in microbial plankton abundance and primary productivity (Williams et al., 2015). This deviation may also be due to the lack of sufficient sampling within the LC, as only two sites were sampled within the LC during May 2016 (DP03). Long-term seasonal and annual sampling in the LC might be needed to verify the above analysis. As the LC and the MR outflow are two major mesoscale circulations in the GoM, further studies focused on water circulation and microbial community function can help understand potential pelagic effects of the increased nutrient loading from the Mississippi River runoff.

Considerations, Developments, and Future Applications

PICRUSt continues to be used in studying aquatic and marine microbial communities with reasonably accurate functional inferences, which will potentially improve with the application of PICRUSt2 (Douglas et al., 2019). While deep-sequencing metagenomics studies offer in-depth

116

insights into the gene content and function of marine microbial communities, they are often expensive and computationally intensive. The 16S amplicon-based tools such as PICRUSt thus offer cost-effective alternatives. Both approaches have in effect highlighted the importance of analyzing microbial function to study marine environments. Microbial analyses can also support studies on marine chemical data and nutrient cycles.

Development of high-throughput sequencing standards, single-gene surveys, metagenomic surveys, and curation of sequence databases are key methods to accurately analyze microbial communities in natural environments. As marine microbial community composition and function has evolved distinctively from other natural environments, a marine environment-specific database such as OMRGC (Sunagawa et al., 2015) can serve as a repository and reference database for future amplicon, metagenomic and metatranscriptomic analyses, avoiding ecosystem bias.

Conclusion

The current study is the first comprehensive report of microbial community function from the surface to bathypelagic zone across the pelagic Northern Gulf of Mexico. Metabolic functional data across cruises in May and August of 2016 showed strong depth stratification, with a major shift in function from the euphotic to aphotic zone. Photosynthetic functional signatures such as photosynthesis proteins and photosynthesis metabolism were the major functions driving this shift. Sunlight-dependent primary productivity at the surface was replaced by methane metabolism as the most abundant pathway for primary productivity in the aphotic zone. Methane metabolism in the aphotic zone mapped across the sampling sites showed high abundance in the farthest sites. Analysis of association of physicochemical environmental variables showed temperature as the major driver among other parameters: salinity, absolute depth, chlorophyll a concentration, and dissolved oxygen. Environmental variables overall also had a stronger correlation with function (63% and 77%), explaining more variation than taxonomic composition (14% and 26%).

Metabolic function was also significantly different across the euphotic and aphotic zone of both sampling seasons, indicating spatial variability. Based on mesoscale water classification, common water, mixed water and Loop Current, function was found similar across the deeper layers of the NGoM during late summer, August 2016, which may be attributed to a strong stratification

117

of water column. Previously analyzed data from an earlier cruise in August 2015, was included for temporal analyses. Photosynthetic primary productivity overall was significantly different across seasons (May and August), but not annually (2015 and 2016). This analysis shows potential pelagic effects of the Mississippi River as early spring season of May corresponds with a higher river outflow. Marine microbes are key drivers of the biogeochemical cycles, and studying their function can help assess ecological changes and environmental effects which may also affect higher trophic levels. The analyses of functional data presented here contribute to studying the dynamics of microbial communities of the GoM and establishing a functional baseline for the region.

118

References

Acinas, S. G., Sánchez, P., Salazar, G., Cornejo-Castillo, F. M., Sebastián, M., Logares, R., et al. (2019) Metabolic Architecture of the Deep Ocean Microbiome. bioRxiv, 635680.

Afgan, E., Baker, D., Batut, B., Van Den Beek, M., Bouvier, D., Čech, M., et al. (2018) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic acids research, 46(W1), W537-W544.

Andrews, S. (2010) FastQC: a quality control tool for high throughput sequence data. In: Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom.

Arístegui, J., Gasol, J. M., Duarte, C. M., & Herndld, G. J. (2009) Microbial oceanography of the dark ocean's pelagic realm. Limnology and Oceanography, 54(5), 1501-1529.

Bos, R. P., Lopez, J., Easson, C., & White, J. (2018) PICRUSt Analysis of Shallow and Deep- Water Microbial Communities in the Gulf of Mexico.

Boucher, Y., Douady, C. J., Papke, R. T., Walsh, D. A., Boudreau, M. E. R., Nesbø, C. L., et al. (2003) Lateral gene transfer and the origins of prokaryotic groups. Annual review of genetics, 37(1), 283-328. Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., et al. (2010) QIIME allows analysis of high-throughput community sequencing data. Nature methods, 7(5), 335.

DeLong, E. F. (2005) Microbial community genomics in the ocean. Nature Reviews Microbiology, 3(6), 459.

DeSantis, T. Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E. L., Keller, K., et al. (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and Environmental Microbiology, 72(7), 5069-5072.

Douglas, G. M., Maffei, V. J., Zaneveld, J., Yurgel, S. N., Brown, J. R., Taylor, C. M., et al. (2019) PICRUSt2: An improved and extensible approach for metagenome inference. bioRxiv, 672295.

Easson, C. G., & Lopez, J. V. (2018) Depth-dependent environmental drivers of microbial plankton community structure in the northern Gulf of Mexico. Frontiers in microbiology, 9.

Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047 -3048.

Falkowski, P. G., Barber, R. T., & Smetacek, V. (1998) Biogeochemical controls and feedbacks on ocean primary production. Science, 281(5374), 200-206.

119

Falkowski, P. G., Fenchel, T., & Delong, E. F. (2008) The microbial engines that drive Earth's biogeochemical cycles. Science, 320(5879), 1034-1039.

Fernández, A., Huang, S., Seston, S., Xing, J., Hickey, R., Criddle, C., & Tiedje, J. (1999) How stable is stable? Function versus community composition. Appl. Environ. Microbiol., 65(8), 3697-3704.

Fuhrman, J. A., Steele, J. A., Hewson, I., Schwalbach, M. S., Brown, M. V., Green, J. L., & Brown, J. H. (2008) A latitudinal diversity gradient in planktonic marine bacteria. Proceedings of the National Academy of Sciences, 105(22), 7774-7778.

Gao, Z. M., Huang, J. M., Cui, G. J., Li, W. L., Li, J., Wei, Z. F., et al. (2019) In situ meta‐omic insights into the community compositions and ecological roles of hadal microbes in the Mariana Trench. Environmental microbiology.

Gilbert, J. A., Steele, J. A., Caporaso, J. G., Steinbrück, L., Reeder, J., Temperton, B., et al. (2012) Defining seasonal marine microbial community dynamics. The ISME journal, 6(2), 298.

Goolsby, D. A., Battaglin, W. A., Lawrence, G. B., Artz, R. S., Aulenbach, B. T., Hooper, R. P., et al. (1999) Flux and sources of nutrients in the Mississippi-Atchafalaya River Basin. National Oceanic and Atmospheric Administration National Ocean Service Coastal Ocean Program.

Johnston, M. W., Milligan, R. J., Easson, C. G., DeRada, S., English, D. C., Penta, B., & Sutton, T. T. (2019) An empirically validated method for characterizing pelagic habitats in the Gulf of Mexico using ocean model data. Limnology and Oceanography: Methods.

Joye, S. B., MacDonald, I. R., Leifer, I., & Asper, V. (2011) Magnitude and oxidation potential of hydrocarbon gases released from the BP oil well blowout. Nature Geoscience, 4(3), 160.

Joye, S. B., Teske, A. P., & Kostka, J. E. (2014) Microbial dynamics following the Macondo oil well blowout across Gulf of Mexico environments. BioScience, 64(9), 766-777.

Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., et al. (2007) KEGG for linking genomes to life and the environment. Nucleic acids research, 36(suppl_1), D480- D484.

King, G. M., Smith, C., Tolar, B., & Hollibaugh, J. T. (2013) Analysis of composition and structure of coastal to mesopelagic bacterioplankton communities in the northern gulf of Mexico. Frontiers in microbiology, 3, 438.

120

Kumar, R., Mishra, A., & Jha, B. (2019) Bacterial community structure and functional diversity in subsurface seawater from the western coastal ecosystem of the Arabian Sea, India. Gene, 701, 55-64.

Lamendella, R., Strutt, S., Borglin, S. E., Chakraborty, R., Tas, N., Mason, O. U., et al. (2014) Assessment of the Deepwater Horizon oil spill impact on Gulf coast microbial communities. Frontiers in microbiology, 5, 130.

Langille, M. G., Zaneveld, J., Caporaso, J. G., McDonald, D., Knights, D., Reyes, J. A., et al. (2013) Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature biotechnology, 31(9), 814.

Lelieveld, J., Crutzen, P. J., & Dentener, F. J. (1998) Changing concentration, lifetime and climate forcing of atmospheric methane. Tellus B, 50(2), 128-150.

Logares, R., Deutschmann, I. M., Giner, C. R., Krabberød, A. K., Schmidt, T. S., Rubinat-Ripoll, L., et al. (2018) Different processes shape prokaryotic and picoeukaryotic assemblages in the sunlit ocean microbiome. bioRxiv, 374298.

Louca, S., Parfrey, L. W., & Doebeli, M. (2016) Decoupling function and taxonomy in the global ocean microbiome. Science, 353(6305), 1272-1277.

Mason, O. U., Canter, E. J., Gillies, L. E., Paisie, T. K., & Roberts, B. J. (2016) Mississippi river plume enriches microbial diversity in the northern gulf of Mexico. Frontiers in microbiology, 7, 1048.

Medini, D., Donati, C., Tettelin, H., Masignani, V., & Rappuoli, R. (2005) The microbial pan- genome. Current opinion in genetics & development, 15(6), 589-594.

Mestre, M., Ruiz-González, C., Logares, R., Duarte, C. M., Gasol, J. M., & Sala, M. M. (2018) Sinking particles promote vertical connectivity in the ocean microbiome. Proceedings of the National Academy of Sciences, 115(29), E6799-E6807.

Nagata, T., Fukuda, H., Fukuda, R., & Koike, I. (2000) Bacterioplankton distribution and production in deep Pacific waters: Large–scale geographic variations and possible coupling with sinking particle fluxes. Limnology and Oceanography, 45(2), 426-435.

Neethu, C., Saravanakumar, C., Purvaja, R., Robin, R., & Ramesh, R. (2019) oil-spill triggered shift in Indigenous Microbial structure and Functional Dynamics in Different Marine environmental Matrices. Scientific reports, 9(1), 1354.

Nowinski, B., Smith, C. B., Thomas, C. M., Esson, K., Marin, R., Preston, C. M., et al. (2019) Microbial metagenomes and metatranscriptomes during a coastal phytoplankton bloom. Scientific data, 6(1), 129.

121

Oksanen, J., Blanchet, F. G., Kindt, R., Legendre, P., Minchin, P. R., O’hara, R., et al. (2011) vegan: Community ecology package. R package version, 117-118.

Orcutt, B. N., Sylvan, J. B., Knab, N. J., & Edwards, K. J. (2011) Microbial ecology of the dark ocean above, at, and below the seafloor. Microbiol. Mol. Biol. Rev., 75(2), 361-422.

Rabalais, N. N., Turner, R. E., Justić, D., Dortch, Q., Wiseman, W. J., & Gupta, B. K. S. (1996) Nutrient changes in the Mississippi River and system responses on the adjacent continental shelf. Estuaries, 19(2), 386-407.

Rabalais, N. N., Turner, R., Gupta, B. S., Boesch, D., Chapman, P., & Murrell, M. (2007) Hypoxia in the northern Gulf of Mexico: Does the science support the plan to reduce, mitigate, and control hypoxia? Estuaries and Coasts, 30(5), 753-772.

Rabalais, N. N., Turner, R. E., & Wiseman Jr, W. J. (2002) Gulf of Mexico hypoxia, aka “The dead zone”. Annual Review of ecology and Systematics, 33(1), 235-263.

Rakowski, C., Magen, C., Bosman, S., Gillies, L., Rogers, K., Chanton, J., & Mason, O. U. (2015) Methane and microbial dynamics in the Gulf of Mexico water column. Frontiers in Marine Science, 2, 69.

Reddy, C. M., Arey, J. S., Seewald, J. S., Sylva, S. P., Lemkau, K. L., Nelson, R. K., et al. (2012) Composition and fate of gas and oil released to the water column during the Deepwater Horizon oil spill. Proceedings of the National Academy of Sciences, 109(50), 20229-20234.

Rusch, D. B., Halpern, A. L., Sutton, G., Heidelberg, K. B., Williamson, S., Yooseph, S., et al. (2007) The Sorcerer II global ocean sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS biology, 5(3), e77.

Salerno, J., Little, B., Lee, J., & Hamdan, L. J. (2018) Exposure to crude oil and chemical dispersant may impact marine microbial biofilm composition and steel corrosion. Frontiers in Marine Science, 5(196)

Shen, Y., Fichot, C. G., Liang, S. K., & Benner, R. (2016) Biological hot spots and the accumulation of marine dissolved organic matter in a highly productive ocean margin. Limnology and Oceanography, 61(4), 1287-1300.

Sunagawa, S., Coelho, L. P., Chaffron, S., Kultima, J. R., Labadie, K., Salazar, G., et al. (2015) Structure and function of the global ocean microbiome. Science, 348(6237), 1261359.

Sutton, T., Cook, A., Boswell, K. M., Bracken-Grissom, H. D., DeRada, S., English, D., et al. (2017) Deep-Pelagic Research in the Gulf of Mexico: The DEEPEND Consortium.

Thompson, L. R., Sanders, J. G., McDonald, D., Amir, A., Ladau, J., Locey, K. J., et al. (2017) A communal catalogue reveals Earth’s multiscale microbial diversity. Nature, 551(7681).

122

Vernikos, G., Medini, D., Riley, D. R., & Tettelin, H. (2015) Ten years of pan-genome analyses. Current opinion in microbiology, 23, 148-154.

Ward, C. H., & Tunnell, J. W. (2017) Habitats and Biota of the Gulf of Mexico: An Overview. In Habitats and Biota of the Gulf of Mexico: Before the Deepwater Horizon Oil Spill (pp. 1- 54): Springer.

Williams, A. K., McInnes, A. S., Rooker, J. R., & Quigg, A. (2015) Changes in microbial plankton assemblages induced by mesoscale oceanographic features in the northern Gulf of Mexico. PloS one, 10(9), e0138230.

Xu, J. (2006) Invited review: microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances. Molecular ecology, 15(7), 1713-1731.

Zhou, J., L Richlen, M., R Sehein, T., M Kulis, D., Anderson, D., & Cai, Z. (2018) Microbial community structure and associations during a marine dinoflagellate bloom. Frontiers in microbiology, 9, 1201.

123