UC Irvine UC Irvine Electronic Theses and Dissertations

Title Phylogenetic and Functional Biogeography of Marine Bacteria

Permalink https://escholarship.org/uc/item/3z64m8m8

Author Hatosy, Stephen Mark

Publication Date 2015

License https://creativecommons.org/licenses/by/4.0/ 4.0

Peer reviewed|Thesis/dissertation

eScholarship.org Powered by the California Digital Library University of California

UNIVERSITY OF CALIFORNIA, IRVINE

Phylogenetic and Functional Biogeography of Marine Bacteria

DISSERTATION

submitted in partial satisfaction of the requirements for the degree of

DOCTOR OF PHILOSOPHY

in Biological Sciences

by

Stephen Mark Hatosy, Jr.

Dissertation Committee: Associate Professor Adam C. Martiny, Chair Associate Professor Steven Allison Professor Brandon Gaut

2015

Chapter 1 © 2013 Ecological Society of America All other material © 2015 Stephen Mark Hatosy, Jr.

DEDICATION

To my family who fostered my love of science

!!" "

TABLE OF CONTENTS

Page

LIST OF FIGURES iv

LIST OF TABLES v

ACKNOWLEDGMENTS vi

CURRICULUM VITAE vii

ABSTRACT OF THE DISSERTATION ix

INTRODUCTION 1

CHAPTER 1: Beta-diversity of marine bacteria depends on scale 8

CHAPTER 2: Taxonomic scale influences temporal dynamics 35

CHAPTER 3: The ocean as a global reservoir of antibiotic resistance genes 55

!!!" " LIST OF FIGURES

Page

Figure 1.1 Pairwise community similarity heatmaps 27

Figure 1.2 Temporal decay curves 28

Figure 1.3 Variance decomposition 29

Figure S1.1 Spectral periodogram 30

Figure S1.2 Moran’s I community similarity correlograms 31

Figure S1.3 Moran’s I enviornmental similarity correlograms 32

Figure S1.4 Moran’s I environmental distance correlograms 33

Figure S1.5 Environmental variance decomposition 34

Figure 2.1 Time series plots 51

Figure 2.2 Phylogenetic tree and time series 52

Figure 2.3 Correlation between water temperature and relative abundance 53

Figure S2.1 Spectral density plots 54

Figure 3.1 Clone frequencies across sample locations and antibiotics 75

Figure 3.2 Abundance and amino acid similarity to known AR genes 76

Figure 3.3 Relative abundance of marine and non-marine taxa 77

Figure 3.4 Clone frequencies from genomic libraries 78

!#" " LIST OF TABLES

Page

Table 1.1 Dataset summary 25

Table 1.2 Community regression statistics 26

Table 2.1 rpoC1 library summary 49

Table 2.2 Pearson’s correlation coefficients 50

Table S3.1 Minimum inhibitory concentrations 79

Table S3.2 Kruskal-Wallis and PERMANOVA statistics 80

Table S3.3 Previously uncharacterized AR genes 81

Table S3.4 Known AR genes in marine taxa 86

Table S3.5 Protein functional types in marine taxa 87

Table S3.6 ARDB and GenBank hits to E. coli clones 88

Table S3.7 GenBank hits to Synechococcus WH8102 clones 95

#" " ACKNOWLEDGMENTS

First, I would like to thank my advisor, Professor Adam Martiny, for his unending support during my dissertation and treating as a colleague.

I would like to thank my committee members, Professor Steven Allison and Professor Brandon Gaut, for reminding to think about the big picture.

In addition, I thank Professor Jennifer Martiny and Professor Bradford Hawkins for insightful comments on dissertation chapters. And thank you to Claudia Weihe for her molecular biology knowledge.

I thank the Ecological Society of America for permission to include copyrighted material as Chapter 1 of my dissertation. I also thank my co-authors on that publication, Rohan Sachdeva, Joshua Steele, and Jed Fuhrman.

Financial support was provided by the University of California, Irvine, the NSF Biological Oceanography Program and Department of Education Graduate Assistance in Areas of National Need.

I also thank the Ecology and Evolutionary Biology graduate students for sharing advice, analyses, expertise, and the occasional beer.

I thank my parents Steve and Phyllis Hatosy and my sister Viki Hatosy for their support and encouragement, especially towards the end of the dissertation when I needed it the most.

And finally, I thank LuAnna Dobson for her endless support and constructive criticism when I needed it.

#!" " CURRICULUM VITAE

Stephen M. Hatosy

Education: University of California, Irvine Ecology and Evolutionary Biology Ph.D. 2015 University of California, Berkeley Molecular Environmental Biology B.Sc. 2006 Orange Coast College Biology A.A. 2004

Additional education: Cold Spring Harbor Laboratory Advanced Bacterial Genetics 2011

Appointments: 2009 – 2014 Teaching Assistant, University of California, Irvine, CA 2007 – 2009 Teaching Assistant, Orange Coast College, Costa Mesa, CA 2007 Research Assistant, US Geological Survey, Menlo Park, CA 2005 Undergraduate Researcher, Richard B. Gump South Pacific Research Station, Moorea, French Polynesia 2005 – 2006 Research Assistant, University of California, Berkeley, CA 2002 – 2003 NSF Teaching Scholarship Program, Orange Coast College, Costa Mesa, CA 2002 – 2004 Teaching Assistant, Orange Coast College, Costa Mesa, CA 2002 – 2003 Aquarium Manager, Orange Coast College Public Aquarium, Costa Mesa, CA

Mentoring Activities: 2010 – Present Graduate mentor for 24 undergraduate students at UCI Publications: Hatosy, S.M., A.C. Martiny. The ocean as a global reservoir of antibiotic resistance genes. in review. Hatosy, S.M., J.B.H. Martiny, R. Sachdeva, J. Steele, J.A. Fuhrman, A.C. Martiny. 2013. Beta-diversity of marine bacteria depends on scale. Ecology. 94:1894- 1904 Allison, S, Y. Chao, J.D. Fararra, S. Hatosy, A.C. Martiny. 2012. Fine-scale temporal variation in marine ectoenzymes of coastal southern California. Frontiers in Microbiology. doi:10.3389/fmicb.2012.00301 Freitas, S, S. Hatosy, J.A. Fuhrman, S.M. Huse, D.B.M. Welch, M.L. Sogin, A.C. Martiny. 2012. Global distribution and diversity of marine Verrucomicrobia. ISME J. 6:1499-1505

#!!" " Awards: 2014 The Data Incubator Data Science Fellowship Semi-finalist 2013 Dept. of Education Graduate Assistance in Areas of National Need fellowship 2012 Dept. of Education Graduate Assistance in Areas of National Need fellowship 2012 Orange Coast College Garrison Fellow 2010 NSF Graduate Research Fellowship Honorable Mention

Service: 2013-2014 Graduate Student Representative UC Irvine, CA 2012-2013 Graduate Student Invited Speaker Organizer UC Irvine, CA 2011-2012 Graduate Student Symposium Organizer UC Irvine, CA 2012 Reviewer for Aquatic Microbial Ecology

Outreach: 2012 Community Science Night 2010-2012 Ask-A-Scientist Night 2002-2003 Career Day Speaker Presentations: 2014 Hatosy, S.M., A.J. Field, A.C. Martiny. Marine environments are a reservoir for potential antibiotic resistance genes. Southern California Geobiology Sympopsium, USC. 2013 Lee, J.A., C. Mouginot, S.M. Hatosy, A. Talarmin, A.C. Martiny. Temporal Variability in Oceanic Nutrient Ratios along the Southern California Coast. Undergraduate Research Opportunity Program Symposium, UCI. 2013 Lee, J.A., C. Mouginot, S.M. Hatosy, A. Talarmin, A.C. Martiny. Variability of the C:N:P ratios of the particulate organic matter at Newport Beach Pier. Southern California Geobiology Conference, Caltech. 2012 Hatosy, S.M., A.J. Field, A.C. Martiny. Widespread potential for antibiotic resistance genes in marine environments. American Society for Microbiology. 2011 Abedrabbo, S., S.M. Hatosy, A.C Martiny. Diversity and its limits: Variation within cyanobacterial ecotypes. Excellence in Research Symposium. UCI. 2011 Freitas, S., S.M. Hatosy, A. Martiny. Spatial and temporal variation of Verrucomicrobia. Undergraduate Research Opportunity Program Symposium. UCI 2010 Hatosy, S.M., V. Neino, J. Lee, T. Tran, H. Bui, A. Heetland, D. Ho, C. League, B. Takagi, S. Kathuria, S. Allison, A. Martiny. Multi-temporal scale variation of activity. International Symposium on Microbial Ecology.

#!!!" " ABSTRACT OF THE DISSERTATION

Phylogenetic and Functional Biogeography of Marine Bacteria

By

Stephen Mark Hatosy, Jr.

Doctor of Philosophy in Biological Sciences

University of California, Irvine, 2015

Professor Adam Martiny, Chair

Communities vary across space and time. In addition, they may vary differently at different spatial and temporal scales. It is well established that marine bacterial communities are temporally structured by seasons. However, there are other environmental changes that occur at different temporal scales. Here, we quantified the community variation at three temporal scales across three time series. We found that communities varied at the three time scales with the majority of community variation occurred within seasons. Additionally, we found that community variation correlated with different environmental variables at different temporal scales.

Next I wanted to identify the effect phylogenetic scale has on biodiversity patterns. Most of bacterial ecology defines a taxon as a group that shares more than

97% similarity of the 16S rRNA. However, these groups are not independent of each other. They share evolutionary history charted by divergences into different niches. In order to identify the role of phylogentic resolution on population dynamics, I undertook a

4.5-year sampling project and analysed the variation in relative abundance at different taxonomic levels (genus level, clade level, and subgroup level). I found that the

!$" " frequency of variation increased as I moved to finer phylogentic resolutions. In addition, the correlation with temperature also changed by changing phylogentic resolution.

Finally, I wanted to identify antibiotic resistance genes in marine environments.

Natural environments are quickly being discovered to harbor a number of antibiotic resistance genes. However, marine environments have been mostly overlooked even though they cover more than 70% of Earth’s surface, and hold on the order of 1029 bacterial cells. As part of my dissertation, I wished to fill this knowledge gap. By using the culture independent method of functional metagenomics, I discovered genes that conferred antibiotic resistance. Of these genes, 28% were identified as previously known antibiotic resistance genes. The majority were unknown to confer resistance, but had activities similar to antibiotic resistance genes (e.g. transport pumps, enzymatic degradation, etc.). I also identified that the majority of these genes were found in marine bacteria, such as Pelagibacter, Roseobacter, and Prochlorococcus. Therefore, I have uncovered a potential global reservoir of antibiotic resistance genes.

$" " INTRODUCTION

Patterns of biodiversity have been of interest to ecologists for several decades and the role of spatial scale on turnover has been known since the description of the species-area relationship (Preston 1960). The role of spatial scale has since been investigated using the negative relationship between community similarity and distance between those communities (the distance decay curve) (Harte and Kinzig 1997, Harte et al. 1999, Morlon et al. 2008). Such scale dependent patterns have been observed in forest communities (Nekola and White 1999, Condit et al. 2002), intertidal communities

(Tsujino et al. 2009), and microbial communities (Ramette and Tiedje 2007)

However, distance decay patterns are not limited to spatial community turnover.

Temporal changes in community similarity can also be measured using distance decay.

These patterns have been observed in a variety of communities including fish, invertebrate larva, microalgae, fossil diatoms, and leaf-surface bacteria (Cermeno and

Falkowski 2009, Redford and Fierer 2009, Korhonen et al. 2010, Magurran and

Henderson 2010), and as with spatial patterns, temporal turnover shows scale dependent turnover (Cermeno and Falkowski 2009, Korhonen et al. 2010). Of the diversity of communities that have been studied using distance decay, ones that have gotten little attention are marine bacterial communities.

Marine environments are seasonally and inter-annually dynamic ecosystems

(DuRand et al. 2001, Cloern et al. 2010), and marine microbial communities vary seasonally and these variations correspond to environmental variations (Fuhrman et al.

2006, Gilbert et al. 2012). However, the nature of these variations is not well understood. How much does composition change season-to-season, or year-to-year?

%" " Now, recent advances in sequencing technology and continuous long-term time series of microbial communities (>4 years) have made these analyses more tractable.

In my first chapter, we use a temporal distance-decay approach on three marine microbial communities over seasonal and inter-annual time scales to answer the following questions: 1) How much do intra-seasonal, seasonal and inter-annual turnover contribute to community composition variation? 2) How do intra-seasonal, seasonal and inter-annual variations compare across the three different habitats? 3) What environmental factors drive variation at these three temporal scales?

In addition to temporal scales, the taxonomic resolution studied may also influence microbial biodiversity patterns (Cho and Tiedje 2000, Fulthorpe et al. 2008).

For instance, the marine cyanobacterium Prochlorococcus is distributed along a depth gradient; however, this distribution is partitioned into high-light adapted and low-light adapted ecotypes (Moore et al. 1999). And when these ecotypes are defined by finer taxonomic resolution, they display biogeographic patterns in response to water temperature, and nitrate concentration (Martiny et al. 2009).

Similarly, the marine cyanobacterium Synechococcus also follows taxonomic resolution trends. Marine Synechococcus forms a monophyletic clade, which is divided into ten subclades (Ferris and Palenik 1998). Using quantitative PCR assays, it was observed that four of these clades were present in California coastal waters, and each clade displayed a unique temporal pattern (Tai and Palenik 2009). Therefore, in my second chapater, I use sequence analysis of the RNA polymerase gene (rpoC1) from the cyanobacteria Prochlorococcus and Synechococcus to answer the following questions:

&" " 1) Which Prochlorococcus and Synechococcus clades exist in the ocean off Newport

Beach and 2) How do population dynamics change at various taxonomic and temporal scales?

In my third chapter, I address the question of diversity patterns by looking at functional diversity in antibiotic resistance genes. Antibiotic resistant bacteria continue to be a public health concern. The topic of antibiotic resistance has generally focused on resistance in human pathogenic bacteria. This work has focused on bacteria that are easily culturable; however, non-pathogenic bacteria can also be resistant to antibiotics and only 1% of these microbes are culturable (Mazel and Davies 1999, Martinez 2009).

Recent advances in molecular methods and sequencing technologies have uncovered diverse antibiotic resistance genes (AR genes) in soil microbial communities (Riesenfeld et al. 2004), the human gut (Sommer et al. 2009), and seagulls (Martiny et al. 2011).

The results of such studies illustrate our lack of understanding of the diversity and distribution of antibiotic resistance in a range of microbial habitats. These habitats generally have microorganisms living in close association, so there is the potential for antagonistic interactions to include microbial derived antibiotics. But what about environments that are typically diffuse, such as oceans?

There is a general understanding that there is not a large role for antibiotic resistance in oceans because they are highly dilute environments, and antibiotics would not reach lethal concentrations. However, there are a few instances when antibiotic resistant bacteria have been isolated from seawater (Hermansson et al. 1987, De

Souza et al. 2006, Baker-Austin et al. 2009, de Oliveira et al. 2010). In addition, secondary metabolites from marine bacteria have been identified as antibiotic

'" " compounds (Gontang et al. 2010, Haste et al. 2010, Edlund et al. 2011), and bacteria isolated from seawater can interact antagonistically using antibiotic compounds (Long and Azam 2001). Because of these interactions and the rich taxonomic diversity of marine microbial communities (Rusch et al. 2007, Yooseph et al. 2007), there is the potential for widespread occurrence of antibiotic resistance. Since oceans represent

71% of Earth’s surface, they could be global reservoirs of antibiotic resistance, thus antibiotic resistant bacteria in marine habitats could have global implications for public health and marine ecology. Therefore, for my third chapter I use a culture independent method to identify genes that confer resistance to antibiotics to answer the following questions: 1) What is the frequency of resistance to different antibiotics in specific marine environments, 2) what is the diversity of marine AR genes, and 3) are these genes harbored by marine bacteria?

References:

Baker-Austin, C., J. McArthur, A. Lindell, M. Wright, R. Tuckfield, J. Gooch, L. Warner, J. Oliver, and R. Stepanauskas. 2009. Multi-site analysis reveals widespread antibiotic resistance in the marine pathogen Vibrio vulnificus. Microbial Ecology 57:151-159.

Cermeno, P. and P. G. Falkowski. 2009. Controls on diatom biogeography in the ocean. Science 325:1539-1541.

Cho, J. C. and J. M. Tiedje. 2000. Biogeography and degree of endemicity of fluorescent Pseudomonas strains in soil. Applied and Environmental Microbiology 66:5448-5456.

Cloern, J. E., K. A. Hieb, T. Jacobson, B. Sanso, E. Di Lorenzo, M. T. Stacey, J. L. Largier, W. Meiring, W. T. Peterson, T. M. Powell, M. Winder, and A. D. Jassby. 2010. Biological communities in San Francisco Bay track large-scale climate forcing over the North Pacific. Geophysical Research Letters 37.

(" " Condit, R., N. Pitman, E. G. Leigh, Jr., J. Chave, J. Terborgh, R. B. Foster, P. Nunez, S. Aguilar, R. Valencia, G. Villa, H. C. Muller-Landau, E. Losos, and S. P. Hubbell. 2002. Beta-diversity in tropical forest trees. Science 295:666-669. de Oliveira, A. J. F. C., P. T. R. de França, and A. B. Pinto. 2010. Antimicrobial resistance of heterotrophic marine bacteria isolated from seawater and sands of recreational beaches with different organic pollution levels in southeastern Brazil: evidences of resistance dissemination. Environmental Monitoring and Assessment 169:375-384.

De Souza, M. J., S. Nair, P. A. L. Bharathi, and D. Chandramohan. 2006. Metal and antibiotic-resistance in psychrotrophic bacteria from Antarctic Marine waters. Ecotoxicology 15:379-384.

DuRand, M., R. Olson, and S. Chisholm. 2001. Phytoplankton population dynamics at the Bermuda Atlantic Time-series station in the Sargasso Sea. Deep-Sea Res Pt Ii 48:1983-2003.

Edlund, A., S. Loesgen, W. Fenical, and P. R. Jensen. 2011. Geographic Distribution of Secondary Metabolite Genes in the Marine Actinomycete Salinispora arenicola. Applied and Environmental Microbiology 77:5916-5925.

Ferris, M. J. and B. Palenik. 1998. Niche adaptation in ocean cyanobacteria. Nature 396:226-228.

Fuhrman, J., I. Hewson, M. Schwalbach, J. Steele, M. Brown, and S. Naeem. 2006. Annually reoccurring bacterial communities are predictable from ocean conditions. Proceedings of the National Academy of Sciences (USA) 103:13104- 13109.

Fulthorpe, R. R., L. F. Roesch, A. Riva, and E. W. Triplett. 2008. Distantly sampled soils carry few species in common. ISME J 2:901-910.

Gilbert, J. A., J. A. Steele, J. G. Caporaso, L. Steinbruck, J. Reeder, B. Temperton, S. Huse, A. C. McHardy, R. Knight, I. Joint, P. Somerfield, J. A. Fuhrman, and D. Field. 2012. Defining seasonal marine microbial community dynamics. ISME Journal 6:298-308.

Gontang, E. A., S. P. Gaudencio, W. Fenical, and P. R. Jensen. 2010. Sequence-Based Analysis of Secondary-Metabolite in Marine Actinobacteria. Applied and Environmental Microbiology 76:2487-2499.

Harte, J. and A. Kinzig. 1997. On the implications of species-area relationships for endemism, spatial turnover, and food web patterns. Oikos 80:417-427.

)" " Harte, J., S. McCarthy, K. Taylor, A. Kinzig, and M. L. Fischer. 1999. Estimating species-area relationships from plot to landscape scale using species spatial- turnover data. Oikos 86:45-54. Haste, N. M., V. R. Perera, K. N. Maloney, D. N. Tran, P. Jensen, W. Fenical, V. Nizet, and M. E. Hensler. 2010. Activity of the streptogramin antibiotic etamycin against methicillin-resistant Staphylococcus aureus. Journal of Antibiotics 63:219-224.

Hermansson, M., G. W. Jones, and S. Kjelleberg. 1987. Frequency of antibiotic and heavy-metal resistance, pigmentation, and plasmids in bacteria of the marine air- water-interface. Applied and Environmental Microbiology 53:2338-2342.

Korhonen, J. J., J. Soininen, and H. Hillebrand. 2010. A quantitative analysis of temporal turnover in aquatic species assemblages across ecosystems. Ecology 91:508-517.

Long, R. A. and F. Azam. 2001. Antagonistic interactions among marine pelagic bacteria. Appl Environ Microbiol 67:4975-4983.

Magurran, A. E. and P. A. Henderson. 2010. Temporal turnover and the maintenance of diversity in ecological assemblages. Philosophical Transactions of the Royal Society B 365:3611-3620.

Martinez, J. L. 2009. The role of natural environments in the evolution of resistance traits in pathogenic bacteria. P R Soc B 276:2521-2530.

Martiny, A. C., J. B. H. Martiny, C. Weihe, A. Field, and J. C. Ellis. 2011. Functional Metagenomics Reveals Previously Unrecognized Diversity of Antibiotic Resistance Genes in Gulls. Frontiers in Microbiology 2.

Martiny, A. C., A. P. K. Tai, D. Veneziano, F. Primeau, and S. W. Chisholm. 2009. Taxonomic resolution, ecotypes and the biogeography ofProchlorococcus. Environmental Microbiology 11:823-832.

Mazel, D. and J. Davies. 1999. Antibiotic resistance in microbes. Cellular and Molecular Life Sciences.

Moore, L. R., G. Rocap, and S. W. Chisholm. 1999. Physiology and molecular phylogeny of coexisting Prochlorococcus ecotypes. Nature 393:464-467.

Morlon, H., G. Chuyong, R. Condit, S. Hubbell, D. Kenfack, D. Thomas, R. Valencia, and J. Green. 2008. A general framework for the distance–decay of similarity in ecological communities. Ecology letters 11:904-917.

Nekola, J. and P. White. 1999. The distance decay of similarity in biogeography and ecology. Journal of Biogeography 26:867-878.

*" " Preston, F. 1960. Time and space and the variation of species. Ecology 41:611-627.

Ramette, A. and J. M. Tiedje. 2007. Multiscale responses of microbial life to spatial distance and environmental heterogeneity in a patchy ecosystem. Proceedings of the National Academy of Sciences (USA) 104:2761-2766.

Redford, A. J. and N. Fierer. 2009. Bacterial Succession on the Leaf Surface: A Novel System for Studying Successional Dynamics. Microbial Ecology 58:189-198.

Riesenfeld, C., R. Goodman, and J. Handelsman. 2004. Uncultured soil bacteria are a reservoir of new antibiotic resistance genes. Environmental Microbiology 6:981- 989.

Rusch, D. B., A. L. Halpern, G. Sutton, K. B. Heidelberg, S. Williamson, S. Yooseph, D. Y. Wu, J. A. Eisen, J. M. Hoffman, K. Remington, K. Beeson, B. Tran, H. Smith, H. Baden-Tillson, C. Stewart, J. Thorpe, J. Freeman, C. Andrews-Pfannkoch, J. E. Venter, K. Li, S. Kravitz, J. F. Heidelberg, T. Utterback, Y. H. Rogers, L. I. Falcon, V. Souza, G. Bonilla-Rosso, L. E. Eguiarte, D. M. Karl, S. Sathyendranath, T. Platt, E. Bermingham, V. Gallardo, G. Tamayo-Castillo, M. R. Ferrari, R. L. Strausberg, K. Nealson, R. Friedman, M. Frazier, and J. C. Venter. 2007. The Sorcerer II Global Ocean Sampling expedition: Northwest Atlantic through Eastern Tropical Pacific. Plos Biology 5:398-431.

Sommer, M. O. A., G. Dantas, and G. M. Church. 2009. Functional Characterization of the Antibiotic Resistance Reservoir in the Human Microflora. Science 325:1128- 1131.

Tai, V. and B. Palenik. 2009. Temporal variation of Synechococcus clades at a coastal Pacific Ocean monitoring site. 3:903-915.

Tsujino, M., M. Hori, T. Okuda, M. Nakaoka, T. Yamamoto, and T. Noda. 2009. Distance decay of community dynamics in rocky intertidal sessile assemblages evaluated by transition matrix models. Population Ecology 52:171-180.

Yooseph, S., G. Sutton, D. B. Rusch, A. L. Halpern, S. J. Williamson, K. Remington, J. A. Eisen, K. B. Heidelberg, G. Manning, W. Z. Li, L. Jaroszewski, P. Cieplak, C. S. Miller, H. Y. Li, S. T. Mashiyama, M. P. Joachimiak, C. van Belle, J. M. Chandonia, D. A. Soergel, Y. F. Zhai, K. Natarajan, S. Lee, B. J. Raphael, V. Bafna, R. Friedman, S. E. Brenner, A. Godzik, D. Eisenberg, J. E. Dixon, S. S. Taylor, R. L. Strausberg, M. Frazier, and J. C. Venter. 2007. The Sorcerer II Global Ocean Sampling expedition: Expanding the universe of protein families. Plos Biology 5:432-466.

+" " Chapter 1

Beta-diversity of marine bacteria depends on temporal scale

Abstract

Factors controlling the spatial distribution of bacterial diversity have been intensely studied, whereas less is known about temporal changes. To address this, we tested whether the mechanisms that underlie bacterial temporal !-diversity vary across different scales in three marine microbial communities. While seasonal turnover was detected, at least 73% of the community variation occurred at intra-seasonal temporal scales, suggesting that episodic events are important in structuring marine microbial communities. In addition, turnover at different temporal scales appeared to be driven by different factors. Intra-seasonal turnover was significantly correlated to environmental variables such as phosphate and silicate concentrations, while seasonal and inter- annual turnover were related to nitrate concentration and temporal distance. We observed a strong link between the magnitude of environmental variation and bacterial

!-diversity in different communities. Analogous to spatial biogeography, we found different rates of community changes across temporal scales.

Key words: beta-diversity, biodiversity, biogeography, distance decay, microbial ecology

Introduction

A well described pattern in community ecology is the negative relationship between community similarity and physical distance (Harte and Kinzig 1997). Such

“distance decay curves” are a directional measure of beta-diversity, or variation in community composition among samples (Anderson et al. 2011). This pattern can result

," " from a variety of mechanisms, including environmental heterogeneity, dispersal limitation, migration, and stochastic events (Hubbell 2001, Hanson et al. 2012). The effects of these mechanisms depend on the spatial scale over which they occur, and such scale-dependent patterns have been observed for a variety of taxa in many environments – e.g. plant communities (Nekola and White 1999, Condit et al. 2002), sessile invertebrate and algal communities (Tsujino et al. 2009), and microbial communities (Ramette and Tiedje 2007, Martiny et al. 2011).

Analogous to patterns in space, the abundances of individual taxa as well as the diversity of whole communities are also dynamic over time (Korhonen et al. 2010,

Magurran and Henderson 2010). In aquatic environments, studies have shown that microbial communities vary over different time scales (Fuhrman et al. 2006, Kara and

Shade 2009, Gilbert et al. 2012, Jones et al. 2012). However, we know little about the relative changes in microbial community composition across different time-scales, and the underlying factors that control any such patterns. Because microbial communities commonly consist of thousands of taxa, a beta-diversity approach using temporal distance decay curves and variance partitioning across temporal scales can be useful for the investigation of highly complex communities (Anderson et al. 2011). Using these approaches, we analyzed three marine microbial communities to address the following questions: 1) How does beta-diversity in marine microbial communities vary over intra- seasonal (<90 days), seasonal, and inter-annual time scales? 2) Which environmental variables are important drivers of beta-diversity at these temporal scales? 3) How do temporal changes in beta-diversity compare across three different marine environments? The results of this study indicate that the temporal beta-diversity of

-" " marine bacterial communities depends on scale and that this diversity is driven in large part by environmental variation.

Methods

San Pedro Basin communities

We analyzed an eight-year time series of the surface (5 meter, SPB5) and a six- year time-series of the deep-water (890 meters, SPB890) from the San Pedro Basin

(33.5 N, 118.4 W) (Fuhrman et al. 2006) with samples collected once per month.

Operational taxonomic units (OTUs) were defined using automated ribosomal intergenic sequence analysis (ARISA) (Fuhrman et al. 2006). The relative abundance of each

OTU was determined by calculating the area under the fluorescence peak relative to the area under all the peaks. The SPB5 dataset consisted of eighty-five samples and 391

OTUs. The SPB890 dataset had forty-three samples and 367 OTUs.

English Channel community

The English Channel data set (EC5) was a six-year time series at the L4 Ocean

Observatory (50.2 N, 4.2 W) (Gilbert et al. 2012). Surface water samples were collected once per month between January 2003 and December 2008. OTUs were defined as

97% sequence similarity of the V-6 variable region of 16S rRNA (Gilbert et al. 2012).

The dataset consisted of 73 samples, a total of 752,028 sequences with an average of

10,301 sequences per sample, and 7614 OTUs identified. We also reduced the number of OTUs to make the EC dataset more comparable to the SPB datasets in three ways,

1) we kept the 400 most abundant taxa across all samples, 2) removed taxa that were less than 0.1% of the sequences for each sample, and 3) removed taxa that were less than 1% of the sequences from each sample. Because yearly autocorrelations (Moran’s

%." " I) and relative intra-seasonal variance (PERMANOVA) of the reduced EC datasets (0.12 to 0.14 and 59% to 60%) were similar to the whole dataset (0.15 and 60%), we focused on the whole dataset. Proportional abundances from EC5 and SPB were used in the community analysis. The environmental analysis used the variables water temperature and nitrate, phosphate, and silicate concentration because these were variables shared between the three time series. Some of the samples did not have associated environmental data; therefore, the environmental time series are shorter than the community time series.

Time-series analysis

Pairwise similarity among all samples was calculated using the Bray-Curtis metric from the ‘vegan’ package in R using untransformed and a square root transformation of the OTU abundance (Oksanen et al. 2012). The square root transformation only increased the Moran’s I autocorrelation at yearly intervals (0.15 to

0.20) and the variance decomposition pattern was unchanged (intra-seasonal contribution 62% of the variation and inter-annual contributing 12% of the variation), therefore we only included untransformed data for these analyses. The English Channel data were randomly re-sampled 10,000 times without replacement to standardize the number of sequences in a sample (4101 sequences) and the average of the randomizations was used for further analysis. Temporal distance was calculated as the

Euclidean distance between sample dates. Nutrient data (nitrate, phosphate, and silicate) were logarithmically transformed and all environmental data (nutrients and temperature) were normalized to zero with 1 standard deviation. Environmental similarity was calculated as 1 - Euclidean distance of all environmental variables. We

%%" " used matrix-based heat maps to color code similarity values and visualize community similarity with the ‘gplots’ package in R (Warnes et al. 2011).

Spectral density analysis was used to investigate the occurrence of seasonal cycles (which we defined as one cycle per year). For this analysis, community similarities were averaged within thirty-day windows, and treated as a time series set at monthly intervals. We averaged in these windows because samples were not always collected thirty days apart, so we had a range of day-distance between samples. These averaged time series were demeaned and detrended and the spectral densities were calculated using discrete Fourier transformations via fast Fourier transforms in the

‘timeSeries’ package in R (Wuertz and Chalabi 2012). We used the spatial autocorrelation coefficient, Moran’s I, to calculate the magnitude of correlation between community and environmental similarities at different temporal distances with the ‘ape’ package in R (Paradis et al. 2004). Individual Bray-Curtis similarities and Euclidean distances were treated as observations, and these observations were grouped into 30- day temporal classes. Environmental samples lacking data for any parameter were removed from the Moran’s I calculation. Moran’s I coefficients were then calculated between all observations in each temporal class and between temporal classes.

Seasonal (3-month intervals), annual, and intra-seasonal variations were estimated by decomposing the Bray-Curtis community and environmental similarity indices using a permutation analysis of variance (PERMANOVA) with ‘adonis’ in the

‘vegan’ package in R (Oksanen et al. 2012).

Multiple regression on matrices

%&" " We tested whether the turnover slopes were significantly different from zero at four a priori defined temporal scales; less than sixty-days (2 months) (intra-seasonal turnover), 60-183 days (2-6 months) and 183-365 days (6-12 months) (seasonal turnover), and greater than 365 days (12 months) (inter-annual turnover). We estimated the significance by randomization of the community similarity matrix at each temporal scale (9,999 times).

To examine the factors influencing community similarity, temporal distance, and the similarity of individual environmental variables (silicate, nitrate, phosphate, and water temperature) were regressed on the community similarity matrix using multiple regressions on matrices (MRM) (Goslee and Urban 2007, Martiny et al. 2011) at the four temporal scales. This analysis estimated regression coefficients and tested for significance by permuting the community similarity matrix and holding the environmental and temporal distance matrices constant. The predictor variables were standardized to a mean of zero and standard deviation of one, and the significance of the partial correlation coefficients were tested using a one-tailed t-test.

Results

To characterize and quantify the temporal turnover of marine microbial composition, we first calculated community pairwise similarity between all samples at a location. This was done for communities from the English Channel surface waters

(EC5), San Pedro Basin surface waters (SPB5), and deep-water (SPB890) (Figure

1.1A-C). Over the entire time series, community similarity varied widely in all three communities (Table 1.1). The surface communities at EC5 and SPB5 had nearly identical global average similarities superimposed with periods of high and low similarity

%'" " at twelve-month intervals (Table 1.1). Specifically, the thirty-day averages of similarity at

EC5 and SPB5 oscillated between 0.28 and 0.57 at one-year periods. Seasonal turnover in surface communities could also be seen from the cross-section of sample 1 compared to all other samples (Figure 1.1A and B). In comparison, the deep-water community at SPB890 varied irregularly between similarity values of 0.44 and 0.72

(Figure 1.1C).

Plots of community similarity versus temporal distance showed periodic patterns for the surface regions (Figure 1.2A-B), so we next estimated the frequencies of oscillation with a spectral density analysis and Moran’s I correlation analysis. The spectral density analysis showed a strong peak at one cycle per year for the surface communities at the EC5 and SPB5 sites (Figure S1.1), confirming the visual observations of seasonal changes in both regions. In contrast, the SPB890 community did not show any periodic signal. The Moran’s I analysis also supported the spectral analysis because it showed that the surface communities were seasonally auto- correlated at one-year intervals (Figure S1.2). In addition, Moran’s I revealed that seasonal variation of the microbial community was 3.9 times greater at EC5 compared to SPB5 (Figure S1.2A-B).

We next asked whether the rate of community turnover - i.e., the slope of the temporal distance decay curve - varied over different temporal scales. We observed a significant (MRM; p <0.0001) inter-annual decline in the mean similarity in the EC surface communities, whereas we did not observe this trend at the two SPB sites

(Figure 1.2 and Table 1.2). On average, overall community similarity declined 1.2% per year at the EC5 from an initial average similarity of 0.44 to 0.40. We also found

%(" " extensive community variation at intra-seasonal time intervals (Figure 1.1 and 1.2). For instance, samples that were collected 28 days apart ranged between 0.098 and 0.71 similarity. At no point were two samples completely different or identical (Figure 1.2). As an example of the importance of episodic events on marine community composition, we observed the occurrences of nearly unique communities at all three regions (illustrated by the solid blue columns and rows in Figure 1.1). Such events occurred as few as one time at EC5 and as many as seven times at SPB5. Furthermore, the decay of community similarity at all three sites during the first 60 days was significantly higher compared to the period between 60 and 183 days (Table 1.2 and Figure 1.2). In other words, the community changed more rapidly over short time scales than seasonally or inter-annually.

Based on the above analyses, it was clear that community turnover varied at different temporal scales. To examine this further, we next decomposed the similarity variation into annual, seasonal, and intra-seasonal (<90 days) components to quantify the contribution of different time-scales to overall community variation. The majority of variation (> 60%) in all regions was associated with short-term changes, whereas the seasonal component contributed between 11 and 23% (Figure 1.3). Inter-annual variation was the least variable and represented less than 11% of the total variation.

What factors controlled these temporal patterns? We found that different environmental variables as well as temporal distance explained the turnover at different temporal scales. At the 30-60 day temporal scale, the rapid turnover at EC5 was correlated with changes in phosphate and silicate concentration while turnover was correlated to temporal distance alone at SPB5 (Table 1.2). Within the 60-183 day and

%)" " 183-365 day temporal scales, turnover at EC5 appeared to be correlated to nitrate+nitrite and phosphate concentration and temporal distance; however, at SPB5 turnover was again correlated with temporal distance (Table 1.2). Over 365 days, community turnover at EC5 could be explained by nitrate+nitrite concentration, while at

SPB5 turnover was correlated to water temperature (Table 1.2). Thus, it was clear that different factors and beta-diversity were correlated at different temporal scales.

Given that environmental factors appeared to be driving much of the temporal beta-diversity, we also directly investigated the temporal patterns of environmental conditions. First, we observed a yearly repeating seasonal oscillation in environmental similarity at the two surface sites (Figure S1.3). This seasonality was found in temperature and nitrate, phosphate, and silicate concentrations at EC5 (Figure S1.4A), whereas only temperature and phosphate were seasonally changing at the SPB5 site.

However, the absolute seasonal change in temperature was slightly higher at EC5 (~7-

8°C) than SPB5 (~5-6°C) (Figure S1.4). In contrast, environmental condition at the

SPB890 deep-water site showed little seasonality (Figure S1.3C). Overall, there were stronger seasonal changes at EC5 compared to SPB5 compared to SPB890, whereas inter-annual changes were limited across all three regions. However, all three sites also displayed extensive variation (> 47%) at the intra-seasonal scale (Figure S1.5).

Discussion

Analogous to many studies of spatial biogeography (Preston 1960, Nekola and

White 1999, Ramette and Tiedje 2007, Martiny et al. 2011), we see distinct patterns of bacterial turnover over time at three different temporal scales (i.e., seasonal, inter-

%*" " annual, and intra-seasonal) in several ocean regions. Thus, temporal beta-diversity of bacteria appears to depend on scale – just as for spatial beta-diversity.

Specifically, within seasons (~0-60 days), we observed the highest community turnover. High short-term community turnover has also been described in other freshwater and marine microbial studies (Kara and Shade 2009, Korhonen et al. 2010), suggesting this pattern is a general phenomenon. Such rapid changes can be driven by at least two mechanisms. First, microbial composition may be tracking the environment.

In support of this mechanism, we observe that most variation in environmental conditions occurs at intra-seasonal scales at all three sites (Figure 1.3). Further, the rapid compositional turnover at EC5 is correlated with changes in phosphate and silicate concentration (Table 1.2). Such a correlation might suggest that bacterial communities are tracking these nutrients although it is also possible that they co-vary with phytoplankton communities, which are responding to changes in nutrient concentration at these scales (Kent et al. 2007). High variation in environmental conditions over short time intervals (<1 month) has also been observed in other coastal marine systems (Cloern and Nichols 1985), whereas open ocean regions are generally less variable. Thus, if environmental variation is the key driver of short-term microbial beta-diversity, we would expect that intra-seasonal beta-diversity will generally be lower in open ocean regions than in coastal marine communities.

A second mechanism that could contribute to the rapid turnover of marine bacteria within seasons is ecological drift (Hubbell 2001). Whereas samples taken a few days apart from one another may share a “historical” connection (their composition may be highly similar because there has not been time for taxon abundance to respond

%+" " to new environmental conditions), this connection should be less apparent after several weeks, as mixing of water parcels takes place. In this way, ecological drift could contribute to the “steepness” of the distance decay curve on short time scales, just as it appears to do at small spatial scales (Condit et al. 2002, Martiny et al. 2011). In support of this mechanism, we observe a significant relationship between temporal distance and microbial composition at short times scales (<60 days) in the SPB5 dataset even after controlling for the measured environmental variables. However, this pattern can be due to unmeasured abiotic and biotic variables (Hanson et al. 2012); thus, additional work is needed to assess the importance of ecological drift for temporal beta-diversity in microbial communities. It is possible that differences between communities could be attributed to analytical artifacts since DNA fingerprinting and sequencing may introduce various errors as well as likely only identify a subset of the microbial biomass (Sogin et al. 2006). However, such analytical variation may contribute little to the temporal patterns. Jones and colleagues (2012) found that ARISA could differentiate communities that were more than 5% different based on the Sorensen index. So it appears unlikely that analytical artifacts would influence the different temporal patterns of community changes observed.

Within the 60-183 day and 183-365 day temporal scales, the average absolute rate of community change declined in comparison to the first 60 days. In the near- surface communities, we also saw a seasonal oscillation that has been observed previous studies (Fuhrman et al. 2006, Gilbert et al. 2012). The deep-water community also varied between seasons, but not in a repeatable pattern. Thus, the bacterial community composition at depth may not be responding immediately to changes in

%," " primary production, an interaction that has been described across lakes (Kent et al.

2007). It is also likely that communities at this depth are driven by non-repeatable fluxes in particulate organic matter. The seasonal variation at the English Channel is nearly three times greater than at the surface of the San Pedro Basin that is likely due to a stronger absolute seasonal change in environmental variation at EC5 (Figure S1.3A-B).

The relationship between the strength of community and environmental turnover has also been observed in the seasonal dynamics of amphibians across latitudes (Canavero et al. 2009) and thus, is not limited to marine microorganisms.

At the inter-annual time-scale, temporal community turnover was correlated with temporal distance and environmental similarity. As discussed above, one interpretation of the turnover at EC5 is that this pattern may be caused by historical differences due to ecological or genetic drift because of the correlation with temporal distance after controlling for environmental effects. Alternatively, surface communities are responding to environmental variation at longer time-scales. The latter interpretation is supported by our analyses as beta-diversity at this time-scale is significantly correlated with nitrate+nitrite at EC5 and with water temperature at SPB5 (Table 1.2). Thus, variation of surface communities at this temporal scale is possibly driven by long-term changes in environmental conditions including the North Atlantic Oscillation and El Nino/Southern

Oscillation events. In contrast, we did not observe any systematic inter-annual trend in community composition or environmental similarity at 890 meters, a pattern that is similar to the general pattern of inter-annual stability seen in freshwater systems (Shade et al. 2007, Crump et al. 2009).

%-" " It should be noted that in our comparison of these ocean regions, different methods are used to infer taxonomy (ARISA fingerprinting vs. 16S rRNA sequencing).

The English Channel site has almost 20 times more identified taxa than either region of the San Pedro Basin (Table 1.1). Thus, it is possible that a methodological difference influences observed turnover patterns between the communities (Anderson et al. 2011).

However, there are four reasons why the differences may be biological and not methodological. First, The EC community composition diversity was reduced to simulate the community resolution limits of ARISA but this did not affect the patterns. Second,

ARISA is used for both regions of the San Pedro Basin, yet the deep-water community did not have recurrent seasonal variations. Third, both surface communities share many similarities in the overall temporal scale patterns. Last, the magnitude of environmental variation follows the same pattern as community variation across the regions and temporal scales. Thus, the patterns observed do not appear to be an artifact of methodologies.

Decay patterns of community composition occur across space and time, and across taxonomic groups (Nekola and White 1999, Korhonen et al. 2010). For macroorganisms, the pattern at each scale can sometimes be attributed to a particular process. For large organisms, this includes sampling error at small scales, ecological processes at intermediate scales, and evolutionary processes at large scales (e.g.,

Preston 1960, Soininen 2010). For marine bacteria, it is possible that similar processes control community composition and turnover at different temporal scales. Here, ecological processes, namely responses to fluctuations in the environmental conditions, appear to influence beta-diversity at the intra-seasonal, seasonal and inter-annual

&." " scales. However, different environmental factors control changes in microbial communities at different temporal scales. In addition to direct environmental selection, we also see some evidence of ecological drift, whereby past environmental conditions can influence beta-diversity at short time-scales. Our analysis demonstrates that microbial communities show unique patterns of temporal beta-diversity depending on the time-scale and environmental conditions. Thus, predicting how microbial communities will respond to future environmental changes requires knowledge of the factors influencing communities at multiple temporal scales.

Literature Cited

Anderson, M. J., T. O. Crist, J. M. Chase, M. Vellend, B. D. Inouye, A. L. Freestone, N.

J. Sanders, H. V. Cornell, L. S. Comita, K. F. Davies, S. P. Harrison, N. J. Kraft, J. C.

Stegen, and N. G. Swenson. 2011. Navigating the multiple meanings of beta diversity:

a roadmap for the practicing ecologist. Ecology Letters 14:19-28.

Canavero, A., M. Arim, and A. Brazeiro. 2009. Geographic variations of seasonality and

coexistence in communities: The role of diversity and climate. Austral Ecology 34:741-

750.

Cloern, J. E. and F. H. Nichols. 1985. Time scales and mechanisms of estuarine

variability, a synthesis from studies of San Francisco Bay. Hydrobiologia 129:229-237.

Condit, R., N. Pitman, E. G. Leigh, Jr., J. Chave, J. Terborgh, R. B. Foster, P. Nunez, S.

Aguilar, R. Valencia, G. Villa, H. C. Muller-Landau, E. Losos, and S. P. Hubbell. 2002.

Beta-diversity in tropical forest trees. Science 295:666-669.

Crump, B. C., B. J. Peterson, P. A. Raymond, R. M. W. Amon, A. Rinehart, J. W.

McClelland, and R. M. Holmes. 2009. Circumpolar synchrony in big river

&%" " bacterioplankton. Proceedings of the National Academy of Sciences (USA)

106:21208-21212.

Fuhrman, J., I. Hewson, M. Schwalbach, J. Steele, M. Brown, and S. Naeem. 2006.

Annually reoccurring bacterial communities are predictable from ocean conditions.

Proceedings of the National Academy of Sciences (USA) 103:13104-13109.

Gilbert, J. A., J. A. Steele, J. G. Caporaso, L. Steinbruck, J. Reeder, B. Temperton, S.

Huse, A. C. McHardy, R. Knight, I. Joint, P. Somerfield, J. A. Fuhrman, and D. Field.

2012. Defining seasonal marine microbial community dynamics. ISME Journal 6:298-

308.

Goslee, S. C. and D. L. Urban. 2007. The ecodist package for dissimilarity-based

analysis of ecological data. Journal of Statistical Software 22:1-19.

Hanson, C. A., J. A. Fuhrman, M. C. Horner-Devine, and J. B. Martiny. 2012. Beyond

biogeographic patterns: processes shaping the microbial landscape. Nature Reviews

Microbiology 10:497-506.

Harte, J. and A. Kinzig. 1997. On the implications of species-area relationships for

endemism, spatial turnover, and food web patterns. Oikos 80:417-427.

Hubbell, S. P. 2001. The unified neutral theory of biodiversity and biogeography.

Princeton University Press, Princeton, New Jersey, USA.

Jones, S.E., T.A. Cadkin, R.J. Newton, K.D. McMahon. 2012. Spatial and temporal

scales of aquatic bacterial beta diversity. Frontiers in Microbiology. doi:

10.3389/fmicb.2012.00318

&&" " Kara, E. and A. Shade. 2009. Temporal dynamics of South End tidal creek (Sapelo

Island, Georgia) bacterial communities. Applied Environmental Microbiology 75:1058-

1064.

Kent, A. D., A. C. Yannarell, J. A. Rusak, E. W. Triplett, and K. D. McMahon. 2007.

Synchrony in aquatic microbial community dynamics. ISME Journal 1:38-47.

Korhonen, J. J., J. Soininen, and H. Hillebrand. 2010. A quantitative analysis of

temporal turnover in aquatic species assemblages across ecosystems. Ecology

91:508-517.

Magurran, A. E. and P. A. Henderson. 2010. Temporal turnover and the maintenance of

diversity in ecological assemblages. Philosophical Transactions of the Royal Society

B 365:3611-3620.

Martiny, J. B. H., J. A. Eisen, K. Penn, S. D. Allison, and M. C. Horner-Devine. 2011.

Drivers of bacterial !-diversity depend on spatial scale. Proceedings of the National

Academy of Sciences (USA) 108:7850-7854.

Nekola, J. and P. White. 1999. The distance decay of similarity in biogeography and

ecology. Journal of Biogeography 26:867-878.

Oksanen, J., F. G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O'Hara, G. L.

Simpson, P. Solymos, M. H. H. Stevens, and H. Wagner. 2012. vegan: Community

Ecology Package. R package version 2.0-3. http://CRAN.R-

project.org/package=vegan

Paradis, E., J. Claude, and K. Strimmer. 2004. APE: analyses of phylogenetics and

evolution in R language. Bioinformatics 20:289-290.

Preston, F. 1960. Time and space and the variation of species. Ecology 41:611-627.

&'" " Ramette, A. and J. M. Tiedje. 2007. Multiscale responses of microbial life to spatial

distance and environmental heterogeneity in a patchy ecosystem. Proceedings of the

National Academy of Sciences (USA) 104:2761-2766.

Shade, A., A. D. Kent, S. E. Jones, R. J. Newton, E. W. Triplett, and K. D. McMahon.

2007. Interannual dynamics and phenology of bacterial communities in a eutrophic

lake. Limnology and Oceanography 52:487-494.

Sogin, M., H. Morrison, J. Huber, D. Welch, S. Huse, P. Neal, J. Arrieta, and G. Herndl.

2006. Microbial diversity in the deep sea and the underexplored “rare biosphere”.

Proceedings of the National Academy of Sciences (USA) 103:12115-12120.

Soininen, J. 2010. Species turnover along abiotic and biotic gradients: patterns in space

equal patterns in time? BioScience 60:433-439.

Tsujino, M., M. Hori, T. Okuda, M. Nakaoka, T. Yamamoto, and T. Noda. 2009.

Distance decay of community dynamics in rocky intertidal sessile assemblages

evaluated by transition matrix models. Population Ecology 52:171-180.

Warnes, G. R., B. Bolker, L. Bonebakker, R. Gentleman, W. Huber, A. Liaw, T. Lumley,

M. Maechler, A. Magnusson, S. Moeller, M. Schwartz, and B. Venables. 2011. gplots:

Various R programming tools for plotting data. R package version 2.10.1.

http://CRAN.R-project.org/package=gplots

Wuertz, D. and Y. Chalabi. 2012. timeSeries: Rmetrics - Financial Time Series Objects.

R package version 2160.95. http://CRAN.R-project.org/package=timeSeries

&(" "

Table 1.1. Summary of the datasets used in the analyses. EC – English Channel, SPB

– San Pedro Basin. Global mean is the average of all pairwise similarities. Global min and global max are the minimum and maximum similarities within all pairwise similarities.

Bray-Curtis similarity

Source Depth Latitude Longitude Duration Number of Number Global Global Global

(meters) (years) samples of OTUs mean min max

EC 5 50.3 -4.2 5.9 73 7614 0.43 0.060 0.81

SPB 5 33.6 -118.4 8.3 85 391 0.42 0.042 0.83

SPB 890 33.6 -118.4 4.7 43 367 0.54 0.12 0.83

&)" "

Table 1.2. Community turnover regression and multiple regression on matrices (MRM) at different temporal scales at EC5 – English Channel (5m), SPB5 – San Pedro Basin (5 m), SPB890 – San Pedro Basin (890 m). Community turnover is the regression slope reported as percentage points per year (p-value). MRM slopes are reported as regression slope (p-value). Nutrient concentrations were log10 transformed before analysis.

Temporal Community Temporal Water Silicate Nitrate+Nitrite Phosphate Scale (days) turnover distance temperature

<60 -56.8 (0.017) n.s. 0.27 (0.027) n.s. 0.36 (0.019) n.s.

60-183 -27.1 (<0.001) 2.02 (0.001) n.s. 0.41 (0.001) 0.15 (0.038) n.s. EC5 183-365 33.4 (<0.001) -2.24 (0.001) n.s. 0.33 (0.002) n.s. n.s.

>365 -1.20 (0.024) n.s n.s 0.51 (0.001) n.s. n.s.

<60 -58.8 (0.012) 1.52 (0.022) n.s. n.s. n.s. n.s. 60-183 -7.9 (0.286) SPB5 183-365 17.6 (0.001) -1.29 (0.081) n.s. n.s. n.s. n.s.

>365 -0.45 (0.14) n.s. n.s. n.s. n.s. 0.19 (0.036)

<60 -86.0 (0.014) n.d.* n.d.* n.d.* n.d.* n.d.*

60-183 -28.0 (0.002) n.d.* n.d.* n.d.* n.d.* n.d.* SPB890 183-365 0.099 (0.496) n.d.* n.d.* n.d.* n.d.* n.d.*

>365 1.15 (0.108) n.d.* n.d.* n.d.* n.d.* n.d.*

n.s.: Not significant n.d.: Not determined *There was not enough environmental data from SPB890 to calculate a MRM analysis

&*" " !"##$%&'( 2- )&#&*+,&'( -./000000000000000000-.1 6708%9*&):0;:+%%<*0=50#7 -

/

2

3

4

5

-./000000000000000000-.> A70B+%0C

/

2 ?<*'+0'&#<0=(<+,)7 3

4

5

>

1

@

-.2000000000000000000-.1 ;70B+%0C

/

2

3

4

5

'&#<0=#"%':)7

-.2 -.4 -.> -.@ /.- ;"##$%&'(0)&#&*+,&'(0=A,+(E;$,'&)7

Figure 1.1. Heat maps of pairwise community similarity over time for A) the English

Channel 5 meters, B) San Pedro Basin 5 meters, and C) San Pedro Basin 890 meters.

Hot colors represent high similarity between two samples. Cool colors represent low

similarity between two samples. The panels to the left of the heat maps are cross-

sections of one sample compared to all other samples. Arrows show the samples used

in cross-sections. Plots represent different time durations but were resized.

&+" " 1.0 1.0 21

● ● ●

● 0.8 ●● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● 0.6 ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ●● ● ●● ●● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ● ● ●●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ●● ● ● ●●●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ●●● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ●● ●● ● ●● ● ● ● ●●● ●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ● ●●● ●● ● ● ● ● ●● ●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ●● ●● ●●● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ●● ●●● ●● ● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ●●●● ● ●● ● ● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ●● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ●● ● ● ●●●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ●●● ● ● ● ● ●● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●●● ● ●●● ● ● ● ● ●●● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ●● ● 0.4 ● ●● ●● ● ● ● ●● ● ●● ● ●●●● ●● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ●● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● 0.2 ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.0 1.0 1.0

● ●

● 0.8 ● 0.8 ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ●● ● ● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● 0.6 ● ● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ●●● ● ●●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ●● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ●● ● ●●● ● ● ● ● ● ●● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ●● ●●●● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ●● ●●● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● 0.4 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● 0.2 ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.0 1.0 1.0

●●

● ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Community similarity between samples (Bray-Curtis) ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● 0.6 ● ● ● ● ● ● ● ● 0.6 ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ●

0.0 0123456789 Figure 2 Temporal distance between samples (years)

Figure 1.2. Temporal decay curves for bacterial communities at the English Channel 5 meters, San Pedro Basin 5 meters, and San Pedro Basin 890 meters. Community similarity is measured as the Bray-Curtis percent similarity, and delta time is the time distance between two samples in years. Red lines represent linear regressions at different temporal scales (0-60 days, 61-183 days, 184-365 days, and >365 days).

&," " 22

100 Inter−annual Seasonal Intra-seasonal 80

60

40

20

0

Community variation (percent of total Sum of Squares) of Sum total of (percent variation Community English San Pedro San Pedro Channel (5 m) Basin (5 m) Basin (890 m)

Figure 1.3. Variance decomposition of community similarity into annual, seasonal, and intra-seasonal components for the English Channel (5m), the San Pedro Basin (5m), andFigure the San3 Pedro Basin (890m). The variance decomposition was done using

PERMANOVA on a Bray-Curtis similarity matrix for all samples from each region.

&-" " Supplemental information

English Channel (5 m) 0.20 San Pedro Basin (5 m)

San Pedro Basin (890 m) 5 0.1 0 0.1 Power Spectral Density 5 0.0 0 0.0

0 1 2 3 4 5 6 frequency (cycles per year)

Figure B1 Figure S1.1. Spectral periodogram of English Channel (5m), San Pedro Basin (5m), and San Pedro Basin (890m). The sum of the spectral densities was normalized to one for all three time-series. The surface communities have the highest peaks at one cycle per year.

'." " 0.2 A) English Channel (5 m)

0.1

0.0

-0.1

-0.2

0.2 B) San Pedro Basin (5 m)

0.1

0.0

-0.1 s I (Community Similarity) ’ -0.2

0.2 C) San Pedro Basin (890 m) Moran 0.1

0.0

-0.1

-0.2 0 1 2 3 4 5 6 7 8 9 Temporal class (years)

Figure B2

Figure S1.2. Moran’s I correlograms of community similarity and temporal distance at

A) English Channel (5m), B) San Pedro Basin (5m), and C) San Pedro Basin (890m).

Both English Channel (5m) and San Pedro Basin (5m) show temporal autocorrelation at yearly intervals, while San Pedro Basin (890m) shows no periodicity.

'%" "

Figure S1.3. Moran’s I correlograms of environmental similarity and temporal distance at A) English Channel (5m), B) San Pedro Basin (5m), and C) San Pedro Basin (890m).

Both English Channel (5m) and San Pedro Basin (5m) show periodicity at yearly intervals, while San Pedro Basin (890m) shows periodicity at a different time scale.

'&" "

Figure S1.4. Moran’s I correlograms of the Euclidean distances of the individual environmental parameters water temperature, nitrate, phosphate, and silicate. A)

English Channel (5m), B) San Pedro Basin (5m), and C) San Pedro Basin (890m).

Seasonal variation was observed for all parameters at English Channel (5m), but seasonal autocorrelations were only observed for temperature at San Pedro Basin (5m and 890m).

''" "

Figure S1.5. Variance decomposition of environmental similarity into annual, seasonal, and intra-seasonal components for the English Channel (5m), San Pedro Basin (5m), and San Pedro Basin (890m). The variance decomposition was done using

PERMANOVA on a Euclidean distance matrix for all samples from each region.

'(" " Chapter 2

Temporal and taxonomic scale affects dynamics of marine

cyanobacteria in a coastal upwelling system

Abstract

Organisms are distributed through space and time and the patterns of these distributions depend on the scale that is used. Organisms are also related phylogentically; however, the role of phylogenetic scale on biodiversity has received little attention. Here we answer the question, do biodiversity patterns depend on taxonomic resolution? To answer that question, we collected seawater samples over a four-year period at two-day intervals. We sequenced the gamma subunit of the RNA polymerase of the marine cyanobacteria Synechococcus and Prochlorococcus, and tracked the temporal patterns at the genus, clade (groups within genera), and subgroup

(groups within clades) levels. In addition we correlated these taxonomic groupings with water temperature, phosphate concentration, and nitrate+nitrite concentration. We found that at the generic levels varied at periods of two years and correlated with the environmental variables. At clade level resolution, Clade I varied seasonally and each clade correlated with all three environmental variables. At the subgroup level, patterns changed as well. Most subgroups varied seasonally and Clade I subgroups 1 and 2 also varied at 6 month periods. Correlations with environmental parameters also weakened across several subgroups, and in some cases the direction of correlation reversed between the clade level and subgroup level. Therefore, the taxonomic resolution one uses is important as it can affect the pattern observed.

')" " Introduction

Temporal patterns of microbial community variation depend on the scale used

(Hatosy et al., 2013; Korhonen et al., 2010). In addition, the taxonomic scale that is used may also influence microbial biodiversity patterns (Fulthorpe et al., 2008; Cho and

Tiedje, 2000). The marine cyanobacteria Prochlorococcus and Synechococcus are globally important contributors to carbon cycling (Liu et al., 1995; Li, 1994).

Prochlorococcus is the globally dominant of the two in the tropics, while Synechococcus are dominant at high latitudes and in coastal regions (Zwirglmaier et al., 2008;

Partensky et al., 1999). Throughout the tropical ocean, the Prochlorococcus clade is composed of ecotypes that partition in different environments according to light, temperature, and nutrient concentrations (Moore et al., 1998; Johnson et al., 2006;

Rocap et al., 2003; Martiny et al., 2009). In addition, Prochlorococcus ecotype abundances vary seasonally (Malmstrom et al., 2010). Based on multiple loci,

Synechococcus is divided into multiple clades (Mazard et al., 2012). Two of these clades (Clade I and Clade IV) are dominant in southern California and they vary seasonally with Clade I increasing in abundance during late spring, followed by Clade IV

(Tai and Palenik, 2009).

The southern California Bight is a dynamic marine ecosystem where the physical environment can change over the course of a day to weeks (Santoro et al., 2010;

Allison et al., 2012). In addition microbial communities change over different temporal scales (Hatosy et al., 2013; Korhonen et al., 2010; Kara and Shade, 2009). While we know how Prochlorococus and Synechococcus vary over seasonal time scales, we know little about high frequency variation in these two cyanobacterial groups. In

'*" " addition, while the ecotypic variation in response to environmental drivers has been studied in tropical, open-ocean Prochlorococcus populations, such variation is poorly understood for Synechococcus and temperate, coastally occurring Prochlorococcus.

Here we address this lack of knowledge using high-throughput sequencing of the RNA polymerase of a four year time series to answer the following: 1) Which

Prochlorococcus and Synechococcus ecotypes exist in the ocean off Newport Beach, 2) do these ecotypes partition into more ecotypes at finer taxonomic resolutions, 3) what environmental variables correspond to these ecotypes, and 4) how do population dynamics change at various taxonomic and temporal scales?

Methods

Sample collection

Seven hundred forty-seven samples of seawater (2 L) were collected in replicate at the Microbes of the Coastal Region of Orange County (MiCRO) time series at

Newport Beach, California, USA (33.61°N, 117.93°W) between September 2009 and

September 2012. Three time per week, Water samples were transported to the laboratory within 30 minutes following sample collection at 15-20° C. Sea water was pre-filtered through a glass microfiber filter (Whatman GF/D) and bacteria were collected on a 0.22"m polyethersulfone Sterivex filter (Millipore). An additional 20 mL of sea water was manually pushed through a 0.22"m polyethersulfone syringe filter

(Millipore) and frozen at -20 C for nutrient analysis. Nutrient samples were sent to the

Marine Science Institute (University of California – Santa Barbara) to measure the concentration of nitrate+nitrate, ortho-phosphate, ammonium, and silicate. Nutrient concentrations that were below detection limit were assigned a value that was half the

'+" " detection limit. Nitrate+nitrite, phosphate, ammonia, and silicate concentrations were then log10 transformed.

Bacterial DNA was extracted from the Sterivex filters using a protocol adapted from Boström et al. (2004). Extraction started with a overnight freeze in SET (lysis) buffer, followed by a 30 minute incubation at 37 C with lysozyme. Next, protinase K and

SDS were added and incubated at 55 C overnight. DNA was then precipitated with ice- cold isopropanol, pelleted via centrifuge, and resuspended in Tris buffer in a 37° C water for 30 minutes. DNA was then purified using a genomic DNA Clean and

Concentrator kit (Zymo Research Corp.). rpoC1 amplification

The rpoC1 gene was amplified using a two-step PCR. First, the gene was amplified using the degenerative rpoC1 primers SAC1039R (5’-CYTGYTTNCCYTCDAT

DATRT) and 5M_newF (5’-GARCARATHGTYTAYTTYA) for ten cycles of 95° C for 30 seconds, 50° C for 40 seconds and 72° C for 1 minute. Following this, rpoC1 primers containing Roche 454 Titanium flow cell adaptor sequences were added to the reaction:

SAC1039R-LibL (5’-CTATGCGCCTTGCCAGCCCGCTCAGC

YTGYTTNCCYTCDATDATRT) and 5M_newF_LibL (CCATCTCATCCCTGCGTGTC

TCCGACTCAGXXXXXXXXXXXX GARCARATHGTYTAYTTYA) where

XXXXXXXXXXX represents a unique 12 nucleotide Golay barcode (Hamady et al.,

2008) for twenty-three cycles of 95° C for 30 seconds, 50° C for 40 seconds and 72° C for 1 minute, followed by a final step at 72° C for 10 minutes. PCR reactions were performed in duplicate and pooled. Dimers less than 500 nucleotides long were removed using AMPure magnetic bead purification (Beckman-Coulter). Purified DNA

'," " from each sample was quantified using the Quant-iT high sensitivity assay (Invitrogen), and all samples were pooled to equimolar concentrations for sequencing.

Bioinformatics

Raw reads were quality filtered in QIIME using default parameters and then denoised in QIIME using denoise_wrapper.py. Denoised reads were clustered into

OTUs using uclust in QIIME with a 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%,

96%, 97%, 98% and 99% sequence similarity. The longest representative read from each OTU was then used for taxonomic assignment. A reference database of

Synechococcus and Prochlorococcus rpoC1 sequences was constructed from sequenced genomes and environmental sequences greater than 841 nucleotides.

Reference sequences were aligned with MUSCLE (Edgar, 2004), trimmed to 841 nucleotides, and dereplicated using the derep_fulllength command in usearch (Edgar,

2010). A maximum likelihood reference tree was constructed in RAxML using

Synechococcus WH5701 as an outgroup.

Representative query sequences were blasted against our rpoC1 database using blastn with an e-value cutoff of 10. Representative sequences that did not contain gaps in the blast alignment or introduce gaps in the subject sequence were used for taxonomy assignment. We assigned taxonomy using BLAST For the BLAST assignment, representative query sequences from the uclust clustering were blasted against our rpoC1 database using blastn with an e-value cutoff of 10. For clade temporal analysis, OTUs that hit to the same clade were collapsed so that each clade was represented by a single OTU. In addition, representative sequences were taken from the thirty most abundant OTUs from the three most relatively abundant clades.

'-" " These sequences were then clustered using PHYLIP (Felsenstein, 1989) to identify potential subclusters within these clades.

The frequency at which environmental variables and cyanobacterial relative abundance varied was calculated using least-squares spectral analysis (the Lomb-

Scargle algorith). The raw time series was fit to a series a sinusoid waves to determine the relative contribution of each wave frequency to the variation.

Correlation to environmental parameters

To identify potential drivers of population variation, we correlated relative abundance of Prochlorococcus, Synechococcus, Clade I, Clade IV, HLI, and the subclades to water temperature using Pearson’s correlation.

Results

Sequencing data

To characterize and quantify the temporal turnover of marine cyanobacteria at different taxonomic scales, we first sequenced bacterial DNA from 747 seawater samples from a 4.5 year time series. The sequencing resulted in 2,369,259 raw reads.

Denoising removed 2% of the reads, leaving 2,319,650 quality reads, which averaged

3,105 reads per sample (Table 2.1). To identify clusters of cyanobacteria, reads were clustered at 97% nucleotide similarity. This produced 2,603 clusters. Clusters that shared BLAST hits were collapsed, leaving 24 clusters. Over the course of the time series, Synechococcus was relatively more abundant than Prochlorococus (Figure 2.1).

The majority of Synechococcus reads belonged to Clades I and IV (30% and 56%, respectively). The dominant Prochlorococcus clade was high-light I (HLI) (9% of cyanobacterial diversity). These three clades represented the most abundant clades.

(." " Time series analysis

Environmental variables varied across the time series. Water temperature, salinity, phosphate, and silicate varied seasonally (Figure 2.1, Figure S2.1). Water temperature ranged from 11.5˚C to 21.8˚C, and peaked during late summer and early fall. Salinity ranged from 32.0 ppt to 33.5 ppt with troughs during winter (Figure 2.1).

Phosphate ranged from 0.05 "M to 0.94 "M, peaking in spring and winter (Figure 2.1).

Nitrate+Nitrite ranged from 0.1 "M to 12.0 "M and tended to peak in spring (Figure 2.1).

Synechococcus and Prochlorococcus varied over the course of the time series

(Figure 2.1) with significant variation occurring at seasonal cycles (1 cycle year-1) and one cycle every two years (Figure S2.1). Synechoccocus relative abundance generally peaked around spring and decreased during fall (Figure 2.1). This pattern was reversed for Prochlorococcus. Increasing phylogenetic resolution, we looked at Clades I and IV from Synechococcus and HLI from Prochlorococcus. As with Synechococus and

Prochlorococcus, the relative abundance of all three clades varied throughout the years

(Figure 2.2). The relative abundance of Clade I peaked during spring and troughed during fall (Figure 2.2), and varied at a period of one year and once every two years

(Figure S2.1). Clade IV relative abundance peaked in summer 2011, but troughed in summer 2012 (Figure 2.2). The frequency spectrum of Clade IV shows significant variation at one cycle every two years (Figure S2.1). The relative abundance of HLI followed a similar pattern as Prochlorococcus (Figure 2.1 and 2.2 and Figure S2.1).

We next increased taxonomic resolution by splitting Clades I and IV and HLI into subgroups. Clade I and Clade IV were split into three subgroups and HLI was split into two subgroups (Figure 2.1). Unlike Clade I as a whole, the Clade I subgroups did not

(%" " have significant variation at any frequency (Figure 2.1). The majority of variation for

Subgroups 1 and 2 was at two cycles per year (Figure 2.2 and Figure S2.1). Clade I subgroup 1 peaked during the winter and slightly in the summer, subgroup 2 peaked during the fall and spring, and subgroup three peaked during summer (Figure 2.2).

The patterns of the Clade IV subgroups also differed from the pattern of the whole clade. Clade IV subgroups 2 and 3 varied seasonally (Figure 2.2 and Figure

S2.1). In addition, the majority of variation for all three subgroups occurred at one cycle every two years, which is similar to the clade as a whole (Figure S2.1). However, unliike the whole clade, the three subgroups also varied at higher a frequency (Figure S2.1).

Clade IV subgroup 2 varied between higher relative abundance late winter to spring and lower relative abundance during the summer while Clade IV subgroup 3 had higher relative abundance in spring and summer and lower relative abundance during winter.

Clade IV subgroup 1 had higher relative abundance during fall and lower relative abundance during spring (Figure 2).

The temporal patterns of the two subgroups from HLI were similar to the clade as a whole, but frequencies differed in the relative contribution to variation. Both subgroups varied seasonally (Figure 2.2 and Figure S2.1) where HLI subgroup 1 peaked during spring and summer and troughed during the winter while HLI subgroup 2 had the opposite pattern (Figure 2.2). Unlike the clade as a whole, the majority of variation for both subgroups occurred at one cycle every year.

Water temperature correlation

Because the temporal patterns changed based on taxonomic resolution, we correlated the relative abundance to water temperature. At the broadest taxonomic

(&" " resolution, Synechococcus was negatively correlated with water temperature and

Prochlorococcus was positively correlated with water temperature (r = -0.45 and 0.45, respectively) (Table 2.2 and Figure 2.3). At the clade level, Clade I followed a similar pattern to Synechococcus as a whole (Figure 2.3, Table 2.2 and Figure S2.3), but Clade

IV diverged from the Synechococcus pattern. Clade IV was positively correlated with water temperature (r = 0.32) (Table 2.2 and Figure S2.3). HLI maintained a similar pattern to Prochlorococcus (Figure 2.3, Table 2.2 and Figure S2.3).

Further division of the taxonomy continued to change the pattern. Clade I subgroup 1 had a similar pattern to Clade I with respect to water temperature (r = -0.27)

(Table 2.2). Clade I subgroup 2 was less correlated with temperature. Clade I subgroup

3 had higher relative abundance in warmer water (>18˚C) (Figure 2.3 and Table 2.2).

Discussion

Using high-throughput sequencing of the RNA polymerase gamma subunit

(rpoC1), we were able to identify clades of Prochlorococcus and marine Synechococcus in an ocean upwelling zone. Overall, Synechococcus was relatively more abundant than

Prochlorococcus with the relative abundance of Prochlorococcus rising during the warm temperatures of the summer and late fall. The general abundance of Synechococcus is not suprising in the nutrient-region of the southern California Bight, as Synechococcus tends to dominate in such waters (Zwirglmaier et al., 2008). The increase in

Prochlorococcus abundance during the summer/fall and positive correlation with water temperature is supported by temperature dependent culture experiments and ocean surveys (Johnson et al., 2006; Flombaum et al., 2013). At the clade level,

Synechococcus Clades I and IV contributed most to the cyanobacterial diversity at the

('" " MiCRO time series, which was similar to patterns seen at the Scripps Pier in La Jolla,

CA (Tai and Palenik, 2009).

We also found that temporal patterns changed as taxonomic resolution changed.

At the genus level, Synechococcus and Prochlorococcus varied at a period of two years

(Figure S2.2). Such a result is surprising patterns of these marine cyanobacteria tend to be seasonal (Tai and Palenik, 2009; Malmstrom et al., 2010). In our case, the temporal variation in cyanobacteria appears to coincide with temporal variation in ammonia concentrations. At finer taxonomic resolution (clade level), we detected seasonality for

Clade I (Figure 2.2). Clade I peaked in spring while Clade IV did not have a clear seasonal pattern, similar to a 2005-2007 time series at Scripps Pier (Tai and Palenik,

2009). HLI did not have a clear seasonal pattern as well, and it coincided with the pattern of Prochlorococcus possibly because HLI was the most abundant

Prochlorococcus clade in our time series.

Further division of the clades into subgroups revealed different patterns from the other taxonomic resolutions. Clade I subgroups appear to partition based on water temperature with subgroup 1 as a low temperature variant and subgroup 3 as a high temperature variant (Figure 2.3). The subgroups of Clade IV and HLI weakly correlate with water temperature. This change in correlation with changing taxonomic resolution in general, and in particular, the weakening of some correlations with increasing taxonomic resolution highlights the importance of choosing the proper phylogenetic scale for a study. For instance, environmental drivers may be important at some taxonomic scales and not others. Martiny et al. (2009) found that relationships with environmental variables such as light, temperature, and nitrate changed when the taxon

((" " definition changed (based on sequence similarity). The decrease in correlations at the

finer taxonomic resolutions suggests that the environmental variables we used play a

lesser role in structuring the dynamics of these populations.

Dynamics of populations defined at finer taxonomic resolutions could be

controlled by biotic factors, which we did not measure. In particular, predation by

cyanophages could alter the abundance of these subgroups. The dynamics of

cyanophage strains in southern California were temporally variable (Clasen et al.,

2013). In chemostats, Synechococcus and cyanophage grown together were varied

temporally over the course of five months compared to Synechococcus grown alone

suggesting that phage could influence population dynamics at shorter time scales

(Marston et al. 2012).

Taxa are influenced by abiotic and biotic interactions. Taxa also share

evolutionary history with each other, and this history is itself influenced by abiotic and

biotic interactions. Here we show that taking into account that evolutionary history to

define taxa affects the patterns that you observe and could affect the inferences you

make about the process behind those patterns. Therefore, as with spatial and temporal

scale, one must choose the appropriate taxonomic scale for the question being

answered.

References

Allison SD, Chao Y, Farrara JD, Hatosy S, Martiny AC. (2012). Fine-scale temporal

variation in marine extracellular of coastal southern California. Front

Microbiol 3:301.

()" " Boström K, Simu K, Hagstrom A, Riemann L. (2004). Optimization of DNA extraction for

quantitative marine bacterioplankton community analysis. Limnol Oceanogr Methods

2:365–373.

Cho, J.C. & Tiedje, J.M. (2000) Biogeography and Degree of endemicity of fluorescent

Pseudomonas strains in soil. Applied and Environmental Microbiology, 66, 5448-

5456.

Clasen J, Hanson C, Ibrahim Y, Weihe C, Marston M, Martiny J. (2013). Diversity and

temporal dynamics of Southern California coastal marine cyanophage isolates. Aquat

Microb Ecol 69:17–31.

Edgar RC. (2004). MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res 32:1792–7.

Edgar RC. (2010). Search and clustering orders of magnitude faster than BLAST.

Bioinformatics 26:2460–1.

Felsenstein J. (1989). PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics

5:164 – 166.

Flombaum P, Gallegos JL, Gordillo R a, Rincón J, Zabala LL, Jiao N, et al. (2013).

Present and future global distributions of the marine Cyanobacteria Prochlorococcus

and Synechococcus. Proc Natl Acad Sci U S A 110:9824–9.

Fulthorpe, R.R., Roesch, L.F., Riva, A. & Triplett, E.W. (2008) Distantly sampled soils

carry few species in common. ISME J, 2, 901-10.

Hamady M, Walker JJ, Harris JK, Gold NJ, Knight R. (2008). Error-correcting barcoded

primers for pyrosequencing hundreds of samples in multiplex. Nat Methods 5:235–7.

(*" " Hatosy SM, Martiny JBH, Sachdeva R, Steele J, Fuhrman JA, Martiny AC. (2013). Beta

diversity of marine bacteria depends on temporal scale. Ecology 94:1898–1904.

Johnson ZI, Zinser ER, Coe A, Mcnulty NP, Woodward EMS, Chisholm SW. (2006).

Niche partitioning among Prochlorococcus ecotypes along ocean-scale

environmental gradients. 311:1737–1741.

Kara E, Shade A. (2009). Temporal dynamics of South End tidal creek (Sapelo Island,

Georgia) bacterial communities. Appl Environ Microbiol 75:1058–64.

Korhonen JJ, Soininen J, Hillebrand H. (2010). A quantitative analysis of temporal

turnover in aquatic species assemblages across ecosystems. Ecology 91:508–17.

Li W. (1994). Primary production of prochlorophytes, cyanobacteria, and eucaryotic

ultraphytoplankton: measurements from flow cytometric sorting. Limnol Oceanogr

39:169–175.

Liu H, Nolla HA, Campbell L. (1995). Prochlorococcus growth rate and contribution to

primary production in the equatorial and subtropical North Pacific Ocean.

Malmstrom RR, Coe A, Kettler GC, Martiny AC, Frias-Lopez J, Zinser ER, et al. (2010).

Temporal dynamics of Prochlorococcus ecotypes in the Atlantic and Pacific oceans.

ISME J 4:1252–64.

Marston MF, Pierciey FJ, Shepard A, Gearin G, Qi J, Yandava C, et al. (2012). Rapid

diversification of coevolving marine Synechococcus and a virus. Proc Natl Acad Sci

U S A 109:4544–9.

Martiny AC, Tai APK, Veneziano D, Primeau F, Chisholm SW. (2009). Taxonomic

resolution, ecotypes and the biogeography of Prochlorococcus. Environ Microbiol

11:823–32.

(+" " Mazard S, Ostrowski M, Partensky F, Scanlan DJ. (2012). Multi-locus sequence

analysis, taxonomic resolution and biogeography of marine Synechococcus. Environ

Microbiol 14:372–86.

Moore LR, Rocap G, Chisholm SW. (1998). Physiology and molecular phylogeny of

coexisting Prochlorococcus ecotypes. Nature 393:464–467.

Partensky F, Hess W, Vaulot D. (1999). Prochlorococcus, a marine photosynthetic

prokaryote of global significance, Microbiol. Mol Biol Rev 63:106–127.

Rocap G, Larimer FW, Lamerdin J, Malfatti S, Chain P, Ahlgren N a, et al. (2003).

Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche

differentiation. Nature 424:1042–7.

Santoro AE, Nidzieko NJ, Dijken GL Van, Arrigo KR, Boehm AB. (2010). Contrasting

spring and summer phytoplankton dynamics in the nearshore Southern California

Bight. Limnol Oceanogr 55:264–278.

Tai V, Palenik B. (2009). Temporal variation of Synechococcus clades at a coastal

Pacific Ocean monitoring site. ISME J 3:903–915.

Zwirglmaier K, Jardillier L, Ostrowski M, Mazard S, Garczarek L, Vaulot D, et al. (2008).

Global phylogeography of marine Synechococcus and Prochlorococcus reveals a

distinct partitioning of lineages among oceanic biomes. Environ Microbiol 10:147–61.

(," "

Table 2.1. Summary of rpoC1 libraries Number Number Mean Mean Mean Mean Mean Mean of reads of reads Temp NO2+NO3 PO4 NH3 Salinity SiO4 (pre (post (˚C) ("M) ("M) ("M) (ppt) ("M) denoising) denoising)

2,369,259 2,319,650 16.6 1.00 0.22 1.30 33.2 2.50

(-" "

Table 2.2. Pearson’s correlation coefficients between relative abundance and environmental variables Water Temperature Prochlorococcus 0.45 HLI 0.50 HLI subgroup 1 0.06 HLI subgroup 2 -0.06 Synechococcus -0.45 Clade I -0.54 Clade I subgroup -0.27 1 Clade I subgroup 0.04 2 Clade I subgroup 0.58 3 Clade IV 0.32 Clade IV 0.10 subgroup 1 Clade IV -0.08 subgroup 2 Clade IV 0.01 subgroup 3

)." " !(& &() &(+ &(# &(% &(' &(* &(" &($

:=<$AB,A,AA*C D$/"#;E$&"F*

-. "& ,

!% ''(&

!$

:"/;<;#=&+))#. '"() !# !"#$%&'$()$%"#*%$&+ !" '"(& &(& !()

!(&

. &(# . 67 ? 5 5 &() +34 +8> 12 &(% 12 /,0 /,0 &(&

!(" &()

!(& !(&

&() . 57 5 . 9 &()

&(& +:;4 +84 12 12 /,0 /,0 &() &(&

!(&

"&!& "&!! "&!" "&!' "&!& "&!! "&!" "&!' @"#$ @"#$

Figure 2.1. Temporal variation of Synechococcus and Prochlorococcus relative abundance and environmental variables. Gray lines represent raw data. Black lines represent low-pass filter smoothing.

)%" " '&& %$ !"#&4&1 4' '&& !"#$3'1 '&& $2 !"#$3&' '&& '&& !"#$%'0 '&& *4 !"#$%&% '&& 10 !"#$'&2 11 '&& )-;";DA@C789:;' !"#$%&' &6& &61 &6* $' '&& !"#$3'' '&& '&& $*1$1 *' '&& 30 !EF1 $$ '&& )-;";5:78=>?:;,@AB98BC: )-"

!"#$0'0 )-;";DA@C789:;% '&& &6& &6' &6% &63 &61 *%$3& &6& &61 &6* 0* '&& )+-.' '&& '&& !"#&*&' 43 04 %&& +,#-', '&& $0 )+-.% '&& !"#&4&% $$ '&&

!"#$%'' .789:;";DA@C789:;'

'&& &64 &62 &6* &6$ '&& !"#$3'3 '&& !"#&2&' *$ '&& 4& ..$3'' 30 '&& *%*%% '&& 2$ .789:;";DA@C789:;% 4410& &6&0 &6%& &630 '&& '&& '&%0*%

'&& .789:;" $1 '&& 302$% .789:;";5:78=>?:;,@AB98BC: '&& &6' &63 &60 &62 ()*&'4 '&& '&& /-'&2 '&& .789:;";DA@C789:;3 41 2*$$* &6&& &6'& &6%& 2' '&& '&& ..$$&% '&& '&&

403$0 .789:;"< '&& 1& ''34& '&& ()2*&0 .789:;"<;DA@C789:;' '&& &64& &620 &6$& 30 ()*'&% '&& 0& %% ..$4&0 '&& *% ./&%&0 '&& &6' &6% &63 &61

./&'&' .789:;"<;5:78=>?:;,@AB98BC: &63 &61 &60 &64 &62 &6* '&& .789:;"<;DA@C789:;% 5..3&2 0& %&'& %&'' %&'% %&'3 ()02&' F8=:

&6&3 .789:;"<;DA@C789:;3 &6&& &6'& &6%& %&'& %&'' %&'% %&'3 F8=:

Figure 2.2. Neighbor-joining tree of RNA polymerase gamma subunit and temporal variation of the relative abundance of the top three abundant clades from the marine cyanobacteria, and clade subgroups. Gray lines represent raw data. Black lines represent low-pass filter smoothing.

)&" "

                                                                                                                                                                                                                                                                        -/"0$&1                                                                    =>;$

                                                                                                                                                                                            -/"0$&1@                                                                            

!"'$ !"(! !"($ !")!  8$/"#9:$&"3*;0";<$                     !"!( !"!* !"!+ !"!%  -/"0$&1@&2*34%,*)&6 -/"0$&1@&2*34%,*)&5       -/"0$&1@&2*34%,*)&7         !"*$ !"$$ !"+$ !"#$           !"+$ !"#! !"#$ !"%!

'( '+ (!

                                                                                                                                                                                                                                                                   B94?CD94?#&1                 !"!$ !"'!   !"'$                       BD1&2*34%,*)&6      BD1&2*34%,*)&5           

A%,

 !"( !") !"* !"$ !"!$ !"'$ !"($             !"$ !"+ !"# !"%                 

'( '+ (! '( '+ (! '( '+ (! '( '+ (! !"#$%&'$()$%"#*%$&+,-.

Figure 2.3. Correlation between water temperature and relative abundance of all taxonomic resolutions. Regression line is the linear regression.

)'" " /- /- /- /- /- /- 1 3 I HLI Syn HLI *H22B"!', .- F!#%'#$GF!#%!#$ 56 54 .- .- ! ! - .- - .- - .- - .- - .- - .- . - - 7 6 / . - 9 8 7 / - - - -:4 -:-. -:-6 -:-/ -:-. -:-6 -:-/ -:-. -:--4 /- /- /- /- /- /- 3 2 I IV Silicate Salinity CladeIV *F!#%'#$GF!#%!#$, (BC .- 57 56 , .- .- 5. ! ! - .- - .- - .- - .- - .- - .- 6 / . - - - 8 7 / - . - - .:4 -:4 -:-9 -:-8 -:-7 -:-/ -:-. -:-9 -:-8 -:-7 -:-/ -:--4 /- /- /- /- /- /- >%$?@$"<=)*=% 2 1 I IV *DEB13E'#$, (BC CladeI .- Phosphate (BC 56 57 A'#$%)#$23$%'#@%$ .- .- ! ! - .- - .- - .- - .- - .- - .- 9 8 7 / - 4 - - 9 8 7 / - - - .- -:-6 -:-/ -:-. -:-. -:-6 -:-/ -:-. -:--4 /- /- /- /- /- /- 2 1 IV Pro HLI Other !"#$%&'()*+, Ammonia 56 56 .- .- Sampling interval ! ! - .- - .- - .- - .- - .- - .- - . - - 6 / . - 7 6 / . - -

4-

.:4 -:4

/-- .4- .-- 0)1'23($1 -:-. -:-6 -:-/ -:-.

-:--4 ;3$<#%'()+$"1!#=

Figure S2.1. Spectral density plots of time series. Pro – Prochlorococcus, Syn –

Synechococcus, HLI – High-light I, Ix – Clade I subgroup x, IVx – Clade IV subgroup x,

HLIx – High-light I subgroup x

)(" " Chapter 3

The ocean as a global reservoir of antibiotic resistance genes

Abstract: Recent studies of natural environments have revealed vast genetic reservoirs of antibiotic resistance (AR) genes. Soil bacteria and human pathogens share AR genes, and AR genes have been discovered in a variety of habitats. However, there is little knowledge about the presence and diversity of AR genes in marine environments and which organisms host AR genes. To address this, we identified the diversity of genes conferring resistance to ampicillin, tetracycline, nitrofurantoin, and sulphadimethoxine in diverse marine environments using functional metagenomics (the cloning and screening of random DNA fragments). Marine environments were host to a diversity of AR conferring genes. Antibiotic resistant clones were found at all sites with

28% of the genes identified as known AR genes (beta-lactamases, bicyclomycin resistance pumps, etc.). However, the majority of AR genes were not previously classified as such, but were similar to proteins such as transport pumps, , and . Furthermore, 44% of the genes conferring antibiotic resistance were found in abundant marine taxa (e.g., Pelagibacter, Prochlorococcus,

Vibrio, etc.). Therefore, we uncovered a previously unknown diversity of genes that conferred an AR phenotype among marine environments, which makes the ocean a global reservoir of both clinically relevant and potentially novel AR genes.

Keywords: Functional metagenomics, microbial ecology, public health

))" " Introduction:

The spread of antibiotic resistance (AR) is critically important to human health.

Past research has focused on resistance in clinical environments (e.g. hospitals), but the rise of community-acquired infections of resistant bacteria has fueled interest in AR genes in natural environments (Martínez 2008, Forsberg et al. 2012, Gibson et al.

2014). Natural environments can be important as they can act as reservoirs of AR genes (Martínez 2008, Forsberg et al. 2012). Such environments include soils

(Riesenfeld et al. 2004, Allen et al. 2009), glaciers (Segawa et al. 2013), and animals

(Foti et al. 2009, Miller et al. 2009, Martiny et al. 2011). Additionally, the frequency of

AR in human hosts is also higher than previously thought (Pike et al. 2002, Sommer et al. 2009) and the same AR genes found in soil bacteria have also been found in clinical pathogens (Forsberg et al. 2012). One set of environments that has received little attention is marine environments. Oceans are dilute systems and henceforth there may be little selection for antibiotic production as compounds can rapidly diffuse away from the producer (Allison 2005).

However, there are two possible hypotheses for the occurrence of AR in marine environments. One is the classical interpretation of antibiotic resistance in response to antibiotics. Coastal runoff of AR bacteria from terrestrial sources could introduce AR into marine environments. In this case, we expect to find AR genes in bacterial taxa non- native to marine environments. Although, selection for AR could also occur, due to antibiotic runoff or production in marine environments. Antagonistic microbial interactions can occur on marine snow (Long and Azam 2001) or in small parcels of

)*" " seawater (Sher et al. 2011, Cordero et al. 2012). These interactions may include the production of antibiotics and subsequent selection for resistance.

The second hypothesis is that the sheer genetic diversity present in marine systems increases the likelihood of finding genes that confer resistance. Antibiotic resistance genes are not needed to make bacteria resistant. For example, for bacteria living in heavy metal containing habitats non-antibiotic efflux pumps make the bacteria resistant to antibiotics (Pike et al. 2002, De Souza et al. 2006, Martinez et al. 2009).

Despite the large expanse of the oceans, we currently have little understanding of the presence, diversity of organisms, or types of genes responsible for AR in the marine environment. To address this limitation in our understanding of AR in natural environments, and to test the two mechanisms above, we here used functional metagenomics (i.e., the cloning and functional screening of DNA fragments from communities) to identify genes conferring resistance to specific antibiotics in marine waters. We specifically asked: 1) What is the frequency of resistance to different antibiotics in specific marine environments, 2) what is the diversity of marine AR genes, and 3) are these genes harbored by marine bacteria?

Materials and Methods:

Sample collection

Seawater was collected from Newport Bay (33° 37’ 29.8”N, 117° 53’ 35.2”W), a natural bay that has freshwater influence from San Diego Creek and Delhi Channel,

Agua Hedionda Lagoon (33° 8’ 44.1” N, 117° 20’ 35.8”W) which contains an aquaculture facility, Los Angeles Harbor (33° 42’ 37.0”N, 118° 15’ 23.5”W), the San

Pedro Ocean Time-series (33° 33’ 00”N, 118° 24’ 00”W), an open-ocean site that has

)+" " coastal influence, and the Hawaii Ocean Time Series (22° 45’ 00”N, 158° 00’ 00”W), which is an open-ocean site. These locations were chosen because they represented a range of marine environments with different proximities to the coast. Eight to sixteen liters of seawater were collected in replicate and prefiltered through 2.7 micron glass fiber filters (Whatman GF/D, Pittsburgh, PA) and then collected on 0.22 micron polyestersulfone Sterivex filters (Millipore, Billerica, MA).

DNA was extracted from the Sterivex filters using a protocol modified from

Bostrom and colleagues (Boström et al. 2004). 1620 "l of Tris-EDTA-Sucrose buffer was added to each filter and frozen for at least 24 hours. Filters were then thawed and

180 "l of lysozyme buffer was added and incubated at 37˚C for 30 minutes. 180 "l of proteinase K and 100 "l of sodium dodecyl sulfate was then added and incubated at

55˚C overnight. Sodium acetate and cold isopropanol were added to precipitate macromolecules and the solution was left at -20˚C for at least 1 hour. The precipitate was pelleted by centrifugation at 15000Xg for 30 minutes. The supernatant was decanted and the pellet was resuspended in Tris buffer using a 37˚C water bath for 30 min. DNA was then purified using a genomic DNA Clean and Concentrator kit (Zymo

Corp., Irvine, CA).

Library construction

At least 2 ug of DNA from each replicate was sonically sheared to 3 kb using a

S2 Focused Acoustic Shearer (Covaris Inc., Woburn, MA). Fragments of 2-4kb were gel extracted using a Zymo Gel Extraction kit (Zymo Corp., Irvine, CA). DNA was treated with an End-It end repair kit (Epicentre, Madison, WI) to create blunt ends on the DNA fragments. Fragments were then ligated into the pZE-21 plasmid (Lutz and Bujard 1997,

)," " Sommer et al. 2009) using a Fast-Link ligation kit (Epicentre, Madison, WI). The ligation reaction was then purified using a DNA Clean and Concentrator-5 kit (Zymo Corp.,

Irvine, CA). 4 "l of ligation was added to 50 "l of Lucigen (Middleton, WI) electrocompetent Ecloni® cells in a 2mm electroporation cuvette. The cells were electroporated at 1800 V, 250 ohms, 50 "F, and immediately transferred to 975 "l of

Recovery Media (Lucigen, Middleton, WI). Transformed cells were allowed to recover at

37˚C for 1 hour. After 1 hour, the cultures were diluted to 1:10 and 1:100 and 1 "l of each was plated on a LB + Kanamycin (50 "g/ml) plate and incubated overnight to determine the initial titer. 3 mL of LB and 50 "g/ml of Kanamycin were added to the cultures and incubated at 37 ˚C for 1.5 hours. These cultures were diluted 1:100 and 1

"l was plated on LB + Kanamycin (50 "g/ml) and inclubated overnight at 37° C to determine the titer of successful transformants. Libraries were then preserved with 10% glycerol and stored at -80˚C.

Approximately 106 cells in glycerol stocks were plated on LB plates with ampicillin (60 "g/ml) (a semisynthetic antibiotic that arrests cell wall synthesis), tetracycline (8 "g/ml) (a naturally produced antibiotic that inhibits protein synthesis), sulfadimethoxine (700 "g/ml)(a synthetic antibiotic that inhibits folic acid synthesis), or nitrofurantoin (5 "g/ml) (a synthetic antibiotic that damages intracellular macromolecules). Antibiotic concentrations were the minimum concentration needed to inhibit growth of Ecloni® cells transformed with the pZE-21 plasmid. Antibiotic resistant clones were picked and grown in 200 "l LB + Kanamycin (50 "g/ml) in a 96 well plate overnight at 37° C. Plates were sent to Beckman-Coulter for Sanger sequencing of the plasmid ends.

)-" " It is likely that this approach underestimates the frequency and diversity of AR conferring genes in the ocean. First, our functional genomics assay only identified genes that are expressed in E. coli. Also, our insert size (2-4kb) limits detection to simple genetic systems.

Identification of known antibiotic resistance genes

We compared sequences to the Antibiotic Resistance Genes Database (ARDB)

(Liu and Pop 2009) and the non-redundant (nr) database from GenBank using BLAST.

Nucleotide sequences were compared to an amino acid database (blastx) using the

BLOSUM62 substitution matrix and an e-value cutoff of 10. Genes that were previously classified as AR genes from either the ARDB or GenBank were labeled as “known AR genes.” Genes that were not known AR genes are classified as “previously unclassified

AR genes.”

Identifying marine vs. non-marine taxa

We compared sequences to the 'non-redundant' database from NCBI using

BLAST. We then took the top-hit for each sequence and searched the taxa against the

EnvDB (Pignatelli et al. 2009). Records for each taxa were then identified as either

'marine' or 'non-marine'. A record was considered 'marine' if the environment of that record was 'saline water', 'saline sediment', 'marine host', 'freshwater-saline waters interface', 'soil-saline waters interface', or 'hydrothermal'. If more than 50% of the environmental records were 'marine', then that taxon was classified as 'marine'. The taxon was labeled as 'unknown' if the identification was broad (e.g. alphaproteobacteria). We also determined the functional type of protein from these hits

(e.g. , transporter). Using the top hit from GenBank, we searched

*." " UniProt for molecular functions. If the UniProt record identified an or a molecular function (e.g. transport, DNA binding, etc.), we assigned that function to the gene. If there was no Enzyme Commission number or molecular function, we identified that type as 'unknown'.

Strain genomic libraries

Control libraries were constructed similarly to the functional metagenomic libraries above. Cultures of antibiotic sensitive Escherichia coli and Synechococcus strain

WH8102 were grown to a density of 106 cells ml-1 and syringe filtered through a Sterivex filter. DNA was extracted and clone libraries were constructed as above for the seawater samples. Clones resistant to ampicillin, tetracycline, and nitrofurantoin were paired-end sequenced at Beckman-Coulter.

Statistical analyses

Differences in the frequency of resistance positive clones, the frequency of known AR genes, and the frequency of marine taxa between locations and between antibiotics were calculated using the Kruskal-Wallis test in the R core package (R Core

Team 2013). Differences in composition of known AR genes and unclassified AR genes between locations and antibiotics were calculated using permutational ANOVA from the

‘vegan’ package in R (Oksanen et al. 2013) using 999 permutations

Results:

In order to quantify the extent and diversity of AR genes in marine environments, we applied functional metagenomics to screen DNA from five marine sites against four antibiotics (ampicillin, tetracycline, sulfadimethoxine, and nitrofurantoin), which differed in their mode of activity (Table S3.1). We found resistant clones at all sites with mean

*%" " frequencies ranging from 1.6x10-6 to 8.7x10-5 AR positives per transformant (Figure 3.1,

Table S3.2). Environments significantly differed in their frequencies with Los Angeles

(LA) Harbor having the highest average frequency of resistant clones (8.7x10-5) (Figure

3.1, Table S3.2). The environment with the second highest frequency of resistant clones was the open ocean site at the Hawaiian Ocean Time-Series (HOT) (7.2x10-5), and the lowest frequency was observed at the Agua Hedionda Lagoon, Carlsbad, CA (1.6x10-6).

Frequencies of resistance also varied between antibiotics (Figure 3.1, Table

S3.2). Nitrofurantoin had the highest frequency of resistant clones (mean = 9.7x10-5 per transformant), while sulfadimethoxine had the lowest frequency (mean = 9.4x10-6 per transformant). This is noteworthy as both sulfadimethoxine and nitrofurantoin are fully synthetic antibiotics and thus not produced by microorganisms.

We next examined the diversity of resistance genes. We divided the genes into two main categories - previously known vs. unclassified AR genes (Figure 3.2 and

Table S3.3). Known genes made up 28% of the sequenced clones and ranged between

10 - 100% in amino acid similarity to other AR genes (Figure 3.2B). Of the known AR genes, the majority were identified as multi-drug efflux pumps (bcr, 36%) or beta- lactamases (bl2b_tem1, 29%) (Figure. 3.2A). TEM1 beta-lactamase (bl2b_tem1) shared the highest similarity to sequences in the Antibiotic Resistance Genes Database

(ARDB) (>80%) (Figure 3.2B). The multi-drug efflux pump bcr was represented by two groups differing by their similarity to other AR genes (Figure 3.2B). In addition, the genes that conferred resistance to nitrofurantoin and sulfadimethoxine all shared lower than 75% sequence similarity to other AR genes (Figure 3.2B).

*&" " The sites did not differ significantly in their overall frequencies of known AR gene types (Table S3.2). However, sites did vary in the composition of known AR genes based on a permutational ANOVA (Table S3.2), so they each contained a different set of AR genes. For instance, the Agua Hedionda lagoon was characterized by an abundance of bicyclomycin pumps (bcr) while the open ocean HOT had mostly TEM1 beta-lactamases (bl2b_tem1). In addition, the frequency of known AR genes did not differ by antibiotic, but there was a difference in the composition of known AR genes on each antibiotic (Figure 3.2 and Table S3.2). Tetracycline screens had a high frequency of bicyclomycin resistance pumps (69%) followed by the tetracycline efflux pump, tet41

(11%). Ampicillin screens mostly contained beta-lactamase genes of which the majority was the TEM1 (bl2b_tem1) (65%). Other beta-lactamases were present, but in lower frequencies (2.5 – 3% per gene). There were few known AR genes isolated on synthetic sulfadimethoxine and nitrofurantoin (Figure 3.2). Those isolated on sulfadimethoxine included target-modified dihydropterate synthase (sul1, sul2, and sul3) and target- modified dihydrofolate reductase (dfra24). Three types of known resistance genes were found on nitrofurantoin, the ABC transporter bcrA, the MFS transporter rosB, and the penicillin binding protein pbp2. Each of these genes represented 33% of the known AR genes found on nitrofurantoin.

The majority of genes identified (72%) did not match known AR genes in either

ARDB or GenBank (Table S3.3). We therefore grouped these genes by function (e.g. oxidoreductases, , transport pump protein, etc.). Similar to the known AR genes, we also observed differences in the composition of gene functions present between locations (Table S3.2). Genes of unknown function formed the plurality at HOT (47%),

*'" " Newport Bay (25%) and Agua Hedionda Lagoon (19%), while they represented 13% and 14% of the genes at San Pedro Channel and LA Harbor, respectively. Beyond these unknown types, HOT and Newport Bay were characterized by hydrolases, which comprised 12% and 16% of the gene functions identified there, respectively. Newport

Bay was also dominated by ligases (19% of gene functions). LA Harbor and San Pedro were both defined by the same gene functions, which included oxidoreductases (19% and 31%, respectively), ligases (14% and 15%, respectively), and (14% and 11%, respectively). Aqua Hedionda Lagoon was characterized by oxidoreductases,

DNA binding proteins, and transporters (15%, 12%, and 10% of the functional types, respectively).

Across antibiotics, the majority of unclassified AR genes were detected on tetracycline (44%) followed by ampicillin (27%) and then nitrofurantoin and sulfadimethoxine (both 14%). Over half of resistance (54%) to ampicillin was conferred by genes of unknown function. Resistance to tetracycline was predominately conferred by oxidoreductases, ligases, DNA binding proteins, and regulatory proteins.

Transferases and hydrolases were the dominant functional types isolated on sulphadimethoxine, and nitrofurantoin resistance was predominately conferred by oxidoreductases and ligases.

We next asked which taxa hosted the genes and whether the organisms were native to marine environments. Each sample contained a mixture of putative marine bacteria (44%), non-marine bacteria (25%), eukaryotes (10%), vectors (4%), viruses

(1%), and unassigned taxa (15%) (Figure 3.1, 3.2B, and 3.3). The marine bacterial taxa included abundant lineages like Pelagibacter, Prochlorococcus, and Roseobacter

*(" " (Figure 3.3). The putative non-marine taxa were characterized by Eschericia,

Parvibaculum, Flavobacterium, and Rhodobacteracea (Figure 3.3). Across environments, marine bacteria made up the majority of resistant clones except for HOT, which was characterized by an even distribution of marine and non-marine taxa (Figure

3.1). The resistant marine bacteria at HOT were dominated by Prochlorococcus (29%),

Pelagibacter (21%), and Vibrio (18%), whereas sites closer to shore like Newport Bay and LA Harbor were dominated by Ruegeria (19% and 15%, respectively). The San

Pedro Channel was dominated by Pelagibacter (22%) and Roseovarius (17%), and

Agua Hedionda Lagoon was dominated by Octadecabacter (16%).

The composition of resistant taxa also varied across antibiotics. Ampicillin resistance was mostly found in non-marine taxa (43%), while resistance to tetracycline and nitrofurantoin was identified predominately in marine taxa (57% and 56%, respectively). Resistance to sulfadimethoxine was found in both marine and non-marine taxa (43% and 42%, respectively). Among the marine taxa, resistance to ampicillin was primarily observed in Vibrio (19%) and Prochlorococcus (12%). Resistance to tetracycline was predominately found in Octadecabacter (13%), Roseovarius (12%), and Ruegeria (12%). Resistance to nitrofurantoin and sulfadimethoxine was mostly observed in Pelagibacter (30% and 18%, respectively), while resistance to sulfadimethoxine was also found in Puniceispirillum (15%).

Both known and unclassified AR genes were found in marine taxa (Tables S3.4 and S3.5). Overall, the dominant AR gene type was the bicyclomycin resistance pump

(bcr). This gene was predominantly found in Octadecabacter and Dinoroseobacter

(Table S3.4). The next abundant known AR gene was the TEM1 beta-lactamase, which

*)" " was only found in Vibrio (Table S3.4). Organisms such as Prochlorococcus contained genes like rosB (a putative potassium antiporter), vansd and vanxd (D-alanine-D- alanine activity) (Table S3.4). Among the marine taxa, the gene functions of previously unclassified AR genes were predominately ligases (18%), oxidoreductases

(15%), and DNA binding proteins (15%) (Table S3.5). Ruegeria contained the most unclassified AR genes (13% of the total) with Pelagibacter, Roseovarius, and

Roseobacter each hosting 9% of the total unclassified AR genes (Table S3.5).

Some marine taxa were host to several functional types. For example,

Roseobacter was host to AR conferring DNA binding proteins, oxidoreductases, regulators, and transferases, and Pelagibacter was characterized by transferases, ligases, , and oxidoreductases (Table S3.5). Some taxa predominately hosted one or two types, such as Ruegeria, which was characterized by ligases and DNA binding proteins and Silicibacter, which hosted AR conferring regulators (Table S3.5).

It was unexpected to find a range of AR genes in abundant open ocean bacteria like Synechococcus, Prochlorococcus and Pelagibacter. To further examine this, we also analyzed a genome clone library of the antibiotic sensitive marine cyanobacterium

Synechococcus WH8102 as well as Escherichia coli. We then screened for the presence of resistance genes using the same procedure as with the environmental samples. Although the strains themselves were sensitive to antibiotics, we found multiple individual genes conferring resistance (Figure 3.4). In E. coli we identified genes that were previously known (e.g. beta-lactamases) and unclassified AR genes

(Table S3.6). The majority of known AR genes were resistant to ampicillin, which were identified as either the beta-lactamase bl1_ec or the multi-drug efflux pump ykkC (Table

**" " S3.6). The hits to bl1_ec had high amino acid similarity (>72%) to sequences in the

ARDB while hits to ykkC had lower amino acid similarity (<58%) (Table S3.6).

Resistance to tetracycline included known antibiotic efflux pumps such as bcr, carA, macB, and marA, and known resistance to nitrofurantoin was conferred by the penicillin binding protein pbp2 (Table S3.6). In contrast, there were no known AR genes in the

Synechococcus libraries.

The unclassified AR genes from E. coli included hypothetical proteins, oxidoreductases, transport proteins, regulatory proteins, etc. (Table S3.6). For

Synechococcus, we did identify unclassified AR genes that conferred resistance to nitrofurantoin, which were a combination of hypothetical proteins, regulatory proteins, an oxidoreductase, a , a , a , a ligase, and an (Table

S3.7). Thus, we clearly identified AR genes in antibiotic sensitive strains of both marine

Synechococcus and E.coli.

Discussion:

Using a functional metagenomic assay, we identified antibiotic resistance genes at multiple marine sites and resistance to four antibiotics. In addition, these genes were found in a range of marine bacteria, and while some of the genes were known antibiotic resistance genes, the majority were previously unclassified as such. Our results were consistent with past limited studies of marine samples, studies which were based on cell culture, polymerase chain reaction, and sequence-based metagenomics (Dang et al.

2007, Baker-Austin et al. 2009, Port et al. 2012). However, this greatly expands our knowledge of the diversity of organisms and genes responsible for these resistance patterns by linking resistance phenotypes to genotypes, and identifying potential novel

*+" " resistance genes. While the estimated frequencies of resistance we found (up to 0.9% of cells) are less than estimated in gulls (up to 5% resistant cells) (Martiny et al. 2011) and aquacuture sediments (up to 15% resistant cells) (Buschmann et al. 2012), this study highlights the sheer abundance and diversity of potential AR genes across marine sites.

Based on the classic hypothesis, antibiotic resistance evolved in response to exposure to antibiotics. We did find genes that are known to confer resistance in marine and non-marine bacteria, which supports this hypothesis. The resistance we found to nitrofurantoin and sulfadimethoxine could be the result of antibiotic effluents. Because nitrofurantoin and sulfadimethoxine are synthetic antibiotics, microbes would not experience these molecules in natural microbe-microbe interactions. Resistant bacteria have been isolated from marine environments in proximity to aquaculture facilities or waterways in proximity to human influence (Alonso et al. 2001, Dang et al. 2007,

Buschmann et al. 2012, Zhao and Dang 2012). In addition, Port and colleagues (2012) showed that frequencies of known antibiotic resistance genes were higher near-shore, so anthropogenic inputs can influence resistance patterns. Based on these observations, we would expect a lower frequency of AR genes in the open ocean samples versus samples closer to shore. However, we found that frequencies of known

AR genes were similar in offshore (e.g. HOT) and near-shore environments (e.g. LA

Harbor). Thus, it appears that human-influence is not sufficient to explain our results. In addition, there are possible antagonistic in situ microbial interactions known to occur on particles (Long and Azam 2001). As we removed particles by pre-filtering, particle associated interactions are not likely to be an important mechanism in our system.

*," " However, microbial interactions can possibly also occur in the water column using diffusible compounds (Long and Azam 2001, Sher et al. 2011). If such interactions are common, this can lead to the widespread occurrence of AR genes as observed here.

Our second hypothesis was that the biocomplexity of these diverse communities would increase the probability of finding genes that conferred resistance. Past studies have shown that non-antibiotic efflux pumps can be used to transport antibiotics out of cells (Pike et al. 2002, De Souza et al. 2006, Martinez et al. 2009), and E. coli evolved to high temperature in antibiotic-free culture also evolved resistance to rifampicin

(Rodríguez-Verdugo et al. 2013). In support of this, we found many genes that conferred resistance in our assay, and the functions of these genes are similar to known

AR genes (e.g. transporters, oxidoreductases, hydrolases), but their primary function may not be associated with resistance. We also found that genes from two antibiotic sensitive strains could confer resistance in a new host, and we saw resistance to synthetic antibiotics like sulfadimethoxine and nitrofurantoin. This finding was unexpected but may partly explain the high occurrence and diversity of AR genes in other environments like the human gut (Sommer et al. 2009). Thus, organisms living in the oceans (and other environments) may harbor a large potential reservoir of AR genes that can be activated under the right circumstances (e.g., with the right promoter) but not currently cause resistance.

AR is no longer just a clinical issue (Forsberg et al. 2012). The discovery of natural reservoirs of resistance genes highlights the role natural environments play in the dissemination of AR genes. Here, we add oceans to the list. Covering 70% of

Earth’s surface area, oceans host on the order of 1029 bacterial cells (Whitman et al.

*-" " 1998). Given that oceans are prone to rapid vertical and horizontal transport of water masses and a key element to human commerce and recreation, this environment may be an important global reservoir of AR genes. Furthermore, given that most of the genes we identified may not be AR genes in the strictest sense, this global reservoir exists in the absence of human impact and is a potential reservoir for AR genes in the proper selective environment.

References: Allen, H. K., L. A. Moe, J. Rodbumrer, A. Gaarder, and J. Handelsman. 2009.

Functional metagenomics reveals diverse beta-lactamases in a remote Alaskan

soil. The ISME Journal 3:243–251.

Allison, S. D. 2005. Cheaters, diffusion and nutrients constrain decomposition by

microbial enzymes in spatially structured environments. Ecology Letters 8:626–635.

Alonso, A., P. Sanchez, and J. L. Martinez. 2001. Environmental selection of antibiotic

resistance genes. Environmental Microbiology 3:1–9.

Baker-Austin, C., J. V. McArthur, A. H. Lindell, M. S. Wright, R. C. Tuckfield, J. Gooch,

L. Warner, J. Oliver, and R. Stepanauskas. 2009. Multi-site analysis reveals

widespread antibiotic resistance in the marine pathogen Vibrio vulnificus. Microbial

Ecology 57:151–159.

Boström, K., K. Simu, A. Hagstrom, and L. Riemann. 2004. Optimization of DNA

extraction for quantitative marine bacterioplankton community analysis. Limnology

and Oceanography: Methods 2:365–373.

Buschmann, A. H., A. Tomova, A. López, M. a Maldonado, L. a Henríquez, L. Ivanova,

F. Moy, H. P. Godfrey, and F. C. Cabello. 2012. Salmon aquaculture and

antimicrobial resistance in the marine environment. PLOS ONE 7:e42724.

+." " Cordero, O. X., H. Wildschutte, B. Kirkup, S. Proehl, L. Ngo, F. Hussain, F. Le Roux, T.

Mincer, and M. F. Polz. 2012. Ecological populations of bacteria act as socially

cohesive units of antibiotic production and resistance. Science (New York, N.Y.)

337:1228–1231.

Dang, H., X. Zhang, L. Song, Y. Chang, and G. Yang. 2007. Molecular determination of

oxytetracycline-resistant bacteria and their resistance genes from mariculture

environments of China. Journal of applied microbiology 103:2580–2592.

Forsberg, K. J., A. Reyes, B. Wang, E. M. Selleck, M. O. A. Sommer, and G. Dantas.

2012. The shared antibiotic resistome of soil bacteria and human pathogens.

Science (New York, N.Y.) 337:1107–1111.

Foti, M., C. Giacopello, T. Bottari, V. Fisichella, D. Rinaldo, and C. Mammina. 2009.

Antibiotic Resistance of Gram Negatives isolates from loggerhead sea turtles

(Caretta caretta) in the central Mediterranean Sea. Marine pollution bulletin

58:1363–1366.

Gibson, M. K., K. J. Forsberg, and G. Dantas. 2014. Improved annotation of antibiotic

resistance determinants reveals microbial resistomes cluster by ecology. The ISME

Journal:1–10.

Liu, B., and M. Pop. 2009. ARDB-Antibiotic Resistance Genes Database. Nucleic acids

research 37:D443–447.

Long, R., and F. Azam. 2001. Antagonistic interactions among marine pelagic bacteria.

Applied and Environmental Microbiology 67:4975–4983.

+%" " Lutz, R., and H. Bujard. 1997. Independent and tight regulation of transcriptional units in

Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements.

Nucleic Acids Research 25:1203–1210.

Martínez, J. L. 2008. Antibiotics and antibiotic resistance genes in natural environments.

Science (New York, N.Y.) 321:365–367.

Martinez, J. L., M. B. Sánchez, L. Martínez-Solano, A. Hernandez, L. Garmendia, A.

Fajardo, and C. Alvarez-Ortega. 2009. Functional role of bacterial multidrug efflux

pumps in microbial natural ecosystems. FEMS microbiology reviews 33:430–449.

Martiny, A. C., J. B. H. Martiny, C. Weihe, A. Field, and J. C. Ellis. 2011. Functional

metagenomics reveals previously unrecognized diversity of antibiotic resistance

genes in gulls. Frontiers in Microbiology 2:238.

Miller, R. V., K. Gammon, and M. J. Day. 2009. Antibiotic resistance among bacteria

isolated from seawater and penguin fecal samples collected near Palmer Station,

Antarctica. Canadian Journal of Microbiology 55:37–45.

Oksanen, A. J., R. Kindt, P. Legendre, B. O. Hara, G. L. Simpson, M. H. H. Stevens,

and H. Wagner. 2013. vegan: Community Ecology Package.

Pignatelli, M., A. Moya, and J. Tamames. 2009. EnvDB, a database for describing the

environmental distribution of prokaryotic taxa. Environmental microbiology reports

1:191–197.

Pike, R., V. Lucas, P. Stapleton, M. S. Gilthorpe, G. Roberts, R. Rowbury, H. Richards,

P. Mullany, and M. Wilson. 2002. Prevalence and antibiotic resistance profile of

mercury-resistant oral bacteria from children with and without mercury amalgam

fillings. Journal of Antimicrobial Chemotherapy 49:777–783.

+&" " Port, J. A., J. C. Wallace, W. C. Griffith, and E. M. Faustman. 2012. Metagenomic

profiling of microbial composition and antibiotic resistance determinants in Puget

Sound. PLOS ONE 7:e48000.

R Core Team. 2013. R: A language and environment for statistical computing.

Riesenfeld, C. S., R. M. Goodman, and J. Handelsman. 2004. Uncultured soil bacteria

are a reservoir of new antibiotic resistance genes. Environmental Microbiology

6:981–989.

Rodríguez-Verdugo, A., B. S. Gaut, and O. Tenaillon. 2013. Evolution of Escherichia

coli rifampicin resistance in an antibiotic-free environment during thermal stress.

BMC Evolutionary Biology 13:50.

Segawa, T., N. Takeuchi, A. Rivera, A. Yamada, Y. Yoshimura, G. Barcaza, K. Shinbori,

H. Motoyama, S. Kohshima, and K. Ushida. 2013. Distribution of antibiotic

resistance genes in glacier environments. Environmental Microbiology Reports

5:127–134.

Sher, D., J. W. Thompson, N. Kashtan, L. Croal, and S. W. Chisholm. 2011. Response

of Prochlorococcus ecotypes to co-culture with diverse marine bacteria. The ISME

journal 5:1125–1132.

Sommer, M. O. A., G. Dantas, and G. M. Church. 2009. Functional characterization of

the antibiotic resistance reservoir in the human microflora. Science (New York,

N.Y.) 325:1128–1131.

De Souza, M.-J., S. Nair, P. a Loka Bharathi, and D. Chandramohan. 2006. Metal and

antibiotic-resistance in psychrotrophic bacteria from Antarctic marine waters.

Ecotoxicology (London, England) 15:379–384.

+'" " Whitman, W. B., D. C. Coleman, and W. J. Wiebe. 1998. Prokaryotes!: The unseen

majority. Proceedings of the National Academy of Sciences, USA 95:6578–6583.

Zhao, J., and H. Dang. 2012. Coastal seawater bacteria harbor a large reservoir of

plasmid-mediated quinolone resistance determinants in Jiaozhou Bay, China.

Microbial Ecology 64:187–199.

Acknowledgments: We thank Alyssa Kent for collecting seawater from the Hawaii

Ocean Time Series and Jennifer Martiny for helpful comments on the manuscript. In addition we would like to acknowledge funding from the Department of Education

Graduate Assistance in Areas of National Need fellowship P200A120144.

+(" " Marine Eukaryote Unknown Plasmid Virus Sul Tet Nit Amp Non-marine LA1

LA2

SP1

SP2

AH1

AH2

HOT1

HOT2

NB1

NB2

-6.5 -2 log10(Frequency)

Figure 3.1. Antibiotic resistant clone frequencies across samples and antibiotics. White squares represent low frequency, and darker squares represent higher frequency. Rows represent sample sites: Los Angeles Harbor (LA), San Pedro Channel (SP), Hawaii

Ocean Time Series (HOT), Newport Bay (NB), and Agua Hedionda Lagoon (AH).

Numbers next to sample sites represent replicate samples. Columns represent antibiotics on which clones were screened: sulfadimethoxine (Sul), tetracycline (Tet), nitrofurantoin (Nit), ampicillin (Amp). 'Other' taxa represent bacterial sequences that could not be assigned to 'marine' or 'non-marine' becasue the taxonomic designation was too broad (e.g. alphaproteobacteria).

+)" " A Ampicillin Tetracycline Nitrofurantoin Sulfadimethoxine Site 60 AH HOT LA NB SP

40

Gene abundance 20

0 bcr bcr tetz tetc teta sul1 neo sul3 sul2 yajR bcra rosb tet41 MFS pbp2 matE marA vanre marR vansd vanxd vanhd vanhb dfra24 cml_e1 cml_e3 bl3_gim bl1_pse bl3_shw bl2b_tem bl2c_pse1 bl2b_tem1 B bl2d_oxa2 100

75

50

Organism Percent identity to top hit top identity to Percent 25 Non-marine Marine Eukaryote Unknown Vector 0

Figure 3.2. The abundance and percent amino acid identity to known AR genes. (A)

The abundance of known antibiotic genes across sample sites. (B) The percent identity between AR genes from marine environments and the Antibiotic Resistance Gene

Database or GenBank. Each symbol represents one sequence and symbols represent the host organism the gene was in, identified using GenBank and EnvDB.

+*" " Marine Non-marine Pelagibacter Octadecabacter Flavobacterium Polaribacter Flavobacteria Oceanicola Parvibaculum Escherichia Prochlorococcus Marine -proteobacterium Pedobacter Puniceispirillum Loktanella Prevotella Clostridium Erythrobacter Roseobacter Burkholderia Dinoroseobacter Pseudomonas Acinetobacter Rhodobacteraceae Roseovarius Rhodobacterales Other Variovorax Ruegeria Silicibacter Synechococcus Vibrio Other Uncultured marine group II euryarchaeote

Figure 3.3. Relative abundance of marine and non-marine bacterial taxa within all samples. “Other” are groupings of taxa that were less than 2% relative abundance.

++" " Strain E. coli 0.0002 Synechococcus WH8102

0.00015

0.0001

0.00005 Frequency (number of positive clones/ transformant) (number of positive Frequency

0.00000

Amp Nit Sul Tet Antibiotic

Figure 3.4. The frequency of antibiotic resistant clones from the genomic libraries of E. coli (red bars) and Synechococcus WH8102 (blue bars) on each of four antibiotics: ampicillin (Amp), nitrofurantoin (Nit), sulfadimethoxine (Sul), and tetracycline (Tet).

+," "

Table S3.1. The minimum inhibitory concentrations used for the antibiotic screens. The concentrations inhibited growth of E. cloni 10G cells (Lucigen, Middleton, WI) carrying empty pZE-21 plasmid.

Natural/ Minimum inhibitory Antibiotic synthetic concentration (ug/ml) Antibiotic activity Semi- Arrests cell wall Ampicillin 60 synthetic synthesis Tetracycline Natural 8 Inhibits protein synthesis Sulfadimethoxi Inhibits folic acid Synthetic 700 ne synthesis Damages intracellular Nitrofurantoin Synthetic 5 macromolecules

+-" "

Table S3.2. Statistical results for Kruskal-Wallis tests and permutational analysis of variance.

Model Kruskal-Wallis #2 Pseudo-F d.f. P-value positive clone frequencies across sites 15.6 - 4 0.004* positive clone frequency across antibiotics 9.1 - 3 0.03* frequency of known AR genes across sites 7.4 - 4 0.1 frequency of known AR genes across antibiotics 6.7 - 3 0.08 composition of known AR genes across sites - 4.1 4 0.01* composition of known AR genes across antibiotics - 4.6 3 0.009* composition of gene functions across sites - 3.6 4 0.004* frequency of marine taxa across sites 7.6 - 4 0.1 frequency of marine taxa across antibiotics 5.2 - 3 0.2

,." "

Table S3.3. The abundance of previously uncharacterized AR genes across sample site and host organism. These are sequences that did not match sequences in the ARDB. Gene names are the top hit to the non-redundant database in GenBank. Table S3. The abundance of previously uncharacterized AR genes across sample site and host organism . These are sequences that did not match sequences in the ARDB. Gene names are the top hit to the non-redundant database in GenBank.

Sample site Host organism Non- Gene AH HOT LA NB SP Marine Eukaryote Unknown Virus marine (S)-2-hydroxy-acid oxidase 0 0 0 0 2 0 2 0 0 0 2-oxoglutarate dehydrogenase 0 1 0 1 0 1 1 0 0 0 3-alpha,7-alpha, 12-alpha- trihydroxy-5-beta-cholest-24- 1 0 0 0 0 1 0 0 0 0 enoyl-CoAhydratase 3-hydroxyacyl-CoA 2 0 0 0 0 0 2 0 0 0 dehydrogenase 3-isopropylmalate dehydratase 0 0 1 0 0 0 1 0 0 0 6-phosphofructokinase 0 0 1 0 0 0 1 0 0 0 ABC transporter 1 0 0 1 1 1 1 0 1 0 Acetyl/propionyl-CoA 0 1 0 0 0 0 1 0 0 0 carboxylase Acetylornithine 0 0 1 0 0 0 1 0 0 0 aminotransferase Acetyltransferase 1 0 0 0 0 0 1 0 0 0 Acyl-CoA thioesterase 0 1 0 0 0 0 0 0 1 0 Adenylate cyclase 0 0 1 0 0 1 0 0 0 0 Adenylate/Guanylate Cyclase 0 0 1 0 0 0 1 0 0 0 Alanyl-tRNA synthetase 0 0 0 5 0 0 5 0 0 0 Alcohol dehydrogenase 0 0 0 0 1 0 0 0 1 0 Aldehyde dehydrogenase 1 0 0 0 0 1 0 0 0 0 Alpha/beta fold family 0 1 0 0 0 0 1 0 0 0 hydrolase Amidotransferase 0 2 0 0 0 2 0 0 0 0 Ammonium transporter family 1 0 0 0 0 0 0 0 1 0 protein Ankyrin repeat protein 0 0 0 0 1 0 0 1 0 0 Anthranilate synthase 0 0 0 0 1 1 0 0 0 0 Arabinose efflux protein 0 0 0 0 1 0 0 0 1 0 Aspartate aminotransferase 0 1 0 0 0 0 1 0 0 0 Aspartate/ornithine 0 0 1 0 0 0 0 0 1 0 carbamoyltransferase ATP-dependent 0 0 0 8 0 8 0 0 0 0 metalloprotease Beta-carbonic anhydrase 9 0 0 0 0 0 0 0 9 0 Beta-ketoacyl synthase 0 0 1 0 0 0 1 0 0 0 Betaine aldehyde 5 0 1 0 0 0 6 0 0 0 dehydrogenase Bifunctional deaminase- 1 0 0 0 0 0 0 0 1 0 reductase Bifunctional dihydrofolate/folylpolyglutama 0 1 0 0 0 0 1 0 0 0 te synthase IMP cyclohydrolase 1 0 0 0 0 0 0 0 1 0 Carbonate dehydratase 0 0 1 0 0 0 1 0 0 0 Carboxyl transferase 0 0 1 0 0 0 1 0 0 0 Chaperonin 0 0 1 0 0 0 0 0 1 0 Choline dehydrogenase 0 0 0 0 7 0 7 0 0 0 Chromate transporter 1 0 0 0 0 1 0 0 0 0 Conserved hypothetical protein 11 1 1 0 0 8 3 0 2 0

AH - Agua Hedionda Lagoon, HOT - Hawaii Ocean Time Series, LA - Los Angeles Harbor, NB - Newport Bay, SP - San Pedro Ocean Time Series

,%" " Table S3.3. continued

Sample site Host organism Non- Gene AH HOT LA NB SP Marine Eukaryote Unknown Virus marine Cysteine desulfurase 0 2 0 0 0 1 1 0 0 0 D-ala-D-ala dipeptidase 0 1 0 0 0 0 1 0 0 0 D-amino acid oxidase 0 1 0 0 0 0 1 0 0 0 DEAD/DEAH box helicase 1 0 0 0 0 0 1 0 0 0 Delta 1-pyrroline-5- 0 0 1 0 0 0 1 0 0 0 carboxylate reductase Diaminopimelate epimerase 1 0 0 0 0 1 0 0 0 0 Dienelactone hydrolase 0 0 0 1 0 1 0 0 0 0 Dihydrolipoamide 0 0 0 0 2 0 0 0 2 0 dehydrogenase Dihydroneopterin aldolase 0 0 0 1 0 1 0 0 0 0 Dihydroxy-acid dehydratase 0 0 0 0 1 0 1 0 0 0 DNA gyrase 0 1 0 0 0 0 1 0 0 0 DNA helicase 0 0 1 0 0 0 0 0 1 0 DNA mismatch 1 1 0 0 0 2 0 0 0 0 DNA polymerase III 3 0 0 1 0 1 3 0 0 0 DNA primase 1 0 0 0 0 0 1 0 0 0 DNA topoisomerase 0 1 0 0 0 0 1 0 0 0 DNA-binding protein 21 0 1 6 1 1 28 0 0 0 DNA-directed RNA 0 0 2 0 0 0 2 0 0 0 polymerase DSBA oxidoreductase 0 0 1 0 0 0 1 0 0 0 DSBA-like thioredoxin domain 1 0 0 0 0 0 1 0 0 0 protein Esterase 1 0 0 0 0 1 0 0 0 0 Exonuclease domain protein 0 2 0 0 0 2 0 0 0 0 FAD dependent oxidoreductase/aminomethyl 0 0 1 0 0 0 0 0 1 0 transferase FAD-binding domain protein 1 0 0 0 0 1 0 0 0 0 FAD/FMN-containing 0 1 0 0 0 0 1 0 0 0 dehydrogenase Ferrochelatase 1 0 0 0 0 0 1 0 0 0 FeS assembly ATPase SufC 0 0 0 0 1 0 1 0 0 0 FeS assembly protein SufB 0 1 0 0 0 0 1 0 0 0 FeS assembly protein SufD 0 2 0 0 0 0 2 0 0 0 Fmu 0 0 1 0 0 0 1 0 0 0 Formate dehydrogenase 1 0 0 0 0 0 1 0 0 0 Formate--tetrahydrofolate 0 0 0 1 0 0 1 0 0 0 ligase Fumarate reductase/succinate 0 0 0 0 1 1 0 0 0 0 dehydrogenase gag polyprotein 1 0 0 0 0 0 0 0 1 0 Gamma-glutamyl phosphate 2 0 0 0 0 0 0 0 2 0 reductase Glucokinase 0 0 1 0 0 1 0 0 0 0 Gluconolaconase 0 1 0 0 0 1 0 0 0 0 Glutamate synthase 1 0 0 0 0 1 0 0 0 0 Glycine dehydrogenase 1 0 1 0 0 1 1 0 0 0 Glycyl-tRNA synthetase 0 0 0 0 1 0 1 0 0 0 Glyoxylate reductase 1 0 0 0 0 0 0 0 1 0

,&" "

Table S3.3. continued

Sample site Host organism Non- Gene AH HOT LA NB SP Marine Eukaryote Unknown Virus marine GMC oxidoreductase 1 0 0 0 0 0 1 0 0 0 GP30.3 family protein 1 0 0 0 0 1 0 0 0 0 GTP cyclohydrolase 1 0 1 0 0 2 0 0 0 0 HlyD family secretion protein 1 0 0 0 0 0 1 0 0 0 Homoserine/homoserine 1 0 0 0 0 1 0 0 0 0 lactone efflux protein Hypothetical membrane 1 0 0 0 0 1 0 0 0 0 protein Hypothetical protein 30 31 6 17 5 10 22 45 4 8 Hypothetical transport protein 0 0 1 0 0 0 1 0 0 0 Inosine-5-monophosphate 1 0 0 0 0 0 1 0 0 0 dehydrogenase Integral membrane protein 1 0 0 0 0 0 1 0 0 0 DUF6 Integrase catalytic region 2 0 0 0 0 2 0 0 0 0 Ion channel 0 0 0 0 1 0 0 1 0 0 IS1 transposase 0 0 0 0 1 1 0 0 0 0 Isocitrate dehydrogenase 1 0 0 0 1 1 0 0 1 0 Isoleucyl-tRNA synthetase 1 0 0 0 1 1 1 0 0 0 Ketol-acid reductoisomerase 9 0 1 1 0 0 4 0 7 0 L-sorbosone dehydrogenase 0 1 0 0 0 0 1 0 0 0 Leucyl-tRNA synthetase 1 0 0 1 0 0 2 0 0 0 Lon-B peptidase 0 0 0 0 1 0 1 0 0 0 Long-chain fatty-acid-CoA 1 0 0 0 0 0 1 0 0 0 ligase LrgB-like family 0 0 1 0 0 0 1 0 0 0 LysM domain-containing 0 1 0 0 0 0 1 0 0 0 protein Major facilitator family protein 6 0 1 2 0 6 2 0 1 0 Malate dehydrogenase 0 0 1 0 0 1 0 0 0 0 Mandelate racemase 4 0 0 0 0 0 0 0 4 0 MFS-type transporter 1 0 0 0 0 1 0 0 0 0 Molecular chaperone DnaK 0 1 0 0 0 0 1 0 0 0 Molybdenum 0 0 0 0 1 0 1 0 0 0 biosynthesis protein A N6-adenine-specific methylase 0 0 0 0 1 0 1 0 0 0 Na+/Ca2+-exchanging protein 1 0 0 0 0 1 0 0 0 0 NADPH-ferredoxin reductase 0 0 1 0 0 0 0 0 1 0 Nitrate reductase 0 0 0 0 1 0 1 0 0 0 Nitrite and sulfite reductase 0 0 1 0 0 1 0 0 0 0 Nitrogen regulatory protein PII 1 0 0 0 0 0 0 0 1 0 O-methyltransferase 0 0 0 0 1 0 1 0 0 0 Oligoendopeptidase F 0 0 0 0 1 1 0 0 0 0 Oligoribonuclease 1 0 0 0 0 0 1 0 0 0 Orphan protein 1 0 0 0 0 0 1 0 0 0 Outer membrane receptor 0 0 2 0 0 0 2 0 0 0 Oxidoreductase, GMC family 0 0 1 0 0 0 0 0 1 0 protein

,'" " Table S3.3. continued

Sample site Host organism Non- Gene AH HOT LA NB SP Marine Eukaryote Unknown Virus marine parB-like protein partition 0 0 0 1 0 1 0 0 0 0 protein Peptidase 2 0 0 0 0 1 1 0 0 0 Phage integrase 0 0 0 1 0 1 0 0 0 0 Phage tail fiber protein 0 0 0 1 0 1 0 0 0 0 Phenylacetic acid degradation 0 1 0 0 0 0 1 0 0 0 protein Phosphoenolpyruvate 0 0 0 1 0 1 0 0 0 0 carboxylase Phosphoglucomutase/phospho 0 0 0 2 0 0 2 0 0 0 mannomutase Phosphoglucosamine mutase 0 0 0 3 0 1 2 0 0 0 Phosphomethylpyrimidine 0 0 0 1 0 1 0 0 0 0 kinase Phosphoribosylamino- 5 0 0 0 0 0 0 5 0 0 imidazole-carboxylase Phosphoribosylanthranilate 0 0 0 0 1 1 0 0 0 0 isomerase Phosphoribosylformylglycina 0 0 0 1 0 0 1 0 0 0 midine synthase Phosphoserine 0 0 0 1 0 0 1 0 0 0 aminotransferase Pleckstrin 0 0 0 1 0 0 0 1 0 0 Polyprenyl P-hydroxybenzoate and phenylacrylic acid 0 1 0 0 0 0 1 0 0 0 decarboxylases Porin 0 0 0 0 1 0 1 0 0 0 Porphobilinogen deaminase 1 0 0 0 0 1 0 0 0 0 Predicted protein 1 0 3 2 3 0 0 9 0 0 Probable oxidoreductase 0 0 0 0 1 0 1 0 0 0 Propionyl-CoA carboxylase 14 0 9 6 4 7 26 0 0 0 Protein-L-isoaspartate O- 1 0 0 0 0 0 1 0 0 0 methyltransferase Protocatechuate 4,5- 0 0 1 0 0 0 0 0 1 0 dioxygenase Putative Gingipain R 0 1 0 0 0 1 0 0 0 0 Pyridoxamine 5'-phosphate 1 0 0 0 0 1 0 0 0 0 oxidase Pyruvate decarboxylase 0 0 1 0 0 0 0 0 1 0 Quinolinate synthetase 0 1 0 0 0 0 1 0 0 0 Recombinase A 0 0 1 1 0 0 2 0 0 0 Recombination protein recR 6 0 0 0 0 0 6 0 0 0 RecX family transcriptional 0 0 0 1 0 1 0 0 0 0 regulator Resolvase 0 0 1 0 0 0 1 0 0 0 Ribose 5-phosphate isomerase 0 0 0 1 0 0 0 0 1 0 A Ribose-phosphate 0 1 0 0 0 0 1 0 0 0 pyrophosphokinase

,(" " Table S3.3. continued

Sample site Host organism Non- Gene AH HOT LA NB SP Marine Eukaryote Unknown Virus marine Ribosomal large subunit 0 1 0 0 0 0 1 0 0 0 pseudouridine synthase D rrf2 transcriptional regulator 0 0 1 0 0 0 1 0 0 0 Sarcosine oxidase 0 0 0 1 0 1 0 0 0 0 SecD export membrane protein 0 0 1 0 0 0 0 0 1 0 Secreted protein 2 0 0 0 0 0 2 0 0 0 Segregation and condensation 1 0 0 0 0 0 1 0 0 0 protein A Shikimate kinase I 1 0 0 0 0 1 0 0 0 0 Short chain 0 0 1 0 0 0 1 0 0 0 dehydrogenase/reductase Single-strand binding protein 1 0 0 0 0 0 0 0 1 0 Sodium-glucose/galactose 0 1 0 0 0 0 1 0 0 0 cotransporter Sodium/solute symporter 3 0 1 0 0 0 4 0 0 0 SRP 0 2 0 0 0 0 0 2 0 0 Succinyl-CoA synthetase 0 1 0 0 0 1 0 0 0 0 Superfamily II DNA and RNA 0 0 0 1 0 1 0 0 0 0 helicase Tetratricopeptide TPR_2 0 1 0 0 0 0 0 0 1 0 Thymidylate synthase 0 1 0 0 1 1 0 0 0 1 Transcription elongation factor 0 0 0 0 1 0 1 0 0 0 NusA Transcription-repair coupling 0 0 0 1 0 0 0 0 1 0 factor Transcriptional regulator 1 0 0 0 1 0 2 0 0 0 Transcriptional regulator, AraC 2 0 0 0 0 2 0 0 0 0

Transcriptional regulator, XRE 13 0 3 1 2 0 19 0 0 0 Transketolase 1 0 0 0 0 1 0 0 0 0 Translocator protein, LysE 1 0 0 0 0 0 1 0 0 0 family Transposase 1 0 1 0 0 2 0 0 0 0 Triose-phosphate isomerase 0 0 0 0 1 0 1 0 0 0 tRNA (uracil-5-)- 0 0 0 0 1 0 1 0 0 0 methyltransferase Two component transcriptional 0 1 0 0 0 0 1 0 0 0 regulator Type IIA topoisomerase 1 0 0 0 0 0 0 0 1 0 Uncharacterized protein 1 1 0 0 0 1 1 0 0 0 Unknown function 2 0 0 0 0 2 0 0 0 0 Valyl-tRNA synthetase 0 0 0 0 1 0 1 0 0 0 Total 204 71 63 74 54 105 232 64 56 9

,)" "

Table S3.4. The number of hits to each previously known antibiotic resistance gene in marine taxa. Genes were identifed using the ARDB or described as resistance proteins in GenBank.

Antibiotic resistance gene Taxa bcr bl2_tem1 bl2c_pse1 bl2d_oxa2 cml_e1 matE MFS pbp2 rosb vanre vansd vanxd Dinoroseobacter 9 0 0 0 1 0 0 0 0 0 0 0 Jannaschia 3 0 0 0 0 0 0 0 0 0 0 0 Marinobacter 0 0 0 0 0 1 0 0 0 0 0 0 Oceanobacter 1 0 0 0 0 0 0 0 0 0 0 0 Oceanospirillium 1 0 0 0 0 0 0 0 0 0 0 0 Octadecabacter 20 0 0 0 0 0 0 0 0 0 0 0 Pelagibacter 0 0 0 0 0 0 0 1 0 0 0 0 Photobacterium 0 0 2 0 0 0 0 0 0 0 0 0 Prochlorococcus 0 0 0 0 0 0 0 0 1 0 1 1 Puniceispirillium 0 0 0 0 0 0 0 0 0 1 0 0 Roseovarius 1 0 0 0 0 0 0 0 0 0 0 0 Shewanella 0 0 0 1 0 0 0 0 0 0 0 0 uncultured marine group II 0 0 0 0 0 0 1 0 0 0 0 0 euryarchaeote Vibrio 0 7 0 0 0 0 0 0 0 0 0 0 Total 35 7 2 1 1 1 1 1 1 1 1 1

,*" "

Table S3.5. The number of hits to protein functional types in marine taxa.

Gene function

DNA Oxido- DNA RNA Taxa Ligase binding reductase Unknown Regulation Transferase Transport Hydrolase Isomerase Lyase repair Receptor Chaperone binding Alcanivorax 0 0 0 0 0 1 0 0 0 0 0 0 0 0 Alteromonas 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Cellulophaga 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Dinoroseobacter 0 0 0 0 1 0 0 0 0 0 0 0 0 0 Erythrobacter 1 0 0 2 0 0 1 1 0 1 0 0 0 0 Glaciecola 0 0 0 0 0 0 0 0 0 1 0 0 0 0 Kordia 0 0 0 0 0 0 0 0 0 1 0 0 0 0 Leeuwenhoekiella 0 0 2 0 0 1 0 0 0 0 0 0 0 0 Loktanella 0 3 4 0 0 0 0 0 0 0 0 0 0 0 Lyngbya 0 0 0 0 0 0 0 1 0 0 0 0 0 0 Maribacter 0 0 0 1 0 0 0 0 0 0 0 0 0 0 marine $- proteobacterium 0 0 6 3 0 1 1 1 0 1 0 0 0 0

Marinobacter 0 0 0 0 1 0 0 0 0 0 0 0 0 0 Marinomonas 1 0 0 0 0 0 0 0 0 1 0 0 0 0 Maritimibacter 0 0 0 0 0 0 0 0 0 0 1 0 0 0 Methylophaga 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Microscilla 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Moritella 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Oceanicola 3 0 0 2 0 0 0 0 1 0 0 0 0 0 Oceanospirillum 0 1 1 0 0 0 0 0 0 0 0 0 0 0 Octadecabacter 0 0 0 1 3 0 0 1 0 0 0 0 0 0 Odyssella 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Owenweeksia 0 0 0 0 0 0 0 0 0 0 0 2 0 0 Pelagibaca 0 3 0 0 0 0 0 0 0 0 0 0 0 0 Pelagibacter 3 0 2 4 1 5 2 1 2 2 0 0 0 0 Phaeobacter 1 0 0 0 0 1 0 0 0 0 0 0 0 0 Polaribacter 1 6 1 1 0 0 4 1 0 0 0 0 0 0 Prochlorococcus 1 0 2 2 0 2 1 2 0 0 0 0 1 0 Pseudoalteromonas 0 0 3 0 0 0 0 1 0 0 0 0 0 0 Puniceispirillum 2 0 1 0 0 2 0 1 0 0 0 0 0 0 Robiginitalea 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Roseibium 0 0 0 0 0 0 0 0 2 0 0 0 0 0 Roseobacter 1 6 4 5 3 2 0 0 0 0 0 0 0 0 Roseovarius 9 7 5 0 0 0 0 0 0 0 1 0 0 0 Ruegeria 17 9 0 1 2 0 1 0 0 0 0 0 0 0 Ruthia 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Shewanella 0 0 0 3 0 0 0 0 0 0 0 0 0 0 Silicibacter 0 0 0 0 11 0 0 0 0 0 0 0 0 1 Stappia 0 0 0 0 0 1 0 0 2 0 0 0 0 0 Sulfitobacter 1 1 0 0 0 0 0 0 0 0 0 0 0 0 Synechococcus 0 0 3 0 1 0 0 1 0 1 0 0 0 0 uncultured marine group II 0 0 0 1 0 3 0 1 0 0 0 0 0 0 euryarchaeote

Vibrio 0 0 0 1 0 1 0 0 0 0 0 0 0 0 Zunongwangia 0 0 0 0 0 0 2 0 0 0 0 0 0 0 Total 41 36 36 28 23 20 15 12 9 8 2 2 1 1

,+" "

Table S3.6. Top hits to the ARDB and GenBank for genes cloned from E. coli genomic

DNA which conferred resistance in E. coli host cells. Table S6. Top hits to the ARDB and GenBank for genes cloned from E. coli genomic DNA, which conferred resistance in E. coli host cells.

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID fumarate reductase Ecoli_1F Amp YP_002806354 bl1_ec 1 NP_757087 0.87 subunit D outer membrane Ecoli_1R Amp - - - NP_710018 0.99 lipoprotein Ecoli_2F Amp - - - ZP_06656271 taurine dioxygenase 0.5 Ecoli_3F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1

Ecoli_3R Amp BAE78154 bl1_ec 1 3IXG AmpC beta-lactamase 0.97

Ecoli_4F Amp YP_002806354 bl1_ec 0.96 ACT97394 AmpC beta-lactamase 0.97

Ecoli_4R Amp BAE78154 bl1_ec 0.99 3IWQ AmpC beta-lactamase 0.82 periplasmic nitrate Ecoli_5F Amp - - - YP_217252 0.6 reductase periplasmic nitrate Ecoli_5R Amp - - - ZP_02901751 0.96 reductase Ecoli_6F Amp YP_002806354 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 outer membrane Ecoli_6R Amp ABS73653 ykkC 0.53 2ACO 0.88 lipoprotein Blc Ecoli_7F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 outer membrane Ecoli_7R Amp BAE78154 bl1_ec 1 ZP_03069379 1 lipoprotein Blc Ecoli_8F Amp YP_002806354 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 0.95

Ecoli_8R Amp BAE78154 bl1_ec 1 ACT97420 AmpC beta-lactamase 0.96

Ecoli_9F Amp YP_002806354 - 0.98 ACT97394 AmpC beta-lactamase 0.85 outer membrane Ecoli_9R Amp - - - NP_710018 0.99 lipoprotein Blc Ecoli_10F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 SMR family small Ecoli_10R Amp ABS73653 ykkC 0.53 ZP_06664871 multidrug resistance 0.87 protein Ecoli_11F Amp ZP_04534300 bl1_ec 0.98 ACT97394 AmpC beta-lactamase 1

Ecoli_11R Amp BAE78154 bl1_ec 1 3IXG AmpC beta-lactamase 0.9

Ecoli_12F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 Ecoli_12R Amp - - - NP_757081 hypothetical protein 0.97 unnamed protein Ecoli_13F Amp - - - CBI55270 0.48 Ecoli_13R Amp BAE78154 bl1_ec 1 ACT97394 AmpC beta-lactamase 1

Ecoli_14F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 Ecoli_14R Amp ABS73653 ykkC 0.53 CAA49570 sugEL 0.9 Ecoli_15F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1

,," "

Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID Ecoli_16F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 0.99 ammonium compound- Ecoli_16R Amp ABS73653 ykkC 0.53 ZP_03061458 resistance protein 0.89 SugE Ecoli_17F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 Ecoli_17R Amp - - - CAA49570 sugEL 0.93 Ecoli_18F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 potassium:proton Ecoli_18R Amp - - - NP_414589 0.73 antiporter fumarate reductase Ecoli_19F Amp YP_002806354 bl1_ec 0.98 NP_757087 0.87 subunit D outer membrane Ecoli_19R Amp BAE78154 bl1_ec 1 ZP_03069379 1 lipoprotein Blc fumarate reductase Ecoli_20F Amp - - - NP_757087 0.87 subunit D Ecoli_20R Amp BAE78154 bl1_ec 1 3IWI AmpC beta-lactamase 0.88 fumarate reductase Ecoli_21F Amp YP_002806354 bl1_ec 0.72 NP_757087 0.87 subunit D outer membrane Ecoli_21R Amp BAE78154 bl1_ec 1 ACT97648 1 lipoprotein Blc Ecoli_22F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 outer membrane Ecoli_22R Amp ABS73653 ykkC 0.53 NP_757085 0.9 lipoprotein Blc Ecoli_23F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 0.93 outer membrane Ecoli_23R Amp ABS73653 ykkC 0.51 ZP_03069379 0.98 lipoprotein Blc Ecoli_24F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 0.99 outer membrane Ecoli_24R Amp ABS73653 ykkC 0.53 ZP_03069379 1 lipoprotein Blc Ecoli_25F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 Ecoli_25R Amp ABS73653 ykkC 0.53 CAA49570 sugEL 0.9 Ecoli_26F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1 Ecoli_26R Amp ZP_04534300 bl1_ec 0.99 CAA49570 sugEL 0.95 Ecoli_27F Amp - - - ACT97394 AmpC beta-lactamase 1 outer membrane Ecoli_27R Amp ABS73653 ykkC 0.53 NP_710018 0.87 lipoprotein Blc Ecoli_28F Amp ZP_04004609 bl1_ec 0.99 ACT97398 AmpC beta-lactamase 1 Ecoli_28R Amp ABS73653 ykkC 0.54 CAA49570 sugEL 0.9 fumarate reductase Ecoli_29F Amp YP_002806354 bl1_ec 1 NP_757087 0.87 subunit D Ecoli_29R Amp BAE78154 bl1_ec 1 1I5Q AmpC beta-lactamase 0.93

Ecoli_30F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 1

,-" "

Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID outer membrane Ecoli_30R Amp - - - ZP_03069379 1 lipoprotein Blc Ecoli_31F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 Ecoli_31R Amp ABS73653 ykkC 0.58 CAA49570 sugEL 0.94 Ecoli_32F Amp ZP_04534300 bl1_ec 0.99 ACT97394 AmpC beta-lactamase 1 outer membrane Ecoli_32R Amp - - - ZP_03069379 1 lipoprotein Blc Ecoli_33F Amp YP_002806354 bl1_ec 1 ACT97394 AmpC beta-lactamase 0.92 Ecoli_33R Amp - - - NP_757081 hypothetical protein 0.98 DNA-binding Ecoli_34F Tet - - - AP_002151 transcriptional 1 repressor Ecoli_34R Tet - - - NP_757081 hypothetical protein 0.61 putative helicase, ATP- Ecoli_35F Tet - - - ZP_06352119 0.86 dependent O- Ecoli_35R Tet - - - ZP_07244647 acetylserine/cysteine 0.96 export protein EamA putative helicase, ATP- Ecoli_36F Tet - - - ZP_06352119 0.79 dependent Ecoli_36R Tet - - - ZP_02793228 hypothetical protein 0.94 predicted multidrug Ecoli_37F Tet - - - NP_414982 transporter subunits of 0.93 ABC superfamily multidrug resistance- Ecoli_37R Tet YP_001453760 macB 0.51 AAB40205 like ATP-binding 1 protein Ecoli_38F Tet - - - YP_539470 hypothetical protein 0.97 oxidoreductase, Ecoli_38R Tet - - - ZP_07134808 aldo/keto reductase 1 family protein right origin-binding Ecoli_39F Tet - - - NP_291009 0.98 protein phosphoglycerate Ecoli_39R Tet - - - NP_313380 1 mutase right origin-binding Ecoli_40F Tet - - - NP_291009 1 protein Ecoli_40R Tet - - - YP_001746852 hypothetical protein 0.93 multiple antibiotic Ecoli_41F Tet - - - ZP_03064302 resistance protein 0.99 MarR O- Ecoli_41R Tet - - - ZP_07244647 acetylserine/cysteine 1 export protein EamA redox-sensitive Ecoli_42F Tet - - - ZP_07124080 transcriptional 1 activator SoxR cyclic diguanylate Ecoli_42R Tet - - - ZP_07169733 0.85 phosphodiesterase Multiple antibiotic Ecoli_43F Tet - - - ABF03767 resistance protein 1 marA

-." " Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID O- Ecoli_43R Tet - - - ZP_07244647 acetylserine/cysteine 0.93 export protein EamA DNA-binding Ecoli_44F Tet - - - AP_002151 transcriptional 1 repressor O- Ecoli_44R Tet - - - ZP_07244647 acetylserine/cysteine 1 export protein EamA redox-sensitive Ecoli_45F Tet - - - ZP_07124080 transcriptional 1 activator SoxR Ecoli_45R Tet - - - NP_291010 hypothetical protein 1 right origin-binding Ecoli_46F Tet - - - NP_291009 1 protein Ecoli_46R Tet - - - YP_001746852 hypothetical protein 0.93 DNA-binding Ecoli_47F Tet - - - NP_290695 transcriptional 1 regulator SoxS putative signal Ecoli_47R Tet - - - YP_002415202 1 transduction protein DEAD/DEAH box Ecoli_48F Tet - - - ZP_07150143 0.93 helicase O- Ecoli_48R Tet - - - ZP_07244647 acetylserine/cysteine 1 export protein EamA DNA-binding Ecoli_49F Tet - - - BAJ11987 transcriptional 1 regulator cyclic diguanylate Ecoli_49R Tet - - - ZP_07140304 1 phosphodiesterase DNA-binding Ecoli_50F Tet - - - AP_002151 transcriptional 1 repressor O- Ecoli_50R Tet - - - ZP_07244647 acetylserine/cysteine 1 export protein EamA right origin-binding Ecoli_51F Tet - - - NP_291009 1 protein phosphoglycerate Ecoli_51R Tet - - - NP_313380 1 mutase DNA-binding Ecoli_52F Tet - - - AP_002151 transcriptional 1 repressor O- Ecoli_52R Tet - - - ZP_07244647 acetylserine/cysteine 1 export protein EamA DNA-binding Ecoli_53F Tet - - - AP_002151 transcriptional 1 repressor major facilitator Ecoli_53R Tet - - - ACT97580 0.87 family transporter right origin-binding Ecoli_54F Tet - - - NP_291009 1 protein Ecoli_54R Tet - - - YP_001746852 hypothetical protein 0.93

-%" "

Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID DNA-binding Ecoli_55F Tet - - - NP_290695 transcriptional 1 regulator SoxS cyclic diguanylate Ecoli_55R Tet - - - ZP_07169733 0.89 phosphodiesterase Bcr/CflA family Ecoli_56F Tet CAA45230 bcr 0.98 ZP_07150141 1 protein Ecoli_56R Tet - - - ZP_07120164 hypothetical protein 0.98 Multiple Antibiotic Ecoli_57F Tet - - - 1JGS Resistance Repressor, 0.92 MarR O- Ecoli_57R Tet - - - ZP_07244647 acetylserine/cysteine 0.84 export protein EamA non-ribosomal peptide Ecoli_58R Tet - - - CAM34292 0.95 synthetase transporter, major Ecoli_59F Tet - - - ZP_07137802 facilitator family 0.84 protein putative 2-deoxy-D- Ecoli_59R Tet - - - ZP_07124391 gluconate 3- 1 dehydrogenase right origin-binding Ecoli_60F Tet - - - NP_291009 1 protein phosphoglycerate Ecoli_60R Tet - - - ZP_03051554 1 mutase family protein integral membrane Ecoli_61F Tet - - - ZP_07186351 0.89 protein, PqiA family ABC transporter Ecoli_61R Tet AAC32027 carA 0.51 EEF06825 0.96 family protein multiple antibiotic Ecoli_62F Tet - - - ZP_06990264 resistance protein 1 marA transporter, major Ecoli_62R Tet - - - ZP_07164887 facilitator family 0.96 protein multidrug transporter Ecoli_63F Tet - - - ZP_07447277 0.9 membrane multidrug transporter Ecoli_63R Tet YP_001453760 macB 0.5 YP_002291750 0.94 membrane right origin-binding Ecoli_64F Tet - - - NP_291009 1 protein transglycosylase SLT Ecoli_64R Tet - - - ZP_07096244 0.99 domain protein cyclic diguanylate Ecoli_65F Tet - - - ZP_07152972 0.99 phosphodiesterase cyclic diguanylate Ecoli_65R Tet - - - ZP_07140304 0.95 phosphodiesterase Ecoli_66F Nit - - - BAI55489 tyrosine-protein kinase 1 amylovoran export Ecoli_66R Nit - - - ZP_07174446 outer membrane 0.89 protein AmsH Ecoli_67F Nit - - - YP_002406175 carbonic anhydrase 0.92

-&" "

Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID polysaccharide Ecoli_67R Nit - - - ZP_07246326 1 deacetylase electron transfer Ecoli_68F Nit - - - ADA75118 flavoprotein subunit 0.74 ygcR FAD binding domain Ecoli_68R Nit - - - ZP_07164409 0.97 protein Ecoli_69F Nit - - - YP_539654 sensor kinase DpiB 0.99 sensory histidine Ecoli_69R Nit - - - YP_003220624 0.99 kinase CitA dITP/XTP Ecoli_70F Nit - - - NP_417429 1 pyrophosphatase Ecoli_70R Nit - - - AAA69117 hypothetical protein 1 Ecoli_71F Nit - - - ZP_06934414 hypothetical protein 1 lipid A export Ecoli_71R Nit - - - ZP_07219958 permease/ATP- 0.98 binding protein MsbA Ecoli_72F Nit - - - YP_851762 rare lipoprotein A 0.99 penicillin-binding Ecoli_72R Nit ZP_03837173 pbp2 0.81 ZP_07098061 0.98 protein 2 cystathionine beta- Ecoli_73F Nit - - - ZP_03034036 1 lyase AraC-type Ecoli_73R Nit - - - ZP_07210309 transcriptional 1 regulator Putative transposase Ecoli_74F Nit - - - P76119 0.9 yncI Ecoli_74R Nit - - - NP_415977 conserved protein 1 Predicted P-loop- Ecoli_75F Nit - - - CBK86782 1 containing kinase RNA polymerase Ecoli_75R Nit - - - ZP_06655296 0.86 sigma-54 factor Ecoli_76F Nit - - - ZP_07134251 HtpX 1 Ecoli_76R Nit - - - ZP_07165283 peptidase 0.87 Ecoli_77F Nit - - - ZP_07134251 HtpX 1 Ecoli_77R Nit - - - ZP_07165283 peptidase 0.87 bifunctional chorismate Ecoli_78F Nit - - - ZP_06939494 1 mutase/prephenate dehydrogenase phospho-2-dehydro-3- Ecoli_78R Nit - - - ZP_07446679 deoxyheptonate 1 aldolase pyruvate formate- Ecoli_79F Nit - - - ZP_04873264 lyase 2 activating 0.95 enzyme formate Ecoli_79R Nit - - - ZP_04873263 0.92 acetyltransferase 2 phenylacetic acid Ecoli_80F Nit - - - ZP_07162560 degradation protein 1 PaaY autotransporter beta- Ecoli_80R Nit - - - ZP_07246949 0.9 domain protein

-'" "

Table S3.6. continued

ARDB hit GenBank hit Accession Amino Accession Amino Clone Antibiotic number Description acid ID number Description acid ID conserved Ecoli_81F Nit - - - ZP_07246946 1 hypothetical protein phosphatidate Ecoli_81R Nit - - - ZP_03032646 0.87 cytidylyltransferase Ecoli_82F Nit - - - ZP_04871366 yjgB 0.92 oxidoreductase, zinc- binding Ecoli_82R Nit - - - ZP_07123836 0.8 dehydrogenase family protein argininosuccinate Ecoli_83F Nit - - - ZP_07144257 0.93 lyase Ecoli_83R Nit - - - ZP_03031412 acetylglutamate kinase 0.92 Ecoli_84F Nit - - - YP_851762 rare lipoprotein A 0.99 penicillin-binding Ecoli_84R Nit ZP_03837173 pbp2 0.81 ZP_07098061 1 protein 2 zinc-binding Ecoli_85F Nit - - - YP_001457423 dehydrogenase family 1 oxidoreductase Ecoli_85R Nit - - - YP_001725984 ribonuclease I 0.96 D-arabinose 5- Ecoli_86F Nit - - - YP_542608 0.95 phosphate isomerase 3-deoxy-D-manno- Ecoli_86R Nit - - - NP_289772 octulosonate 8- 1 phosphate phosphatase Ecoli_87F Nit - - - NP_414902 taurine dioxygenase 0.94 Ecoli_87R Nit - - - NP_757099 hypothetical protein 0.8 oxidoreductase, zinc- binding Ecoli_88F Nit - - - ZP_07186568 0.95 dehydrogenase family protein sugar ABC transporter Ecoli_88R Nit - - - YP_003499223 0.81 permease thiamine biosynthesis/tRNA Ecoli_89F Nit - - - ZP_07151899 1 modification protein ThiI Ecoli_89R Nit - - - AAA82704 ThiJ 0.98 Ecoli_90F Nit - - - ZP_06934414 hypothetical protein 1 lipid A export Ecoli_90R Nit - - - ZP_07219958 permease/ATP- 0.99 binding protein MsbA Ecoli_91F Nit - - - ZP_04873017 ATPase family 1 Ecoli_91R Nit - - - ZP_04873019 RbsP 0.89 Ecoli_92F Nit - - - ZP_07190304 LysM domain protein 1 Ecoli_92R Nit - - - 1JF9 Selenocysteine Lyase 0.99 Ecoli_93F Nit - - - YP_001725984 ribonuclease I 1 Ecoli_93R Nit - - - ZP_06936990 Citrate carrier 0.95 Ecoli_94F Nit - - - NP_311509 SsrA-binding protein 1 GTP-binding protein Ecoli_94R Nit - - - ZP_07177633 1 TypA/BipA Ecoli_95F Nit - - - XP_002275926 hypothetical protein 0.48

-(" "

Table S3.7. Genes cloned from Synechococcus WH8102 genomic DNA that conferred resistance in E. coli. Descriptions are top hits to GenBank.

Amino acid Clone Accession Description ID Syn_1F NP_896696 Hypothetical protein 1.00 Syn_1R NP_896697 Arylsulfatase regulatory protein 0.84 Syn_2F NP_896101 Phosphoribosylformylglycinamidine 0.89 synthase II Syn_2R NP_896102 Amidophosphoribosyltransferase 1.00 Syn_3F NP_898541 Allantoate 0.87 Syn_3R NP_898540 Hypothetical protein 0.84 Syn_4F NP_896191 Hypothetical protein 0.86 Syn_4R NP_896190 DNA gyrase subunit B 0.92 Syn_5F NP_898357 Transcriptional regulator 1.00 Syn_5R NP_898362 NADH dehydrogenase subunit I 0.98 Syn_6F NP_898613 Hypothetical protein 0.90 Syn_6R NP_898615 Threonine synthase 0.94

-)" "