<<

INVESTIGATION OF THE MICROBIOMES IN TWO FULL-SCALE DRINKING WATER DISTRIBUTION SYSTEMS

A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY

MICHAEL BRANDON WAAK

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

RAYMOND M.HOZALSKI &TIMOTHY M.LAPARA,CO-ADVISERS

AUGUST 2018 © 2018 Michael Brandon Waak ALL RIGHTS RESERVED Acknowledgements

There are many people that have earned my gratitude for their contribution to my time in graduate school. My advisors, Ray Hozalski and Tim LaPara—along with the faculty and staff of the Department of Civil, Environmental, and Geo-Engineering at the University of Minnesota (UMN)—have been supportive in multiple roles: advisors, mentors, constructive critics, colleagues, and friends. I am most thankful for their commitment to my success and well- being. I must also acknowledge my adoptive department, the Institute for Civil and Environmental Engineering at the Norwegian University of Science and Technology (NTNU), which welcomed me to Norway on multiple occasions. Cynthia Hallé, in particular, served as formal supervisor during my study abroad. She also acted informally as my third academic advisor prior to and after my two semesters in Trondheim. My fellow graduate students, friends, and colleagues at both UMN and NTNU have likewise been a necessary part of the experience by providing support, insight, and joy—whether it was weathering difficult coursework, enjoying a happy hour, or scooping buckets of water out of a sinking canoe. My family and friends, including father Gerald, mother Claudette, brother Kyle, and friend Chris Howe—among countless others—must also be recognized for supporting me throughout this academic and personal journey. It has meant the world. There are others that made my graduate work possible. I would like to acknowledge the water utilities that supported this work financially and by providing human resources. They showed great passion for their work, and it is because of their dedication and service that many of us can enjoy safe drinking water in the United States and Norway. This work, including my study abroad at NTNU, was also supported financially by the Norwegian Center for International Cooperation in Education (grant NNA-2012/10128). Additional technical and academic support was provided by the UMN Genomics Center, the Minnesota Supercomputing Institute, and the UMN Libraries, whose expertise and guidance help make UMN a world-class research university.

i Finally, additional credit goes to the UMN Water Resources Center, the College of Natural Resources at the University of Wisconsin – Stevens Point, the Water & Environmental Analysis Laboratory, and the Boy Scouts of America, which were critical to my early enthusiasm in science, the environment, and water resources.

ii Dedication

To the family, friends, and mentors who have held me up over the years and the giants whose shoulders I stand upon

iii Abstract

The drinking water distribution system (DWDS) microbiome can impact public health as well as distribution infrastructure. Though the majority of bacterial biomass in the DWDS is associated with biofilms on the walls of water mains and other surfaces, there is a lack of understanding about the biofilms due to the challenges of accessing them. Using culture-independent methods targeting marker genes, including real-time quantitative polymerase chain reaction (qPCR) and high-throughput sequencing of PCR amplicons, the microbiomes of two full-scale systems were investigated—a DWDS in the United States that maintains a chloramine residual and another in Norway that intentionally has very low or no residual disinfectant in the distributed water. This work demonstrates that residual chloramine is a fundamental factor affecting the microbiome in a chloraminated DWDS. Not all changes to the microbiome due to chloramine, however, may be desirable. Namely, non-tuberculous mycobacteria (NTM) and ammonia-oxidizing (AOB) in water-main biofilms benefit from residual chloramine, and both of these taxa pose possible concerns to water utilities and their consumers: NTM include some opportunistic pathogens (especially Mycobacterium avium complex, or MAC), and AOB may contribute to biologically accelerated chloramine decay. Still, chloramine appeared to generally work as desired. Biofilm biomass was significantly lower in the chloraminated DWDS, despite ostensibly more favorable conditions for bacterial growth, and most taxa in the bulk drinking water were not observed in the biofilms. Legionellae, which may include some opportunistic pathogens, were significantly reduced from the biofilms of the chloraminated DWDS, and no MAC were detected in either system. Characterization of the NTM indicated nearly all in the chloraminated DWDS were Mycobacterium gordonae-like , while various phylogenetically-different species of novel NTM were present in the no-residual DWDS. Chloramine-derived ammonia also appeared to support an AOB community in the chloraminated DWDS comprised primarily of Nitrosomonas oligotropha-like taxa. Abiotic reaction of nitrite with the chloramine likely hinders complete

iv biotic nitrification; nitrite-oxidizing bacteria (NOB) are denied available nitrite. Conversely, AOB, NOB, and ammonia-oxidizing archaea were all present in the no-residual DWDS despite little or no ammonia in the drinking water. Finally, corrosion-associated bacteria like Desulfovibrio spp. were common underneath corrosion tubercles in both systems. Microbiological activity may therefore contribute to corrosion of cast-iron water mains, regardless of whether a disinfectant residual is maintained in the bulk drinking water. This work provides novel evidence that residual chloramine alters the DWDS microbiome by reducing total biomass and diversity of water-main biofilms—though the remaining taxa may still pose management challenges. Future work will need to expand this type of research to other systems before general applicability to other systems can be assumed.

v Contents

List of Tables ix

List of Figures x

1 Introduction 1

2 Comparison of the microbiomes of two drinking water distribution systems — with and without residual chloramine disinfection 5 2.1 Introduction ...... 6 2.2 Materials & Methods ...... 8 2.2.1 Drinking water distribution systems ...... 8 2.2.2 Water quality ...... 8 2.2.3 Sample collection ...... 9 2.2.4 High-throughput 16S rRNA gene sequencing ...... 11 2.2.5 Real-time qPCR ...... 12 2.3 Results ...... 14 2.3.1 Water quality ...... 14 2.3.2 Taxonomic profiles ...... 14 2.3.3 Gene marker concentrations ...... 20 2.3.4 Alpha diversity ...... 21 2.3.5 Beta diversity ...... 22 2.4 Discussion ...... 22 2.5 Conclusion ...... 28

3 Occurrence of spp. in water-main biofilms from two drinking water distribution systems 31

vi 3.1 Introduction ...... 32 3.2 Materials & Methods ...... 34 3.2.1 Drinking water distribution systems ...... 34 3.2.2 Water-main biofilms ...... 34 3.2.3 Tap water samples ...... 36 3.2.4 DNA extraction ...... 37 3.2.5 Real-time quantitative PCR (qPCR) ...... 37 3.2.6 Characterization of Legionella-like 16S rRNA Genes ...... 38 3.2.7 Water quality analyses ...... 38 3.2.8 Data analysis and statistics ...... 39 3.2.9 Data availability ...... 40 3.3 Results ...... 40 3.3.1 Water quality ...... 40 3.3.2 Quantification of total Bacteria ...... 41 3.3.3 Characterization of Legionella-like 16S rRNA genes ...... 41 3.3.4 Quantification of Legionella spp...... 42 3.3.5 Biofilms from seasonal dead-end water mains ...... 42 3.4 Discussion ...... 44

4 Occurrence of Mycobacterium spp. in two drinking water distribution systems, with and without residual chloramine 52 4.1 Introduction ...... 53 4.2 Materials & Methods ...... 54 4.2.1 Drinking water distribution systems ...... 55 4.2.2 Water quality ...... 55 4.2.3 Sample collection ...... 56 4.2.4 Real-time qPCR ...... 57 4.2.5 Characterization of amplicon sequences ...... 58

vii 4.3 Results ...... 61 4.3.1 Water quality ...... 61 4.3.2 Quantification of marker genes ...... 61 4.3.3 Characterization of 16S rRNA gene amplicons ...... 62 4.3.4 Characterization of hsp65 gene amplicons ...... 63 4.4 Discussion ...... 68

5 Conclusions 73

References 78

Appendix A Acronyms and Glossary 92 Appendix A.1 Acronyms ...... 92 Appendix A.2 Glossary ...... 93

Appendix B Protocol for Retrieving Water-Main Cross-sections 95 Appendix B.1 Standard Operating Procedure ...... 96 Appendix B.2 Visual Protocol ...... 99

Appendix C Supporting Information for Chapter 2 101

Appendix D Supporting Information for Chapter 3 117

Appendix E Supporting Information for Chapter 4 135

viii List of Tables

2.1 Total bacteria, AOB, and AOA via qPCR ...... 19 3.1 Total bacteria and legionellae via qPCR ...... 43 4.1 Total bacteria and mycobacteria via qPCR ...... 65 A.1 Acronyms ...... 92 C.1 Assessment of bias in beta diversity due to library size ...... 112 C.2 PCR primers and thermoprofiles for 16S rRNA and amoA genes ...... 113 C.3 Summary of qPCR reactions targeting 16S rRNA and amoA genes ...... 114 C.4 Group-wise comparisons of Shannon index ...... 115 C.5 Group-wise comparisons of inverse Simpson index ...... 116 D.1 Summary of water-main metadata ...... 128 D.2 Summary of tap-water metadata ...... 129 D.3 PCR primers, probes, and thermoprofiles for ssrA, mip, and wzm genes . . . . 130 D.4 qPCR standards for ssrA, mip, and wzm genes ...... 131 D.5 Summary of qPCR reactions targeting ssrA, mip, and wzm genes ...... 132 D.6 Drinking water utility monitoring methods ...... 133 D.7 Water quality parameters for the chloraminated and no-residual systems . . . . 134 E.1 PCR primers, probes, and thermoprofiles for atpE and hsp65 genes and the ITS region of MAC ...... 140 E.2 qPCR standards for atpE genes and the ITS region of MAC ...... 141 E.3 Summary of qPCR reactions targeting atpE genes and the ITS region of MAC 142 E.4 Reference hsp65 gene sequences used for naïve Bayesian classifier ...... 143 E.5 Taxonomic classification of hsp65 gene OTUs ...... 146

ix List of Figures

2.1 Taxonomic profiles and qPCR gene concentrations in water-main biofilms . . . 16 2.2 Taxonomic profiles and qPCR gene concentrations in drinking water ...... 17 2.3 Taxonomic profiles under tubercles ...... 18 2.4 Alpha diversity of water-main biofilms, drinking water, and under tubercles . . 21 2.5 Beta diversity of water-main biofilms and drinking water ...... 23 2.6 Beta diversity under tubercles ...... 24 3.1 Photographs of several water mains in two drinking water distribution systems . 36 3.2 Total and Legionella-like gene markers in water-main biofilms ...... 45 3.3 Total and Legionella-like gene markers in tap water ...... 46 4.1 Marker gene concentrations via qPCR and select 16S rRNA gene ASVs in water-main biofilms ...... 66 4.2 Marker gene concentrations via qPCR and select 16S rRNA gene ASVs in drinking water ...... 67 4.3 Heat map of mycobacterial hsp65 genes in biofilms and drinking water . . . . 69 C.1 Ordination of sample collection sites ...... 103 C.2 Photos of water mains ...... 104 C.3 AOB- and NOB-like 16S rRNA gene sequences ...... 105 C.4 Comparison of AOB-like OTUs versus amoA:16S rRNA gene ratios ...... 106 C.5 Alpha diversity versus library size ...... 107 C.6 Beta diversity: all samples ...... 108 C.7 PCoA of unweighted UniFrac: Water-main biofilms and drinking water . . . . 109 C.8 PCoA of Bray-Curtis dissimilarity: Water-main biofilms and drinking water . . 110 C.9 PCoA of unweighted UniFrac and Bray-Curtis dissimilarity: Under tubercle . . 111 D.1 AOC versus distance to treatment plant ...... 121

x D.2 Raw water temperatures in the chloraminated and no-residual systems . . . . . 121 D.3 Temperatures of raw and distributed water in the chloraminated system . . . . . 122 D.4 Phylogenetic analysis of Legionella-like OTUs ...... 123 D.5 Legionella-like OTUs versus ssrA:16S rRNA gene ratios ...... 124 D.6 Quantities of mip genes in biofilms and drinking water via qPCR ...... 125 D.7 Quantities of wzm genes in biofilms and drinking water via qPCR ...... 126 D.8 Incidence rates of legionellosis in the U.S. and Norway ...... 127 E.1 Phylogenetic tree of mycobacterial hsp65 gene sequences ...... 136 E.2 Methylobacterium- versus Mycobacterium-like ASVs ...... 137 E.3 atpE:16S rRNA gene ratios versus Mycobacterium-like ASVs ...... 138 E.4 Heat map of mycobacterial hsp65 gene ASVs in biofilms and drinking water . 139

xi 1 Chapter 1

Introduction

In the United States, drinking water derived from surface water is required to contain a residual disinfectant, such as free chlorine or chloramine. Though generally an effective control against new microbial growth during the hours, days, or weeks water traverses the drinking water distribution system (DWDS), there are well-known drawbacks to residual disinfectants (see Appendix A for a list of acronyms and a glossary defining common terms). Other countries, such as the Netherlands, Germany, and Norway, have no such requirement during normal operating conditions, and therefore residual disinfectants are usually minimal or absent in the treated drinking water [1]. In many such cases, alternative microbial control strategies are implemented to provide safe drinking water to consumers. These may include source water protection, aggressive DWDS maintenance (e.g., corrosion management, main replacement, and water-main flushing), and biologically active filtration to limit assimilable organic carbon (AOC) in treated water [1, 2]. In Norway, however, source water is typically of high quality with low water temperatures throughout the year (i.e., <20°C). As a result, Norwegian drinking water often undergoes minimal treatment—only alkalinity adjustment to control corrosion and primary disinfection with medium-pressure UV radiation (≥40 mJ cm−2) and/or low doses of −1 free chlorine (e.g., ≤1 mg L as Cl2). There has been ongoing discussion regarding the use of residual disinfection. The primary question is whether treatment and management of the water supply ought to place greater empha- sis on biological suppression or biological stabilization. Notable drawbacks of residual chlorine or chloramine include potentially harmful disinfection by-products, consumer complaints about a ‘chemical’ taste or odor, and corrosion of infrastructure and premise plumbing [1, 3, 4]. Residual disinfectants may also be unreliable in parts of the DWDS due to disinfectant decay or reaction 2 with various compounds [3, 5]. Furthermore, a review of European countries that do not actively maintain disinfectant residuals found no higher risk of waterborne disease in those countries compared to other European and North American countries that maintain a residual [1]. Even when maintaining a disinfectant residual, biofilms will develop on the walls of water mains and other surfaces throughout the DWDS; disinfectant-tolerant microbes may be enriched in the presence of chlorine or chloramine, and some of these genera (e.g., Mycobacterium spp.) include pathogenic species [6, 7]. Disinfection has also been purported to increase incidence of antibiotic resistance genes in treated drinking water [8, 9]. On the other hand, there are practical considerations that may make residual disinfection either necessary or preferable. In many parts of the world, drinking water sources may be particularly susceptible to microbial contamination or favor subsequent growth (e.g., high AOC, inorganic nutrients, and water temperatures) [10, 11]. Water quality and availability are likely to become even more stressed by population growth and climate change [12, 13]. The treatment necessary to produce biologically stable water without the need for residual disinfection (i.e., slow-rate biofiltration) may be untenable for many water utilities (i.e., due to cost or space constraints), compared to suppression with a residual disinfectant—and may even be worse for the environment [14]. Even if utilizing a high-quality water source, there may be previously unrecognized risks. There is mounting evidence that groundwater—typically not required to contain a residual disinfectant—contributes to enteric disease in the U.S. and abroad [15, 16]. Residual disinfection adds an additional safeguard against potential post-treatment sources of microbial contamination. Water-main biofilms, which account for over 95% of the biomass in the DWDS, may be a reservoir of waterborne pathogens [17]. Deteriorating infrastructure could also compromise water supplies, either acutely (i.e., catastrophic water-main breaks) or repeatedly over time (e.g., leaky connections concurrent with transient negative-pressure events) [18, 19]. The objective of this work was to ascertain how residual chloramine influences the DWDS microbiome—the total of all microbes in the distribution network—relative to the microbiome of a DWDS operated with no residual disinfectant. To this end, drinking water and water mains 3 from two systems—a chloraminated DWDS in the U.S. and a no-residual DWDS in Norway— were investigated using culture-independent molecular methods, including real-time quantitative polymerase chain reaction (qPCR) and sequencing of PCR-amplified gene markers (herein also called amplicon sequences). A particularly novel aspect of this work was the investigation of water mains from two full-scale systems on two continents (see Appendix B for the protocol for retrieving water-main cross-sections). Though the majority of biomass is attached to water-main walls as biofilms, water mains are an under-studied aspect of the DWDS microbiome due to their general inaccessibility. Water mains are usually buried underground and thus require costly excavation to access, concurrent with disruption to local traffic and water service. In Chapter 2, the bacterial microbiome is broadly described, with emphasis on how chlo- ramine appears to affect one DWDS versus a no-residual DWDS. Total bacterial biomass and biomass associated with ammonia-oxidizing bacteria (AOB) and archaea (AOA) were quantified in water-main biofilms and drinking water using, respectively, bacterial 16S ribosomal RNA (rRNA) genes and the amoA genes of Nitrosomonas oligotropha and AOA. 16S rRNA gene amplicons were sequenced to gather ecological information, including taxonomic composition and diversity, about bacterial communities in water-main biofilms, drinking water, and corrosion tubercles. In addition to stark contrasts between the two systems—particularly in the biofilms— the taxonomic profiles suggested the presence of two bacterial genera that include opportunistic waterborne pathogens: Legionella spp. and Mycobacterium spp. Opportunistic pathogens are an emerging class of environmental microbes that may cause disease upon exposure, especially in specific human populations, such as immunocompromised persons and the elderly. Incidence of legionellae was scrutinized with follow-up analyses in Chapter 3. Legionella spp.—the etiological agents of Legionnaires’ disease and —were quantified via qPCR targeting three marker genes: ssrA (total Legionella spp.), mip (L. pneumophila), and wzm (L. pneumophila serogroup 1). Additional phylogenetic analysis of the Legionella-like 16S rRNA gene amplicon sequences was also performed against sequences from characterized Legionella spp. and related taxa. Importantly, it is shown that the water-main biofilms may serve as a reservoir of legionellae after initial water treatment and chloramine may 4 aid in reducing their abundance. Though chloramine may have reduced legionellae in the biofilms of the chloraminated DWDS, it was less clear whether another opportunistic pathogen may have benefited from the disinfectant. In Chapter 4, additional investigation targeted non-tuberculous mycobacteria (NTM), which were particularly prominent in the biofilms from the chloraminated DWDS [Chapter 2]. Total Mycobacterium spp. and M. avium complex (MAC)—a common cause of environmentally acquired bacterial —were quantified via qPCR targeting, respectively, the atpE gene and the 16S-23S rRNA internal transcribed spacer (ITS) region. Mycobacterial was assessed using the hsp65 gene, and incidence of Mycobacterium- and Methylobac- terium-like 16S rRNA genes were assessed for possible correlation, which has been reported elsewhere [20, 21]. No clinically significant NTM, such as MAC, were detected in either system. Though abundances of mycobacteria were higher in the chloraminated DWDS, mycobacterial diversity appeared lower in that system. Maintenance of a high chloramine residual (e.g., 4 mg −1 Cl2 L ) may select for specific NTM species, such as M. gordonae. Taken as a whole, this work demonstrates that not all changes to the DWDS microbiome due to residual chloramine may be desirable. Namely, mycobacteria and AOB in water-main biofilms benefit from residual chloramine, and both of these taxa pose possible concerns to water utilities and their consumers. Still, this work suggests that, in many ways, residual chloramine works as desired—fundamentally altering the DWDS microbiome. 5 Chapter 2

Comparison of the microbiomes of two drinking water distribution systems — with and without residual chloramine disinfection

FOREWORD

In this study, water-main biofilms, drinking water, and corrosion-associated samples (under tubercle) were collected from a chloraminated DWDS in the U.S. (residual = 3.8±0.1 mg Cl2 −1 −1 L ) and a no-residual DWDS in Norway (residual = 0.08±0.01 mg Cl2 L ). PCR-amplified fragments of 16S rRNA genes were sequenced using high-throughput Illumina MiSeq. In addition, AOB and AOA were assessed, respectively, via real-time qPCR targeting fragments of ammonia monooxygenase genes (amoA) from Nitrosomonas oligotropha and archaea. Estimates of total bacterial biomass, as 16S rRNA gene concentrations, are also provided. There were clear differences between the two systems—especially in the water-main biofilms. Water-main biofilms in the chloraminated DWDS were dominated by Mycobacterium- and Nitrosomonas-like bacteria and had significantly lower alpha diversity than the corresponding drinking water (p < 0.001 using Shannon and inverse Simpson indices). In contrast, biofilms in the no-residual DWDS were as diverse as the bulk water (p > 0.05 for Shannon and Simpson indices). Furthermore, chloramine appeared to significantly decrease bacterial biomass in the biofilms of the chloraminated DWDS compared to the no-residual DWDS, despite the no-residual DWDS having colder water with lower AOC and inorganic nutrients. AOB, however, appeared significantly more abundant in the chloraminated DWDS (p = 2.4 × 10−6), exhibiting higher levels of N. oligotropha-like amoA genes in the biofilms (up to 1.0 × 107 copies cm−2 versus a maximum of 2.6 × 104 copies cm−2 in the no-residual DWDS). This was likely due to the 6 availability of chloramine-derived ammonia in that system. Archaeal amoA genes were only detected in the water and biofilms of the no-residual DWDS and were present in significantly greater quantities (p = 0.02) than the N. oligotropha-like amoA gene concentrations of that system. These results suggest a greater role of AOA than AOB in nitrification in the no-residual DWDS, which was consistent with other low-ammonia/low-nitrogen environments. In both systems, water-main biofilms were taxonomically distinct from drinking water, suggesting that microbiological monitoring of tap water provides an incomplete assessment of the DWDS microbiome. The under-tubercle samples in both systems were distinct from the respective biofilms and drinking water, though not as distinct from one another despite coming from separate distribution systems on two continents. The specific niche under cast-iron corrosion deposits may attract certain specialized taxa, regardless of locale. In summary, these findings suggest chloramine is among the most significant determinants of biomass, diversity, and taxonomic composition in the water-main biofilms of a chloraminated DWDS. In contrast, nutrient availability and other environmental conditions are likely the critical factors affecting biofilm biomass and diversity in a no-residual DWDS. Though clearly different systems, less clear is whether one management approach yields a more ideal microbiome—in terms of microbiological stability and overall water quality and safety. Chloramine appears to decrease biofilm biomass—as intended—but creates a biofilm community dominated by chloramine-resistant mycobacteria and nitrite-producing AOB. Both constituents pose potential concerns for water utilities.

2.1 INTRODUCTION

Drinking water supplies derived from surface waters are routinely disinfected prior to distribution to inactivate or suppress pathogenic microbes present in the source water. Upon exiting a water treatment facility, drinking water may spend hours, days, or even weeks in the distribution system before reaching consumers. In some countries, including the United States, a residual disinfectant is used to limit undesirable biofilm growth on water-main surfaces and suppress bacteria levels 7 in the bulk water. Free chlorine (HOCl/OCl−) and chloramines (primarily monochloramine,

NH2Cl) are common disinfectants. Chloramine has gained some popularity over free chlorine for use as residual disinfectant largely because of reduced formation of halogenated disinfection byproducts, but there are other potential benefits [22]. Unlike free chlorine, however, chloramine − is susceptible to accelerated decay when ammonia-oxidizing microbes produce nitrite (NO2 ), which reacts with chloramine to produce additional ammonia [23]. Furthermore, the selective pressure exerted by chlorine or chloramine may create unique communities comprised of resilient microbes [7, 24]. Residual disinfection is not used in many parts of the world, particularly in European coun- tries, due to its various drawbacks. Safe drinking water supplies can be achieved without residual disinfection through alternative management strategies that control water-main biofilms [1, 3]. These strategies include biofiltration to produce biologically stable drinking water [25] and aggressive maintenance of the distribution infrastructure, such as regular flushing of water mains [2]. Unfortunately, despite efforts to suppress bacterial growth, water-main biofilms are inevitable. The question remains which scenario, the presence or absence of residual disinfectant, generally results in a more desirable (or less problematic) microbiome. Due to the high surface area-to-volume ratio of water mains in the drinking water distribution system (DWDS), water-main biofilms comprise more than 95% of the total biomass [17]. Water-main biofilms are also a relatively permanent feature of the microbiome compared to transient microbes in the water. Biofilm communities are particularly difficult to investigate due to the inaccessibility of buried water mains. Nonetheless, biofilm bacteria may significantly decrease water quality [6], increase corrosion of distribution infrastructure [26], and prolong the survival of waterborne pathogens [27]. There is great interest in the extent to which residual disinfection reduces biofilm formation and suppresses problematic organisms in full-scale distribution systems. Herein, we compare the results from an assessment of the microbiomes of two full-scale systems, one DWDS in the United States that maintains a chloramine residual and another in Norway that intentionally has very low or no residual disinfectant in the distributed water. A 8 unique aspect of this study is the assessment of bacterial communities in the drinking water and the water-main biofilms to obtain a more complete picture of the DWDS microbiomes.

2.2 MATERIALS &METHODS

In the present study, water mains and associated drinking water were collected from the two aforementioned full-scale systems. The interior surfaces of water mains were sampled for surface biofilms in direct contact with the drinking water, in addition to bacterial communities under corrosion-associated features such as tubercles, when present. Drinking water samples were filtered to isolate suspended cells. DNA extracted from water, biofilm, and under-tubercle bacteria was sequenced to assess bacterial community composition and diversity. In addition, total bacteria and ammonia-oxidizing bacteria and archaea (AOB and AOA, respectively) were assessed via real-time quantitative polymerase chain reaction (qPCR) of 16S ribosomal RNA (rRNA) and ammonia monooxygenase genes (amoA), respectively.

2.2.1 Drinking water distribution systems

The chloraminated system in the United States treats river water with lime softening, recarbona- tion, alum coagulation, sedimentation, filtration, and free chlorine disinfection. Chloramines are −1 produced prior to distribution, with an initial residual of 3.8±0.1 mg Cl2 L (mean ± standard deviation). In contrast, the no-residual system obtains water from a lake in a protected watershed. Raw water is withdrawn at a depth of 50 m, passed through granular marble beds (primarily

CaCO3) to increase alkalinity and water hardness, and then disinfected with free chlorine and medium-pressure UV light (40 mJ cm−2). Treated water had an initial residual of 0.08±0.01 mg −1 Cl2 L .

2.2.2 Water quality

Total chlorine was determined during collection of water samples using a Pocket Colorimeter II with DPD powder pillows (Hach Company, Loveland, CO, USA). Assimilable organic carbon (AOC) was measured in subsequent sampling throughout the two systems (August 2015 and 9 May 2017 for chloraminated and no-residual systems, respectively) using the Pseudomonas fluorescens strain P-17/Spirillum sp. strain NOX method [28]. The chloraminated utility provided raw water temperatures and treated water pH, total chlorine, hardness, free ammonia (NH3 + + − 3− NH4 ), nitrate (NO3 ), and orthophosphate (PO4 ) for 2014. Water temperatures, total chlorine, free ammonia, and nitrate were also provided for 13 monitoring sites in the distribution system for 2014. Raw water temperatures and treated water pH, total chlorine, hardness, ammonium + (NH4 ), and nitrate were provided by the no-residual utility for 2014–2015. Because it was not measured by the water utility, free ammonia was estimated from ammonium using pH and temperature, as previously described [29]. In addition to water temperature, orthophosphate and total phosphorus were measured at three sites in the distribution system during subsequent sampling using a photometric method (Hach assay number LCK 349). Methods used by the water utilities for analysis of water quality are in compliance with either the United States Environmental Protection Agency (chloraminated DWDS) or the Standards Norway (no-residual DWDS).

2.2.3 Sample collection

Water mains. Sections of water main were cut and removed from the DWDS within 2 hours of water shutoff. Prior to cutting, water-main exteriors were scraped clean of soil and disinfected −1 with a chlorine bleach rinse (approximately 400 mg Cl2 L ). Depending on the water-main material, either a hydraulic chain cutter (unlined grey cast iron or mortar-lined ductile cast iron) or handheld chop saw (unlined ductile cast iron) was used. Water was allowed to evacuate from the water main during removal to minimize contamination. The ends of the sample section were sealed with clean plastic wrap, and the sample was transported to the lab for immediate sampling. In the laboratory, 3–4 regions of the water main were gently scraped with a flame- sterilized steel microspatula to sample biofilm (median sample area 1.3 cm2, range 0.4–8.5 cm2). Samples were released into 0.5 mL of lysis buffer (5% sodium dodecyl sulfate, 120 mM sodium phosphate, pH 8) to begin DNA extraction. If corrosion tubercles were present in the water main (commonly in unlined water mains), 2–4 tubercles were pried up and the underlying surfaces 10 were gently scraped to recover the solids and associated bacteria (median sample area 1.3 cm2, range 0.5–5.5 cm2). In total, nine water mains were collected from nine sites in the chloraminated DWDS (sites C2–10), and six water mains were collected from three sites in the no-residual DWDS (sites N5–N7; Figure C.1 in the Supporting Information, SI [Appendix C]). In the chloraminated DWDS, these included water mains of either unlined grey cast iron or mortar-lined ductile cast iron that ranged in age from 40 to 127 years, while mains in the no-residual DWDS were unlined and made of either grey or ductile cast iron, ranging in age from 44 to 109 years (Fig. C.2). Sites with more than one water main were designated with lower-case letters (e.g., N7a and N7b). It was determined after water-main collection that sites C7–C10 may be atypical for the chloraminated DWDS because they are shutoff during the winter months (Fig. C.1). More explanation is provided in the SI Materials & Methods. Because operation of these water mains was atypical, biofilm samples from these sites were omitted from statistical comparisons between the two systems; drinking water samples from these sites were included, however, because the water was sampled during normal, uninterrupted operation.

Drinking water. Water was sampled either directly from the water main via pitot tubes (chloram- inated DWDS) or faucets in nearby residential or commercial buildings (no-residual DWDS). To reduce bacterial contamination from the pitot valves or premise plumbing, metal taps were flame- sterilized while plastic taps were rinsed with chlorine bleach. Taps were flushed free of stagnant water for up to 5 min, and then samples were collected in 3–4 autoclave-sterilized bottles and transported to the laboratory in a cooler. Each sample was individually vacuum-filtered through a 47-mm diameter, 0.2-µm pore-size nitrocellulose membrane (EMD Millipore, Bellerica, MA, USA) within an hour of collection (median filtrate volume 1000 mL, range 747–1265 mL). Filters were submerged in 0.5 mL lysis buffer to begin DNA extraction. In total, water samples were collected from six sites in the chloraminated DWDS (sites C1, C2, C4, C6, C7, and C9) and six sites in the no-residual DWDS (sites N1–N4, N6–N7), plus treated water from the treatment plant of that system (site N0; Fig. C.1). 11 DNA extraction. To recover DNA from lysis buffer, samples were subjected to three freeze-thaw cycles and a 90-min incubation at 70°C. DNA was extracted using the FastDNA SPIN Kit (MP Biomedicals, Santa Ana, CA, USA) and then stored at −20°C.

2.2.4 High-throughput 16S rRNA gene sequencing

Bacterial 16S rRNA genes were PCR-amplified targeting the V3 region (primers 341F/534R) and sequenced to gather microbial community information [30, 31]. To pre-screen samples for sufficient biomass, 16S rRNA gene copy numbers were determined via qPCR, as previously reported [32]. Samples were processed only if the copy number was at least 10 times greater than no-template controls, equivalent to 1.3 × 104 or more gene copies. Subsequent PCR amplification was performed using the quantification cycle (Cq) plus an additional 5–10 cycles to produce PCR amplicons. Purified products were pooled by equal mass into a single amplicon library prior to paired-end MiSeq sequencing (2 × 150 bp; Illumina, Inc., San Diego, CA, USA), as previously described [32]. Sequence reads were trimmed, filtered, and stitched together using the metagenomics- pipeline v1.5 [33]. Operational taxonomic units (OTUs) were determined using subsampled open-reference clustering in QIIME v1.9.1 [34, 35]. Briefly, the USEARCH 6.1 and UCLUST wrappers [36] were used, respectively, to cluster OTUs at 97% similarity and then assign taxonomy using reference sequences and taxonomy from SILVA release 128 [37, 38] and the default settings in QIIME. Global singletons were removed. OTU sequences that failed to align to a SILVA core alignment via PyNAST [39] were also removed. A phylogenetic tree of representative sequences was created with FastTree [40] and rooted using the midpoint. Within-sample (alpha) diversity was evaluated using the Shannon and inverse Simpson indices (henceforth abbreviated to ‘Simpson’). These were computed using estimated singleton counts to avoid potentially inflated estimates from spurious singletons [41]. Kruskal-Wallis tests were used to detect significant differences among groups, and if the resulting p was significant (i.e., <0.05), a Conover-Iman test was used for post hoc testing, with no adjustment to the p values [42]. Effect of library size was assessed as a confounding variable using Spearman’s 12 rank correlation; no significant correlation was observed for either Shannon or Simpson indices (p = 0.72 and 0.97, respectively; Fig. C.5). Between-sample (beta) diversity was assessed using generalized UniFrac distances, without normalization of sequence counts (i.e., with variable library sizes) [43]. Unweighted UniFrac and Bray-Curtis dissimilarity were also considered as alternative measurements to check the robustness of perceived visual trends and clusters. Detailed information on these alternatives are provided in the SI Materials & Methods. Dimensional reduction was performed using principal coordinates analysis (PCoA) via the ape package [44] in R software [45]. Library size was determined to be a significant albeit minor confounding variable (permuted p = 0.001; R2 = 0.03), assessed via the vegan package [46] using permutational multivariate analysis of variance (PERMANOVA; 999 random permutations). Two normalization methods—subsampling without replacement [47] and cumulative sum scaling [48]—yielded similar PERMANOVA results, suggesting normalization was not necessary to control for varying library sizes. All permuted p and R2 values are summarized in Table C.1.

2.2.5 Real-time qPCR qPCR was performed on a CFX Connect Real-Time PCR Detection System (Bio-Rad Labora- tories, Inc., Hercules, CA, USA) targeting the amoA genes of Nitrosomonas oligotropha [49] and ammonia-oxidizing archaea [50]. qPCR reactions (final volume = 25 µL) consisted of 12.5 µL Bio-Rad iTaq SYBR Green Supermix with ROX, 25 µg bovine serum albumin (Roche Diagonistics, Indianapolis, IN, USA), optimized concentrations of forward and reverse primers to reduce primer dimer, 0.5 µL DNA template, and molecular biology-grade water (Sigma- Aldrich, St. Louis, MO, USA). Forward and reverse primers were synthesized by Integrated DNA Technologies, Inc. (Skokie, IL, USA). PCR primers and thermoprofiles are summarized in Table C.2. Standard curves were generated using serially diluted solutions of either plasmid DNA (N. oligotropha-like amoA genes) or custom gBlocks gene fragments (archaeal amoA genes). Plasmid DNA was prepared from PCR amplification using positive controls, followed by ligation 13 with pGEM-T Easy cloning vectors (Promega, Madison, WI, USA) and transformation into JM109. After purification with the QIAprep Spin Miniprep Kit (QIAGEN, Hilden, Germany), plasmid DNA was stained with Hoechst 33258 dye and quantified on a TD-700 fluorometer (Turner Designs, Sunnyvale, CA, USA) using calf thymus DNA as a standard. For archaeal amoA standards, custom gBlocks gene fragments were synthesized by Integrated DNA Technologies using a 150-bp fragment of amoA from Nitrosopumilus maritimus (GenBank accession HM345610) as reference, which included the 135-bp target sequence. Amplification efficiencies of the standard curves ranged from 93.0–99.7% and 94.4–96.6% for N. oligotropha-like and archaeal amoA, respectively. Additional standard curve statistics are summarized in Table C.3. Specificity of PCR products was confirmed using melt curves. The limit of quantification (LOQ) was defined as the lowest standard to reliably amplify without primer dimer (5 and 130 copies for N. oligotropha-like and archaeal amoA, respectively). qPCR of amoA genes was only performed for water-main biofilms and drinking water samples; AOB and AOA were not considered relevant for the ostensibly oxygen-deficient under- tubercle communities. Gene copy numbers for water-main biofilms and drinking water were normalized by sample area or filtrate volume and then log10-transformed. The method LOQ

(LOQm) was calculated for each sample by applying the same normalization to the LOQ copy number. The 16S rRNA gene copy numbers for under-tubercle samples are not shown due to the lack of an appropriate normalization parameter. Surface area was deemed inappropriate as there appeared to be substantial differences in the masses of corrosion solids recovered from underneath different tubercles. Dry mass of corrosion solids may have been a suitable parameter, but unfortunately, the masses were not quantified. Left-censored observations, including non- detects, were substituted to arbitrarily low numbers prior to rank-based statistical comparisons. For comparisons between groups, the cendiff function in the R package NADA [51] was used to perform generalized Wilcoxon tests [52]. Comparisons of paired data were performed using a modified sign test [53], as previously coded using R [54]. 14 2.3 RESULTS

2.3.1 Water quality

As previously reported [32], the chloraminated DWDS had high seasonal variation in distributed water temperature (median 16.1°C; range 1.8–34.1°C) and was usually warmer than the no- residual DWDS (median 7.2°C; range 5.2–8.5°C). AOC appeared higher in the chloraminated DWDS (range 238–343 versus 81–109 µg acetate-C L−1), as did free ammonia (<0.02–0.97 mg NL−1), nitrate (0.36–1.73 mg N L−1), and phosphorus (0.17–0.52 mg P L−1). In the no-residual DWDS, free ammonia and phosphorus were consistently below detection (<0.02 mg N L−1 and <0.05 mg P L−1, respectively), and nitrate had a range of 0.22–0.26 mg N L−1. Water generally −1 had greater hardness in the chloraminated DWDS after lime softening (44–93 mg CaCO3 L ) −1 than water in the no-residual DWDS (36–54 mg CaCO3 L ), which had been hardened with granular marble. The two systems had similar pH (7.5–9.5 versus 7.8–8.6, chloraminated and no residual, respectively).

2.3.2 Taxonomic profiles

Unprocessed paired-end sequence reads are available online from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (accession SRP148989). After processing the sequences, the median library size was 74,883 reads (range 17,041–324,640 reads per sample). OTU picking yielded 41,815 unique OTUs among all samples. An analysis of dominant genera—defined here as any genus comprising at least 5% of sequences in two samples—revealed differences in the taxonomic profiles of the chloraminated and no-residual systems.

Water-main biofilms. Biofilms were taxonomically and structurally different between the two systems (Fig. 2.1a). Biofilms in the chloraminated DWDS were unevenly distributed—skewed by three dominant genera: Mycobacterium (≤91.7% of sequence reads), Nitrosomonas (≤39.6%), and Methylobacterium (≤10.3%). In contrast, water-main biofilms in the no-residual DWDS were more evenly distributed and included many taxa: Hyphomicrobium (≤13.2%), Amphiplicatus 15 (≤12.4%), Nitrospira (≤9.7%), Methyloglobulus (≤6.4%), H16 of family Desulfurellaceae (≤6.1%), and Woodsholea (≤5.9%), as well as uncharacterized genera of order TRA3-20 within (≤13.0%) and family MNG7 within order Rhizobiales (≤6.8%). No sequences from the winter-shutoff samples of the chloraminated DWDS were assessed due to insufficient biomass (i.e., all 16S rRNA gene copy numbers were <1.3 × 104 copies per reaction—equivalent to 10 times the copy number in no-template controls).

Drinking water. Drinking water samples were comprised of different taxa than the respective biofilms in both systems (Fig. 2.2a). Drinking water taxa found commonly in both systems included uncharacterized members of the LD12 freshwater group of , Polynu- cleobacter, Limnohabitans, Flavobacterium, and ‘Candidatus Methylopumilus.’ Nitrosomonas- like OTUs were prominent at sites C6 and C7 in the chloraminated DWDS. In the no-residual DWDS, the hgcI clade of family Sporichthyaceae and uncharacterized members of families Comamonadaceae and Anaerolineaceae were also dominant. Polaromonas-like OTUs were detected intermittently, particularly at sites N3 and N6 but also N7.

Under tubercle. Desulfovibrio-like OTUs were commonly detected under tubercles of both systems (Fig. 2.3). These were especially dominant in the chloraminated DWDS (≤98.4% of sequence reads, compared to ≤47.0% in the no-residual DWDS). Other notable taxa found com- monly in both systems included Desulfosporosinus, Holophaga, Bradyrhizobium, Acinetobacter, uncharacterized Comamonadaceae, Sideroxydans, and Gallionella. Sulfuricurvum-like OTUs were especially dominant in a single observation in the no-residual DWDS (site N5a). a Chloraminated DWDS No•residual DWDS 100

75

50

25

% sequence reads 0 C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b Other Nitrospira Amphiplicatus Mycobacterium Nitrosomonas H16 (Desulfurellaceae) Uncharacterized (TRA3•20) Uncharacterized (MNG7) Methyloglobulus Hyphomicrobium Woodsholea Methylobacterium

b Chloraminated DWDS No•residual DWDS

) 8 2 −

6 (copies cm

10 4 g o l

C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b Bacterial 16S rRNA genes Nitrosomonas oligotropha•like amoA Archaeal amoA

Figure 2.1: Characterization of water-main biofilms: (a) Taxonomic profiles of dominant genera via high-throughput sequencing of PCR-amplified 16S rRNA gene fragments, and (b) marker gene concentrations via real-time qPCR. Samples not sequenced due to low 16S rRNA gene copy numbers are represented with empty space.

16 a Chloraminated DWDS No•residual DWDS 100

75

50

25

% sequence reads 0 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Other hgcI clade (Sporichthyaceae) Uncharacterized (LD12 freshwater group) Polaromonas Polynucleobacter Uncharacterized (Comamonadaceae) Limnohabitans Uncharacterized (Anaerolineaceae) Flavobacterium Nitrosomonas 'Ca. Methylopumilus' OM43 clade (Methylophilaceae)

b Chloraminated DWDS No•residual DWDS

8 ) 1 − 6 (copies L

10 4 g o l

C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Bacterial 16S rRNA genes Nitrosomonas oligotropha•like amoA Archaeal amoA

Figure 2.2: Characterization of drinking water: (a) Taxonomic profiles of dominant genera via high-throughput sequencing of PCR-amplified 16S rRNA gene fragments, and (b) gene concentrations via real-time qPCR. Samples not sequenced due to low 16S rRNA gene copy numbers are represented with empty space.

17 Chloraminated DWDS No•residual DWDS 100

75

50

25

% sequence reads 0 C2 C5 C7 C8 C9 C10 N5a N5b N6 N7a N7b Other Holophaga Sideroxydans Desulfovibrio Desulfosporosinus Gallionella Uncharacterized (Comamonadaceae) Sulfuricurvum Acinetobacter Bradyrhizobium Geobacter

Figure 2.3: Taxonomic profiles of dominant genera in under-tubercle communities via high-throughput sequencing of PCR-amplified 16S rRNA gene fragments. Samples not sequenced due to low biomass are represented with empty space.

18 Table 2.1: Summary of marker gene concentrations via real-time qPCR* Biofilms (copies cm−2) Drinking water (copies L−1) Target Statistic p value‡ p value‡ Source Chloraminated† No residual Chloraminated No residual Median

Median 4.7 × 104 1.1 × 103 2.6 × 104 7.9 × 102 N. oligotropha-like This Min. 2.0 × 103 <3.2 × 102 1.8 × 10−7 2.1 × 103 <2.2 × 102 5.0 × 10−10 amoA study Max. 1.0 × 107 2.6 × 104 4.4 × 106 7.4 × 103

Median

*

19 20 Nitrification-related taxa. AOB have traditionally encompassed two lineages of class Proteobac- teria: genus Nitrosococcus in the gamma subclass and family Nitrosomonadaceae in the beta subclass, which includes genera Nitrosomonas and Nitrosospira [55]. Of these AOB-like taxa, OTUs assigned to family Nitrosomonadaceae were ubiquitous among biofilm and drinking water samples in both systems (Fig. C.3). Nitrosomonas-like OTUs, in particular, were dominant in the taxonomic profiles of biofilms from the chloraminated DWDS (Fig. 2.1a), whereas the majority of Nitrosomonadaceae in the no-residual DWDS were from uncharacterized genera. No Nitrosococcus-like OTUs were detected in either DWDS. NOB comprise at least seven genera in four phyla [56]. The genera Nitrospira, Nitrolancea, Nitrococcus, Nitrotoga, and Nitrobacter have been associated with engineered environments, while Nitrospina and ‘Candidatus Nitromaritima’ are associated with marine environments [56]. Nitrospira-like OTUs were prominent in the no-residual DWDS but were also detected in addition to Nitrotoga- and Nitrobacter-like OTUs in the chloraminated DWDS (Fig. C.3).

2.3.3 Gene marker concentrations

N. oligotropha-like amoA and archaeal amoA gene concentrations, as well as previously reported values of bacterial 16S rRNA gene concentrations [32], are shown in Figs. 2.1b and 2.2b and summarized in Table 2.1.

16S rRNA genes. Bacterial 16S rRNA gene concentrations in water-main biofilms of the −6 chloraminated DWDS were frequently

2.3.4 Alpha diversity

Shannon and Simpson indices for all samples are provided in Fig. 2.4. Post hoc testing after significant Kruskal-Wallis tests for both indices (p < 0.0001) indicated the chloraminated DWDS

a Water¡ main biofilms Drinking water Under tubercle 8

6 † _ † _ _ 4 _ * _ 2 * _ 0 Chloraminated No residual Chloraminated No residual Chloraminated No residual

b Water main biofilms Drinking water Under tubercle 125 100

75 † _ _ 50 † _ 25 _ _ 0 * * _ Chloraminated No residual Chloraminated No residual Chloraminated No residual

Figure 2.4: Within-sample (alpha) diversity of water-main biofilm, drinking water, and under tubercle using (a) Shannon index (higher = richer and more even) and (b) inverse Simpson index (higher = more even). Shared symbols indicate no significant group-wise difference (i.e., p > 0.05), and bars indicate medians. 22 biofilm, drinking water, and under-tubercle samples had significantly lower Shannon (p values of <0.0001, <0.0001, and 0.0006, respectively) and Simpson indices (p values of <0.0001, <0.0001, and 0.0014) than the corresponding sample groups from the no-residual DWDS. Water-main biofilms in the no-residual DWDS had similar Shannon and Simpson indices to the drinking water samples from that DWDS (p = 0.79 and 0.29, respectively), while in the chloraminated DWDS, the biofilms and drinking water samples had significantly different Shannon and Simpson indices (p < 0.0001 for both). In the chloraminated DWDS, water-main biofilms were not significantly different than under tubercle (p = 0.63 for both Shannon and Simpson). All p values are summarized in Tables C.4 and C.5.

2.3.5 Beta diversity

PCoA of generalized UniFrac distances clustered by distribution system and sample type, which was consistent with the results of unweighted UniFrac and Bray-Curtis dissimilarity (Fig. C.6). To investigate potentially subtler ecological trends, water-main biofilms and drinking water were compared separately from under-tubercle samples (Fig. 2.5). Principal axis 1 of generalized UniFrac distances separated the samples by distribution system, while axes 2 and 3 (Figs. 2.5a and 2.5b, respectively) separated drinking water from water-main biofilms. The same phenomenon was seen with unweighted UniFrac (Fig. C.7) and Bray-Curtis dissimilarity (Fig. C.8). Under- tubercle samples—considered as a separate group due to their isolation from the biofilms and bulk drinking water—showed moderate partitioning by distribution system, albeit less pronounced than the biofilm and drinking water clusters (Fig. 2.6). This behavior was also observed with unweighted UniFrac and Bray-Curtis dissimilarity (Fig. C.9).

2.4 DISCUSSION

Bacteria in the DWDS, whether attached to water-main walls and other surfaces as biofilms or suspended in the water, are important in that they can have public health impacts as well as impacts on distribution system infrastructure. Detailed characterizations of the microbiomes in full-scale distribution systems, however, are lacking owing to the difficulty of obtaining biofilms 23 a b 0.2 0.3

0.1 0.2 %) %)

15.6 0.0 12.4 0.1

Axis 2 ( −0.1 Axis 3 ( 0.0

−0.2 −0.1

−0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 Axis 1 (29.5%) Axis 1 (29.5%)

Sample type Distribution system Water•main biofilms Chloraminated Drinking water No residual

Figure 2.5: Principal coordinates analysis of generalized UniFrac distances: Between-sample (beta) diversity of water-main biofilms versus drinking water in two drinking water distribution systems. (a) Principal axes 1 and 2 and (b) principal axes 1 and 3. Percentage = variance explained. from water mains. Simply sampling the water is insufficient, as biofilm bacteria represent more than 95% of the bacteria present in the DWDS [17]. In this investigation, water and water- main biofilms were sampled from two distribution systems to characterize and compare their microbiomes: one DWDS that maintains a chloramine residual and another with no residual disinfectant. The results provide novel information on relationships between biofilms and suspended biomass within a DWDS and suggest possible effects of the presence or absence of a residual disinfectant on the diversity and community composition of the DWDS microbiome. In both systems, drinking water had markedly different taxonomic profiles than the corre- sponding water-main biofilms. This was in agreement with previous studies [57, 58] and further strengthens the argument that tap water samples provide an incomplete assessment of the DWDS microbiome. Among drinking water samples, the observed genera are commonly associated with 24

0.2

%) Distribution system 0.0 Chloraminated 12.6 No residual Axis 2 (

−0.2

−0.4 −0.2 0.0 0.2 Axis 1 (31.6%)

Figure 2.6: Principal coordinates analysis of generalized UniFrac distances: Between-sample (beta) diversity under corrosion tubercles of two drinking water distribution systems. Percentage = variance explained. rivers and lakes—consistent with the source waters of both systems. Limnohabitans, Polynucle- obacter, Flavobacterium,‘Candidatus Methylopumilus,’ the hgcI clade, and the LD12 freshwater group are all important bacterial plankton [59–63]. The prevalence of plankton-like taxa in the water and limited overlap with taxa from the corresponding biofilms, however, suggested separate communities in the water and biofilms—a finding that is not surprising due to the vast differences between suspended and surface-associated microbial lifestyles [64]. Based on the dominant taxa, water-main biofilms in both systems appeared to be adapted for nutrient scarcity. This is consistent with a meta-analysis of diverse bacterial communities across distinct habitats that found all exhibited predictable patterns during primary succession— resulting in mature, late-succession communities well-adapted to low-resource conditions [65]. In the two systems, taxa considered facultative methylotrophs were common in the biofilms. A feature of methylotrophs is their ability to use single-carbon compounds, such as methanol, formate, and other methylated compounds, as sole carbon and energy source [66]. Facultative methylotrophs, including Mycobacterium, Methylobacterium, and Hyphomicrobium spp., have 25 been associated with biofilms in water meters and premise plumbing [66, 67]. In the present study study, Methylobacterium- and especially Mycobacterium-like OTUs were primarily associated with the chloraminated DWDS—consistent with other chloraminated systems [7, 68]—while Hyphomicrobium-like OTUs were prevalent in the no-residual DWDS. Our findings were also consistent with a survey of 17 municipal distribution systems that found Mycobacterium and Methylobacterium spp. were enriched in drinking water with high chloramine concentrations, whereas Hyphomicrobium spp.—among other genera—were more abundant in water with low concentrations of free chlorine [69]. Though methylotrophy is common among environmental bacteria [70], the variety of potential substrates these organisms can utilize may make them particularly well-adapted for low-nutrient environments, such as drinking water distribution systems [66]. The presence of residual chloramine appeared to further differentiate biofilms from the bulk drinking water. In the chloraminated DWDS, the lower richness and evenness of biofilms (relative to the corresponding drinking water) was consistent with selection by chloramine disinfection. It was therefore not surprising that the dominant taxa were also taxa likely well-adapted to the presence of chloramine: chloramine-tolerant mycobacteria and ammonia-seeking Nitrosomonas. In contrast, the drinking water and biofilms in the no-residual DWDS shared similar richness and evenness, although comprised of different taxa. Though detected in both systems, Mycobacterium-like OTUs were far more abundant in the chloraminated DWDS—similar to another chloraminated DWDS in the United States [7]. Environmental mycobacteria include opportunistic pathogens, such as the Mycobacterium avium complex [71]. 16S rRNA genes, however, cannot distinguish between pathogenic and non- pathogenic types [72]. Nonetheless, mycobacteria are often associated with drinking water supplies because their waxy, hydrophobic cell membranes afford them tolerance to chlorine and chloramine [73]. Sharing a similar ecological niche to mycobacteria, Methylobacterium spp. may outcompete mycobacteria when disinfectant residuals are diminished [21]. Notably, Methylobac- terium-like OTUs, along with Nitrosomonas, were most prominent in two heavily tuberculated water mains (sites C2 and C5), though it is unclear whether this was related to the tuberculation 26 because of our limited sample quantity. Nonetheless, certain corrosion products can impart a chloramine demand, resulting in ammonia production and chloramine loss [74]. Environmental conditions, including seasonally warmer water temperatures and consistently higher assimilable nutrients (i.e., AOC, ammonia, nitrate, phosphorus), appeared more favorable to microbial growth in the chloraminated DWDS [32]. Yet, bacterial biomass was significantly lower in the drinking water and especially the water-main biofilms of that system compared to the no-residual DWDS [32]. The lower biomass in the chloraminated DWDS was attributed to bacterial inactivation by the residual chloramine. Lower species richness due to chloramine, however, may have also contributed to a decrease in total biomass, because such communities are limited to the metabolic capabilities of the taxa present—many potential substrates that sustain growth may go under-utilized. Notably, the taxonomy of 16S rRNA gene sequences in the no-residual DWDS biofilms hinted at the presence of diverse chemotrophic lifestyles, including methane oxidation (Methanoglobulus spp. [75]), ammonia or nitrite oxidation (Nitro- spira spp. [56]), and sulfur reduction (family Desulfurellaceae [76]). Though we did not assess whether these metabolic functions were present and active, the higher diversity of taxa observed in the no-residual DWDS should permit utilization of more diverse substrates. The high relative abundances of Nitrosomonas-like OTUs in the chloraminated DWDS biofilms may pose challenges for the management of nitrification in that system. AOB, including Nitrosomonas spp., convert ammonia to nitrite. Nitrite is typically unstable and quickly converted to nitrate—either biologically by NOB or abiotically by chloramine. Nitrite and nitrate are both toxic at high concentration, with maximum contaminant limits in the United States of 1.0 and 10.0 mg N L−1, respectively. Notably, nitrite and chloramine react abiotically to produce ammonia and nitrate [77]. With active AOB present in a chloraminated DWDS to convert the ammonia to nitrite, this can result in a rapid loss of residual chloramine in a process known as biologically-accelerated chloramine decay [5]. The abiotic reaction between nitrite and chloramine likely explains the deficiency of NOB relative to AOB in the chloraminated DWDS. In canonical two-step nitrification, the expected biomass yield of NOB is approximately 50% of AOB and implies a theoretical NOB:AOB ratio 27 of 0.5 when NOB have uncontested access to nitrite [78]. There appeared to be an orders-of- magnitude deficiency of NOB-like OTUs relative to AOB-like OTUs in the chloraminated DWDS biofilms. The abiotic reaction between chloramine and nitrite is fast and likely to outcompete NOB for nitrite [77, 79]. In contrast, the NOB:AOB ratio was approximately equal to or greater than 0.5 in the no-residual DWDS, consistent with complete nitrification of ammonia to nitrate by biotic conversion. Though free ammonia was not detected in the drinking water of the no-residual DWDS, there were nitrification-relevant taxa present in both the drinking water and the biofilms of that system. The higher concentration of archaeal amoA genes over N. oligotropha-like amoA genes was consistent with an ammonia-deficient environment [80]. In addition, Nitrospira-like OTUs were abundant in the no-residual DWDS and may have included comammox Nitrospira, as they are efficient at utilizing ammonia in nutrient-poor environments and have been previously observed in drinking water systems [81, 82]. Overall, the apparent diversity of nitrifiers in the no-residual DWDS—archaeal and N. oligotropha-like amoA genes in conjunction with NOB-like OTUs—and primarily only Nitrosomonas-like AOB in the chloraminated DWDS, was consistent with the broader observations of richness and evenness in the two systems. Though of benefit to AOB as an ammonia source, chloramine may nonetheless only benefit chloramine-tolerant AOB—in this case, Nitrosomonas spp. with N. oligotropha-like amoA genes. Our findings are consistent with previous reports that chloramine selects for N. oligotropha [83]. Despite the stark contrasts in drinking water and biofilm communities from the two distri- bution systems, the under-tubercle communities from the two systems were relatively similar. Under-tubercle samples from the chloraminated DWDS were, however, significantly less diverse than those from the no-residual DWDS. We had assumed that the under-tubercle communities from the chloraminated DWDS had been sufficiently shielded from residual chloramine, due to their physical separation from the bulk water and the reactivity of chloramine with corrosion products [74]. It may be that chloramine contributes indirectly to a lower diversity under the tubercles by selectively culling primary colonizers, which presumably originate from the chlo- raminated water. The diversity and taxonomic composition of water-main bacterial communities, 28 however, may be significantly influenced by pipe material [84], which we unfortunately cannot address due to our limited sample quantities. Desulfovibrio-like OTUs were commonly the most dominant taxonomic group in the chlo- raminated DWDS, which was consistent with the tubercles of another chloraminated DWDS [7]. Sulfate-reducing Desulfovibrio spp. are commonly associated with microbiologically influenced corrosion and corroded, iron-rich environments [7, 26, 84, 85]. The electron donor for these organisms is unknown but could be H2 from reduction of protons via iron corrosion. The higher relative abundances of Desulfovibrio-like OTUs in the chloraminated DWDS may be attributed −1 2− to sulfate in the treated water (27.6±3.2 mg L as SO4 ) from coagulation with aluminum sulfate (Al2(SO4)3) during water treatment. In contrast, no sulfate was added to water in the no-residual DWDS during treatment, and Desulfovibrio-like OTUs were prominent in only a few samples.

2.5 CONCLUSION

In addition to nutrient scarcity, chloramine is likely a fundamental factor influencing water-main biofilms in a chloraminated drinking water distribution system. Richness and evenness were significantly lower in the biofilms of the chloraminated DWDS (relative to the corresponding water samples). In contrast, there was no significant difference in the within-sample diversity between water-main biofilms and drinking water from the no-residual DWDS. Total biomass was also significantly lower in the chloraminated DWDS, despite seasonally higher water temperatures and higher AOC, ammonia, nitrate, and phosphorus. Additional investigation of water-main biofilms from full-scale systems is necessary, however, to determine whether the observations for the chloraminated and no-residual systems in this study extend to other similar systems. Nonetheless, the ostensible shift in taxonomic composition toward mycobacteria and AOB (particularly Nitrosomonas with N. oligotropha-like amoA genes) suggested that certain taxa tolerate residual chloramine and may even benefit from it. These taxa may pose problems: Some mycobacteria are potential pathogens, and AOB may contribute to accelerated chloramine 29 decay. In addition, corrosion-associated Desulfovibrio spp. were common underneath corrosion tubercles in both systems, suggesting that microbiological activity may contribute to cast-iron corrosion regardless of whether a disinfectant residual is maintained in the bulk drinking water. Though the chloramine appeared to work as intended—reducing biomass and preventing most taxa present in the bulk water from integrating into the biofilms—the remaining consortium of biofilm taxa may pose management challenges. It therefore remains unclear whether the effectiveness of chloramine outweighs the potential problems arising from its selection of specific, well-adapted taxa.

SUPPORTING INFORMATION

Supplementary text to the Materials & Methods, including description of winter-shutoff sites and library-size normalization for beta metrics; supplementary figures, including ordination of sample collection sites, photos of water mains, AOB- and NOB-like 16S rRNA gene sequences, comparison of AOB-like OTUs versus amoA:16S rRNA gene ratios, alpha diversity versus library size, beta metrics of all samples together, PCoA of unweighted UniFrac and Bray-Curtis dissimilarity for water-main biofilms versus drinking water in addition to under-tubercle samples; supplementary tables, including effect of library size on beta diversity using PERMANOVA, PCR primers sequences and thermoprofiles, real-time qPCR summary statistics, Conover-Iman tests of Shannon and inverse Simpson indices

ACKNOWLEDGEMENTS

We thank the two water utilities involved in this research for providing access to their drinking water distribution systems. This work was primarily supported with a grant from the water utility in the United States, which wishes to remain anonymous. The water utility in Norway contributed additional financial support. Collaboration between the University of Minnesota (UMN) and the Norwegian University of Science and Technology, including international travel and lodging expenses, was possible with funding from the Norwegian Center for International Cooperation 30 in Education (grant NNA-2012/10128). Kyle Sandberg and Hanna Temme assisted in water sample collection. The UMN Genomics Center sequenced the 16S rRNA gene amplicons and provided additional technical support. Sequence analysis was possible using resources from the Minnesota Supercomputing Institute. The authors declare no competing financial interest. 31 Chapter 3

Occurrence of Legionella spp. in water-main biofilms from two drinking water distribution systems

FOREWORD

Following taxonomic characterization of the 16S rRNA gene amplicons, Legionella-like OTUs were detected in the water-main biofilms and drinking water samples from both the chloraminated DWDS in the United States and the no-residual DWDS in Norway. Though comprising small fractions of the amplicon libraries (up to 0.5% of sequence reads in a given sample), we hypothesized that there may be greater biomass of legionellae in the no-residual DWDS, because 16S rRNA gene concentrations were significantly higher in both the biofilm and drinking water samples of that system (p = 1.4 × 10−6 and 0.005, respectively). This was concerning from a public health perspective because Legionella spp.—especially L. pnuemophila—are the etiological agents behind Pontiac fever and Legionnaires’ disease. In this study, we investigated the occurrence of Bacteria and Legionella spp. in the water- main biofilms and drinking water using real-time qPCR. We also performed additional analysis of the Legionella-like OTUs to confirm their phylogenetic similarity to characterized strains of Legionella spp. and not related taxa. Despite generally higher water temperatures and assimilable organic carbon concentrations in the chloraminated DWDS, total Bacteria and Legionella spp. were significantly lower in water-main biofilms and tap water of that system (p < 0.05). Gene markers of Legionella spp. (ssrA genes) were not detected in the biofilms of the chloraminated DWDS (0 of 35 samples) but were frequently detected in biofilms from the no-residual DWDS (10 of 23 samples; maximum concentration = 7.8 × 104 gene copies cm−2). Low levels of 32 L. pneumophila-like mip genes were detected near the limit of quantification in 1 biofilm sample of the chloraminated DWDS and 2 samples from the no-residual DWDS. No gene markers of L. pneumophila serogroup 1 (wzm genes) were detected in either DWDS. This investigation suggests water-main biofilms in the DWDS may serve as a source of legionellae for tap water and premise plumbing systems, and residual chloramine may aid in reducing their abundance.

Note: The remainder of Chapter 3 has been previously published [32]. It is reproduced in part with permission from:

Waak MB, LaPara TM, Hallé C, Hozalski RM. Occurrence of Legionella spp. in water-main biofilms from two drinking water distribution systems. Environ Sci Technol 2018; 52: 7630–7639, doi: 10.1021/acs.est.8b01170.

© 2018 American Chemical Society

3.1 INTRODUCTION

After entering the drinking water distribution system (DWDS), drinking water may spend hours, days, or even weeks before reaching consumer taps. In addition to the above-ground infrastructure (i.e., pumps, water towers, and reservoirs), the predominant components of the DWDS are the hundreds to thousands of kilometers of water mains buried underground. Because of the relatively high surface area-to-volume ratio and often harsh environment of the bulk water, more than 95% of the drinking water microbiome—the sum of all microorganisms within the DWDS—exists in thin biofilms on the walls of the water mains [17]. Biofilm bacteria may exacerbate corrosion of iron water mains, produce unpleasant tastes and odors, decrease residual concentrations of disinfectant (commonly chlorine and/or chloramine), and shed viable bacterial cells into treated drinking water [2, 7, 86, 87]. Of particular concern is the role of biofilms as reservoirs of waterborne pathogens, including opportunistic microbes like Legionella spp. [27, 88, 89]. Infections caused by Legionella spp. are known as legionellosis, which can manifest as two 33 clinical syndromes. Legionnaires’ disease is a severe pneumonia that is fatal for 1 in 10 cases, and Pontiac fever involves milder respiratory symptoms, similar to influenza [10]. While all legionellae are believed to be pathogenic, 90–95% of clinical cases of Legionnaires’ disease have been attributed to L. pneumophila, with serogroup 1 causing up to 70% of cases [10]. Legionellosis is caused by inhalation of contaminated water droplets; drinking tap water that contains Legionella spp. is not known to cause illness [10]. Outbreaks have been associated with contaminated premise plumbing (e.g., pipes and showerheads), spas and decorative water features, and cooling towers [90], although public water supplies have been long suspected as a means for the dispersal of Legionella to such environments [91, 92]. To minimize the risk of pathogen exposure via tap water, a residual disinfectant is used in many parts of the world, particularly in the United States [1]. The residual disinfectant is intended to provide continuous suppression of the DWDS microbiome and prevent pathogen growth. Two common disinfectants in drinking water are free chlorine (HOCl/OCl−) and combined chlorine

(primarily as monochloramine, NH2Cl) [93]. Despite their widespread usage, chlorine and chloramine have well-known drawbacks. Chlorine-based disinfectants impart an unpleasant taste and odor to tap water. In addition, the reaction of chlorine or chloramine with organic matter in the water produces disinfection byproducts, which may have adverse health effects [22]. Furthermore, chlorine and chloramine may enhance leaching of lead and other metals from plumbing if adequate corrosion prevention has not been implemented [4, 94]. In contrast, disinfectant residuals are often low or nonexistent in many European cities. The reduced reliance on chlorination may be due to the ability to obtain water from very high-quality sources, the implementation of rigorous treatment barriers to remove bacteria and nutrients from the water, as well as aggressive programs to maintain the DWDS infrastructure [3]. A recent review of epidemiological data concluded that incidences of waterborne diseases were no higher in European cities compared to cities in the United States, which suggests that safe drinking water can be attained without the need for residual disinfection [1]. In the present study, we collected water-main biofilms and tap water samples from a chloramine-containing DWDS in the United States and a DWDS in Norway that does not 34 maintain a residual disinfectant. We initially used real-time quantitative polymerase chain reaction (qPCR) targeting 16S ribosomal RNA (rRNA) genes to assess bacterial biomass in biofilms and tap water. After Legionella-like operational taxonomic units were identified by high-throughput sequencing of PCR-amplified 16S rRNA gene fragments, additional qPCR was performed to quantify legionellae-specific gene markers in both DWDSs.

3.2 MATERIALS &METHODS

3.2.1 Drinking water distribution systems

Water-main sections and tap water were obtained at various locations in two drinking water distribution systems: a chloramine-containing system in the United States (chloraminated system) and a system in Norway that does not maintain a disinfectant residual (no-residual system). Water mains and tap water samples were collected May–November 2014 in the chloraminated system and over several trips in June 2014, October 2014, and May 2015 in the no-residual system. The chloraminated system withdraws raw water directly from a river, and treatment includes lime softening, recarbonation, alum coagulation, sedimentation, filtration, and disinfection. Primary disinfection is achieved using free chlorine and then the residual is quenched with ammonia to form chloramines before distribution to consumers. The total chlorine concentration in the −1 finished water is 3.8±0.1 mg Cl2 L (mean ± standard deviation). In contrast, the no-residual system treats water from a lake in a protected and relatively pristine watershed. The soft raw water is hardened by passage through beds of granular calcium carbonate (CaCO3), and then −1 dosed with free chlorine to achieve a minimum of 0.05 mg Cl2 L total chlorine after a 30-min contact time. The water is further disinfected by UV radiation (40 mJ cm−2 using medium- pressure lamps) and then distributed to consumers with a free chlorine residual of 0.08±0.01 mg −1 Cl2 L .

3.2.2 Water-main biofilms

Sections of water mains were removed from the distribution systems during routine maintenance activities. Photographs of several representative water mains are provided in Figure 3.1. The 35 sample collection sites were dictated by utility valve or main replacement schedules. Only water mains with diameters of 15–20 cm (or 6–8 in) were collected; larger mains were avoided to keep the transport and handling of sampled sections manageable. Water mains were taken out of service by closing adjacent gate-valves no more than 12 hours prior to removal. To limit contamination of the pipe interior, the water was allowed to remain in the main such that water evacuated when the main was cut. To mitigate contamination of the sample while cutting, the exterior of the water main was brushed to remove adhered soil and then rinsed with a chlorine-bleach solution (400 mg L−1) using a hand-operated sprayer. Cast iron pipes made of gray iron or cement-lined ductile iron were cut with a hydraulic pipe cutter to produce samples of 30–60 cm longitudinal length. As the hydraulic pipe cutter simply crushed unlined ductile iron water mains, a hand-held chop saw was used for those mains. In these cases, longer samples (75–100 cm) were collected and scraped for biofilms at least 20 cm from a cut end to minimize the risk of biofilm disturbance or contamination from the chop saw. A flame-sterilized steel microspatula was used to gently scrape three to four internal surfaces of the water mains to sample surface biofilms (median area 1.3 cm2; range 0.4–8.5 cm2); each biofilm scraping was treated as an individual biological replicate. The microspatula was dipped and vigorously swirled in lysis buffer (5% sodium dodecyl sulfate, 120 mM sodium phosphate, pH 8) to release sampled biofilms, with flame sterilization of the microspatula between uses. Each sample set was accompanied by three negative controls, in which the flame-sterilized microspatula was dipped in lysis solution. These controls were extracted for DNA simultaneously with the biofilm samples. Ten sampling sites were utilized for biomass sampling in the chloraminated DWDS (C1–C10) and seven in the no-residual DWDS (N1–N7). Ideally, corresponding samples of water and water-main biofilms would have been obtained from various locations throughout each system. Although water can be readily collected throughout most systems, sites for water-main collection could not be selected by our team but were dictated by utility DWDS maintenance schedules. Unfortunately, because of accessibility and other issues it was not always possible to obtain water from areas near water-main sampling and vice versa. In total, nine water mains were collected 36 A B C D

Figure 3.1: Photographs of several water mains in two drinking water distribution systems: (A) a 118-year-old cast-iron water main from the chloraminated drinking water distribution system (site C7; original internal diameter = 15.2 cm); (B) a 40-year-old ductile iron water main with cement-mortar lining from the chloraminated drinking water distribution system (site C6; diam. = 15.2 cm); (C) a 109-year-old cast-iron main from the no-residual system (site N7a; original diam. = 15.0 cm); and (D) an unlined 46-year-old ductile water main from the no-residual system (site N5a; diam. = 15.0 cm). from nine sites in the chloraminated DWDS (sites C2–C10), and six water mains were collected from three sites (N5–N7) in the no-residual system (Table D.1 in the Supporting Information, SI [Appendix D]). Water mains from the same site typically came from different sides of a 3- or 4-way pipe intersection, may have had different pipe ages and materials, and were designated with lower-case letters (e.g., a and b).

3.2.3 Tap water samples

Tap water samples were collected either directly from the water main via pitot valves (chlo- raminated system) or via nearby public or private faucets (no-residual system). Taps were flame-sterilized and flushed up to 1 min for pitot valves and 5 min for faucets prior to sample collection to evacuate stagnant water. Water was then sampled using autoclave-sterilized glass bottles. Three to four biological replicates were collected in separate bottles at each location (approximately 1 L each) and transported to the laboratory in a cooler. Samples were individually vacuum-filtered through separate 0.22-µm nitrocellulose membrane filters (47-mm diameter; EMD Millipore; Bellerica, MA) within an hour of collection. The median volume of filtrate was 1000 mL (range 747–1265 mL). Filters were directly submerged in microcentrifuge tubes 37 containing lysis buffer to initiate DNA extraction. Each set of tap water samples was accom- panied by three negative control filters, which were prepared by vacuum-filtering 2.0 mL of PCR-grade water (Sigma-Aldrich; St. Louis, MO). In total, water was collected from six sites in the chloraminated DWDS (sites C1, C2, C4, C6, C7, and C9) and seven sites in the no-residual DWDS (N1–N4, N6, and N7) plus the treatment plant of that system (N0; Table D.2).

3.2.4 DNA extraction

To recover DNA after submersion in lysis buffer, the samples were subjected to three freeze-thaw cycles followed by a 90-min incubation at 70°C. DNA was extracted using the FastDNA SPIN Kit (MP Biomedicals; Santa Ana, CA) according to the manufacturer’s instructions. Extracted DNA samples were stored at −20°C until further analysis.

3.2.5 Real-time quantitative PCR (qPCR)

Real-time quantitative polymerase chain reaction (qPCR) was used to quantify several target genes. Primers targeting the V3 region of the 16S rRNA gene (341F/534R) were used to measure total bacterial biomass, as described previously [30]. Similarly, qPCR was used to quantify three different gene targets as measures of Legionella biomass: ssrA for Legionella spp. (primers PanLegF/PanLegR and probe PanLegP [95]), mip for L. pneumophila (primers LpF/LpR and probe LpP [95]), and wzm for L. pneumophila serogroup 1 (primers P65/P66 [96]). Full primer/probe sequences, PCR reaction concentrations, and PCR thermoprofiles are described in Table D.3. The oligonucleotides used for creating qPCR standard curves are summarized in Table D.4. Due to background amplification of 16S rRNA genes in reaction reagents, there was no limit of detection (LOD) for this assay, and the limits of quantification (LOQ) were defined as 10 times the gene copy number observed in no-template controls, or 1.3 × 104 copies per reaction. The LOD and LOQ for the qPCR reactions targeting ssrA, mip, and wzm were 5 and 10 copies per reaction, respectively, as previously described [97]. Amplification efficiencies, LOQs, LODs, and standard curves are summarized in Table D.5. The method limits of detection and quantification (LODm and LOQm, respectively) for each sample were defined as the LOD or LOQ normalized to the surface area (biofilms) or filtrate volume (water). PCR reaction chemistry 38 and quality control protocols are described in the SI Materials & Methods.

3.2.6 Characterization of Legionella-like 16S rRNA Genes

The V3 region of the 16S rRNA gene was PCR-amplified using modified versions of the 341F/534R primers and sequenced using the MiSeq platform (Illumina, Inc.; San Diego, CA) to gather microbial community data, as previously described [31]. Samples with detectable but non-quantifiable 16S rRNA gene copy numbers via qPCR were omitted from analysis to avoid biases from reagent contamination [98]. Raw sequence reads were processed via the ‘metagenomics-pipeline’ version 1.5 [33]. Briefly, after quality trimming, filtering, stitching of paired-end reads, and identification of chimeric sequences, operational taxonomic units (OTUs) were clustered at 99% similarity for ‘species’-level analysis [99, 100] and assigned consensus taxonomy using the SILVA 99% reference sequences and taxonomy (release 128) [37, 38]. Singleton OTUs and alignment failures were removed prior to data analysis. Using the number of amplicon sequences per sample, we defined a practical limit of detection (LODseq) as 1/[sequence depth + 1]. A description of sample preparation prior to sequencing and detailed information about the bioinformatics pipeline are provided in the SI. To confirm taxonomic identity of Legionella-like OTUs, representative sequences of the 5 most frequently observed Legionella-like OTUs in each sample type from both DWDSs (19 OTUs total when pooled) were manually searched using the National Center for Biotechnol- ogy Information (NCBI) BLAST web utility [101, 102] with the GenBank [103] nucleotide collection. A multiple sequence alignment was then performed with MAFFT [104] using the 19 Legionella-like OTU sequences in addition to published 16S rRNA gene sequences (trimmed to the 341F/534R V3 region) of 27 Legionella spp. and 4 phylogenetically related non-legionellae species (i.e., class ). A phylogenetic tree was constructed with FastTree [40] and rooted using the midpoint.

3.2.7 Water quality analyses

Total chlorine in water samples was determined immediately after sample collection using the N,N-diethyl-1,4-phenylenediamine (DPD) method and a Pocket Colorimeter II (Hach Company; 39 Loveland, CO). The water utilities provided daily total chlorine concentrations for finished water for 2014 and average daily temperatures of the raw water at the treatment plants for 2014–2015. Temperatures of distributed water at 15 monitoring taps across the chloraminated DWDS were also provided. Comparable data were not available for the no-residual DWDS, so temperature measurements were taken at four locations during May–June 2017. Assimilable organic carbon (AOC) was measured using the P-17/NOX method [28, 105] at three locations of varying distances from the treatment plant in each DWDS in 2015 (chloraminated system) or 2016–2017 (no-residual system). A detailed description of the method is provided in the SI. The chloraminated water utility provided daily pH as well as hardness, total ammonia (NH3 + + + NH4 + combined chlorine), free ammonia (NH3 + NH4 ), nitrate (NO3−), and total phosphorus (P) concentrations for the treated water during 2014. The no-residual utility measures pH daily + while hardness, ammonium (NH4 ), and nitrate are measured much less frequently (i.e., several times per year) due to highly consistent raw and treated water quality. Total and free ammonia were calculated from 2014 ammonium concentrations based on pH and water temperature, as previously described [29]. The no-residual utility does not monitor phosphorus, so total phosphorus (Hach LCK 349) was determined in May 2017 for raw and treated water. Methods used by the water utilities for water quality analyses are summarized in Table D.6.

3.2.8 Data analysis and statistics

Statistical tests and data transformations were performed with R software [45]. Hypothesis testing was performed using generalized Wilcoxon rank sum tests [54] on log10-transformed gene concentrations, with non-detects (i.e.,

Unprocessed 16S rRNA gene sequence reads are available online from the NCBI Sequence Read Archive (accession SRP148989). All data that support the findings of this study are available from the corresponding author upon request.

3.3 RESULTS

3.3.1 Water quality

Water quality parameters for the two DWDSs are summarized in Table D.7. Both DWDSs had mildly to moderately basic finished water (median pH 8.7 and 8.1 for chloraminated and no residual, respectively). The chloraminated system had moderately hard water after lime softening (median 73 mg CaCO3 L−1; range 44–93 mg CaCO3 L−1), while the no-residual system had soft water after hardening (median 50 mg CaCO3 L−1; range 36–54 mg CaCO3 L−1). AOC was consistently higher in the chloraminated system (238–343 versus 81–109 µg C L−1; see Fig. D.1 for effect of distance from water treatment plant). Free ammonia and total phosphorus concentrations were consistently below detection limit in the no-residual system (<0.02 mg N L−1 and <0.05 mg P L−1) and ranged from <0.02 to 0.97 mg N L−1 and 0.17 to 0.52 mg P L−1, respectively, in the chloraminated system. The median nitrate concentration in the chloraminated system was also three times more than that in the no-residual system (0.86 versus 0.24 mg N L−1). Daily temperature averages for raw water at the inlets of the treatment plants are provided for 2014–2015 (Fig. D.2). The chloraminated system raw water had a median daily temperature of 12.2°C for 2014–2015 (range 1.1–32.1°C), while temperatures measured in the DWDS during that period had a median of 16.1°C (range 1.8–34.1°C). With the exception of the summer months (June–September), distributed water was generally warmer than the raw water (Fig. D.3). Conversely, the no-residual system raw water was consistently colder than the chloraminated DWDS (median 3.9°C; range 0.1–5.5°C). Distributed water temperatures in the no-residual system were slightly warmer than the raw water, with a median of 7.2°C (range 5.2 to 8.5°C). 41 3.3.2 Quantification of total Bacteria

Total Bacteria concentrations (as 16S rRNA gene copies) in all samples are summarized in Table 3.1. Total Bacteria concentrations in water-main biofilms in the chloraminated system −6 were frequently

3.3.3 Characterization of Legionella-like 16S rRNA genes

Sequencing of 16S rRNA gene fragments provided a median of 76,358 high-quality paired-end sequence reads per sample (range 29,558–305,986). Reads were clustered at 99% sequence similarity into a total of 115,494 OTUs. Of these, 678 were classified as genus Legionella. Phylogenetic analysis of the 19 most frequently observed OTUs indicated high similarity to 16S rRNA gene sequences of known Legionella spp. (Fig. D.4). The Legionella-like OTUs were detected among sequenced samples from both DWDSs, although at a lower frequency in both the water-main biofilms (89% vs. 100%) and tap water (95% vs. 100%) of the chloraminated system. These OTUs were a small fraction of the community profiles (up to 0.5% of sequences in a given sample; Figs. 3.2B and 3.3B for biofilms and tap water, respectively). Relative abundances of Legionella-like OTUs were lower in the tap water samples of the chloraminated system (p = 3.2 × 10−4), but there was no significant difference between relative abundances in the biofilms of the two systems (p = 0.95). No Legionella-like OTUs were present among six negative controls that were sequenced. 42 3.3.4 Quantification of Legionella spp.

Legionella spp. (i.e., ssrA gene copy) concentrations were significantly lower in biofilms (p = 9.0 × 10−3) and tap water (p = 0.045) of the chloraminated DWDS (Figs. 3.2C and 3.3C), 2 with no detection in the biofilms of that system (i.e.,

3.3.5 Biofilms from seasonal dead-end water mains

After samples were analyzed via real-time qPCR of 16S rRNA genes, four water mains from the chloraminated system (sites C7–C10) exhibited uniformly low 16S rRNA gene copies. We identified these four sites as having come from along a corridor with bridge crossings, where gate-valves on either end of the bridges are closed and the exposed water mains under the bridges drained to prevent freezing and rupturing during the cold-weather months (approximately late Autumn to mid-Spring). These water-main sampling sites therefore effectively become dead-ends and experience recurring, months-long periods of stagnation, which may be responsible for the observed low biomass. The 16S rRNA gene concentrations in biofilms from the seasonal dead-end water mains were significantly lower than in the other biofilms from the chloraminated Table 3.1: Real-time qPCR observations for water-main biofilms and tap water in two drinking water distribution systems* Water-main biofilms (copies cm−2) Tap water (copies L−1) Taxonomic target (Gene) p p Chloraminated No residual Chloraminated No residual value †,‡ value ‡ FOD (%) 35/35 (100) 23/23 (100) 23/23 (100) 22/22 (100) 7 −6 6 7 all Bacteria Median

43 44 DWDS (i.e., sites C2–C6; p = 2.0 × 10−3). Because sites C7–C10 were potentially atypical for the system, Wilcoxon comparisons between biofilms from the two DWDSs were performed again with these samples excluded (Table 3.1). This did not affect significance of comparisons or interpretation of results. Tap water samples from these areas (sites C7 and C9) were collected during regular operation and therefore not considered atypical.

3.4 DISCUSSION

The seminal finding of this investigation is that water-main biofilms in drinking water distribution systems can harbor Legionella spp., which may have implications for public health and manage- ment of public water supplies. To our knowledge, the present study is the first to demonstrate gene markers of Legionella spp. in biofilms collected from water mains in full-scale DWDSs. We are unable to directly compare our results on Legionella occurrence in water-main biofilms with results from other studies because of the absence of such studies in the peer-reviewed literature. This lack of information on microbial communities in water-main biofilms is likely due to the difficulty in obtaining water-main sections that are buried underground, and therefore generally inaccessible. As a consequence, water system biofilms analyzed for legionellae typically are collected from more convenient locations such as water meters and faucets. Schwake et al. detected Legionella spp. in 9 of 35 (26%) and L. pneumophila in 5 of 35 (14%) water-meter biofilms via PCR [106]. Legionella spp., including L. pneumophila serogroup 1, have also been detected in biofilm samples collected from faucets, though conditions in premise plumbing likely differ from the DWDS [107]. Biofilms grown on cast-iron coupons exposed to tap water at 25°C harbored L. pneumophila at concentrations of 103–105 mip copies cm−2 [106], which were higher than the mip concentrations observed in this study. Using 16S rRNA gene sequencing together with qPCR afforded us the opportunity to compare two different methods to assess Legionella occurrence in two water systems. It is unclear why Legionella-like 16S rRNA gene sequences were more commonly detected than ssrA or why the ssrA:16S rRNA gene ratio did not correlate significantly with the Legionella-like 45

A Chloraminated No residual

) 9 −2 8

7 ------6 ------(copies cm ------10 - 5 -- log C2 C3 C4 C5 C6 C7 C8 C9 C10 N5 a N5 b N5 c N6 N7 a N7 b

Quantifiable 16S rRNA gene copies - LOQ m

Detectable 16S rRNA gene copies,

B 10 4 10 3

10 2

10 1

10 0 ------−1 ------seqs per 10 000 10 ------C2 C3 C4 C5 C6 C7 C8 C9 C10 N5 a N5 b N5 c N6 N7 a N7 b Legionell a-like 16S rRNA gene sequences

- LOD seq (tail indicates no detection,

C2 C3 C4 C5 C6 C7 C8 C9 C10 N5 a N5 b N5 c N6 N7 a N7 b

Quantifiable ssrA gene copies - LOQ m (tail indicates no detection,

Detectable ssrA gene copies,

Figure 3.2: Total and Legionella-like bacterial gene markers in water-main biofilms of the chloram- inated and no-residual drinking water distribution systems: (A) 16S rRNA genes via qPCR (total Bacteria); (B) Legionella-like fraction of OTUs via sequencing of 16S rRNA genes; and (C) ssrA copies via qPCR (total Legionella spp.). 46

A Chloraminated No residual 9 ) −1 8

7

(copies L ------10 6 ------

log 5 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7

Quantifiable 16S rRNA gene copies - LOQ m

Detectable 16S rRNA gene copies,

B 10 4 10 3

10 2

10 1

10 0 - - - −1 ------

seqs per 10 000 - - 10 ------C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Legionell a-like 16S rRNA gene sequences

- LOD seq (tail indicates no detection,

(copies L 3 ------10 ------log 2 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7

Quantifiable ssrA gene copies - LOQ m (tail indicates no detection,

Detectable ssrA gene copies,

Figure 3.3: Total and Legionella-like bacterial gene markers in tap water of the chloraminated and no-residual drinking water distribution systems: (A) 16S rRNA genes via qPCR (total Bacteria); (B) Legionella-like fraction of OTUs via sequencing of 16S rRNA genes; and (C) ssrA copies via qPCR (total Legionella spp.). 47 sequences. A search of the primer sequences using the Primer-BLAST utility [108] indicated that Legionella spp. harbor three or four 16S rRNA gene copies per genome in contrast to one ssrA copy per genome, which could enable greater sensitivity of the 16S rRNA gene copies. Notably, ssrA were only detected in samples that contained Legionella-like OTUs. There may also be PCR amplification biases, inaccurate taxonomic classifications (though the most abundant sequences appeared phylogenetically consistent with Legionella), and/or primer specificity biases (e.g., non-specific 16S rRNA gene primers or overly specific ssrA primers and probes). Interestingly, the three biofilms that were positive for -like mip genes were negative for ssrA. At very low target gene levels in the PCR reaction well (i.e., <10 copies), it may be that different PCR amplification efficiencies resulted in positives of one marker gene and negatives of another. Legionella spp. were observed in tap water from both DWDSs—up to 2.0 × 103 and 2.8 × 103 ssrA copies L−1 for chloraminated and no-residual systems, respectively—lower than previously reported for treated surface water (range 2.1 × 104–7.8 × 105 copies L−1 of Legionella-specific 16S rRNA genes) [109]. Tap water concentrations were also lower than observations from small- building taps in Flint, Michigan (range 1.2 × 104–2.5 × 106 copies L−1 of Legionella-specific 23S rRNA genes), where legionellosis outbreaks had occurred [94]. These studies assessed 16S rRNA and 23S rRNA genes, however, both likely having multiple copies per genome, thus complicating direct comparisons between our ssrA results and these literature values [95]. L. pneumophila is regarded as the most clinically relevant species of legionellae [10]. The failure to detect mip in tap water and L. pneumophila serogroup 1 (as wzm) in tap water or biofilms from either system suggests they are either not present or very low in abundance (median LOQm <1.5 × 102 copies cm−2 in water-main biofilms and <1.9 × 102 copies L−1 in tap water). The reasons for the low abundance of L. pneumophila and especially serogroup 1 in these systems are unclear but is a welcome result from a public health perspective. A survey of tap water across the U.S. found that chlorinated systems were often positive for gene markers of L. pneumophila, generally in the range of 102–104 L. pneumophila-like 16S rRNA gene copies L−1 [110]. The survey also found that L. pneumophila were less frequently detected in chloraminated tap water 48 than in tap water containing free chlorine. Based on analysis of several key water quality parameters, we suspect the observed difference in abundances of Bacteria and Legionella spp. between the two DWDSs was due to the presence of the substantial chloramine residual in one system and not the other. AOC, which is typically the limiting factor for heterotrophic bacterial growth in DWDSs [105], was higher in the chloraminated system. The chloraminated system also had consistently higher concentrations of nitrogen and phosphorus, which are necessary nutrients for microbial growth. In addition, the temperature in the chloraminated system, though often <20°C, was more favorable for the growth of mesophiles like Legionella than the consistently colder no-residual system. We cannot confirm that Legionella spp. are growing in either of the studied distribution systems, but our data indicate that Legionella cells persist in water-main biofilms at temperatures well below their ideal growth range. The occurrence of Legionella in these systems and other cold-water environments, such as cold-climate lakes [111–113] and tap water [109, 110], could be due to the cold water inducing a viable but non-culturable state [114]. Previous studies have observed conflicting results on whether higher pH or hardness significantly benefit Legionella growth [115, 116], but pH and hardness were slightly higher in the chloraminated system water. Certainly, there are many other factors that could have played a role in the observed differences in the two systems that were not investigated, including trace nutrient concentrations and hydraulic conditions at the sampling sites. The ostensible effectiveness of chloramines at controlling Legionella spp. observed in this work is consistent with other research, including studies in which a switch from free chlorine to monochloramine decreased Legionella colonization of building hot water systems [117, 118]. Chloramine may be more effective against biofilm bacteria because its selective reactivity with extracellular polymeric substances allows it to penetrate deeper into biofilms [119]. It is also known that planktonic Legionella are susceptible to inactivation via chloramine [120], and intracellular Legionella within host Acanthamoeba are effectively inactivated by chloramine [121]. Thus, maintenance of a chloramine residual may deter the proliferation of Legionella in DWDSs due to its effectiveness at inactivating the organism in planktonic, biofilm, and 49 intracellular states, as well as by generally decreasing biofilms and limiting most bacteria in the DWDS microbiome. Only two water systems were compared: one with a chloramine residual and one without any residual disinfectant. Thus, it is unclear whether our findings are representative of other similarly operated systems. Certainly, more studies of the DWDS microbiome are needed, especially of the microbial communities present in water-main biofilms. One potential bias is that different timelines for sample collection were used (May–November 2014 for the chloraminated DWDS; June 2014, October 2014, and May 2015 for the no-residual system). Although both systems were sampled during early summer months, the sampling of the chloraminated system extended into fall months. For biofilm samples, little effect is expected as there is evidence that biofilm populations in water systems become stable after 500 days of development [122]. Another potential bias is that the different sources of tap water (i.e., pitot valves in the chloraminated system; premise taps in the no-residual) could have also resulted in contamination of water samples with premise plumbing organisms in the no-residual system. In order to try to mitigate potential premise plumbing effects, the tap water was flushed for 5 min compared to only 1–2 min in other studies [123]. It is difficult to quantify the risk associated with Legionella in the biofilms and tap water of drinking water distribution systems. Concentrations causing infection have been estimated to range from 7.8 × 105 to 7.8 × 108 colony-forming units (CFU) cm−2 for premise plumbing biofilms and 3.5 × 106 to 3.5 × 108 CFU L−1 for water [124]. Concentrations observed in this study were well below both of these ranges, though this is complicated by ambiguous conversions between CFUs and gene copy numbers. Perhaps of greater concern, however, is the role of the DWDS as a reservoir of legionellae [91]. Drinking water contaminated by Legionella released from water-main biofilms may inoculate hot water systems, premise plumbing, and cooling systems. The warmer temperatures and lower chlorine or chloramine residuals in building water systems may then favor Legionella growth, resulting in increased infection risk [125–127]. Legionnaires’ disease has been linked to use of drinking water derived from surface waters [128], though risk of hospital-associated legionellosis outbreaks has been shown to be significantly 50 lower in communities with chloraminated water compared to water with other disinfectants [129]. There are recent examples of outbreaks tentatively linked to public water supplies [94, 130]. The present study advances our knowledge of the potential role of the DWDS in Legionella transmission by demonstrating that DWDS biofilms harbor Legionella spp. Although only two DWDSs were considered, the results give the impression that residual chloramine was the defining factor affecting the lower abundances in the chloraminated system. Due to the presence in biofilms, DWDSs may serve as a source of Legionella spp. to residential and commercial plumbing systems, even though the organisms may be absent in the finished water leaving the treatment plant. We considered the incidence of legionellosis in both cities involved in this investigation, as well as the national incidence rates of the United States and Norway, using publicly available epidemiological data (Fig. D.8) [131, 132]. The incidence rates were generally higher in the metropolitan area served by the DWDS with no residual, which is consistent with our findings regarding the higher levels of Legionella spp. in the biofilms and tap water of that system. Linking epidemiological data to the drinking water supply, however, is tenuous without additional research. Observed incidence rates are also subject to confounding factors that may complicate regional comparisons (e.g., frequency of diagnostic testing, age structure of the population, and frequency of travel). Thus, more research is needed to determine whether higher Legionella occurrence rates in water-main biofilms and drinking water supplies leads to a higher risk of contracting Legionnaires’ disease and Pontiac fever.

SUPPORTING INFORMATION

Supplementary text to the Materials & Methods, including real-time qPCR, characterization of Legionella-like 16S rRNA genes, assimilable organic carbon, data analysis and statistical treatments; supplementary figures showing AOC versus distance from water treatment plant, water temperatures, phylogenetic analysis of Legionella-like 16S rRNA gene sequences, Le- gionella-like OTU relative abundances versus ssrA:16S rRNA gene ratios, qPCR results for the 51 mip and wzm genes in water-main biofilms and tap water, legionellosis incidence rates; tables giving water main and tap water metadata, qPCR primers/probes and thermoprofiles, qPCR standards, qPCR standard curves, water quality analytical methods, and supplementary water quality parameters

ACKNOWLEDGEMENTS

We wish to thank the two water utilities involved in this study for providing access to their respective drinking water distribution systems. This work was supported primarily with a grant from the water utility in the United States, which wishes to remain anonymous. Additional financial support for partially covering the cost of laboratory supplies was provided by the water utility in Norway. Collaboration between the University of Minnesota and the Norwegian University of Science and Technology, including travel and lodging between the United States and Norway, was made possible with funding from the Norwegian Center for International Cooperation in Education (grant NNA-2012/10128). Kyle Sandberg and Hanna Temme from the University of Minnesota assisted in collecting tap water samples from Norway in 2014. The University of Minnesota Genomics Center analyzed the 16S rRNA gene amplicons via sequencing. Sequence analysis was possible with resources from the Minnesota Supercomputing Institute. The authors declare no competing financial interest. 52 Chapter 4

Occurrence of Mycobacterium spp. in two drinking water distribution systems, with and without residual chloramine

FOREWORD

In Chapter 3, water-main biofilms were shown to be a potential reservoir of Legionella spp., particularly in the no-residual DWDS. Because conditions were ostensibly more favorable to bacterial growth in the chloraminated DWDS, residual chloramine may have helped reduce the abundance of legionellae in that system. Mycobacterium spp., however, dominated the biofilms of the chloraminated DWDS and were also present in the no-residual DWDS. Despite reducing legionellae, there was concern that residual chloramine may simply replace one genus of opportunistic pathogens with another—namely, non-tuberculous mycobacteria (NTM) like the M. avium complex, which can cause pulmonary disease. NTM are frequently found in chloraminated distribution systems because they are tolerant of the residual disinfectant. In this study, additional investigation was performed to quantify and classify the NTM in the two distribution systems. Total NTM and MAC were quantified via qPCR targeting, respectively, mycobacterial atpE genes and the ITS region of MAC. Concentrations of mycobacterial atpE genes were not significantly different in drinking water from the two systems (p = 0.09) but were significantly higher in the biofilms from the chloraminated DWDS (p = 5 × 10−9). No MAC were detected in either DWDS, however. Characterization of mycobacterial hsp65 genes indicated that M. gordonae was the most prominent species present in the chloraminated DWDS. The dominance of a single species in the chloraminated DWDS was consistent with the low diversity of 16S rRNA genes in biofilms from that system [Chapter 2], suggesting that residual chloramine 53 may also limit NTM diversity. In contrast, there was more phylogenetic diversity of hsp65 genes in the no-residual DWDS. These hsp65 genes could not be taxonomically classified, however, and may represent novel NTM. Finally, Mycobacterium- and Methylobacterium-like 16S rRNA genes did not exhibit negative correlation, as previously observed in premise plumbing [20, 21]. Because methylobacteria were minor members of the water-main biofilms in the chloraminated DWDS, consistent chloramine concentrations in the DWDS may give mycobacteria an advan- tage over methylobacteria. This investigation shows that, despite increasing the abundance of M. gordonae in the water-main biofilms, residual chloramine may have suppressed other NTM in the DWDS microbiome of a chloraminated DWDS.

4.1 INTRODUCTION

Drinking water supplies derived from surface waters are routinely disinfected to suppress pathogenic microbes. In the United States, residual concentrations of either free chlorine − (HOCl/OCl ) or chloramines (primarily monochloramine, NH2Cl) are maintained to reduce microbial growth during the hours to weeks that treated water passes through the drinking water distribution system (DWDS). Though effective at reducing bacterial biomass within DWDS biofilms, the residual disinfectant may encourage the presence of disinfectant-tolerant microbes— particularly Mycobacterium spp. [Chapter 2] [7, 24, 67, 69, 133]. Non-tuberculous mycobacteria (NTM) are ubiquitous in soil and water and include op- portunistically pathogenic species [71]. The Mycobacterium avium complex (MAC), notably, is the primary cause of NTM-related disease worldwide [134]. Pulmonary infections are the most common form of disease, but NTM may also infect the skin, lymph nodes, and other soft tissues [134]. Incidence is typically 1.0–1.8 per 100,000 people among industrialized nations, which is comparable to incidence of the more commonly known Legionnaires’ disease, caused by opportunistic Legionella spp. [134]. NTM-related illnesses, however, are not monitored by surveillance agencies (e.g., the U.S. Centers for Disease Control and Prevention), so incidence of NTM infections is thought to be severely underreported [89, 134–137]. 54 Illnesses caused by NTM originate exclusively from exposure to environmental sources [134]. MAC infections in individuals with HIV/AIDS are usually from contact with potable water [138], and other disease-causing NTM species are recovered almost exclusively from municipal water supplies [134]. The increasing incidence of pulmonary infections due to NTM may be linked to the modern preference for showers over baths, with aerosolized tap water transmitting the infec- tious cells [139, 140]. Biofilms in shower heads commonly test positive for NTM species [21], but there is additional interest in whether NTM in drinking water might also originate from NTM in water-main biofilms throughout the DWDS. These biofilms may act as a reservoir of another opportunistic pathogen, Legionella spp., though residual chloramine may reduce their abundance [32]. Unclear, however, is whether residual chloramine might also reduce other opportunistic pathogens, particularly when residual chloramine may actually encourage NTM in the water-main biofilms of chloraminated systems [Chapter 2] [7]. The primary goal of this investigation was to determine whether water-main biofilms in two full-scale distribution systems were a reservoir of opportunistic mycobacteria. Mycobacterium- like 16S ribosomal RNA (rRNA) gene amplicons were previously recovered from the drinking water and water-main biofilms of a chloraminated DWDS in the United States and a DWDS in Norway that intentionally operates with little or no residual disinfectant [Chapter 2]. In the present study, we used two genetic markers of mycobacteria to assess their concentrations in the biofilms and drinking water of these two systems. We also performed additional characterization of mycobacterial genes to identify unique amplicon sequence variants and attempt species-level taxonomic classification.

4.2 MATERIALS &METHODS

Water mains and associated drinking water samples were collected from two full-scale distribution systems. Interior surfaces of water mains were sampled for biofilms, and drinking water was filtered to gather suspended cells. Extracted DNA from biofilms and drinking water was previously assessed for quantities of 16S rRNA genes via real-time quantitative polymerase chain 55 reaction (qPCR) and sequenced to gather bacterial community information from PCR-amplified 16S rRNA genes [Chapter 2] [32]. These sequences were re-processed to identify amplicon sequence variants of Mycobacterium- and Methylobacterium-like 16S rRNA genes; there is interest in the ecological interactions between methylobacteria and mycobacteria [20, 21]. To assess species composition of NTM, we sequenced PCR amplicons of the 65-kD heat-shock protein gene (hsp65) of Mycobacterium spp. via high-throughput Illumina MiSeq. Additionally, qPCR was performed to quantify total Mycobacterium spp. and MAC using, respectively, mycobacterial atpE genes and the 16S-23S internal transcribed spacer (ITS) region of MAC.

4.2.1 Drinking water distribution systems

The chloraminated system in the United States uses lime softening, recarbonation, alum coagula- tion, sedimentation, filtration, and free chlorine disinfection to treat river water. Chloramines −1 are produced with an initial residual concentration of 3.8±0.1 mg Cl2 L (mean ± standard deviation). In contrast, the no-residual DWDS obtains raw lake water at a depth of 50 m. This water is passed through granular marble beds (primarily CaCO3) for corrosion control and then disinfected with free chlorine and medium-pressure UV light (40 mJ cm−2). There is a minor −1 residual concentration of free chlorine leaving the treatment plant (0.08±0.01 mg Cl2 L ).

4.2.2 Water quality

Total chlorine was measured during water collection using a Pocket Colorimeter II with DPD powder pillows (Hach Company, Loveland, CO, USA). Assimilable organic carbon (AOC) was measured subsequently throughout the two DWDSs (August 2015 and May 2017 for chloraminated and no-residual systems, respectively) using the Pseudomonas fluorescens strain P- 17/Spirillum sp. strain NOX method [28]. The utility that uses chloramines provided temperatures + − of raw water and pH, total chlorine, hardness, free ammonia (NH3 + NH4 ), nitrate (NO3 ), 3− and orthophosphate (PO4 ) for treated water in 2014. Water temperatures, total chlorine, free ammonia, and nitrate were provided from 13 DWDS monitoring sites for 2014. Raw water + temperatures and pH, total chlorine, hardness, ammonium (NH4 ), and nitrate of treated water from 2014–2015 were provided by the utility that uses no residual. Because the no-residual 56 utility does not measure it, free ammonia was calculated from ammonium concentrations using pH and temperature, as previously described [29]. Water temperature, as well as orthophosphate and total phosphorus (Hach assay number LCK 349), were measured at three sites in the no- residual DWDS during subsequent sampling in 2017. The methods used by the two utilities for monitoring drinking water quality are in compliance with either the U.S. Environmental Protection Agency (chloraminated DWDS) or the Standards Norway (no-residual DWDS).

4.2.3 Sample collection

Water-main biofilms. Water-main sections were removed from the DWDS, as previously de- scribed [32]. Briefly, the exteriors of the mains were cleaned and disinfected with a chlorine- −1 bleach rinse (approximately 400 mg Cl2 L ). Within 2 hours of water shutoff, a length of water main (30–100 cm) was cut out of the DWDS, depending on the pipe material: a hydraulic chain cutter for unlined grey and mortar-lined ductile cast iron or a handheld chop saw for unlined ductile cast iron. After sealing with clean plastic, the section of water main was moved to the laboratory, where 3–4 regions were gently scraped with a microspatula to collect biofilms (median sample area 1.3 cm2, range 0.4–8.5 cm2). Biofilm samples were released into 0.5 mL of lysis solution (5% sodium dodecyl sulfate, 120 mM sodium phosphate buffer, pH 8) to start DNA extraction. At every sample site, negative controls were collected in triplicate by placing the flame-sterilized microspatula in 0.5 mL lysis solution, similar to the biofilm samples. Nine water mains were collected from nine sites in the chloraminated DWDS (sites C2–10), and six water mains were collected from three sites in the no-residual DWDS (sites N5–N7). In the chloraminated DWDS, these included both unlined grey and mortar-lined ductile cast iron that had been in operation 40–127 years. In the no-residual DWDS, all water mains were unlined, made of either grey or ductile cast iron, and in operation 44–109 years. Water mains from the same site were designated with lower-case letters (e.g., N7a and N7b). Water mains from sites C7–C10 in the chloraminated DWDS may have been atypical because they are acted as dead ends every year during the winter; flow was restricted near these water mains to prevent freezing in adjacent parts of the DWDS that were exposed to the atmosphere. 57 As a result, water-main samples from these sites were omitted from statistical comparisons between the two systems; drinking water samples from these locations, on the other hand, were included in comparisons because they were taken during normal operation.

Drinking water. In the chloraminated DWDS, water was sampled directly from water mains via pitot tubes, while in no-residual DWDS, water was taken from faucets in nearby residential or commercial buildings. To reduce microbial contamination, metal taps were flame-sterilized and plastic taps were rinsed with a chlorine bleach solution. Taps were flushed for up to 5 min, and then 3–4 samples were collected in autoclave-sterilized bottles. After transport to the laboratory in a cooler, and within 1 hour after collection, each sample was vacuum-filtered through a 47-mm diameter, 0.2-µm pore-size nitrocellulose membrane (EMD Millipore, Bellerica, MA, USA). The median filtrate volume was 1000 mL (range 747–1265 mL). Filters were placed in 0.5 mL lysis solution to begin DNA extraction. During each sampling event, triplicate negative controls were collected by filtering 2.0-mL aliquots of molecular biology-grade water through clean filters. These negative controls were placed in 0.5 mL lysis solution, like the drinking water samples. Ideally, water was collected corresponding to the biofilm sample locations, but this was not always possible due to time constraints or limited access to suitable taps. Water samples were collected from six sites in the chloraminated DWDS (sites C1, C2, C4, C6, C7, and C9) and six sites in the no-residual DWDS (sites N1–N4, N6–N7) as well as treated water leaving the treatment plant from that system (site N0).

DNA extraction. Samples and negative controls were subjected to three freeze-thaw cycles and a 90-min incubation at 70°C. DNA was then extracted from the lysis solution using the FastDNA SPIN Kit (MP Biomedicals, Santa Ana, CA, USA) and stored at −20°C until subsequent analysis.

4.2.4 Real-time qPCR

We assessed mycobacterial atpE genes and the ITS region of MAC via real-time qPCR on a CFX Connect Real-Time PCR Detection System (Bio-Rad Laboratories, Inc., Hercules, CA, USA), using previously described methods [141, 142]. qPCR reactions consisted of 10.0 µL Bio-Rad SsoAdvanced Universal Probes Supermix, 20 µg bovine serum albumin (Roche Diagonistics, 58 Indianapolis, IN, USA), appropriate concentrations of forward and reverse primers and probes, 1 µL DNA template, and molecular biology-grade water (Sigma-Aldrich, St. Louis, MO, USA) for a final volume = 20 µL. Primers and probes were synthesized by Integrated DNA Technologies, Inc. (Skokie, IL, USA). Primer sequences, concentrations, and PCR thermoprofiles are summarized in Table E.1 in the Supplementary Information [Appendix E]. Standard curves were created with serially diluted solutions of custom gBlocks gene frag- ments. These were synthesized by Integrated DNA Technologies using reference genes obtained from GenBank [103], which are summarized in Table E.2. A positive control containing genomic DNA from MAC was analyzed to verify qPCR targeting the ITS region of MAC. No-template controls and the negative controls—collected and extracted concurrently to biofilms and drinking water—exhibited no amplification during either assay. Amplification efficiencies of the standard curves ranged from 92.1–94.4% and 92.6–93.8% for mycobacterial atpE genes and the ITS region of MAC, respectively. Additional standard curve information is summarized in Table E.3. The limit of quantification (LOQ) was defined as the lowest standard to reliably amplify (10 copies for both qPCR targets). Copy numbers for water-main biofilms and drinking water were normalized by sample area or filtrate volume and then log10-transformed. The method LOQ

(LOQm) was calculated for each sample by applying the same normalization to the LOQ copy number. For comparing gene concentrations, the cendiff function in the NADA package [51] of R statistical software [45] was used to perform generalized Wilcoxon tests [52, 54]. Left-censored observations (i.e.,

4.2.5 Characterization of amplicon sequences

16S rRNA genes. As previously reported, purified PCR amplicons of bacterial 16S rRNA genes were sequenced to gather bacterial community information [32]. Amplicons were produced targeting the V3 region with custom versions of the 341F/534R primers (i.e., with Illumina adapters and index sequences), consistent with established methods [30, 31]. Primers and PCR 59 thermoprofiles are summarized in Table E.1. Briefly, samples were sequenced if their copy numbers (via qPCR) were at least 10 times greater than no-template controls (equivalent to 1.3 × 104 gene copies). Subsequent PCR amplification was performed based on the quantification cycle (Cq) values, with 5–10 additional cycles to produce amplicon libraries. Purified products were pooled with equal mass into a single amplicon library and then sequenced with 2 × 150 bp MiSeq (Illumina, Inc., San Diego, CA, USA). Unprocessed sequence reads are available from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (accession SRP148989). For sequence analysis, a custom reference database of 16S rRNA gene fragments was created from the SILVA 128 reference sequences and taxonomy (99% operational taxonomic units) [37, 38]. The reference sequences were trimmed to the target fragment of the V3 region (i.e., amplified by the 338F/518R primers) using the feature-classifier plugin in QIIME 2 v2018.2 [34]. hsp65 genes. PCR amplicons of mycobacterial hsp65 genes were sequenced to gather taxo- nomic information about the NTM present in the two systems. A 441-bp fragment was amplified using custom, Illumina-compatible versions of the Tb11/Tb12 PCR primers [143]. PCR reactions consisted of 25.0 µL Bio-Rad iTaq SYBR Green Supermix with ROX, 1 µg µL−1 of bovine serum albumin, 500 nmol L−1 of forward and reverse primers, 1 µL of template genomic DNA, and molecular biology-grade water (final volume = 50 µL). Primer sequences and PCR thermoprofile are provided in Table E.1. PCR products were purified with the QIAquick PCR Purification Kit (Qiagen; Hilgen, Germany) and then quantified by UMGC using the PicoGreen assay (Thermo Fisher Scientific; Waltham, MA). Equal masses of all purified hsp65 gene amplicons were pooled to create a single library prior to sequencing. Finally, the amplicon library was sequenced with 2 × 250 bp Illumina MiSeq by the University of Minnesota Genomics Center (UMGC), consistent with previously described methods [7, 144]. Unprocessed sequence reads are available from the NCBI Sequence Read Archive (accession SRP152199). A custom reference database of 163 published mycobacterial hsp65 gene sequences was obtained from GenBank [103]. These included sequences from characterized strains and type 60 cultures of Mycobacterium spp., summarized in Table E.4. Reference sequences were manually curated and checked for proper sequence orientation. A custom, QIIME-compatible database based on GenBank taxonomy was created to classify the hsp65 gene sequences.

Amplicon sequence variants. The 16S rRNA and hsp65 gene sequence reads were processed using similar pipelines in QIIME 2 to produce amplicon sequence variants (ASVs) [145]. First, forward and reverse primer sequences were removed using cutadapt [146]. Next, filtering, de-replication, sample inference, chimeric sequence identification, and joining of paired-end reads were performed using DADA2 [147]. Additional filtering was performed using the VSEARCH [148] implementation of the UCHIME method [149] to identify and remove de novo chimeras in addition to borderline chimeric sequences. BLAST+ [150] was used to remove non-target ASVs: For 16S rRNA genes, ASVs from two separate MiSeq runs were merged and then filtered to exclude sequences with less than 97% alignment and 70% identity to the reference sequences. For hsp65, ASVs were filtered against the reference hsp65 sequences, requiring 100% alignment and 90% sequence identity—based on intra-species divergence rates for this gene [151]. hsp65 gene operational taxonomic units. To confirm taxonomic assignment and assess phylo- genetic similarity to published NTM sequences, ASVs were clustered by 95% similarity into operational taxonomic units (OTUs) using the vsearch plugin in QIIME 2. The 95% similarity value was within the range of known intra-species divergence rates for mycobacterial hsp65 genes [151] and was selected for convenience to reduce the total number of reference sequences (i.e., due to the high number of ASVs present). Each OTU was represented using the centroid sequence within the OTU cluster. The OTUs were searched manually with the NCBI BLASTn v2.8.0 web utility [152] (limited to order Corynebacteriales within the non-redundant nucleotide collection to identify phylogenetically related non-target sequences, if any). The correspond- ing sequences of the top hits, in addition to several other high-scoring or clinically relevant NTM, were then compiled together with the representative OTU sequences and aligned using MAFFT [104] in the alignment plugin of QIIME 2. A phylogenetic tree was constructed with FastTree [40] and rooted using the midpoint. 61 Taxonomic classification. Two naïve Bayesian classifiers were trained via the feature- classifier plugin, using the reference sequence sets and their corresponding taxonomy designations. ASVs were classified with 95% or higher bootstrap confidence.

Correlation of Mycobacterium- and Methylobacterium-like ASVs. Relative abundances of My- cobacterium- and Methylobacterium-like 16S rRNA gene ASVs were normalized by 16S rRNA gene copy numbers (via qPCR) to reduce compositional bias prior to assessing Spearman rank correlations [153]. Alternatively, a correlation network of all genera was inferred via SparCC, using the average of 20 iterations and a bootstrap procedure to calculate pseudo-p values from 100 permutations [154].

4.3 RESULTS

4.3.1 Water quality

We previously reported higher water temperatures and nutrient concentrations in the chlorami- nated DWDS compared to the no-residual DWDS [32]. In summary, the chloraminated DWDS had high seasonal variation in distributed water temperatures (median 16.1°C; range 1.8–34.1°C) and was typically warmer than the no-residual DWDS (median 7.2°C; range 5.2–8.5°C). Nutri- ents were also higher in the chloraminated DWDS compared to the no-residual DWDS, including AOC (238–343 versus 81–109 µg acetate-C L−1, respectively), free ammonia (<0.02–0.97 vs. <0.02 mg N L−1), nitrate (0.36–1.73 vs. 0.22–0.26 mg N L−1), and phosphorus (0.17–0.52 vs. <0.05 mg P L−1). Water hardness was higher in the chloraminated DWDS (44–93 vs. 36–54 mg −1 CaCO3 L ), but water pH was similar (chloraminated = 7.5–9.5 vs. no residual = 7.8–8.6).

4.3.2 Quantification of marker genes

Bacterial 16S rRNA genes. As previously reported [32], bacterial 16S rRNA gene concentra- tions in water-main biofilms were significantly lower in the chloraminated DWDS than the no-residual DWDS (p = 1 × 10−6; Fig. 4.1A). The median observation in the chloraminated 5 6 −2 DWDS was

Mycobacterial atpE genes and the ITS region of MAC. Mycobacterial atpE genes were com- monly detected in the water-main biofilms and drinking water from both systems, but biofilm concentrations were significantly higher in the chloraminated DWDS (p = 5 × 10−9; Figs. 4.1A and 4.2A). In the chloraminated DWDS, atpE genes were observed in all 19 biofilm samples from full-operation water mains and 5 of 16 biofilms samples from water mains with seasonal shutoff. The median observation among full-operation water mains was 9.3 × 104 copies cm−2 and was higher than the median in the no-residual DWDS, which was

4.3.3 Characterization of 16S rRNA gene amplicons

There were 45 Mycobacterium-like 16S rRNA gene ASVs comprising 650,927 sequence reads. The read frequency per sample ranged from 5 to 124,509 (median = 653 reads). Conversely, there were only 5 Methylobacterium-like ASVs, with total frequency of 40,005 reads. These 63 ranged from 12 to 5,552 reads per samples (median = 204). Relative abundances of these Mycobacterium- and Methylobacterium-like ASVs are presented in Figs. 4.1B and 4.2B for water-main biofilms and drinking water, respectively. The copy number-normalized ratios of Mycobacterium- and Methylobacterium-like ASVs were plotted to assess correlation (Fig. E.2). Spearman rank correlations indicated no significant correlation between these two taxonomic groups in water-main biofilms from the no-residual DWDS (p = 0.23, Spearman’s ρ = 0.26) and drinking water from either system (p = 0.08 and 0.59; ρ = 0.4 and 0.12, chloraminated and no-residual systems, respectively). There was a significant positive correlation, however, in biofilms from the chloraminated DWDS (p = 0.01, ρ = 0.80). With the exception of the chloraminated biofilm samples, these results were generally in agreement with SparCC, where there was no inferred correlation between the two genera in chloraminated biofilms (permuted p = 0.86, correlation = −0.02), chloraminated drinking water (perm. p = 0.76, cor. = 0.02), no-residual biofilms (perm. p = 0.92, cor. = 0.01), and no-residual drinking water (perm. p = 0.44, cor. = −0.05). Relative abundances of Mycobacterium-like 16S rRNA gene ASVs from high-throughput sequencing were also compared to the ratios of atpE genes to bacterial 16S rRNA genes via real-time qPCR (Fig. E.3). These correlated strongly (Spearman’s ρ = 0.81, p < 2 × 10−16), though the relative abundances of Mycobacterium-like ASVs were always greater than the corresponding atpE:16S rRNA gene ratios.

4.3.4 Characterization of hsp65 gene amplicons

In total, there were 50 ASVs representing 45,766 hsp65 gene sequence reads. Library size varied from 9 to 6,769 reads per sample (median = 384). After taxonomic classification, 11 ASVs were identified as M. gordonae with 98.0–100.0% confidence (median = 98.5%). The remaining 39 ASVs failed species-level assignment using the naïve Bayesian classifier. There were 9 ASVs among the water-main biofilms and 6 ASVs among the drinking water samples in the chloraminated DWDS. In the no-residual DWDS, there were 32 ASVs in the biofilms and 8 ASVs in the drinking water. 64 After clustering the ASVs by 95% similarity, there were 15 unique hsp65 gene OTUs. Using the BLASTn web utility to classify the OTUs, the top-scoring GenBank hits had 91.3–99.8% identity to published mycobacterial hsp65 sequences and are summarized in Table E.5. Only OTU 1, which represented the 11 M. gordonae-like ASVs, was confidently classified with the BLASTn web utility. BLASTn was in agreement with the previous classification (99.8% identity to M. gordonae, with 100% query coverage). All other OTUs, which had failed species-level classification with the naïve Bayesian classifier, were less confidently classified to the species level (i.e., with either <97% identity or <100% query coverage). The top hits of these OTUs were consistently from among published Mycobacterium spp., however, which suggested these OTUs were likely from genus Mycobacterium. They are likely novel NTM without characterized representatives in GenBank. Phylogenetic analysis further supported the unclassified OTUs as novel NTM, as opposed to other genera with hsp65 genes (Fig. E.1). A heat map of the hsp65 gene OTUs was created to visualize occurrence of the OTUs among the water-main biofilms and drinking water samples in both systems (Fig. 4.3). Alternatively, a heat map depicting all 50 ASVs is also provided (Fig. E.4). Table 4.1: Summary of marker gene concentrations via real-time qPCR* Biofilms (copies cm−2) p Drinking water (copies L−1) p Target Statistic Source value‡ value‡ Chloraminated† No residual Chloraminated No residual n 19 23 23 22

nobs 9 (47%) 23 (100%) 20 (87%) 22 (100%) Bacterial Median

nobs 19 (100%) 9 (39%) 17 (74%) 22 (100%) Mycobacterial Median 9.3 × 104

* LOQm, method limit of quantification; nobs, incidences observed (i.e., >LOQm); min., minimum; max., maximum; n.a., not applicable †Observations from seasonal shut-off water mains excluded ‡Group-wise comparisons (chloraminated versus no residual) using generalized Wilcoxon tests

65 A Chloraminated DWDS No•residual DWDS

8 ) 2 −

6

(copies cm 4 10 g o l 2 C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b Bacterial 16S rRNA genes Mycobacterial atpE genes M. avium complex•like ITS region B Chloraminated DWDS No•residual DWDS 104

103

102

101 seqs per 10,000

100 C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b Mycobacterium•like ASVs Methylobacterium•like ASVs

Figure 4.1: Characterization of water-main biofilms: (A) Marker gene concentrations via real-time qPCR, and (B) relative abundance of Mycobacterium- and Methylobacterium-like amplicon sequence variants (ASVs) of 16S rRNA gene fragments. Samples not sequenced due to low 16S rRNA gene copy numbers are represented with empty space. Sites C7–C10 were subjected to seasonal shutoff.

66 A Chloraminated DWDS No•residual DWDS

8

) 1 − 7 6 5 (copies L 10

g 4 o l 3 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Bacterial 16S rRNA genes Mycobacterial atpE genes M. avium complex•like ITS region B Chloraminated DWDS No•residual DWDS 104

103

102

101 seqs per 10,000

100 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Mycobacterium•like ASVs Methylobacterium•like ASVs

Figure 4.2: Characterization of drinking water: (A) Marker gene concentrations via real-time qPCR, and (B) relative abundance of Mycobacterium- and Methylobacterium-like amplicon sequence variants (ASVs) of 16S rRNA gene fragments. Samples not sequenced due to low 16S rRNA gene copy numbers are represented with empty space.

67 68 4.4 DISCUSSION

This investigation demonstrates that residual chloramine facilitates a reservoir primarily of Mycobacterium gordonae in the water-main biofilms of a chloraminated DWDS. In contrast, M. frederiksbergense dominated another chloraminated DWDS—up to 85.7% of hsp65 gene sequences in that DWDS [7]. Nonetheless, in both that DWDS and the chloraminated DWDS in the present study, the NTM-dominated biofilm communities were comprised mostly of a single NTM species. This suggests that residual chloramine may not encourage all NTM equally, but rather, other factors may favor certain NTM in the DWDS microbiome. It remains unclear, however, whether deterministic factors contribute to this selection or whether the process is stochastic. Because this trend has only been observed in water-main biofilms from two systems, additional surveys of chloraminated systems are necessary. Nonetheless, it would be of particular concern to public health if chloraminated biofilms could primarily harbor a pathogenic species, such as M. avium. Though there were similar concentrations of NTM in drinking water from the chloraminated and no-residual systems, biofilm concentrations were significantly higher in the chloraminated DWDS. High relative abundances of NTM have been previously observed in biofilms from chloraminated systems (i.e., >90% of 16S rRNA gene amplicon sequences) [Chapter 2] [7], but fractions of marker gene sequences can be misleading [153]. Dominance of a particular taxon can be achieved without any changes to its concentration, simply by eliminating other taxa. Total bacteria concentrations were significantly lower in the chloraminated DWDS compared to the no-residual DWDS, despite more favorable temperatures, AOC, and inorganic nutrients [32]. In contrast, the significantly higher biofilm concentrations of mycobacterial atpE genes in the chloraminated DWDS suggest that the NTM in that system benefit from reduced competition. Though many bacterial taxa are suppressed by residual free chlorine and chloramine, NTM can tolerate these disinfectants due to their hydrophobic cell membranes [73]. As a result, the NTM may increase in response to both the favorable temperatures and uncontested nutrient substrates. Unfortunately, we cannot compare our water-main biofilm concentrations to other studies due 69

C2 3.6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C2 3.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C3 3.8 0 2.8 0 0 0 0 0 0 0 0 0 0 0 0

C3 3.8 0 2.8 0 0 0 0 0 0 0 0 0 0 0 0 Biofilm C4 2.7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C5 3.6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C5 2.4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C6 3.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C6 3.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C6 3.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C4 3.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Water C4 2.2 0 0 0 0 0 0 2.1 0 0 0 0 0 0 0 C9 2.4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C9 1.6 0 0 0 0 0 0 0 0 0 0 0 0 0 0

N5a 0 0 0 2.3 0 0 0 0 0 0 0 0 0 0 0 N5a 0 0 0 2.6 0 0 0 0 0 0 0 0 0 0 0 N5a 0 0 0 2.3 0 0 0 0 0 0 0 0 0 0 0

Sample site N5c 0 0 0 0 0 0 1.1 0 0 0 0 0 0 0 0 Biofilm N5c 0 0 0 2.1 0 0 2.1 0 0 0 0 0 0 0 0 N6 0 1.6 0 0 0 0 0 0 0 1.3 0 0 0 0 0 N7a 0 3.6 0 0 0 0 0 0 0 0 0 0 1.1 0 0.6 N7a 0 3.1 0 0 0 0 0 0 0 0 0 1.3 0 0 0 N7a 0 2.3 0 0 1.3 0 0 0 0 0 0 0 0 0 0 N7a 0 3.8 0 0 2.2 0 1 0 0 0 0 0 0 0 0 N7b 0 2.1 0 2.4 0 2.5 0 0 0 0 1.3 0 0 0 0

N1 0 0 0 0 1.2 0 0 0 0 0 0 0 0 0 0

N1 0 0 0 0 2.9 0 0 0 0 0 0 0 0 0 0 Water N3 0 2.4 0 0 0 0 0 0 1.8 0 0 0 0 0 0 N4 0 0 0 0 1.3 0 0 0 0 0 0 0 0 0 0 N4 0 2.3 0 0 2.2 0 0 0 1.3 1.3 0 0 0 0 0 N7 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Operational taxonomic unit

Distribution system 0 1 2 3 Chloraminated No•residual log10(frequency)

Figure 4.3: Heat map of mycobacterial hsp65 gene operational taxonomic units (clustered by 95% similarity) in water-main biofilms and drinking water of two drinking water distribution systems. A pseudo-count of 1 was added to frequencies prior to logarithmic transformation of undetected OTU observations (i.e., 0 = no detection). 70 to a lack of such data in the literature. The atpE gene concentrations in the drinking water of the chloraminated and no-residual systems (median = 9.3 × 104 and 2.9 × 103 copies L−1, respec- tively), however, were lower than observed in another chloraminated DWDS (median = 6.9 × 106 copies L−1) [133]. The apparent absence of MAC in both systems was a welcome finding for public health. It is still possible MAC are present either in low concentrations or intermittently in the drinking water. Though we did not investigate biofilms in premise plumbing, these environments are especially favorable for growth of MAC [125]. Thus, a single incidence of MAC in the water could contaminate faucets or shower heads. Our findings indicate, however, that water-main biofilms in these two systems were not significant reservoirs of MAC. Still, the NTM that were present may pose a minor but persistent threat to certain human populations. M. gordonae have been previously observed to cause infection [134, 155]. Such cases have been rare, however, despite M. gordonae being one of the most commonly observed NTM in drinking water [156, 157]. Additionally, because many of the NTM were novel, particularly in the no-residual DWDS, it is unclear whether these NTM could cause infection. The novel NTM did not have phylogenetically similar hsp65 genes to known opportunistic pathogens, however. High concentrations of residual chloramine may favor certain NTM, particularly M. gor- donae, over other NTM. The chloraminated DWDS produces drinking water with a residual −1 concentration near 4.0 mg Cl2 L , which is the maximum residual disinfectant level permitted −1 by the U.S. Safe Drinking Water Act. In tap water, the median concentration was 3.3 mg Cl2 L −1 (range = 0.3–4.1 mg Cl2 L ). M. gordonae have appeared to increase in biofilms after treatment with high chlorine concentrations (i.e., for chlorine shock treatment) as well as a water system that produced super-oxidized water for surgeries [20, 158]. In contrast, MAC in chloraminated water increased with high water age [133]. The total chlorine concentrations, which may decrease with increasing water age, appeared lower in that DWDS than the chloraminated DWDS in this −1 study (median = 2.2, range = 0.5–2.9 mg Cl2 L ). Furthermore, though we were unable to calculate within-sample diversity of hsp65 gene ASVs due to small library sizes, the biofilms in the chloraminated DWDS yielded the highest number of sequence reads despite exhibiting 71 fewer ASVs and only 3 OTUs. The no-residual DWDS, on the other hand, consistently had small library sizes yet contained most of the ASVs—12 of the 15 OTUs. This gave the impression that these NTM communities, despite being minor fractions of the total bacteria in the no-residual DWDS, were more diverse than the NTM in the chloraminated DWDS. Mycobacteria and methylobacteria have been previously associated with chloraminated systems [69]. This was initially observed when a 16S rRNA gene survey found that Methy- lobacterium and Mycobacterium seldom co-existed in shower-head biofilms [20]. In another study, cultured colonies of one genus were a good predictor for absence of the other genus in shower-head biofilms [21]. Both genera tolerate residual chloramine and may directly compete for a similar ecological niche; methylobacteria may outcompete mycobacteria when residual chloramine is diminished [21]. In the present study, no clear relationship was observed between these two genera. Methylobacterium-like ASVs were generally minor fractions of the 16S rRNA gene profiles, however, compared to the Mycobacterium-like ASVs in the water-main biofilms from the chloraminated DWDS. Mycobacteria may outcompete methylobacteria in the water- main biofilms because residual chloramine concentrations are less likely to diminish drastically, as is common in premise plumbing, unless conditions in the DWDS chronically favor chloramine decay (e.g., long periods of stagnation or nitrification) [5, 125]. The copy number-normalized fractions of the genera suggested a positive correlation in the biofilms of the chloraminated DWDS, which was inconsistent with the negative correlations previously reported [20, 21]. Because of the disagreement with the inferred correlations via SparCC, however, this may have been due either to limited sample quantities or other biases from amplicon sequencing and qPCR.

SUPPORTING INFORMATION

Supplementary figures, including a phylogenetic tree of mycobacterial hsp65 gene sequences, Methylobacterium- versus Mycobacterium-like ASVs, atpE:16S rRNA gene ratios versus My- cobacterium-like ASVs, heat map of mycobacterial hsp65 gene ASVs in biofilms and drinking water of the two distribution systems; supplementary tables, including PCR primer and probe 72 sequences and thermoprofiles, sequences and accession numbers for qPCR standards, summary of real-time qPCR reactions, reference hsp65 gene sequences from GenBank used for naïve Bayesian classifier, taxonomic classification of hsp65 OTUs

ACKNOWLEDGEMENTS

We thank the two water utilities for providing access to their drinking water distribution systems. This work was supported primarily with a grant from the water utility in the United States. The water utility in Norway contributed additional financial resources. Collaboration between the University of Minnesota (UMN) and the Norwegian University of Science and Technology was possible with funding from the Norwegian Center for International Cooperation in Education (grant NNA-2012/10128). Kyle Sandberg and Hanna Temme assisted in water sample collection. The UMN Genomics Center sequenced the 16S rRNA and atpE gene amplicons and provided additional technical support. Sequence analysis was possible using resources from the Minnesota Supercomputing Institute. We thank Dr. Sara-Jane Haig for providing DNA extract of M. avium for use as a positive control in real-time qPCR. The authors declare no competing financial interest. 73 Chapter 5

Conclusions

Bacteria in the DWDS microbiome can impact public health as well as distribution system infrastructure. Because more than 95% of the bacterial biomass in the DWDS is present in biofilms [17], water samples alone may be insufficient for comprehensive monitoring of the mi- crobiome. Water-main biofilms are challenging to investigate due to their general inaccessibility, such that detailed characterizations of DWDS microbiomes are lacking. In this investigation, drinking water and water-main biofilms were sampled from two distribution systems to charac- terize and compare their microbiomes—a system that maintains residual chloramine and another with no residual disinfectant. The results provide novel information on relationships between biofilms and suspended biomass within a system and suggest possible effects of the presence or absence of a residual disinfectant on the DWDS microbiome. The primary contributions of this work can be summarized in six points:

Residual chloramine may limit the ecological diversity of water-main biofilms in a chlo- raminated DWDS [Chapter 2]. There were no significant differences in within-sample diversity of biofilm communities versus the corresponding drinking water in the no-residual system. In contrast, community richness and evenness were significantly lower in the biofilms of the chloraminated DWDS relative to its water. Nonetheless, in both systems, drinking water was comprised of separate taxa than the respective biofilm communities. Microbial monitoring of the water alone will therefore misrepresent the taxonomic diversity within the overall DWDS microbiome. Furthermore, due to the suppression of nearly all but the most well-adapted taxa, biofilms in chloraminated systems may be more predictable. There may be specific environmental bacteria common among chloraminated systems, either due to their resilience to chloramine (e.g., Mycobacterium spp.) or their affinity for chloramine-derived ammonia (i.e., AOB) [Chapter 2] 74 [5, 7, 68, 83]. Additional investigation of water-main biofilms from full-scale systems is neces- sary, however, to determine whether the observations in this study extend to the microbiomes of other similar systems.

Despite ostensibly more favorable growth conditions than the no-residual DWDS, the chlo- raminated DWDS had lower bacterial biomass in the drinking water and especially the water- main biofilms [Chapters 2, 3, and 4] [32]. The chloraminated DWDS had higher water temper- atures and AOC, ammonia, nitrate, and phosphorus concentrations than the no-residual DWDS. In addition to the maintenance of residual chloramine, water treatment—particularly the pres- ence of coagulation, sedimentation, and filtration—may explain the lower concentrations of suspended bacteria in the chloraminated DWDS than the no-residual DWDS. The lower biomass in the water-main biofilms of the chloraminated DWDS, however, appears to be primarily a result of the residual chloramine. This finding is consistent with residual chloramine as an effective strategy for limiting biofilm biomass. In addition to suppressing bacterial growth directly, either in the bulk water or by penetrating biofilm extracellular polymeric substances [119], residual chloramine may indirectly limit growth by making the microbiome less metabol- ically efficient. Growth-supporting substrates may go under-utilized due to the suppression of chloramine-susceptible taxa that would otherwise occupy available niches. In contrast, the no-residual DWDS, which was not biologically stable (i.e., AOC >10 µg L−1 [3]) but nonetheless had lower AOC, inorganic nutrients, and water temperatures than the chloraminated DWDS, had many taxa present that are typically associated with diverse metabolic functions. Thus, in systems with relatively high nutrient loads in the source water and where financial or spacial constraints complicate the ability of the water utility to produce biologically stable drinking water, residual chloramine may be an effective strategy for managing the microbiome.

Water-main biofilms may serve as a reservoir of Legionella spp., and residual chloramine may aid in reducing this reservoir [Chapter 3] [32]. Legionella-like 16S rRNA gene sequences were detected in the drinking water and biofilms of both the chloraminated and no-residual systems. Legionellae are of importance to public health because, like MAC, they are oppor- tunistic waterborne pathogens of environmental origin [89]. Though phylogenetic analysis of 75 the Legionella-like 16S rRNA genes in both systems supported their classification as genus Legionella, Legionella-like ssrA genes were only detected in the biofilms of the no-residual DWDS. These markers were detected in the drinking water of both systems, however, and the temperature and nutrient conditions of the chloraminated DWDS were ostensibly more favorable for sustaining legionellae. These findings point to residual chloramine as an important factor in this discrepancy. Detection of L. pneumophila-like mip genes in some biofilms of both systems nonetheless indicated that the most pathogenic species of legionellae may be present, though no wzm genes of L. pneumophila serogroup 1 were detected. Thus, residual chloramine may be an important tool for reducing the reservoir of legionellae in the water-main biofilms. Residual chlo- ramine may also act as an additional safeguard for preventing suspended legionellae—potentially released from water-main biofilms—from contaminating premise plumbing systems via the drinking water.

Water-main biofilms in a chloraminated system were comprised predominantly of non- tuberculous mycobacteria [Chapters 2 and 4]. The most dominant genus observed in the biofilms of the chloraminated DWDS was Mycobacterium [Chapter 2], which was consistent with another chloraminated DWDS [7]. The presence of NTM is of possible concern to public health because there are Mycobacterium spp. that can cause infection in immune-compromised individuals [71]. Quantities of mycobacterial atpE genes were not significantly different in the drinking water of the two systems but were significantly higher in the biofilms of the chloraminated DWDS. This suggested that biofilm-associated NTM may thrive in the presence of residual chloramine due to both their tolerance to the disinfectant and the decreased competition for limited resources [Chapter 4] [73]. Despite the incidence of NTM in both systems, however, no MAC-like 16S-23S ITS fragments were detected in either the biofilms or the drinking water [Chapter 4], which was favorable from a public health perspective because MAC infections are the most common cause of NTM-related illness [134]. Characterization of mycobacterial hsp65 genes suggested that M. gordonae was uniformly the most prominent species present in the chloraminated DWDS [Chapter 4]. M. gordonae are generally the most observed NTM in drinking water systems but are rarely associated with illness [134, 157]. The dominance 76 of a single species in the chloraminated DWDS (i.e., M. gordonae) was consistent with the alpha diversity of 16S rRNA genes in biofilms from that system [Chapter 2]. In contrast, NTM were less abundant in the no-residual DWDS, but there were more hsp65 gene ASVs present [Chapter 4]. These sequences could not be classified to a species, however, and may represent novel mycobacteria. Though residual chloramine has been previously associated with high incidence of NTM in chloraminated water systems, it may selectively favor specific species, such as M. gordonae—decreasing overall NTM diversity. Finally, Mycobacterium- and Methylobacterium-like 16S rRNA gene ASVs did not exhibit negative correlation, as previously observed in biofilms from chloraminated premise plumbing systems [20, 21]. Because methylobacteria were minor fractions of the water-main biofilms in the chloraminated DWDS, it may be that consistent chloramine concentrations in the DWDS give NTM an advantage over methylobacteria.

Chloramine may encourage growth of Nitrosomonas oligotropha-like AOB in water-main biofilms [Chapter 2]. AOB are potentially problematic for water utilities because AOB con- tribute to biologically-accelerated chloramine decay [5]. Characterization of 16S rRNA genes provided evidence that Nitrosomonas-like taxa were prominent in the biofilms of the chloram- inated DWDS [Chapter 2]. Quantities of N. oligotropha-like amoA genes indicated that the functional potential for ammonia oxidation by AOB was more prevalent in the chloraminated DWDS than the no-residual DWDS. Despite low concentrations of ammonia, however, the no-residual DWDS had both AOB and AOA present. AOA outnumbered the N. oligotropha- like AOB—consistent with a low-ammonia environment. The relative abundances of NOB in the no-residual DWDS were also consistent with the expected AOB:NOB ratios if complete, two-step nitrification were occurring. Due to the prominence of Nitrospira-like taxa, comam- mox organisms may also have been present, as they have been observed in other low-ammonia drinking water systems [81, 82]. On the other hand, an apparent orders-of-magnitude deficiency of NOB in the chloraminated DWDS suggested abiotic reaction of nitrite with chloramine hinders two-step biotic nitrification while enhancing chloramine decay. The frequent detection of Nitrosomonas-like 16S rRNA genes concurrently to N. oligotropha-like amoA genes, and low 77 or non-existent incidence of AOA and NOB, also indicated that chloramine may specifically encourage N. oligotropha in the water-main biofilms while decreasing nitrifier diversity. This was consistent with other work [83] and mirrors the apparent effect of chloramine in decreasing overall biofilm bacteria diversity [Chapter 2] as well as biofilm NTM diversity [Chapter 4].

The under-tubercle communities from the two systems were relatively similar and included taxa often associated with microbiologically influenced corrosion [Chapter 2]. Under-tubercle samples from the chloraminated DWDS were, however, significantly less diverse than those from the no-residual DWDS, which were comprised mostly of Desulfovibrio-like taxa. It may be that chloramine contributes indirectly to a lower diversity under the tubercles by suppressing potential colonizers, which presumably originate from the chloraminated drinking water. The diversity and taxonomic composition of these communities, however, may be significantly influenced by pipe material [84] or other factors, such as sulfate concentration. Unlike the chloraminated DWDS, sulfate was not added to water in the no-residual DWDS during treatment, and Desulfovibrio-like taxa were prominent in only a few samples. Nonetheless, the presence of taxa in both systems that are associated with microbiologically influenced corrosion suggests that presence or absence of residual chloramine may not significantly reduce or prevent microbiologically-influenced corrosion. 78 References

1. Rosario-Ortiz F, Rose J, Speight V, von Gunten U, Schnoor J. How do you like your tap water? Science 2016; 351: 912–914.

2. Douterelo I, Husband S, Loza V, Boxall J. Dynamics of biofilm regrowth in drinking water distribution systems. Appl Environ Microb 2016; 82: 4155–4168.

3. van der Kooij D, van Lieverloo JHM, Schellart JA, Hiemstra P. Distributing drinking water without disinfectant: Highest achievement or height of folly? J Water SRT—Aqua 1999; 48: 31–37.

4. Edwards MA, Triantafyllidou S, Best D. Elevated blood lead in young children due to lead-contaminated drinking water: Washington, DC, 2001–2004. Environ Sci Technol 2009; 43: 1618–1623.

5. Krishna KCB, Sathasivan A, Sarker DC. Evidence of soluble microbial products accelerat- ing chloramine decay in nitrifying bulk water samples. Water Res 2012; 46: 3977–3988.

6. Liu G, Verberk JQJC, van Dijk JC. Bacteriology of drinking water distribution systems: An integral and multidimensional review. Appl Microbiol Biotechnol 2013; 97: 9265–9276.

7. Gomez-Smith CK, LaPara TM, Hozalski RM. Sulfate reducing bacteria and mycobacteria dominate the biofilm communities in a chloraminated drinking water distribution system. Environ Sci Technol 2015; 49: 8432–8440.

8. Bai X, Ma X, Xu F, Li J, Zhang H, Xiao X. The drinking water treatment process as a potential source of affecting the bacterial antibiotic resistance. Sci Total Environ 2015; 533: 24–31.

9. Zhang S, Lin W, Yu X. Effects of full-scale advanced water treatment on antibiotic resistance genes in the Yangtze Delta area in China. FEMS Microbiol Ecol 2016; 92: fiw065–9.

10. World Health Organization. Legionella and the Prevention of Legionellosis. WHO Press, Geneva, Switzerland, 2007.

11. World Health Organization. Fact Sheet: Drinking Water, 2018. URL http://www.who. int/news-room/fact-sheets/detail/drinking-water.

12. Delpla I, Jung AV, Baures E, Clement M, Thomas O. Impacts of climate change on surface water quality in relation to drinking water production. Environ Int 2009; 35: 1225–1233. 79 13. Liyanage CP, Yamada K. Impact of population growth on the water quality of natural water bodies. Sustainability 2017; 9: 1405–14. 14. Jones CH, Shilling EG, Linden KG, Cook SM. Life cycle environmental impacts of disinfection technologies used in small drinking water systems. Environ Sci Technol 2018; 52: 2998–3007. 15. Ravenscroft P, Mahmud ZH, Islam MS, Hossain AKMZ, Zahid A, Saha GC, Ali AHMZ, Islam K, Cairncross S, Clemens JD, Islam MS. The public health significance of latrines discharging to groundwater used for drinking. Water Res 2017; 124: 192–201. 16. Murphy HM, Prioleau MD, Borchardt MA, Hynds PD. Review: Epidemiological evidence of groundwater contribution to global enteric disease, 1948–2015. Hydrogeol J 2017; 25: 981–1001. 17. Flemming HC, Percival SL, Walker JT. Contamination potential of biofilms in water distribution systems. Wa Sci Technol 2002; 2: 271–280. 18. LeChevallier MW, Gullick RW, Karim MR, Friedman M, Funk JE. The potential for health risks from intrusion of contaminants into the distribution system from pressure transients. J Water Health 2003; 1: 3–14. 19. American Water Works Association. Buried No Longer. Tech. rep., Denver, Colorado, 2012. 20. Feazel LM, Baumgartner LK, Peterson KL, Frank DN, Harris JK, Pace NR. Opportunistic pathogens enriched in showerhead biofilms. PNAS 2009; 106: 16393–16399. 21. Falkinham III JO, Williams MD, Kwait R, Lande L. Methylobacterium spp. as an indicator for the presence or absence of Mycobacterium spp. Int J Myco 2016; 5: 240–243. 22. Stalter D, O’Malley E, von Gunten U, Escher BI. Fingerprinting the reactive toxicity pathways of 50 drinking water disinfection by-products. Water Res 2016; 91: 19–30. 23. American Water Works Association. Nitrification. Tech. rep., U.S. Environmental Protec- tion Agency, Washington, DC, 2002. 24. Chiao TH, Clancy TM, Pinto A, Xi C, Raskin L. Differential resistance of drinking water bacterial populations to monochloramine disinfection. Environ Sci Technol 2014; 48: 4038–4047. 25. Prest EI, Hammes F, van Loosdrecht MCM, Vrouwenvelder JS. Biological stability of drinking water: Controlling factors, methods, and challenges. Front Microbiol 2016; 7: 1–24. 26. Beech IB, Sunner J. Biocorrosion: Towards understanding interactions between biofilms and metals. Curr Opin Biotech 2004; 15: 181–186. 80 27. Wingender J, Flemming HC. Biofilms in drinking water and their role as reservoir for pathogens. Int J Hyg Envir Heal 2011; 214: 417–423.

28. Rice EW, Baird RB, Eaton AD, Clesceri LS (eds.) Standard Methods for the Examination of Water and Wastewater. 22nd edn. American Public Health Association, American Water Works Association, Water Environment Federation, Washington, DC, 2012.

29. Emerson K, Russo RC, Lund RE, Thurston RV. Aqueous ammonia equilibrium calculations: Effect of pH and temperature. J Fish Res Board Can 1975; 32: 2379–2383.

30. Muyzer G, De Waal EC, Uitterlinden AG. Profiling of complex microbial-populations by denaturing gradient gel-electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl Environ Microb 1993; 59: 695–700.

31. Bartram AK, Lynch MDJ, Stearns JC, Moreno-Hagelsieb G, Neufeld JD. Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end Illumina reads. Appl Environ Microb 2011; 77: 3846–3852.

32. Waak MB, LaPara TM, Hallé C, Hozalski RM. Occurrence of Legionella spp. in water- main biofilms from two drinking water distribution systems. Environ Sci Technol 2018; 52: 7630–7639.

33. Garbe JR, Gould TJ, Gohl DM, Knights D, Beckman KB. Metagenomics pipeline, 2017. URL https://bitbucket.org/jgarbe/gopher-pipelines.

34. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Peña AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R. QIIME allows analysis of high-throughput community sequencing data. Nat Methods 2010; 7: 335–336.

35. Rideout JR, He Y, Navas-Molina JA, Walters WA, Ursell LK, Gibbons SM, Chase J, McDonald D, Gonzalez A, Robbins-Pianka A, Clemente JC, Gilbert JA, Huse SM, Zhou HW, Knight R, Caporaso JG. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences. PeerJ 2014; 2: e545.

36. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010; 26: 2460–2461.

37. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 2012; 41: D590–D596.

38. Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, Schweer T, Peplies J, Ludwig W, Glöckner FO. The SILVA and “All-species Living Tree Project (LTP)” taxonomic frameworks. Nucleic Acids Res 2013; 42: D643–D648. 81 39. Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 2010; 26: 266–267.

40. Price MN, Dehal PS, Arkin AP. FastTree 2 – Approximately maximum-likelihood trees for large alignments. PLoS ONE 2010; 5: e9490.

41. Chiu CH, Chao A. Estimating and comparing microbial diversity in the presence of sequencing errors. PeerJ 2016; 4: e1634.

42. Moran MD. Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 2003; 100: 403–405.

43. Chen J, Bittinger K, Charlson ES, Hoffmann C, Lewis J, Wu GD, Collman RG, Bushman FD, Li H. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 2012; 28: 2106–2113.

44. Paradis E, Claude J, Strimmer K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 2004; 20: 289–290.

45. R Core Team. R: A language and environment for statistical computing. R Foundation, Vienna, Austria, 2014. URL www.R-project.org.

46. Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin PR, O’Hara RB, Simpson GL, Solymos P, Stevens MHH, Szoecs E, Wagner H. vegan: Com- munity Ecology Package 2018; URL https://CRAN.R-project.org/package= vegan.

47. Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, Lozupone C, Zaneveld JR, Vazquez-Baeza Y, Birmingham A, Hyde ER, Knight R. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 2017; 5: 27.

48. Paulson JN, Stine OC, Bravo HC, Pop M. Differential abundance analysis for microbial marker-gene surveys. Nature 2013; 10: 1200–1202.

49. Harms G, Layton AC, Dionisi HM, Gregory IR, Garrett VM, Hawkins SA, Robinson KG, Sayler GS. Real-time PCR quantification of nitrifying bacteria in a municipal wastewater treatment plant. Environ Sci Technol 2003; 37: 343–351.

50. Meinhardt KA, Bertagnolli A, Pannu MW, Strand SE, Brown SL, Stahl DA. Evaluation of revised polymerase chain reaction primers for more inclusive quantification of ammonia- oxidizing archaea and bacteria. Env Microbiol Rep 2015; 7: 354–363.

51. Lopaka L. NADA: Nondetects And Data Analysis for environmental data, 2017. URL https://CRAN.R-project.org/package=NADA. 82 52. Peto R, Peto J. Asymptotically efficient rank invariant test procedures. J R Stat Soc Ser A-G 1972; 135: 185–207.

53. Fong DYT, Kwan CW, Lam KF, Lam KSL. Use of the sign test for the median in the presence of ties. Am Stat 2003; 57: 237–240.

54. Huston C, Juarez-Colunga E. Guidelines for Computing Summary Statistics for Data-Sets Containing Non-Detects. Tech. rep., Bulkley Valley Research Centre, 2009.

55. Purkhold U, Pommerening-Röser A, Juretschko S, Schmid MC, Koops HP, Wagner M. Phylogeny of all recognized species of ammonia oxidizers based on comparative 16S rRNA and amoA sequence analysis: Implications for molecular diversity surveys. Appl Environ Microb 2000; 66: 5368–5382.

56. Daims H, Lücker S, Wagner M. A new perspective on microbes formerly known as nitrite-oxidizing bacteria. Trends Microbiol 2016; 24: 699–712.

57. Liu G, Bakker GL, Li S, Vreeburg JHG, Verberk JQJC, Medema GJ, Liu WT, van Dijk JC. Pyrosequencing reveals bacterial communities in unchlorinated drinking water distribution system: An integral study of bulk water, suspended solids, loose deposits, and pipe wall biofilm. Environ Sci Technol 2014; 48: 5467–5476.

58. Pinto AJ, Xi C, Raskin L. Bacterial community structure in the drinking water microbiome is governed by filtration processes. Environ Sci Technol 2012; 46: 8851–8859.

59. Zwart G, Crump BC, Kamst-van Agterveld MP, Hagen F, Han SK. Typical freshwater bacteria: an analysis of available 16S rRNA gene sequences from plankton of lakes and rivers. Aquat Microb Ecol 2002; 28: 141–155.

60. Watanabe K, Komatsu N, Ishii Y, Negishi M. Effective isolation of bacterioplankton genus Polynucleobacter from freshwater environments grown on photochemically degraded dissolved organic matter. FEMS Microbiol Ecol 2009; 67: 57–68.

61. Salcher MM, Pernthaler J, Posch T. Seasonal bloom dynamics and ecophysiology of the freshwater sister clade of SAR11 bacteria ‘that rule the waves’ (LD12). ISME J 2011; 5: 1242–1252.

62. Llirós M, Inceoglu˘ Ö, García-Armisen T, Anzil A, Leporcq B, Pigneur LM, Viroux L, Darchambeau F, Descy JP, Servais P. Bacterial community composition in three freshwater reservoirs of different alkalinity and trophic status. PLoS ONE 2014; 9: e116145.

63. Salcher MM, Neuenschwander SM, Posch T, Pernthaler J. The ecology of pelagic freshwa- ter methylotrophs assessed by a high-resolution monitoring and isolation campaign. ISME J 2015; 9: 2442–2453.

64. Donlan RM. Biofilms: Microbial life on surfaces. Emerg Infect Dis 2002; 8: 881–890. 83 65. Ortiz-Álvarez R, Fierer N, de los Ríos A, Casamayor EO, Barberán A. Consistent changes in the taxonomic structure and functional attributes of bacterial communities during primary succession. ISME J 2018; : 1–10doi: 10.1038/s41396-018-0076-2.

66. Liu R, Zhu J, Yu Z, Joshi D, Zhang H, Lin W, Yang M. Molecular analysis of long-term biofilm formation on PVC and cast iron surfaces in drinking water distribution system. J Environ Sci 2014; 26: 865–874.

67. Ling F, Hwang C, LeChevallier MW, Andersen GL, Liu WT. Core-satellite populations and seasonality of water meter biofilms in a metropolitan drinking water distribution system. ISME J 2016; 10: 582–595.

68. Kelly JJ, Minalt N, Culotti A, Pryor M, Packman A. Temporal variations in the abundance and composition of biofilm communities colonizing drinking water distribution pipes. PLoS ONE 2014; 9: e98542.

69. Stanish LF, Hull NM, Robertson CE, Harris JK, Stevens MJ, Spear JR, Pace NR. Factors influencing bacterial diversity and community composition in municipal drinking waters in the Ohio River Basin, USA. PLoS ONE 2016; 11: e0157966.

70. Chistoserdova L, Kalyuzhnaya MG. Current trends in methylotrophy. Trends Microbiol 2018; 26: 703–714.

71. Falkinham III JO. Nontuberculous mycobacteria in the environment. Clin Chest Med 2002; 23: 529–551.

72. Beye M, Fahsi N, Raoult D, Fournier PE. Careful use of 16S rRNA gene sequence similarity values for the identification of Mycobacterium species. New Microbe and New Infect 2018; 22: 24–29.

73. Luh J, Tong N, Raskin L, Mariñas BJ. Inactivation of Mycobacterium avium with monochlo- ramine. Environ Sci Technol 2008; 42: 8051–8056.

74. Woolschlager JE, Rittmann BE, Piriou P, Schwartz B. Developing an effective strategy to control nitrifier growth using the Comprehensive Disinfection and Water Quality Model (CDWQ). In: World Water and Environmental Resources Congress 2001. American Society of Civil Engineers, Orlando, Florida.

75. Deutzmann JS, Hoppert M, Schink B. Characterization and phylogeny of a novel methan- otroph, Methyloglobulus morosus gen. nov., spec. nov. Syst Appl Microbiol 2014; 37: 165–169.

76. Greene AC. The Family Desulfurellaceae. In: Rosenberg E, DeLong EF, Lory S, Stacke- brandt E, Thompson F (eds.) The Prokaryotes: Deltaproteobacteria and Epsilonproteobac- teria. Springer, Berlin, Germany, 2014; pp. 135–142. 84 77. Vikesland PJ, Ozekin K, Valentine RL. Monochloramine decay in model and distribution system waters. Water Res 2001; 35: 1766–1776.

78. Winkler MKH, Bassin JP, Kleerebezem R, Sorokin DY, van Loosdrecht MCM. Unravelling the reasons for disproportion in the ratio of AOB and NOB in aerobic granular sludge. Appl Microbiol Biotechnol 2012; 94: 1657–1666.

79. Nowka B, Daims H, Spieck E. Comparison of oxidation kinetics of nitrite-oxidizing bacteria: Nitrite availability as a key factor in niche differentiation. Appl Environ Microb 2015; 81: 745–753.

80. Martens-Habbena W, Berube PM, Urakawa H, de la Torre JR, Stahl DA. Ammonia oxidation kinetics determine niche separation of nitrifying Archaea and Bacteria. Nature 2009; 461: 976–979.

81. Kits KD, Sedlacek CJ, Lebedeva EV, Han P, Bulaev A, Pjevac P, Daebeler A, Romano S, Albertsen M, Stein LY, Daims H, Wagner M. Kinetic analysis of a complete nitrifier reveals an oligotrophic lifestyle. Nature 2017; 549: 269–272.

82. Wang Y, Ma L, Mao Y, Jiang X, Xia Y, Yu K, Li B, Zhang T. Comammox in drinking water systems. Water Res 2017; 116: 332–341.

83. Regan JM, Harrington GW, Baribeau H, De Leon R, Noguera DR. Diversity of nitrifying bacteria in full-scale chloraminated distribution systems. Water Res 2003; 37: 197–205.

84. Ren H, Wang W, Liu Y, Liu S, Lou L, Cheng D, He X, Zhou X, Qiu S, Fu L, Liu J, Hu B. Pyrosequencing analysis of bacterial communities in biofilms from different pipe materials in a city drinking water distribution system of East China. Appl Microbiol Biotechnol 2015; 99: 10713–10724.

85. Bolton N, Critchley M, Fabien R, Cromar N, Fallowfield H. Microbially influenced corrosion of galvanized steel pipes in aerobic water systems. J Appl Microbiol 2010; 39: 283–289.

86. Krishna KCB, Sathasivan A, Ginige MP. Microbial community changes with decaying chloramine residuals in a lab-scale system. Water Res 2013; 47: 4666–4679.

87. Pizarro GE, Vargas IT. Biocorrosion in drinking water pipes. Wa Sci Technol 2016; 16: 881–887.

88. Reynolds KA, Mena KD, Gerba CP. Risk of waterborne illness via drinking water in the United States. Rev Environ Contam Toxicol 2008; 192: 117–158.

89. Ashbolt NJ. Environmental (saprozoic) pathogens of engineered water systems: Under- standing their ecology for risk assessment and management. Pathogens 2015; 4: 390–405. 85 90. Fields BS, Benson RF, Besser RE. Legionella and Legionnaires’ disease: 25 years of investigation. Clin Microbiol Rev 2002; 15: 506–526.

91. Stout JE, Yu VL, Best MG. Ecology of Legionella pneumophila within water distribution systems. Appl Environ Microb 1985; 49: 221–228.

92. Stout JE, Yu VL, Muraca P. Isolation of Legionella pneumophila from the cold water of hospital ice machines: Implications for origin and transmission of the organism. Infect Control 1985; 6: 141–146.

93. Kim BR, Anderson JE, Mueller SA, Gaines WA, Kendall AM. Literature review—efficacy of various disinfectants against Legionella in water systems. Water Res 2002; 36: 4433– 4444.

94. Schwake DO, Garner E, Strom OR, Pruden A, Edwards MA. Legionella DNA markers in tap water coincident with a spike in Legionnaires’ disease in Flint, MI. Environ Sci Technol Lett 2016; 3: 311–315.

95. Benitez AJ, Winchell JM. Clinical application of a multiplex real-time PCR assay for simultaneous detection of Legionella species, Legionella pneumophila, and Legionella pneumophila serogroup 1. J Clin Microbiol 2013; 51: 348–351.

96. Mérault N, Rusniok C, Jarraud S, Gomez-Valero L, Cazalet C, Marin M, Brachet E, Aegerter P, Gaillard JL, Etienne J, Herrmann JL, the DELPH-I Study Group, Lawrence C, Buchrieser C. Specific real-time PCR for simultaneous detection and identification of Legionella pneumophila serogroup 1 in water and clinical samples. Appl Environ Microb 2011; 77: 1708–1717.

97. Collins S, Jorgensen F, Willis C, Walker JT. Real-time PCR to supplement gold-standard culture-based detection of Legionella in environmental samples. J Appl Microbiol 2015; 119: 1158–1169.

98. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J, Loman NJ, Walker AW. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol 2014; 12: 12.

99. Stackebrandt E, Ebers J. Taxonomic parameters revisited: tarnished gold standards. Microbiol Today 2006; 33: 152–155.

100. Kim M, Oh HS, Park SC, Chun J. Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Micr 2014; 64: 346–351.

101. Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for aligning DNA sequences. J Comput Biol 2000; 7: 203–214. 86 102. Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schäffer AA. Database indexing for production MegaBLAST searches. Bioinformatics 2008; 24: 1757–1764.

103. Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res 2013; 41: D36–D42.

104. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: Im- provements in performance and usability. Mol Biol Evol 2013; 30: 772–780.

105. LeChevallier MW, Shaw NE, Kaplan LA, Bott TL. Development of a rapid assimilable organic-carbon method for water. Appl Environ Microb 1993; 59: 1526–1531.

106. Schwake DO, Alum A, Abbaszadegan M. Impact of environmental factors on Legionella populations in drinking water. Pathogens 2015; 4: 269–282.

107. Rodríguez-Martínez S, Sharaby Y, Pecellín M, Brettar I, Höfle MG, Halpern M. Spatial distribution of Legionella pneumophila MLVA-genotypes in a drinking water system. Water Res 2015; 77: 119–132.

108. Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 2012; 13: 134.

109. Wullings BA, van der Kooij D. Occurrence and genetic diversity of uncultured Legionella spp. in drinking water treated at temperatures below 15°C. Appl Environ Microb 2006; 72: 157–166.

110. Donohue MJ, O’Connell K, Vesper SJ, Mistry JH, King D, Kostich M, Pfaller S. Widespread molecular detection of Legionella pneumophila serogroup 1 in cold water taps across the United States. Environ Sci Technol 2014; 48: 3145–3152.

111. Campbell J, Bibb WF, Lambert MA, Eng S, Steigerwalt AG, Allard J, Moss CW, Brenner DJ. Legionella sainthelensi: A new species of Legionella isolated from water near Mt. St. Helens. Appl Environ Microb 1984; 47: 369–373.

112. Hiorns WD, Methé BA, Nierzwicki-Bauer SA, Zehr JP. Bacterial diversity in Adirondack Mountain lakes as revealed by 16S rRNA gene sequences. Appl Environ Microb 1997; 63: 2957–2960.

113. Carvalho FRS, Nastasi FR, Gamba RC, Foronda AS, Pellizari VH. Occurrence and diversity of Legionellaceae in polar lakes of the Antarctic Peninsula. Curr Microbiol 2008; 57: 294–300.

114. Paszko-Kolva C, Shahamat M, Yamamoto H, Sawyer T, Vives-Rego J, Colwell RR. Survival of Legionella pneumophila in the aquatic environment. Microb Ecol 1991; 22: 75–83. 87 115. Lasheras A, Boulestreau H, Rogues AM, Ohayon-Courtes C, Labadie JC, Gachie JP. Influence of amoebae and physical and chemical characteristics of water on presence and proliferation of Legionella species in hospital water systems. Am J Infect Control 2006; 34: 520–525.

116. Zanetti F, Stampi S, De Luca G, Fateh-Moghadam P, Antonietta M, Sabattini B, Checchi L. Water characteristics associated with the occurrence of Legionella pneumophila in dental units. Eur J Oral Sci 2000; 108: 22–28.

117. Moore MR, Pryor M, Fields BS, Lucas C, Phelan M, Besser RE. Introduction of monochlo- ramine into a municipal water system: Impact on colonization of buildings by Legionella spp. Appl Environ Microb 2006; 72: 378–383.

118. Flannery B, Gelling LB, Vugia DJ, Weintraub JM, Salerno JJ, Conroy MJ, Stevens VA, Rose CE, Moore MR, Fields BS, Besser RE. Reducing Legionella colonization of water systems with monochloramine . Emerg Infect Dis 2006; 12: 588–596.

119. Xue Z, Lee WH, Coburn KM, Seo Y. Selective reactivity of monochloramine with extracellular matrix components affects the disinfection of biofilm and detached clusters. Environ Sci Technol 2014; 48: 3832–3839.

120. Mansi A, Amori I, Marchesi I, Marcelloni AM, Proietto AR, Ferranti G, Magini V, Valeriani F, Borella P. Legionella spp. survival after different disinfection procedures: Comparison between conventional culture, qPCR and EMA-qPCR. Microchem J 2014; 112: 65–69.

121. Dupuy M, Mazoua S, Berne F, Bodet C, Garrec N, Herbelin P, Ménard-Szczebara F, Oberti S, Rodier MH, Soreau S, Wallet F, Héchard Y. Efficiency of water disinfectants against Legionella pneumophila and Acanthamoeba. Water Res 2011; 45: 1087–1094.

122. Martiny AC, Jorgensen TM, Albrechtsen HJ, Arvin E, Molin S. Long-term succession of structure and diversity of a biofilm formed in a model drinking water distribution system. Appl Environ Microb 2003; 69: 6899–6907.

123. Wang H, Bédard E, Prévost M, Camper AK, Hill VR, Pruden A. Methodological approaches for monitoring opportunistic pathogens in premise plumbing: A review. Water Res 2017; 117: 68–86.

124. Schoen ME, Ashbolt NJ. An in-premise model for Legionella exposure during showering events. Water Res 2011; 45: 5826–5836.

125. Falkinham III JO, Pruden A, Edwards MA. Opportunistic premise plumbing pathogens: Increasingly important pathogens in drinking water. Pathogens 2015; 4: 373–386.

126. Lesnik R, Brettar I, Höfle MG. Legionella species diversity and dynamics from surface reservoir to tap water: From cold adaptation to thermophily. ISME J 2015; 10: 1064–1080. 88 127. van Heijnsbergen E, Schalk JAC, Euser SM, Brandsema PS, den Boer JW, de Roda Husman AM. Confirmed and potential sources of Legionella reviewed. Environ Sci Technol 2015; 49: 4797–4815.

128. den Boer JW, Coutinho RA, Yzerman EPF, van der Sande MAB. Use of surface water in drinking water production associated with municipal Legionnaires’ disease incidence. J Epidemiol Commun H 2008; 62: e1.

129. Kool JL, Carpenter JC, Fields BS. Effect of monochloramine disinfection of municipal drinking water on risk of nosocomial Legionnaires’ disease. Lancet 1999; 353: 272–277.

130. Cohn PD, Gleason JA, Rudowski E, Tsai SM, Genese CA, Fagliano JA. Community outbreak of legionellosis and an environmental investigation into a community water system. Epidemiol Infect 2014; 143: 1322–1331.

131. Adams DA, Thomas KR, Jajosky RA, Foster L, Sharp P, Onweh DH, Schley AW, Anderson WJ. Summary of notifiable infectious diseases and conditions—United States, 2014. Morb Mortal Wkly Rep 2016; 63: 1–152.

132. Norwegian Institute of Public Health. Norwegian Surveillance System for Communicable Diseases: Legionellosis cases per county, 2010–2015, 2017. URL http://www.msis. no.

133. Haig SJ, Kotlarz N, LiPuma JJ, Raskin L. A high-throughput approach for identification of nontuberculous mycobacteria in drinking water reveals relationship between water age and Mycobacterium avium. mBio 2018; 9: e02354–17.

134. Griffith DE, Aksamit T, Brown-Elliott BA, Catanzaro A, Daley C, Gordin F, Holland SM, Horsburgh R, Huitt G, Iademarco MF, Iseman M, Olivier K, Ruoss S, von Reyn CF, Wallace Jr RJ, Winthrop K. An official ATS/IDSA statement: Diagnosis, treatment, and prevention of nontuberculous mycobacterial diseases. Am J Respir Crit Care Med 2007; 175: 367–416.

135. Fairchok MP, Rouse JH, Morris SL. Age-dependent humoral responses of children to mycobacterial antigens. Clin Diagn Lab Immunol 1995; 2: 443–447.

136. von Reyn CF, Horsburgh CR, Olivier KN, Barnes PF, Waddell R, Warren C, Tvaroha S, Jaeger AS, Lein AD, Alexander LN, Weber DJ, Tosteson ANA. Skin test reactions to Mycobacterium tuberculosis purified protein derivative and Mycobacterium avium sensitin among health care workers and medical students in . . . . Int J Tuberc Lung Dis 2001; 5: 1122–1128.

137. Lin C, Russell C, Soll B, Chow D, Bamrah S, Brostrom R, Kim W, Scott J, Bankowski MJ. Increasing prevalence of nontuberculous mycobacteria in respiratory specimens from US-affiliated Pacific Island jurisdictions. Emerg Infect Dis 2018; 24: 485–491. 89 138. von Reyn CF, Marlow JN, Barber TW, Falkinham III JO, Arbeit RD. Persistent colonisation of potable water as a source of Mycobacterium avium infection in AIDS. Lancet 1994; 343: 1137–1141. 139. O’Brien DP, Currie BJ, Krause VL. Nontuberculous mycobacterial disease in northern Australia: a case series and review of the literature. Clin Infect Dis 2000; 31: 958–967. 140. Prevots DR, Marras TK. Epidemiology of human pulmonary infection with nontuberculous mycobacteria. Clin Chest Med 2015; 36: 13–34. 141. Radomski N, Roguet A, Lucas FS, Veyrier FJ, Cambau E, Accrombessi H, Moilleron R, Behr MA, Moulin L. atpE gene as a new useful specific molecular target to quantify Mycobacterium in environmental samples. BMC Microbiol 2013; 13: 277. 142. Rocchetti TT, Silbert S, Gostnell A, Kubasek C, Widen R. Validation of a multiplex real-time PCR assay for detection of Mycobacterium spp., Mycobacterium tuberculosis complex, and Mycobacterium avium complex directly from clinical samples by use of the BD Max open system. J Clin Microbiol 2016; 54: 1644–1647. 143. Telenti A, Marchesi F, Balz M, Bally F, Böttger EC, Bodmer T. Rapid identification of mycobacteria to the species level by polymerase chain reaction and restriction enzyme analysis. J Clin Microbiol 1993; 31: 175–178. 144. Ringuet H, Akoua-Koffi C, Honore S, Varnerot A, Vincent V, Berche P, Gaillard JL, Pierre- Audigier C. hsp65 sequencing for identification of rapidly growing mycobacteria. J Clin Microbiol 1999; 37: 852–857. 145. Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J 2017; 11: 2639–2643. 146. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 2011; 17: 10–12. 147. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature 2016; 13: 581–583. 148. Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 2016; 4: e2584. 149. Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 2011; 27: 2194–2200. 150. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10: 421. 151. Kim SH, Shin JH. Identification of nontuberculous mycobacteria using multilocous sequence analysis of 16S rRNA, hsp65, and rpoB. J Clin Lab Anal 2017; 32: e22184. 90 152. Altschul SF, Madden TL, Schäffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25: 3389–3402.

153. Jackson DA. Compositional data in community ecology: The paradigm or peril of propor- tions? Ecology 1997; 78: 929–940.

154. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol 2012; 8: e1002687.

155. Lalande V, Barbut F, Varnerot A, Febvre M, Nesa D, Wadel S, Vincent V, Petit JC. Pseudo- outbreak of Mycobacterium gordonae associated with water from refrigerated fountains. J Hosp Infect 2001; 48: 76–79.

156. Falkinham III JO, Norton CD, LeChevallier MW. Factors influencing numbers of Mycobac- terium avium, Mycobacterium intracellulare, and other mycobacteria in drinking water distribution systems. Appl Environ Microb 2001; 67: 1225–1231.

157. Donohue MJ, Mistry JH, Donohue JM, O’Connell K, King D, Byran J, Covert T, Pfaller S. Increased frequency of nontuberculous mycobacteria detection at potable water taps within the United States. Environ Sci Technol 2015; 49: 6127–6133.

158. Fujita J, Nanki N, Negayama K, Tsutsui S, Taminato T, Ishida T. Nosocomial contamination by Mycobacterium gordonae in hospital water supply and super-oxidized water. J Hosp Infect 2002; 51: 65–68.

159. McMurdie PJ, Holmes S. Waste Not, Want Not: Why rarefying microbiome data is inadmissible. PLoS Comput Biol 2014; 10: e1003531.

160. Brenner DJ, Steigerwalt AG, McDade JE. Classification of the Legionnaires’ disease bac- terium: Legionella pneumophila, genus novum, species nova, of the Family Legionellaceae, familia nova. Ann Intern Med 1979; 90: 656–658.

161. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014; 30: 2114–2120.

162. Masella AP, Bartram AK, Truszkowski JM, Brown DG, Neufeld JD. PANDAseq: PAired- eND Assembler for Illumina sequences. BMC Bioinformatics 2012; 13: 31. doi: 10.1186/ 1471-2105-13-31.

163. Lopaka L. NADA: Nondetects And Data Analysis for environmental data, 2013. URL https://cran.r-project.org/web/packages/NADA.

164. McMurdie PJ, Paulson JN. biomformat: An interface package for the BIOM file format, 2016. URL http://bioconductor.org/packages/biomformat. 91 165. Wickham H. ggplot2: Elegant graphics for data analysis, 2009. URL http://ggplot2. org.

166. Wilke CO. cowplot: Streamlined plot theme and plot annotations for ‘ggplot2’, 2017. URL https://CRAN.R-project.org/package=cowplot.

167. Standards Norway: 13.060–Water quality. Oslo, Norway, 2017. URL https://www. standards.no. 92 Appendix A

Acronyms and Glossary

Care has been taken to minimize the use of jargon and acronyms, but this cannot always be achieved. All uncommon acronyms have been defined in-text upon first use as well as in the captions of figures and footers of tables. For convenience to the reader, all acronyms from Chapters 1–5, in addition to undefined acronyms (i.e., common in popular usage), are summarized in Table A.1 with their explicit meaning. Names of chemical compounds (i.e., molecular formulas), abbreviated organizational names, and software packages with names derived from acronyms are not included in this list. This appendix also defines jargon terms in a glossary.

A.1 ACRONYMS

Table A.1: Acronyms

Acronym Meaning AOA ammonia-oxidizing archaea AOB ammonia-oxidizing bacteria AOC assimilable organic carbon ASV amplicon sequence variant DNA* deoxyribonucleic acid DWDS drinking water distribution system EPS extracellular polymeric substances FOD frequency of detection ITS internal transcribed spacer LOD limit of detection LODm limit of detection for the method LODseq limit of detection for high-throughput sequencing LOQ limit of quantification LOQm limit of quantification for the method MAC Mycobacterium avium complex NOB nitrite-oxidizing bacteria NTM non-tuberculous mycobacteria Continued on next page 93 Table A.1 – Continued from previous page Acronym Meaning OTU operational taxonomic units PCR polymerase chain reaction qPCR quantitative polymerase chain reaction RNA* ribonucleic acid rRNA ribosomal ribonucleic acid ssp.* species, plural *Undefined in-text due to common usage

A.2 GLOSSARY

• 16S-23S internal transcribed spacer (ITS) region – non-functional conserved region between the coding regions for the large and small subunits of rRNA in bacteria (16S and 23S, respectively); specifically referring to the ITS region of Mycobacterium avium complex [142] within this dissertation

• 16S ribosomal RNA gene – house-keeping gene, present in all Bacteria (often multiple times per genomic unit), that codes for the small subunit of ribosomal RNA; typically referring to the V3 hyper-variable region [30] within the scope of this dissertation

• atpE gene – adenosine triphosphate (ATP) synthase subunit C gene; specifically referring the atpE of Mycobacterium spp. [141] in this dissertation

• biofilms – matrices of sticky extracellular polymeric substances (EPS), attached to surfaces, in which microbes may form complex communities; biofilms have defined architecture, encourage the exchange of genetic material between cells, and may enable communication between cells (i.e., quorum sensing); as they grow and mature, biofilms may shed into other mediums, such as flowing water [64]

• biologically-accelerated chloramine decay – the rapid loss of residual chloramine from drinking water due to microbial activity; primarily the result of chloramine reacting with nitrite (produced by ammonia-oxidizing microbes), which produces more ammonia in addition to nitrate; though predominantly caused by ammonia oxidizers, other taxa may be involved (e.g., via the production of extracellular polymeric substances) [5]

• drinking water distribution system (DWDS) – the vast network of water mains and pipes, reservoirs, water towers, and other pumps and storage basins that deliver potable drinking water to consumers; within this dissertation, specifically referring to the infras- tructure after water treatment and before premise plumbing

• hsp65 gene – 65-kDa heat-shock protein gene; typically referring to the hsp65 gene of Mycobacterium spp. [143] within this dissertation 94 • legionellae – common term (plural) for the genus, Legionella

• methylobacteria – common term (plural) for the genus, Methylobacterium

• mip gene – macrophage infectivity potentiator gene of Legionella spp.; specifically refer- ring to the mip gene of pathogenic Legionella pneumophila [95] within this dissertation

• microbiome – the totality of all microbes in a specified environment, which may include bacteria, archaea, and single-celled eukaryotes; in this dissertation, typically referring to the bacterial fraction of the microbiome

• mycobacteria – common term (plural) for the genus, Methylobacterium

• non-tuberculous mycobacteria (NTM) – environmentally ubiquitous Mycobacterium spp., including species capable of causing disease in humans and other animals, which are differentiated from the tuberculosis-causing species, M. tuberculosis; NTM may, however, cause tuberculous lesions in skin and other soft tissues [71]

• opportunistic pathogens – microbes found in natural and engineered environments that may cause disease in specific human populations (e.g., immunocompromised individuals and the elderly) upon exposure to environmental sources; opportunistic pathogens are not spread from human-to-human or human-to-animal contact, unlike traditional pathogens causing communicable disease [89]

• ssrA gene – transfer-messenger RNA gene; specifically referring to the ssrA gene in all Legionella spp. [95] within this dissertation

• wzm gene – highly conserved gene fragment for the ABC transporter of lipopolysaccharide O-antigen in Legionella pneumophila serogroup 1 [96] 95 Appendix B

Protocol for Retrieving Water-Main Cross-sections

There were two versions of the water-main sampling protocol: A standard operating procedure (SOP) and a Quick-Reference Guide. The SOP was used for detailing the protocol to the water utilities during initial planning. A visual summary of the method was also created with instructions in both United States English and Norwegian bokmål. The Quick-Reference Guide was used as a supplement to the SOP and was especially useful for reiterating the protocol while at the collection sites during excavation. 96 Standard Operating Procedure: Extraction of a Pipe Sample for Biofilm Analysis

Site Preparation (Overseen by the water utility) 1. A water main (diameter 6–12 in or 15–30 cm) is selected by the utility and approved by the researcher to ensure compliance with research goals and protocol. 2. The utility crew excavates the water main and secures a trench. The pipe exterior is cleaned with a chain cleaner to remove soil deposits and loose debris. 3. Water should be shut off or rerouted no more than 12 hours prior to sample extraction to prevent significant changes in internal microbial communities. Pipe Extraction (researcher MUST be present) 4. The pipe exterior and cutting device (e.g., hinged reed cutter, chain or hydraulic “snapping” cutter, or chop saw) are rinsed with a chlorine solution (provided by the researcher) to minimize contamination during pipe extraction. 5. As the pipe is cut, water will evacuate the water main and pool in the trench. The utility should have a pump prepared onsite to remove this water. Pooled water should NOT contact the sample once extraction has begun. If trench water contacts the pipe, the sample is assumed contaminated and rendered invalid. 6. The maintenance crew makes two vertical incisions approximately 12–25 in, or 30–65 cm, apart (longitudinal distance; longer for a chop saw to avoid overheating and agitation). 7. If the pipe does not readily separate from the water main, a third cut may be necessary to relieve pressure. 8. Once removed, the sample is carefully placed in a clean bag (provided by the researcher), with the plastic taped secure around the ends of the pipe sample. Care should be taken to minimize contact (e.g., hands, arms) with the inside of the bag. The bag is secured with tape to prevent the sample from moving within the bag. The pipe sample may cause a mess during transit and should be placed in a cooler (smaller sections) or on additional plastic sheeting (longer sections). It should be transported immediately to the laboratory for sampling.

9. Collection of Water Samples (OPTIONAL): During pipe extraction, at least three 1-L samples of water (preferably four or more) may be collected from near the pipe sample for analysis of bulk drinking water. Use sterile glass bottles 97 (autoclaved at 121°C for 15 min) or commercially-available, disposable sterile bags for water sampling (e.g., Nasco Whirl-paks®). The researcher should follow standard microbiological sampling protocol (i.e., rinse relevant surfaces with a chlorine solution and/or flame the collection point). Residual chlorine of the bulk water should be measured immediately, such as with a Hach Pocket Colorimeter. Additional Notes Regarding Quality Assurance/Quality Control (QA/QC) a) The pipe must always be secured by wooden wedges, rope/cables (attached to an excavator/backhoe), or by a maintenance crew member’s hands. The segment must NOT be allowed to drop or roll freely. Once removed, the sample should be placed immediately in a clean bag; placing the pipe directly on the ground or other non-disinfected surface should be avoided unless absolutely necessary. b) When removing the sample, wedges or pry bars should not be forced into the incisions nor should rigorous hammering be used to dislodge the sample. Excessive force will damage internal structures, and insertion of foreign objects into the pipe may introduce microbiological contamination. c) When handling the pipe sample, the segment should be grabbed around the pipe exterior and gently cradled; hands should NOT enter the interior of the pipe or grasp the internal pipe walls. d) Notes on tools for opening the water main: • A hinged Reed™ pipe cutter may be least invasive at opening the water main but is laborious, time-consuming, and may experience difficulty with cement-lined pipes. Extra blade wheels should be readily available, as they are prone to break during cutting. Furthermore, the time required per cut (about 20 min) may be especially burdensome for the water utility and can increase the risk of contamination due to the duration in which the pipe will be open while in the trench. • Chain or hydraulic “snapping” cutters tend to work well, but are not meant for ductile iron. While intended only for gray cast iron, success is possible with cement-lined ductile (although not recommended by the manufacturer). • Chop saws are quickest and the most convenient method for the water utility (and therefore may enable more sample opportunities). They work with all pipe materials and with pipes containing cement lining or heavy corrosion scaling and tuberculation. Chop saws are riskiest, however, as they may overly agitate the pipe, introduce contamination, and overheat the edges. Use a chop saw only if water remains in the pipe. The water will evacuate during the cutting, thus keeping the pipe from overheating and expelling dirt and 98 debris outward. When using a chop saw, a longer cut of pipe is recommended (up to 3 ft or 1 m), and biofilm sampling should occur at least 4–8 in, or 10–20 cm, from the cut ends. Safety Procedures/Concerns The primary safety concerns occur during removal of the pipe section. Utility personnel are responsible for all pipe-removal procedures in the trench (e.g., cutting, securing, lifting). These activities require operation of special equipment as well as activity within confined spaces. As it is an active construction site, the researcher should be wearing or have access to a safety helmet, reflective vest, and ear protection while supervising pipe extraction onsite. 99

Before Removal PRO TOCOL 1 Før fjerning English | norsk

The pipe should not be taken o ffline more than 12 Turn the water off hours before pipe sampling. Bacteria may change in within 12 hours stagnant wate r. Slå av vannet Rø ret skal ikke slås av mer enn 12 timer før innen 12 timer prøvetaking. Bakterier kan end re seg i stillestående vann.

Remove excess soil from the pipe with a descale r. We will provide a chlorine solution in a sprayer for the Rinse off the pipe workers to use. Use this to disinfect the pipe. Fjern over flødig jo rd fra rø ret med en descale r. Vi Skyll av røret skal ta med oss klo roppløsning i en trykksprøyte som arbeiderne kan bruke. Det skal brukes til å desin fisere rø ret.

The pipe should still contain wate r. When this drains Have the water during removal, a pump should be read y. The sample is pump ready to use invalid if it touches dirty wate r. Ha vannpumpen Rø ret må likevel inneholde vann. Når det renner i løpet klar til å bruke av fjerning, må en pumpe væ re kla r. Prøven er ugyldig hvis den kommer i kontakt med skittent vann.

2 Water Main Removal Researchers must be present! Vannrøret fjerning Forskere må være der!

Secure the pipe so it does not fall into dirt or wate r. 1 2 3 Make two cuts with a chopsa w. A third cut may be Secure and saw necessary for easier removal. Sikre og sage Sik re rø ret så det ikke vil falle i skitt eller vann. Sag to kutt med en metallkutte r. Et t redje kutt kan væ re nødvendig for enkle re fjerning.

Once the pipe is free, take it out of the pit to the road to Take it up and out be unfastened. Ta det opp og ut Når rø ret er fritt, løsne det ved å ta det ut av hullet i veien.

UNIVERSITY OF MINN ESOTA 100 PRO TOCOL After Removal English | norsk 3 Etter fjerning

Sample Handling Prøvehåndtering

Correct Do not turn vertical! Do not put fingers inside! Riktig Ikke snu vertikalt! Ikke legg fingrene inni!

The researcher will seal and tape the pipe Sample is sealed sample. The pipe is ready to go to the lab. Prøven forseglet Forske ren skal forsegle og teipe rørprøven. Rø ret er klart til å bli tatt med til laboratoriet.

Important Information ! Questions? Michael Waak Viktig informasjon Spørsmål? [email protected]

The pipe should About the sample contain water! Om prøven OUT UT Røret bør inneholde vann! WATER 15–20 cm VANN (6–8 inches)

60–70 cm If the pipe contains water while the saw is (25–30 inches) cutting it: • No overheating We would like to know: • Contamination does not enter pipe • Year of installation • Material (e.g., cast iron) Hvis røret inneholder vann mens metallkutteren sager det: Vi vil gjerne vite: • Ikke ove roppvarming • Installasjonsår • Foru rensning kan komme ikke inn • Materiale (f.eks støpejern) 101 Appendix C

Supporting Information for Chapter 2 102 SIMATERIALS &METHODS

Water mains from winter-shutoff sites Though collected during normal operation, the water mains at these sites are shutoff annually during the cold-weather months due to freezing concerns; though technically still operational and full of drinking water, these mains lie adjacent to gate-valves that are closed to completely halt flow. Water in the main on the opposite side of each gate-valve is drained to prevent ice formation and expansion damage. These dead ends are likely stagnant for months-long periods every year. Library-size normalization for beta metrics Beta diversity was assessed using the generalized UniFrac distances and checked using the unweighted UniFrac distances and Bray-Curtis dissimilarity as alternatives. Depending on the metric, read counts were normalized to account for the uneven library sizes [159]. No normalization was performed for generalized UniFrac (i.e. original counts were used) [43]. For unweighted UniFrac, counts were equally subsampled to the lowest count of any sample [47]. Bray-Curtis dissimilarity was calculated using read counts normalized with cumulative sum scaling [48]. 103

Figure C.1: Ordination of sample collection sites relative to drinking water treatment plant in the (a) chloraminated and (b) no-residual drinking water distribution systems, and total chlorine (red) and assimilable organic carbon (blue) concentrations in water. 104

Site C2 Site C3 Site C4 Site C5 Sampled 2014/08/05 Sampled 2014/08/08 Sampled 2014/09/26 Sampled 2014/10/02 Grey cast iron, unlined Ductile iron, mortar lined Grey cast iron, unlined Grey cast iron, unlined Original diameter = 15 cm Original diameter = 20 cm Original diameter = 15 cm Original diameter = 15 cm Installed 1922 (103 years) Installed 1972 (42 years) Installed 1909 (105 years) Installed 1911 (103 years)

Site C6 Site C7 Site C8 Site C8 Sampled 2014/10/07 Sampled 2014/10/09 Sampled 2014/10/15 Sampled 2014/11/20 Ductile iron, mortar lined Grey cast iron, unlined Grey cast iron, unlined Grey cast iron, unlined Original diameter = 15 cm Original diameter = 15 cm Original diameter = 15 cm Original diameter = 20 cm Installed 1974 (40 years) Installed 1896 (118 years) Installed 1903 (111 years) Installed 1887 (127 years)

Site C10 Site N5 a Site N5 b Site N5 c Sampled 2014/11/26 Sampled 2014/10/29 Sampled 2014/10/29 Sampled 2014/10/29 Grey cast iron, unlined Ductile iron, mortar lined Ductile iron, mortar lined Ductile iron, mortar lined Original diameter = 15 cm Original diameter = 15 cm Original diameter = 15 cm Original diameter = 15 cm Installed 1887 (127 years) Installed 1968 (46 years) Installed 1967 (47 years) Installed 1967 (47 years)

Site N6 Site N7 a Site N7 b Sampled 2015/05/20 Sampled 2015/05/28 Sampled 2015/05/28 Grey cast iron, unlined Grey cast iron, unlined Ductile iron, unlined Original diameter = 15 cm Original diameter = 15 cm Original diameter = 15 cm Installed 1957 (58 years) Installed 1906 (109 years) Installed 1971 (44 years)

Figure C.2: Water mains collected from the chloraminated (a–i) and no-residual (j–o) drinking water distribution systems.

a Chloraminated No residual 104 102 100 10−2 −4 seqs per 10 000 10 C2 C3 C4 C5 C6 N5a N5b N5c N6 N7a N7b b Chloraminated No residual 104 103 102 101 100 −1

seqs per 10 000 10 C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7 Nitrosomonas Nitrospira Nitrosospira Nitrotoga Uncultured Nitrosomonadaceae Nitrobacter Other Nitrosomonadaceae

Figure C.3: 16S rRNA gene sequence profiles of bacterial genera associated with ammonia oxidation ( • ) and nitrite oxidation ( ∇ ) in (a) water-main biofilms and (b) drinking water.

105 106

a b 0 0 ) − − ) 1 1

− 2 − 2

− 3 − 3 Nitrosomonas

( − 4 − 4 10 Nitrosomonadaceae g ( o l − 5 10 − 5 g o l

log10(amoA:16S rRNA gene copies) log10(amoA:16S rRNA gene copies)

Distribution system Sample type Chloraminated Drinking water No residual Water•main biofilms

Figure C.4: Comparison of operational taxonomic units (OTUs) of potential ammonia-oxidizing bacteria (AOB) versus Nitrosomonas oligotropha-like amoA:16S rRNA gene ratios: (a) genus Nitro- somonas-like OTUs and (b) family Nitrosomonadaceae-like OTUs. 107

a 8 Spearman's ρ = − 0.035 (p = 0.72)

6

4 Shannon 2

0 104 105 106 Sequence depth

b 125 Spearman's ρ = 0.004 (p = 0.97)

100

75

50

Inverse Simpson Inverse 25

0 104 105 106 Sequence depth

Sample type Distribution system Water•main biofilms Chloraminated Drinking water No residual Under tubercle

Figure C.5: Alpha diversity as a function of library size: (a) Shannon index and (b) inverse Simpson index. 108

a b

0.2

0.0 0.0 Axis 2 (8.5%) Axis 2 (14.3%) -0.2 -0.2

-0.2 0.0 0.2 0.4 -0.4 -0.2 0.0 0.2 Axis 1 (26.3%) Axis 1 (24.9%)

c

Sample type 0.2 Water­main biofilms Drinking water Under tubercle 0.0

Axis 2 (11.8%) Distribution system -0.2 Chloraminated No residual

-0.2 0.0 0.2 0.4 Axis 1 (21.5%)

Figure C.6: Beta diversity of water-main biofilms, drinking water, and under-tubercle samples from two drinking water distributions systems using principal coordinates analysis of (a) generalized UniFrac distances, (b) unweighted UniFrac distances, and (c) Bray-Curtis dissimilarity. Axes indicate percent variance explained. 109

a b

0.2

0.0 0.0 Axis 2 (9.51%) Axis 3 (5.83%)

-0.2 -0.2 -0.2 0.0 0.2 -0.2 0.0 0.2 Axis 1 (23.9%) Axis 1 (23.9%)

Sample type Distribution system Water­main biofilms Chloraminated Drinking water No residual

Figure C.7: Beta diversity of water-main biofilms and drinking water from two drinking water distribu- tions systems using principal coordinates analysis of unweighted UniFrac: (a) principal axis 2 versus 1 and (b) principal axis 3 versus 1. Axes indicate percent variance explained. 110

a b 0.4 0.2

0.2 0.0

0.0 Axis 3 (7.1%) Axis 2 (11.4%) -0.2

-0.2 -0.4 -0.2 0.0 0.2 0.4 Axis 1 (30.5%) -0.4 -0.2 0.0 0.2 0.4 Axis 1 (30.5%)

Sample type Distribution system Water­main biofilms Chloraminated Drinking water No residual

Figure C.8: Beta diversity of water-main biofilm and drinking water from two drinking water distributions systems using principal coordinates analysis of Bray-Curtis dissimilarity: (a) principal axis 2 versus 1 and (b) principal axis 3 versus 1. Axes indicate percent variance explained. 111

a b 0.4

0.2 0.2

0.0 0.0 Axis 2 (12.1%) Axis 2 (10.1%)

-0.2 -0.2 -0.2 0.0 0.2 0.4 -0.2 0.0 0.2 Axis 1 (17%) Axis 1 (13.4%)

Distribution system Chloraminated No residual

Figure C.9: Beta diversity under tubercles from two drinking water distributions systems using principal coordinates analysis of (a) Bray-Curtis dissimilarity and (b) unweighted UniFrac measurements. Axes indicate percent variance explained. Table C.1: Effect of library size on beta diversity metrics using different normalization methods

all samples biofilm vs. water under tubercle Metric R2 p * R2 p * R2 p * Generalized UniFrac No normalization (original counts) 0.03197 0.001 0.04498 0.001 0.07573 0.016 Cumulative sum scaling 0.03573 0.001 0.05224 0.001 0.06702 0.012 Subsampling without replacement † 0.02971 0.001 0.04300 0.001 0.07537 0.023 Unweighted UniFrac No normalization (original counts) 0.04961 0.001 0.06178 0.001 0.05059 0.037 Cumulative sum scaling 0.03300 0.001 0.06178 0.001 0.05721 0.015 Subsampling without replacement † 0.01693 0.012 0.03055 0.002 0.05127 0.046 Bray-Curtis dissimilarity No normalization (original counts) 0.06429 0.001 0.10603 0.001 0.08771 0.004 Cumulative sum scaling 0.03892 0.001 0.05752 0.001 0.05567 0.004 Subsampling without replacement † 0.03233 0.001 0.05054 0.001 0.04791 0.133

NOTE: Italics indicates final normalization method used for determination of the corresponding metric * Permuted (n = 999) † Even library size equal to lowest sequence depth among included samples (‘all samples’ and ‘under tubercle’ = 17 041 sequences per sample; ‘biofilm vs. water’ = 30 304 sequences per sample)

112 Table C.2: PCR primer sequences and thermoprofiles

Target Primer name and sequence (50 → 30) Size, bp PCR thermoprofile Reference 1 min at 95°C; Bacterial 341F: CCT ACG GGA GGC AGC AG ~200 30 cycles of 15 s at 95°C [31] 16S rRNA genes, V3 534R: ATT ACC GCG GCT GCT GG and 1 min at 60°C

1 min at 95°C; Nitrosomonas amo550F: TCA GTA GCY GAC TAC ACM GG 205 40 cycles of 15 s at 95°C, [49] oligotropha-like amoA amo754R: CTT TAA CAT AGT AGA AAG CGG 1 min at 56°C

GenAOAF: ATA GAG CCT CAA GTA GGA AAG 10 min at 95°C; TTC TA Archaeal amoA 135 40 cycles of 15 s at 95°C, [50] GenAOAR: CCA AGC GGC CAT CCA GCT GTA 30 s at 55°C TGT CC

113 114

Table C.3: Summary of real-time quantitative PCR reactions

LOQ, Amplification Target R2 Slope Intercept copy number efficiency Nitrosomonas 5 97.2% 0.994 −3.39 38.37 oligotropha-like amoA 5 99.7% 0.990 −3.33 38.26 5 93.0% 0.994 −3.50 38.03 Archaeal amoA 130 94.4% 0.999 −3.46 41.26 130 96.6% 0.999 −3.41 41.04 130 94.8% 0.996 −3.45 41.26

LOQ, limit of quantification Table C.4: p values for group-wise comparisons of Shannon index using the Conover-Iman test

Kruskal-Wallis test water-main biofilm drinking water under tubercle p = 2.4×10−15 chlor. no res. chlor. no res. chlor. no res. chlor. 1 – – – – – water-main biofilm no res. <0.0001 1 – – – – chlor. <0.0001 <0.0001 1 – – – drinking water no res. <0.0001 0.7852 <0.0001 1 – – chlor. 0.6285 <0.0001 <0.0001 <0.0001 1 – under tubercle no res. 0.0104 <0.0001 0.0001 <0.0001 0.0006 1 Bold indicates significant p values (α = 0.05) Chlor., chloraminated drinking water distribution system; no res., no-residual drinking water distribution system

115 Table C.5: p values for group-wise comparisons of inverse Simpson index using the Conover-Iman test

Kruskal-Wallis test water-main biofilm drinking water under tubercle p = 3.2×10−15 chlor. no res. chlor. no res. chlor. no res. chlor. 1 – – – – – water-main biofilm no res. <0.0001 1 – – – – chlor. <0.0001 <0.0001 1 – – – drinking water no res. <0.0001 0.2878 <0.0001 1 – – chlor. 0.6253 <0.0001 <0.0001 <0.0001 1 – under tubercle no res. 0.0180 <0.0001 <0.0001 <0.0001 0.0014 1 Bold indicates significant p values (α = 0.05) Chlor., chloraminated drinking water distribution system; no res., no-residual drinking water distribution system

116 117 Appendix D

Supporting Information for Chapter 3

This appendix contains the Supporting Information (SI) for a previously published article [32]. It is reproduced in part with permission from:

Waak MB, LaPara TM, Hallé C, Hozalski RM. Occurrence of Legionella spp. in water-main biofilms from two drinking water distribution systems. Environ Sci Technol 2018; 52: 7630–7639, doi: 10.1021/acs.est.8b01170.

© 2018 American Chemical Society 118 SIMATERIALS &METHODS

Real-time quantitative polymerase chain reaction (qPCR) A Bio-Rad CFX Connect Real-Time PCR Detection System was used to perform real-time PCR. PCR reactions consisted of Bio-Rad iTaq SYBR Green Supermix with ROX or Bio- Rad SsoAdvanced Universal Probes Supermix, 1 µg µL−1 of bovine serum albumin (Roche Diagnostics; Indianapolis, IN), appropriate concentrations of forward and reverse primers and probe (if applicable; see Table D.3 for final concentrations), 1 µL of template genomic DNA, and molecular biology-grade water (Sigma-Aldrich; St. Louis, MO) for a final reaction volume of 25 µL. Quantification cycles (Cq) were determined using single threshold and baseline subtracted curve fit settings in Bio-Rad CFX Manager v3.1 software. All primers and probe Lp-P (targeting mip) were synthesized by Integrated DNA Technologies, Inc. (IDT; Coralville, Iowa), and probe PanLeg-P (for ssrA) was synthesized as a custom TaqMan probe by Applied Biosystems (Foster City, CA). All qPCR assays were initially performed using universal SYBR Green chemistry. Melt curve analysis suggested non-target amplification of high-molecular weight products in PCR reactions targeting ssrA and mip; this was later confirmed with electrophoresis gels (data not shown), although there also appeared to be products consistent with the intended target amplicon sizes (Table D.3). As a result, fluorophore-labelled probes PanLeg-P (ssrA) and Lp-P (mip) were utilized [95], with PCR thermoprofiles previously designed for environmental samples [97]. Standard curves for qPCR were derived from serial 1:10 dilutions of custom IDT gBlocks Gene Fragments comprised of the target DNA sequences flanked by 10–20 additional bases on either side. These sequences were obtained from the GenBank database [103] and are provided in Table D.4. To be acceptable, standard curves comprised a minimum of five valid points (i.e., the quantifiable range spanned five orders of magnitude), exhibit amplification efficiency between 90% and 110%, and have a coefficient of determination (R2) greater than 0.985. No- template controls were used during each qPCR assay to validate the method and monitor for contamination. For PCR performed with SYBR green universal dye, the specificity of PCR products was validated by visual inspection of melt curves. For all PCR assays, amplification efficiencies of individual standards and individual unknown samples were visually compared to ensure similarity. Negative controls collected during tap water and biofilm sampling, and processed along with DNA samples, were used to monitor contamination. Genomic DNA from Legionella pneumophila subsp. pneumophila [160] (ATCC 33152D-5) was used as a positive control during qPCR of ssrA, mip, and wzm. Randomly selected samples were further assessed in duplicate, with the addition of genomic Legionella DNA, to test for inhibition. Characterization of Legionella-like 16S rRNA genes Samples were amplified 5–10 cycles beyond the Cq value (determined via real-time qPCR), until the amplification curve approached the plateau phase, to produce amplicon products. PCR reactions consisted of Bio-Rad iTaq SYBR Green Supermix with ROX (Life Science Research, Hercules, CA), 1 µg µL−1 of bovine serum albumin (Roche Diagnostics; Indianapolis, IN), 119 500 nmol L−1 of forward and reverse primer, 1 µL of template genomic DNA, and molecular biology-grade water (Sigma-Aldrich; St. Louis, MO) for a final volume of 50 µL. PCR-amplified 16S rRNA gene fragments were purified using the QIAquick PCR Purification Kit (Qiagen; Hilgen, Germany) and then quantified by the University of Minnesota Genomics Center (UMGC) using the PicoGreen assay (Thermo Fisher Scientific; Waltham, MA). Equal masses of the PCR-amplified DNA were pooled from all samples to create a library for paired- end (2 × 150 bp) sequencing at UMGC, using the MiSeq platform (Illumina, Inc.; San Diego, California). A metagenomics pipeline previously developed for data processing was used to filter and assign taxonomy to the 16S rRNA amplicon sequence reads [33]. Paired-end reads were purged of Illumina adapter sequences using Trimmomatic [161] and then stitched together and purged of primer sequences using PANDAseq [162]. Chimeric sequences were identified and removed via QIIME v1.9.1 [34] using the USEARCH 6.1 wrapper (using both reference-based and de novo methods) [36]. Open-reference operational taxonomic units (OTUs) were clustered at 99% sequence identity using the USEARCH 6.1 wrapper and the SILVA 128 99% reference sequences [37], with cluster seeds selected as representative OTU sequences. OTUs were then assigned consensus taxonomy (i.e., 0.51 minimum consensus fraction of 3 maximum accepts with 90% or greater similarity) using the UCLUST wrapper in QIIME [36] and the SILVA 128 taxonomic database [38]. Finally, the OTU table was filtered of single-read OTUs and OTUs that failed alignment against the SILVA 128 core alignment using the PyNAST wrapper in QIIME [39]. Assimilable organic carbon (AOC) Tap water samples were collected from public or private faucets after 5–10 min of flushing to evacuate stagnant water from the premise plumbing. A 1-L tap water sample was collected at each site in a Whirl-Pak disposable sampling bag (Nasco; Fort Atkinson, Wisconsin). The sampling bags were previously determined to contribute no appreciable AOC during quality control testing. A 1-L tap water sample was vacuum-filtered through a 0.22-µm nitrocellulose membrane filter, with the first 500 mL of filtrate discarded to avoid potential AOC contributions from the filter. The second 500-mL aliquot was dechlorinated with 0.5 mL of a 190-mM, AOC-free sodium thiosulfate (Na2S2O3) solution and dispensed into four pre-cleaned 40-mL glass vials (certified <10 ppb total organic carbon; Thermo Fisher Scientific; Waltham, Massachusetts). The vials were tightly capped and pasteurized in a hot-water bath at 70°C for 30 min. Three vials were co-inoculated with P-17 and NOX for final concentrations of 500 colony-forming units (CFU) mL−1 of each strain, while the fourth vial was a negative control. All four vials were then held at 15°C for 7 days. On day 7, serial dilutions of the samples were prepared with a buffered phosphate solution. Dilutions were plated onto R2A agar, and colonies were enumerated after 2–3 days of growth in an incubator at room temperature. The geometric mean of colony counts was used for conversion to AOC via the standard yield factors [28]. 120 Data analysis and statistics Statistics and data transformations were performed in R software [45] using packages NADA [163] and biomformat [164], as well as statistical tests and code previously described [54]. Plots were generated in R using the packages ggplot2 [165] and cowplot [166]. Due to the high occurrence of left-censored observations within the datasets (i.e., below the method limits of quantification, or 50% censored, the median was indeterminate and reported as

300 1 − Distri bution system 200 g C L Chlo raminated µ , No residual C

O 100 A

0 0 5 10 15 Distance from treatment plant, km

Figure D.1: Assimilable organic carbon (AOC) versus direct distance of tap location from the drinking water treatment plant.

2014 C 30.0 °

2015 20.0

10.0 Chloraminated

Temperature, No residual 0.0 JUL JAN JAN JUN FEB SEP APR MAY OCT DEC AUG NOV MAR

Figure D.2: Daily average temperatures of raw water entering the chloraminated and no-residual drinking water treatment plants, 2014–2015. 122

A 35.0 Raw water 30.0 Tap water 25.0

20.0

15.0

10.0 Temperature, °C

5.0

0.0 JUL JAN JUN FEB SEP APR OCT DEC NOV MAY AUG MAR

B +10.0

+5.0

0.0 Median change,°C –5.0 JUL JAN JUN FEB SEP APR OCT DEC NOV MAY AUG MAR

Figure D.3: Monthly water temperatures for the chloraminated drinking water distribution system, 2014–2015: (A) Temperature of raw water at the treatment plant and tap water from 15 monitoring points in the distribution system, and (B) net change in median temperatures. 123

Uncultured Legionella sp. (KP822856.1) Legionell a-like OTU 2 (this stu dy) Sample Type Legionell a-like OTU 8 (this stu dy) Legionella lansingensis (LT906451.1) Water-main biofilms Legionell a-like OTU 14 (this stu dy) Tap water Legionella wadswo rthii (JF720382.1) Uncultured Legionella sp. (KM624098.1) Legionella geestiana (NR_044957.1) Legionella wors leiensis (JF720415.1) Legionella clemsonensis (KX694517.1) Distribution System Legionella anisa (JF720397.1) Chloraminated Legionella drozanskii (FJ544434.1) Legionella pneumophila strain Philadelphia -1 (CP013742.1) No residual Legionella pneumophila subs p. pneumophila (CP021286.1) Legionella pneumophila strain Pontiac (CP016029.2) Legionella dumoffii (KU143915.1) (LC085648.1) Legionella micdadei (CP020615.1) Legionella bo zemanii (EF474026.1) Legionella waltersii (LT906442.1) Legionella tau rinensis (AB638719.1) Legionella fallonii (LN614827.1) Legionella mor avica (JF720413.1) Legionella quinli vanii (JF720360.1) Legionella feeleii (JF720366.1) Legionella donaldsonii (KM504126.1) Legionell a-like OTU 6 (this stu dy) Legionell a-like OTU 13 (this stu dy) Legionell a-like OTU 5 (this stu dy) Legionell a-like OTU 16 (this stu dy) Legionell a-like OTU 12 (this stu dy) Legionell a-like OTU 9 (this stu dy) Legionell a-like OTU 1 (this stu dy) Legionella nor rlandica (KJ796839.1) Legionell a-like OTU 19 (this stu dy) Legionell a-like OTU 11 (this stu dy) Uncultured Legionella sp. ( AY924057.1) Legionell a-like OTU 3 (this stu dy) Legionell a-like OTU 15 (this stu dy) Legionell a-like OTU 17 (this stu dy) Legionell a-like OTU 18 (this stu dy) Legionell a-like OTU 7 (this stu dy) Legionell a-like OTU 10 (this stu dy) Legionell a-like OTU 4 (this stu dy) Legionella sp. S003 (FJ544437.1) Legionella isra elensis (JF720422.1) (MG966450.1) Esche richia coli (CP019213.2) Pseudomonas ae ruginosa (CP015650.1) burnetii (CP014563.1) 0.05

Figure D.4: Phylogenetic analysis of most-observed Legionella-like operational taxonomic units versus 16S rRNA gene sequences of known Legionella spp. and phylogenetically similar taxa (GenBank accession numbers). 124

−2

Sample type −3 Tap water Biofilm

−4

Distribution system (relative abundance) (relative 10 −5 Chloraminated log No residual

log10(ssrA:16S rRNA gene)

Figure D.5: Relative abundances of Legionella-like operational taxonomic units versus ratios of ssrA marker gene copy number to 16S rRNA gene copy number. 125

A Chloraminated No residual 4

) 2 − −− − − − − − − − − − − − − − − − − − − 3 − − −− − − − − −− − − − −− −− −− − − − − − −− − − − − −−− (copies cm

10 − − g − o l −− 2 C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b

B Chloraminated No residual 4

) 1 −

− (copies L 3 −−−− −−−− −−−− −−−− −−− −−−− −−−− 10

g −

o − l −− −− −−− − − − − −−− −

C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7

Quantifiable mip − LOQmeth (tail indicates no detection,

Detectable mip,

Figure D.6: Total Legionella pneumophila in the chloraminated and no-residual drinking water distribu- tion systems, as determined by qPCR targeting mip genes in: (A) water-main biofilms, and (B) filtered tap water. 126

A Chloraminated No residual 4

) 2 − −− − − − − − − − − − − − − − − − − − − 3 − − −− − − − − −− − − − −− −− −− − − − − − −− − − − − −−− (copies cm

10 − − g − o l −− 2 C2 C3 C4 C5 C6 C7 C8 C9 C10 N5a N5b N5c N6 N7a N7b

B Chloraminated No residual 4

) 1 −

− (copies L 3 −−−− −−−− −−−− −−−− −−− −−−− −−−− 10

g −

o − l −− −− −−− − − − − −−− −

C1 C2 C4 C6 C7 C9 N0 N1 N2 N3 N4 N6 N7

Quantifiable wzm − LOQmeth (tail indicates no detection,

Detectable wzm,

Figure D.7: Total Legionella pneumophila serogroup 1 in the chloraminated and no-residual drinking water distribution systems, as determined by qPCR targeting wzm genes in: (A) water-main biofilms, and (B) filtered tap water. Sites C7-C10 of the chloraminated system experience seasonal shutoff and may not represent typical water-main biofilms from that system. Site N0 of the no-residual system is treated water leaving the treatment plant. 127

3.0 Regional incidence 2.0 Chloraminated system (county) United States (national)

1.0 No residual system (county) Norway (national) cases per 100,000 0.0 2010 2011 2012 2013 2014 2015

Figure D.8: Incidence rate of legionellosis, 2010–2015, for the metropolitan areas served by the chloraminated (United States) and no-residual (Norway) drinking water distribution systems and the national incidence rates for each country. Table D.1: Summary of water mains from the chloraminated and no-residual drinking water distribution systems, 2014–2015 Install Diameter, Distance from Site ID Sampling date Pipe material year cm WTP, km Chloraminated system C2 2014–08–05 1911 Cast iron (gray), unlined 15.2 8.1 C3 2014–08–08 1972 Cast iron (ductile), mortar-lined 20.3 7.7 C4 2014–09–26 1909 Cast iron (gray), unlined 15.2 4.6 C5 2014–10–02 1904 Cast iron (gray), unlined 15.2 12.0 C6 2014–10–07 1974 Cast iron (ductile), mortar-lined 15.2 6.5 C7 2014–10–09 1896 Cast iron (gray), unlined 15.2 10.5 C8 2014–10–15 1903 Cast iron (gray), unlined 15.2 10.5 C9 2014–11–20 1887 Cast iron (gray), unlined 20.3 10.5 C10 2014–11–26 1887 Cast iron (gray), unlined 15.2 10.6 No-residual system N5a 2014–10–29 1968 Cast iron (ductile), unlined 15.0 5.8 N5b 2014–10–29 1967 Cast iron (ductile), unlined 15.0 5.8 N5c 2014–10–29 1967 Cast iron (ductile), unlined 15.0 5.8 N6 2015–05–20 1957 Cast iron (gray), unlined 15.0 8.8 N7a 2015–05–28 1906 Cast iron (gray), unlined 15.0 8.4 N7b 2015–05–28 1971 Cast iron (ductile), unlined 15.0 8.4 WTP, water treatment plant

128 129

Table D.2: Summary of tap water collected from the chloraminated and no-residual drinking water distribution systems, 2014–2015 Total chlorine, Distance from Site ID Sampling date −1 mg Cl2 L WTP, km Chloraminated system C1 2014–05–20 2.8 16.4 C2 2014–07–16 3.5 8.1 C4 2014–09–26 3.9 4.4 C6 2014–10–07 3.1 6.5 C7 2014–10–09 3.2 10.7 C9 2014–11–20 3.5 10.7 No-residual system N0a 2014–06–16 0.05 0 N1 2014–06–17 <0.02 6.2 N2 2014–06–17 0.02 7.5 N3 2014–06–17 0.05 3.1 N4 2014–06–17 0.02 6.3 N6 2015–05–21 <0.02 8.8 N7 2015–05–28 <0.02 8.4 WTP, water treatment plant aTreated water leaving plant Table D.3: Forward (F) and reverse (R) primer and probe (P) sequences and thermoprofiles for analysis by real-time PCR Gene Concentration, Primer/probe name and sequence (50 → 30) Size, bp Thermoprofile Ref. (taxonomic target) nmol L−1 1 min at 95°C; 16S rRNA 341F: CCT ACG GGA GGC AGC AG 500 ~200 30 cycles of 15 s at 95°C [30, 31] (all Bacteria) 534R: ATT ACC GCG GCT GCT GG 250 and 1 min at 60°C

PanLegF: GGC GAC CTG GCT TC 500 PanLegR: GGT CAT CGT TTG CAT TTA 500 10 min at 95°C; ssrA TAT TTA 101 45 cycles of 15 s at 95°C [95, 97] (Legionella spp.) PanLegP: 6-FAM / ACG TGG GTT GCA A / 100 and 1 min at 60°C MGBNFQ

LpF: TTG TCT TAT AGC ATT GGT 500 GCC G 10 min at 95°C; mip LpR: CCA ATT GAG CGC CAC TCA 500 115 45 cycles of 15 s at 95°C [95, 97] (L. pneumophila) TAG and 1 min at 60°C LpP: 6-FAM / CGG AAG CAA /ZEN/ 100 TGG CTA AAG GCA TGC A / IBFQ

P65: CAA AGG GCG TTA CAG TCA 500 wzm 2 min at 95°C; AAC C (L. pneumophila 75 35 cycles of 15 s at 95°C [96] P66: CAA ACA CCC CAA CCG TAA 250 serogroup 1) and 30 s at 60°C TCA

6-FAM, 6-FAM fluorescein reporter dye; MGBNFQ, TaqMan minor groove binder non-fluorescent quencher; ZEN, ZEN internal fluorescence quencher; IBFQ, Iowa Black dark quencher

130 Table D.4: Custom gBlocks Gene Fragments used as oligonucleotide standards during real-time qPCR and respective reference genomic DNA sequences from GenBank Taxonomic target GenBank Region Reference Genome (Gene) Accession No. (Length) All Bacteria Escherichia sp. UIWRF0630 16S ribosomal base 248 to 471 KR190116.1 (16S rRNA genes) RNA gene, partial sequence (224 bp)

All Legionella spp. Legionella pneumophila strain Philadelphia 1 base 172901 to 173032 CP015927.1 (ssrA) ATCC, complete genome (132 bp)

L. pneumophila Legionella pneumophila strain Philadelphia 1 base 915183 to 915327 CP015927.1 (mip) ATCC, complete genome (145 bp)

L. pneumophila serogroup 1 Legionella pneumophila serogroup 1 base 5421 to 5552 AJ007311.1 (wzm) lipopolysaccharide biosynthesis gene cluster (132 bp)

131 Table D.5: Summary of real-time quantitative PCR reactions LOQ, LOD, Amplification Gene target Assay R2 Slope Intercept copy number copy number efficiency 16S rRNA 1 1.3 × 104 n.d. 98.6% 0.994 −3.36 38.46 2 1.3 × 104 n.d. 99.7% 0.989 −3.33 38.29 3 1.3 × 104 n.d. 93.3% 1.000 −3.49 39.58 ssrA 1 10 5 94.6% 0.998 −3.46 40.61 2 10 5 98.0% 0.998 −3.37 39.93 mip 1 10 5 96.8% 1.000 −3.40 39.89 2 10 5 96.4% 1.000 −3.41 38.91 wzm 1 10 5 97.4% 0.998 −3.39 38.68 2 10 5 93.2% 0.992 −3.50 39.85

LOQ, limit of quantification; LOD, limit of detection; n.d., not defined

132 133

Table D.6: Methods used by the two drinking water utilities for monitoring water chemistry Parameter Method Chloraminated system pH Electometric method (SM 4500-H+)

Hardness EDTA titration method (SM 2340)

Total chlorine DPD colorimetric & titration methods (SM 4500-Cl)

Total/dissolved organic Heated persulfate oxidation method (SM 5310) carbon (TOC/DOC) Indophenol (Hach® method 10268) and Total ammonia indosalicylate (EPA Method 350.1) + Free ammonia, NH3/NH4 Indophenol (Hach® methods 10200/10201) EPA Diazotization Method Nitrite, NO− 2 (Hach® method 8507) Dimethylphenol method Nitrate, NO− 3 (Hach TNTplus™ 835; Method 10206) Ascorbic acid method Total phosphorus (Hach TNTplus™ 843; Method 10209/10210) No-residual system pH Electometric method (NS-EN ISO 10523:2012)

Mg++ Ca+ calculation method via ion Hardness chromatography (NS-EN ISO 14911:1999) DPD colorimetric & titration methods Total chlorine (NS-EN ISO 7393-(1,2):2000) Total/dissolved organic Heated persulfate oxidation method (NS-EN 1484:1997) carbon (TOC/DOC) + Ammonium, NH4 Ion chromatography method (NS-EN ISO 14911:1999) − Nitrate, NO3 Ion chromatography method (NS-EN ISO 10304-1:2009) SM, Standard Methods [28]; NS-EN, Norwegian Standard (English) [167]; EPA, U.S. Environmental Protection Agency; ISO, International Organization for Standardization 134

Table D.7: Median water quality parameters (and ranges, minimum–maximum) for the chlo- raminated and no-residual drinking water distribution systems Units Chloraminated No residual Raw water (treatment plant influent) Location United States Norway Source River Lake (inlet depth 50 m) Daily average temperatures °C 10.0 (1.1–32.1) 3.8 (0.1–5.4) Treated water (treatment plant effluent) pH unitless 8.7 (7.5–9.5) 8.1 (7.8–8.6) −1 Hardness mg CaCO3 L 73 (44–93) 50 (36–54) −1 Total chlorine mg Cl2 L 3.8 (3.2–4.1) 0.08 (0.05–0.12) Organic carbon Total (TOC) mg C L−1 4.8 (3.7–8.0) 3.4 (2.8–4.1) Dissolved (DOC) mg C L−1 4.6 (3.6–7.6) a Assimilable (AOC) µg C L−1 n.a. 135 P-17 µg C L−1 n.a. 25 NOX µg C L−1 n.a. 110 Inorganic nitrogen Total ammoniab mg N L−1 0.82 (0.53–1.83) <0.02c + −1 c Free ammonia, NH3/NH4 mg N L 0.11 (<0.02–0.97) <0.02 + −1 d Ammonium, NH4 mg N L 0.09 (<0.02–0.85) <0.02 (<0.02) − −1 Nitrate, NO3 mg N L 0.86 (0.36–1.73) 0.24 (0.22–0.26) Total phosphorus mg P L−1 0.29 (0.17–0.52) <0.05 Tap water (drinking water distribution system) Temperaturese °C 16.1 (1.8–34.1) 7.2 (5.2–8.5) −1 Total chlorine mg Cl2 L 3.3 (0.3–4.1) <0.02 (<0.02–0.08) Inorganic nitrogen Total ammoniab mg N L−1 0.77 (0.07–1.02) n.a. + −1 Free ammonia, NH3/NH4 mg N L 0.18 (<0.02–0.50) n.a. − −1 Nitrite, NO2 mg N L 0.023 (<0.002–0.456) n.a. − −1 Nitrate, NO3 mg N L 0.70 (0.12–1.84) n.a. AOC µg C L−1 297 (238–343) 101 (81–109) P-17 µg C L−1 31 (30–34) 14 (12–26) NOX µg C L−1 263 (208–312) 81 (67–97) n.a., not available aDOC is consistently equal to TOC b + Total ammonia includes NH3/NH4 and monochloramine, NH2Cl c + Calculated conservatively based on fractionation of NH3/NH4 at pH 8.6, temperature 5.4°C, and + −1 0.02 mg NH4 -N L , as previously described[29] d + Calculated based on fractionation of NH3/NH4 at known pH, temperature, and free ammonia, as previously described [29] eDistribution system temperatures for the chloraminated system were measured monthly from 15 monitoring locations for 2014–2015. Temperatures from the no-residual system were collected from three taps in May 2017 for this investigation. 135 Appendix E

Supporting Information for Chapter 4 136

Sample typ e OTU 1 Water-main biofilms M. gordonae (KF432745.1) M. asiaticum DSM 44297 (AF547806.1) Tap water M. gordonae CIP 104529 (AF547840.1) M. gordonae (KF432502.1) FL04-5-253F (EU619883.1) M. szulgai ATCC 35799 (AF434731.1) Distribution syste m OTU 10 OTU 12 Chloraminate d OTU 5 M. t ripl ex CIP 106108 (AF547882.1) No residual E1455 ( AY379077.1) M. stomatepiae (JX091372.1) M. lentifl avum CIP 105465 (AF547851.1) OTU 9 05FL-43-159LA (EU619898.1) OTU 2 OTU 15 OTU 13 M. parascrofulaceum CIP 108112 ( AY943201.1) M. gordonae (AF434735.1)/ M. parascrofulaceum ATCC BAA614 (GQ153295.1) OTU 8 M. riyadhense (DQ284768.1) M. kansasii ATCC 12478 (CP006835.1) FI-13041 (KJ957808.1) M. avium ATCC 25291 (DQ284768.1) M. aquiterrae (LC126333.1) M. sphagni DSM 44076 (AF547877.1) OTU 14 M. confluentis CIP 105510 (AF547822.1) M. mageritense (EU732652.1) LTG 466 (KY853653.1) M. goodii CIP 106349 (AF547839.1) OTU 6 OTU 4 OTU 11 M. monacense (KF432591.1) M. manitobense DSM 44615 (DQ350158.1) OTU 3 M. parater rae (EU919228.1) M. hassiacum CIP 105218 (AF547842.1) M. holsaticum (AJ310469.1) M. pyreni vorans DSM 44605 (JF510463.1) M. psychrotolerans DSM 44697 (HM602035.1) Rhodococcus ruber str ain SD3 (CP029146.1) Nocardia pneumoniae DSM 44730 ( AY903636.1) Strept omyces somaliensis DSM 40738 (EF375995.1) OTU 7

0.05

Figure E.1: Phylogenetic tree of mycobacterial hsp65 operational taxonomic units versus sequences of known Mycobacterium spp. and related taxa from GenBank. 137

7

6 Sample type Water•main biofilms 5 Drinking water

•like 16S rRNA genes •like Distribution system 4 Chloraminated No residual 3

ND Methylobacterium ND 3 4 5 6 7 8 Mycobacterium•like 16S rRNA genes

Figure E.2: Methylobacterium- versus Mycobacterium-like amplicon sequence variants (ASVs) in water-main biofilms and drinking water of two drinking water distribution systems. Relative abundances have been normalized by 16S rRNA gene copy numbers. ND = no such ASV detected. 138

100

10−1 Sample type Water•main biofilms 10−2 Drinking water Distribution system −3 10 Chloraminated

:16S rRNA gene ratios No residual 10−4 atpE

10−5 10−5 10−4 10−3 10−2 10−1 100 Mycobacterium•like sequences

Figure E.3: Ratios of mycobacterial atpE genes to bacterial 16S rRNA genes (via real-time qPCR) versus Mycobacterium-like amplicon sequence variants (via high-throughput sequencing of 16S rRNA genes) in water-main biofilms and drinking water from two drinking water distribution systems. Relative abundances of 0 and qPCR values below the method limit of quantification have been substituted to 1×10−5 for logarithmic scale. Operational taxonomic unit 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 4 4 4 4 4 4 5 5 5 5 6 7 7 8 9 10 11 12 13 14 15 15 15 15 15 15 C2-1 C2-3 C3-1 C3-3 Biofilm C4-2 C5-1 C5-4 C6-2 C6-3 C6-4

C4-2 Water Distribution system C4-3 C9-3 Chloraminated DWDS C9-4 No­residual DWDS N5a-2 N5a-3 N5a-4

N5c-1 Biofilm Sample site N5c-3 N6-1 0 1 2 3 N7a-1 log10(frequency) N7a-2 N7a-3 N7a-4 N7b-3 N1-1 N1-3 Water N3-3 N4-1 N4-3 N7-3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Amplicon sequence variant

Figure E.4: Heat map of mycobacterial hsp65 gene amplicon sequence variants (ASVs) in water-main biofilms and drinking water of two drinking water distribution systems. Operational taxonomic unit assignments for each ASV are indicated on the top x-axis, representing ASVs clustered by 95% similarity. 139 Table E.1: PCR primer and probe sequences and thermoprofiles* Concentration, Target Primer/probe name and sequence (50 → 30) PCR thermoprofile Ref. nmol L−1 atpE-F: CGG YGC CGG TAT CGG YGA 500 45 s at 95°C; Mycobacterial atpE-R: CGA AGA CGA ACA RSG CCA T 500 40 cycles of 3 s at 95°C, [141] atpE genes (164 bp) atpE-P: FAM-ACS GTG ATG /ZEN/ AAG AAC 50 30 s at 60°C GGB GTR AA-IBFQ

MAC-F: TTG GGC CCT GAG ACA ACA CT 1800 ITS region of MAC-R: GCA ACC ACT ATC CAA TAC TCA 1800 10 min at 95°C; M. avium complex AAC AC 40 cycles of 15 s at 95°C, [142] (90 bp) MAC-P: FAM-CCG TGT GGA /ZEN/ GTC CCT 400 1 min at 60°C CCA TCT TGG-IBFQ 2 min at 98°C; Mycobacterial Tb11: ACC AAC GAT GGT GTG TCC AT 500 32–45 cycles of 15 s at 98°C, [143, 144] hsp65 genes (441 bp) Tb12: CTT GTC GAA CCG CAT ACC CT 500 30 s at 60°C

1 min at 95°C; Bacterial 16S rRNA 341F: CCT ACG GGA GGC AGC AG 500 30 cycles of 15 s at 95°C [31] genes, V3 (~197 bp) 534R: ATT ACC GCG GCT GCT GG 250 and 1 min at 60°C *ITS, internal transcribed spacer; FAM, 6-carboxyfluorescein; ZEN, ZEN™ internal quencher; IBFQ, Iowa Black fluorescent quencher

140 Table E.2: Reference genomic DNA for creation of custom gBlocks Gene Fragments (qPCR standards)* GenBank Region Target Reference Genome Accession No. (Length) Bacterial Escherichia sp. UIWRF0630 16S ribosomal RNA gene, base 248 to 471 KR190116.1 16S rRNA genes partial sequence (227 bp)

Mycobacterial Mycobacterium avium subsp. avium strain DJO-44271, base 1377859 to 1378052 CP009614.1 atpE genes complete genome (194 bp)

ITS region of Mycobacterium avium subsp. avium strain DJO-44271, 1388871 to 1388990 CP009614.1 M. avium complex complete genome (120 bp)

*ITS, internal transcribed spacer

141 142

Table E.3: Summary of real-time quantitative PCR reactions* LOQ, Amplification Target R2 Slope Intercept copy number efficiency Mycobacterial 10 92.1% 0.998 −3.53 39.66 atpE genes 10 94.4% 0.998 −3.46 39.42

ITS region of 10 93.8% 1.000 −3.48 39.17 M. avium complex 10 92.6% 0.999 −3.51 39.59

*LOQ, limit of quantification; ITS, internal transcribed spacer Table E.4: Reference hsp65 gene sequences from GenBank used for naïve Bayesian classifier*

Accession No. Species/subspecies (strain) Accession No. Species/subspecies (strain) U55832.1 M. abscessus (ATCC 14472) AF547829.1 M. fallax (CIP 81.39) CU458896.1 M. abscessus (ATCC 19977) AF547830.1 M. farcinogenes (DSM 43637) AM902964.1 M. aemonae (DSM 45058T) AF547831.1 M. flavescens (CIP 104533) FJ617583.1 M. africanum (ATCC 25420) DQ350162.1 M. florentinum (DSM 44852) AY438080.1 M. agri (CIP 105391) CP014258.1 M. fortuitum subsp. fortuitum (DSM 46621) AF547804.1 M. aichiense (DSM 44147) AF547834.1 M. frederiksbergense (DSM 44346) GU564405.1 M. algericum (DSM 45454) DQ184963.1 M. frederiksbergense (strain OA128Y) AF547805.1 M. alvei (CIP 103464) AF547835.1 M. gadium (CIP 105388) KC481266.1 M. angelicum (JCM 18266) AF547836.1 M. gastri (CIP 104530) KP644751.1 M. arceuilense (strain 269) AF547837.1 M. genavense (DSM 44424) AB239922.1 M. arupense (strain CST7052) AF547838.1 M. gilvum (DSM 44503) AF547806.1 M. asiaticum (DSM 44297) AF547839.1 M. goodii (CIP 106349) GU362517.1 M. asiaticum (JCM 18266) AF547840.1 M. gordonae (CIP 104529) AY859677.1 M. aubagnense (CIP 108543) EF601222.1 M. gordonae (isolate 3599) AY438081.1 M. aurum (CIP 104465) FJ643458.1 M. gordonae (isolate MGO-b) AF547807.1 M. austroafricanum (CIP 105395) AM398480.1 M. gordonae (strain 28887) AF126030.1 M. avium subsp. avium (ATCC 25291) KF432745.1 M. gordonae (strain GS10011) CP009360.2 M. avium subsp. hominissuis (strain OCU464) JX154108.1 M. gordonae (strain InDRE Chiapas 1137) AF547809.1 M. avium subsp. paratuberculosis (CIP 103963) DQ350160.1 M. hackensackense (DSM 44833) AF547810.1 M. avium subsp. silvaticum (ATCC 49884) GQ245967.1 M. haemophilum (ATCC 29548) AY859679.1 M. barrassiae (CIP 108545) AF547842.1 M. hassiacum (CIP 105218) AY943195.1 M. boenickei (CIP 107829) AF547843.2 M. heckeshornense (DSM 44428) AF547811.1 M. bohemicum (CIP 105811) AF547844.1 M. heidelbergense (CIP 105424) AY859675.1 M. bolletii (CIP 108541) AY438083.1 M. hiberniae (DSM 44241) AF547812.1 M. botniense (DSM 44537) AY438084.1 M. holsaticum (DSM 44478) CP009449.1 M. bovis (ATCC BAA935) AY458077.1 M. houstonense (ATCC 49403) AF547815.1 M. branderi (CIP 104592) AF547846.1 M. interjectum (DSM 44064) JF491333.1 M. brisbanense (DSM 44680) AF547847.1 M. intermedium (CIP 104542) AF547816.1 M. brumae (CIP 103465) DQ284774.1 M. intracellulare (ATCC 13950) Continued on next page 143 Table E.4 – Continued from previous page Accession No. Species/subspecies (strain) Accession No. Species/subspecies (strain) JX154104.1 M. canariasense (strain InDRE 904DT) AF547848.1 M. intracellulare (CIP 104243) AF547884.1 M. caprae (CIP 105776) KF432780.1 M. intracellulare (strain HN11072) AF547817.1 M. celatum (CIP 106109) CP006835.1 M. kansasii (ATCC 12478) AF547818.1 M. chelonae (CIP 104535) AF547849.1 M. kansasii (CIP 104589) CP015278.1 M. chimaera (DSM 44623) AB232365.1 M. kansasii (strain KM-7) AF547819.1 M. chitae (CIP 105383) AY438649.1 M. komossense (CIP 105293) FJ172327.1 M. chlorophenolicum (ATCC 25793) HG673725.1 M. koreense (DSM 45576T) AF547821.1 M. chubuense (CIP 106810) AF547850.1 M. kubicae (CIP 106428) JX154106.1 M. colombiense (strain InDRE 9m) AB239920.1 M. kumamotonense (strain CST7247) AY859678.1 M. conceptionense (CIP 108544) AB370171.1 M. kyorinense (strain KUM 60204) AF547822.1 M. confluentis (CIP 105510) HM030495.1 M. lacticola (ATCC 9626) AF547823.1 M. conspicuum (CIP 105165) AY438090.1 M. lacus (DSM 44577) AF547824.1 M. cookii (CIP 105396) AF547851.1 M. lentiflavum (CIP 105465) DQ124111.1 M. cosmeticum (DSM 44829) FM211192.1 M. leprae (strain Br4923) AF547825.1 M. diernhoferi (CIP 105384) AY550232.1 M. lepraemurium (strain variant TS130) AF547826.1 M. doricum (DSM 44339) AM421340.1 M. llatzerense (strain MG12) AF547827.1 M. duvalii (CIP 104539) KP676901.1 M. lutetiense (strain 24) AF547828.1 M. elephantis (CIP 106831) AF547852.1 M. madagascariense (CIP 104538) JX154109.1 M. engbaekii (strain InDRE Chiapas1942) AF547853.1 M. mageritense (CIP 104973) DQ184958.1 M. mageritense (strain O237W) LT629971.1 M. rutilum (DSM 45405) AF547854.1 M. malmoense (CIP 105775) DQ866777.1 M. salmoniphilum (ATCC 13758) FJ232523.1 M. mantenii (strain NLA000401474) JF491331.1 M. saskatchewanense (DSM 44616) AF456470.1 M. marinum (ATCC 927) AF547871.1 M. scrofulaceum (CIP 105416) AF547856.1 M. microti (CIP 104256) KC010486.1 M. sediminis (strain YIM M13028) KU361326.1 M. monacense (strain L152Z&Z) AY684045.1 M. senegalense (ATCC 35796) AY943204.1 M. montefiorense (DSM 44602) AY684046.1 M. senegalense (ATCC BAA849) KP676916.1 M. montmartrense (strain 28) FJ268582.1 M. senuense (DSM 44999) AY859680.1 M. moriokaense (CIP 105393) JF491322.1 M. seoulense (DSM 44998) AF547858.1 M. mucogenicum (CIP 105223) JF491329.1 M. septicum (DSM 44393) AF547859.1 M. murale (CIP 105980) EU371505.1 M. setense (CIP 109395) Continued on next page 144 Table E.4 – Continued from previous page Accession No. Species/subspecies (strain) Accession No. Species/subspecies (strain) GQ153294.1 M. nebraskense (ATCC BAA837) HG673723.1 M. sherrisii (DSM 45441) AF547860.1 M. neoaurum (CIP 105387) AF547874.1 M. shimoidei (DSM 44152) AY458076.1 M. neworleansense (ATCC 49404) AF547875.1 M. simiae (CIP 104531) AF547861.1 M. nonchromogenicum (DSM 44164) AF547876.1 M. smegmatis (CIP 104444) EU600390.1 M. noviomagense (strain NLA000500338) HM755949.1 M. sp. (ATCC BAA2131) AF547862.1 M. novocastrense (CIP 105546) AF547877.1 M. sphagni (DSM 44076) AF547863.1 M. obuense (CIP 106803) AF434731.1 M. szulgai (ATCC 35799) AY943200.1 M. palustre (DSM 44572) JF491308.1 M. szulgai (ATCC 35799) AF547864.1 M. parafortuitum (CIP 106802) AF547879.1 M. terrae (CIP 104321) AY943201.1 M. parascrofulaceum (CIP 108112) GQ478699.1 M. terrae (strain P51)) AY337276.1 M. parascrofulaceum (strain U02532) AF547880.1 M. thermoresistibile (CIP 105390) HM602042.1 M. paraseoulense (DSM 45000) AF547881.1 M. tokaiense (CIP 106807) JX154107.1 M. parmense (strain InDRE 1121) GQ153291.1 M. triplex (ATCC 700071) AF547865.1 M. peregrinum (CIP 105382) AF547883.1 M. triviale (DSM 44153) AF547866.1 M. phlei (CIP 105389) AP012340.1 M. tuberculosis (strain Erdman ATCC 35801) AY859676.1 M. phocaicum (CIP 108542) AF547887.1 M. tusciae (CIP 106367) AY496138.1 M. porcinum (ATCC BAA328) AP017635.1 M. ulcerans subsp. shinshuense (ATCC 33728) AF547868.1 M. poriferae (CIP 105394) AF547889.1 M. vaccae (CIP 105934) HM602035.1 M. psychrotolerans (DSM 44697) AY438091.1 M. vanbaalenii (strain PYR-1) AF547869.1 M. pulveris (CIP 106804) EU834056.1 M. vulneris (strain NLA009601918 variant II) JF510463.1 M. pyrenivorans (DSM 44605) AF547890.1 M. wolinskyi (CIP 106348) AF547870.1 M. rhodesiae (CIP 106806) AF547891.1 M. xenopi (CIP 104035) EU921671.1 M. riyadhense (strain NLA000201958)

*ATCC: American Type Culture Collection, Manassas, Virginia, United States; CIP: Collection of Institut Pasteur, Paris, France; DSM: German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany; JCM: Japan Collection of Microorganisms, Koyadai, Tsukuba, Ibaraki, Japan

145 Table E.5: Taxonomic classification of hsp65 95% operational taxonomic units (OTUs) Naïve Bayesian classifier BLASTn web utility (highest scoring hits) OTU Taxon Confidence Taxon (accession no.) Identity Coverage 1 M. gordonae 1.000 M. gordonae GS10011 (KF432745.1) 99.8% 100% 2 Mycobacterium 1.000 M. sp. 05FL-43-159LA (EU619898.1) 96.3% 100% 3 Mycobacterium 1.000 M. paraterrae 05-2522 (EU919228.1) 97.6% 95.3% M. sp. LTG 466 (KY853653.1) 4 Mycobacterium 1.000 93.8% 100% M. manitobense DSM 44615 (DQ350158.1) 5 Mycobacterium 1.000 M. lentiflavum CIP 105465 (AF547851.1) 96.3% 100% 6 Mycobacterium 1.000 M. sp. LTG 466 (KY853653.1) 92.3% 100% 7 Mycobacterium 1.000 M. sp. FI-13041 (KJ957808.1) 91.3% 99.5% * M. parascrofulaceum CIP 108112 (AY943201.1) 8 Mycobacterium 1.000 93.5% 100% M. gordonae SN 601 (AF434735.1) † 9 Mycobacterium 1.000 M. sp. 05FL-43-159LA (EU619898.1) 95.3% 100% 10 Mycobacterium 1.000 M. sp. FL04-5-253F (EU619883.1) 95.0% 100% 11 Mycobacterium 1.000 M. manitobense DSM 44615 (DQ350158.1) 95.8% 100% 12 Mycobacterium 1.000 M. sp. FL04-5-253F (EU619883.1) 95.8% 100% 13 Mycobacterium 1.000 M. sp. E1455 (AY379077.1) 93.3% 100% 14 Mycobacterium 1.000 M. mageritense 04K678 (EU732652.1) 96.8% 100% 15 Mycobacterium 1.000 M. stomatepiae KKA 2409 (JX091372.1) 95.2% 100%

* 4 gap characters † Identical sequence to M. parascrofulaceum ATCC BAA614 (GQ153295.1)

146