University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange

Masters Theses Graduate School

8-2013

Factors Influencing Diarrheal athogenP Presence in Tubewells of Bangladesh

Kati Anne Ayers [email protected]

Follow this and additional works at: https://trace.tennessee.edu/utk_gradthes

Part of the Environmental Microbiology and Microbial Ecology Commons

Recommended Citation Ayers, Kati Anne, "Factors Influencing Diarrheal athogenP Presence in Tubewells of Bangladesh. " Master's Thesis, University of Tennessee, 2013. https://trace.tennessee.edu/utk_gradthes/2386

This Thesis is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Masters Theses by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council:

I am submitting herewith a thesis written by Kati Anne Ayers entitled "Factors Influencing Diarrheal Pathogen Presence in Tubewells of Bangladesh." I have examined the final electronic copy of this thesis for form and content and recommend that it be accepted in partial fulfillment of the requirements for the degree of Master of Science, with a major in Geology.

Larry D. McKay, Major Professor

We have read this thesis and recommend its acceptance:

Alice C. Layton, Annette S. Engel

Accepted for the Council: Carolyn R. Hodges

Vice Provost and Dean of the Graduate School

(Original signatures are on file with official studentecor r ds.) FACTORS INFLUENCING DIARRHEAL PATHOGEN PRESENCE IN TUBEWELLS OF BANGLADESH

A Thesis Presented for the Master of Science Degree The University of Tennessee, Knoxville

Kati Anne Ayers August 2013

Copyright © 2013 by Kati Anne Ayers

All Rights Reserved

ii

For my husband, Tyler Ayers, for his continued support, patience, and encouragement.

Also for my grandfather, Henry T. Abell. May he celebrate with me in spirit.

iii

ACKNOWLEDGEMENTS

I would like to thank Drs. Larry McKay and Alice Layton, my thesis co-advisors for their patience and guidance in training me to think and write like a scientist. I would also like to thank my third committee member, Dr. Annette Engel, for her expertise and for challenging me and helping to keep me on track. Special thanks also to Dan Williams, Abby Smartt, Andrew

Ferguson, Brian Mailloux, Peter Knappett, and Alexander van Geen for field and lab support.

This project was partially supported by NIH/FIC R01 TW008066.

iv

ABSTRACT

Diarrheal disease pathogens remain a major concern in developing countries as is the leading cause of hospitalization of young children worldwide. A recent study has shown shallow groundwater in rural Bangladesh to be contaminated with bacterial and viral pathogens, but found no correlation between rotavirus and any fecal indicator or environmental parameter during the monsoon season of July, 2009. The objectives of this thesis were to examine the non- relationship between pathogens and fecal indicators, as well as to improve the understanding of the seasonal transport of viral pathogens, especially rotavirus, in shallow, sandy aquifers of

Bangladesh. This was achieved by comparing the previously published July 2009 data to new measurements of samples collected from the same tubewells in March 2009 and January 2011, as well as surface water data collected during the same months. Quantitative PCR (qPCR) was used to measure the concentrations of several pathogens and fecal indicators, including E. coli,

Shigella, Bacteroides, rotavirus, and adenovirus in a total of 25 surface water and 145 well water samples. was also measured, but only in March 2009 well water samples. These measurements were used to compare pathogen presence to molecular and environmental fecal indicators, including pH, seasonality, and Cl/Br ratios. The viral community was also characterized using metagenomic sequence analysis. In both surface and ground water samples, the viral pathogen (primarily rotavirus) frequency of occurrence and mean log concentration exceeded those of the bacterial pathogens, with mean log concentrations being at least ten-fold higher in most cases. Rotavirus G12 showed no statistically significant differences between the months sampled and was not predicted by the presence of culturable E. coli, total coliform, or molecular E. coli. Rotavirus concentration in groundwater was also not correlated to temperature, dissolved oxygen (DO), or oxidation reduction potential (ORP), which were the

v only other environmental parameters found to have significant differences between the months sampled. This study indicates that E. coli, temperature, DO and ORP are not useful indicators of rotavirus presence or concentration in the shallow aquifers of rural Bangladesh.

vi

1. INTRODUCTION ...... 1

2. MATERIALS AND METHODS ...... 7

2.1 Study Site ...... 7

2.2 Field Measurements and Sampling ...... 7

2.3 Extraction of DNA and RNA from filters...... 9

2.4 Single-stranded cDNA synthesis ...... 12

2.5 Pathogen Detection ...... 13

2.6 Statistical Analysis ...... 13

2.7 Double-stranded cDNA Synthesis ...... 14

2.8 Metagenomic Sequencing and Analysis ...... 17

3. RESULTS ...... 19

3.1 Concentrations of fecal indicators and pathogens in surface and groundwater ...... 19

3.3 Analysis of fecal indicators and pathogens in well water by month sampled ...... 23

3.4 Analysis of environmental parameters in well water by month sampled ...... 26

3.5 Analysis of fecal indicators and environmental parameters in select wells ...... 37

3.6 Correlation of rotavirus with significant parameters ...... 41

vii

3.7 Comparison of rotavirus to fecal indicators ...... 41

3.8 Metagenomic analysis of surface and groundwater samples ...... 45

4. DISCUSSION AND CONCLUSIONS ...... 56

4.1 The Purposes of this Study ...... 56

4.2 Surface to groundwater transmission pathway ...... 56

4.3 Rotavirus relationship to fecal indicating parameters...... 57

4.4 Rotavirus relationship to environmental parameters...... 59

4.5 Viral pathogen presence in surface and groundwater virome...... 60

4.6 Conclusions...... 61

LIST OF REFERENCES ...... 62

APPENDIX ...... 66

VITA ...... 94

viii

LIST OF TABLES

TABLE 1: MICROBIAL AND ENVIRONMENTAL PARAMETERS...... 6

TABLE 2. GENE TARGETS AND PRIMER AND PROBE SEQUENCES...... 15

TABLE 3. REDUCTION OF FECAL INDICATORS AND PATHOGENS...... 25

TABLE 4. STATISTICAL VALUES OF GROUNDWATER PARAMETERS...... 27

TABLE A 1. PHYSICAL CHARACTERISTICS OF POND AND CANAL SAMPLES...... 67

TABLE A 2. CONCENTRATIONS OF MOLECULAR INDICATOR AND PATHOGENICITY

GENES IN SURFACE WATER...... 68

TABLE A 3. CONCENTRATIONS OF MOLECULAR INDICATORS AND

PATHOGENICITY GENES IN GROUNDWATER FROM ALL TUBEWELLS...... 69

TABLE A 4. CONCENTRATION OF MOLECULAR INDICATORS AND PATHOGENICITY

GENES IN GROUNDWATER FROM ALL TUBEWELLS...... 70

TABLE A 5. CHEMICAL COMPOSITION OF GROUNDWATER FROM ALL

TUBEWELLS...... 71

TABLE A 6. CONTINUED CHEMICAL COMPOSITION OF GROUNDWATER FROM ALL

TUBEWELLS...... 72

TABLE A 7. CONCENTRATIONS OF MOLECULAR INDICATOR AND PATHOGENICITY

GENES IN GROUNDWATER FROM 28 SELECTED TUBEWELLS...... 73

TABLE A 8. CHEMICAL COMPOSITION OF GROUNDWATER FROM 28 SELECTED

TUBEWELLS...... 74

TABLE A 9. PATHOGEN AND FECAL INDICATOR PREVALENCE AND ABUNDANCE.

...... 75

ix

TABLE A 10. PHAGE SEQUENCE ABUNDANCES IN THE GROUNDWATER VIROME

USING MG-RAST...... 91

TABLE A 11. VIRAL SEQUENCE ABUNDANCES IN THE GROUNDWATER VIROME

USING MG-RAST...... 92

*Tables beginning with “A” are in the appendix.

x

LIST OF FIGURES

FIGURE 1. FLOW CHART OF STATISTICAL ANALYSES...... 16

FIGURE 2. SURFACE WATER FECAL INDICATOR AND PATHOGEN

CONCENTRATIONS...... 20

FIGURE 3. GROUNDWATER FECAL INDICATOR AND PATHOGEN

CONCENTRATIONS...... 21

FIGURE 4. GROUNDWATER CULTURABLE E. COLI AND TOTAL COLIFORM

CONCENTRATIONS AS DETERMINED BY COLILERT TESTING...... 22

FIGURE 5. SCATTER PLOT OF REDUCTION IN THE CONCENTRATIONS OF FECAL

INDICATORS AND PATHOGENS...... 24

FIGURE 6. COMPARISON OF CULTURABLE E. COLI IN ALL WELLS BY MONTH...... 28

FIGURE 7. COMPARISON OF TOTAL COLIFORM IN ALL WELLS BY MONTH...... 29

FIGURE 8. COMPARISON OF ROTAVIRUS IN ALL WELLS BY MONTH...... 30

FIGURE 9. CONTINGENCY ANALYSIS OF ROTAVIRUS IN SELECT WELLS BY

MONTH...... 31

FIGURE 10. COMPARISON OF ADENOVIRUS IN ALL WELLS BY MONTH...... 32

FIGURE 11. COMPARISON OF TEMPERATURE IN ALL WELLS BY MONTH...... 33

FIGURE 12. COMPARISON OF PH IN ALL WELLS BY MONTH...... 34

FIGURE 13. COMPARISON OF DISSOLVED OXYGEN IN ALL WELLS BY MONTH.. ... 35

FIGURE 14. COMPARISON OF REDUCTION POTENTIAL IN ALL WELLS BY MONTH..

...... 36

FIGURE 15. COMPARISON OF TOTAL COLIFORM IN SELECT WELLS (N=28) BY

MONTH...... 38

xi

FIGURE 16. COMPARISON OF ROTAVIRUS IN SELECT WELLS (N=28) BY MONTH.. . 39

FIGURE 17. COMPARISON OF ADENOVIRUS IN SELECT WELLS (N=28) BY MONTH..

...... 40

FIGURE 18. COMPARISON OF TEMPERATURE IN SELECT WELLS (N=28) BY

MONTH...... 42

FIGURE 19. COMPARISON OF DISSOLVED OXYGEN IN SELECT WELLS (N=28) BY

MONTH...... 43

FIGURE 20. COMPARISON OF REDUCTION POTENTIAL IN SELECT WELLS (N=28) BY

MONTH...... 44

FIGURE 21. ROTAVIRUS CONTINGENCY ANALYSIS VERSUS E. COLI IN 28

SELECTED WELLS...... 46

FIGURE 22. ROTAVIRUS CONTINGENCY ANALYSIS VERSUS TOTAL COLIFORM IN

28 SELECTED WELLS...... 47

FIGURE 23. TAXONOMIC COMPOSITION OF SURFACE WATER C13...... 49

FIGURE 24. TAXONOMIC COMPOSITION OF IN SURFACE WATER

SAMPLE C13...... 50

FIGURE 25.TAXONOMIC COMPOSITION OF GROUNDWATER 21754 ...... 51

FIGURE 26. TAXONOMIC COMPOSITION OF HERPESVIRALES IN GROUNDWATER

SAMPLE 21754...... 52

FIGURE 27. RECRUITMENT HISTOGRAM OF ROTAVIRUS...... 53

FIGURE 28. RECRUITMENT HISTOGRAM OF PROCHLOROCOCCUS PHAGE P-SSM7..

...... 54

xii

FIGURE 29. SEQUENCE ALIGNMENT OF PROCHLOROCOCCUS PHAGE PRESENT IN A

SURFACE WATER SAMPLE (C1-3) AND A GROUNDWATER SAMPLE (21754)..... 55

FIGURE A 1. COMPARISON OF MOLECULAR E. COLI IN ALL WELLS BY MONTH.. .. 76

FIGURE A 2. COMPARISON OF ALLBACTEROIDES IN ALL WELLS BY MONTH...... 77

FIGURE A 3. COMPARISON OF VIBRIO IN ALL WELLS BY MONTH...... 78

FIGURE A 4. COMPARISON OF SHIGELLA IN ALL WELLS BY MONTH...... 79

FIGURE A 5. COMPARISON OF CL/BR RATIOS IN ALL WELLS BY MONTH...... 80

FIGURE A 6. COMPARISON OF NITRATE CONCENTRATIONS IN ALL WELLS BY

MONTH...... 81

FIGURE A 7. COMPARISON OF PHOSPHATE CONCENTRATIONS IN ALL WELLS BY

MONTH...... 82

FIGURE A 8. COMPARISON OF SULFATE CONCENTRATIONS IN ALL WELLS BY

MONTH...... 83

FIGURE A 9. COMPARISON OF CULTURABLE E. COLI IN 28 SELECT WELLS BY

MONTH...... 84

FIGURE A 10. COMPARISON OF PH IN SELECT WELLS BY MONTH...... 85

FIGURE A11. LINEAR REGRESSION ANALYSIS OF ROTAVIRUS CONCENTRATION

WITH CL/BR RATIOS FOR MARCH AND JULY...... 86

FIGURE A 12. LINEAR REGRESSION ANALYSIS OF ROTAVIRUS CONCENTRATION

WITH TEMPERATURE VARIATION...... 87

FIGURE A 13. LINEAR REGRESSION ANALYSIS OF ROTAVIRUS CONCENTRATION

WITH ORP...... 88

xiii

FIGURE A 14. LINEAR REGRESSION ANALYSIS OF ROTAVIRUS CONCENTRATION

WITH DO...... 89

FIGURE A 15. LINEAR REGRESSION ANALYSIS OF ROTAVIRUS CONCENTRATION

WITH TOTAL COLIFORM CONCENTRATION...... 90

FIGURE A 17. PROCHLOROCOCCUS PHAGE P-SSM7 SEQUENCES FOUND IN (A) THE

SURFACE WATER SAMPLE (C1-3) AND (B) THE GROUNDWATER SAMPLE

(21754)...... 93

*Figures beginning with “A” are in the appendix.

xiv

1. INTRODUCTION

Waterborne diarrheal disease pathogens are a major water quality concern in rural Bangladesh because of monsoonal flooding, poor sanitary conditions of the region, and reliance on untreated well water for drinking, among other factors. Diarrheal disease pathogens can be transmitted by multiple routes, including direct person-to-person and waterborne transmission (1, 2). Several pathogens are of concern for their contribution to devastating diarrheal diseases, including

Shigella, Vibrio, Enterotoxigenic Escherichia coli (ETEC), adenovirus, norovirus, and especially rotavirus, which remains the leading cause of severe diarrheal disease in infants and young children under five worldwide (3). Numerous shallow tubewells have been installed in

Bangladesh since the 1970s and have provided an improved water source to drinking contaminated surface water under their previous conditions (4). These wells were generally thought not to require any further treatment of the groundwater due to natural filtration and attenuation of contaminates in the sediments (5). However, recent research (5) finds this not to be the case, as many human pathogens, such as Shigella, Vibrio, ETEC E. coli, rotavirus, and adenovirus, were still detected in groundwater samples from the shallow tubewells in July 2009, with rotavirus being the dominant pathogen detected. Although the of concern differ between countries, viral transmission from surface water to groundwater aquifers and groundwater drinking wells seems to exist (6). There are several factors that could influence the presence of these pathogens, such as population density, poor sanitation, monsoonal climate, or hydrogeological conditions (7).

According to the World Health Organization (WHO), culture-based tests for total coliform and cultural E. coli are the typical methods of testing drinking water for potential pathogen contamination (8, 9). However, in recent years the ability of these fecal indicator bacteria (FIB) 1 to predict pathogen presence has been under scrutiny due to poor correlation with pathogens caused by factors such as non-fecal sources of E. coli, strain diversity, and differences in transport characteristics (5, 10). As a result of these shortcomings, other microorganisms are being explored for their abilities to indicate fecal contamination and for use as alternatives to the current FIBs. Among these are Bacteroides, enterococci, bacteriophages, Clostridium perfringens, adenoviruses, and Torque teno (TTV) (5, 11). The monitoring of drinking water for actual pathogens and FIBs has expanded greatly over the past twenty years because of the development of rapid molecular-based detection methods, such as quantitative PCR (qPCR)

(12, 13). However, these methods are more expensive than culture-based total coliform and E. coli methods and require substantial technical equipment and expertise. As a result, there is still a need to better understand the relationship between FIBs, which are generally not pathogens, and pathogens.

Environmental parameters that are indicative of fecal contamination can also act as a useful supplement to microbial monitoring methods. Parameters such as Cl/Br ratios and nitrate are inexpensive to measure and could prove useful for this purpose. Cl/Br ratios in the 150-540 (14) or 300-600 (15) range have been suggested as a tracer for sewage contamination because both occur in sewage and are stable in water-rock interactions, making determination of anthropogenic sources possible. Nitrate is also cited as being high in areas of agricultural usage due to fertilizer or urban residence due to sewage, though it is not specified as an indicator of fecal pathogen contamination (15, 16) and, in contrast, was found to have a negative correlation to rotavirus by Ferguson et al. (5). A complete listing of environmental parameters considered to indicate fecal contamination of groundwater can be found in Table 1.

2

Bacteria inherently possess different transport characteristics than viruses, and therefore are unable to provide a complete prediction of enteric pathogen presence or absence (5, 17). Thus, the need for a better understanding of the attachment and/or transport of these pathogens in the environment exists. A recent field study in Bangladesh (6) illustrates that seasonality plays a role in E. coli concentrations in groundwater, with a peak during the early monsoon season and lower detection during the dry season. Though it cannot be assumed that viruses will travel in the aquifer in the same manner as bacteria, due to their smaller size and unique cell properties (18), rotavirus concentrations may also display a seasonal pattern. This idea is supported by recent studies but with conflicting results. Specifically, one study reports a year-round prevalence in rotavirus induced disease in diarrheal patients with two seasonal peaks – one in the winter and one in the monsoon season (19). Another report recognizes two seasonal peaks but claims no distinct seasonal pattern in their study area of Bangladesh (20). Though water contamination is an important pathway of pathogen transmission, it is not the only explanation of the spread of these diseases. Person-to-person contact through the fecal-oral route could also be responsible for the peaks seen in rotavirus disease (21).

A study by Pieper et al. (22) has also shown pH to influence virus transport. Every particle has an isoelectric point (pHiep); the pH at which the particle loses its net electric charge. A change in pH of the environment surrounding the particle approaching its pHiep will cause a positively charged particle, for example, to become less positive and eventually reverse charge once the pH passes that of the particle’s pHiep. The pHiep of ferric oxyhydroxide (FeOOH) is 7-8, and, according to the Pieper et al. study, an increase in groundwater pH approaching 7 or 8 is expected to increase virus transport in the environment by decreasing the positive charge of the

FeOOH and reducing the attraction between FeOOH and the virus particles (22). Similarly, for

3 rotavirus, with a pHiep of ~4.5 (18), its charge will become less negative as the pH of the surrounding groundwater decreases and approaches 4.5. Elevated concentrations of phosphate and sulfate are also able to increase virus transport by binding to hematite suspensions and causing charge reversal (22, 23). Other environmental parameters found in literature to influence virus transport are listed in Table 1. The study by Ferguson et al. (5) showed no statistical correlation between rotavirus and the fecal indicators, culturable E. coli, molecular E. coli (mE. coli), total coliform, or F+RNA. Similarly, the only significant correlation between rotavirus and any environmental parameter was a negative correlation to nitrate levels.

The primary purposes of this thesis were to examine the non-relationships between rotavirus and

FIBs found by Ferguson et al. and to improve the understanding of seasonal transport of viral pathogens, especially rotavirus, in the shallow, sandy aquifers of rural Bangladesh by comparing the previously published July 2009 data (5) to data collected from the same tubewells in March

2009 and January 2011, as well as surface water data collected during the same months. The diarrheal pathogens, Shigella, Vibrio, adenovirus, and rotavirus, were detected in the groundwater samples from these dates, and their concentrations were compared to those detected in surface water to investigate the possibility of a surface to groundwater transmission pathway.

Molecular techniques, such as qPCR, were used to compare rotavirus presence in groundwater to that of mE. coli, Bacteroides, and adenovirus, as well as Cl/Br ratios and nitrate concentrations.

These parameters were examined as suitable as alternative indicators to the currently used culture-based methods for pathogens in rural Bangladesh. Because the fecal indicators typically accepted by WHO are bacteria and their transport characteristics likely differ from that of viruses

(5, 9), rotavirus presence and concentration were also compared to the environmental parameters of pH, sulfate, phosphate, and month sampled, which are known to influence virus transport.

4

Lastly, metagenomic sequencing was performed to characterize the viral community and to identify any unknown pathogens present in the surface and well waters.

5

Table 1: Microbial and environmental parameters. Listed are those identified as possible indicators of fecal contamination or virus transport. Only some parameters from this list that were measured in the field or laboratory are identified in the Measured column.

Common Used for Viral Fecal Parameter Reporting Measured Statistical Ref. Transport Contamination Units Analysis E. coli (23s) Log copies/ml X X X (11) ETEC E. coli Log copies/ml X Bacteroides Log copies/ml X X E. coli (colilert) MPN/100ml X X X Total Coliform MPN/100ml X X X (11) Rotavirus Log copies/ml X X Enterococci Log copies/ml X (11) Bacteriophages Log copies/ml X (11) C. perfringens Log copies/ml X (11) Adenoviruses Log copies/ml X X X (11) Torque teno virus Log copies/ml X (11) Shigella Log copies/ml X Vibrio Log copies/ml X Norovirus Log copies/ml X Temperature °C X X X (24) Organic matter X (22) pH X X X (22) Bivalent cations mg/L X (22) Ionic Strength mg/L X (22) DO mg/L X X (14) Cl- mg/L X X (14) Br- mg/L X X Cl/Br - X X X (14) Depth (m) m X Redox (Eh) V X Arsenic mg/L X Fluoride mg/L Nitrate mg/L X X (25) Potassium mg/L X (25) Nitrite mg/L X (14) Sulfate mg/L X (22) (22, Phosphate mg/L X X X 26) Salinity mg/L Ammonia mg/L X (14) Ammonium mg/L X (26) Radium mg/L X (26) *Time (Seasonal) month X X X (6)

*Seasonal time represents infiltration rate by known seasonal trend.

6

2. MATERIALS AND METHODS

2.1 Study Site

The ponds, canals, and tubewells used for sampling are located in the village of Bara Haldia

(23.370°N; 90.646°E) in Matlab upazilla of Bangladesh. The ponds and canals are thought to receive discharge from latrines due to unsanitary septic system infrastructure (27). Notable physical characteristics of the sampled surface waters are listed in Table A1 in the appendix.

Bangladesh is known to have a typical monsoonal climate with January being the coldest, driest month with temperatures around 26°C and having less than one centimeter of average rainfall.

Both temperatures and rainfall gradually increase until the monsoon season begins in May or

June, and both decrease at the end of the monsoon season, around September, until the end of the year (6, 28). Sample collections were arranged with two representing the dry season and one representing the monsoon season. Due to its low-lying landscape and the location of Bangladesh in the intersection of the Ganges, Brahmnaputra, and Meghna rivers, during the monsoon season, the entire region is flooded, allowing the rapid buildup of young, unconsolidated sediment (29).

The villages in the study area are underlain by a thick sequence of alternating layers of fine to medium grained sand (aquifers) and silt or clay layers (aquitards) (5). These layers range in thickness from a few meters to a few tens of meters.

2.2 Field Measurements and Sampling

Sample collection was performed by Barnard College, NY, and Columbia University, NY, in

March 2009, July 2009, and January 2011 and, along with tubewell construction, is further detailed in studies by Ferguson et al. (5). A total of 60 shallow household tubewells, ranging from 7-37m (23-121ft) depth, 12 ponds, and 3 canals were sampled for notable diarrheal disease- causing pathogens. The tubewells were generally constructed out of 2.5 inch diameter PVC pipe

7 with a 5 foot screen and installed by the hand-flapper method (5). The bottom portion of the well annulus was filled with coarse sand and then packed with sediment, removed during drilling, and sometimes capped using clay. Private household wells are rarely installed with a grout seal, but some households install a concrete and brick platform around their tubewell (5). Two well volumes were purged from the tubewells prior to groundwater collection. Groundwater samples were collected for analysis of chloride, bromide, nitrate, sulfate, and phosphate by ion chromatography (Dionex, Sunnyvale, CA). These samples were shipped and stored at -20°C until analysis could be completed. Measurements of pH, ORP, and temperature (YSI, Yellow

Springs, Ohio) were also collected on site. Duplicate surface and groundwater samples were also collected and analyzed for total coliform and culturable E. coli using the Most Probable Number

(MPN) based ColilertTM assay (IDEXX Laboratories, Inc, Westbrook, ME). Samples were considered negative for E. coli if either replicate measured <1MPN/100ml and were labeled 0.5

MPN/100ml for statistical analysis.

For molecular analysis of bacterial and viral pathogens, 2-8 L of groundwater and 0.2 L of surface water were filtered in duplicate through 0.22 μm nitrocellulose filters (Millipore

Corporation, Billerica, MA or Corning, Corning, NY). One of the duplicate filters was used for

DNA analysis and the other was used for RNA analysis. Water samples used for RNA analysis were acidified before filtration to pH 2.5-3 with HCl (1:1 v/v) and amended with MgCl2 to a final concentration of 1 mM per Sinton et al. (30). The filters were removed from the plastic and placed in sterile petri dishes that were then wrapped in parafilm. The samples were immediately placed on dry ice and transported at -20°C to Barnard College and stored at -80°C until shipped at -20°C to the University of Tennessee for molecular DNA and RNA analysis.

8

2.3 Extraction of DNA and RNA from filters

Groundwater and surface water filter samples were used for DNA and RNA extractions. Using sterile laboratory procedures, ¼ to ½ of each filter was sliced into ~1mm squares and placed into

Lysing Matrix E tubes (Fast DNA Spin Kit for Soil or FastRNA Pro Soil-Direct Kit, MP

Biomedicals, LLC, Solon, OH). For DNA extractions, 978µL kit provided Sodium Phosphate

Buffer and 122µL kit provided MT Buffer were added to Lysing Matrix E tubes with filter. The tubes were homogenized in the FastPrep® Instrument for 40 sec at a speed setting of 6.0. The tubes were then centrifuged at 14,000xg for 10 min to pellet debris. The supernatant was transferred to a clean 2mL microcentrifuge tube, and 250µL kit provided PPS was added, and the tube was inverted 10 times by hand. The samples were then centrifuged at 14,000xg for 5 min to pellet precipitate, and the supernatant was transferred to a clean 15mL tube along with 1mL resuspended kit provided Binding Matrix. The tubes were placed on a rotator for 2 min to allow binding of the DNA and then allowed to sit in a rack for 3 min to settle. 500µL of the supernatant was removed before transferring 600µL of the mixture to a SPINTM Filter and centrifuging at

14,000xg for 1 min. The flow-through was discarded from the SPINTM filter and the remaining binding matrix mixture was added and centrifuged as before. 500µL of kit provided SEW-M was added and the samples were centrifuged at 14,000xg for 1 min, discarding the flow-through. The samples were then centrifuged a second time at 14,000xg for 2 min to dry the matrix. The filters were transferred to a clean 2mL microcentrifuge tube, allowed to air dry for 5 min at room temperature, and the binding matrix was gently resuspended in 50µL kit provided DES. Samples were centrifuged at 14,000xg for 1 min to elute DNA. Another 50µL kit provided DES was added to each tube and centrifuged as before to yield a final volume of 100µL DNA.

9

For RNA extractions, 1mL kit provided RNAproTM Soil Lysis Solution was added to Lysing

Matrix E tubes with filter, and the tube was inverted. The tube was then homogenized in the

FatPrep® Instrument for 40 sec at a speed setting of 6.0 and centrifuged at 14,000xg for 5 min at room temperature. The supernatant was transferred to a clean 2mL microcentrifuge tube, and

750µL kit provided phenol:chloroform (1:1) was added. The samples were vortexed for 10 sec and incubated at room temperature for 5 min before centrifuging them at 14,000xg for 5 min at

4°C. The upper aqueous phase was then transferred to a clean 2mL microcentrifuge tube along with 200µL kit provided Inhibitor Removal Solution, and the tubes were inverted by hand. The tubes were then centrifuged at 14,000xg for 5 min at room temperature to pellet precipitate, and the supernatant was transferred to a clean 2mL microcentrifuge tube along with 660µL cold

100% isopropanol. The tubes were inverted by hand 5 times and placed at -20°C for at least 30 min. The samples were then centrifuged at 14,000xg for 15 min at 4°C and the supernatant was discarded, and the pellet was washed in 500µL cold 70% ethanol. The ethanol was then removed, and the pellets were allowed to air dry for 5 min at room temperature before resuspending in 200µL kit provided DEPC water and adding 600µL kit provided

RNAMATRIX® Binding Solution and 10µL kit provided RNAMATRIX® Slurry. The samples were then placed on a rotator for 5 min and microcentrifuged for approximately 10 sec. The supernatant was discarded, and the RNA was resuspended in 500µL kit provided

RNAMATRIX® Wash Solution. The samples were microcentrifuged again for 10 sec, and the supernatant was discarded before microcentrifuging a second time for 10 sec and allowed to air dry for 5 min at room temperature. 50µL kit provided DEPC water was added to each tube, vortexed, and incubated at room temperature for 5 min. The samples were microcentrifuged for

10 sec, and the supernatant was transferred to a clean RNA-free microcentrifuge tube. Another

10

50µL kit provided DEPC water was added to each tube, vortexed, incubated for 5 min, and microcentrifuged for 10 sec to achieve a final volume of 100µL RNA.

Prior to qPCR, extracted DNA/RNA was quantified using a NanoDrop Spectrophotometer

(ThermoScientific, Wilmington, DE) and diluted to obtain a concentration of ~5 ng/µL. If OD

260/280 readings of DNA extractions were less than 1.0, they were cleaned using QIAquick®

PCR Purification Kit (Quigen, Valencia, CA) to minimize PCR inhibition. 250µL kit provided

Buffer PB was added to 50µL PCR reaction. The sample was then added to a QIAquick column and centrifuged for 60 sec to bind the DNA. The flow-through was discarded and 0.75mL kit provided Buffer PE was added to the same QIAquick column and centrifuged for 60 sec. The flow-through was again discarded before the QIAquick column was centrifuged once more for 1 min to remove residual wash buffer. The DNA was eluted into a clean 1.5mL microcentrifuge tube by adding 50µL kit provided Buffer EB to the QIAquick column and centrifuging for 1 min.

The DNA extractions were stored at -20°C.

RNA extractions from March sampling period were also cleaned using RNA Clean &

ConcentratorTM -5 (Zymo Research, Irvine, CA). 50μL kit provided RNA Binding Buffer was mixed with 50μL 95% ethanol. 100μL of this mixture was mixed with 50μL sample extracted

RNA. The mixture was transferred to the Zymo-SpinTM IC Column in a collection tube and centrifuged at 12,000xg for 1 min. After discarding the flow-through, 400µL kit provided RNA

Prep Buffer was added to the column and centrifuged at 12,000xg for 1 min, and the flow- through was discarded. 800µL kit provided RNA Wash Buffer was then added to the column and centrifuged at 12,000xg for 30 sec. The flow-through was discarded, and the wash step was repeated with 400µL kit provided RNA Wash Buffer. The column was centrifuged in a new collection tube at 12,000xg for 2 min. The column was then transferred to an RNase-free tube

11 and mixed with 50µL kit provided DNase/RNase-Free water and allowed to sit for 1 min at room temperature. The column was centrifuged at 10,000xg for 30 sec to elute RNA. The RNA extractions were stored at -80°C.

2.4 Single-stranded cDNA synthesis

The extracted RNA from groundwater samples was converted to single-stranded cDNA along with a control blank of kit provided DEPC-treated water using two different methods: a modified

SuperScript III First-Strand Synthesis SuperMix for qRT-PCR (Invitrogen, Carlsbad, CA) and

GoScriptTM Reverse System (Promega, Madison, WI). For the first method, 8µL diluted RNA was added to 10µL kit provided 2X RT reaction mix and 2µL kit provided RT enzyme on ice in 0.2mL RNase-free PCR tubes (Ambion-Life Technologies, Carlsbad, CA). The tubes were then incubated at 25°C for 10 min, 50°C for 30 min, 85°C for 5 min, and cooled to

4°C using a PTC-225 Peltier Thermal Cycler Gradient Cycler (MJ Research). The tubes were put on ice, and 1µL kit provided E. coli RNase H was added to each. Then, the tubes were incubated at 37°C for 20 min, and the resulting cDNA samples were diluted 1:10 for use with qPCR. For the second method, 4µL diluted RNA was added to 1µL kit provided random primers, heated to

70°C for 5 min, and cooled to 4°C for 5 min using a C1000 TouchTM Thermal Cycler (Bio-Rad,

Hercules, CA). Samples were then stored on ice until the reverse transcription mix was added.

The reverse transcription mix was prepared by combining the following kit provided components

TM in order: 2.1µL nuclease-free water, 4µL GoScript 5X reaction buffer, 6.4µL MgCl2, 1µL

PCR nucleotide mix, 0.5µL recombinant RNasin® ribonuclease inhibitor, and 1µl GoScriptTM reverse transcriptase. 15µl reverse transcription mix was added to each tube, and the tubes were heated to 25°C for 5 min, 42°C for 60 min, and cooled to 4°C. The single-stranded cDNA was stored at -20°C until used for qPCR.

12

2.5 Pathogen Detection

Quantitative PCR was used to measure copies of genes/μL in the filtered samples of both surface and groundwater for the following pathogens and FIB: mE. coli, Bacteroides , ETEC, Vibrio,

Shigella, adenovirus, rotavirus, and norovirus. 2.5µL of each DNA/cDNA sample was loaded in triplicate onto a 96 well plate, with the third well also containing a plasmid DNA 105 spike to determine PCR inhibition. A blank, negative control, sample was also prepared with 2.5µL filtered HPLC water and run with each set of samples. A rotavirus standard was also made from a synthesized 200 nucleotide oligomer based on VP6 from X94617 contained in a plasmid (GenScript,

Piscataway, NJ) and cloned into PCR4-TOPO cloning vector (5). This standard was run on every 96 well plate as well in a dilution series from 107 to 101. Each well also contained a quantitative PCR amplification reaction consisting of 12.5 µL 2X concentrated ABsolute Blue qPCR Mix (Thermo

Fisher Scientific, Waltham, MA), 0.75 µL each of the appropriate forward and reverse primers

(Table 2) at 20μM stock concentration, and 0.5 µL of the appropriate probe (Table 2). The remainder of each 25µL total reaction volume was obtained by adding 0.2µL filtered HPLC water (Fisher Scientific, Pittsburgh, PA). The qPCR conditions consisted of 50°C for 2 min, followed by 95°C for 15 min, and up to 50 cycles of 95°C for 30 sec and 60°C (or 55°C for E. coli) for 45 sec. The measured concentrations were converted to copies of genes/100mL for comparison with culturable E. coli and total coliform measurements. Samples were considered negative for each molecular marker if at least one of the duplicate samples measured <1copy per

PCR reaction and were labeled 10 copies/100mL for statistical analysis.

2.6 Statistical Analysis

Data from fecal indicator bacteria and bacterial and viral pathogens from all surface and groundwater samples collected from all three months were organized in Microsoft Office Excel

13

2007. Concentration and percent presence (calculated as number of wells contaminated divided by total wells) of each of these organisms were statistically analyzed with JMP ® Pro 9.0.3 (SAS

Institute Inc., Cary, NC) along with the measurements of temperature, pH, ORP, DO, Cl/Br ratios, and concentrations of nitrate, phosphate, and sulfate. The pearson correlations from contingency tables were used to analyze differences in the number of contaminated wells by month sampled; the F-values from ANOVA were used to analyze differences in the concentrations of contamination or measurement of environmental parameter; and the R2 and p- values from linear regression were used to determine any relationship between rotavirus concentrations and other parameters varying significantly by month sampled. Molecular and environmental parameters were first analyzed as a population of all measured wells in order to determine variance between months sampled. However, not all wells were measured for every parameter analyzed in this study, so those found with a significant difference between months sampled in all wells were also analyzed in a smaller dataset of 28 wells with measurements for all parameters. Parameters having a significant difference between months in the 28 selected wells were further analyzed for correlation with rotavirus concentration using regression analysis. All graphs were created using SigmaPlot for Windows Version 11.0 (Systat Software,

Inc.). A flow chart of this statistical elimination process is illustrated in Figure 1. Along with these statistical analyses, metagenomic analysis was performed with Metavir to characterize the viral community found in the surface and groundwater samples.

2.7 Double-stranded cDNA Synthesis

Following qPCR quantification, one surface (Cl-3) and one groundwater (21754) sample were chosen from March 2009 sampling for metagenomic sequencing due to their high quality of

RNA and consistently high pathogen concentrations. The single-stranded cDNA of each of these

14

Table 2. Gene targets and primer and probe sequences. Listed are those used for quantitative PCR assays to target specific pathogens in surface and groundwater samples.

Assay Gene Target Target (Assay Organism type and Primer/Probe Name and Sequence Ref. (Relevance) annealing temperature)

Molecular E. EC23S EC23Sf, 5’-GAGCCTGAATCAGTGTGTGTG-3’ (31) coli (Fecal (Fluorogenic EC23Sr, 5’-ATTTTTGTGTACGGGGCTGT-3’ Indicator) Probe, 55°C) EC23Srv1bhq, 5’-(FAM)CGCCTTTCCAGACGCTTCCAC(BHQ-1)-3’ ETEC E. coli eltA (Heat labile (5) (Enterotoxigenic toxin LT) Elt311f- 5’ TCTGAATATAGCTCCGGCAGA-3’ E. coli strains) (Fluorogenic Elt414r-5’ CAACCTTGTGGTGCATGATGA-3’ Probe, 60°C) Elt383Taqr -5’ TTCTCTCCAAGCTTGGTGATCCGGT-3’

All Bacteroides AllBac AllBac296f, 5’-GAGAGGAAGGTCCCCCAC-3’ (32) (Fecal Indicator) (Fluorogenic AllBac412r, 5’-CGCTACTTGGCTGGTTCAG-3’ Probe, 60°C) AllBac375Bhqr, 5’-(FAM)CCATTGACCAATATTCCTCACTGCTGCCT(BHQ-1)-3’

Vibrio ompW (SYBR OmpW f-5'-CACCAAGAAGGTGACTTTATTGTG-3' (33) green, 60°C) OmpWr-5'-GGTTTGTCGAATTAGCTTCACC-3'

Shigella and stx1 or vtx1 (shiga (5) EIEC E. coli, toxin 1) IbStx1m64f- 5’-GCAAAGACGTATGTAGATTCGCT-3’ EPEC (Shigella (Fluorogenic IbStx1m209r 5’-ATCTATCCCTCTGACATCAACTG-3’ and multiple Probe, 60°C) IbStx1mTaq119f 5-ATGTCATTCGCTCTGCAATAGGTACTC-3’ types of E. coli pathogens)

(34) Adenovirus 40/41 hexon gene AV40/41-117f, 5’-CAGCCTGGGGAACAATTCAG-3’ (Fluorogenic AV40/41-258r, 5’-CAGCGTAAAGCGCACTTTGTAA-3’ Probe, 60°C) AV40/41-157BHQ, 5’-(FAM)ACCCACGATGTAACCACAGACAGGTC (BHQ-1)-3’

Rotavirus cDNA G12VP6 G12RV23f, 5’- GGAGGTTCTGTATTCATTGTCA-3’ (5) (Fluorgenic Probe, G12RV176r, 5’- CC AATTCCTCCAGTTTGAAAG- 3’ 60°C) G12RV55fTaq, 5’- AAAGATGCTAGGGATAAGATTGTTGAAGGTAC-3’

Norovirus RNA HuNV G1 COG1f, 5’-CGYTGGATGCGNTTYCATGA-3’ (35) (Fluorogenic COG1r, 5’-CCTTAGACGGATCATCATTYAC-3’ Probe, 56°C) RING1(a)-TP, 5’-FAM-AGATYGCGATCYCCTGTCCA-TAMRA-3’ RING1(b)-TP, 5’-FAM-AGATCGCGGTCTCCTGTCCA-TAMRA-3’

HuNV G2 COG2f, 5’-CARGARBCNATGTTYAGRTGGATGAG-3’ (Fluorogenic COG2r, 5’-TCGACGCCATCTTCATTCACA-3’ Probe, 56°C) RING2-TP, 5’-FAM-TGGGAGGGCGATCGCAATCT-TAMRA-3’

15

All Data

All Environmental All Molecular Data Data – ANOVA –ANOVA and Tukey-Kramer HSD and Tukey Kramer HSD

No Significant Significant Significant No Significant Difference in Month Difference in Month Difference in Month Difference in Month

28Wells with All Place Graph in Parameters – ANOVA Place Graph in Appendix and Tukey-Kramer Appendix HSD

No Significant Significant Difference in Month Difference in Month

Regression and Place Graph in Contingency Analysis of rotavirus Appendix and Significant Parameters

Significant No Significant Relationship = Relationship = Place Contribution to Graph in Appendix rotavirus Presence

Figure 1. Flow chart of statistical analyses. The steps shown were used to eliminate parameters that were not found to have statistically significant differences between months sampled and to compare those with statistically significant differences to rotavirus concentrations.

16 samples was converted into double-stranded cDNA using the Superscript Double-Stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, CA). 91μL kit provided DEPC-treated water, 30μL kit provided 5X Second-Strand Reaction Buffer, 3μL kit provided 10mM dNTP mix, 1μL kit provided E. coli DNA ligase (10U/μl), and 4μL kit provided E. coli DNA Polymerase were added on ice to 21μL single-stranded cDNA and incubated in a PTC-225 Peltier Thermal Cycler

Gradient Cycler (MJ Research) at 16°C for 2 hr. 2μL kit provided T4 DNA Polymerase was then added and the samples were incubated again at 16°C for 5 min. The samples were put on ice, and

10μL kit provided 0.5M EDTA was added. The samples were then cleaned using Agencourt

AMPure XP - PCR Purification (Beckman Coulter, Inc, Indianapolis, IN). The samples were quantified using a NanoDrop Spectrophotometer (ThermoScientific, Wilmington, DE) and qPCR, following above protocol for RVG12 assay (Table 2). The samples were also run on a

1.5% agarose gel for verification of RVG12 presence. The double-stranded cDNA was then amplified using illustraTM GenomiPhiTM V2 DNA Amplification Kit using 10ng cDNA (GE

Healthcare, Buckinghamshire, UK). 2μl double-stranded DNA was mixed with 8μL sample buffer provided by kit. This mixture was heated to 95°C for 3 min and cooled to 4°C on ice. 9μL kit provided reaction buffer and 1μL kit provided enzyme mix were added to each reaction mix on ice and incubated at 30°C for 3 hr, 65°C for 10 min, and cooled to 4°C. Amplified samples were quantified using a NanoDrop Spectrophotometer (ThermoScientific, Wilmington, DE) and diluted to 5ng/µL before being run on qPCR RVG12 assay to measure concentrations against a known positive sample.

2.8 Metagenomic Sequencing and Analysis

A library of the amplified double-stranded cDNA was prepared for sequencing following the manufacturer protocol using the Rapid Library Preparation Method (Roche, Mannheim,

17

Germany). Briefly, the libraries were loaded onto an RNA 6000 PicoChip following manufacturer protocol (Agilent Technologies, Santa Clara, CA) for verification of sample base pair size, and the DNA was quantified using a Quanti-TTM PicoGreen® dsDNA Assay Kit

(Invitrogen, Carlsbad, CA). Once appropriate amounts of double-stranded DNA were determined, the libraries were amplified using the emPCR Amplification Method (Roche,

Mannheim, Germany) and sequenced using the GS FLX+ instrument and XL+ kit Sequencing

Method (Roche, Mannheim, Germany). 454 sequencing was performed at Oak Ridge National

Laboratory.

The fasta sequence output files of C1-3 (surface water) and 21754 (groundwater) were uploaded into Metavir (36) online to filter out bacterial and unknown virus sequences and process against known virus for characterization of the viral community and identification of viral pathogens, such as Torque Teno virus and rotavirus. The fasta sequence files were also uploaded into MG-RAST (37) to identify any virome overlap between the sequences in the surface and groundwater samples.

18

3. RESULTS

3.1 Concentrations of fecal indicators and pathogens in surface and groundwater

Fecal indicator bacterial and bacterial and viral pathogens were found in surface and groundwater samples collected from all three different sampling months. The fecal indicator,

Bacteroides, was present in the highest concentrations of detected parameters and was found in

100% of both surface and groundwater samples (Figures 2 & 3). Culturable E. coli was present in 20, 23, and 47% of wells in the January, March, and July wells, respectively, with concentrations ranging from below the detection limit of 0.5 MPN/100mL to the too numerous to count (TNTC) value of 3000 MPN/100mL (Figure 4). Total coliform concentrations also ranged from below the detection limit of 0.5MPN/100mL to the TNTC value of 3000 MPN/100mL and were present in 57, 65, and 94% of wells in January, March, and July, respectively (Figure 4).

Like culturable E. coli, total coliforms were most prevalent in the monsoon season month of

July, as determined by percent presence. Viral pathogens were not only found in both surface water and groundwater samples, but also found in higher concentrations than bacterial pathogens in both types of samples, though not always with statistical significance. Rotavirus was present in 40% of surface water samples in January and 100% in both March and July with concentrations ranging from below the detection limit of 10 gene copies/100mL to 4.2x106 gene copies/100mL. Rotavirus was also present in 10, 20, and 27% of wells in January, March, and

July, respectively, with concentrations ranging from below the detection limit of 10 gene copies/100mL to 1.4x105 gene copies/100mL (Figures 2 & 3). Adenovirus was present in 80, 0, and 7% of surface water samples in January, March, and July, respectively, with concentrations ranging from below detection limit of 10 gene copies/100mL to 1.6x104 gene copies/100mL.

Adenovirus was also only present in 20% of the wells during January and 0% of wells in both

19

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 BDL BDL log Concentrations (gene copies/100mL) (gene Concentrations log * BDL * 1e+1 * 1e+0 January March July Month mE. coli Bacteroides Rotavirus Adenovirus Shigella+ETEC E. coli Vibrio

Figure 2. Surface water fecal indicator and pathogen concentrations. The concentrations were determined by quantitative PCR assays. The bar graph was generated from arithmetic mean concentrations of each marker for each month sampled from 12 ponds and 3 canals. The table with the raw values is in the appendix (Table A9). BDL = all samples were below detection limit for that month sampled. Shigella and ETEC E. coli were BDL in January and adenovirus was BDL in March.

20

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 BDL BDL * * BDL 1e+1 *

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July Month mE. coli Bacteroides Rotavirus Adenovirus Shigella+ETEC E. coli Vibrio

Figure 3. Groundwater fecal indicator and pathogen concentrations. The concentrations were determined by quantitative PCR assays. The bar graph was generated from arithmetic mean concentrations of each marker for each month sampled from up to 60 shallow tubewells. The table with the raw values is in the appendix (Table A9). BDL = all samples below detection limit for that month sampled. Adenovirus was BDL in March and July, and Shigella and ETEC E. coli were BDL in March.

21

1e+4

1e+3

1e+2

1e+1

1e+0

log Concentration (MPN/100mL) Concentration log ND = 0.25

1e-1 January March July Month E. coli Total Coliform

Figure 4. Groundwater culturable E. coli and total coliform concentrations as determined by Colilert testing. The bar graphs were generated from geometric mean concentrations of each marker for each month sampled from up to 60 shallow tubewells. The table with the raw values is in the appendix (Table A8). The horizontal dashed line reflects the non-detect (ND) value of 0.25 MPN/100mL.

22

March and July with concentrations ranging from below the detection limit of 10 gene copies/100mL to 2.2x102 gene copies/100mL. Bacterial pathogens were all below detection level in surface water samples in January, <33% in March, and <14% in July. Shigella was also absent from surface water samples in March (Figure 2). ETEC E. coli was below the detection limit of 1 gene copy/ PCR reaction in all wells for all months, and Shigella was below the detection limit in all wells in March. Shigella and Vibrio were both <9% for all wells in all three months (Figure

3). The arithmetic mean concentrations and percent of wells contaminated with each of these fecal indicators and pathogens are also listed in Table A9 in the appendix.

Reduction values of the bacterial fecal indicators and viral and bacterial pathogens from surface water to groundwater were calculated for all three months by dividing the log concentration of the FIB/pathogen in the surface water by its log concentration in the groundwater. The resulting ratios represent the amount that each FIB or pathogen was reduced in concentration from the surface waters to the groundwater as shown in Figure 5 and Table 3. The highest log reduction was seen in the fecal indicators, mE. coli and Bacteroides, with a range for mE. coli of 0.80-3.44 and arithmetic mean of 2.45 in all three months and a range for Bacteroides of 2.81-3.56 and arithmetic mean of 3.28. The viral pathogens, rotavirus and adenovirus, had higher log reductions than the bacterial pathogens, Shigella, ETEC E. coli, and Vibrio. However, the viral pathogens had higher concentrations in both the surface and groundwater than the bacterial pathogens, which were only present in surface water at concentrations at or below the detection limit, making accurate reduction calculations difficult.

3.3 Analysis of fecal indicators and pathogens in well water by month sampled

The molecular data from all wells showed some significant differences in the fecal indicators, culturable E. coli and total coliform, as well as rotavirus and adenovirus, between the months

23

8

7

6 1:1 line of no retention 5

4

3

log Mean of Positives in Surface Water Surface in Positives of Mean log 2

1 1 2 3 4 5 6 7 8 log Mean of Positives in Groundwater

mE. coli Bacteroides rotavirus adenovirus Shigella Vibrio

Figure 5. Scatter plot showing reduction in the concentrations of fecal indicators and pathogens between the surface water and groundwater. Bacteria are represented by circles and viruses are represented by diamonds. The arithmetic mean of the log concentrations of fecal indicators and pathogens for each month from surface water were divided by those in groundwater to calculate reduction. The values are reported as log concentrations. Organisms on the 1:1 line would have no apparent retention.

24

Table 3. Reduction of Fecal Indicators and Pathogens. The geometric mean concentrations measured in surface water and groundwater for each sampling month are provided followed by the log of the reduction in concentrations from surface to groundwater of each pathogen or fecal indicator.

Surface Water Concentration Groundwater Concentration Fecal Indicator or Log Reduction Standard (gene copies/100mL) (gene copies/100mL) Mean Pathogen Deviation January March July January March July January March July mE. coli 70462.12 720.52 381231.28 52.95 115.07 138.07 3.12 0.80 3.44 2.45 1.44 Bacteroides 16935741.55 3383306.45 16505226.63 5699.12 5218.48 4541.35 3.47 2.81 3.56 3.28 0.41 rotavirus 153.84 8964.11 297119.75 222.92 2878.50 3144.92 -0.16 0.49 1.98 0.77 1.09 adenovirus 941.71 0.00 15.43 14.03 0.00 0.00 1.83 nc 0.19 0.67 1.00 Shigella + ETEC E. coli 0.00 20.62 24.32 10.28 0.00 12.18 -0.01 0.31 0.30 0.20 0.18 Vibrio 0.00 67.51 13.91 10.99 12.87 10.45 -0.04 0.72 0.12 0.27 0.40

* Detection limit = 10 gene copies/100mL * nc = no value calculated because both the surface water and groundwater concentrations were below the detection limit.

25 sampled using ANOVA and the Tukey-Kramer HSD test (Table 4, Figures 6-8, 10). Both culturable E. coli and total coliform concentrations were significantly higher in July than in

March and January, and March and January concentrations were not significantly different.

However, rotavirus concentrations were significantly lower in January than in both March and July and were not significantly different between March and July, and by contingency analysis, rotavirus presence in the wells was not significantly different between any of the three months sampled (p = 0.2, Figure 9). Adenovirus concentrations were significantly higher in January and not significantly different between March and July. The fecal indicators, mE. coli and Bacteroides, and pathogens, Vibrio, and Shigella showed no significant differences between months sampled and were excluded from further statistical analysis (Figures A1-A4).

3.4 Analysis of environmental parameters in well water by month sampled

In the analysis of data from all wells, statistically significant differences by the Tukey-

Kramer HSD test were found in the environmental parameters of temperature, pH, DO, and

ORP between months sampled (Figures 11-14). Temperature increased by month and as expected, based on season, while pH and ORP did not follow a seasonal trend. The March pH was significantly higher than both January and July, and the January pH was also significantly higher than that of July. DO measurements were also significantly higher in

March than in July. However, ORP was significantly higher in January than both March and

July, and July was significantly higher than March.

No statistically significant differences were found in nitrate, phosphate, sulfate, or Cl/Br ratios between months sampled. Data were data available for concentrations of these parameters for January (Figures A5-A8). Due to the wide range of Cl/Br ratios, regression

26

Table 4. Statistical values of groundwater parameters. All microbial data were log transformed prior to statistical analysis. Molecular parameters are in log gene copies/100mL. Culturable E. coli and total coliform are in log MPN/100mL. BDL = All samples measured below detectable limit for that parameter.

Variation in All Variation in Select Correlation Parameter Mean/ Std Dev Wells by Month Wells by Month to rotavirus F-ratio: 6.8 F-ratio: 0.8 E. coli (colilert) -0.21/0.76 p=0.002 p=0.4 F-ratio: 7.0 F-ratio: 3.8 R2=0.005 Total Coliform 0.76/1.20 p=0.001 p=0.03 p=0.5 F-ratio: 2.6 mE. coli (23s) 1.98/0.87 p=0.08 F-ratio: 0.05 Bacteroides 3.73/1.02 p=1.0 F-ratio: 4.3 F-ratio: 3.8 Rotavirus 1.53/1.03 p=0.02 p=0.03 Not enough F-ratio:9.1 F-ratio: 6.3 Adenovirus 1.04/0.20 positive p=0.0002 p=0.003 samples Norovirus BDL/0 F-ratio: 2.6 Shigella 1.03/0.19 p=0.08 ETEC E. coli BDL/0 F-ratio: 1.3 Vibrio 1.06/0.34 p=0.3 F-ratio: 57.4 F-ratio: 64.2 R2=0.05 Temperature 25.88/0.87°C p<0.0001 p<0.0001 p=0.05 F-ratio:190.7 F-ratio: 2.8 pH 6.9/0.5 p<0.0001 p=0.07 F-ratio: 14.5 F-ratio: 6.9 R2=0.0001 DO 0.79/0.15 mg/L p=0.0003 p=0.01 p=0.9 470.85/1095.40 F-ratio:141.6 F-ratio: 77.9 R2=0.02 Redox (ORP) mV p<0.0001 p<0.0001 p=0.2 F-ratio: 2.1 Cl/Br 793.97/1861.9 p=0.2 F-ratio: 1.2 Nitrate 1.20/3.86 mg/l p=0.3 F-ratio: 0.2 Phosphate 0.34/0.49 mg/L p=0.6 3.73/16.17 F-ratio: 0.1 Sulfate mg/L p=0.7

27

1e+4

1e+3

1e+2

1e+1

1e+0

log Concentration (MPN/100mL) Concentration log ND = 0.25

1e-1 January March July

Month

Figure 6. Comparison of culturable E. coli in all wells by month. The ANOVA F value of 6.8 showed significant variance, p = 0.002. In the Tukey-Kramer HSD test, July (n = 50) was significantly higher than both March (n = 57) and January (n = 42), p = 0.004 and p = 0.007, respectively. However, March and January were not significantly different. The horizontal dashed line reflects the non-detect (ND) value of 0.25 MPN/100mL. The points plotted above the whiskers represent outliers in the dataset.

28

1e+4

1e+3

1e+2

1e+1

1e+0

log Concentration (MPN/100mL) Concentration log ND = 0.25

1e-1 January March July

Month

Figure 7. Comparison of total coliform in all wells by month. The ANOVA F value of 7.0 showed significant variance, p = 0.001. In the Tukey-Kramer HSD test, July (n = 50) was significantly higher than both March (n = 57) and January (n = 42), p = 0.02 and p = 0.001, respectively. However, March and January were not significantly different. The horizontal dashed line reflects non-detect (ND) value of 0.25 MPN/100mL. The points plotted above and below the whiskers represent outliers in the dataset.

29

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure 8. Comparison of rotavirus in all wells by month. The ANOVA F value of 4.3 showed significant variance, p = 0.02. In the Tukey-Kramer HSD test, March (n = 44) was significantly higher than January (n = 39), p = 0.02. However, March and July were not significantly different, nor were July and January. The horizontal dashed line reflects the non-detect (ND) value of 10 gene copies/100mL. The points plotted above and below the whiskers represent outliers in the dataset.

30

Month

Figure 9. Contingency analysis of rotavirus in select wells by month. The Pearson test statistic showed that rotavirus presence was not significantly different between months, p = 0.2. 0 = absent, 1 = present.

31

1e+5

1e+4

1e+3

1e+2

ND = 1 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure 10. Comparison of adenovirus in all wells by month. The ANOVA F value of 9.1 showed significant variance, p = 0.0002. In the Tukey-Kramer HSD test, January (n = 42) was significantly higher than both March (n = 54) and July (n = 46), p = 0.0006 and p = 0.001, respectively. All samples in both March and July were below detectable limit, and thus were not significantly different. The horizontal dashed line reflects non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

32

Temperature Variation in All Wells by Month

28

27

26

C)

o

25

Temperature ( Temperature 24

23

22 January March July

Month

Figure 11. Comparison of temperature in all wells by month. The ANOVA F value of 57.4 showed significant variance, p <0.0001. In the Tukey-Kramer HSD test, January (n = 50) was significantly lower than both March (n = 50) and July (n = 44), p < 0.001. However, March and July were not significantly different. The points plotted above and below the whiskers represent outliers in the dataset.

33

9.0

8.5

8.0

7.5

pH 7.0

6.5

6.0

5.5 January March July

Month

Figure 12. Comparison of pH in all wells by month. The ANOVA F Value of 190.7 showed significant variance, p < 0.0001. In the Tukey-Kramer HSD test, March (n = 50) was significantly higher than both January (n = 49) and July (n = 40), p < 0.0001, and January was significantly higher than July, p < 0.0001. The points plotted above and below the whiskers represent outliers in the dataset.

34

Dissolved Oxygen Variation in All Wells by Month

1.4

1.2

1.0

0.8

DO (mg/L) DO

0.6

0.4

0.2 March July

Month

Figure 13. Comparison of dissolved oxygen in all wells by month. The ANOVA F value of 14.5 showed significant variance, p = 0.0003. In the Tukey-Kramer HSD test, March (n = 50) was significantly higher than July (n = 46), p = 0.0001. DO measurements were not available for January sampling period. The points plotted above and below the whiskers represent outliers in the dataset.

35

Variation in Reduction Potential in All Wells by Month

5000

4000

3000

2000

ORP (mV)

1000

0

-1000 January March July

Month

Figure 14. Comparison of reduction potential in all wells by month. The ANOVA F value of 141.6 showed significant variance, p < 0.0001. In the Tukey-Kramer HSD test, January (n = 42) was significantly higher than both March (n = 50) and July (n = 46), p <0.001, and July was significantly higher than March, p = 0.005. The points plotted above and below the whiskers represent outliers in the dataset.

36 analysis was still run between these and rotavirus concentrations for March and July but no significant correlation was found (R2 = 0.005, p = 0.6, Figure A11). Geometric means and standard deviations of all environmental parameters for the tubewell groundwater samples for each month as well as the measured pathogen and fecal indicator concentrations are listed in Table 4.

3.5 Analysis of fecal indicators and environmental parameters in select wells

Twenty-eight wells were selected that had complete data for all fecal indicators and environmental parameters. Parameters that showed significant variation between months sampled in the larger data set were also analyzed using analysis of variance by month, followed by the Tukey-Kramer HSD test.

The only fecal indicator that varied significantly in concentration by months sampled in the

28 selected wells was total coliform between July and January (p = 0.02) but there was no significant difference between March and January (Figure15). Rotavirus concentrations in the 28 selected wells were also significantly higher in July than January (p = 0.02) with no significant difference between March and January (Figure 16). Adenovirus was significantly higher in January than both March and July (p = 0.008, Figure 17). However, this analysis was based on only 5 positive samples in the selected wells for January, and adenovirus was not detected in any samples during the March or July sampling. Temperature, DO, and ORP were significantly different between months sampled using the 28 well dataset (Figures 18-

20). Temperature in well water increased between the months January to July. January and

July samples had oxidizing conditions with geometric mean ORP measurements of 1465mV and 64mV, respectively, and March had reducing conditions with a geometric mean ORP measurement of -269mV. The ORP values in all months were significantly different from

37

1e+4

1e+3

1e+2

1e+1

1e+0

log Concentration (MPN/100mL) Concentration log ND = 0.25

1e-1 January March July

Month

Figure 15. Comparison of total coliform in select wells (n=28) by month. The ANOVA F value of 3.8 showed significant variance, p = 0.03. In the Tukey-Kramer HSD test, July was significantly higher than January, p = 0.02. However, January and March were not significantly different. The horizontal dashed line reflects non-detect (ND) value of 0.25 MPN/100mL. The points plotted above and below the whiskers represent outliers in the dataset.

38

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure 16. Comparison of rotavirus in select wells (n=28) by month. The ANOVA F value of 3.9 showed significant variance, p = 0.03. In the Tukey-Kramer HSD test, July was significantly higher than January, p = 0.02. However, January and March were not significantly different. The horizontal dashed line reflects non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

39

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2

ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure 17. Comparison of adenovirus in select wells (n=28) by month. The ANOVA F value of 6.3 showed significant variance, p = 0.003. In the Tukey-Kramer HSD test, January was significantly higher than both March and July, p = 0.008. All samples in both March and July were below detectable limit, and thus were not significantly different. The horizontal dashed line reflects non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

40

one another. The DO measurements were not available from the January sampling time, but

March measurements were significantly higher than July (p = 0.006). In most cases,

parameters that were significantly different in all wells were also significantly different in

the 28 selected wells subset, indicating the smaller dataset represented the larger dataset.

The exception being rotavirus and ORP in which the March values differed between the

larger dataset and the 28 well subset. The March rotavirus concentration dropped from a

median log concentration of ~1.8 gene copies/100mL for all wells to non-detect levels for

the 28 selected wells, and the ORP interquartile range (IQR) was expanded in the box and

whisker plot of the 28 selected wells.

3.6 Correlation of rotavirus with significant parameters

Rotavirus concentration in groundwater samples was related to month sampled but was not

related to temperature variation (R2 = 0.046, p = 0.0497, Figure 22). The scatter plot showed

a spike in rotavirus concentrations between 26°C and 27°C, but this could be attributed to

the increase seen in rotavirus in July. Although there were some non-detect samples in July,

both January and March had mostly non-detectable rotavirus measurements in the 28

selected wells. However, rotavirus concentration was not correlated to temperature (R2 =

0.05, p = 0.05, Figure A12), ORP (R2 = 0.02, p = 0.2, Figure A13) or DO (R2 = 0.0001, p =

0.9, Figure A14). Thus, none of the environmental parameters were related to rotavirus

presence or concentration.

3.7 Comparison of rotavirus to fecal indicators

Pearson test statistics of contingency tables of all wells indicated that rotavirus presence was

independent of both molecular (p=0.4) and culturable E. coli presence (p=0.4, Figure 21), as

well as total coliform presence (p=0.9, Figure 22). In addition, linear regression analysis

41

28

27

26

C)

o

25

Temperature ( Temperature 24

23

22 January March July

Month

Figure 18. Comparison of temperature in select wells (n=28) by month. The ANOVA F value of 64.2 showed significant variance, p < 0.0001. In the Tukey-Kramer HSD test, January was significantly lower than both March and July, p <0.0001. However, March and July were not significantly different. The points plotted above and below the whiskers represent outliers in the dataset.

42

Dissolved Oxygen Variation in Select Wells by Month

1.4

1.2

1.0

0.8

DO (mg/L) DO

0.6

0.4

0.2 March July

Month

Figure 19. Comparison of dissolved oxygen in select wells (n=28) by month. The ANOVA F value of 6.9 showed significant variance, p = 0.01. In the Tukey-Kramer HSD test, March was significantly higher than July, p = 0.006. DO measurements were not available for January sampling period. The points plotted above and below the whiskers represent outliers in the dataset.

43

Reduction Potential Variation in Select Wells by Month

5000

4000

3000

2000

ORP (mV) ORP

1000

0

-1000 January March July

Month

Figure 20. Comparison of reduction potential in select wells (n=28) by month. The ANOVA F value of 77.9 showed significant variance, p <0.0001. In the Tukey-Kramer HSD test, January was significantly higher than both March and July, p <0.0001. July was also significantly higher than March, p = 0.03. The points plotted above and below the whiskers represent outliers in the dataset.

44 between total coliform and rotavirus concentrations showed no significant correlation

(R2 = 0.005, p = 0.5, Figure A15). Thus, none of the fecal indicators were related to rotavirus presence or concentration.

3.8 Metagenomic analysis of surface and groundwater samples

A total of 32,443 sequences were obtained from the surface water sample and 68,044 from the groundwater sample in the virome analysis by Metavir. Of the total sequence reads in the surface water virome, 97% were identified as either dsDNA or ssDNA viruses, 0.3% were ssRNA viruses, and 0.04% were identified as being from rotavirus (Figure 23). The majority of sequences (61%) found in the surface water sample were identified as

Caudovirales bacteriophage. Several other sequences similar to phages (including Inovirus and Gokushovirinae) as well as plant viruses (, , and

Nanoviridae) were identified in lower percentages (Figure 23). Although Torque Teno virus was expected to be present in the viromes due to its suggested used as a fecal indicator for viruses, it was not found in either the surface or groundwater sample.

Of the total sequence reads in the groundwater virome, 79% were identified as dsDNA or ssDNA viruses, and 16% were identified as being rotavirus A (Figure 25). Many of the same viruses and phages found in the viromes using Metavir were also found using MG-

RAST (Tables A10 & A11). All of the rotavirus sequence reads found in the virome were located around the VP6 gene marker (Figure 26). The presence of a high number of reads located only in the VP6 gene marker led to a suspicion that the March RNA sample may have been contaminated with the cloned VP6 carried in plasmid DNA. Because of this, the

March RNA samples were re-extracted and retested for rotavirus. Only the re-extracted data were used in the analysis of this study. Aside from rotavirus, the only other human pathogen

45

mE. coli

E. coli

Figure 21. Rotavirus contingency analysis versus E. coli in 28 selected wells. The Pearson test statistic indicates that rotavirus presence was independent of both molecular, p = 0.08, and culturable E. coli presence, p = 0.1. 0 = absent, 1 = present. 46

Total Coliform

Figure 22. Rotavirus contingency analysis versus total coliform in 28 selected wells. The Pearson test statistic indicates that rotavirus presence was independent of total coliform presence, p = 0.7844. 0 = absent, 1 = present.

47 sequence identified in either the surface water or groundwater virome using Metavir was from Human herpesvirus. These sequences were present in the surface water sample at

0.01% of the total sequences for each of Human herpesvirus 4, Human herpesvirus 2, and

Human herpesvirus 6B in the surface water sample. One sequence (0.04% of the total viruses) was identified in the groundwater sample matching Human herpesvirus 8 (Figures

24 & 26).

The Cyanophage, Prochlorococcus, sequences were present in both the surface water sample and groundwater sample (Figure 28). The presence of these phage sequences in the surface water sample was not surprising due to the photosynthetic nature of the bacterial host, Prochlorococcus. However, its presence in groundwater, with fewer sequence reads, suggests the possibility that the phage was transported from the surface water to the groundwater. Further sequencing of this phage using both MG-RAST (37) and Metavir (36) revealed a ssbinding protein sequence for Prochlorococcus phage P-SSM7 present in both the surface water sample and the groundwater sample (Figure A17). These two sequences were aligned using BLAST (38) and showed 100% identity (Figure 29).

48

Figure 23. Taxonomic composition of virus sequences identified in the surface water C1-3 sample. Pie chart was created using Metavir.

49

Figure 24. Taxonomic composition of Herpesvirales sequences in surface water sample C1-3. Pie chart was created using Metavir. Human Herpesvirus 4, 2, and 6B were identified. 50

Figure 25.Taxonomic composition of virus sequencesw identified in the groundwater 21754 sample. Pie chart was created using Metavir.

51

Figure 26. Taxonomic composition of Herpesvirales sequences in groundwater sample 21754. Human Herpesvirus 8 was identified at 0.04% of the total viruses.

52

VP-6

Figure 27. Recruitment histogram of rotavirus. Both surface (red) and groundwater (blue) virome are represented. All rotavirus sequence reads occur at VP6 gene marker location, which is the plasmid that was used for preparing the standard. There are also more sequence reads in the groundwater sample than in the surface water sample.

53

Figure 28. Recruitment histogram of Prochlorococcus phage P-SSM7. Both surface (red) and groundwater (blue) virome are represented. Cyanophage sequences were identified in both the surface and groundwater virome with overlapping sequence reads.

54

Query 1 TTGCTCGCAAGCAAAAGCGCAAGTTGACGTACATCGCAAACGTTCTTGTGATCTCTGACG 60 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 143 TTGCTCGCAAGCAAAAGCGCAAGTTGACGTACATCGCAAACGTTCTTGTGATCTCTGACG 84

Query 61 CCAAGCGTCCGCAAAATGAAGGTAAGGTTTTCCTGTTCAAGTTCGGAAAGAAGATTTTCG 120 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 83 CCAAGCGTCCGCAAAATGAAGGTAAGGTTTTCCTGTTCAAGTTCGGAAAGAAGATTTTCG 24

Query 121 ACAAGATCAAGGAACAACTCGAG 143 ||||||||||||||||||||||| Sbjct 23 ACAAGATCAAGGAACAACTCGAG 1

Figure 29. Sequence alignment of Prochlorococcus phage sequences present in a

surface water sample (C1-3) and a groundwater sample (21754). Sequences were downloaded from MG-RAST and aligned using BLAST, which showed 100% identity, giving evidence for transport between the surface water and groundwater. Query = Prochlorococcus phage P-SSM7 ssbinding protein from groundwater. Subject = Prochlorococcus phage P-SSM7 ssbinding protein from surface water.

55

4. DISCUSSION AND CONCLUSIONS

4.1 The Purposes of this Study

Diarrheal disease causes many hospitalizations in Bangladesh, with rotavirus being the leading pathogen affecting children worldwide. Ferguson et al. (5) found through molecular methods that both bacterial and viral pathogens were present in the groundwater of Bara

Haldia, Bangladesh in July 2009 and that rotavirus was not correlated to any environmental or fecal indicating parameters. Thus, this study analyzed the published data from July 2009, and also included data from March 2009 and January 2011. The three main purposes of this study were to investigate a transmission pathway of pathogens from surface water to groundwater, to determine if a relationship existed between rotavirus concentration and fecal indicating parameters with added time points, and to determine if a relationship of rotavirus concentration to parameters shown to influence virus transport existed. A virome was also constructed and analyzed with Metavir (36) and MG-RAST (37) to characterize the potential viral community present in the surface water and groundwater, to identify any potential viral fecal indicators present, and determine any overlap in the surface and groundwater viromes.

4.2 Surface to groundwater transmission pathway

The presence of diarrheal pathogens in groundwater has raised a concern that a transmission pathway may exist from surface to groundwater and that the natural filtration that occurs through sediments is not always enough to provide safe drinking water (5). Since there are no regulations in place for Bangladesh, the US Environmental Protection Agency (USEPA) was used in this study as a guideline for regulatory groundwater limits. The Ground Water

Rule established by the USEPA requires that a 10,000-fold (4-log) reduction of virus

56 concentration occur between a potential virus source and its groundwater well (39). Any aquifer producing less reduction is termed hydrogeologically sensitive and is subject to potential groundwater pathogen contamination in its supply wells (40). The presence of bacterial indicators and viral pathogens in the surface water and groundwater of Bangladesh for the sampling months in 2009 and 2011 provides the opportunity to assess the hydrogeological sensitivity of the area.

The viral pathogens, specifically rotavirus, were found in greater concentrations than bacterial pathogens in both surface and groundwater. The relative concentration of rotavirus was 10 to 100-fold higher in groundwater than was any other measured pathogen. The bacterial fecal indicators, mE. coli and Bacteroides, had greater reductions than rotavirus having a log reduction of 0.77. These reductions are less than the USEPA target, suggesting that the aquifers in this area would be considered hydrogeologically sensitive according to the USEPA regulation. The bacterial fecal indicators measured showed higher reductions from the surface water to the groundwater in Bara Haldia, Bangladesh than the viral pathogens, suggesting a difference in their transport and a potential explanation for the lack of correlation between bacterial fecal indicators and rotavirus.

4.3 Rotavirus relationship to fecal indicating parameters.

The presence or absence of total coliform and E. coli as fecal indicator bacteria in water is the current basis for determining whether water is considered safe to drink worldwide (41,

42). The common assumption is that the presence of E. coli or total coliforms (MPN > 1) is indicative of the potential for the water to contain human pathogens. However, previous studies indicate that these fecal indicators are not related to the presence of viruses (42-44) as was also the case in this study. Of a total 131 wells (from all months) used in this study

57 and having both fecal indicator and rotavirus concentration data, 18% of the wells had both a fecal indicator (cultural E. coli or total coliform) and rotavirus presence, 6% of the wells had rotavirus presence only, 56% of the wells had an indicator(s) but no rotavirus, and

20% of the wells had neither fecal indicator or rotavirus presence. Contingency analysis of rotavirus versus both mE. coli and culturable E. coli, as well as total coliform in all wells, indicated that the presence of rotavirus was independent of the presence of any of these three indicators. The presence of rotavirus in wells without the presence of a fecal indicator was most troubling, with respect to protection of people from waterborne pathogens, because these wells would be considered clean and safe to drink based on the lack of E. coli.

The concentrations of culturable E. coli or total coliform in groundwater also did not follow the same pattern as rotavirus concentrations. Rotavirus concentration in the 28 selected wells peaked in July 2009 and decreased in January 2011. However, there were no significant differences in mE. coli concentrations between months sampled, and culturable

E. coli showed significantly higher concentrations in July 2009 than both January 2011 and

March 2009, confirming the results from Ferguson et al. (5). Another possible explanation for the lack of relationship between fecal indicators and rotavirus concentrations, besides differences in transport from surface water to groundwater, was that a rotavirus vaccination campaign was implemented in Bara Haldia between July 2009 and January 2011. Effective vaccination of children would reduce the amount of rotavirus carried in stools and thus, lower presence of rotavirus in the groundwater during January 2011.

Chloride/bromide ratios have been used to indicate sewage contamination of water (14).

Although high chloride concentrations also can occur in groundwater from several non- sewage sources, such as brine, seawater intrusion, and urban road salting, the ratio of

58 chloride to bromide can be used to discriminate between these sources. Groundwater with relatively high ratios (~150-600) indicates sewage-associated contamination (14, 15).

Although the Cl/Br ratios did not vary significantly between months sampled, and were not significantly correlated to rotavirus presence, their ranges of 61-1011 and 56-16512 in

March and July 2009, respectively, and means of 467 and 428, respectively, suggest that the groundwater was contaminated by surface water sewage effluent.

The presence of nitrate concentrations greater than the USEPA acceptable limit of 10mg/L

(46) in three groundwater samples also suggests contamination with surface water sewage effluents. Other studies indicate nitrate concentrations can be high in areas of sewage effluent contamination (15, 16, 45), with a report of nitrate concentrations as high as 25mg/l in groundwater receiving surface water effluent recharge in the study by Khan et al. (16).

Although Bara Haldia is also an agricultural area which could generate high nitrate concentration in the groundwater if inorganic fertilizer is used, a likely source is the numerous latrine effluent ponds typically located throughout the villages (6).

4.4 Rotavirus relationship to environmental parameters.

The unreliability of fecal indicators to predict rotavirus presence highlights the need for finding new mechanisms for predicting viral occurance. One method would be by understanding the transport of viral pathogens in groundwater or attachment to particles in sediment. Pieper et al. (22) show that the environmental pH could influence the presence of viruses in groundwater by causing charge reversal and detachment of the virus from sediment particles, such as ferric oxyhydroxide (FeOOH). Thus, a change in the groundwater pH could be related to a change in the presence of rotavirus in the groundwater. In this study, pH significantly varied between all months sampled (F = 190.7, 59 p<0.0001, Figure 12). Liang et al. (23) showed that increased phosphate or sulfate concentrations could also cause a charge reversal of FeOOH and, therefore, resulted in the detachment of virus particles. Thus, an increase in the concentration of these ions should correlate to an increase in virus concentration in the groundwater (22). However, neither phosphate (F = 0.3, p = 0.6) nor sulfate (F = 0.1, p = 0.7) significantly differed between months sampled for all the wells. The combined pH, sulfate, and phosphate results suggest that while pH may be related to rotavirus presence in the groundwater, more sample time points are needed to explain this relationship. Rotavirus presence in the groundwater wells is not related to sulfate or phosphate.

4.5 Viral pathogen presence in surface and groundwater virome.

Although viruses are abundant in the environment, they are difficult to identify by culture- and molecular-based techniques. Thus, alternative measures such as 454 pyrosequencing of environmental samples have been initiated to characterize the diversity within the viral community (47). Analysis of the metagenomic viromes of this study with Metavir identified two human pathogenic viruses present in both surface and groundwater, rotavirus and human herpesvirus. This sequence analysis did indicate the presence of multiple types of bacteriophage. The abundance of phages in the surface and groundwater viromes, along with the differing reduction rates of the bacterial indicators and viral pathogens, found in this study suggest that these sequences may provide useful new phage indicators for viral pathogens. With improvements in sequencing methodologies, analysis of metaviromes may become a useful tool for monitoring for viral pathogens in water samples.

60

4.6 Conclusions.

The presence of pathogens in both surface water and groundwater indicated that pathogen transmission occurs between surface water and groundwater. This pathway was further supported by sequence analysis that showed a Prochlorococcus phage P-SSM7 ssbinding protein sequence with 100% similiarity in both the surface water and the groundwater samples. However, more sequencing with subsequent samples will strengthen this evidence by providing more information about the amount of overlap between the surface and groundwater viromes.

The presence of the fecal indicators: total coliform, culturable E. coli, molecular E. coli, and

Bacteroides, in groundwater did not indicate the presence of rotavirus in the groundwater.

Rotavirus concentrations also were not significantly related to any of these fecal indicator concentrations.

Furthermore, rotavirus concentrations were not significantly correlated to any environmental parameters measured. Many of these parameters were not measured in January 2011, potentially providing a bias negative relationship due to a limited amount of data. Thus, future studies may require more time points and more sampling locations in order to better understand if any of these parameters relate to rotavirus presence. The results of this study provided insight to the presence of diarrheal pathogens in the groundwater of rural

Bangladesh as well as the possibility of transport between the surface water and groundwater through molecular and metagenomic sequencing analyses. Although rotavirus presence in the groundwater showed no relationship to the bacterial fecal indicators currently used, sequencing results suggest exciting future research for the use of new phage as indicators of viruses.

61

LIST OF REFERENCES

62

1. WHO. Diarrhoeal disease. 2009 [updated 2009; cited]; Available from: http://www.who.int/mediacentre/factsheets/fs330/en/index.html. 2. Gallay A, De Valk H, Cournot M, Ladeuil B, Hemery C, Castor C, et al. A large multi-pathogen waterborne community outbreak linked to faecal contamination of a groundwater system, France, 2000. Clinical Microbiology and Infection. 2006;12(6):561-70. 3. WHO. Rotavirus. 2011 [updated 2011; cited 2012 August 28]; Available from: http://www.who.int/nuvi/rotavirus/en/. 4. Carrel M, Escamilla V, Messina J, Giebultowicz S, Winston J, Yunus M, et al. Diarrheal disease risk in rural Bangladesh decreases as tubewell density increases: a zero- inflated and geographically weighted analysis. International Journal of Health Geographics. 2011;10:41. 5. Ferguson AS, Layton AC, Mailloux BJ, Culligan PJ, Williams DE, Smartt AE, et al. Comparison of fecal indicators with pathogenic bacteria and rotavirus in groundwater. Science of The Total Environment. 2012;431(0):314-22. 6. Knappett PSK, McKay LD, Layton A, Williams DE, Alam MJ, Huq MR, et al. Implications of Fecal Bacteria Input from Latrine-Polluted Ponds for Wells in Sandy Aquifers. Environmental Science & Technology. 2011;46(3):1361-70. 7. Leber J, Rahman MM, Ahmed KM, Mailloux B, van Geen A. Contrasting Influence of Geology on E. coli and Arsenic in Aquifers of Bangladesh. Ground Water. 2011;49(1):111-23. 8. Glassmeyer ST, Furlong ET, Kolpin DW, Cahill JD, Zaugg SD, Werner SL, et al. Transport of Chemical and Microbial Compounds from Known Wastewater Discharges: Potential for Use as Indicators of Human Fecal Contamination. Environmental Science & Technology. 2005;39(14):5157-69. 9. Guidelines for Drinking-Water Quality. 4 ed: World Health Organization; 2011. 10. Cook K, Bolster C, Ayers K, Reynolds D. Escherichia coli Diversity in Livestock Manures and Agriculturally Impacted Stream Waters. Current Microbiology. 2011;63(5):439-49. 11. Figueras MJ, Borrego JJ. New Perspectives in Monitoring Drinking Water Microbial Quality. International Journal of Environmental Research and Public Health. 2010;7(12):4179-202. 12. Lavender JS, Kinzelman JL. A cross comparison of QPCR to agar-based or defined substrate test methods for the determination of Escherichia coli and enterococci in municipal water quality monitoring programs. Water Research. 2009;43(19):4967-79. 13. Wade TJ, Calderon RL, Brenner KP, Sams E, Beach M, Haugland R, et al. High sensitivity of children to swim ming-associated gastrointestinal illness - Results using a rapid assay of recreational water quality. Epidemiology. 2008;19(3):375-83. 14. Vengosh A, Pankratov I. Chloride/Bromide and Chloride/Fluoride Ratios of Domestic Sewage Effluents and Associated Contaminated Ground Water. Ground Water. 1998;36(5):815-24. 15. Panno SV, Hackley KC, Hwang HH, Greenberg SE, Krapac IG, Landsberger S, et al. Characterization and Identification of Na-Cl Sources in Ground Water. Ground Water. 2006;44(2):176-87. 16. Khan MS, Ahmad SR, ur Rahman Z, Ishaque M. Estimation and Distribution of Nitrate Contamination in Groundwater of Wah Town, its Causes and Management. Pakistan Journal of Nutrition. 2012;11(4):332-5.

63

17. Hazen TC. Fecal coliforms as indicators in tropical waters: A review. Toxicity Assessment. 1988;3(5):461-77. 18. Gutierrez L, Mylon SE, Nash B, Nguyen TH. Deposition and Aggregation Kinetics of Rotavirus in Divalent Cation Solutions. Environmental Science & Technology. 2010;44(12):4552-7. 19. Rahman M, Sultana R, Ahmed G, Nahar S, Hassan Z, Salada F, et al. Prevalence of G2P[4] and G12P[6] Rotavirus, Bangladesh. Emerging Infectious Diseases. January 2007;13(1):18-24. 20. Siddique AK, Ahmed S, Iqbal A, Sobhan A, Poddar G, Azim T, et al. Epidemiology of Rotavirus and Cholera in Children Aged Less Than Five Years in Rural Bangladesh. Journal of Health, Population, and Nutrition. 2011;29(1):8. 21. van Doorn L-J, Kleter B, Hoefnagel E, Stainier I, Poliszczak A, Colau B, et al. Detection and Genotyping of Human Rotavirus VP4 and VP7 Genes by Reverse Transcriptase PCR and Reverse Hybridization. Journal of Clinical Microbiology. 2009;47(9):2704-12. 22. Pieper AP, Ryan JN, Harvey RW, Amy GL, Illangasekare TH, Metge DW. Transport and Recovery of Bacteriophage PRD1 in a Sand and Gravel Aquifer: Effect of Sewage-Derived Organic Matter. Environmental Science & Technology. 1997;31(4):1163- 70. 23. Liang L, Morgan JJ. Chemical aspects of iron oxide coagulation in water: Laboratory studies and implications for natural systems. Aquatic Sciences - Research Across Boundaries. 1990;52(1):32-55. 24. Harvey RW, Ryan JN. Use of PRD1 bacteriophage in groundwater viral transport, inactivation, and attachment studies. FEMS Microbiology Ecology. 2004;49(1):3-16. 25. Whitehead E, Hiscock K, Dennis P. Evidence for sewage contamination of the Sherwood Sandstone aquifer beneath Liverpool, UK. Impacts of Urban Growth on Surface Water and Groundwater Quality. 1999(259):179-85. 26. Boehm AB, Shellenbarger GG, Paytan A. Groundwater Discharge: Potential Association with Fecal Indicator Bacteria in the Surf Zone. Environmental Science & Technology. 2004;38(13):3558-66. 27. van Geen A, Ahmed KM, Akita Y, Alam MJ, Culligan PJ, Emch M, et al. Fecal Contamination of Shallow Tubewells in Bangladesh Inversely Related to Arsenic. Environmental Science & Technology. 2011;45(4):1199-205. 28. Emch M, Yunus M, Escamilla V, Feldacker C, Ali M. Local population and regional environmental drivers of cholera in Bangladesh. Environmental Health. 2010;9:2. 29. von Brömssen M, Häller Larsson S, Bhattacharya P, Hasan MA, Ahmed KM, Jakariya M, et al. Geochemical characterisation of shallow aquifer sediments of Matlab Upazila, Southeastern Bangladesh — Implications for targeting low-As aquifers. Journal of Contaminant Hydrology. 2008;99(1–4):137-49. 30. Sinton LW, Finlay RK, Reid AJ. A simple membrane filtration-elution method for the enumeration of F-RNA, F-DNA and somatic coliphages in 100-ml water samples. Journal of Microbiological Methods. 1996;25(3):257-69. 31. Knappett PSK, Escamilla V, Layton A, McKay LD, Emch M, Williams DE, et al. Impact of population and latrines on fecal contamination of ponds in rural Bangladesh. Science of The Total Environment. 2011;409(17):3174-82.

64

32. Layton A, McKay L, Williams D, Garrett V, Gentry R, Sayler G. Development of Bacteroides 16S rRNA Gene TaqMan-Based Real-Time PCR Assays for Estimation of Total, Human, and Bovine Fecal Pollution in Water. Applied and Environmental Microbiology. 2006;72(6):4214-24. 33. Gibson KE, Schwab KJ. Detection of Bacterial Indicators and Human and Bovine Enteric Viruses in Surface Water and Groundwater Sources Potentially Impacted by Animal and Human Wastes in Lower Yakima Valley, Washington. Applied and Environmental Microbiology. 2011;77(1):355-62. 34. Rajal VB, McSwain BS, Thompson DE, Leutenegger CM, Wuertz S. Molecular quantitative analysis of human viruses in California stormwater. Water Research. 2007;41(19):4287-98. 35. Kageyama T, Kojima S, Shinohara M, Uchida K, Fukushi S, Hoshino FB, et al. Broadly Reactive and Highly Sensitive Assay for Norwalk-Like Viruses Based on Real- Time Quantitative Reverse Transcription-PCR. J Clin Microbiol. 2003;41(4):1548-57. 36. Roux S, Faubladier M, Mahul A, Paulhe N, Bernard A, Debroas D, et al. Metavir: a web server dedicated to virome analysis. Bioinformatics.27(21):3074-5. 37. Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, et al. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008;9:386. PMCID: 2563014. 38. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403-10. 39. Agency UEP. National Primary Drinking Water Regulations: Ground Water Rule. Rules and Regulations. 2006;71(216). 40. Bhattacharjee S, Ryan JN, Elimelech M. Virus transport in physically and geochemically heterogeneous subsurface porous media. Journal of Contaminant Hydrology. 2002;57:161-87. 41. Payment P, Locas A. Pathogens in Water: Value and Limits of Correlation with Microbial Indicators. Ground Water. 2011;49(1):4-11. 42. Francy DS, Helsel DR, Nally RA. Occurrence and Distribution of Microbiological Indicators in Groundwater and Stream Water. Water Environment Research. 2000;72(2):152-61. 43. Marzouk Y, Goyal SM, Gerba CP. Prevalence of in Ground Water of Israel. Ground Water. 1979;17(5):487-91. 44. Bushon RN. Fecal indicator viruses: U.S. Geological Survey Techniques of Water- Resources Investigations. 2003;book 9(chap. A7):section 7.2. 45. Williams AE, Lund LJ, Johnson JA, Kabala ZJ. Natural and Anthropogenic Nitrate Contamination of Groundwater in a Rural Community, California. Environmental Science & Technology. 1998;32(1):32-9. 46. EPA. Drinking Water Contaminants. 2012 [updated 2012; cited]; Available from: http://water.epa.gov/drink/contaminants/index.cfm. 47. Alhamlan FS, Ederer MM, Brown CJ, Coats ER, Crawford RL. Metagenomics- based analysis of viral communities in dairy lagoon wastewater. Journal of Microbiological Methods. 2013;92(2):183-8.

65

APPENDIX

66

Table A 1. Physical Characteristics of Pond and Canal Samples. Sample ID Type Village Latitude Longitude Physical Characteristics IEP-2 Pond Bara Haldia 23.370 90.648 Fishing pond. Owner puts cow dung into pond for fish. IEP-3 Pond Bara Haldia 23.373 90.645 Near IEP-5. Used to store water for bathing during dry season. Level goes down in dry IEP-4 Pond Bara Haldia 23.371 90.648 season, but never completely dries up. IEP-5 Pond Bara Haldia 23.373 90.644 Near IEP3. IEP-6 Pond Sardarkandi 23.349 90.659 Several latrines draining into it. CI-1 Canal Bara Haldia 23.370 90.647 Canal behind well #21778. CI-2 Canal Bara Haldia 23.371 90.646 Extension of CI-1 where it narrows to a small stream. CI-3 Canal Bara Haldia 23.372 90.644 Sewage canal with dozens of latrines draining into it. BHP-1 Pond Bara Haldia 23.368 90.648 Muddy, Just North of Road, 10 x 25 m. No hanging latrines. BHP-2 Pond Bara Haldia 23.369 90.648 Very muddy, brown with oily slick, 10 x 10m. No hanging latrines. BHP-3 Pond Bara Haldia 23.369 90.647 Very dirty, brown with brown scum, 25 x 50m. No hanging latrines. BHP-4 Pond Bara Haldia 23.370 90.647 Brown with some scum, medium dirty, 50 x 25m. Can't see latrines. BHP-5 Pond Bara Haldia 23.369 90.647 Brown, but no scum, tree fallen in pond. 9 x 50m. Can't see latrines. BHP-6 Pond Bara Haldia 23.368 90.649 Small, filthy black, 10 x 6m. One nearby latrine but not hanging. BHP-7 Pond Bara Haldia 23.368 90.649 Small brown. 6 x 6m.

67

Table A 2. Concentrations of molecular indicator and pathogenicity genes in surface water. 100 to 250 mL of water was filtered for each sample. Units are in gene copies/100mL. Red text = below detection limit.

log E. coli log Bacteroides log R otavirus log Adenovirus log Shigella log ETEC E. coli log Vibrio Sample ID (23S rRNA) (AllBac) (16S rRNA) G12-like (VP6) 40/41 (hexon gene) (ipaH) (eltA) (omp W) January March July January March July January March July January March July January March July January March July January March July IEP-2 3.85 1.00 5.24 7.09 6.06 6.77 1.00 3.88 3.38 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3.20 3.00 IEP-3 1.00 1.00 5.75 4.38 6.08 8.03 1.00 3.89 5.01 1.00 1.00 1.00 1.00 1.00 1.90 1.00 1.00 2.39 1.00 1.00 1.00 IEP-4 4.49 1.00 5.30 7.74 6.35 5.85 1.00 4.24 3.38 1.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 1.00 1.00 1.00 IEP-5 4.40 1.00 5.93 7.17 6.47 7.65 4.06 4.54 4.79 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 IEP-6 5.27 8.30 1.00 4.20 1.00 1.00 1.00 CI-1 5.08 6.05 7.80 7.86 4.11 1.00 1.00 1.00 1.00 1.00 1.00 CI-2 3.63 5.84 7.05 8.33 4.22 1.00 1.00 1.00 1.00 1.00 1.00 CI-3 4.67 4.88 5.54 7.46 6.64 7.32 4.07 4.20 6.62 2.89 1.00 1.00 1.00 1.00 1.00 1.00 3.15 3.90 1.00 2.90 1.00 BHP-1 5.28 5.97 4.70 7.70 7.43 6.66 4.00 3.61 1.00 1.00 1.00 1.00 1.00 1.00 3.26 1.00 1.00 1.00 1.00 BHP-2 4.83 4.02 5.42 7.91 6.46 6.98 1.00 4.25 3.89 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 BHP-3 5.43 5.26 5.94 7.89 6.74 6.51 3.74 4.87 3.46 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3.40 1.00 BHP-4 1.00 5.35 6.18 7.22 1.00 3.64 1.00 1.00 1.00 1.00 1.00 1.00 BHP-5 1.00 5.79 5.38 6.72 2.60 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 BHP-6 1.00 6.07 6.52 8.39 3.06 1.00 1.00 1.00 1.00 1.00 1.00 4.45 1.00 BHP-7 9.27 4.17 5.21 6.63 6.27 6.75 1.00 3.56 2.92 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Total Wells 10 14 14 10 14 14 10 12 3 10 14 14 10 14 14 10 12 14 10 12 14

68

Table A 3. Concentrations of molecular indicators and pathogenicity genes in groundwater from all tubewells. 2000 to 8000 mL of water was filtered for each sample. Units are in log gene copies/100mL unless otherwise noted. Red text = below detection limit. Any replicate samples are shown as their average.

log E. coli log total coliform log E. coli log Bacteroides log R otavirus (MPN/100 mL) (MPN/100 mL) (23S rRNA) (AllBac) (16S rRNA) G12-like (VP6) Sample ID January March July January March July January March July January March July January March July 21682 -0.60 -0.60 -0.60 1.49 -0.60 -0.60 2.42 1.00 3.04 6.03 4.01 3.98 1.00 3.14 1.00 21683 -0.60 2.49 2.35 3.41 21739 0.72 -0.60 -0.60 0.99 -0.60 0.49 1.23 1.00 1.70 3.51 2.57 2.95 1.00 1.00 1.00 21740 -0.60 -0.60 2.16 0.62 1.00 1.00 3.08 2.66 1.00 21741 -0.60 0.31 1.70 2.79 21743 -0.30 -0.60 -0.60 -0.30 -0.60 -0.60 1.31 1.00 1.00 4.28 1.27 2.50 1.00 1.00 1.00 21744 -0.60 -0.60 -0.30 -0.60 1.98 1.85 1.20 3.08 2.02 2.95 3.39 2.12 2.33 1.00 2.53 21745 -0.60 -0.30 1.23 0.00 1.32 2.61 2.11 2.11 2.32 4.93 4.20 1.00 1.00 21746 -0.60 -0.60 -0.60 0.18 -0.60 0.62 1.41 1.95 4.74 1.00 1.00 21747 2.19 5.12 21748 -0.60 -0.60 -0.60 2.27 -0.60 0.18 1.75 2.49 1.60 4.94 3.42 1.00 1.00 1.00 21749 -0.60 -0.60 -0.60 0.41 1.52 2.14 3.30 3.82 1.00 1.00 21750 -0.60 -0.60 -0.60 -0.60 0.00 0.31 1.32 2.23 1.00 3.87 3.73 4.12 2.41 3.25 1.00 21752 -0.60 -0.60 0.72 -0.60 0.18 2.69 2.14 1.91 2.79 2.40 4.89 4.75 1.00 1.00 1.00 21753 -0.60 -0.60 0.87 0.00 1.00 1.70 5.39 1.00 21754 -0.60 -0.30 0.72 2.32 0.91 2.08 2.07 1.00 1.00 3.41 3.29 3.80 1.00 1.00 1.00

21755 -0.60 -0.60 -0.60 -0.60 -0.60 0.61 1.57 1.55 2.40 2.84 3.15 3.07 1.00 1.00 1.00 21756 -0.30 3.48 3.45 3.88 3.02 21757 -0.60 -0.60 0.88 0.18 -0.30 1.52 3.52 2.47 3.19 3.11 3.19 4.60 1.00 1.00 4.53 21758 -0.60 2.03 0.41 3.17 1.00 4.87 4.22 7.45 4.56 1.00 21759 -0.60 -0.60 -0.30 1.36 0.00 0.67 2.85 1.00 2.53 4.54 4.44 3.75 1.00 1.00 1.00 21760 -0.60 0.41 3.48 -0.60 1.97 3.48 1.00 2.61 4.24 4.47 4.92 6.72 1.00 1.00 1.00 21762 -0.60 -0.60 0.00 0.00 2.00 1.04 2.47 2.14 2.64 3.08 1.00 1.00 21763 -0.60 1.86 1.79 4.08 3.40 21764 -0.60 -0.60 0.49 -0.60 -0.60 1.73 1.00 1.88 2.30 1.73 3.62 3.30 1.00 3.10 3.67 21765 0.18 -0.60 0.41 2.60 -0.60 2.07 1.00 2.06 1.59 4.29 2.94 2.58 1.00 1.00 1.00 21766 -0.60 -0.60 -0.60 0.30 -0.60 1.40 2.08 2.18 1.85 4.02 2.66 2.84 1.00 3.60 21767 -0.60 -0.60 -0.60 0.18 2.58 1.65 3.92 1.00 1.00 21768 -0.60 -0.60 -0.60 -0.60 -0.30 -0.30 1.00 2.81 1.00 3.82 2.66 1.00 1.00 1.00 21769 -0.60 -0.30 0.00 1.04 1.87 1.26 4.46 2.82 1.00 21770 -0.60 1.73 -0.60 -0.30 3.48 0.41 1.80 3.38 2.09 1.63 4.24 2.57 1.00 1.00 1.00 21771 -0.60 -0.60 0.31 -0.60 1.65 1.26 1.00 1.00 1.62 2.43 3.14 3.89 1.00 1.00 1.00 21773 -0.60 -0.30 2.60 3.36 21774 -0.30 0.31 -0.60 0.56 2.06 1.22 1.00 1.08 2.47 2.71 4.50 2.49 2.21 1.00 4.25 21775 1.62 -0.60 -0.30 2.65 0.18 1.53 2.48 1.66 1.78 4.29 3.30 2.23 1.00 2.72 4.75 21776 -0.60 -0.60 -0.60 -0.30 1.16 0.90 1.45 1.98 2.81 2.46 3.58 3.75 3.09 4.42 21777 1.48 -0.60 0.56 1.60 -0.60 2.21 2.82 1.49 2.76 4.54 4.88 4.45 1.00 1.00 4.46 21778 0.18 -0.60 0.67 2.49 -0.30 2.39 1.72 1.00 2.51 1.00 2.31 1.00 3.97 21779 -0.60 -0.60 0.55 -0.60 2.19 2.04 2.00 4.46 1.00 4.03 4.28 3.97 1.00 1.00 2.48 21780 -0.60 0.56 -0.60 0.71 1.27 -0.60 1.00 2.04 2.19 1.61 3.88 3.64 1.00 1.00 1.00 21781 -0.60 -0.60 0.76 -0.60 1.65 1.98 2.13 2.55 3.09 3.78 2.62 4.07 1.00 1.00 1.00 21782 -0.60 0.93 2.85 0.40 2.49 2.95 1.64 4.08 5.18 4.55 4.30 5.70 1.00 1.00 21783 0.41 1.66 3.11 4.37 4.68 21784 0.67 0.00 -0.60 2.46 1.07 0.90 1.89 1.99 1.00 5.07 4.23 4.45 1.00 1.36 1.00 21785 -0.60 -0.60 0.84 -0.60 -0.60 3.20 1.99 1.00 2.87 3.93 3.47 1.00 3.10 1.00 21786 -0.60 -0.60 -0.60 -0.60 -0.60 1.02 1.17 1.56 1.40 3.26 2.78 2.56 1.00 3.33 1.00 21787 -0.31 -0.60 1.39 0.72 2.48 2.72 1.69 3.58 2.83 3.72 1.00 2.81 2.68 21788 -0.60 -0.60 1.55 1.02 3.23 1.37 3.12 3.78 1.00 21789 -0.60 0.56 -0.60 -0.60 2.35 0.97 2.50 1.12 4.41 5.91 3.20 2.08 21790 -0.60 -0.60 -0.60 0.00 0.18 0.31 2.44 2.24 1.00 5.20 4.09 4.26 1.00 1.00 1.00 21791 -0.60 -0.60 1.00 4.17 21792 -0.60 -0.60 -0.60 1.64 0.31 0.62 1.00 3.64 1.00 5.17 4.50 4.22 1.00 1.00 1.00 21793 -0.60 2.00 -0.60 -0.60 3.48 0.49 2.15 3.96 1.84 4.26 4.93 1.00 1.00 2.06 21794 -0.60 -0.60 -0.60 -0.60 0.99 1.30 1.29 1.00 1.95 3.43 3.90 2.45 1.00 1.00 21795 -0.60 -0.60 1.00 3.96 21796 -0.60 -0.60 0.62 0.74 -0.60 1.22 1.82 1.99 2.42 4.33 3.20 4.20 1.00 1.00 1.00 21797 -0.30 2.54 1.64 3.48 3.61 5.38 1.00 21798 -0.60 -0.60 0.62 -0.60 -0.60 1.45 1.00 2.20 1.94 4.66 3.02 3.41 1.00 1.00 1.75 21800 -0.60 -0.60 1.50 -0.60 -0.60 1.37 1.00 2.39 3.23 2.51 1.00 1.00 Total Wells 42 57 50 42 57 50 41 55 50 37 55 44 39 44 48

69

Table A 4. Concentration of molecular indicators and pathogenicity genes in groundwater from all tubewells. 2000 to 8000 mL of water was filtered for each sample. Units are in log gene copies/100mL unless otherwise noted. Red text = below detection limit. Any replicate samples are shown as their average

log Adenovirus log Norovirus log Norovirus log Shigella log ETEC E. coli log Vibrio 40/41 (hexon gene) (G1) (G2) (ipaH) (eltA) (omp W) Sample ID January March July March March January March July January March July January March July 21682 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21683 1.00 1.00 1.00 1.00 21739 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21740 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21741 1.00 1.00 1.00 1.00 21743 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21744 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21745 1.18 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21746 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21747 1.00 1.00 1.00 2.66 21748 2.34 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3.16 1.00 21749 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21750 1.47 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21752 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21753 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21754 1.73 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.30 1.00 1.00 21755 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21756 1.00 1.00 1.00 3.75 21757 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.68 1.00 1.00 1.00 1.00 1.00 1.00 21758 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21759 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21760 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21762 1.00 1.00 1.00 1.00 1.00 1.00 21763 1.00 1.00 1.00 1.00 1.00 1.00 21764 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21765 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21766 1.00 1.00 1.00 1.00 1.00 1.57 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21767 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.30 21768 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21769 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21770 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21771 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21773 1.00 1.00 1.00 1.00 21774 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.30 1.00 1.00 1.00 1.00 1.00 1.00 21775 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.48 1.00 1.00 1.00 1.00 1.00 1.00 21776 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.44 1.00 1.00 1.00 1.00 1.00 1.37 21777 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21778 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21779 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21780 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.45 1.00 1.00 21781 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.02 21782 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21783 1.00 1.00 1.00 1.00 1.00 1.00 21784 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21785 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.22 1.00 1.00 21786 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21787 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21788 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21789 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21790 1.89 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21791 1.00 1.00 1.00 1.00 21792 2.26 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21793 1.69 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21794 1.63 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21795 1.00 1.00 1.00 1.00 21796 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21797 1.00 1.00 1.00 1.00 21798 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 21800 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Total Wells 42 54 46 40 40 42 55 50 42 55 50 42 55 49 70

Table A 5. Chemical composition of groundwater from all tubewells. 2000 to 8000 mL of water was filtered for each sample. DO units are in mg/L.

Depth Temp (oC) pH DO ORP (mV) Sample ID Latitude Longitude (m) January March July January March July March July January March July 21682 23.365 90.649 36.58 24.20 26.30 26.73 6.77 7.52 0.76 0.87 1436.00 -198.20 40.10 21683 21739 23.368 90.649 22.86 23.10 25.59 26.29 6.62 7.48 6.73 0.97 0.79 450.00 -230.10 54.90 21740 23.368 90.649 13.72 26.63 26.10 26.63 6.21 7.43 6.21 0.85 0.48 -233.70 82.10 21741 23.366 90.650 21743 23.366 90.651 18.29 25.40 26.01 26.71 7.06 7.22 6.26 0.82 0.88 495.00 -257.20 121.40 21744 23.364 90.651 16.76 25.10 27.19 6.77 6.65 0.89 1597.00 88.30 21745 23.364 90.651 36.58 25.50 26.20 26.09 6.68 7.47 6.90 0.98 0.72 1700.00 -226.20 157.40 21746 23.364 90.651 16.76 25.60 25.84 6.70 0.89 1574.00 136.70 21747 23.365 90.649 21748 23.366 90.649 13.72 25.60 26.30 6.80 7.38 0.96 0.75 768.00 -203.20 25.70 21749 23.366 90.649 32 26.27 26.36 7.12 0.73 0.52 -662.00 70.90 21750 23.366 90.648 27.43 24.20 26.38 26.56 6.78 7.06 6.77* 1.24 0.63 880.00 -652.10 87.30 21752 23.367 90.648 15.24 23.80 26.07 26.54 6.93 7.04 0.77 0.89 725.00 -464.20 -41.20 21753 23.367 90.648 15.24 27.33 25.85 27.33 6.24 7.39 6.24 0.64 0.79 -162.60 -68.50 21754 23.370 90.645 10.67 24.80 25.64 26.23 6.53 7.04 6.44 1.00 0.40 807.00 -730.00 137.30 21755 23.369 90.645 7.62 25.30 26.28 7.03 7.86 0.92 752.00 -179.20 21756 23.371 90.645 12.19 25.94 7.77 0.90 -177.40 21757 23.371 90.645 13.72 24.00 25.99 26.91 6.88 7.78 6.88 0.51 0.39 2899.00 -184.40 149.40 21758 23.371 90.645 15.24 26.71 26.15 26.71 6.38 7.93 6.38 0.86 0.72 -208.10 122.70 21759 23.371 90.645 15.24 25.90 26.19 26.33 6.88 7.44 6.28 0.77 0.65 1659.00 -212.60 142.50 21760 23.371 90.644 9.14 25.60 26.02 26.75 6.81 7.75 6.80 0.91 0.80 1751.00 -168.30 145.70 21762 23.372 90.646 15.24 25.10 6.55 1010.00 21763 23.372 90.647 15.24 26.06 8.27 0.98 -239.60 21764 23.371 90.646 15.24 24.70 26.20 26.32 7.02 7.41 6.24 1.18 0.91 4513.00 -222.60 189.30 21765 23.371 90.646 15.24 25.00 26.41 26.47 6.99 7.50 6.32 0.84 0.77 4144.00 -205.60 119.10 21766 23.371 90.646 15.24 23.30 26.27 26.03 6.81 7.59 6.47 0.78 0.86 2482.00 -206.70 -207.70 21767 23.370 90.646 15.24 26.11 26.40 26.11 6.17 7.54 6.17 0.89 0.55 -213.20 121.00 21768 23.369 90.647 19.81 24.50 26.30 26.54 6.83 7.48 6.69 0.55 0.72 664.00 -218.50 126.40 21769 23.369 90.648 21.34 26.97 26.34 26.97 6.18 7.07 6.18 0.73 0.87 -721.20 91.70 21770 23.368 90.647 18.29 25.20 26.23 26.23 6.88 7.01 6.38 0.92 0.81 577.00 -727.30 130.60 21771 23.373 90.646 22.86 26.00 26.26 26.86 6.91 7.02 6.87 0.85 0.59 642.00 -727.80 122.40 21773 23.371 90.646 18.29 26.18 8.39 0.92 -224.10 21774 23.371 90.648 14.33 24.60 26.28 26.57 6.98 7.96 6.31 0.81 0.62 1261.00 -658.50 176.10 21775 23.371 90.647 16.15 25.50 26.28 26.49 6.86 7.92 6.39 0.65 0.83 1242.00 -226.90 139.90 21776 23.370 90.649 10.06 25.00 25.96 26.15 6.83 7.40 6.10 0.63 0.67 393.00 -144.70 199.60 21777 23.700 90.648 9.14 25.20 25.97 26.51 6.92 7.93 6.49 0.71 0.88 1360.00 -662.90 134.80 21778 23.370 90.648 9.14 24.10 26.45 26.20 7.22 7.66 6.46 0.76 0.83 423.40 -286.70 140.30 21779 23.371 90.645 16.76 26.00 26.15 25.94 6.99 7.72 6.17 0.79 0.91 2526.00 -191.00 149.10 21780 23.368 90.648 18.29 24.40 26.04 26.62 6.88 7.52 6.39 0.90 0.82 960.00 -219.20 -121.10 21781 23.370 90.647 16.76 24.20 26.24 26.85 6.74 7.60 6.21 0.78 0.88 1269.00 -206.30 32.80 21782 23.371 90.646 13.72 24.80 26.37 25.83 6.85 7.37 6.45 0.98 0.67 3640.00 -223.30 143.60 21783 23.372 90.645 16.76 26.16 7.32 0.89 -218.20 21784 23.372 90.645 16.76 23.40 26.17 26.17 6.51 7.92 6.23 0.81 0.51 2586.00 -219.60 15.40 21785 23.372 90.644 15.24 25.60 26.23 26.05 6.78 7.34 6.02 0.87 0.62 2732.00 -267.40 84.40 21786 23.372 90.644 16.76 24.60 25.86 26.32 6.92 7.43 6.49 0.97 0.75 1777.00 -219.70 132.10 21787 23.373 90.644 16.76 23.10 26.56 6.88 6.41 0.82 2833.00 161.20 21788 23.373 90.644 18.29 26.57 26.04 26.57 6.45 7.21 6.45 0.78 0.82 -282.50 169.50 21789 23.373 90.644 18.29 24.80 25.98 6.75 7.65 0.81 3774.00 -210.40 21790 23.373 90.644 16.76 24.80 26.10 25.98 6.73 7.43 6.35 0.74 0.78 3244.00 -190.70 162.00 21791 23.372 90.644 16.76 26.14 7.49 0.79 -211.20 21792 23.373 90.644 10.67 26.00 26.55 25.98 6.88 7.06 6.39 0.80 0.64 3036.00 -711.70 162.80 21793 23.374 90.644 18.29 23.00 26.39 26.57 6.94 7.15 6.18 0.89 0.68 2986.00 -252.10 70.10 21794 23.373 90.644 10.67 26.10 26.09 26.30 6.91 7.09 6.51 0.98 0.68 1626.00 -261.10 -26.20 21795 23.373 90.644 13.72 25.82 7.19 0.80 -262.10 21796 23.372 90.644 18.29 24.60 26.42 26.69 6.74 7.40 6.87 0.71 0.74 2198.00 -225.10 191.60 21797 23.372 90.644 13.72 26.07 6.70 21798 23.372 90.643 10.67 25.80 26.37 26.44 6.84 7.73 6.17 0.82 0.84 1402.00 -184.20 188.70 21800 23.333 90.644 13.72 25.20 26.68 6.74 6.56 0.47 2983.00 140.80 Total Wells 50 50 44 49 50 40 50 46 42 50 46 * Well 21750 was typo of 16.77 in MicrosoftAccess file. Assumed it was supposed to be 6.77.

71

Table A 6. Continued chemical composition of groundwater from all tubewells. 2000 to 8000 mL of water was filtered for each sample. Units are in mg/L. Red text = below detection limit. Chloride Bromide Cl/Br Nitrate Phosphate Sulfate Sample ID March July March July March July March July March July March July 21682 49.47 45.14 0.26 0.48 427.51 212.57 0.05 6.05 0.21 0.10 0.12 1.68 21683 21739 7.88 8.13 0.02 0.31 981.05 59.77 0.02 0.28 1.58 0.92 1.15 1.52 21740 4.44 0.01 2002.91 0.35 0.46 0.10 21741 21743 7.23 6.94 0.03 0.01 639.21 3129.00 0.03 0.42 2.74 2.86 0.07 0.10 21744 21745 89.65 99.56 0.48 0.64 420.84 352.42 0.02 2.28 0.12 0.32 0.12 0.13 21746 27.34 0.48 128.08 4.25 0.10 0.10 21747 21748 8.30 7.71 0.05 0.31 405.08 55.83 0.05 0.41 0.08 0.31 0.10 0.10 21749 278.48 271.86 1.30 1.35 483.36 455.38 0.05 0.05 0.06 0.10 0.10 0.10 21750 11.68 11.69 0.04 0.33 725.48 79.01 0.03 3.03 0.34 0.32 0.11 0.16 21752 25.83 21.96 0.11 0.36 551.31 136.56 0.02 0.31 0.64 0.69 0.05 0.10 21753 7.34 7.81 0.02 0.01 839.96 3521.83 0.03 0.63 0.14 0.40 0.39 0.10 21754 43.69 25.70 0.38 0.32 261.07 180.14 0.30 0.41 0.39 0.32 0.29 0.18 21755 14.28 0.06 581.06 0.04 0.37 0.07 21756 105.40 0.42 565.74 0.12 0.09 0.06 21757 586.85 3.12 423.43 0.31 0.34 0.06 21758 97.49 107.35 0.66 0.77 333.56 313.04 0.05 0.43 0.10 0.10 0.10 0.10 21759 80.74 94.62 0.46 0.80 393.82 267.94 0.05 0.27 0.12 0.31 0.24 0.10 21760 143.66 136.39 0.87 0.73 373.19 418.78 0.54 0.66 0.35 0.10 3.86 2.84 21762 21763 407.26 1.90 484.01 0.07 0.06 1.37 21764 871.63 933.85 4.53 4.72 433.72 445.60 0.05 0.05 0.29 0.39 0.10 0.10 21765 711.26 740.10 3.66 3.87 437.64 431.35 0.05 0.05 0.10 0.10 0.10 0.10 21766 351.18 314.23 1.73 1.72 456.96 411.75 0.32 0.28 1.12 0.42 0.09 0.13 21767 13.51 0.31 96.87 0.41 0.35 0.10 21768 16.18 0.34 108.65 0.61 0.32 0.97 21769 9.36 7.53 0.04 0.01 503.34 3393.19 0.03 0.27 0.17 0.54 0.10 0.10 21770 8.52 6.52 0.31 0.01 61.47 2940.85 0.27 0.51 0.43 0.10 1.51 0.10 21771 11.34 0.03 781.80 0.05 0.12 0.07 21773 280.89 1.18 537.46 0.20 0.10 14.32 21774 69.96 69.96 0.35 0.35 444.79 444.79 0.05 0.05 0.05 0.10 1.26 1.26 21775 22.38 0.14 352.45 0.23 0.31 0.71 21776 36.63 0.01 16511.53 0.05 0.10 24.21 21777 9.73 0.28 79.01 0.05 1.70 0.54 21778 4.76 0.01 2147.06 0.05 0.05 0.82 21779 367.48 1.85 448.11 0.34 0.33 0.10 21780 53.78 46.48 0.53 0.43 229.91 246.35 0.24 1.56 0.33 0.31 1.34 0.10 21781 50.67 0.23 507.37 0.05 0.08 0.22 21782 685.76 604.67 3.58 3.24 431.49 420.40 0.01 0.37 0.14 0.32 0.08 1.40 21783 457.05 2.12 485.96 0.02 0.53 0.10 21784 346.58 362.28 1.50 1.24 521.41 657.42 0.11 10.13 0.27 0.73 0.11 0.44 21785 401.94 417.73 1.17 1.21 771.89 780.99 20.87 25.75 0.05 0.31 73.87 126.67 21786 150.46 142.70 0.57 0.79 590.55 405.76 0.05 0.25 0.08 0.44 0.10 0.10 21787 508.25 506.50 2.56 2.61 447.23 437.96 0.05 0.34 0.32 0.10 0.10 1.11 21788 647.56 665.85 3.18 3.36 458.51 446.15 0.00 0.05 0.10 0.10 0.05 1.28 21789 766.78 3.78 456.94 0.05 0.06 0.05 21790 657.67 639.59 3.16 3.14 469.65 459.66 0.00 0.37 0.06 0.10 0.05 1.51 21791 21792 517.97 595.33 2.77 3.02 421.83 444.22 0.05 0.35 0.10 0.10 0.10 1.16 21793 428.78 439.66 2.25 2.14 429.18 463.41 0.05 0.05 0.10 0.10 0.10 0.10 21794 470.32 531.73 1.96 2.58 540.49 463.84 0.01 0.37 0.11 0.10 20.43 8.39 21795 864.35 4.38 444.52 0.05 0.10 0.10 21796 233.16 234.93 1.35 0.99 389.92 535.37 0.31 6.63 0.29 0.10 0.11 1.62 21797 21798 41.58 30.08 0.09 0.37 1011.03 181.37 5.55 1.21 0.01 0.40 5.79 2.91 21800 472.41 2.26 471.45 0.05 0.10 1.19 Total Wells 42 42 42 42 42 42 42 42 42 42 42 42

72

Table A 7. Concentrations of molecular indicator and pathogenicity genes in groundwater from 28 selected tubewells. 2000 to 8000 mL of water was filtered for each sample. Molecular units are in gene copies/100mL. Red text = below detection limit. Any replicate samples are shown as their average.

log E. coli log total coliform log E. coli log Bacteroides log R otavirus log Adenovirus Well ID (MPN/100 mL) (MPN/100 mL) (23S rRNA) (AllBac) (16S rRNA) G12-like (VP6) 40/41 (hexon gene) January March July January March July January March July January March July January March July January March July 21739 0.72 -0.60 -0.60 0.99 -0.60 0.49 1.23 1.00 1.70 3.51 2.57 2.95 1.00 1.00 1.00 1.00 1.00 1.00 21743 -0.30 -0.60 -0.60 -0.30 -0.60 -0.60 1.31 1.00 1.00 4.28 1.27 2.50 1.00 1.00 1.00 1.00 1.00 1.00 21744 -0.60 -0.60 -0.30 -0.60 1.98 1.85 1.20 3.08 2.02 2.95 3.39 2.12 2.33 1.00 2.53 1.00 1.00 21750 -0.60 -0.60 -0.60 -0.60 0.00 0.31 1.32 2.23 1.00 3.87 3.73 4.12 2.41 3.25 1.00 1.47 1.00 1.00 21754 -0.60 -0.30 0.72 2.32 0.91 2.08 2.07 1.00 1.00 3.41 3.29 3.80 1.00 1.00 1.00 1.73 1.00 1.00 21755 -0.60 -0.60 -0.60 -0.60 -0.60 0.61 1.57 1.55 2.40 2.84 3.15 3.07 1.00 1.00 1.00 1.00 1.00 21757 -0.60 -0.60 0.88 0.18 -0.30 1.52 3.52 2.47 3.19 3.11 3.19 4.60 1.00 1.00 4.53 1.00 1.00 1.00 21759 -0.60 -0.60 -0.30 1.36 0.00 0.67 2.85 1.00 2.53 4.54 4.44 3.75 1.00 1.00 1.00 1.00 1.00 1.00 21760 -0.60 0.41 3.48 -0.60 1.97 3.48 1.00 2.61 4.24 4.47 4.92 6.72 1.00 1.00 1.00 1.00 1.00 1.00 21764 -0.60 -0.60 0.49 -0.60 -0.60 1.73 1.00 1.88 2.30 1.73 3.62 3.30 1.00 3.10 3.67 1.00 1.00 1.00 21765 0.18 -0.60 0.41 2.60 -0.60 2.07 1.00 2.06 1.59 4.29 2.94 2.58 1.00 1.00 1.00 1.00 1.00 1.00 21770 -0.60 1.73 -0.60 -0.30 3.48 0.41 1.80 3.38 2.09 1.63 4.24 2.57 1.00 1.00 1.00 1.00 1.00 1.00 21771 -0.60 -0.60 0.31 -0.60 1.65 1.26 1.00 1.00 1.62 2.43 3.14 3.89 1.00 1.00 1.00 1.00 1.00 1.00 21774 -0.30 0.31 -0.60 0.56 2.06 1.22 1.00 1.08 2.47 2.71 4.50 2.49 2.21 1.00 4.25 1.00 1.00 1.00 21775 1.62 -0.60 -0.30 2.65 0.18 1.53 2.48 1.66 1.78 4.29 3.30 2.23 1.00 2.72 4.75 1.00 1.00 1.00 21776 -0.60 -0.60 -0.60 -0.30 1.16 0.90 1.45 1.98 2.81 2.46 3.58 3.75 1.00 3.09 4.42 1.00 1.00 1.00 21777 1.48 -0.60 0.56 1.60 -0.60 2.21 2.82 1.49 2.76 4.54 4.88 4.45 1.00 1.00 4.46 1.00 1.00 1.00 21779 -0.60 -0.60 0.55 -0.60 2.19 2.04 2.00 4.46 1.00 4.03 4.28 3.97 1.00 1.00 2.48 1.00 1.00 1.00 21780 -0.60 0.56 -0.60 0.71 1.27 -0.60 1.00 2.04 2.19 1.61 3.88 3.64 1.00 1.00 1.00 1.00 1.00 1.00 21781 -0.60 -0.60 0.76 -0.60 1.65 1.98 2.13 2.55 3.09 3.78 2.62 4.07 1.00 1.00 1.00 1.00 1.00 1.00 21784 0.67 0.00 -0.60 2.46 1.07 0.90 1.89 1.99 1.00 5.07 4.23 4.45 1.00 1.36 1.00 1.00 1.00 1.00 21786 -0.60 -0.60 -0.60 -0.60 -0.60 1.02 1.17 1.56 1.40 3.26 2.78 2.56 1.00 3.33 1.00 1.00 1.00 1.00 21790 -0.60 -0.60 -0.60 0.00 0.18 0.31 2.44 2.24 1.00 5.20 4.09 4.26 1.00 1.00 1.00 1.89 1.00 1.00 21792 -0.60 -0.60 -0.60 1.64 0.31 0.62 1.00 3.64 1.00 5.17 4.50 4.22 1.00 1.00 1.00 2.26 1.00 1.00 21793 -0.60 2.00 -0.60 -0.60 3.48 0.49 2.15 3.96 1.84 3.03 4.26 4.93 1.00 1.00 2.06 1.69 1.00 1.00 21794 -0.60 -0.60 -0.60 -0.60 0.99 1.30 1.29 1.00 1.95 2.53 3.43 3.90 2.45 1.00 1.00 1.63 1.00 1.00 21796 -0.60 -0.60 0.62 0.74 -0.60 1.22 1.82 1.99 2.42 4.33 3.20 4.20 1.00 1.00 1.00 1.00 1.00 1.00 21798 -0.60 -0.60 0.62 -0.60 -0.60 1.45 1.00 2.20 1.94 4.66 3.02 3.41 1.00 1.00 1.75 1.00 1.00 1.00

73

Table A 8. Chemical composition of groundwater from 28 selected tubewells. 2000 to 8000 mL of water was filtered for each sample. DO units are in mg/L, and ORP units are in mV.

o pH DO ORP Well ID Temp C January March July January March July March July January March July 21739 23.10 25.59 26.29 6.62 7.48 6.73 0.97 0.79 450.00 -230.10 54.90 21743 25.40 26.01 26.71 7.06 7.22 6.26 0.82 0.51 495.00 -257.20 121.40 21744 25.10 26.26 27.19 6.77 7.16 6.65 0.83 0.88 1597 -250.10 88.30 21750 24.20 26.38 26.56 6.78 7.06 6.77* 1.24 0.75 880 -652.10 87.30 21754 24.80 25.64 26.23 6.53 7.04 6.44 1.00 0.89 807.00 -730.00 137.30 21755 25.30 26.28 26.91 7.03 7.86 6.69 0.92 0.88 752.00 -179.20 150.10 21757 24.00 25.99 26.91 6.88 7.78 6.88 0.51 0.79 2899.00 -184.40 149.40 21759 25.90 26.19 26.33 6.88 7.44 6.28 0.77 0.39 1659.00 -212.60 142.50 21760 25.60 26.02 26.75 6.81 7.75 6.80 0.91 0.72 1751.00 -168.30 145.70 21764 24.70 26.20 26.32 7.02 7.41 6.24 1.18 0.80 4513.00 -222.60 189.30 21765 25.00 26.41 26.47 6.99 7.50 6.32 0.84 0.91 4144.00 -205.60 119.10 21770 25.20 26.23 26.23 6.88 7.01 6.38 0.92 0.87 577.00 -727.30 130.60 21771 26.00 26.26 26.31 6.91 7.02 6.99 0.85 0.81 642 -727.80 122.40 21774 24.60 26.28 26.57 6.98 7.96 6.31 0.81 0.59 1261.00 -658.50 176.10 21775 25.50 26.28 26.49 6.86 7.92 6.39 0.65 0.62 1242.00 -226.90 139.90 21776 25.00 25.96 26.15 6.83 7.40 6.10 0.63 0.83 393.00 -144.70 199.60 21777 25.20 25.97 26.51 6.92 7.93 6.49 0.71 0.67 1360.00 -662.90 134.80 21779 26.00 26.15 25.94 6.99 7.72 6.17 0.79 0.83 2526.00 -191.00 149.10 21780 24.40 26.04 26.62 6.88 7.52 6.39 0.90 0.91 960.00 -219.20 -121.10 21781 24.20 26.24 26.85 6.74 7.60 6.21 0.78 0.82 1269.00 -206.30 32.80 21784 23.40 26.17 26.17 6.51 7.92 6.23 0.81 0.67 2586.00 -219.60 15.40 21786 24.60 25.86 26.32 6.92 7.43 6.49 0.97 0.62 1777.00 -219.70 132.10 21790 24.80 26.10 25.98 6.73 7.43 6.35 0.74 0.82 3244.00 -190.70 162.00 21792 26.00 26.55 25.98 6.88 7.06 6.39 0.80 0.78 3036.00 -711.70 162.80 21793 23.00 26.39 26.57 6.94 7.15 6.18 0.89 0.64 2986 -252.10 70.10 21794 26.10 26.09 26.30 6.91 7.09 6.51 0.98 0.68 1626 -261.10 -26.20 21796 24.60 26.42 26.69 6.74 7.40 6.87 0.71 0.68 2198.00 -225.10 191.60 21798 25.80 26.37 26.44 6.84 7.73 6.17 0.82 0.84 1402.00 -184.20 188.70 *Well 21750 was typo of 16.77 in MicrosoftAccess file. Assumed it was supposed to be 6.77

74

Table A 9. Pathogen and fecal indicator prevalence and abundance. Both surface and groundwater measurements are listed. All concentration units in arithmetic mean log gene copies/100mL.

Surface Water Groundwater January March July January March July Target % Positive % Positive % Positive % Positive % Positive % Positive (mean) (mean) (mean) (mean) (mean) (mean) 90.0% 50.0% 100% 75.6% 74.5% 82.0% mE. coli (4.85) (2.86) (5.58) (1.72) (2.06) (2.11) 100% 100% 100% 100% 98.2% 100% Bacteroides (7.23) (6.53) (7.22) (3.77) (3.71) (3.72) 40.0% 100% 100% 10.3% 34.1% 27.1% Rotavirus (2.19) (3.95) (5.47) (1.14) (1.75) (1.64) 80.0% 7.1% 19.0% Adenovirus 0% 0% 0% (2.97) (1.19) (1.15) Shigella + 16.7% 21.4% 2.4% 8.0% 0% 0% ETEC E. coli (1.39) (1.39) (1.01) (1.08) 33.3% 7.1% 7.1% 5.5% 4.1% Vibrio 0% (1.83) (1.14) (1.05) (1.12) (1.01)

75

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10

1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure A 1. Comparison of molecular E. coli in all wells by month. The ANOVA F value of 2.6 showed no significant variance, p = 0.08. July n = 57, January n = 41, March n = 61. The horizontal dashed line reflects non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

76

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure A 2. Comparison of allBacteroides in all wells by month. The ANOVA F value of 0.05 showed no significant variance, p = 0.1. January n = 39, March n = 61, and July n = 50. The horizontal dashed line reflects the non-detect (ND) value of 10 gene copies/100mL. The points plotted above and below the whiskers represent outliers in the dataset.

77

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure A 3. Comparison of Vibrio in all wells by month. The ANOVA F value of 1.3 showed no significant variance, p = 0.3. January n = 48, March n = 60, and July n = 55. The horizontal dashed line reflects the non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

78

1e+10

1e+9

1e+8

1e+7

1e+6

1e+5

1e+4

1e+3

1e+2 ND = 10 1e+1

log Concentration (gene copies/100mL) (gene Concentration log 1e+0 January March July

Month

Figure A 4. Comparison of Shigella in all wells by month. The ANOVA F-value of 2.6 showed no significant variance, p = 0.08. January n = 48, March n = 60, and July n = 56. The horizontal dashed line reflects the non-detect (ND) value of 10 gene copies/100mL. The points plotted above the whiskers represent outliers in the dataset.

79

Variability of Cl/Br Ratios by Month

18000

16000

14000

12000

10000

8000

6000

Cl/Br (mol/L) Cl/Br

4000

2000

0

March July

Month

Figure A 5. Comparison of Cl/Br ratios in all wells by month. The ANOVA F-value of 2.09 showed no significant variance, p = 0.2. The points plotted above and below the whiskers represent outliers in the dataset.

80

30

25

20

15

10

Nitrate (mg/L) Nitrate

5

0

March July

Month

Figure A 6. Comparison of nitrate concentrations in all wells by month. The ANOVA F value of 1.2 showed no significant variance, p = 0.3. The points plotted above the whiskers represent outliers in the dataset.

81

3.5

3.0

2.5

2.0

1.5

PO4 (mg/L) 1.0

0.5

0.0

March July

Month

Figure A 7. Comparison of phosphate concentrations in all wells by month. The ANOVA F value of 0.2 showed no significant variance, p = 0.6. The points plotted above and below the whiskers represent outliers in the dataset.

82

140

120

100

80

60

Sulfate (mg/L) Sulfate 40

20

0

March July

Month

Figure A 8. Comparison of sulfate concentrations in all wells by month. The ANOVA F value of 0.1 showed no significant variance, p = 0.7. The points plotted above the whiskers represent outliers in the dataset.

83

1e+4

1e+3

1e+2

1e+1

1e+0

log Concentration (MPN/100mL) Concentration log ND = 0.25

1e-1 January March July

Month

Figure A 9. Comparison of culturable E. coli in 28 select wells by month. The ANOVA F value of 0.8 showed no significant variance, p = 0.4. Horizontal dashed line reflects non- detect (ND) value of 0.25 MPN/100mL. The points plotted above the whiskers represent outliers in the dataset.

84

9.0

8.5

8.0

7.5

pH 7.0

6.5

6.0

5.5 January March July

Month

Figure A 10. Comparison of pH in select wells by month. The ANOVA F value of 2.8 showed no significant variance, p = 0.07. The points plotted above and below the whiskers represent outliers in the dataset.

85

5

4

3

2

1

rotavirus (log gene copies/100mL) gene (log rotavirus

0 0 1000 2000 3000 4000

Cl/Br

Figure A11. Linear regression analysis of rotavirus concentration with Cl/Br ratios for March and July. The simple scatter plot was created using data from the 28 selected wells. The linear regression statistics showed no significant correlation, R2 = 0.005, p = 0.6.

86

5

4

3

2

1

Rotavirus (log gene copies/100mL) gene (log Rotavirus

0 22 23 24 25 26 27 28

Temperature (C)

Figure A 12. Linear regression analysis of rotavirus concentration with temperature variation. The simple scatter plot was created using data from 28 selected wells. The linear regression statistics showed a significant but very weak correlation, p = 0.05, R2 = 0.05.

87

5

4

3

2

1

Rotavirus (log gene copies/100mL) gene Rotavirus (log

0 -1000 0 1000 2000 3000 4000 5000

ORP (mV)

Figure A 13. Linear regression analysis of rotavirus concentration with ORP. The simple scatter plot was created using data from the 28 selected wells. The linear regression statistics showed no significant correlation, p = 0.2, R2 = 0.02.

88

5

4

3

2

Rotavirus (gene copies/100mL) Rotavirus (gene 1

0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

DO (mg/L)

Figure A 14. Linear regression analysis of rotavirus concentration with DO. The simple scatter plot was created using data from the 28 selected wells. The linear regression statistics showed no significant correlation, p = 0.9, R2 = 0.0001.

89

5

4

3

2

1

Rotavirus (log gene copies/100mL) gene Rotavirus (log

0 -1 0 1 2 3 4

Total Coliform (log MPN/100mL)

Figure A 15. Linear regression analysis of rotavirus concentration with total coliform concentration. The simple scatter plot was created using data from the 28 selected wells. The linear regression statistics showed no significant correlation, p = 0.5, R2 = 0.005.

90

Table A 10. Phage sequence abundances in the groundwater virome using MG-RAST. Phages are listed in order of abundance of sequence hits. The first two, most abundant phages, are present in the virome due to contamination of the sample with the cloning vector.

Source Organism Abundace GenBank phagemid cloning vector pA2 473 GenBank phagemid vector pBK-CMV 228 GenBank Enterobacteria phage DE3 91 GenBank Vibrio phage CTX 8 GenBank Enterobacteria phage lambda 7 GenBank uncultured phage MedDCM-OCT-S11-C561 7 GenBank Treponema phagedenis F0421 5 GenBank Aureococcus anophagefferens 2 GenBank uncultured phage MedDCM-OCT-S04-C231 2 GenBank Enterobacteria phage f1 1 GenBank Prochlorococcus phage P-SSM7 1 GenBank Pseudomonas phage MP29 1 GenBank Pseudomonas phage PA1/KOR/2010 1 GenBank Shigella phage SfX 1 GenBank uncultured phage MedDCM-OCT-S04-C714 1

91

Table A 11. Viral sequence abundances in the groundwater virome using MG-RAST. Viruses are listed in order of abundance of sequence hits. Rotavirus A and hepatitis C virus were the viral pathogens present according to this program.

Source Organism Abundace GenBank Human rotavirus A 58 GenBank Rotavirus A human/Belgium/BE00009/2005/G1P[8] 31 GenBank Rotavirus A human/Belgium/BE00012/2006/G1P[8] 31 GenBank Rotavirus A human/Belgium/BE00020/2006/G1P[8] 31 GenBank Rotavirus A PO/CE-M-06-0003/Canada/2006/G2P[27]I 14 GenBank Human immunodeficiency virus 1 17 GenBank Acanthamoeba polyphaga 6 GenBank Rotavirus subgroup 2 4 GenBank Micromonas sp. RCC1109 virus MpV1 2 GenBank Rotavirus A human/Bethesda/DC5064/1977/G4P[8] 1 GenBank Rotavirus A human/Bethesda/DC5115/1977/G4P[8] 1 GenBank Hepatitis C virus 1

92

(A) >4489038.3|HMH8K2V02CZEA7|GenBank|ADO99101.1 ssDNA binding protein [Prochlorococcus phage P-SSM7] TTGATGATCGTTTTTGGCAACCAGAAGTTGACGCCGCTGGCAACGGATACGC AGTTATCCGCTTCCTTGATACTCCAGCCGTTGACGGTGAAGATGGTCTGCCG TGGGTACAGATTTGGTCACACGGTTTCCAGGGTCCAGGTGGTTGGTACATTG AGAATTCTCTCACAACTCTTGGCAAGACCGACCCTGTTTCTGAGTACAACAC TGTTCTGTGGAACTCAGGTATCGAAGCAAATAAGGAAATTGCTCGCAAGCA AAAGCGCAAGTTGACGTACATCGCAAACGTTCTTGTGATCTCTGACGCCAAG CGTCCGCAAAATGAAGGTAAGGTTTTCCTGTTCAAGTTCGGAAAGAAGATTT TCGACAAGATCAAGGAACAACTCGAGCCGCAGTTTGCTGATGAGACACCAA TGAATCCGTTTGACTTCTGGAAGGGTGCAAACTTCAAGATCAAGATTCGTAA TGTGGAAGGCTATCGTAACTATGACAAGTCGGAGTTTGAATCTCCTGCTGCA TTGTTCAATAGCGACGACGCGCAGATCGAAAAGGTCTGGAAGTCTGCATATT CACTCAAGGATTTCTTGAAGCCTGATAACTTCAAGTCCTATGATGAACTCAA GGCGAAGTTGGACAAGGTGCTAGGTGCTGGTGGCGCAACTGCTGCCGCAGC CAAGAGATCAACGATGAGGAACGCACCTGCTCCG

(B) >4489039.3|HMH8K2V01CA4B0|GenBank|ADO99101.1 ssDNA binding protein [Prochlorococcus phage P-SSM7] TTGCTCGCAAGCAAAAGCGCAAGTTGACGTACATCGCAAACGTTCTTGTGAT CTCTGACGCCAAGCGTCCGCAAAATGAAGGTAAGGTTTTCCTGTTCAAGTTC GGAAAGAAGATTTTCGACAAGATCAAGGAACAACTCGAGCCGCAGTTTGCT GATGAGACACCAATGAATCCGTTTGACTTCTGGAAGGGTGCAAACTTCAAG ATCAAGATTCGTAATGTGGAAGGCTATCGTAACTATGACAAGTCGGAGTTTG AATCTCCTGCTGCATTGTTCAATAGCGACGACGCGCAGATCGAAAAGGTCTG GAAGTCTGCATATTCACTCAAGGATTTCTTGAAGCCTGATAACTTCAAGTCC TATGATGAACTCAAGGCGAAGTTGGACAAGGTGCTAGGTGCTGGTGGCGCA ACTGCTGCCGCAGCCAAGAAGATCAACGATGAGGAAGCACCTGCTCCTGTC GTTCGGTCAGCGCCAGCCAAGAAGGTCACTGCTGAGGATGTCTCAGTTGATG ACGATGATATGGC

Figure A 16. Prochlorococcus phage P-SSM7 sequences found in (A) the surface water sample (C1-3) and (B) the groundwater sample (21754). These sequences were aligned using BLAST to find 100% identity, suggesting transport between the surface water and groundwater of Bangladesh.

93

VITA

Kati Ayers was born on August 27, 1988 in Gallatin, Tennessee to the parents of Stacy

Coates and Heather Wysong. She attended Westmoreland Elementary, Middle, and High

School in Westmoreland, Tennessee but graduated from River Hill High School in

Clarksville, Maryland in 2006. She went on to earn a Bachelor’s of Science degree in

Biology with a minor in Mathematics from Western Kentucky University in 2010. She attended the University of Tennessee from 2011 to 2013 where she began as a graduate teaching assistant for geology and finished as a graduate research assistant for Dr. Terry

Hazen. She is expected to graduate in August 2013 from the Department of Earth and

Planetary Sciences with a Master of Science degree in Geology.

94