Closed-Tube Barcoding of Sequence Variants (Possible Species)

within the Genus

Master’s Thesis

Presented to

The Faculty of the Graduate School of Arts and Sciences Brandeis University Department of Biology Professor Lawrence J. Wangh, Advisor

In Partial Fulfillment of the Requirements for the Degree

Master of Science in Biology

by John Deng

May 2017

Copyright by

John Deng

© 2017

Acknowledgements

I would like to thank Professor Lawrence Wangh for giving me the opportunity to pursue this project and for providing me with the financial means to make this project possible. His guidance and advice throughout this process has been invaluable. I would like to thank J. Aquiles Sanchez for his constant support and direction. I would also like to thank

Nicky Sirianni for teaching me how to run experiments and Adam Osborne on the assistance and interpretation of difficult data and problems. I am grateful to Heather

Schiller and to Professor Chandler Fulton for providing Naegleria samples. Finally, thank you too to all other members of the Wangh laboratory for helping to make this project successful. I have learned so much about how to be a better scientist from this process.

Thank you. This research was supported in part by the Division of Science Undergraduate

Research Fellowship and a Sprout grant.

iii

Abstract

Closed-Tube Barcoding of Sequence Variants (Possible Species) Within the Naegleria Genus

A thesis presented to the Department of Biology

Graduate School of Arts and Sciences Brandeis University Waltham, Massachusetts

By John Deng

This thesis provides an approach to identify and characterize the enormous diversity of Naegleria genus with Closed-Tube Barcoding. This thesis is comprised of two parts: 1) The Naegleria-Assay, which can be used to identify any species of Naegleria on earth and 2) The Fowleri-Assay, which can be used to selectively detect the “brain eating ” Naegleria fowleri.

In the construction of these assays, Closed-Tube Barcoding was used to generate fluorescent signatures of the CO1 gene for sequence variants of the genus Naegleria. The

Naegleria-Assay can generate a unique fluorescent signature for each of the sixteen tested species of Naegleria and can even differentiate between different strains within a single species. With a few adjustments in thermal cycle and reagents, this assay can be transformed into the Fowleri-Assay, which can identify as few as ten copies of Naegleria fowleri in a background of up to 10,000 copies of DNA from other Naegleria species. The

iv efficiency of this reaction is 91%. These assays indicate that a single set of probes can be used to characterize an entire diverse genus and can provide critical information about the diversity and range of the Naegleria genus as well as identify the pathogenic species

Naegleria fowleri.

v Table of Contents

Introduction ...... 1 Traditional Methods of Species Identification are Inadequate ...... 1 Is DNA Barcoding a Solution for Rapid Identification of Species? ...... 4 Benefits of DNA Barcoding? ...... 7 Problems with DNA Barcoding ...... 8 Closed-Tube Barcoding ...... 9 Closed-Tube Barcoding of Naegleria ...... 13 Materials and Methods ...... 16 Final PCR Composition ...... 16 Primer Design ...... 17 Taq Polymerase ...... 18 ThermaStop™ ...... 19 ThermaStop™ Concentration ...... 19 ThermaMark™ Concentration ...... 21 Probe Design ...... 21 Probe Concentration ...... 25 Other PCR Reagents: ...... 26 Naegleria Target DNA: ...... 26 Target DNA Concentration ...... 28 Thermal Cycle ...... 28 Final PCR Composition and Setup ...... 29 Results ...... 31 In Silico Tests of Naegleria Consensus Probes Against Selected ...... 31 In Silico Test of Naegleria Consensus Probes Against the Naegleria Genus...... 34 Naegleria-Assay Fluorescent Signatures ...... 36 Fowleri-Assay ...... 49 Testing Naegleria-Assay Conditions with the Fowleri-Assay ...... 49 Increasing the Stringency of the Fowleri-Assay ...... 50 Increasing the Copy Number of Related Naegleria Species ...... 51 Testing ThermaMark™ Concentrations ...... 52 Testing Different versions of ThermaMark™ ...... 53 Dilution series in a background of 105 copies of synthetic Naegleria gruberi DNA ...... 55 Dilution series in a background of real Naegleria gruberi DNA ...... 56 Discussion ...... 59 Naegleria Assay ...... 59 Fowleri-Assay ...... 60 Challenges ...... 61 Future Directions ...... 61 Conclusion ...... 63 References ...... 64

vi List of Tables

Table 1: Sequence and Melting Temperatures of Fowleri-Assay and Naegleria-Assay ______18 Table 2: ON/OFF Probe Sequences ______23 Table 3: List of Other Tested Species In-Silico ______24 Table 4: Target Sequences ______26 Table 5: Tested Naegleria Strains and Species ______27 Table 6: Thermal Cycle ______29

vii List of Figures

Figure 1: ThermaStop™ structure. ______20 Figure 2: ThermaMark™ structure. ______21 Figure 3: Diagram of Probe binding to amplicon ______23 Figure 4: Mastermix Preparation ______30 Figure 6: Related Eukaryotes Pseudo-color map ______33 Figure 7: Pseudo-color map of the Naegleria genus. ______35 Figure 8: N. gruberi ______38 Figure 9. N. gruberi ______38 Figure 10: N. gruberi ______39 Figure 11: N. gruberi ______39 Figure 12: N. clarki ______40 Figure 13: N. canariensis ______40 Figure 14: N. fultoni ______41 Figure 15: N. pagei ______41 Figure 16: N. minor ______42 Figure 17: N. schusteri ______42 Figure 18: N. tenerifensis ______43 Figure 19: N. antarctica ______43 Figure 20: N. galeacystis ______44 Figure 21: N. pringsheimi ______44 Figure 22: N. australiensis ______45 Figure 23: N. tihangensis ______45 Figure 24: N. dunnebacki. ______46 Figure 25: N. gruberi total curve ______47 Figure 26: A comparison of two similar signatures N. clarki and N. canariensis ______47 Figure 27: Amplification plot of EDF258 ______48 Figure 28: Fowleri-Assay protocol with 5 cycles at fifty degrees ______50 Figure 29: Fowleri-Assay protocol without 5 cycles at fifty degrees ______51 Figure 30: Fowleri-Assay protocol with increased copy numbers of NEG-M and 5c1 ______52 Figure 31: FAM amplification plots for different Thermamark concentrations ______53 Figure 32: Amplification plots of different dilutions of –TM/Low Tm TM/High Tm TM ______54 Figure 33: Dilution series +/- Thermamark in a background of 105 copies of NEG-M ______56 Figure 34: Fowleri Assay Dilution Series of Naegleria fowleri and Naegleria gruberi ______57 Figure 35: Standard curve of the dilution series performed in Figure 34 ______58

viii

Introduction

Traditional Methods of Species Identification are Inadequate

Identification and characterization of the species that inhabit our planet is central to our understanding of the biodiversity of life on earth. Ernst Mayr (1942) defined a species as a group of animals or organisms that can interbreed and are reproductively isolated from other groups. But long before that, Carl Linnaeus and others in the mid-1700s developed taxonomic systems for classifying animals and plants that were largely based on morphological and behavioral criteria. Charles Darwin also devoted a great deal of time to distinguishing species, particularly barnacles on the basis of morphology (see, Rebecca

Stott “Darwin and the Barnacle”). These efforts and many others over the last two centuries have resulted in the naming of a minimum of about 1.7 million species (Waugh,

2007). But, it is estimated that over 90% of extant species have not yet been identified or characterized (Mora et al. 2011).

Within the rich history of our earth, there have been five periods of profound devastation that have resulted in the death of between fifty and ninety percent of all species on earth. This rapid loss in biodiversity is known as a mass extinction and can be identified through dramatic changes in the fossil record. The most famous of these extinctions is the Cretaceous-Tertiary (K-T) mass extinction and is responsible for extinction of the dinosaurs. However, a previous extinction known as the Permian mass extinction or The Great Dying was much more calamitous, wiping out over ninety percent

1 of all species. Although there are varying hypotheses for the exact causes of these mass extinctions, ranging from fluctuations in sea level to variations in carbon dioxide to bolides, there is no doubt that each of these events were accompanied with profound ecological devastation (Holland et al. 2016; Veron, 2008).

Many scientists argue that the Earth is going through an anthropogenic sixth extinction due to widespread destruction of ecosystems all around the world. Estimates of the extinction rate of species range from 100-1,000 times higher than background (Pimm et al. 2014). The species that have gone extinct within the past century would have taken up to 10,000 years to disappear under normal circumstances (Ceballos et al. 2015). One of the most important consequences of the extinction is the overwhelming loss of biodiversity, the loss of the many species that coexist with humans on this earth. Not only are these species integral and fascinating organisms, but many of them also have very important implications in the food and health industry.

The fact that it has taken over two centuries to characterize only 10% of all the species on earth indicates that traditional morphological methods are not adequate to capture the species that are quickly disappearing from our world. Traditional morphological methods require trained taxonomists to be able to identify and characterize species and the current rate of categorizing biodiversity cannot hope to keep up with the loss in species every year. Additionally, traditional morphological methods suffer from a number of problems that will be detailed below.

Oftentimes, a single species can look like multiple species due to phenotypic plasticity, developmental changes, and ecological differences. As standard morphological methods of identification and characterization are largely visual, changes in the outward

2 appearance of the same species can result in the inaccurate characterization of a single species as multiple different species (Pecnikar et al. 2014). An example of phenotypic plasticity manifests itself within the Carychium genus, which comprise of a group of terrestrial snails. The Carychium genus comprises of over twenty different species, and species differentiation is based mainly on characteristics about the mature shell. However, it has been found that the shell of Carychium species varies under different environmental conditions, making identification of specific samples difficult (Weigend et al. 2011). As a result, the same Carychium species can often be mistaken for different Carychium species under different environmental conditions.

Another issue is that speciation may not always be accompanied with morphological change. Due to morphological and behavioral similarity, some closely related animal species have also been incorrectly categorized as a single species. Ambiguities of this type are known as cryptic species (Bickford et al. 2007). For many cryptic species, their morphological similarity may arise from stabilizing selection under a narrow ecological niche (Bickford et al. 2007). An example is the neotropical skipper butterfly Astrapes fulgerator, that was actually found to be a complex of over 10 species (Herbert et al. 2004).

Revelation of cryptic species suggests that the level of biodiversity is greater than previously believed.

Another reason that the identification and characterization of species moves at such a slow pace is due to the difficulty involved in characterizing small multicellular organisms.

Even experts may find it labor intensive, expensive, or impossible to classify very small organisms purely on the basis of morphological or reproductive behavior. Numerically however, small organisms make up the majority of all the species on earth (Blaxter, 2016).

3 It is therefore understandable, for instance, that the majority of nematodes have not been classified while 90% of all the vertebrates have been identified (Waugh, 2007).

When it comes to unicellular organisms, the problem of classification becomes even greater. Not only are these organisms even more difficult to distinguish morphologically, but a significant proportion of them reproduce asexually. For unicellular organisms that do not reproduce sexually, there is no way to test the biological definition of a species. Even organisms that do reproduce sexually often reproduce clonally through cell division. A real problem arises in how to define and characterize these species, and this need for identification is exacerbated by the presence of many pathogenic organisms that are involved in human health.

Is DNA Barcoding a Solution for Rapid Identification of Species?

Because identification of small organisms is technically difficult, scientists have turned to molecular methods for identification of species. These molecular methods aim to discriminate between species by analyzing the DNA of each species. As there is often no morphological or behavioral way to identify these species, molecular methods have gained broad acceptance with many groups working with these small organisms (Giovannoni et al.

1990; Pace, 1997).

In 2003, Hebert published a paper that introduced an adaptation of traditional molecular methods used for small organisms to be used throughout all groups of life, which he described as DNA barcoding (Hebert et al. 2003). DNA barcoding utilizes a single portion of DNA to serve as a global bioidentification system for all species on earth (Hebert et al. 2003). In this process, the DNA of an organism is purified and isolated, followed by

4 PCR amplification and sequencing. Afterwards, the sequence can be analyzed and compared to other sequences in a database to identify the species.

The isolation of a suitable gene is integral to DNA barcoding being able to serve as a global bioidentification system. Not only must this gene be conserved within all species on earth, variations in this gene must also be able to differentiate between different species and the same species. This necessitates that within this gene, interspecies variation must be significantly greater than intraspecies variation, allowing for the accurate differentiation of different species. In other words, the mutation rate of this particular segment of DNA must be slow enough so that interspecies variation is minimal but at the same time fast enough to produce adequate intraspecies variation for differentiation between species

(Waugh, 2007)

The gene chosen for the analysis and differentiation of species was the mitochondrial CO1 gene (Hebert et al. 2003). The CO1 gene codes for the first subunit in the cytochrome c oxidase protein, a large transmembrane protein that serves as the terminal electron acceptor in the electron transport chain. It helps reduce oxygen to water and plays a role in pumping protons across the membrane to create an electrochemical potential (Iwata et al. 1995). Aside from rare exceptions such as the microbial

Monocercomonoides genus, this gene is found within all eukaryotes on earth, an ideal target for a standardized barcoding region. The protein is broken up into three different subunits, and the first subunit is embedded within the membrane of the mitochondrial crista (Iwata et al. 1995). The nature of this transmembrane protein suggests that there are not only tight structural constraints as it is embedded in the membrane, but also tight functional constraints due to it being an essential protein for oxidative phosphorylation. In

5 fact, it has been found that despite extensive variation at the DNA level, there have been strong constraints on the amino acid sequence and subsequent protein structure. Most variation in sequence occurs at functionally redundant parts of the protein such as the loops, whereas the active site that transfers electrons from copper to heme is most conserved (Pentisaari et al. 2016).

Due to these constraints on the CO1 gene, robust and conserved primers have been identified in the CO1 gene that allow for amplification across many animal species (Folmer et al. 1994). This allows for the use of a single set of primers to be used for barcoding many different species. In addition, mtDNA is comprised of a single double helical circular molecule. Each mitochondrion contains several sets of these circular mitochondrial genes, and as each as cell contains multiple mitochondria, the CO1 gene is therefore present in vast copy excess over nuclear gene copies in every cell (Waugh, 2007). This allows for accurate amplification and identification even from a small piece of the organism. In most eukaryotes, mtDNA is rarely exposed to recombination, devoid of introns, and has a haploid mode of inheritance through the mother (a few exceptions arise in some species of molluscan bivalves) (Hebert et al. 2003, Krishnamurthy et al. 2012, Ladoudakis et al. 2017).

MtDNA also goes through a greater mutation rate in comparison to nuclear DNA (Waugh,

2007), but slower than other mtDNA genes due to the structural and functional constraints on the CO1 gene. These properties result in the CO1 gene changing at a rate that consistent with speciation. As such, the CO1 gene has been found to be well suited to identify and differentiate species in most animal genera.

6 Benefits of DNA Barcoding

The benefits of DNA barcoding over traditional are significant, including the ability to acquire species information from a limited or partial sample, the accessibility of more people to contribute without taxonomic expertise, the identification of cryptic species or species with phenotypic plasticity, and finally faster and cheaper identification of species. Indeed, there is already a very large amount of literature about the efficacy of DNA barcoding, and DNA barcoding has proven to be essential to the study and identification of cryptic species and species that possess phenotypic plasticity (Hebert et al. 2004; Rivera et al. 2009; Weigand et al. 2011). As rapid and accurate species identification is very important in many ecological studies, there has been a great amount of success utilizing

DNA barcoding in biodiversity, bio-assessment, and biomonitoring studies as well

(Pecnikar et al. 2014).

Beyond taxonomic studies, the field of DNA barcoding may have additional use in the field of public health and the detection of pathogens and parasites and their spread

(Azpurua et al. 2010). Oftentimes, traditional morphological identification methods are inadequate and make it difficult to differentiate or identify pathogens or their associated vectors. Not only are pathogens sometimes invisible to the naked eye, they oftentimes undergo several different life-stages that are drastically different and may have drastically different characteristics (Pecnikar et al. 2014). In addition, certain species within the same genus may be pathogenic whereas other species within the genus may have no effects on human health (Grace et al. 2015). Thus it is of utmost importance be able to quickly and accurately identify pathogens and parasites. For example, in the disease Leishmaniasis, a disease that can cause skin lesions, ulcers, and death, scientists utilized DNA barcoding to

7 identify two specific species of the genus Leishmania that have a much higher incidence of transmitting Leishmaniasis than the other twenty known pathogenic species of Leishmania

(Azpurua et al. 2010). The usage of DNA barcoding to monitor species that have important implications in health and agriculture will most likely allow for rapid advances in detection and identification.

As a result of these advances, the initiative for DNA barcoding has resulted in the formation of the International Barcode of Life project (iBOL), the largest biodiversity genomics initiative ever attempted. This initiative aims to create a data-rich DNA reference library, which will be the basis of future species identification. At this moment, iBOL comprises of over 120 organizations in 25 different countries and comprises of millions of specimens (barcodeoflife.org).

Limitations of DNA Barcoding

Clearly, there are enormous benefits to adopting DNA barcoding in conjunction with traditional morphological methods, as it provides scientists with much more extensive, accurate, and rapid identification methods. However, are DNA barcoding methods truly faster, more efficient, and cheaper than traditional morphological methods? At the present time the process of DNA barcoding makes use of methods to isolate and purify DNA from a sample, followed by use of universal primers to amplify the CO1 gene target, followed by use of sequencing to obtain the nucleotide sequence of both strands of the amplicon

(Pecnikar et al. 2014). In 2014, a study compared the costs and time needed for traditional morphological methods of identification against DNA barcoding in the bio-assessment of the San Gabriel Watershed in California (Stein et al. 2014). They found that the costs of

8 DNA barcoding ended up costing 1.7 to 3.4 times more than traditional morphological approaches. The costs of a 200-count macroinvertebrate sample for DNA barcoding was found to be $1500, whereas the costs of the same sample identified through morphological methods was $880 (Stein et al. 2014). This disparity in pricing was primarily due to the high costs involved in use of the methods of DNA isolation, amplification, and sequencing as compared to the costs involved in sorting and identifying samples via morphological characteristics (Stein, et al. 2014). The costs for DNA barcoding end up at around $5 per sample, with half going to the sequencing and half going to the isolation and amplification.

The upside to DNA barcoding methods is that identification is much faster than traditional morphological methods, from three to five weeks for DNA barcoding to six months for traditional methods from sample collection to identification (Stein et al. 2014). The data reported by Stein et al. is consistent with that found in other studies, indicating that the use of DNA barcoding as an identification tool may not be as cost effective as previously believed (Cameron et al. 2006). Although DNA barcoding may be able to quickly and accurately identify species, there evidently needs to be further progression in order to decrease costs.

Closed-Tube Barcoding

This thesis describes the use of Closed-Tube Barcoding for identification of CO1 sequence variants (possible species) within a genus of single celled amoeba flagellates.

Closed-Tube Barcoding was invented in the Wangh laboratory at Brandeis University (Rice et al. 2014; Sirianni et al. 2016). Closed-Tube Barcoding distinguishes species on the basis of sequence variations in the CO1 target sequence, but utilizes a suite of technologies

9 invented in the Wangh lab that can decrease the time needed for DNA purification, as well as the need for sequencing, the most costly and time consuming aspects of traditional

Barcoding (see above). Closed-Tube Barcoding utilizes Linear-After-the-Exponential

(LATE-PCR) and Lights-On/Lights-Off probes to amplify and generate a unique fluorescent signal for each species.

Linear-After-the Exponential (LATE-PCR) is an advanced form of non-symmetric

PCR that efficiently generates single stranded DNA (Sanchez et al. 2004; Pierce et al. 2005).

By utilizing different concentrations of limiting and excess primers, efficient single- stranded product is generated within a closed-tube. The amount of single stranded amplicons generated at the end of amplification is 10-20 fold in excess of the double- stranded amplicons. Following amplification, analysis of the single strand is achieved by utilizing sets of fluorescent probes that bind in a temperature specific manner following the decrease of temperature (Sanchez et al. 2004; Pierce et al. 2005). In regular qPCR, as probe detection occurs during the annealing step, it necessitates that the melting temperature of the probe be higher than the primer, putting tight constraints on probe design. In LATE-

PCR, probe detection occurs at endpoint, allowing for shorter and lower melting temperature which provide more discrimination, lower background, and use of higher concentrations (Sanchez et al. 2004; Pierce et al. 2005).

Lights-On/Lights-Off Probes are pairs of hybridization probes that comprise a set of

Light-On probe and a Lights-Off probes (Rice et al. 2012). Each Lights-On probe comprises of a fluorophore and a quencher on the ends, whereas a Lights-Off probe lacks a fluorophore but contains a quencher. When the Light-On probe is bound to its target, the fluorescence emitted is higher than that of it in solution due to rigid separation of the

10 fluorophore and quencher by hybridization (Rice et al. 2012). Light-Off probes are designed to anneal next to Lights-On probes and turn off the fluorescence generated from

Lights-On probes. (Rice et al. 2014).

The process of Closed-Tube Barcoding begins with a sample of the organism. The sample of the organism can be lysed with either protease K or chaotropic salts (Pierce et al.

2002; Hartshorn et al. 2005). All that is needed afterwards to isolate and purify the DNA is subsequent dilution due to the high copy number of mtDNA in a single cell. This rapid approach has not only shown to be sensitive down to the single molecule level (Osborne et al. 2009), but also faster than traditional methods of DNA isolation. As probe detection occurs at endpoint, any inhibition can be remedied with additional cycles to increase single strand production (Sirianni et al. 2016). Following amplification, the temperature is lowered so to allow for probe hybridization. As the temperature is slowly increased, the probes will melt off at their respective melting temperatures. As this occurs, the fluorescence of the reaction will shift as a result of probe activity. By taking the derivative of the fluorescence as a function of temperature, we can generate a fluorescent signature that is easier to read (Rice et al. 2012). As probes designed to be complementary to the sequence from a specific species will have a decreased melting temperature for any other sequence from different species, the fluorescent signature generated will be unique for each species. This fluorescent signature can then be compared to a fluorescent library of signatures. If a match is found, there is no need for any additional sequencing as the species can be identified solely on its fluorescent signature. If there is no match found, the species can be sequenced with Dilute’N’Go sequencing, which easily and rapidly simplifies preparation and sequencing of amplicons (Jia et al. 2010). Afterwards its fluorescent

11 signature will be added to the library for future reference. Closed-Tube Barcoding has already been successfully utilized to successfully identify five commercially available and beneficial species of nematodes in a proof of concept experiment (Rice et al. 2014). A single set of ten Light-On/Lights-Off probes and primers were utilized alongside LATE-PCR that generated a unique fluorescent signature for each species (Rice et al. 2014).

As previously shown in biodiversity and bio-assessment studies with DNA

Barcoding, although it may be faster than traditional methods, the cost may still be too high. This is mainly due to the fact that there is often too much sequencing necessary; a large amount of the specimens on the iBOL end up being repeats of known sequences

(barcodeoflife.org). This is not only unnecessary, but may present problems for the widespread adoption of barcoding. The usage of Closed-Tube Barcoding to pre-screen samples will allow for more rapid and cheaper identification than traditional DNA

Barcoding methods. Closed-Tube Barcoding is not a substitute for DNA barcoding as it does not identify specific nucleotides, but is rather a tool in biodiversity and bio-assessment studies to rapidly identify samples at a low cost (Sirianni et al. 2016).

To prove the efficacy of Closed-Tube Barcoding, further questions need to be answered. Firstly, how wide a range of species can a single set of Lights-On/Lights-Off probes and a pair of primers cover? A single set of Lights-On/Lights-Off probes were shown to have been successful in covering five species of nematodes, but what is the true range of a single set of probes? Knowledge about the range of a single set of probes is essential to illustrating the capabilities of this technology. The second question that is central to biodiversity studies is whether Closed-Tube Barcoding can be used to isolate a specific species of interest among a background many other different related species?

12 Oftentimes what is important in biodiversity studies is the detection of single pathogenic organisms within a multitude of others species. The ability of this technology to detect pathogenic species among other species has important implications in the field of human health.

Closed-Tube Barcoding of Naegleria

In order to answer these questions, an assay was constructed for Closed-Tube

Barcoding for detection of CO1 sequence variants within the genus Naegleria. Naegleria are unicellular protists that can be found in freshwater and permanently wet soil all over the earth, (Fulton, 1970; Fulton, 1993) and are characterized by the ability to change from amoeba to swimming flagellates (Fulton, 1970). The Naegleria genus is divided into around

40 different species, but morphological differentiation of different species is almost impossible (Robison et al 1992).

The Naegleria genus was chosen for two main reasons. One being that the genus is tremendously diverse (Fulton, 1970). The fact that these protists exist from all the way from the temperate continents to Antarctica raises many questions about the movement and range of this particular genus. Can a single set of primers and probes cover and accurately distinguish this diverse genus, which has been thought to be as diverse as the tetrapods (Fulton, 1993)? Demonstration of the ability to characterize the Naegleria genus not only has important implications about the distance a single set of primers and probes is able to cover in other animal groups, but also allows for the identification and characterization of Naegleria species to provide more information about their widespread presence all across the world.

13 The second reason is due to Naegleria fowleri, a pathogenic amoeba that is part of the Naegleria genus. Naegleria fowleri is a pathogenic amoeba commonly known as the

“brain eating amoeba” for which infection is almost always fatal (Grace et al. 2015).

Naegleria fowleri, like most other Naegleria species lives in warm freshwater. However, this particular species can be exposed to humans through many recreational water activities. It is believed that infection occurs when water moves into the nasal cavity, where the amoeba moves through the olfactory nerve to the olfactory bulb. Once within the olfactory bulb, activation of the innate immune system through macrophages and neutrophils ultimately results in primary amebic meningoencephalitis (John, 1982).

Although Naegleria fowleri infection is almost undetectable and almost always fatal, thankfully infection by Naegleria fowleri is relatively rare (Grace et al. 2015). There are increased fears of Naegleria fowleri movement northward with climate change.

Currently established Naegleria detection methods rely on traditional culturing of possible Naegleria samples on plates followed by morphological analysis to identify

Naegleria, and then finally molecular methods to detect the presence N. fowleri. These methods are effective, but time consuming and require a variety of different techniques

(Streby et al. 2015). There have been the development of a few molecular assays specific for the detection of Naegleria fowleri, but these have been mainly developed for clinical diagnostics (Reveiller et al. 2002; Qvarnstrom et al. 2006; Ahmad et al. 2011). A study looking to quantify the efficacy of such assays for environmental samples found that out of the currently available molecular assays, only two are sufficient for environmental samples.

These two molecular assays have been found to suffer from amplification and specificity issues respectively, with the first assay having an amplification efficiency of 76.8% and the

14 second assay amplifying the related species Willaertia magna (Streby et al. 2015). In addition, these studies have not measured levels of fowleri in the presence of other

Naegleria species, a likely scenario when testing for the presence of fowleri from water samples. Thus, it is of importance that molecular based tests for the presence of fowleri is improved. The identification and characterization of this specific species can provide a wealth of information on the ecological niche and activity of a species that is both genetically and physiologically divergent from most other Naegleria species (Fritz-Laylin et al. 2010).

Thus, the goals of this study have been twofold. The first goal was to design and construct a Closed-Tube Barcoding assay that could distinguish all species within the highly diverse genus Naegleria. This assay had to be rapid, sensitive, highly specific, easy to use, quantitatively accurate, and low cost. The second goal was to use this Naegleria assay identify the pathogenic amoeba Naegleria fowleri in environmental samples.

15

Materials and Methods

The majority of the experiments used LATE-PCR and Lights-On/Lights-Off probes to generate and optimize fluorescent signatures. The assay targeting the entire Naegleria genus will be referred to as the Naegleria-Assay and the assay specific to N. fowleri will be referred to as the Fowleri-Assay. The general PCR protocol will be outlined below, followed by an in depth examination of the choice and amounts of reagents used in the optimization of the assays.

Final PCR Composition

For both assays, each sample was run at a total volume of 25 µL. Each sample contained the following: 1X PCR Buffer, 3mM MgCl, 50nM of each ON Probe, 150nM of each

OFF Probe, 0.2µM DNTP’s, 0.24X SYBR, 50nM limiting primer, 1mM excess primer, 1.25 units Taq, and 2µL of Naegleria DNA. The remainder of the sample was made up of ddH2O.

The Fowleri-Assay differed from the general Naegleria-Assay with the addition of

1.25 units ThermaStop™, 50 nm ThermaMark™ (each of two strands), and the usage of Syd

Taq instead of Platinum Taq. Other than that, the two assays utilized the same PCR reagents and concentrations.

16 Primer Design

Primers that had been previously generated in a study aimed at comparing the sequence relatedness of different Naegleria strains were used for the general Naegleria-

Assay (Broekman, 2011). Those primers were derived from the sequence to NEG-M

(Naegleria gruberi) strain with a few mismatches to increase binding affinity to differing sequences. This previous study established that these primers, in combination with a low annealing temperature for the first five cycles of PCR were able to amplify almost every

Naegleria species, as well as species from distantly related genera of protozoans, Willertia magna and Tetramitus rostratus.

One exception to the above statement is Naegleria fowleri itself. Due to the extraordinary CO1 sequence difference between Naegleria fowleri and the rest of the

Naegleria species, preliminary results indicated that the standard Naegleria-Assay primers were not able to bind and amplify Naegleria fowleri efficiently. As such, new primers that are specific to Naegleria fowleri and did not pick up other Naegleria species were designed.

In the experiments described here the concentration of the limiting primer was 50nM and the concentration of the excess primer was 1µM. But, because lowering the concentration of the limiting primer causes its melting temperature (Tm) to decrease, the length and composition of the limiting primer is deliberately adjusted so as to guarantee that the

L X limiting primer Tm and the excess primer Tm abide by the LATE-PCR rule that Tm o-Tm 0

≥0 (Sanchez et al 2004). The primer sequences and melting temperatures are shown below.

17

Table 1: Sequence and Melting Temperatures of Fowleri-Assay and Naegleria-Assay Primers assuming matched sequence

Primer Sequence (5’ to 3’) Tm0 °C Fowleri- N. fowleri EP TCCCCTCCTCCTACTGGATCATAGAAAGAAGTATTGAAATTTC 70 Assay N. fowleri LP GGTAATGCCTATCCTTTTTGGTGGGTTCGGTAAT 72 Naegleria- NEG EP TACTGGGTCATAGAAAGAAGTATTAAAATTAC 65 Assay NEG LP AGTAATGCCAATCTTATTTGGAGGATTTGGTAAC 69

Note: Melting temperature for Naegleria-Assay is for NEG-M (Naegleria gruberi), and melting temperature for Fowleri-Assay is for Naegleria fowleri

For the general Naegleria-Assay, there was no way to distinguish the initial Tm0 of the limiting primer and the Tm0 of the excess primer for the majority of the strains tested, the sequence that the primers would be binding to was unknown. In order to sequence these strains, the same consensus primers were utilized to generate the sequences. As such, the sequences generated all contained the same complementary consensus primer sequence, but are not necessarily indicative of their true sequence.

Taq Polymerase

Taq Polymerase was utilized in both assays at 1.25U. Platinum Taq, a “hot start”

DNA polymerase was utilized in the general Naegleria-Assay. Platinum Taq was originally utilized in the Fowleri-Assay as well. However, due to cost considerations, Platinum Taq was switched for Invitrogen Taq and then Syd Taq in the Fowleri-Assay.

18 ThermaStop™

ThermaStop™ is a hot-start like reagent that prevents mispriming and increases the specificity of Taq Polymerase. It comprises of a single stranded sequence of DNA with

Quenchers on both ends. The single stranded sequence forms a hairpin that sequesters Taq and prevents Taq activity at lower temperatures. As ThermaStop™ activity is a result of self-hybridization of DNA to form a hairpin, ThermaStop™ retains its activity and function following PCR. Therefore, unlike other hot-start reagents that remain inactivating following an increase in temperature, ThermaStop™ can continue its activity after the temperature drops down again and the strands hybridize to form the hairpin. This allows for increased specificity and suppression of non-specific priming, which is important for the Fowleri-

Assay.

ThermaStop™ Concentration

ThermaStop™ was used within the Fowleri-Assay at a concentration of 1.25U.

ThermaStop™ was not used within the general Naegleria-Assay due to differences in the annealing temperature of the two different assays. The general Naegleria-Assay begins with 5 cycles at an annealing temperature of fifty degrees whereas the Fowleri-Assay possesses an annealing temperature of sixty-two degrees. ThermaStop™ possesses activity at fifty degrees; thus, ThermaStop™ will most likely inhibit the ability of the primers to anneal to their targets and result in decreased yield. For this reason, ThermaStop™ was not utilized within the general Naegleria-Assay. A depiction of the structure of ThermaStop™ is shown below.

19

Figure 1: ThermaStop™ structure. Yellow circles represent fluorophores and blue squares represent quenchers.

ThermaMark™

ThermaMark™ is an internal marker that allows for normalization between experiments and to rule out the effects of evaporation. ThermaMark™ is comprised of two complementary strands of DNA, one with a fluorophore on the 5’ end and a cap at the 3’ end, and the strand containing a quencher on the 3’ end. The strand with the fluorophore acts as Lights-ON probe, and the strand with the two quenchers acts as a Lights-OFF probe to quench the signal. As the exact sequence and the melting temperature of ThermaMark™ are known, it generates a characteristic fluorescent valley at its melting temperature. The location of the fluorescent valley can be compared between experiments and used to normalize fluorescent signature results.

ThermaMark™ also possesses similar characteristics and functions to ThermaStop™.

However, as the melting temperature and structure is different than that of ThermaStop™, it has been found to result in enhanced specificity of Taq to prevent mis-priming at higher temperatures. For the Fowleri-Assay, specificity is key, and the prevention of amplification

20 of any other related Naegleria species is essential. Many different concentrations and variations of ThermaMark™ were tested within the assay. One variation of ThermaMark™ fluoresced in the Quasar channel and possessed a higher melting temperature. This will be referred to as High-Tm ThermaMark™. Another variation of ThermaMark™ fluoresced in

Cal-Red channel and possessed a lower melting temperature. This will be referred to as

Low-Tm ThermaMark™.

Figure 2: ThermaMark™ structure. Yellow circles represent fluorophores and blue squares represent quenchers. Gray square represents a cap.

ThermaMark™ Concentration

After thorough testing between the two types of ThermaMark™, the Low-Tm

ThermaMark™ was finally utilized at 50nM within the Fowleri-Assay. The Low-Tm

ThermaMark™ was not tested within the general Naegleria-Assay due to time constraints.

Probe Design

A set of nine Lights-ON/Lights-OFF probes was designed to coat the amplicon of the

CO1 gene. The Lights-On probe comprise of a fluorophore at one end and a quencher at the other end, and produces a peak in fluorescence at its respective melting temperature. The

Lights-Off probe comprise of two quenchers at each ends, and produces a valley in fluorescence at its melting temperature. The probes were forty base pairs long in order to

21 minimize signal fluctuation that may not be representative of species. The fluorescence from the ON probes was viewed in the Quasar channel. To maximize signal fluctuation from the interaction between Lights-ON/Lights-OFF probes, Lights-OFF probes are usually designed to have a lower melting temperature than the corresponding Lights-ON probes.

However, as the probes utilized bind to a variety of different targets with different melting temperatures, this rule could not always be followed. The choice to use nine probes was to increase the chance of fluorescent signature differences between samples.

Since the probes needed to not only be able to bind to each Naegleria sample but also be able to produce a unique fluorescent signal for each species, the sequence of the probes were paramount to the success of the study. To address the extensive variation within the Naegleria genus, a large number of tentative probe sequences were generated.

Each tentative probe was designed to be complement a different Naegleria strain, specifically strains that possessed differences in sequence from other species at the respective stretch of forty nucleotides. Once a set of nine probes was created using these criteria, they were tested using VISUAL OMP software (DNA Software), which provided information on binding free energy and melting temperature. Melting temperatures between probes and Naegleria sequences were exported to an excel file, where they were then color coded for specific ranges. Every 5-degree range correlated with a specific color, therefore creating an easily distinguishable pseudo-color map of the melting temperature range. For each probe segment of forty base pairs, many different probe sequences were tested against the variety Naegleria species to maximize melting temperature differences between different species. The probes that resulted in unique melting temperatures between as many samples as possible were retained whereas probes that produced similar

22 melting temperatures between species were rejected. Mismatches in certain sequences were added to generate greater variance in melting temperature between species. This process was repeated until a single set of nine probes resulted in a unique “heat map signature” that was different for each species (Sirianni et al. 2016).

Table 2: ON/OFF Probe Sequences

ON/OFF Sequence (5’ to 3’) Concentration

Probe 1 ON TGGTGCACCAGATATGTCTATTCCAAGATTAAATAATTTT 50nM

Probe 2 OFF AGTTTTTGGTTATTACCAGCTGCTATTTTATTAGCTGTTT 150nM

Probe 3 ON TATCTACATACTCAGAAGAAGGTCCAGGAACAGGATGGAC 50nM

Probe 4 ON ATTATATCCACCATTATCTTCATTACAGTCTCATTCTGGA 50nM

Probe 5 OFF GCTTCAATTGATTTATTGATTTTTAGTTTTCATTTAGTAG 150nM

Probe 6 ON GAATAGGATCAATTGTAGCAGGTATTAATTTTATTTGTAC 50nM

Probe 7 ON AATTTTCTATTATAAAAATGAAGCTATGTTTAATAAAGAT 50nM

Probe 8 OFF TTACCATTATTTGTATGGTCAGTTGCAGTAACATCTTTTT 150nM

Probe 9 ON TAGTAATAGTAGCAATACATGTATTAGCAGCAGCTATTAC 50nM

Note: Quencher used was BHQ II. Fluorophore used was Quasar 670.

Figure 3: Diagram of Probe binding to amplicon with respective fluorophores and quenchers. Yellow circles represent fluorophores and blue squares represent quenchers.

The probes designed above were also tested in silico against a variety of different

Eukaryotes to observe how well the Naegleria probes would fare at picking up a wide range

23 of Eukaryotic species. The chosen species below represent a wide variety of selected eukaryotes taken from an analysis Naegleria’s evolutionary history (Fritz-Laylin et al.

2010).

Table 3: List of Other Tested Species In-Silico

Species Accession Number (Genbank) Plasmodium reichenowi HM000117 Plasmodium gallinaceum AB250690 Plasmodium malariae AB250690 Leishmania tarentola M10126 Euglena gracilis AF160864 Euglena gracilis U49052.1 Reclinomonas americana NC_001823 Paramecium bursaria 5J905152 Paramecium sexaurelia FJ905147 Pfiesteria piscicida AF463412.2 Thalassiosira punctigera GQ844276 Porphyra lanceolata JN028584 Porphyra umbilicalis JN028584 Chlamydomonas reinhardtii NC_001638.1 Physarum polycephalum AB027295.1 Dictyostelium discoideum D50297.1 Neurospora crassa X14669 Saccharomyces castelli NC_003920 Saccharomyces cervisiae NC001224 Blastocladiella emersonii DQ287690 Hydra magnipapillata BN001180 Hydra sinensis JF951861 Tetrahymena pyriformis AF160864 Tetrahymena thermophila AF396436 Trypanosoma equiperdum DQ401129.1

24

Probe Concentration

Probes were initially used at 100nM of ON probe and 300nM of OFF probe. ON

Probes comprised of a Quasar 670 fluorophore and a Black Hole Quencher II. OFF probes

comprised of two Black Hole Quencher II modifications. Due to differences in amplitude

between samples as well as cost factors, the probe concentrations were halved to 50nM of

ON probe and 150nM of OFF probe. The reason that 50nM of ON probe and 150nM of OFF

probe was used is because it is of utmost importance that the reaction runs in target excess.

Target excess indicates an excess of the target template rather than an excess of ON probes.

The large amount of target and low amount of ON probes will indicate that regardless of

DNA template amount, the fluorescent curve will have similar amplitudes. As the DNA

samples we are acquiring can have a variable concentration of DNA, it is important to be in

target excess in our reaction. Given that we are in target excess, there is a chance certain

probes may bind to a single target strand whereas other probes may bind to a separate

target strand. OFF probes that bind to a target strand separate from the target strand that

the ON probes bind to will have no ability to turn off the fluorescence of the ON probes.

Thus, the concentration of the OFF probe at 150nM is to assure that the OFF probe is able

to coat all strands and turn off any ON probes located on any strand. As such, we expect

similar fluorescent amplitudes of a single sequence, regardless of amounts of starting DNA.

25 Other PCR Reagents:

In both assays, SYBR was utilized at 0.24X, dNTP’s were utilized at 0.2µM, 1X

Invitrogen Buffer, and 3mM MgCl.

Naegleria Target DNA:

Naegleria samples were acquired from a previous study that examined sequence similarity between different Naegleria species (Broekman, 2011). These Naegleria samples were previously prepared and lysed using QuantiLyse, which is a lysis buffer containing

Proteinase K (Pierce et al. 2002). The strains and species tested are shown in Table 4 below.

Due to the pathogenic nature of Naegleria fowleri, there was no DNA isolates available. In order to generate a synthetic Naegleria fowleri CO1 sequence, known

Naegleria fowleri strain V419 (GenBank: KX580903.1) and V511 (GenBank: KX580902.1) were utilized as a template to create a synthetic oligonucleotide. The sequence was acquired as a gBlocks gene fragment (IDT) alongside a gBlocks gene fragment for strain

NEG-M, Naegleria gruberi to serve as a control. The synthetic N. fowleri sequence alongside the NEG-M Target sequence is shown below.

Table 4: Target Sequences

N. Fowleri Target Sequence NEG-M Target Sequence TATTAATGCTTTTCGTCGTGGTAATGCCTATCCTTTTT AGTAATGCCAATCTTATTTGGAGGATTTGGTAACTA GGTGGGTTCGGTAATTATTTTGCACCTATTCTAATAG TTTTGTCAATTTTAATTGGTGCACCAGATATGTCTTTT GAGCACCTGATATGTCTTTTCCTAGATTAAATAATTTT CCAAGATTAAATAATTTTAGTTTTTGGTTATTGCCTGG AGTTTTTGGTTATTACCGGGTGCTATTTTATTGGCTGT TGCTATTTTGTTAGCTGTATTAGCTACTTATTCAGAAG TTTGGCTACTTATTCTGAAGGAGGTCCAGGTACAGGT GAGGACCAGGTACAGGATGGACAGTATATCCACCAT TGGACAGTATACCCTCCATTATCTTCTCTACAATCTCA TGTCTTCTTTACAATCTCACTCAGGAGCAAGCGTAGA CTCAGGTTCAAGTGTAGATTTAATGATATTTAGCTTTC TTTAATGATATTTAGCTTTCACTTAGTAGGTATTGGAT ATTTAGTAGGTATCGGTTCAATAATAGCAGCTATAAA CTATCGTAGCAGCTATTAACTTCATTTGTACAATTTTT TTTTATATGTACTATTTTTTATTATAAAAATGAAGCAA TATTATAAAAATGAAGCAATGTTTAATAAAGACTTAC

26 TGTTTAATAAAGACCTGCCCTTATTCGTATGGTCGGT CATTATTTGTTTGGTCTGTAGCAGTTACTTCTTTTTTA AGCAATTACTTCGTTTTTAGTAATAGTAGCAATACCA GTAATTGTAGCGTAATTTTAATACTTCTTTCTA GTGTTAGCCGCAGCAATAACATTATTATTATTCGATA GAAATTTCAATACTTCTTTCTATGATCCAGTAGGAGG AGGAGATGTGGTT Note: Bolded and underlined bases indicate limiting primer and excess primer binding sites respectively.

Table 5: Tested Naegleria Strains and Species

Strain Species

NEG-H Naegleria gruberi

NEGc1 Naegleria gruberi

BR6 Naegleria gruberi

H1C2 Naegleria gruberi

1518/14 Naegleria clarkei

1518/27 Naegleria canariensis

NG885 Naegleria fultoni

WT043 Naegleria minor

1518/7 Naegleria pagei

EDF258 Naegleria pussardi

1518/1F Naegleria schusteri

1518/26 Naegleria tenerifensis

1518/21 Naegleria tihangensis

CDCV419 Naegleria dunnebacki

NG408 Naegleria antarctica

AV500 Naegleria galeacystis

NB1 Naegleria pringsheimi

HA59 Naegleria australiensis

Note: All samples were acquired other than H1C2 were prepared from previously lysed Naegleria (Broekman, 2011). H1C2 was prepared in PrimeStore.

27

Target DNA Concentration

Due to varying amounts of Naegleria present within each sample, there were differing amounts of DNA within each sample. This makes choosing a standard dilution to use between different DNA samples very difficult, as the same dilution between samples can result in vastly different amounts of DNA. For samples within the general Naegleria sample, a 1:10 dilution of sample into Tris was utilized based on a dilution series for the

NEG-H control (N. gruberi).

For the Fowleri-Assay, different concentrations of N. fowleri were utilized to generate sensitivity and reproducibility, ranging from ten copies to 107 copies.

Thermal Cycle

There were many different combinations of temperatures and times utilized within each assay. For the Naegleria-Assay, five cycles where the annealing temperature was dropped down to fifty degrees was necessary to successfully pick up species that were not as closely matched to the primers. Only five cycles were necessary at the lower temperature as resulting amplicons are perfectly matched to the primers. For the

Fowleri-Assay, as the primers were unique to Naegleria fowleri, and specificity was desired, five cycles at a lower annealing temperature was not utilized.

28 The melt for both assays started at 25 degrees and spent 39 seconds holding at each temperature before increasing by one degree. The final thermal cycles for each assay are shown below.

Table 6: Thermal Cycle

Naegleria-Assay Fowleri-Assay

Temp Time Cycle # Temp Time Cycle #

Denature 95 3 min 1 Denature 95 3 min 1

Denature 95 5 sec Denature 95 5 sec

Anneal 50 10 sec 5 Anneal 62 10 sec 75

Extension 72 45 sec Extension 72 45 sec

Denature 95 5 sec Melt 25-90 39 sec every degree

Anneal 62 10 sec 60

Extension 72 45 sec

Melt 25-90 39 sec every degree

Final PCR Composition and Setup

For both assays, each sample was run at a total volume of 25 µL. Each sample contained the following: 1X PCR Buffer, 3mM MgCl2, 50nM of each ON Probe, 150nM of each OFF Probe, 0.2µM DNTP’s, 0.24X SYBR, 50nM limiting primer, 1mM excess primer,

1.25 units Taq, 2uL of Naegleria DNA. The remainder of the sample was made up of ddH2O

29 The Fowleri-Assay differed from the general Naegleria-Assay with the addition of

1.25 Units ThermaStop™, 50 nm ThermaMark™ and the usage of Syd Taq over Platinum

Taq. Other than that, the two assays utilized the same PCR reagents and concentrations.

PCR setup comprised of the creation of the Mastermix in a separate PCR room with sterile equipment. All reagents other than the DNA were added to the Mastermix, this

Mastermix was then separated into labeled tubes for each triplicate. The NTC tubes were separated in triplicate and capped inside the PCR room. The other tubes were taken outside the PCR room and to avoid contamination. Appropriate DNA samples were added to the tubes followed by vortexing to thoroughly mix the sample. These tubes were similarly separated into their technical replicates (in triplicate). The final PCR tubes were spun down to pull down any liquid stuck to the sides and then placed within the qPCR machine. PCR was done in a Stragene MX3005P Sequence Detector (Strategene, La Jolla,

CA). A diagram of the PCR setup is shown below.

Figure 4: Mastermix Preparation

30

Results

In Silico Tests of Naegleria Consensus Probes Against Selected Eukaryotes

Professor Chandler Fulton and his colleagues recently published the sequence of the

Naegleria gruberi genome (Fritz-Laylin et al. 2010). The summary of that study is quoted below. Figure 2 from their paper illustrates that Naegleria is an ancient genus that branches from the Heteroloboseans.

“Genome sequences of diverse free-living protists are essential for understanding eukaryotic evolution and molecular and cell biology. The free-living amoebo- flagellate

Naegleria gruberi belongs to a varied and ubiquitous protist clade (Heterolobosea) that diverged from other eukaryotic lineages over a billion years ago. Analysis of the

15,727 protein-coding genes encoded by Naegleria’s 41 Mb nuclear genome indicates a capacity for both aerobic respiration and anaerobic metabolism with concomitant hydrogen production, with fundamental implications for the evolution of organelle metabolism. The Naegleria genome facilitates substantially broader phylogenomic comparisons of free-living eukaryotes than previously possible, allowing us to identify thou- sands of genes likely present in the pan-eukaryotic ancestor, with

40% likely eukaryotic inventions. Moreover, we construct a comprehensive catalog of amoeboid-motility genes. The Naegleria genome, analyzed in the context of other protists, reveals a remarkably complex ancestral with a rich repertoire of cytoskeletal, sexual, signaling, and metabolic modules” (Fritz-Laylin et al., 2010).

31

Figure 2: Consensus Cladogram of Selected Eukaryotes Consensus cladogram of selected eukaryotes relevant to our comparative analyses, highlighting six major groups with widespread support in diverse molecular phylogenies (Burki et al., 2008; Rodriguez-Ezpeleta et al., 2007 ; Yoon et al., 2008). The dotted polytomy indicates uncertainty regarding the order of early branching events. Representative taxa are shown on the right, with glyphs indicating flagellar and/or actin-based amoeboid movement. Although commonly referred to as “amoeboid,” Trichomonas does not undergo amoeboid locomotion. The inset depicts three contending hypotheses for the root. Root A: early divergence of unikonts and bikonts ( Stechmann and Cavalier-Smith, 2002). Root B: the largely parasitic POD lineage branching first, followed by JEH (including Naegleria) ( Ciccarelli et al., 2006). Root C: POD and JEH uniting to form the “Excavates” (Supplemental Information). The branches connecting Naegleria to humans are highlighted in green, with a black triangle indicating their last common ancestor.

Figure 5: Figure 2 from Fritz-Laylin et al. 2010

At the suggestion of Professor Fulton, I began my efforts to build an assay for

Naegleria species by in silico alignment of the available CO1 sequences of many of the species listed in Fulton – Figure 2 followed by designing nine Lights-On/Lights-Off probes for the Naegleria genus. As shown in Table 1, below, I then calculated and compared the melting temperatures for each of these probes to the CO1 sequences to each of these distantly related eukaryotes. Only the CO1 sequence from N. gruberi hybridized to all of the probes. The CO1 sequence Tetramitus rostratus hybridized to six of the nine probes within the useful temperature range which extends down to 25°C. Those probes that did not hybridize at 25°C or above are too mismatched to their target sequence and are scored as probe-drop outs. I concluded that the probe set that I designed based on the N. gruberi

32 CO1 sequence was likely to be uniquely suited for analysis of the members of the Naegleria genus. The pseudo-color map of the probe-target melting temperatures is shown below.

Probe Number ON OFF ON ON OFF ON ON OFF ON 1 2 3 4 5 6 7 8 9 Total Tm (C°) Trypanosoma equiperdum 49 Pfiesteria piscicida 59 Tetrahymena thermophila 65 Physarum polycephalum 78 Tetrahymena pyriformis 88 Euglena gracilis 93 Probe Tm Range (C°) Paramecium bursaria 109 >72 Dictyostelium discoideum 110 67.1-72 Chlamydomonas reinhardti 110 62.1-67 Neurospora crassa 125 57.1-62 Paramecium sexaurelia 135 52.1-57 Saccharomyces castellii 144 47.1-52 Porphyra lanceolata 157 42.1-47 Plasmodium reichenowi 162 37.1-42 Leishmania tarentola 165 25-37 Plasmodium malariae 168 <25 Plasmodium gallinaceum 169 Saccharomyces cerevisiae 170 Hydra magnipapillata 225 Hydra sinensis 206 Blastocladiella emersonii 261 Porphyra umbilicalis 268 Thalassiosira punctigera 270 Reclinomonas americana 276 Tetramitus rostratus 365 Naegleria gruberi 487

Figure 6: Related eukaryotes pseudo-color map: The heat map generated above shows that the probe set does not hybridize strongly to any of the other more distantly related species. In fact, the Tm of most probes would theoretically be less than 25 degrees. Probes binding with a Tm of less than 25 degrees is called probe dropout and indicates that the probe is not useful for generating a fluorescent signature. These results suggest that the set of probes produced would be specific for the Naegleria genus

33 In Silico Test of Naegleria Consensus Probes Against the Naegleria Genus

Construction of the Naegleria-Assay required that a single set of probes needed to generate a unique fluorescent signature for each Naegleria species. In order to do this, a variety of tentative probe sequences each complementary to a different Naegleria strain were generated. The Tm of each probe/target pair was calculated using VISUAL OMP and exported to excel. A pseudo-color map was created with specific temperature ranges corresponding to specific colors. Probe target hybrids with a probe binding range in the upper or lower ranges for most of the species were deemed uninformative and replaced with a probe that bound in the middle of the color scale. Similarly, probes that had the similar melting temperatures for every Naegleria species were rejected. This process was repeated until a set of consensus probes that resulted in a unique pseudo-color map.

34

Figure 7: Pseudo-color map of the Naegleria genus: The figure above displays the pseudo-color map for the nine consensus probes that were designed for the Naegleria genus (Sirianni et al, 2016). These probes were tested against different strains of Naegleria, along with related amoeboflagellates Willaertia magna and Tetramitus rostratus. The color map above shows that there is a unique color map for each different species of Naegleria, predicting a unique fluorescent signature for each different species.

35 Naegleria-Assay Fluorescent Signatures

The results presented thus far are based on software dependent analysis of melting temperatures for each of the nine probes analyzed separately. Fluorescent signatures in contrast, are empirically derived curves that result from the composite melt curves of all probes in the set. As the fluorescent signature is comprised of the composite interactions of all ON and OFF probes, a fluorescent signature is much more informative than a pseudo- color map. The the resultant shape, curvature, steepness of peaks and valleys are all characteristic of a unique fluorescent signature. This indicates that in silico, a single set of probes can be used to cover a single-diverse genus.

Following the development of a set of consensus probes that were hypothesized to bind and characterize the different species of Naegleria, these probes were then run using

LATE-PCR (see Materials and Methods) with sixteen different Naegleria strains and species in order to demonstrate whether each CO1 sequences displayed a unique fluorescent signature. Four of these test strains were Naegleria gruberi species based on the fact that their CO1 sequences only differed by 1 or 2 nucleotides. The other twelve strains had been previously identified from their sequences as type strains for different Naegleria species.

All Naegleria strains except H1C2 were lysed in Quantilyse by Ronit Kaufman and had been stored frozen since that time (Kaufman, 2011). An aliquot from each stock was diluted with a 1:10 dilution prior to adding 2µL to the amplification reaction. Strain H1C2 was prepared in 2016 by lysis in PrimeStore® and was then diluted 1:30 to get rid of the lysis reagent. Each amplification was run for 5 thermal cycles at 50 degrees (low stringency) followed by 60 cycles at an annealing temperature of 62 degrees. Depending on the sample, the single-stranded DNA in each reaction began to accumulate anywhere from 27-

36 50 cycles. At the end of the reaction the temperature was lowered to 25 degrees in order to hybridize the Lights-On/Lights-Off probes on the target sequence. The probes were then melted off to generate a fluorescent contour, which was then converted into a first derivative fluorescent signature (Sanchez et al. 2003). As samples other than H1C2 were from 2011, there could have been an issue with DNA quality due to an unknown amount of freeze-thaw cycles performed on DNA samples. All reactions were run in duplicate or triplicate to demonstrate their reproducibility. The results are shown in the plots below.

37

Figure 8: N. gruberi

Figure 9. N. gruberi

38

Figure 10: N. gruberi

Figure 11: N. gruberi: H1C2 DNA was prepared with PrimeStore

39

Figure 12: N. clarki

Figure 13: N. canariensis

40

Figure 14: N. fultoni

Figure 15: N. pagei

41

Figure 16: N. minor

Figure 17: N. schusteri

42

Figure 18: N. tenerifensis

Figure 19: N. antarctica

43

Figure 20: N. galeacystis

Figure 21: N. pringsheimi

44

Figure 22: N. australiensis

Figure 23: N. tihangensis

45

Figure 24: N. dunnebacki dissociation curve: No data points indicating Tm of peaks and valleys were added due to differences between replicates. Only two replicates amplified.

The resultant fluorescent signatures clearly show that each Naegleria species tested possesses its own unique signature. In Figures 12 and 13, N. clarki (1518/14) and N. canariensis (1518/27), which possess four base pair differences, seem to have very similar fluorescent signatures. However, when they are overlapped it is clear that there is a shift in their peaks and valleys. N. clarki and N. canariensis are shown together below (Figure 26).

It was found that even within a species, different strains can be differentiated from one another. For example, NEG-H and NEGc1 possess a single base change, but that change is visible on their characteristic fluorescent signatures (Figure 25).

46

Figure 25: N. gruberi total curve: Different strains of Naegleria gruberi are plotted. Technical replicates for each strain were averaged.

Figure 26: A comparison of two similar signatures N. clarki and N. canariensis

47

Fluorescent signatures were not obtained for Naegleria pussardi (EDF258). The

SYBR green amplification plots indicate that at 50 cycles, they did not have enough time to make single-stranded DNA (Figure 27). These samples from 2011 either had so little DNA that they took a very long to start amplification, or more likely, their sequences were so poorly matched to the primers that amplification got started very inefficiently if at all.

Figure 27: Amplification plot of EDF258: EDF258 had enough time to make single stranded DNA, resulting in a lack of a fluorescent signal.

48 Fowleri-Assay

The results above demonstrate that a single set of probes can distinguish between both closely and distantly related species of Naegleria. However, the regular NEG primers used for amplification of Naegleria are too mismatched Naegleria fowleri C01 sequence to initiate amplification. For that reason a new set of N. fowleri specific primers were designed in order to selectively amplify this species (See Materials and Methods). As N. fowleri is the only pathogenic species of Naegleria, it was deemed important to able to detect N. fowleri in conditions they would be found in nature. As such, it was important to also be able be able to differentiate N. fowleri from other similar species of Naegleria that may be present in a water sample. The following results detail the important experiments leading to the final assay.

Testing Naegleria-Assay Conditions with the Fowleri-Assay

The conditions used for the Naegleria-Assay were initially used as a baseline for the

Fowleri-Assay. The Naegleria-Assay protocol was tested with synthetic oligonucleotides of

Naegleria fowleri, Naegleria gruberi (NEG-M), Naegleria americana (5c1), and PrimeStore prepared Naegleria gruberi (H1C2). The synthetic oligonucleotides were used at 103 copies each with a 1:30 dilution used for the PrimeStore prepared Naegleria gruberi DNA. The thermal cycle was the same as the Naegleria-Assay, with 5 cycles at the beginning with a

50-degree annealing temperature followed by 50 cycles with a 62-degree annealing temperature. Instead of using the Naegleria-Assay primers that do not amplify N. fowleri, newly designed N. fowleri primers were used (See Table X). The results are shown below.

49

Figure 28: Fowleri-Assay protocol with 5 cycles at fifty degrees: Quasar dissociation curve is shown on the right and the FAM amplification plot is shown on the left. Both fowleri and H1C2 were amplified.

It was demonstrated that the Naegleria fowleri sample amplified and produced a distinct fluorescent signal in Quasar. However, the PrimeStore® prepared Naegleria gruberi (H1C2) also showed amplification and a distinct fluorescent signal. Amplification of this sample but no amplification of the synthetic Naegleria gruberi (NEG-M) suggests that there are a greater amount of copies of the CO1 gene in H1C2 than NEG-M.

Increasing the Stringency of the Fowleri-Assay

The previous experiment has shown that usage of the Naegleria-Assay protocol is not stringent enough to specifically amplify Naegleria fowleri while discriminating against other species of the Naegleria genus. Therefore, the assay protocol was changed to increase stringency of the reaction. The 5 cycles at the beginning with an annealing temperature of fifty degrees was removed and the assay was set to 50 cycles with an annealing temperature of 62 degrees. The same samples as the previous experiment at the same concentrations (103 copies) were used and the results are shown below.

50

Figure 29: Fowleri-Assay protocol without 5 cycles at fifty degrees: Quasar dissociation curve is shown on the right and FAM amplification plot is shown on the left.

It was demonstrated that removing the 5 cycles at the beginning at 50 degrees was sufficient to suppress amplification of related Naegleria species. The previously amplified

Naegleria gruberi sample (H1C2) was no longer amplified in this higher stringency assay.

Increasing the Copy Number of Related Naegleria Species

The previous experiment had demonstrated that 103 copies of Naegleria gruberi

(NEG-M) and Naegleria americana (5c1) are not amplified by the high-stringency Fowleri-

Assay. Therefore, higher copy numbers of Naegleria gruberi and Naegleria americana were tested to identify if increased copy numbers were picked up by the assay. A dilution series with 107, 105, and 103 copies of NEG-M and 5c1 were performed. The PrimeStore® prepared Naegleria gruberi sample (H1C2) was not included, as any dilutions smaller than

1:30 have been shown to result in inhibition of PCR. The results are shown below.

51

Figure 30: Fowleri-Assay protocol with increased copy numbers of NEG-M and 5c1: Quasar dissociation curve is shown on the left and FAM amplification plot is shown on the right. No samples other than the positive control, Naegleria fowleri, showed amplification.

It was demonstrated that even at up to 107 copies of Naegleria gruberi (NEG-M) and

Naegleria americana (5c1), there was no amplification or fluorescent signature produced by the Fowleri-Assay. One note is that technical replicates of the Naegleria fowleri sample seemed particularly poor.

Testing ThermaMark™ Concentrations

In order to further reduce any mispriming and allow for normalization between different experiments, ThermaMark™ was added to the Fowleri-Assay. ThermaMark™ was added at 25nm, 50nm, and 75nm to 105 copies of N. fowleri. The results are shown in the plot below.

52

Figure 31: FAM amplification plots for different Thermamark concentrations The results indicate that concentrations of 25nM ThermaMark™ are not inhibitory to the reaction, but any concentration of ThermaMark™ above 25nM increasingly inhibits the PCR reaction.

Testing Different versions of ThermaMark™

The High-Tm ThermaMark™ used at 50nM has been shown to be inhibitory to the

Fowleri-Assay. Attempts were made to decrease concentrations of Thermamark but concentrations below 50nM do not have a noticeable enough valley. Therefore, a lower Tm

ThermaMark™ in Cal Red was tested instead. This lower Tm Thermamark has a lower melting temperature than the previously used Thermamark, which may cause it to interfere less with the polymerase and reduce inhibition. This will be referred to as Low-

Tm ThermaMark™. A dilution series with 105, 104, 103, 102, and 101 copies of No

Thermamark, Low Tm Thermamark™, and High Tm Thermamark™ was tested. In addition, new aliquots of DNA were prepared in attempt to improve reproducibility among technical replicates. The results are shown in the plots below.

53

Figure 32: Amplification plots of different dilutions of –TM/Low Tm TM/High Tm TM: 105 copies of TM (top left) down to 101 copies of TM (hottom left). At 105 to 102 copies of DNA, the presence of High Tm TM inhibits the reaction, whereas low Tm TM has much less of an inhibitory effect. At 10 copies of DNA, the CTs of the samples were on top of each other. Bottom right and bottom left shows the Low-Tm TM and High-Tm TM respectively.

54

It was found that the Low-Tm ThermaMark™ was significantly less inhibitory than that of the High-Tm ThermaMark™. Whilst the High-Tm ThermaMark™ clearly delayed the

Ct’s, the Low-Tm ThermaMark™ had hardly any effect when compared to the No

ThermaMark™ sample. At lower copy numbers of DNA, the difference between the

ThermaMark™ samples decreased. At 10 copies of DNA, the Ct’s of the three different samples were virtually identical. The Low-Tm ThermaMark™ curve is shown on the bottom right in contrast to the High-Tm ThermaMark™ on the bottom left. Although the amplitude of the High-Tm Quasar ThermaMark™ has greater amplitude, these samples also suffered from PCR inhibition. The Low-Tm Cal-Red ThermaMark™ provides a clear marker to rule out evaporation. Reproducibility among technical replicates was also improved due to the preparation of new aliquots of DNA. This Low-Tm ThermaMark™ was used for all following experiments.

Dilution series in a background of 105 copies of synthetic Naegleria gruberi DNA

This experiment was done with a dilution series of Naegleria fowleri with 105, 104,

103, 102, and 10 copies of Naegleria fowleri in a background of 105 copies of synthetic NEG-

M DNA. This was done to test sensitive and specificity of the assay to pick up low amounts of N. fowleri in the background of other Naegleria species. The assay was run with and without ThermaMark™ in order to assess possible inhibitory effects of ThermaMark™ on the assay. The results are shown in the plots below.

55

Figure 33: Dilution series +/- Thermamark in a background of 105 copies of NEG-M (gruberi): The background of Naegleria gruberi DNA did not have a significant effect until ten copies of fowleri DNA. At ten copies, the characteristic valley between 45 and 55 degrees was less defined.

Dilution series in a background of real Naegleria gruberi DNA

The final assay was tested in order to determine the reliability and specificity of the

Fowleri-Assay to pick up Naegleria fowleri DNA within a background of real Naegleria samples prepared in PrimeStore®. This experiment was done with a dilution series of

Naegleria fowleri with 105, 104, 103, 102, and 10 copies of Naegleria fowleri in a background of real Naegleria gruberi DNA. This was intended to test whether presence of other

Naegleria DNA would be picked up or interfere with the amplification of Naegleria fowleri.

The reaction was run with and without ThermaMark™ in order to determine if

ThermaMark™ would have a negative effect on the assay. The results are shown below.

56

Figure 34: Fowleri Assay Dilution Series (105 copies to 10 copies) of Naegleria fowleri and 105 copies of Naegleria gruberi (H1C2) with and without Thermamark™: Quasar dissociation curve of +/- TM shown on top two figures, and the FAM amplification shown on the bottom left and the Melt curve shown on the bottom right for the TM sample.

This experiment demonstrated that all dilutions of Naegleria fowleri amplified successfully and had clear dissociation curves that were representative of the Naegleria fowleri species. There was no evidence or signal that 105 copies of Naegleria gruberi DNA had any effect on the amplification or fluorescent signal even at lower copy numbers. There was no negative effect of Thermamark™ on the reaction, as the CT values for the dilution series with and without Thermamark™ are right on top of each other. The melt curve shows the production of a single product, indicating no nonspecific amplification of other products.

57 A standard curve was performed for this experiment in order to assess PCR efficiency. A standard curve plots the log of the copy number against the CT values, and PCR efficiency can be derived from the slope of the line. Ideal PCR efficiency levels are between

90-110%. The results are shown in the plots below.

Figure 35: Standard curve of the dilution series performed in Figure 34: The equations used for the calculation of PCR efficiency is the following: Efficiency = 100(-1+10-1/m) where m is the slope of the standard curve.

Taken together these results demonstrate that addition of of 50 nM ThermaMark™ increases both the specificity (more precise replicates) and the efficiency of the reaction from 86.9% without ThermaMark™ to 91.3% with ThermaMark™. Moreover, in the presence of 50 nM ThermaMark™ the assay is efficient down to low copy number even in a vast excess of non-fowleri Naegleria DNA.

58

Discussion Naegleria-Assay

The Naegleria-Assay was able to amplify and produce a unique fluorescent signature for eighteen out of the twenty tested strains. Out of these twenty tested strains, NEG-H,

BR6, H1C2, and NEGc1 were Naegleria gruberi, with the remaining strains each making up a separate Naegleria species (see table 5). Naegleria pussardi (EDF258) did not produce a fluorescent signature due to not amplifying in the given amount of cycles, this is most likely due to poor primer binding during the initial five cycles of amplification. This problem could be remedied by lowering the annealing temperature even further in these cycles or increasing the length or number of cycles to further reduce assay stringency.

These results suggest that single set of nine probes and primers can rapidly amplify and screen virtually all species within the diverse Naegleria genus. These fluorescent signatures can be compiled to form a “library” of fluorescent signatures. This assay can increase the speed of detecting and categorizing the many species of Naegleria present all around the world. Fluorescent signatures generated any new sample could be compared to that of the library, rapidly assigning samples to certain species. In the case of unknown signatures, these samples can then be sequenced via Dilute-N’-Go sequencing (Jia et al.

2010) and then added to the library. Similar assays can be designed for rapid screening of any other microscopic eukaryotic genera.

59 Fowleri-Assay

The Fowleri-Assay is sensitive, specific, and efficient. Figure 34 demonstrates that even ten copies of the CO1 sequence is detected and amplified. At ten copies, all three technical replicates amplified with no difference between fluorescent signatures. The fluorescent signatures are consistent between 105 copies and 101 copies. There are slight fluctuations in the amplitude of the signatures, but the shape of the curve is consistent.

These samples were all run in a background of a 1:30 dilution of H1C2 (Naegleria gruberi), which is the lowest dilution possible with PrimeStore® without inhibiting the reaction. Although the CO1 copy number in H1C2 is unknown, Figure 28 suggests that the

H1C2 sample has greater than 1000 copies, as a 1:30 dilution of H1C2 (Naegleria gruberi) is amplified whereas 1000 copies of NEG-M (Naegleria gruberi) is not.

Figure 33 demonstrates that this assay also runs in in the presence of 100,000 copies of synthetic NEG-M (Naegleria gruberi). At copy numbers above 101, there is no change in fluorescent signatures. The primers do not amplify or pick up 107 copies of the most similar known Naegleria sequence, demonstrating the specificity of the assay (Figure

29). Currently, this assay provides this information in 2.5 hours. This is a rapid diagnostic assay in a single tube that can determine the presence of Naegleria fowleri without the need for sequencing, which is expensive, labor intensive, and slower. Finally, as shown in Figure

35, the Fowleri-Assay is both highly reproducible and efficient (91.3%) when 50nM

ThermaMark is added to the reaction mixture.

The sensitivity and specificity of this assay to pick up a single pathogenic species of

Naegleria while discriminating against other species has enormous potential in diagnostics in the field. This assay raises the possibility of testing water samples for Naegleria fowleri

60 while discriminating against any other species of Naegleria that are present within the sample. Preparation of Naegleria using PrimeStore® is accomplished in a single step that lyses and inactivates biological samples while preserving nucleic acids. This will make it possible to collect and test water samples and cerebral spinal fluid samples for the presence of N. fowleri samples without the need to culture Naegleria on plates. These improvements with decrease both the time and the cost of detecting N. fowleri in critically ill patients, as well as very large numbers of water samples collected in the field.

The Fowleri-Assay exhibits its own distinctive fluorescent signature that can readily be distinguished from all other fluorescent signatures for Naegleria species tested thus far.

As there can be interspecies differences in CO1 sequence, this assay also provides a way to categorize any new fowleri strains with its own distinctive fluorescent signature.

Challenges

One of the problems with DNA Barcoding as a whole is the presence of intraspecies sequence divergence in the CO1 gene. This interspecies divergence can cause the same species to have populations with slightly different fluorescent signatures. As these differences are often indicative of geographical differences, this may necessitate the need for multiple fluorescent signatures to characterize a single species.

Future Directions

In terms of the Naegleria-Assay, there is still more to be done. The strains that did not amplify, EDF258, will need to be repeated with an increased amount of thermal cycles

61 to obtain fluorescent signatures from these two species. The Naegleria-Assay thermal cycling protocol still needs to be optimized and to determine whether a decrease in the annealing temperature during the first five cycles is sufficient to pick up these distantly related strains of Naegleria. The addition of a Low-Tm ThermaMark™ should also be tested in the Naegleria-Assay, to determine whether the use of ThermaMark™ inhibits the amplification of non-perfectly matched DNA to primers. In addition, melting should increase every half-degree rather than every whole degree, in order to increase the resolution on the temperatures of specific peaks and valleys. To improve fluorescent signatures, each of the three ON-OFF-ON probe triplets can each be separated into a different color. In this three-color assay, the temperature range is used three times, which allows for more informative signatures to differentiate between closely matched sequences.

In terms of the Fowleri-Assay, this assay still needs to be tested with real Naegleria fowleri DNA. Due to its pathogenic nature, the usage of a synthetic gBlocks target was necessary, but real N. fowleri DNA is necessary to establish the efficacy of the assay. In addition, the thermal cycle can be decreased from 75 cycles to 60 cycles, as even at ten copies, fowleri can be detected. ThermaGo™, which is a reagent that increases the specificity of Taq similarly to ThermaMark™, can be added to the reaction to increase the specificity even further.

62

Conclusion

Naegleria are a diverse group of freshwater amoebae found throughout the world.

The Naegleria-Assay was developed in order to rapidly characterize different Naegleria species around the world without the need for sequencing. Currently, this assay can differentiate between every Naegleria species tested, but will need to be optimized and possibly changed to a three-color assay to differentiate between other Naegleria species.

The opportunistic pathogen Naegleria fowleri, found primarily in freshwater in the

Southern United States, is implicated in primary amoebic meningoencephalitis with infections almost always leading to death. The Fowleri-Assay can detect down to 10 copies of Naegleria fowleri in a background of other Naegleria species, serving as a possible test to detect fowleri in bodies of water.

These assays provide quick and affordable tests for the detection and mapping of

Naegleria species around the world, providing insight into a diverse genus that is largely unknown.

63

References

Azpurua J., De La Cruz D., Valderama A., Windsor D. 2010. Lutzomyia sand fly diversity and rates of infection by Wolbachia and an exotic Leishmania species on Barro Colorado Island, Panama. PLoS Negl. Trop. Dis., 4(3):e627

Barcode of Life. (2016). Barcodeoflife.org. Retrieved 9 December 2016, from http://www.barcodeoflife.org/content/about/what-ibol

Bickford, D., Lohman, J. D., et al. 2007. Cryptic species as a window on diversity and conservation. Trends in Ecology and Evolution, 22(3):148-155

Blaxter, M. 2016. Imagining Sisyphus happy: DNA barcoding and the unnamed majority. Philosophical Transactions Of The Royal Society B: Biological Sciences, 371(1702): 20150329

Cameron S., Rubinoff D., Will K. 2006. Who will actually use DNA barcoding and what will it cost? Systematic Biology, 55(5):844–847

Ceballos G., Ehrlich P.R., Barnosky A.D., García A., Pringle R.M., Palmer T.M. 2015. Accelerated modern human–induced species losses: Entering the sixth mass extinction. Science Advances. 1(5):e1400253

Desalle, R., Egan M. G., Siddall, M. 2005. The unholy trinity: taxonomy, species delimitation and DNA barcoding. Phil. Trans. R. Soc. B., 360:1905–1916

Hebert, P., Stoeckle, M., Zemlak, T., and Francis, C. 2004. Identification of Birds through DNA Barcodes. Plos Biology, 2(10), e312.

Folmer, O., Black, M., Hoeh, W., Lutz, R. & Vrijenhoek, R. 1994. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol. Mar. Biol. Biotechnol., 3:294–299

Fritz-Laylin LK, SE Prochnik, et al. 2010. The Genome of Naegleria gruberi illuminates early eukaryotic versatility. Cell, 140: 631-642.

64 Fulton, C. 1970. Amebo-flagellates as research partners: the laboratory biology of Naegleria and Tetramitus. Methods Cell Physiol. 4: 341–476.

Fulton, C. 1993. Naegleria: A research partner for cell and developmental biology. Eukaryot. Microbiol., 40: 520–532

Giovannoni, S., Britschgi, T., Moyer, C., & Field, K. 1990. Genetic diversity in Sargasso Sea bacterioplankton. Nature, 345(6270): 60-63.

Grace, E., Asbill, S., Virga, K. 2015. Naegleria fowleri: Pathogenesis, Diagnosis, and Treatment Options. Antimicrobial Agents and Chemotherapy, 59(11): 6677-6681.

Hartshorn, C., Anshelevich, A., Wangh, L. 2005. Rapid, single-tube method for quantitative preparation and analysis of RNA and DNA in samples as small as one cell. BMC Biotechnol. 5:2

Hebert, P., Penton, E., Burns, J., Janzen, D., & Hallwachs, W. 2004. Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings Of The National Academy Of Sciences, 101(41): 14812-14817.

Hebert, P. D. N., A. Cywinska, et al. 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society of London Series Biological Sciences, 270(1512): 313-321.

Holland SM. 2016. Ecological disruption precedes mass extinction. Proceedings of the National Academy of Sciences of the United States of America, 113(30):8349-8351.

Iwata, S., Ostermeler, C., Ludwig, B., Kartmut, M. 1995. Structure at 2.8A resolution of cytochrome c oxidase from Paracoccus denitrificans. Nature, 376:660-669.

Jia Y., Osborne A., Rice J.E., Wangh L. 2010. Dilute-“N”-Go dideoxy sequencing of all DNA strands generated in multiplex LATE-PCR assays. Nucleic Acids Research, 38(11):e119

John, D.T. 1982. Primary Amebic Meningoencephalitis and the Biology of Naegleria fowleri. Ann. Rev. Microbiol., 36: 101-23.

Krishnamurthy, K., Francis, R. 2012. A critical review on the utility of DNA barcoding in biodiversity conservation. Biodiversity And Conservation, 21(8), 1901-1919.

Ladoukakis E., Zouros E. 2017. Evolution and inheritance of animal mitochondrial DNA: rules and exceptions. Journal of Biological Research, 24:2.

65

Mayr, Ernst. 1942. Systematics and the Origin of Species, from the Viewpoint of a Zoologist. New York: Columbia Univ.

Meyer, A., T. D. Kocher, P. Basasibwaki, and A. C. Wilson. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature, 347:550–553.

Mora, C., Tittensor, D.P., Adl, S., Simpson, A., Worm, B. 2011. How many species are there on earth and in the ocean? Plos ONE, 9(8): e1001127

Pace, N. 1997. A Molecular View of Microbial Diversity and the Biosphere. Science, 276(5313):734-740.

Pecnikar, Z., Buzan, E. 2014. 20 years since the introduction of DNA barcoding: from theory to application. Animal Genetics, 55:43–52

Pentinsaari, M., Salmela, H., Mutanen, M., & Roslin, T. 2016. Molecular evolution of a widely- adopted taxonomic marker (COI) across the animal tree of life. Scientific Reports, 6(1)

Pierce, K., Sanchez, J., Rice, J.E., and Wangh, L. 2005. Linear-After-The-Exponential (LATE)- PCR: Primer design criteria for high yields of specific single-stranded DNA and improved real-time detection. Proc. Natl. Acad. Sci. U.S.A., 102: 8609–8614.

Pierce, K., Rice, J. E., Sanchez, J., Wangh, L. 2002. QuantiLyse™: reliable DNA amplification from single cells. BioTechniques, 32: 1106-1111.

Pimm, S., Jenkins, C., Abell, R., Brooks, T., Gittleman, J., & Joppa, L. et al. 2014. The biodiversity of species and their rates of extinction, distribution, and protection. Science, 344: 997-1007.

Rice, J.E, Reis, A.H., Jr., Rice, L.M., Carver-Brown, R.K., and Wangh, L.J. 2012. Fluorescent signatures for variable DNA sequences. Nucleic Acids Res. 40(21): e164.

Rice, L.M., Reis, A.H., Jr., and Wangh, L.J. 2014. Virtual barcoding using LATE-PCR and Lights-On/Lights-Off probes: identification of nematode species in a closed-tube reaction. Mitochondrial DNA, 27(2): 1358–1363.

Rivera, J. Currie, D. 2009. Identification of Nearctic black flies using DNA barcodes (Diptera: Simuliidae). Molecular Ecology Resources, 9:224-236.

66 Robinson, B.S., Christy, P., Hayes, S.J., and Dobson, P.J. 1992. Discontinuous genetic variation among mesophilic Naegleria isolates: further evidence that N. gruberi is not a single species. J. Protozool., 39: 702–712

Rubinoff, D. 2006. Essays: Utility of Mitochondrial DNA Barcodes in Species Conservation. Conservation Biology, 20(4), 1026-1033.

Sanchez, J.A., Pierce, K.E., Rice, J.E., and Wangh, L.J. 2004. Linear-After-The-Exponential (LATE)-PCR: an advanced method of asymmetric PCR and its uses in quantitative real-time analysis. Proc. Natl. Acad. Sci. U.S.A., 101(7): 1933–1938.

Sirianni, N., Huijun, Y., Rice, J., Kaufman, R., Deng, J., Fulton, C., Wangh, L. 2016. Closed-Tube Barcoding. Genome, 59(11): 1049-1061.

Stein, E., Martinez, M., Stiles, S., Miller, P., Zakharov, E. 2014. Is DNA Barcoding Actually Cheaper and Faster than Traditional Morphological Methods: Results from a Survey of Freshwater Bio-assessment Efforts in the United States. Plos ONE, 9(4): e95525.

Stoeckle M.Y., Hebert P. 2008. Barcode of life: DNA tags help classify animals. Sci. Am., 298(10):39–43

Streby A., Mull B.J., Levy K., Hill V.R. 2015. Comparison of real-time PCR methods for the detection of Naegleria fowleri in surface water and sediment. Parasitology Research, 114(5): 1739-1746.

Veron, J. 2008. Mass Extinctions And Ocean Acidification: Biological Constraints On Geological Dilemmas. Coral Reefs, 27(3): 459-472

Waugh, J. 2007. DNA barcoding in animal species: progress, potential and pitfalls. BioEssays, 29: 188-197.

Weigand, A., Jochum, A., Pfenninger, M., Steinke, D., Klussmann-Kolb, A. 2011. A new approach to an old conundrum—DNA barcoding sheds new light on phenotypic plasticity and morphological stasis in microsnails (Gastropoda, Pulmonata, Carychiidae). Molecular Ecology Resources, 11: 255-265.

67