DESIGN AND CHARACTERIZATION OF CANDIDATE SINGLE- STRANDED DNA APTAMERS THAT BIND TO AND DETECT BACTEROIDES FRAGILIS TOXIN SUBTYPES BFT-1 AND BFT-2

By Payam Fathi

A thesis submitted to Johns Hopkins University in conformity with the requirements for the degree of Master of Science

Baltimore, Maryland April 2017

ABSTRACT

Enterotoxigenic Bacteroides fragilis (ETBF) is a diarrheal pathogen that secretes a small

family of 20 kDa metalloproteases that have been designated B. fragilis toxins (BFT).

BFTs bind to a currently uncharacterized receptor on the surface of epithelial cells

resulting in the cleavage of host molecules that mediate cell-cell adhesion leading to loss

of barrier integrity and chronic colitis. ETBF has also been associated with colorectal

cancer and inflammatory bowel disease. Currently there are no well characterized and

widely available molecular tools that are able to specifically label or inhibit BFTs. DNA

aptamers are a promising class of versatile oligomers that can be designed to specifically

bind to a vast array of target molecules. Herein, we identified and characterized ssDNA

aptamers that show enhanced binding to the BFT subtypes BFT-1 and BFT-2. Ultimately, these aptamers may provide a relatively inexpensive way to label and possibly inhibit

BFTs. A systematic evolution of ligands by the exponential enrichment (SELEX) approach employing magnetic beads was used to generate aptamer pools with enriched binding specificity to BFTs. A SYBR Green-based Real-Time PCR assay was developed to monitor the enrichment of aptamer pools against BFT. Next generation sequencing of aptamer pools using Illumina MiSeq-generated sequence data that were analyzed using publicly available graphical user interface (GUI)-based webservers. Sequence alignment and structure predictions were employed to identify and characterize eleven unique aptamers with promise for further work up. These experiments outline a lower cost and technically simple approach for the generation of ssDNA aptamers against bacterial toxins. Thesis readers: Cynthia L. Sears, MD and Alan L. Scott, PhD

ii

PREFACE

I would like to acknowledge and thank my thesis advisor Dr. Cynthia L. Sears for all of

her effort and support during my time at Hopkins. Dr. Sears hired me as a technician in

her lab almost four years ago and has been a great mentor to me during my time in the

lab. She was instrumental in fostering my interest in the human gut microbiome; an area

of study which I hope to continue during my PhD years. I would like to also acknowledge

Dr. Shaoguang Wu for her help, guidance, and tutoring while I have been at the Sears

Lab. I am very appreciatative of Cindy and Shaoguang entertaining my ideas and enthusiasm for conducting pilot studies in the lab. I am sure there are few, if any, places

where my opinions and thoughts would have held the same weight as they have in the

Sears Lab. The degree of independence that I have been allowed while working with

Cindy has taught me invaluable skills with respect to designing and conducting sound

science, and for that I am forever grateful. Additionally, I would like to thank Dr.

Christine Craig for her mentoring while she was at the Sears Lab. Thank you to Dr.

Franck Housseau for writing me an outstanding recommendation letter that helped me

gain acceptance to a PhD program. I would like to thank Dr. Alan L. Scott for serving as

my MMI advisor and for always providing helpful advice. I’d like to also thank my

colleagues in the Sears Lab for their help troubleshooting my experiments,

commiseration, and for dealing with my general pessimistic grumpiness.

iii

TABLE OF CONTENTS ABSTRACT ...... ii PREFACE ...... iii LIST OF TABLES ...... vii LIST OF FIGURES ...... vii INTRODUCTION ...... 8 A structurally analogous framework for the Gut Microbiome ...... 8 Bacteroides fragilis and the Human Gut Microbiome ...... 11 Medical and Public Health significance of ETBF ...... 14 Aptamers: A new-old method for the generation of molecules that bind to targets with high specificity 15 Aptamers as antibody-like molecules “Aptabodies” ...... 18 Aptamers as toxin neutralizing compounds ...... 19 Purpose of Study ...... 20 Specific Aims ...... 21 CHAPTER 2: Biotinylation of BFT ...... 22 Purpose ...... 22 Materials and Methods ...... 22 B. fragilis Toxin BFT-1 and BFT-2 Purification ...... 22 NHS-Sulfo Biotinylation Reaction ...... 23 Dot Blot Analysis of Biotinylated BFT-1 and BFT-2...... 24 HT29/C1 Cell Rounding Assay ...... 24 Results ...... 25 Discussion ...... 26 Chapter 2 Figures ...... 27 Streptavidin HRP Dot Blot visualizes the Biotinylation of BFT ...... 27 Biotinylation of BFT-2 does not disrupt toxin activity ...... 28 CHAPTER 3: Optimization of SELEX PCR Conditions ...... 29 Purpose ...... 29 Materials and Methods ...... 30 Random Library and Primers ...... 30 PCR reactions and optimization conditions ...... 31 Results ...... 32 Discussion ...... 33 Chapter 3 Figures ...... 34 PCR optimization results in robust amplification of the Aptamer Library ...... 34 CHAPTER 4: Magnetic Bead SELEX for the Isolation of Aptamers Binding to BFT-1 and BFT-2 .. 35

iv

Purpose ...... 35 Materials and Methods ...... 36 Buffer Preparation ...... 36 Preparation of Pre-selection and Selection M280 Streptavidin Beads ...... 36 First SELEX Cycle ...... 37 SELEX Cycles Two Through Ten ...... 38 FAM-labeled Aptamer Binding and FACS CALIBUR ...... 39 SYBR Green Real-Time PCR Assays ...... 39 (ii) BSA Blocking Real-Time PCR Assays ...... 40 (iii) BSA co-coat Real-Time PCR Assays ...... 40 Results ...... 40 Discussion ...... 45 Chapter 4 Figures ...... 49 Graphical Overview of SELEX Process ...... 49 Estimating binding capacity of biotinylated DNA amplicons to M280 Streptavidin Beads ...... 50 Microscopic Visualization of FAM-Labeled DNA Aptamers and BFT-M280 Complex ...... 51 FACS CALIBUR Flow Cytometry of FAM-labeled Aptamer pools BFT-M280 Complex ...... 52 Affinity of final SELEX round DNA Pools for Toxin detected by Real-Time SYBR Green PCR Assays ...... 53 CHAPTER 5: Next Generation Sequencing of Aptamer Pools ...... 56 Purpose ...... 56 Materials and Methods ...... 57 Indexing Aptamer Pools for Next Generation Sequencing ...... 57 Adapting NEB NEXT Protocols for Aptamer NGS preparation ...... 58 NEBNext End Prep ...... 58 Illumina Sequencing Adaptor Ligation ...... 59 PCR Amplification and Illumina Barcoding ...... 59 Measuring Concentration of Sequencing Pools ...... 60 MiSeq Nano Chipset Preparation and Sequencing ...... 61 Results ...... 61 Discussion ...... 63 Chapter 5 Figures ...... 64 Barcode Indexing of SELEX DNA Pools ...... 64 NEBNext Illumina Sequencing Library Gel Extraction ...... 65 Illumina Sequencing Library Quantification ...... 66 CHAPTER 6: Analysis of Sequence Data to Characterize BFT Binding Candidates ...... 67 Purpose ...... 67

v

Materials and Methods ...... 68 Post-processing of NGS Sequence reads with GALAXY ...... 68 Comparing aptamer copy numbers for candidate elucidation...... 69 Motif Analysis of Aptamer Pools ...... 69 Statistical Analysis of Aptamer Pool Enrichment ...... 69 Clustal Omega Alignment of Sequences ...... 70 Secondary Structure Prediction of Enriched Sequences ...... 70 Results ...... 70 Discussion ...... 75 Chapter 6 Figures ...... 78 Aptamer Pools were not equally loaded when preparing sequencing samples ...... 78 Graphical Visualization of Multiplexed Sequence Reads ...... 79 Sequencing reads of later pools identify potentially enriched aptamer sequences ...... 80 Identical Aptamer Sequences present in BFT-1 and BFT-2 Selex Pool 10 highlighted in green ...... 81 Analysis of high copy number sequences in BFT-1 SELEX Aptamer pools ...... 82 Prediction of Evolutionary Change in SELEX Sequences in BFT-1 Pool ...... 83 Analysis of high copy number sequences in BFT-2 SELEX Aptamer pools ...... 84 Motif Analysis of Aptamer Pools identifies enrichment of random-region sequence motifs ...... 85 Clustal Omega Alignment of high-copy number sequences shows high diversity in Selex Pool 10 for BFT-1 and BFT-2 ...... 86 SELEX round motif enrichment tracks relative changes in Aptamer Pools ...... 87 Fisher’s Exact Test statistics of motif enrichment in SELEX rounds ...... 88 Secondary Structure Analysis of High Copy Number Sequences ...... 89 SUMMARY...... 90 FUTURE WORK ...... 91 BIBLIOGRAPHY ...... 93 CURRICULUM VITAE ...... 102

vi

LIST OF TABLES Table 1: Demultiplexing Mixed Index DNA Pool Sequencing Read Output ...... 78 Table 2: BFT Aptamer Sequencing Read Metrics ...... 80 Table 3: Comparing BFT-1 and BFT-2 SELEX 10 Aptamer Populations ...... 81 Table 4: Features of BFT-1 SELEX 10 Aptamers...... 82 Table 5: Features of BFT-2 SELEX 10 Aptamers...... 84 Table 6: Contingency tables and Fisher’s Exact Test p-values ...... 88

LIST OF FIGURES Figure 1: Dialyzed BFT-2 and Pre-biotinylated BFT-1 Biotin Labeling Dot Blot ...... 27 Figure 2: HT29/C1 Cell BFT Toxin Assay ...... 28 Figure 3: PCR optimization of SELEX conditions...... 34 Figure 4: Graphical Representation of SELEX cycle process ...... 49 Figure 5: M280 Binding Capacity BFT SELEX PCR ...... 50 Figure 6: FAM-Labeled BFT-2 Aptamer Microscopy ...... 51 Figure 7: FAM-Labeled BFT-2 FACS Calibur Flow Cytometry ...... 52 Figure 8: Aptamer Enrichment Monitoring Using Real-Time SYBR Green PCR ...... 53 Figure 9: Aptamer Real-Time SYBR Green PCR with BSA Blocking...... 54 Figure 10: Aptamer Real-Time SYBR Green PCR with BSA Co-incuation ...... 55 Figure 11: SELEX PCR Pool Index PCR ...... 64 Figure 12: Aptamer Sequencing Libraries for NGS using Illumina MiSeq ...... 65 Figure 13: Aptamer Sequencing Library Quantification ...... 66 Figure 14: Graphical Representation of Aptamer PCR Indexing ...... 79 Figure 15: Sequence Motifs in BFT Aptamer Pools ...... 83 Figure 16: Phylogeny Analysis of BFT Aptamer Pools ...... 85 Figure 17: Normalized Motif Tracking Across SELEX for BFT-1 and BFT-2 ...... 86 Figure 18: Capturing Potential Aptamer Sequence Evolution ...... Error! Bookmark not defined. Figure 19: Secondary Structure Analysis of BFT Aptamers ...... Error! Bookmark not defined.

vii

INTRODUCTION

A structurally analogous framework for the Gut Microbiome

Advances in sequencing techniques provide deeper insight as to how higher organisms interact with and are influenced by their microbial symbiotic communities.

The overused phrase “we are more bacteria than human” has evolved as a simplified shorthand for the complicated and dynamic interactions that are the result of an incredibly complex coevolution of mammals and their largely single-cell symbionts.

The current body of scientific literature has established the prominent role of an

organism’s microbiome in influencing the development and maturation of early immune

responses. More complex examples of this influence have been shown in studies in

models systems comparing the expression of bacterial communities in the gut,

including responses to dietary changes, and how these components result in immune

fluctuations. It is clear that the interactions between multi-cellular host organisms and

their microbiomes have profound effects on health and disease. This interaction is also

shown to be reciprocal: the homeostatic physiological state also effects the composition

of an organism’s microbiome. What we also know is that the microbiome is susceptible

to a large degree of natural variation across time and developmental stages of the host

organism. In summary, there exists a bi-directional influence between the environment

and the microbiome, the microbiome and the host, and lastly the environment and the

8

host. To better examine this web of complex interrelationships, there exists a need to

develop a theoretical model that allows for better understanding of these interactions.

There are a number of instances where the gut microbiome has been referred to as

an underappreciated “virtual organ”. This body of literature places the gut microbiome in

this context due to evidence of the synthesis of endocrine signaling molecules by gut

bacteria. The use of term ‘organ’ in this context is used to emphasize the critical

importance of microbiome contributions to host physiology.

It is my opinion that the word “organ” can be used to describe the gut microbiome

from a structural perspective. One can argue that an organ, simplified, is composed of two discrete components: a cellular (living) component, and a matrix (supportive) component. The influences of the matrix component on cellular development have been clearly demonstrated. Studies examining the influence of extracellular matrixes on cell

development have shown that cellular differentiation is dependent on mechanical stresses

and shear forces exerted on the cells interacting with the matrix. That is, the resultant

and phenotype of a given group of cells is influenced by its interactions

with the local matrix environment. This same analogy can be extended to simplify the

interactions between the microbiome, host, and environment.

This framework proposes that the gut microbiome be contextualized with regards

to the interactions of the bacterial communities (cellular component) with the

environments in which they are directly surrounded (matrix component). The

9 environments in which the bacterium are directly contacting can be simplified into either the luminal particulate matter composed of incompletely digested matter that makes up the host diet or the mucus layer secreted by the host to protect against direct bacterial contact with epithelial cells.

Using this analogy, we can simplify the interactions between diet and the microbiome. More directly stated: the contributions of the dietary contents in the lumen influences bacterial gene expression (or species specific colonization) by providing a structural matrix with which the bacteria can interact. This can be extended to bacterial components at the mucosal surface interacting with the host cell barrier. Mucosal surface and host cell barriers act as a scaffolding matrix that influences the gene expression and colonization of interacting bacterium. This context is also echoed in research showing bacterial biofilms produce a matrix that drastically alters the genetic properties and population dynamics of the involved species; although in this case the matrix is produced by the bacterium rather than interactions with environmental components.

Now let us assume that this structural context holds true. With this in mind it is interesting to think of the effects of antibiotics; the most commonly used intervention when modifying host microbial communities. Simply stated, in this model, introduction of antibiotics has the effect of stripping away a portion of cellular component modifying the proportional cellular profile relative to a given matrix that is not affected by the antibiotic. While the net effect results in elimination of the target, these therapies also result in a considerable level of secondary fallout that may result in subsequent negative

10

consequences for the host. Granted, the ability of the microbiome to recover following

antibiotic exposure is remarkable, however, this contextualization helps to put into

perspective the need to develop more specific modulators of bacterial populations by

considering their spatial structure within the host.

With this in mind, the experiments outlined below outline attempts to design and

characterize ssDNA aptamers that are able to bind the bacterial toxin, B. fragilis toxin or

BFT. Ultimately, the goal of such research is to introduce molecules that are able to

selectively bind bacteria or bacterial toxins and modify a very specific component of the microbial community residing within the host.

Bacteroides fragilis and the Human Gut Microbiome

Studies investigating the role of the human gut microbiome in maintaining

physiological homeostasis have seen a marked increase in prominence within the past

two decades. Earlier studies on the composition of gut microbial communities have

shown a remarkable amount of variation between individuals and have highlighted

drastic differences in the makeup of bacterial taxa that colonize the mucosal surface sites

when compared to the luminal contents of the intestines.1 These and earlier published

studies provided a foundation for a push to more thoroughly document and model the

human microbiome through the establishment of the Human Microbiome Project.2 As a

result of these efforts, and due in part to remarkable strides in sequencing technologies,

the diversity of microbiota within this compartment has been resolved to a high degree.

11

This information has shown that the Bacteroidetes are one of the consistently high

abundance phyla that make up the gut microbiome. 3

Studies examining the effects of the microbiome on the host show an association

with levels of Bacteroidetes bacterial colonization and obesity in mice.4 Attempts to

recapitulate this observation in human populations have shown inconsistent results, with

a more recently published meta-analysis not finding a statistically significant change

across the studies with respect to Bacteroidetes colonization.5 The study does conclude

that a significant difference in the microbiome of obese versus healthy adults still exists

but specific changes are more difficult to clearly distinguish.6 A recently published

review on the human associated Bacteroidetes phylum describes the Bacteroides,

Porphyromonas, and Prevotella genera as dominant players in the human gastro-

intestinal tract.5

The Bacteroides genus is composed of over twenty members as categorized

through 16S rRNA sequencing.7 A subset of this genus is designated as the Bacteroides fragilis group and contains 10 species that have been previously associated with

anaerobic bacteremia in humans.8 Other studies have suggested that members of the

Bacteroides fragilis group play an important role in the development of the immune

system, and in regulating metabolic status in the gut microbiome.9,10 Within this group

the namesake species Bacteroides fragilis sensu stricto has been characterized as both a pathogen and symbiotic bacterium depending on its ability to encode and secrete

Bacteroides fragilis toxin (BFT). This distinction has led to the classification of

Bacteroides fragilis s.s. strains as non-enterotoxigenic or nontoxigenic B. fragilis (NTBF) and enterotoxigenic B. fragilis (ETBF).11 ETBF has been established as a causative agent

12

of diarrheal disease especially in underdeveloped regions.12 Additionally, higher rates of

ETBF bft gene presence have been detected in human mucosal samples from patients

with IBD and colorectal cancer.13

ETBF strains contain a six kilobase B. fragilis pathogenicity island (BfPAI) in

which the encoding for the B. fragilis toxin (bft) reside. The bft gene encodes for a

pre-proprotein containing a signal sequence, prodomain, and active toxin domain.14 Three

subtypes of the toxin have been identified as zinc dependent metalloproteases and called

BFT-1, BFT-2, and BFT-3. The toxin subtypes are encoded by alleles designated bft1,

bft2, and bft3, respectively. All of the ETBF isolates examined to date carry and express

only a single allelic variant of bft. BFT-2 has been shown to exhibit the greatest toxicity

in vitro.15 Secreted BFT binds to an as of yet uncharacterized receptor on the surface of

epithelial cells resulting in a robust signaling cascade culminating in e-cadherin junction

cleavage, Stat3 activation, IL-8 secretion, and colonic epithelial cell proliferation.16-18

These events result in a persistent colitis phenotype in C57Bl6/J mice and results in a

min 19,20 potent TH17 dependent tumor phenotype in the Apc mouse model.

Until recently, the activation and secretion mechanisms of BFT were unknown. A study published in 2016 identified a cysteine peptidase called Fragipain as the enzyme that cleaves the 44 kDa proprotein to its ~20 kDa active form.21 Follow up studies

showed that BFT promotes lethal anaerobic septicemia in mice, and that Fragipain is

required for secretion of active toxin during anaerobic sepsis. However, when Fragipain-

deficient ETBF was inoculated in mice, colitis was still observed.22 This observation

supports the idea that the mechanisms of BFT secretion may be altered by the bacterium

based on signals received through its surrounding environment. Bacteroides fragilis

13

secretes outer membrane vesicles that are packaged with proteases and other enzymes

into the surrounding space. These OMVs have been shown to interact directly with host

cells.23 Reports of BFT-2 detection in ETBF-secreted OMVs represent other potential

mechanisms through which BFT can be delivered to the surface of host cells.24

Medical and Public Health significance of ETBF

ETBF has emerged as a mucosal-associated microbe causing chronic colitis and

contributing to colon tumor formation in mice. Studies investigating the epidemiology of

ETBF in a number of gut pathologies have been conducted and the role this pathogen plays in causing diarrhea in humans has long been established.25 A study examining

diarrheal disease in Bangladesh has associated ETBF with inflammatory diarrhea.26

Investigating the role of ETBF carriage in IBD populations found an increased

detection of the bft gene in patients with active disease.27 Studies have also detected

increased proportions of bft in both stool and colonic mucosa of patients with colorectal

cancer.13,28 Additionally, ETBF has been shown to be elevated in tumor tissue compared

to paired normal tissue in patients with colorectal cancer.29,30 Lastly, ETBF has also been

associated with traveler’s diarrhea in a study comparing populations from three different

countries.31

These studies highlight the role ETBF plays in causing various gastro-intestinal

pathologies. Interestingly, a vast number of the studies examining the presence of bft in

patient populations do not identify the toxin subtype. One study examining the incidence

14

of diarrhea in a population of children in India showed that the ETBF strains contained

only bft1 or bft3.32 A study examining bft carriage in a Brazilian population also detected

only bft1 or bft3 but not bft2.33 In contrast, separate studies on populations in Turkey and

Vietnam detected only bft2 or bft1 containing isolates but not bft3.34,35 In all, these studies

show that the carriage of bft subtypes is different across populations with bft1 being

detected across populations globally, bft3 may be more prevalent in South Asia and South

America, and bft2 is detected more in East Asia and Europe.

While there is significant variation in the toxin subtypes carried across

populations ETBF is detected in diarrheal illnesses globally. Diarrheal disease has been

categorized as a leading cause of malnutrition and subsequent mortality in developing

countries.36 As a causative agent of diarrheal disease in humans, ETBF infections may

play a significant role in the health status of susceptible populations. Novel methods that

are capable of reducing the severity of ETBF infections through direct inhibition of BFT

toxins can potentially have significant positive impacts on global public health.

Aptamers: A new-old method for the generation of molecules that bind to targets with high specificity

Nucleic acid aptamers are single-stranded oligonucleotides polymers designed to exhibit specificity and selectivity to a given molecular target of interest. These multifunctional polymers were first described in the scientific literature in the early

1990’s. Of note, three separate groups described their work in the design and selection of aptamers within the same time period.37-39 Following these initial studies very little

15 mainstream scientific attention focused on aptamer synthesis strategies and applications.

However, within the past decade, most likely due to rapid advances in oligonucleotide synthesis chemistry and significant decrease in costs of biochemical techniques; aptamers have seen a resurgence in the primary literature. Pubmed MESH aptamer keyword searches show a rapid increase in publications related to aptamer research from less than

100 indexed articles published in 2003 to over 900 publications in 2016.

Aptamer generation utilizes either DNA or RNA to generate single-stranded oligonucleotide polymers of various lengths all of which exhibit a distinct ribose or deoxyribose backbones and tertiary structure. Differences between the two aptamer classes in terms of the structural properties of underlying nucleotide bases results in the generation of structurally distinct shapes even within a similar protein target. Although some experiments have demonstrated that DNA aptamers are capable of binding with similar affinity to a target molecule, this utility is molecule specific and does not carry over perfectly. More specifically, researchers were able to demonstrate that conversion of an RNA aptamer sequence to its corresponding single stranded DNA equivalent was capable of binding with similar affinity to the target molecule of interest.40 Further studies examining these differences between DNA and RNA aptamers specific to dopamine show that the DNA aptamer is capable of exhibiting increased sensitivity when used in a detection assay. Experiments examining the differences in binding affinities of using DNA or RNA to design aptamers against the same target riboflavin molecule showed a decreased binding affinity of DNA aptamers when compared to RNA. This shows that slight differences in the ribose backbone may influence the binding specificity of aptamers. This same study also demonstrated the ability of RNA aptamers to

16

distinguish between the specific redox states of nicotinamide.41 RNA aptamers have been

shown to be more functional in that they bind to a wider variety of targets with increased specificity and are sensitive to smaller structural variations to the surface of a target molecule. In comparison, DNA aptamers have shown increased sensitivity when used in detection assays and they have a greater temperature and structural stability in vivo.

Research has been conducted to further understand and refine synthesis techniques to allow for more effective targeting of molecules. Studies have shown that when using traditional synthesis strategies, aptamers tend to bind overlapping areas of the molecular target. This tendency has been attributed to the hydrophilic nature of oligonucleotide base pairs.42 Given this, there has been an impetus to utilize modified

nucleotides in the synthesis strategies to allow for the generation of aptamers capable of

binding to more hydrophobic residues of a given target of interest.

Considerable research has been focused to examining different aptamer synthesis approaches used to generate highly specific and selective aptamer for a given molecular target. These protocols have adapted the original synthesis strategy developed by Tuerk and Gold termed the Systematic Evolution of Ligands by Exponential Enrichment, abbreviated as SELEX.38 The SELEX process utilizes an initial nucleotide library

composed of random fragments of oligonucleotides of a specific length. This initial pool

is incubated with the purified target of interest in conditions that allow for interaction of a

select pool of random fragments that show specificity. Following this the unbound

fragments are removed from the pool through a series of wash steps. This allows for a

population of oligonucleotides that have bound to sites on the target of interest. These

oligos are then removed from the target through denaturation and amplified using readily

17

available PCR protocols. These above outlined steps are repeated for a series of cycles to

allow for selection of target aptamers that have increased specificity to the target of

interest. Repetitive cycling and PCR amplification cause mutations in oligonucleotide

sequences to be introduced by the DNA polymerase that further contributes to the end

specificity of the aptamer.43

Modifications and enhancements in the basic protocol have been developed to

provide more efficient selection of aptamers with desired binding properties. One

example of such modifications includes the addition of a counter-selection step. Counter-

selection introduces molecules that are similar to the target of interest that will allow for

discrimination between the target aptamer from very similar compounds.44 Modifications

to the process by which aptamer-target complexes are separated from the unbound pool

during the selection process have also been studied. Flu-Mag selection is one such

process incorporating fluorescent labeling of DNA to quantify the overall development of

target specific aptamers over the selection process. Magnetic beads coupled to

streptavidin are used to rapidly elute target-aptamer conjugates from solution thereby

decreasing the total time required to carry out a single selection cycle.45

Aptamers as antibody-like molecules “Aptabodies”

Antibodies have emerged as the primary biological agent utilized in diagnostics, and have significance in therapeutic applications as well. The sensitivity and specificity conferred by antibodies had not been paralleled prior to the introduction of aptamers.

18

Aptamers and antibodies can be developed with similar sensitivity and specificity to a

given target of interest. Aptamers are advantageous in that they are able to be synthesized

in vitro and are able to incorporate synthetic chemical modifications to modify in vivo functions. Additionally, by virtue of their small size, aptamers can be considered largely immune-inert with respect to stimulating a host immune response in stark contrast to antibodies. Finally, aptamers can be synthesized cheaply and are more tolerant of high temperature than antibodies.46

Antibodies hold the advantage of having been studied and used extensively for a much longer period of time than aptamers. This has allowed for a better understanding of the properties of antibodies and allows for predictions with regards to kinetics and functions be made. In contrast, although there are studies attempting to relate aptamer structural conformations and in vivo functionality this has been found to be difficult and convoluted. Additionally, aptamers are extremely susceptible to degradation by host nucleases and elimination by kidney filtration in vivo.47

Aptamers as toxin neutralizing compounds

Studies have examined feasibility of using aptamers as molecular labels and

potential inhibitors of a wide variety of toxins. A series of studies designed DNA

aptamers targeted against snake venom bugarotoxins of Bungarus multicinctus. These

studies were able to design aptamers targeted against both the alpha and beta

bungarotoxins.48 Follow-up studies published in 2016 show that aptamers designed

19

against alpha-bungarotoxin cross-react and neutralize Naja atra cardiotoxins.48 Another

study demonstrated the utility of DNA aptamers in binding and neutralizing

Staphylococcus aureus alpha-toxin. The authors of this study were able to synthesize

DNA aptamers that resulted in specific binding and in vitro neutralization of staph alpha

toxin. The study was specifically able to show decreased in cell death following

incubation of toxin and aptamer.49

Purpose of Study

The goal of the work presented in this thesis was to identify and characterize

single stranded DNA (ssDNA) aptamers that bind to and/or inhibit Bacteroides fragilis

toxin (BFT). Previous attempts to develop a robust antibody to the toxin have not been fruitful. Indeed, prior to a study published in March of 2017, there were no records of monoclonal antibodies able to specifically bind to or inhibit BFT.50 The experiments

outlined here aimed to establish a relatively low cost and technically simple pipeline for

the synthesis and characterizing ssDNA aptamers that bind to two subtypes of BFT.

Once these novel tools are generated, they can be used to test the hypothesis that

treatment of ongoing ETBF-driven colitis with BFT-neutralizing aptamers reduces gut

pathology.

20

Specific Aims

The experiments outlined below were designed to address the following three aims:

1) Adapt a SELEX protocol for the generation of ssDNA aptamer pools that exhibit

high affinity binding to Bacteroides fragilis toxin subtypes 1 and 2 (BFT-1, BFT-

2).

2) Apply a next generation sequence (NGS) strategy to characterize unique

candidate aptamers targeting BFT-1 and BFT-2.

3) Develop a cost-effective, accessible and easy to use analysis pipeline for MiSeq

Nano NGS chipsets data using publicly available webservers to establish an easily

utilizable platform for aptamer identification and characterization and validation.

21

CHAPTER 2: Biotinylation of BFT

Purpose

A strategy for the separation of the particular aptamers that bind specifically to

the target BFT molecules from a solution that contains millions of non-interacting DNA

oligomers is required for enrichment and identification. Streptavidin-coated magnetic

beads provide a platform that allows for rapid pull down of target/aptamer complexes by

taking advantage of the strong non-covalent interaction between biotin and streptavidin.

Biotinylation of BFT protein is required to allow for incorporation of magnetic bead

separation in the SELEX process. The goal of the experiments in this Chapter was to

generate biotinylated BFT toxins that retain toxin activity and are able to be coupled to

streptavidin-coated magnetic beads for use in the SELEX process.

Materials and Methods

B. fragilis Toxin BFT-1 and BFT-2 Purification

Previously purified aliquots of active BFT-1 and BFT-2 were used for the biotinylation reaction and subsequent aptamer candidate generation. Toxin purification methods have been detailed in the literature.51,52 Briefly, enterotoxigenic B. fragilis

strains VPI 13784 (secreting BFT-1) and 86-4332-2-2 (secreting BFT-2) were grown in large volumes of Brain-Heart Infusion (BHI) broth anaerobically. Cultures were

22

centrifuged, supernatants were sterile filtered and concentrated using ultrafiltration

membranes. Concentrated supernatants were bound to a phenyl-sepharose hydrophobic

interaction chromatography (HIC) column equilibrated in 0.05 M Tris-HCl, 1.5 M NaCl

(GE Healthcare). Supernatants bound to the column were washed sequentially with

0.05M Tris-HCl containing 1.5, 1.0, or 0.5M NaCl. Protein bound to the HIC column was eluted with 25% ethanol in 0.05M Tris-HCl. Following HIC column elution, the BFT

protein-containing samples were purified using FPLC using Mono Q anion exchange

chromatography supplemented with 6M urea. Column fractions were pooled and dialyzed

against Tris-glycine buffer to remove the urea. The toxins were subsequently

concentrated using Centricon Plus-70 centrifugal filter unit with a 10 kDa cutoff

(Millipore-Sigma). Toxin purity and activity were confirmed through silver staining and

HT/29 cell rounding assays.51

NHS-Sulfo Biotinylation Reaction

Separate aliquots each containing fifteen micrograms of active purified BFT-2 as well as aliquots of a previously biotinylated active purified BFT-1 were dialyzed for buffer exchange from Tris-glycine buffer to Phosphate Buffered Saline pH 7.4 (PBS) using Tube-O-DIALYZER (G-Biosciences) mini dialysis. Samples were dialyzed for 8 hours at room temperature with gentle stirring. The dialysis buffer was exchanged twice during the process. EZ-link No-Weigh Sulfo-NHS-Biotin (Thermo Scientific) was used to biotinylate toxins at room temperature for 1 hour employing a 50-fold molar excess of biotin in the reaction. Unreacted biotin was removed by running samples through 0.5 mL

Zeba Spin Desalting Columns (Thermo Scientific) with a 7 kDa cutoff. Samples were

23

recovered in PBS and the protein concentration measured (A280) using NanoDrop 2000c

UV-Vis (Thermo Scientific).

Dot Blot Analysis of Biotinylated BFT-1 and BFT-2

Ten nanograms each of dialyzed unbiotinylated BFT-2, biotinylated BFT-1,

biotinylated BFT-2, and re-biotinylated BFT-1 were spotted on nitrocellulose membrane

paper and allowed to dry. Biotin-labeled goat anti-mouse IgG (Abcam ab6788) was used

as a positive control. Membranes were dried at room temperature and blocked for four

hours with 5% Milk in PBS-Tween 0.05% (PBST). Membranes were washed three times

for five minutes per wash with PBST. Pierce Streptavidin Poly-HRP (Thermo Scientific)

diluted 15,000-fold in blocking buffer was applied to membrane and incubated for 1 hour at room temperature with shaking. Following incubation with the detector solution, the

membrane was washed three times with blocking buffer and three times with PBST with

two minutes between washes. SuperSignal West Pico Chemiluminescent Substrate kit

(Thermo Scientific) was used according to manufacturer’s protocol. Excess reagent was

drained and membrane was exposed to X-ray film for 15 minutes and processed.

HT29/C1 Cell Rounding Assay

The ability of biotinylated BFT to retain toxin activity was tested using previously

established protocols.53-55 Briefly, HT29/C1 cells exhibit robust changes in cell morphology when incubated with BFT. This cell rounding and detachment was measured

semi-quantitatively on a scale of 1 to 4 with 4 indicating the highest level of cell

rounding. HT29/C1 cells were grown in a 96-well tissue culture treated plate (Corning) to

~80% confluence at 37°C, 10% CO2, in DMEM (4.5 g/L Glucose, No Sodium Pyruvate)

supplemented with 10% FBS. Cells were washed and incubated with DMEM at 37°C,

24

10% CO2 for 30 minutes prior to assay. Cells were incubated with a serial 1:2 dilution

ladder starting with 300 ng biotinylated BFT-2 in 100 µL DMEM for 3 hours.

Additionally, biotinylated BFT-2 with Dynabeads M-280 Streptavidin-coated magnetic

beads (Thermo Fisher) samples were incubated to assess any changes in toxin activity as

a result of magnetic bead binding.

Results

The results of BFT biotinylation can be seen in the dot-blot presented in Figure 1.

Previously biotinylated and dialyzed BFT-1 was used in this experiment to determine if the toxin retained biotin sites that are still able to bind streptavidin after storage at -80°C

for an extended period of time. Biotin groups present in previously stored BFT-1 still

exhibit binding to streptavidin. Figure 1 demonstrates that both toxins were successfully

biotinylated and are capable of binding streptavidin.

Figure 2 shows the results of the semi-quantitative analysis of the HT29/C1 toxin

cell assay. Biotinylated BFT-2 is shown to exhibit cellular cytotoxicity indicating that

biotinylation of the toxin does not inhibit in vitro activity. Interestingly, the presence of

streptavidin coated M280 beads resulted in a significant drop in observed toxin activity.

Following mixing of biotinylated BFT-2 with M280 beads and application to the HT29

cells the toxin activity was severely diminished at 3 hours. Incubation of the toxin-bead

complex for 24 hours prior to cell assay completely removed any observed toxic effects.

This could be a potential result of the bulky nature of the magnetic beads preventing the

25

toxin bound to the bead from effectively engaging the cells to exert cytotoxic effect.

Lastly, the results of incubating the remaining supernatant after pulling the toxin-bead complex from solution resulted in low levels of observed toxin activity. Toxin activity in the supernatant following incubation with SA-M280 indicated that there was an incomplete binding of all available BFT to the bead.

Discussion

The experiments in this Chapter demonstrate that BFT is amenable to biotinylation and that toxin activity was conserved after labeling. Interestingly toxin activity was significantly compromised following binding to the M280 beads. This can be

a result of binding of biotinylated toxin to the streptavidin beads rather than binding to

cells to exhibit toxicity. Comparing this to the toxin scores for biotinylated BFT-2 show that at 3 hours the post-bind buffer contains enough toxin to exert cytotoxic effects similar to biotinylated BFT-2 at a concentration of ~19 ng/mL. If we assume that the ratio of toxin activity provides insight as to the relative percent binding of the biotinylated toxin to the M280 scaffold, the ratio of 19:300 can provide an estimate of the percent unbound toxin remaining in solution following 24 hours of incubation (~6%). This observation supports the idea that incubation of biotinylated BFT-2 with SA-M280 for 24 hours results in the majority of available toxin to binding to the beads.

26

Chapter 2 Figures

Streptavidin HRP Dot Blot visualizes the Biotinylation of BFT

Figure 1: Dot Blot of biotinylation of dialyzed BFT-2 and pre-biotinylated BFT-1. Previously biotinylated BFT-1 was still able to recognize streptavidin, re-biotinylating BFT-1 had no effect on its ability to bind streptavidin. Dialyzed BFT-2 is very obviously negative compared to robust recognition of streptavidin following biotinylation.

27

Biotinylation of BFT-2 does not disrupt toxin activity

Figure 2: HT29/C1 Cell Assay results from biotinylated BFT-2 following 3 hours of incubation. There was a marked decrease in toxin activity immediately after adding the M280 Streptavidin coated beads or after 24 hours of incubation with the beads prior to assay. BFT2 activity following binding to M280 beads was significantly reduced in the buffer indicating high levels of toxin binding to the beads.

28

CHAPTER 3: Optimization of SELEX PCR Conditions

Purpose

Aptamer libraries were generated using standard phosphoramidite-based

automated DNA synthesis. Briefly, the aptamer oligomers are synthesized from the 3’

end starting with attaching the first phosphoramidite nucleoside containing a 4,4’-

dimethoxytrityl (5’-DMT) at the 5’ hydroxyl position to a resin support. The DMT group

was then removed through a process called detritylation. The detritylated resin supported

nucleoside was then reacted with the next 5’-DMT containing nucleoside to extend the

chain by one base to form a phosphite triester linkage between the two nucleosides. Any

unreacted bases from the previous reaction were capped to avoid the extension of unwanted oligomer chains. The detritylation process was repeated and the next phosphoramidite nucleoside is added to the chain and the cycle is continued to synthesize the first constant region of the aptamer chain. The random regions are then created by incubating the detritylated base at the end of the constant region with molar equivalents of 5’DMT containing bases. This process is repeated 20 times to extend the random regions of the aptamer chains to the appropriate length. Following this, the second constant primer region was synthesized as outlined above. The final pool of aptamers was

then decoupled from the resin support using ammonium hydroxide and the final solution

was deprotected to form the oligonucleotide phosphate backbone. The solution

29

containing full-length aptamers and partially reacted mutant sequences were then HPLC

purified to remove any oligonucleotides that were not the appropriate length.56,57

The purified aptamer pools were then used in the SELEX process to allow for enrichment

of sequences that specifically interact with the BFT. Aptamers were incubated with the

BFT and any unbound sequences were washed off. Sequences bound to the BFT were then removed and PCR amplified to enrich for the aptamers exhibiting enhanced

binding specificity to the target. This cycle was repeated ten times to further remove any non-specific binding sequences resulting in a final pool of sequences that contain candidate aptamers that specifically bind to BFTs. Successful SELEX selection protocols were highly dependent on efficient PCR amplification the single-stranded DNA

oligomers following the selection rounds. Optimization of PCR conditions that allow for

favorable conditions for robust amplification of target strands reduces both the time

required to carry out selection and decreases PCR biasing. The goal of the experiments

below was to optimize the PCR conditions for the SELEX cycles to generate a PCR

program resulting in robust amplification of aptamer sequences while also decreasing the

introduction of PCR biases.

Materials and Methods

Random Library and Primers

The phosphodiester DNA aptamer libraries were obtained from Trilink Biotech

(San Diego, CA, Product Number 0-32001). The library was produced as a 100 µM

30

solution of single-stranded DNA molecules of 66 bases. Each single-stranded oligomer

contained three domains: a central 20 base (N20) random sequence domain, a 5’ primer

binding domain (FWDSELEX: 5’-TAGGGAAGAGAAGGACATATGAT-3’) and a 3’ primer binding domain (REVSELEX: 5’-TCAAGTGGTCATGTACTAGTCAA-3’).

The FWDSELEX, REVSELEX, and Biotinylated REVSELEX (5’-(biotin)-

TCAAGTGGTCATGTACTAGTCAA-3’; Trilink Biotech) primers were suspended to a final concentration of 100 µM in ultrapure water.

PCR reactions and optimization conditions

A previously published magnetic bead-based ssDNA aptamer SELEX protocol

was adapted for use with BFT.58 PCR conditions were optimized using the manufacturer’s recommended guidelines (Protocol 1) and compared to methods highlighted in previously published protocols (Protocol 2).58

Protocol 1) 25 µL reaction volumes containing a 4 nM final concentration (100

nM stock) of aptamer library combined with 1x MgCl2-free PCR buffer, 1 mM MgCl2, 2

µM each of the FWDSELEX and biotin-labeled REVSELEX primers, 1.25 U Platinum

Taq DNA polymerase, 0.2 mM dNTP mix, and ultrapure water. Reactions were amplified

at 95°C for 5 minutes followed by 30 cycles of 95°C 30 seconds, 50°C 30 seconds, 72°C

30 seconds, and a final extension at 72°C for 5 minutes.

Protocol 2) 25 µL reaction volumes containing a 4 nM final concentration of

aptamer library combined with 1x MgCl2-free PCR buffer, 1 mM MgCl2, 1.1 µM

FWDSELEX, 0.9 µM biotin-labeled REVSELEX primer, 1.25 U Platinum Taq, and

ultrapure water. Reactions were amplified at 95°C for 5 minutes followed by 30 cycles of

31

95°C 1 minute, 54°C 1 minute, 72°C 1.5 minute, and a final extension at 72°C for 5

minutes.

In addition to the cycling conditions, reactions were supplemented with 0.5 µg/µL

BSA, 3%DMSO, or a combination of the two. Aliquots containing 10 µL PCR reaction

were mixed with 5 µL 5X BlueJuice gel loading buffer and run on ETBR stained 2.5%

agarose gel in SB buffer at 130V for 1 hour with 100 BP DNA ladder (NEB). Gels were

imaged using ChemiDoc MP imaging system (BioRad).

Results

The goal of this aim was to define the PCR conditions for the SELEX selection

process that resulted in efficient amplification with a minimum of PCR bias. The PCR

reactions were tested using 100 femtomoles of ssDNA aptamer library. Multiplying the

total input mass of DNA by Avogadro’s constant (6.0221409x1023) yielded a total of approximately 6x1010 strands of random ssDNA aptamer.

Figure 3 presents the PCR products resulting from Protocols 1 and 2. The results

clearly show a much more robust amplicon profile using Protocol 2 modified from Hover

and Mayer.58 The significantly increased length of the extension step during the 30 cycles

most likely played a large role in this observed difference between Protocol 1 and

Protocol 2.

Each amplification protocol was also run in the presence or BSA, DMSO or BSA

+ DMSO. Supplementation of BSA or DMSO were used to examine increases in the

robustness of generated PCR products through better separation of G-C rich strands

32

which would allow for better access to the polymerase. In Protocol 1 there is a substantial

increase in PCR amplicons in the presence of BSA (lane 4) and BSA + DMSO (lane 5)

suggesting that addition of BSA increases the efficiency of the PCR reaction. This

difference was not observed in protocol 2 PCR reactions possibly due to the increased

PCR efficiency achieved by increasing the length of extension steps in the PCR protocol.

The potential differences are not readily observable in Protocol 2 due to the insensitivity

of a gel-based analysis being able to qualitatively distinguish between slight differences

in PCR efficiencies.

Discussion

Optimization conditions for the SELEX PCR showed that Protocol 2 adapted from the Hover and Mayer (REF) resulted in robust amplicon generation. This protocol will be followed when carrying out SELEX selection PCR amplification. Protocol 1 showed a distinct difference upon addition of BSA to the PCR reaction. Due to this increase in PCR product formation it was decided that SELEX PCR would benefit from the use of BSA during PCR amplification.

33

Chapter 3 Figures

PCR optimization results in robust amplification of the Aptamer Library

Figure 3: PCR optimization of SELEX conditions using the manufacturers recommended conditions or a modified protocol published in the literature.58 Lanes 1-5 for each reaction set contained: (1) No Aptamer library, (2) Aptamer library Only, (3) Aptamer library + DMSO Only, (4) Aptamer library + BSA Only, or (5) Aptamer library + BSA + DMSO.

34

CHAPTER 4: Magnetic Bead SELEX for the Isolation of Aptamers Binding to BFT-1 and BFT-2

Purpose

The most common approach used to determine if certain ssDNA aptamers exhibit

binding to target proteins utilizes radiolabeling of the DNA aptamer library.58 This is

commonly accomplished using a T4 polynucleotide kinase (PNK) enzyme to incorporate

the radiolabeled phosphate from γ-32-Phosphate of [γ-32P]-ATP to the 5’ phosphate of the aptamer strand. This approach presents a number of issues including the requirement of specialized laboratory space and equipment in order to use radiolabeled reagents.

Further, radiolabeling aptamer assays require additional experimental steps that prevent direct assaying of the ssDNA aptamer pools following selection against BFTs. The goal of this chapter was to enrich for aptamers that preferentially bind BFT isoforms by employing a modification of the magnetic bead-based protocol published by Hover and

Mayer.58 In order to avoid the use of radiolabeling procedures and to examine the

aptamer pools directly following selection rounds, a SYBR Green-based Real-Time PCR

assay was developed. This assay was used to detect any increase in affinity of aptamer

pools for the target BFT protein through the progressive rounds of selection. Results of

these experiments were used to monitor the selection process to confirm the presence of

aptamers that specifically bind to BFTs prior to next generation sequencing of aptamer

pools.

35

Materials and Methods

Buffer Preparation

Wash Buffer consisted of 1x PBS containing 1 mM MgCl2. Selection Buffer (5x)

was made by suspending powdered DMEM-High Glucose (Sigma-Aldrich D5648) in 200

µL ultrapure water supplemented with 7.35 mM MgCl2 and 0.5% weight by volume

BSA. Bind Buffer (2x) was composed of 10 mM Tris-HCl (pH 7.5), 1mM EDTA, and 2

M NaCl.

Preparation of Pre-selection and Selection M280 Streptavidin Beads

The pre-selection bead matrix was prepared by washing 400 µL of streptavidin-

coated M280 magnetic beads (SA-M280; Thermo Fisher Scientific) three times with 250

µL Wash Buffer and then suspending the beads in 1.6 mL 1x Selection Buffer. Selection

bead matrix containing either biotinylated BFT-1 or biotinylated BFT-2 bound to the SA-

M280 beads was prepared using the following procedure. SA-M280 beads were washed three times in 250 µL Wash Buffer and suspended in 1x Selection Buffer. After washing,

200 µL of SA-M280 was incubated with 6 µg biotinylated BFT-1 or 6 µg biotinylated

BFT-2 overnight at 4°C on a head to tail shaker. Following incubation, the BFT-SA-

M280 complex was washed twice with 100 µL 1x Selection Buffer and suspended in 800

µL 1x Selection Buffer.

36

First SELEX Cycle

One nanomole of the ssDNA N20 Aptamer library was diluted in 80 µL 1x

selection Buffer and 80 µL pre-selection Matrix. The aptamer pre-selection matrix was

incubated at room temperature for 30 minutes and re-suspended with a pipette every 3

minutes. Following incubation with the preselection matrix, a magnet was used to adhere

the beads to the side of the tube and supernatant containing the unbound ssDNA fraction

was transferred to a 160 µL reaction containing the Selection Matrix. The reaction was

maintained at room temperature for 30 minutes with mixing every 3 minutes. Beads

containing ssDNA aptamer bound fraction were separated by placing the tubes against a

PureProteome magnetic stand (Millipore-Sigma) for 30 seconds. Supernatant was

aspirated and the beads were suspended and washed in 160 µL 1x Selection Buffer for 5

minutes before being separated as before. Following washing, the beads were suspended in 100 µL ultrapure water and heated at 95°C for 3 minutes to release the BFT-bound

aptamers. After removing the beads using the magnet, the supernatant containing the

ssDNA aptamers was used as template for PCR amplification following the first SELEX

round. Sixteen, 50 µL reaction volumes each containing 5 µL of ssDNA aptamer from

the SELEX step, 1x MgCl2 free PCR buffer, 1 mM MgCl2, 1.1 µM FWDSELEX primer,

0.9 µM biotinylated REVSELEX, 2.5 U Platinum Taq polymerase (Invitrogen), 0.5

µg/µL BSA in ultrapure water were amplified using the following cycling conditions:

95°C for 5 minutes followed by 30 cycles (95°C 1 minute, 54°C 1 minute, 72°C 1.5 minute and a final extension at 72°C for 5 minutes. After amplification, 10 µL of the

PCR product was mixed with 5 µL 5x BlueJuice Buffer and run on a 2.5% ETBR

Agarose gel in SB Buffer at 130V for 1 hour along with 100 ladder (NEB) and

37

used to estimate the binding capacity of M280 beads for biotinylated dsDNA PCR

product.

SELEX Cycles Two Through Ten

PCR products containing amplified ssDNA aptamer from the previous selection

round were pooled and used for separating the forward strand for use in the next selection

round. SA-M280 volumes containing 1.5 mg beads (150 µL) were washed twice with

250 µL 1X Bind Buffer and suspended in 500 µL 1X Bind Buffer. Washed SA-M280

beads were combined with 500 µL pooled PCR reaction volumes, 500 µL 2x Bind

Buffer, and incubated at room temperature for 30 minutes on a head to tail shaker. Beads containing dsDNA with biotinylated reverse strand bound to streptavidin were separated and suspended in 30 µL 0.15 M NaOH. Following incubation for 3 minutes, 15 µL 0.3 M

HCl was added. A 5x Selection buffer volume of 16 µL was added and pH estimated by

blotting 1 µL on dry pH Test Strips (Sigma-Aldrich). Solutions were pH neutralized

using NaOH and final volume was adjusted to 80 µL to yield eluted ssDNA in 1x

Selection Buffer for use in next SELEX cycle.

Eluted ssDNA pools contained in an 80 µL volume of 1x Selection Buffer were

incubated in 80 µL pre-selection Matrix for 30 minutes with mixing at 3 minute intervals.

Pre-selection beads were separated by magnetic removal and the supernatant was

incubated in 160 µL of the selection beads at room temperature for 30 minutes with

mixing at 3 minute intervals. Selection beads with bound ssDNA fraction were removed

and washed in 160 µL 1X Selection Matrix.

38

Every subsequent SELEX cycle there was an increase in the number of wash

steps by two for the first 5 rounds. SELEX cycles from round 6 to 10 had eight wash

steps with two 5-minute incubation steps at wash numbers 3 and 6. Following the wash

steps the beads were suspended in 100 µL ultrapure water and incubated at 95°C for 3

minutes to elute ssDNA bound to BFT. BFT-SA-M280 Beads were discarded and ssDNA fraction were used for PCR amplification. SELEX cycles were repeated a total of 10 times for each toxin separately to isolate aptamers binding to BFT-1 and BFT-2.

FAM-labeled Aptamer Binding and FACS CALIBUR

FWDSELEX containing 6-Carboxyfluorescein (5’(6-FAM)-FWDSELEX; Sigma-

Aldrich) and biotinylated REVSELEX (5’(biotin)-REVSELEX) primers were used to

PCR amplify 10 µL ssDNA from BFT-2 Pool-10. The PCR product was bound to SA-

M280 beads and FAM-containing forward aptamer sequences were eluted using NaOH and recovered in the supernatant fraction. FAM-labeled aptamer pools were incubated with BFT-2-SA-M280 bound magnetic beads. Olympus BX61 fluorescence microscope with a FITC filter was used to visualize 5’-(6-FAM)-aptamer-BFT-2-SA-M280 bead complexes. Beads that contained FAM signals were also quantitated by flow cytometry using a FACSCalibur (Becton Dickinson).

SYBR Green Real-Time PCR Assays

BFT-1, BFT-2, or BSA aliquots containing 50 ng protein each were incubated in

separate MicroAmp Optical Reaction Plate (Applied Biosystems) wells overnight at 4°C.

Wells were washed with PBS and incubated with 10 µL ssDNA from BFT-1 or BFT-2

Selex Pool 10 for 3 hours at room temperature. Solutions were aspirated and 50 µL

reactions were prepared containing 1X SYBR Green Master Mix (Applied Biosystems),

39

0.2 µM Forward and Reverse Primers, in ultrapure water. Real-time PCR reactions were

measured on ABI 7500 Real-Time PCR machine with the following reaction parameters:

50°C 2 minutes, 95°C 10 minutes, 40 cycles of 95°C for 15 seconds followed by 60°C for

1 minute. ABI 7500 Software was used to determine reaction CT values.

(ii) BSA Blocking Real-Time PCR Assays

Wells with 50 ng BFT-1, BFT-2, or BSA were co-incubated with 50 ng BSA in

MicroAmp Optical Reaction Plate wells overnight at 4°C. ssDNA aptamer pools from

BFT-1 or BFT-2 Selex Pools 2, 5, and 10 were diluted forty-fold and incubated at room temperature for 1 hour. Solutions were aspirated and wells were washed 3 times with ultrapure water. SYBR green real-time PCR reactions were performed as described.

(iii) BSA co-coat Real-Time PCR Assays

Wells with 50 ng BFT-1, BFT-2, or BSA were incubated separately in MicroAmp

Optical Reaction Plate wells overnight at 4°C. Solutions were aspirated and blocked with

50 ng BSA for 4 hours. Blocking solution was aspirated and wells were washed 3 times with ultrapure water. ssDNA aptamer pools from BFT-1 or BFT-2 Selex Pools 2, 5, and

10 were diluted forty-fold and incubated at room temperature for 1 hour. Solutions were aspirated and wells were washed 3 times with ultrapure water. SYBR green real-time

PCR reactions were performed as described.

Results

40

A graphical representation of a SELEX cycle is presented in Figure 4. The 66

base-long aptamers containing 23 base flanking primer arms with a 20 base random

central region were first incubated in a negative selection step to remove any sequences

that bind to magnetic beads or streptavidin. Magnetic beads were immobilized and the

supernatant containing the remaining aptamer sequences was removed and incubated

with toxin-bound magnetic beads. This complex was incubated, washed, and the

supernatant was discarded. Beads containing bound aptamers were then heated to

dissociate ssDNA secondary structures and release the aptamers from the toxin scaffold.

This pool of free ssDNA was PCR amplified with a biotinylated reverse primer and the

biotin-labeled PCR amplicons were bound to streptavidin-conjugated beads. High pH

conditions were used to dissociate the ssDNA aptamers from the complementary

biotinylated oligomer and the selection cycle was repeated.

Figure 5 shows an aliquot of PCR product derived from the second cycle of

SELEX. The band intensity was compared to that of standardized amount of DNA contained in the 100 base pair ladder to estimate the dsDNA binding capacity of the streptavidin SA-M280 magnetic beads. The amount of DNA generated after one round of

SELEX PCR was estimated to be between 4 μg and 15 μg. The generous estimate of 15

μg was used to determine the volume of magnetic beads used to bind the biotinylated

strand of the PCR amplicons. The manufacturer’s stated binding capacity of SA-M280

beads was 10 μg dsDNA per milligram of SA-M280. This is equivalent to the amount of

SA-M280 contained in a 100 μL aliquot from the manufacturers supplied stock solution.

Therefore to bind an estimated maximum of 15 μg biotinylated DNA, a 150 μL aliquot of

SA-M280 was used per selection round. The SELEX process was then continued for both

41

BFT-1 and BFT-2 for a total of ten rounds. Following the 10th round of selection, eluted ssDNA was used to characterize the specificity of the aptamer pools prior to sequencing and further work-up.

FAM-labeled DNA aptamers were generated using a 5’-FAM labeled forward primer arm in a PCR reaction using ssDNA from BFT-2 SELEX round 10 and the biotin- labeled reverse primer. An overview of the workflow is shown in Figure 6A. The 5’-

(FAM) ssDNA aptamers were then incubated with BFT-2 bound to magnetic beads. We predicted that interactions between the labeled aptamer strands, BFT-2, and the SA-M280 beads would allow us to see labeled beads using fluorescence microscopy. However, no fluorescence was observed under the FITC channel (Figure 6B) indicating that, under the conditions used, there is no evidence of FAM labeled aptamer pool bound to the BFT-

SA-M280 bead complex. It is possible that fluorescence microscopy was not sufficiently sensitive to detect a limited number of FAM-labeled molecules on the beads. We next employed flow cytometry as a more sensitive way to detect aptamer binding to the beads.

Next, the 5’-(FAM) ssDNA-BFT-2 M280 beads were examined using

FACSCalibur in a further attempt to characterize any potential binding of aptamer pools to the toxin. SA-M280 beads without fluorescently labeled ssDNA aptamer incubation were used as a negative baseline control to establish the appropriate cut off gates. Next ssDNA pools containing 5’-(FAM) labeled aptamers were incubated with either SA-

M280 or BFT-2-SA-M280 beads to assess whether there is a difference in observed fluorescence signal. Figure 7 A-C shows the results of the analysis depicting the side- scatter associated with the magnetic beads on the y-axis and measuring the corresponding

FITC channel fluorescence signal on the x-axis. There is no appreciable change in

42

fluorescence signal in the groups incubated with FAM labeled aptamer pools when

compared to SA-M280 negative controls not incubated with FAM labeled aptamers.

Figure 7D shows the histogram comparing the counts observed on the x-axis with the

corresponding FITC channel fluorescence for all groups. It is observed from Figure 7D

that there is no change in the histograms of either group incubated with FAM-labeled aptamer when compared to the SA-M280 only negative control. Overall, the

FACSCalibur results show that there is no binding of FAM-labelled ssDNA aptamer pools to either the SA-M280 or BFT2-SA-M280 beads.

A SYBR Green Real-Time PCR assay was developed to determine if the aptamer pools exhibit any increased selectivity to BFT. Aliquots of 10 µL ssDNA from the last round of selection was incubated in wells that were coated with BFT-1, BFT-2 or BSA.

Enrichment of the populations was determined by evaluating Ct values. Lower Ct value indicates increased aptamer binding. Figure 8 describes the assay setup and shows that both the BFT-1 and BFT-2 aptamer pools exhibited increased affinity to the toxin-coated wells compared to BSA. There was a larger difference between BFT-1 and the BSA control compared to BFT-2, which implies there was greater target specificity in the

BFT-1 aptamer pool.

Due to the very early appearance of the fluorescence signal in the real time PCR assay, modified versions were designed to better examine the enrichment properties in the SELEX pools. Figure 9 shows the modified assay using a BSA blocking step designed to decrease non-specific binding of sequences to the plate. Using DNA from

SELEX rounds 2, 5, and 10 the average fold-change relative to the no protein wells were determined and plotted in Figure 9B. There was an increase in the average fold change

43 for BFT-1 SELEX 10 over BSA that was not seen for SILEX rounds 2 and 5. In contrast,

BFT-2 showed an increase fold change starting from SELEX 2 continuing through all pools. Figure 9C shows that the ratio of fold change for the last round of SELEX in BFT-

1 is equal that of BFT-2 during the second round of selection. Additionally, BFT-2 fold changes increased with selection rounds possibly indicating increased binding to BFT compared to BSA. The data presented in Figure 9D measured the increase in the ratio of enrichment between BFT and BSA for SELEX rounds 10 and 2. Higher ratios of enrichment were observed when comparing the toxin pools for both BFT-1 and BFT-2 relative to the BSA groups indicating there was selection against aptamers that more preferentially bound to the BFTs across SELEX rounds 2 to 10.

While the BSA blocking experiments provided some insight into whether or not there is enrichment for binding to BFT, to further examine the experimental approach, a modified protocol was carried out utilizing an incubation of BSA and BFT together

(Figure 10A). In contrast to the results from the other BSA blocking protocol, there was a high fold change seen for BFT-1+BSA that decreased over subsequent SELEX rounds. A different trend is seen when comparing the average fold change of BFT-2+BSA to the

BSA group. BFT-2+BSA and BSA groups incubated with SELEX round 2 DNA shows a low average fold change but experience an increase in the fold change as selection cycles are increased. BFT-2+BSA incubated with SELEX round 5 DNA has a much higher average fold change compared to the BSA alone. Figure 10C shows the results of taking the ratio of the fold change between the toxin with BSA and BSA only wells. We can see that the ratios are lowered compared to the previous blocking protocol in all pools except for BFT-2+BSA SELEX round 5 compared to BSA only. The data obtained from the co-

44

incubation experiment provides insight into how the selection rounds alter the binding of

sequences that do not exhibit specificity to BFT toxins.

Discussion

The SELEX process for both BFT-1 and BFT-2 used 1 nanomole of starting random aptamer library per toxin. Multiplying this mass by Avogadro’s constant yielded a total starting input of approximately 6x1014 possible ssDNA aptamer strands.

The microscopy and flow cytometry experiments designed to detect aptamer

binding using 5’-(FAM) labeled aptamer pools may have been affected by the presence

of a pool containing a large number of non-specific binding sequences compared to binding sequences resulting in the fluorescence signal being below the detection limits.

The FAM labeling may be effected by the high pH step during the PCR process used to dissociate strand from the biotinylated arm. Finally, the lack of signal detection may have been affected by the length of time the fluorescently labeled aptamers were incubated with the control groups. Overall, it was not possible to detect any changes in fluorescence signal in these experiments which led to the development of the SYBR green based PCR assay.

The SYBR Green Real-Time PCR assay allowed for a number of interesting observations regarding the specificity and selectivity of the Aptamer SELEX pools. The initial experiment, which added a presumably large amount of DNA to the wells, caused

early signal detection by PCR with signals being detected by the second PCR cycle. This

45

very early signal is generally considered to be an unreliable estimate of true enrichment

in Real-Time PCR Assays. Interestingly, the early Ct values still showed favorable

binding towards BFT when compared to BSA. This experiment also demonstrated that the aptamer pools following SELEX round 10 still contained a large amount of DNA that is not specific towards the toxin and easily bind to either BSA or the empty well. Overall, this first experiment is still able to show a preferential binding of the later SELEX aptamer pool to BFTs which supports the conclusion that the SYBR green assay is a promising method of measuring aptamer enrichment towards the target of interest.

Dilution of the aptamer DNA pools and incorporating the BSA blocking step was used to decrease the total number of nonspecific binding interactions to see if this resulted in a significant difference between toxin and control wells. Interestingly, this addition allowed for a better tracking of the changes in selectivity over time. It can be seen from Figure 9C that the fold-changes of BSA stay relatively steady over selection rounds whereas there is a marked increase in the average fold changes of PCR fluorescence signal for both BFT-1 and BFT-2. This may provide insight as to how the selection against a given target is proceeding. It seems that BFT-2 aptamer selection generates higher specificity much faster than BFT-1, since the results of Figure 9B show that BFT-2 SELEX round 5 the fold change ratio is approximately equal to that of BFT-1 SELEX pool 10.

Additionally, we can see that between rounds 5 and 10 there is not a drastic increase in fold changes, this might allow for tracking of SELEX processes allowing for the selection cycles to be modified based on the output of the previous round. From

Figure 9C we see that the ratio of fold-changes for BFT-1 remained around one for both

SELEX rounds 2 and 5 and increased in the 10th round. The ratio of fold change for BFT-

46

2, however, increases dramatically and is as high as BFT-1 round 10 at round 2. This implies that BFT-2 selection generates a larger population of binding sequences early on, this large shift is again shown to be diminished when looking at Figure 9D which shows the ratio of enrichment at the end of selection compared to round 2 are approximately similar between the two toxins.

A final interesting observation can be made from the co-incubation of BFT with

BSA. The ratio of fold change is skewed much closer to one compared to blocking with

BSA showing that the aptamer pools are preferentially binding to the toxin. If the fold change ratio is high, it would imply that the aptamer pool is composed of a good number of non-specific binders since the additions of BSA seems to effectively diminish binding of the aptamers to the well. The data from the real time PCR assays provide an interesting insight into the evolution of the aptamer pools across selection cycles. We can see that the pools exhibit preferred binding to toxin wells, indicating that our selection cycles were successful in generating specific aptamer sequence pools. The incubation of toxin and BSA provides evidence of the ability of this assay to track specificity of the aptamer pool. It seems that selection of BFT-2 may have been disrupted in pool five since a large ration of fold change is seen. Additionally, the final enrichment ratio comparing selection rounds 10 to 2 (Figure 9D) shows that the final enrichment is similar to BFT-1. This can be explained by either assuming that the selection rounds from 6-10 were not very effective or that there was contamination at or before round 5, which was corrected by round 10. The SYBR Green assay provides a readily accessible format to easily query the selectivity of aptamer pools, and allows for interesting observations about the properties of these pools to be made. This assay may prove useful in providing a faster and more

47 simple method to track aptamer evolution compared to the current commonly used method of radiolabeled ATP.

48

Chapter 4 Figures

Graphical Overview of SELEX Process

Figure 4: Graphical simplification of SELEX cycle process. A. The first half of a SELEX cycle starts with ssDNA aptamer pools incubated in Pre-selection and Selection matrix. B. Sequences bound to toxin are eluted, PCR amplified, and separated using NaOH for the next round.

49

Estimating binding capacity of biotinylated DNA amplicons to M280 Streptavidin Beads

Figure 5: PCR product from BFT SELEX cycle 2. DNA concentrations were estimated by comparing band intensities to a standard 100 base pair ladder (NEB) containing 0.5 micrograms of DNA and used to calculate the volume of SA-M280 bead to bind dsDNA SELEX PCR products. The ladder contains a standardized mass of each band, the 100 bp and 200 bp bands containing 48 and 25 ng of DNA, respectively, were used to estimate the concentration of DNA in the PCR product.

50

Microscopic Visualization of FAM-Labeled DNA Aptamers and BFT-M280 Complex

Figure 6: FAM-Labeled BFT-2 SELEX 10 Microscopy A. Overview of FAM labeled aptamer generation of BFT-2 Selex Pool 10 DNA. B. Brightfield microscopy image of M280 streptavidin coated beads. No fluorescence was detected on FITC channel (image not shown).

51

FACS CALIBUR Flow Cytometry of FAM-labeled Aptamer pools BFT- M280 Complex

Figure 7: FACSCalibur Flow cytometry of FAM labeled aptamer pools conjugated to BFT2-SA-M280 Beads. A. SA-M280 bead only without incubation with FAM labeled aptamer B. FAM labeled aptamers incubated with SA-M280 beads C. FAM labeled aptamers incubated with BFT2-SA-M280 beads. D. Histogram of A-C comparing counts with fluorescence signal. There is no evidence of binding of FAM labeled aptamers to the toxin bead complex.

52

Affinity of final SELEX round DNA Pools for Toxin detected by Real-Time SYBR Green PCR Assays

Figure 8: Real-Time SYBR Green PCR Assay comparing enrichment of BFT-1 and BFT-2 Selex Pool 10 to BSA. The optical plates were incubated with toxin overnight, washed, and incubated with DNA. Both BFT-1 and BFT-2 are seen earlier than the BSA non-specific protein control well.

53

Figure 9: A. Schematic diagram of SYBR Green Real Time PCR assay with BSA blocking with incubation of DNA from SELEX pools 2, 5, and 10 was done to decrease the non-specific interaction of aptamers. B. Average fold change compared to wells not coated with protein was used to track the selective binding of aptamer pools with BFT. Increased fold changes in the toxin wells for both toxins. A slight decrease is seen in BFT-1 with SELEX 5 DNA. C. Ratios of fold changes compared to control wells was used to determine if SELEX DNA pools favored BFT versus BSA. D. The ratio of fold changes compared between later and earlier SELEX DNA pools for BFT and BSA coated wells was used to determine favorability of DNA pools. Ratios closer to one imply no selection towards a target. BFT enrichment ratios are more than twice as large as respective BSA pools indicating favorable enrichment to toxin.

54

Figure 10 A. Schematic of co-incubation of BSA control protein with BFT to assess the contributions of non-specific binding has to PCR assay. B. Average fold changes of toxin co-incubated wells are expected to be closer to BSA wells if most of the amplification is due to non-specific interactions. Interestingly the fold changes decrease across SELEX cycles for BFT-1 possibly indicating that there is less favorable binding of aptamers indicating possible selection against BFT-1. C. Ratio of fold changes for toxin co- incubated wells compared to BSA wells to see if there is selection across SELEX cycles. Low ratios are expected for aptamer pools that favor binding to toxin.

55

CHAPTER 5: Next Generation Sequencing of Aptamer Pools

Purpose

Prior to the adoption of next generation sequencing methods for aptamer studies,

the most common approach to identifying binding candidates required ligation of the

final selection pool of aptamers into vectors that were then transformed into competent E.

coli strains (unpublished, general observation). Following transformation, single colonies

were isolated and the aptamer sequence was PCR amplified from the plasmid DNA prior

to sequencing. As selection rounds for the target were increased it was assumed that the

resultant sequences would be dominated by aptamers that exhibited specificity for the

target. This approach assumed that the sequences that are highest in abundance are due to

actual selection and not a result of cloning or PCR bias.

Following the demonstration in the previous chapter of evidence for enriched

preparations containing BFT-1-binding or BFT-2-binding aptamers, the next goal was to analyze the sequence composition of the pools. The following experiments used a modified protocol published by Tolle and Mayer to prepare aptamer pools following

SELEX for Illumina MiSeq next generation sequencing.59 Incorporation of a next

generation approach to more thoroughly assess the composition of aptamer pools allowed

for the possibility of obtaining information regarding the composition of a larger

proportion of sequences in aptamer pools. This approach provides an opportunity to track

56 the evolution of these sequences across selection rounds in order to identify any candidate binders that are not highly selected in the PCR amplification steps.

Materials and Methods

Indexing Aptamer Pools for Next Generation Sequencing

In the work described here the aptamer sequences will undergo two levels of labeling which will be referred to a indexing and barcoding. Index sequences will be added to the aptamers to distinguish the different aptamer pools. Aptamers will receive barcodes to distinguish the different pools for sequencing.

Pairs of FWDSELEX and REVSELEX primers each containing one of the following 12 index sequences incorporated at the 5’ end were synthesized (Sigma-

Aldrich): 1-ATCACG, 2-CGATGT, 3-TTAGGC, 4-TGACCA, 5-ACAGTG, 6-

GCCAAT, 7-CAGATC, 8-ACTTGA, 9-GATCAG. 10-TAGCTT, 11-GGCTAC, and 12-

CTTGTA. Four, 50 µL PCR reactions each containing 1 µL ssDNA template, 1 µM each barcoded forward and reverse primer pairs, 1x Pfu Reaction Buffer, 50 µM each dATP, dCTP, dGTP, dTTP, 2.5 U Pfu Polymerase (Promega), in ultrapure water. ssDNA from

N20 Aptamer Library, BFT-1 Selex Cycles 4, 7, 9, 10 and BFT-2 Selex Cycles 4, 7, 9, 10 were used in the indexing reaction. A second indexing reaction for BFT-1 and BFT-2

SELEX Cycles 10 was done using index pair 10 and 11, respectively. The PCR conditions described in Chapter 4 were used, but only for 8 cycles. The pooled 200 µL dsDNA indexed PCR product was purified using NucleoSpin Gel and PCR Clean-Up Kit

57

(Macherey-Nagel) and eluted in 30 µL Elution Buffer (5 mM Tris/HCl, pH 8.5). All 30

µL of each purified PCR product was loaded onto a 3% ETBR Agarose gel, SB running buffer, and run at 130 V for 30 minutes. PCR products of the correct size were gel extracted and the DNA was purified using NucleoSpin Gel and PCR Clean-Up Kit.

Following purification DNA concentrations were measured using NanoDrop 2000c

(ThermoFisher) and indexed pools 1-9 encompassing DNA from N20 Aptamer Library,

BFT-1 and BFT-2 SELEX cycles 4, 7, 9, and 10 were combined at an equimolar ratio to form the Mixed Index Pool, which was concentrated to a final volume of 30 µL in

Elution Buffer using NucleoSpin Gel and PCR Clean-Up columns. DNA from the BFT-1

SELEX Pool 10 Index 10 and BFT-2 SELEX Pool 10 Index 11 reactions were processed as separate sequencing pools.

Adapting NEB NEXT Protocols for Aptamer NGS preparation

NEBNext Ultra II DNA Library Prep Kit for Illumina and NEBNext Multiplex

Oligos for Illumina Index Set 1 (NEB) were used to prepare sequencing pools containing barcoded aptamers. These specific steps were performed according to available NEB protocols to prepare the three sequencing pools (Mixed Index Pool, BFT-1 Selex Pool 10

Index 10, and BFT-2 Selex Pool 10 Index 11) for Illumina MiSeq sequencing:

NEBNext End Prep

Sequencing pools containing 30 µL of DNA were combined with 20 µL of sterile ultrapure water and added to a sterile DNAse, RNAse free 1.5 mL centrifuge tube containing 3 µL NEBNext Ultra II End Prep Enzyme Mix and 7 µL NEBNext Ultra II

Prep Reaction Buffer. The reaction was mixed thoroughly and pipetted into PCR tubes,

58

placed in an ABI SimpliAmp Thermal Cycler with lid set to 105°C, and incubated for 30

minutes at 20°C followed by 30 minutes at 65°C, holding at 4°C when finished.

Illumina Sequencing Adaptor Ligation

The product of the End Prep reaction (60 µL) was combined with 30 µL NEB

Next Ultra II Ligation Master-mix, 1 µL NEBNext Ligation Enhancer, 2.5 µL NEBNext

Adaptor for Illumina, mixed thoroughly, and spun down with a benchtop centrifuge. The reactions were incubated at 20°C for 15 minutes. USER Enzyme (3.0 µL; NEB) was added to the ligation mix and incubated at 37°C. Ligation reactions were cleaned up using NucleoSpin Gel and PCR Clean-up columns (Macherey-Nagel) and eluted in 30 µL of Elution Buffer.

PCR Amplification and Illumina Barcoding

Aliquots (15 µL) of the purified adaptor-ligated DNA fragments were thoroughly

mixed with 25 µL of NEBNext Ultra II Q5 Master Mix, 5 µL of Index Primer from

NEBNext Index Set 1, and 5 µL of Universal PCR Primer. The mixture was PCR amplified using SimpliAmp Thermal Cycler (Applied Biosystems/ThermoFisher) with the following conditions: 98°C for 30 seconds, 4 cycles (98°C for 10 seconds, 65°C for

75 seconds), followed by 65°C for 5 minutes and a 4°C hold. PCR products (50 µL) and

5x BlueJuice (10 µL) were loaded on a 3% ethidium bromide stained agarose gel in SB buffer and run for 30 minutes at 130V. Gels were imaged using BioRad ChemiDoc MP imager. The correct size bands were identified, gel extracted, purified using NucleoSpin

Gel and PCR Clean-up columns, and eluted in 30 µL Elution Buffer. BFT-1 Selex Pool

10 containing the aptamer FWDSELEX and REVSELEX primer index 10 was barcoded with Illumina Index 6. BFT-2 Selex Pool 10 containing the aptamer FWDSELEX and

59

REVSELEX primer index 11 was barcoded with Illumina Index 12. Finally, the BFT

Mixed Index Pool containing the aptamer FWDSELEX and REVSELEX primer indexes

1-9 were barcoded with Illumina Index 4.

Measuring Concentration of Sequencing Pools

DNA concentrations of the Illumina barcoded sequencing libraries were measured

using NEBNext Library Quant Kit for Illumina. The manufacturer’s protocol was

followed to determine the concentrations of sequencing libraries for MiSeq analysis.

NEBNext Library Quant Dilution Buffer was used to dilute initial starting libraries

1:1,000, 1:10,000, 1:100,000, 1:10,000,000, and 1:1,000,000,000 fold by adding 1 µL

sequencing library DNA to 999 µL 1x NEBNext Dilution buffer. Subsequent ten-fold

serial dilutions were made by adding 10 µL diluted library to 90 µL 1x Dilution Buffer.

Volumes of 4 µL of the diluted libraries or 4 µL of the standards were mixed with 16 µL

NEBNext Library Quant Master Mix with Primers and loaded in ABI MicroAmp optical

reaction plates. Dilution buffer only was used as a no-template control reaction.

Reactions were thoroughly mixed and briefly centrifuged to remove bubbles. The qPCR

assays were run in triplicate on an ABI 7500 Real-Time PCR machine at 95°C for 1

minute, and 35 cycles at 95°C for 15 seconds, 63°C for 45 seconds using SYBR/FAM

with low ROX normalization. Data was analyzed by averaging the standard library values

and linear regression analysis of the quantification cycle (Cq) versus log concentration

was used to determine slope and intercept. Cq values for the sequencing libraries that

were flanked by at least two standards were used to determine concentration of undiluted

aptamer sequencing pools.

60

MiSeq Nano Chipset Preparation and Sequencing

Illumina MiSeq Nano sequencing cartridges contained a one-million read

sequencing capacity. Illumina barcoded sequence pools containing calculated

concentrations were assigned a specific proportion of reads. The Mixed Index pool

containing indexed aptamer sequences across multiple selection rounds was assigned

70% of the total reads on the sequencing lane. The remaining 30% was split evenly such

that 15% of the run was assigned to each BFT-1 SELEX Pool 10 and BFT-2 SELEX Pool

10. Estimated concentrations of the barcoded sequencing pools were used to combine relative fractions to a final concentration of 4 nM in a final volume of 20 µL. The combined DNA libraries (5 µL) were mixed with 5 µL of 0.2 N NaOH and incubated for

5 minutes then added to 990 µL of chilled MiSeq HT1 hybridization buffer. The denatured pooled sequencing libraries were further diluted in HT1 buffer to a final concentration of 12 pM. PhiX sequencing control (40 µL at 20 pM; Illumina) in HT1

buffer was added to 560 µL of the 12 pM pooled sequencing library and loaded into an

Illumina MiSeq Nano Chipset cartridge. The Illumina cartridge was loaded into a MiSeq sequencer.

Results

Figure 11 shows the gel image following barcode indexing of aptamer pools. Both

the BFT-1 and BFT-2 pool 10s show very robust PCR generation when compared to all

the other pools. The formation of secondary products can also be seen in the higher

molecular weight bands on the PCR gel. Gel extraction of the DNA bands resulted in a

61

significant loss in yield (general observation, data not shown). Any pool that did not yield

DNA was re-amplified and re-extracted. Concentrations were measured using NanoDrop,

and the concentration of the lowest pool was used to calculate the amount of DNA to add

to the Mixed Index sequencing library. BFT-1 SELEX pool 10 index 10 and BFT-2

SELEX pool 10 index 11 were used as standalone pools to generate sequencing libraries containing only sequences from the last round of selection. The remaining BFT-1 and

BFT-2 indexed SELEX pools were used to generate the Mixed Index sequencing library.

The Mixed Index library was therefore composed of a mixture of aptamers from a range of SELEX rounds for both BFT-1 and BFT-2. In place of using the DNA from the BFT selection round 10 pools with index 10 and index 11 the BFT pools containing index 5 and 9 were used to represent sequences from the final round of selection in the Mixed

Index sequencing pool.

Figure 12 shows the gel containing the Illumina barcoded libraries. The Mixed

Index pool contained indexed sequences from the stock aptamer library as well as aptamer pools from selection rounds 4, 7, 9, and 10 for both BFT-1 and BFT-2. This pool shows a much larger amount of DNA compared to the BFT pool 10 libraries.

Additionally, the Illumina library generation resulted in number of higher and lower molecular weight secondary PCR products (Figure 12). The bands extracted for sequencing are contained within the red boxes in Figure 12.

Following gel extraction and purification, the DNA concentrations were measured using qPCR and the resulting real-time plots are shown in Figure 13. The Cq values that were in the middle of the standard ladders were chosen for calculating the library concentrations. The standard ladder Cq values were plotted on log-plot and the equation

62 was used to calculate average library concentrations. The libraries were determined to have 51.6 nM, 27.9 nM, 531.4 nM for the BFT-1 SELEX Pool 10, BFT-2 SELEX Pool

10, and Mixed Index Pool, respectively. A final concentration of the libraries at12 pM was sequenced.

Discussion

The PCR reactions to generate the indexed DNA pools showed variations in the robustness of generated PCR products which may have had an effect on the makeup of the aptamer pools that were used for sequencing. The large amount of DNA present in the sequencing library gel may have resulted in the carryover of PCR products that were not of the correct length, which would have affected the quality of sequencing reads. The use of the NanoDrop to estimate the concentration of barcoded DNA pools was a significant shortcoming in the generation of a high quality sequencing library since it resulted in the uneven loading of the sequencing pools which directly affected the read output for a few of the SELEX rounds resulting in the inability to track the evolution of the aptamer sequences across all of the selected SELEX rounds.

63

Chapter 5 Figures

Barcode Indexing of SELEX DNA Pools

Figure 11: SELEX DNA PCR Pools were amplified and run on a 3% agarose gels. The addition of barcode indexes increases the size of aptamers by 12 bp for a total size of 78 bp. Robust amplification is seen for Pool 10 DNA for both toxins. Pool 10 DNA from both toxins were also indexed separately to sequence alongside of the mixed pool.

64

NEBNext Illumina Sequencing Library Gel Extraction

Figure 12: Following amplification of sequencing ligation reactions to add Illumina barcode and Universal primer fragments the reactions were run on a 3%agarose gel and bands indicated by the red boxes were extracted. Illumina barcodes and sequencing primers add approximately 125 bp to the length of dsDNA bringing the expected size of the library to ~200 bp.

65

Illumina Sequencing Library Quantification

Figure 13: Real-Time qPCR outputs for NEBNext Library quantification of prepared aptamer sequencing pools. Sequencing pool concentrations was necessary to determine the amount of DNA to add to sequencing chip. A. qPCR output for NEBNext standards with 10, 1.0, 0.1, and 0.01 pM concentration. B. Output of standards with sequencing library qPCR curves used for data analysis. C,D,E. qPCR output for BFT-1, BFT-2, and Mixed Index pools respectively. 10-4, 10-5, 10-6, 10-7, and 10-9 dilutions were used. The 10-9 curved shows that the Mixed Index pool concentration was higher than the other pools due to its earlier appearance.

66

CHAPTER 6: Analysis of Sequence Data to Characterize BFT Binding Candidates

Purpose

Sequencing of aptamer pools using next generation sequencing (NGS) platforms generate a large amount of sequence data that needs to be efficiently analyzed. There are a number of widely available aptamer sequence processing programs available for use.

However, a familiarity with command-line scripting is required for the efficient use of these data analysis pipelines. One of the principal goals considered in designing the previous experiments was to underscore the ease of access and utility of aptamer generation techniques to laboratories without previous experience using SELEX or NGS tools. The use of publicly available graphical user interface (GUI)-based webservers for data analysis circumvents the need to use command-line scripting making analysis of aptamer sequences more readily accessible.

Successful MiSeq sequencing of the barcoded aptamer pools generated approximately one million high quality reads that needed analysis in order to provide insight to the overall success of aptamer selection and candidate generation against the

BFT toxins. Only publicly available GUI-based webservers were used to analyze the results of the SELEX aptamer pool sequences. The data analysis conducted in this

67

chapter is used to demonstrate a straightforward pipeline that can be easily adapted for

use with aptamer sequence analysis.

Materials and Methods

Post-processing of NGS Sequence reads with GALAXY

The 150 base paired-end MiSeq sequencing runs generated three fastq.gz

sequencing files that were processed with toolsets available on Galaxy project webservers

(https://galaxyproject.org/public-galaxy-servers/).60 The FastQC package was used to

analyze the quality of sequencing reads and to assess overrepresented sequences, average read lengths, and base content.61 Paired-end reads were merged using PEAR.62 Fastq

assembled files were subsequently converted to fasta format and de-multiplexed using the

Barcode Splitter tool. Index sequences (Refer to Chapter 4 section Indexing Aptamer

Pools for Next Generation Sequencing) and the forward primer (FWDSELEX) sequence

were used to de-multiplex all files. Unmatched files were uploaded, de-multiplexed using barcode index sequences with the reverse primer (REVSELEX) sequence and the reverse complement sequence was obtained using the Reverse-Complement tool. The reverse complement sequence file and initial de-multiplexed files were combined using Collapse

Collection tool. Sequences from the BFT-1 SELEX Pool 10 reads were merged together using. The same was repeated for BFT-2 SELEX Pool 10 reads. The Trim Sequences tool was used to remove any barcode indexes. The Clip Adapter tool was used to remove the reverse primer arm followed by Trim Sequences to remove the forward primer arm in one subset of files. Another set of files had only the indexes removed and retained all forward

68

and reverse primer regions. A third set of files filtered the final aptamer sequences for

length, discarding any reads that were less than 66 bases using the “Filter sequences by length” tool.

Comparing aptamer copy numbers for candidate elucidation

The collapse sequences tool was used to count the repeated sequences within the

aptamer pools. This was done using trimmed sequences as well as full-length primer

sequences to determine if there is any obvious enrichment in the final pools.

Additionally, sequences present in both BFT-1 and BFT-2 pools were analyzed using

Collapse Collection tool to join the sequencing reads in both SELEX Pool 10 files and the highest frequency reads were compared to the original counts to determine similarities between pools.

Motif Analysis of Aptamer Pools

Multiple Em for Motif Elicitation (MEME) webservers were utilized to determine

the presence of motifs in the DNA aptamer pools. Motifs found in later pools were

compared to earlier pools to determine if there are any overlapping sequences.63

Statistical Analysis of Aptamer Pool Enrichment

MEME webservers were used to determine the presence of 8-15 base motifs in

the most highly enriched sequences from SELEX Pool 10 for both BFT-1 and BFT-2.

The number of reads in each aptamer pool containing these motifs was quantified using

the MEME webserver Find Motifs (FIMO) tool. Contingency (2x2) tables comparing the

number of reads with motifs to the remaining reads in the sequenced aptamer pools were

generated. Data were not normalized and Fisher’s Exact Test was used to determine if

69

there was a statistically significant difference in the enrichment of the motifs across

sequenced pools.

Clustal Omega Alignment of Sequences

Sequences with more than 3 reads were aligned using Clustal Omega for Selex

pool 10. Phylogeny plots were generated to determine the overall homogeneity present

within aptamer pools.64

Secondary Structure Prediction of Enriched Sequences

Clustal Omega alignments were processed using webservers to determine any

obvious consensus aptamer sequences. Secondary structures and conserved structures of

the most enriched aptamer sequences were visualized using the RNAstructure and mfold

webservers.65-67

Results

The post-processing of the Mixed Index Pool sequence data shown in Table 1

suggested that the DNA concentrations obtained using the Nano-drop instrument were not accurate. The sequence heterogeneity in the earlier pools limited the ability to properly determine sequence evolution over selection rounds. Figure 14 shows the potential sequencer read output as either being 5’ oriented, which was the orientation of the aptamers used for selection, or 3’ oriented (reverse complement reads). If the sequences had a 3’ orientation, the reverse complement sequences were generated in order to process the reads in a way that provides useful aptamer sequence information.

70

Table 1 also shows that about five percent of the reads were unusable. The Mixed Index

Pool, which was constructed to be composed of 11% of each sub-library with an expected

read output of 700,000 reads, generated over 584,300 sequence reads. This same process

was repeated for the independently sequenced BFT-1 and BFT-2 pools, , generated

164,000 and 133,000 reads, respectively, out of an expected 150,000 reads (data not

shown). Table 1 shows a large degree of variability in the actual percentage total reads

that each sub-library yielded from sequencing. BFT-1 SELEX round 4, 7, and 9 had

fewer reads than expected with yields of 0%, 1.88%, and 8.1%, respectively. BFT-2

SELEX Rounds 7 and 9 had fewer reads than expected with yields of 0% and 1.67%, respectively. BFT-1 SELEX round 10, BFT-2 SELEX round 4, 10, and the Stock

Aptamer Library pools had more reads than expected with yields of 31.84%, 14.69%,

21.47%, and 14.89% respectively.

Table 2 describes the properties of the aptamer pools regarding the number of enriched sequences and the number of sequences that were useable for analysis. Aptamer sequences that had two or more reads output through sequence analysis were categorized as highly enriched sequences. The BFT-2 selection round 10 had a significant number of unusable sequence reads containing forward and reverse constant regions or reads that had less 15 bases.

Tables 4 and 5 examined the properties of the toxin-binding aptamer sequences from the 10th selection round. The sequence reads ranged from between 2 to 137 reads

per sequence for the highly enriched aptamer pools. The highest number of reads were

seen in the BFT-2 pools where one sequence had 137 reads and the second highest had 13

reads. The highest number of reads seen in the BFT-1 pools were 10. The majority of the

71

sequences in both pools had only two reads. Table 3 shows the comparison between the

most highly enriched (three or more sequence reads) sequences across BFT-1 and BFT-2 selection round 10. The table was generated by combining the sequences from the 10th

selection round and looking for an increase in the sequencing reads in the resulting

combined aptamer pools. This resulted in two sequences that had a different number of

reads when compared to the separate BFT SELEX 10 pools. These sequences were

highlighted in green and depict two aptamer sequences that were present in SELEX pool

10 of both BFT-1 and BFT-2. It should be noted, that the number of sequence reads of

these aptamers were higher in BFT-2 compared to BFT-1. Future work will be done to

determine if these sequences can bind specifically to both toxins and to determine

whether the sequences are a result of PCR biasing or represent aptamers that may

nonspecifically interact with other proteins such as BSA.

Table 4 depicts the highest copy number aptamer sequences from BFT-1 that showed some degree of sequence identity. Interestingly comparing one sequence from round 7 to round 10 shows a great amount of sequence identity between aptamers 2-2S3 and 22-2S3. Figure 15 shows the predicted secondary structures of these aptamers. There is a change of 2 bases at position 27 and 28 from CC to GG that resulted in a significant shift in the secondary structure of these aptamers through the introduction of a larger stem loop structure. Table 5 describes the high abundance sequences from BFT-2 selection round 10. Interestingly, the highest copy number aptamer sequence from this round is also found in both round 9 and round 4 indicating the early generation and continued selection of this aptamer. Within the higher copy number aptamers there is a cluster that has a large amount of sequence identity. Figure 16 compares the motifs

72

present in both SELEX 10 aptamer pools. Six of the nine motifs (having 3 or more

copies) have some degree of overlap.

To better visualize the distribution of sequences in the last SELEX pool the

sequences were aligned using Clustal Omega and the alignment was used to generate a

dendrogram describing the relative similarity between the sequences. Figure 17 shows the

degree of sequence similarity between the aptamers found to bind to BTF-1 or BTF-2.

BFT-1 seems to be less diverse compared to BFT-2. Sequence alignment and dendrogram

generation of pool BFT aptamer sequences shows four main clustering branches.

Interestingly, the sequences for both toxins are distributed throughout the branches. In

order to assess the statistical significance of the enrichment across selection rounds,

MEME webservers were used to find 8-15 base-long sequence motifs present in the highest copy number aptamer sequences from the 10th selection round of both toxins.

These motifs were then counted across selection rounds, normalized, and plotted in

Figure 18. The corresponding motif sequence and lengths are also displayed in Figure 17.

It can be seen that there was an increase in the normalized motif count across selection

rounds for BFT-1. The normalized motif counts for BFT-2 show a sharp increase from

SELEX round 4 to 9 and then drops in round 10. Although the lack of sequencing depth

prevented tracking of aptamer pool evolution through copy number counts we could

deduce there was a change in the composition of these pools based on the normalized

motif counts which showed that the selected motif composition significantly changes

across selection rounds.

Contingency tables containing the motif sequence counts compared to sequences

without motifs are shown in Table 6. The motif counts were not normalized for statistical

73

analysis. Fisher’s Exact Tests were conducted on 2x2 contingency tables containing the

sequence count to determine if there is enrichment between the comparison groups. Table

6 shows that comparison of all selection rounds to the no selection groups reveals a

statistically significant difference indicating there was demonstrable enrichment for these

motifs across the SELEX process.

Additionally, when comparing the motif counts in SELEX round 10 to the other

selection rounds for both BFTs we see that there is a statistically significant difference in

the motif counts when compared to earlier rounds showing that enrichment of these

motifs are still occurring after 10 rounds of selection. Interestingly, there is no

statistically significant difference when comparing the 7th and 9th selection round of BFT-

1. This implies that there is little change in the assessed motif populations during these rounds. Overall, the statistical tests comparing the change in motifs contained in the highest copy number aptamer sequences show that the 10th selection round of both toxins

is significantly different from the earlier pools which implies that there is still selection

for toxin binding aptamers occurring at this stage. This implies that the number of

selection cycles used in the SELEX process could have been increased to further enrich

the aptamer pools containing potential binding sequences.

Figure 19 shows the predicted secondary structure for the eleven candidate

aptamer sequences. Evaluation of the overlap between the highlighted random regions

and the constant primer regions show that the predicted secondary structure for many

aptamer sequences contains regions of the constant primer regions. Figure 19 also shows

aptamer sequences that have identical secondary structures. There were two cases where

where a pair of distinct aptamer sequences have identical secondary structure (10-6 and

74

9-6;33-3 and 24-2). There was on case where three different aptamer sequences (16-4, 2-

137, 23-2) are predicted to share an identical secondary structure. From this data we can

see that some of the high copy number sequences from the BFT-2 aptamer pool share identical predicted secondary structures. BFT-1 aptamers also are predicted to have significantly similar structures. For example, aptamer 16-2 and 28-2 have minor differences in their predicted smaller stem loop structure, which can be attributed to the differences in the random regions of these sequences. The BFT-1 aptamers 26-2 and 3-10 also have very similar predicted structures with a slight variation of a single base in the minor stem loop. This data supports the notion that the aptamer sequences present within the final selection pool of both BFTs may have been selected for specificity of binding to the toxins since they share similar secondary structures.

Discussion

The use of a NanoDrop to measure the DNA concentrations of the indexed pools

most likely resulted in inaccuracies in the reported concentrations. A better estimate of

the DNA concentrations of the pool would have been to use a PicoGreen-based protocol

(ThermoFisher, Cat No P11496). Ideally, the indexed aptamer pools would have yielded

~77,000 quality sequencing reads per sub-library and generated ~700,000 total reads.

Additionally, it was expected that this level of sequencing depth would be adequate to

reveal sequences with high numbers of repeats. The read outputs from sequencing were

far from the expected percentages. Very few sequencing pools were within 5% of the

75

expected read output. The BFT SELEX round 10 sequencing reads were the majority of

the reads recovered through sequencing with an expected 52% of all the sequencing reads

(expected 520,000 reads of 1,000,000 total). Instead, the BFT-1 SELEX round 10 pool

made up ~350,000 reads and the BFT-2 SELEX round 10 pool made up ~258,000 reads

which together composed ~69% of all sequencing reads (608,000 reads of 881,000 total).

Overall, the sequencing read outputs were far from expected which resulted in the inability to track aptamer pool evolution since the earlier aptamer pool sequencing depth was inadequate for such analysis.

While the lack of sequencing depth prevented tracking of aptamer pool evolution

through sequencing analysis of different selection rounds, the data did provide useful

insight regarding the potential binding candidate aptamers within the pools. Of note, is the ability to track the potential evolution of a single aptamer sequence across three rounds of selection into the final pool.

The change in two bases resulted in a significant shift in the predicted structure of

the variable region of the aptamer. This observation provides insight into the utility of

relying completely on dendrogram-based analysis to predict similar binding sequences.

Utilizing these graphs, it is possible to visualize any overlapping similarity between aptamer sequences generated in the separate BFT-1 and BFT-2 SELEX processes. It can be hypothesized that aptamer sequences that are similar between both toxin SELEX experiments might possibly bind to both BFTs, whereas sequences that are unique to each toxin might bind only to their respective toxins. The dendrograms generated for the aptamer sequences allow for a method to visualize closely related aptamer sequences in order to narrow down the total number of candidate aptamers for further workup.

76

Statistical tests analyzing motif counts identified in the most highly enriched sequences from the last selection round showed that the 10th round contained a significantly different motif count than the earlier selection rounds. This indicates that increasing the total number of selection rounds may have resulted in further enrichment of the aptamer pools. The sequences analyzed provide a large number of aptamer candidates that can be assessed for their ability to selectively bind and possibly inhibit one or both subtypes of the toxin.

77

Chapter 6 Figures

Aptamer Pools were not equally loaded when preparing sequencing samples

Table 1: A. Output from de multiplexing Mixed Index DNA Pool sequencing reads. B. The reverse complemented data is also displayed. C. The table shows that the aptamer index pools were not equally loaded with variation seen in all pools and BFT1 Round 4, BFT2 Round 7 have no reads. ~5% of reads unrecovered possibly due to misalignment and removal of index. The expected read percentages of an evenly loaded pool are shown and compared to the total percent reads yielded from sequencing. There is marked deviation from the expected percent yields showing that the mixed index aptamer pool was not evenly loaded.

78

Graphical Visualization of Multiplexed Sequence Reads

Figure 14: PCR pools were indexed on both strands to maximize the number of recovered reads from the sequencing run. Paired end sequencing is unable to distinguish between the preferred strand to sequence resulting in sequencing pools containing the reverse complement strand sequenced.

79

Sequencing reads of later pools identify potentially enriched aptamer sequences

Sequence Enrichment Copy Number BFT-1 BFT-2 Selex Round 10 9 7 Selex Round 10 9 4 Copy Number Count Copy Number Count >10 3 0 0 >=10 7 0 0 >=5 3 0 0 >=5 5 0 0 >=3 6 2 0 >=3 24 0 2 >=2 38 6 2 >=2 14 3 10 >=1 NA 17 23 >=1 NA 22 13 Sequences Containing Forward or Reverse Constant Regions Selex Round 10 9 7 Selex Round 10 9 4 Count 0 1 0 Count 16 0 0 Seqences with Less than 15 Bases Selex Round 10 9 7 Selex Round 10 9 4 Count 9 5 1 Count 11 2 3

Table 2: Candidate aptamers were determined based on their enrichment across SELEX cycles and in the final aptamer pool. There was a more enriched profile seen in BFT-2 with seven sequences having more than 10 copies in the final pool compared to 3 in BFT- 1. BFT-2 sequences also contained a high number of sequences with repeated primer regions. Reads containing primer regions or less than 20 bases in the random sequence were discarded.

80

Identical Aptamer Sequences present in BFT-1 and BFT-2 Selex Pool 10 highlighted in green

Table 3: Searching the highest copy number sequences in BFT-2 SELEX pool 10 revealed the presence of identical sequences in BFT-1 SELEX pool 10, shown highlighted in green. The sequences were more highly enriched in the BFT-2 pools with 5 and 3 reads compared to 1 for each sequences respectively.

81

Analysis of high copy number sequences in BFT-1 SELEX Aptamer pools

Table 4: Highest representative sequences for BFT-1 revealed certain features present within the aptamer pools. Sequence 7-2-2 from round seven is very similar to sequence 10-22-2 in the round 10 pool. Sequences with similarly conserved features are highlighted in the same color (Blue, Green, Gray, or Yellow) with the exact matches between the two sequences in denoted in red. The percent population were determined by dividing the number of aptamer repeats with the total number of reads from sequencing. This percentage was used to generate the normalized value of calculated expected reads per million sequences.

82

Prediction of Evolutionary Change in SELEX Sequences in BFT-1 Pool

Figure 15: Structure prediction of the highly similar sequence in BFT-1 Pool 7 and Pool 10 may track the evolutions of a sequence through increases in secondary structure across SELEX. A change in nucleotide 27 and 28 (boxed in red) in SELEX 10 introduces a larger stem-loop structure.

83

Analysis of high copy number sequences in BFT-2 SELEX Aptamer pools

Table 5: Highest representative sequences for BFT-2 reveals certain features present within the aptamer pools. One sequence, which was detected through all available rounds of SELEX, is highlighted in green. Additionally, a cluster of sequences with significant homology to each other is show in yellow. The percent population were determined by dividing the number of aptamer repeats with the total number of reads from sequencing. This percentage was used to generate the normalized value of calculated expected reads per million sequences.

84

Motif Analysis of Aptamer Pools identifies enrichment of random-region sequence motifs

Figure 16: MEME motif searching algorithms revealed a set of motifs, present in at least three sequences, that were conserved between both BFT-1 and BFT-2. Bit values describe the proportion of bases that occupy a given region in a motif. A bit value of 2 represents a base that is found in all of the motif repeats identified. The sizes of the bases in the figures represents the relative proportion of bases that occupy that given position. Larger base images corresponding to higher frequencies of detection in a given motif position.

85

Clustal Omega Alignment of high-copy number sequences shows high diversity in Selex Pool 10 for BFT-1 and BFT-2

Figure 17: Clustal Omega sequence alignment was used to generate phylogeny trees of BFT toxin aptamer pool 10 for the high copy number sequences with two or more reads. A combined phylogeny tree was also generated to identify potential aptamer sequences that can interact with both toxins. BFT-2 Pool seems to have a greater amount of diversity with more branching of sequences. Many of the sequences in both pools seem to be closely related.

86

SELEX round motif enrichment tracks relative changes in Aptamer Pools

Figure 18: Motif tracking normalized to motif counts per million sequence reads across selection rounds of BFT-1 and BFT-2. Motifs 8-15 bases in length were identified in the most highly enriched aptamer sequences and were used to determine statistically significant difference in enrichment across aptamer pools. The motif sequences are listed below the plots and wobble sites are enclosed in brackets.

87

Fisher’s Exact Test statistics of motif enrichment in SELEX rounds

Table 6: Contingency tables of sequences that contain the 8-15 base motifs identified from the highest copy number aptamer sequences. The comparison groups of the 2x2 contingency groups containing the raw motif count reads and total reads not containing motifs were displayed. Fisher’s Exact Test p-values and statistical significance are shown below the raw data. There was no statistical difference in motif enrichment for BFT-1 Selex round 7 when compared to Selex round 9.

88

Secondary Structure Analysis of High Copy Number Sequences

Figure 19: Structure prediction for eleven candidate aptamer sequences for both BFT-1 and BFT-2, with the random sequence regions highlighted in yellow. A. BFT-1 sequence 16-2 and 28-2 share a similar predicted structure. Aptamer 26-2 and 3-10 contain the same stem-loop at the 3’end. B. BFT-2 aptamers contain a large amount of structural overlap with 7 of the 9 aptamers sharing identical secondary structures. Aptamer sequence identifiers with identical secondary structures are boxed in red.

89

SUMMARY

Using a modified magnetic bead SELEX protocol, I enriched for ssDNA aptamers that could bind to Enterotoxigenic Bacteroides fragilis toxin subtypes BFT-1 and BFT-2.

A SYBR Green-based Real-Time PCR assay was designed to monitor changes in specificity of aptamer pools to toxin across selection rounds. Aptamer pools were barcoded and sequenced using MiSeq nano sequencing. Resulting sequencing reads were analyzed using publicly available webservers to generate potential toxin binding aptamer candidates. The evolution of an aptamer structure from the early to late rounds of selection was described in which there was a modification of a two-nucleotide sequence from C-C to G-G. Clustal Omega multiple sequence alignment and dendrogram analysis were used to inform the candidate selection process. Eleven structurally unique aptamers, comprising a collection of fourteen aptamer sequences were described for further workup. This work has shown that methods for aptamer selection combined with next generation sequencing and data analysis using publicly available GUI-based webservers is a feasible approach that can be adopted by labs that do not have any previous experience in aptamer generation.

90

FUTURE WORK

The aptamer candidates described in this study need to be assessed for their

capacity to bind to BFT-1 or BFT-2. A SYBR Green-based binding assay will be adapted

for use with BFT. Following confirmation of specific binding to BFTs through the SYBR

Green binding assay, future experiments can be conducted to map the interaction coordinates of the ssDNA aptamers nucleic acids with the amino acids of the BFT protein using X-ray crystallography.

There are a number of areas in which further work would benefit the candidate

selection process. The SYBR Green-based protein assay would need to be validated using

previously evaluated aptamers to determine adaptability to general use. The uneven

loading of aptamer sequences from earlier selection rounds prevented complete analysis

of the SELEX cycle for the BFT toxins. A Real-Time qPCR based method of quantifying

the aptamer pools would have resulted in a more accurate method of concentration estimation.

Although the use of the MiSeq Nano chipset resulted in adequate sequencing resolution for identifying potential aptamer candidates, the reliability of the data generated would have greatly benefited from deeper sequencing. This increased resolution would have allowed for more sequencing reads across each selection round.

Overall, the lack of sequencing depth from the earlier selection rounds were the greatest weakness in the experiments used to generate BFT binding aptamer candidates.

Future experiments will determine the ability of these aptamers to be used to

fluorescently label the toxin for in vitro applications or for use in immuno-histochemistry

91 staining of tissues in order to visualize the presence of BFTs within tumors and ETBF- infected tissues. Additionally, it will be possible to examine the ability of these aptamers to inhibit BFT, ultimately to see if they can be applied in preventing BFT-mediated vascular collapse or chronic colitis in murine models.

92

BIBLIOGRAPHY

1. Eckburg PB, Bik EM, Bernstein CN, et al. Diversity of the human intestinal microbial flora. Science

(New York, N.Y.). 2005;308(5728):1635.

2. Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett C, Knight R, Gordon JI. The human microbiome project. Nature. 2007;449(7164):804.

3. Huttenhower C, Gevers D, Knight R, et al. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207.

4. Mahowald MA, Magrini V, Turnbaugh PJ, Gordon JI, Ley RE, Mardis ER. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444(7122):1027-131. http://dx.doi.org/10.1038/nature05414. doi: 10.1038/nature05414.

5. Johnson E, Heaver S, Walters W, Ley R. Microbiome and metabolic disease: Revisiting the bacterial phylum bacteroidetes. J Mol Med. 2017;95(1):1-8. http://search.proquest.com/docview/1852948687. doi:

10.1007/s00109-016-1492-2.

6. Walters WA, Xu Z, Knight R. Meta‐analyses of human gut microbes associated with obesity and IBD.

FEBS Letters. 2014;588(22):4223-4233. http://onlinelibrary.wiley.com/doi/10.1016/j.febslet.2014.09.039/abstract. doi:

10.1016/j.febslet.2014.09.039.

7. Hannah M. Wexler. Bacteroides: The good, the bad, and the nitty-gritty. Clinical Microbiology Reviews.

2007;20(4):593-621. http://cmr.asm.org/content/20/4/593.abstract. doi: 10.1128/CMR.00008-07.

8. Ellie J. C. Goldstein. Anaerobic bacteremia. Clinical Infectious Diseases. 1996;23(Supplement 1):S101. http://www.jstor.org/stable/4459895. doi: 10.1093/clinids/23.Supplement_1.S97.

93

9. Maier E, Anderson RC, Roy NC. Understanding how commensal obligate anaerobic bacteria regulate immune functions in the large intestine. Nutrients. 2015;7(1):45-73. http://www.ncbi.nlm.nih.gov/pubmed/25545102. doi: 10.3390/nu7010045.

10. Meng Wu, Nathan P McNulty, Dmitry A Rodionov, et al. Genetic determinants of in vivo fitness and diet responsiveness in multiple human gut bacteroides. Science. 2015;350(6256):aac5992. http://www.ncbi.nlm.nih.gov/pubmed/26430127. doi: 10.1126/science.aac5992.

11. Cynthia L Sears, Abby L Geis, Franck Housseau. Bacteroides fragilis subverts mucosal biology: From symbiont to colon carcinogenesis. The Journal of clinical investigation. 2014;124(10):4166-4172.

12. Wick E, Sears C. Bacteroides spp. and diarrhea. Current Opinion in Infectious Diseases.

2010;23(5):470-474. http://www.ncbi.nlm.nih.gov/pubmed/20697287. doi:

10.1097/QCO.0b013e32833da1eb.

13. Boleij A, Hechenbleikner EM, Goodwin AC, et al. The bacteroides fragilis toxin gene is prevalent in the colon mucosa of colorectal cancer patients. Clinical Infectious Diseases. 2015;60(2):208-215. http://www.narcis.nl/publication/RecordID/oai:repository.ubn.ru.nl:2066%2F157007. doi:

10.1093/cid/ciu787.

14. Sears CL. The toxins of bacteroides fragilis. Toxicon: Official Journal Of The International Society On

Toxinology. 2001;39(11):1737.

15. Gyung-Tae Chung, Augusto A. Franco, Shaoguang Wu, et al. Identification of a third metalloprotease toxin gene in extraintestinal isolates of bacteroides fragilis. Infection and Immunity. 1999;67(9):4945-4949. http://iai.asm.org/content/67/9/4945.abstract.

16. Shaoguang Wu, Jai Shin, Guangming Zhang, Mitchell Cohen, Augusto Franco, Cynthia L. Sears. The bacteroides fragilis toxin binds to a specific intestinal epithelial cell receptor. Infection and Immunity.

2006;74(9):5382-5390. http://iai.asm.org/content/74/9/5382.abstract. doi: 10.1128/IAI.00060-06.

94

17. Wick EC, Rabizadeh S, Albesiano E, et al. Stat3 activation in murine colitis induced by enterotoxigenic bacteroides fragilis. Inflammatory Bowel Diseases. 2014;20(5):821-834. http://www.ncbi.nlm.nih.gov/pubmed/24704822. doi: 10.1097/MIB.0000000000000019.

18. Shaoguang Wu, Kuei-Cheng Lim, Julie Huang, Roxan F. Saidi, Cynthia L. Sears. Bacteroides fragilis enterotoxin cleaves the zonula adherens protein, E-cadherin. Proceedings of the National Academy of

Sciences of the United States of America. 1998;95(25):14979-14984. http://www.jstor.org/stable/46655. doi: 10.1073/pnas.95.25.14979.

19. Ki-Jong Rhee, Shaoguang Wu, XinQun Wu, et al. Induction of persistent colitis by a human commensal, enterotoxigenic bacteroides fragilis, in wild-type C57BL/6 mice. Infection and Immunity.

2009;77(4):1708-1718. http://iai.asm.org/content/77/4/1708.abstract. doi: 10.1128/IAI.00814-08.

20. Huso DL, Pardoll DM, Rhee K, et al. A human colonic commensal promotes colon tumorigenesis via activation of T helper type 17 T cell responses. Nature Medicine. 2009;15(9):1016-1022. http://dx.doi.org/10.1038/nm.2015. doi: 10.1038/nm.2015.

21. Herrou J, Choi VM, Bubeck Wardenburg J, Crosson S. Activation mechanism of the bacteroides fragilis cysteine peptidase, fragipain. Biochemistry. 2016;55(29):4077. http://www.ncbi.nlm.nih.gov/pubmed/27379832.

22. Vivian M Choi, Julien Herrou, Aaron L Hecht, et al. Activation of bacteroides fragilis toxin by a novel bacterial protease contributes to anaerobic sepsis in mice. Nature medicine. 2016;22(5):563-567. http://www.ncbi.nlm.nih.gov/pubmed/27089515. doi: 10.1038/nm.4077.

23. Elhenawy W, Debelyy MO, Feldman MF. Preferential packing of acidic glycosidases and proteases into bacteroides outer membrane vesicles. mBio. 2014;5(2):909. doi: 10.1128/mBio.00909-14.

24. Zakharzhevskaya NB, Tsvetkov VB, Vanyushkina AA, et al. Interaction of bacteroides fragilis toxin with outer membrane vesicles reveals new mechanism of its secretion and delivery. Frontiers In Cellular

And Infection Microbiology. 2017;7:2.

95

25. L L Myers, D S Shoop, L L Stackhouse, et al. Isolation of enterotoxigenic bacteroides fragilis from humans with diarrhea. Journal of Clinical Microbiology. 1987;25(12):2330-2333. http://jcm.asm.org/content/25/12/2330.abstract.

26. Cynthia L. Sears, Salequl Islam, Amit Saha, et al. Association of enterotoxigenic bacteroides fragilis infection with inflammatory diarrhea. Clinical Infectious Diseases. 2008;47(6):797-803. http://www.jstor.org/stable/40307753. doi: 10.1086/591130.

27. Basset C, Holton J, Bazeos A, Vaira D, Bloom S. Are helicobacter species and enterotoxigenic bacteroides fragilis involved in inflammatory bowel disease? Dig Dis Sci. 2004;49(9):1425-1432. http://www.ncbi.nlm.nih.gov/pubmed/15481314. doi: DDAS.0000042241.13489.88.

28. Ulger Toprak N, Yagci A, Gulluoglu BM, et al. A possible role of bacteroides fragilis enterotoxin in the aetiology of colorectal cancer. Clinical Microbiology & Infection. 2006;12(8):782-786. http://www.ingentaconnect.com/content/bsc/clm/2006/00000012/00000008/art00011. doi: 10.1111/j.1469-

0691.2006.01494.x.

29. Katie S Viljoen, Amirtha Dakshinamurthy, Paul Goldberg, Jonathan M Blackburn. Quantitative profiling of colorectal cancer-associated bacteria reveals associations between fusobacterium spp., enterotoxigenic bacteroides fragilis (ETBF) and clinicopathological features of colorectal cancer. PLoS

One. 2015;10(3). http://search.proquest.com/docview/1661596268. doi: 10.1371/journal.pone.0119462.

30. Zhou Y, He H, Xu H, et al. Association of oncogenic bacteria with colorectal cancer in south china.

Oncotarget. 2016. doi: 10.18632/oncotarget.13094.

31. Zhi-Dong Jiang, Herbert L. DuPont, Eric L. Brown, et al. Microbial etiology of travelers' diarrhea in mexico, guatemala, and india: Importance of enterotoxigenic bacteroides fragilis and arcobacter species.

Journal of Clinical Microbiology. 2010;48(4):1417-1419. http://jcm.asm.org/content/48/4/1417.abstract. doi: 10.1128/JCM.01709-09.

96

32. Ramamurthy D, Pazhani GP, Sarkar A, et al. Case-control study on the role of enterotoxigenic bacteroides fragilis as a cause of diarrhea among children in kolkata, india. PloS one. 2013;8(4):e60622. http://www.ncbi.nlm.nih.gov/pubmed/23577134. doi: 10.1371/journal.pone.0060622.

33. V. R. C. Merino, V. Nakano, C. Liu, Y. Song, S. M. Finegold, M. J. Avila-Campos. Quantitative detection of enterotoxigenic bacteroides fragilis subtypes isolated from children with and without diarrhea.

Journal of Clinical Microbiology. 2011;49(1):416-418. http://jcm.asm.org/content/49/1/416.abstract. doi:

10.1128/JCM.01556-10.

34. Vu Nguyen T, Le Van P, Le Huy C, Weintraub A. Diarrhea caused by enterotoxigenic bacteroides fragilis in children less than 5 years of age in hanoi, vietnam. Anaerobe. 2005;11(1):109-114. http://www.sciencedirect.com/science/article/pii/S1075996404000897. doi:

10.1016/j.anaerobe.2004.10.004.

35. Akpınar M, Aktaş E, Cömert F, Külah C, Sümbüloğlu V. Evaluation of the prevalence of enterotoxigenic bacteroides fragilis and the distribution bft gene subtypes in patients with diarrhea.

Anaerobe. 2010;16(5):505-509. http://www.sciencedirect.com/science/article/pii/S1075996410001319. doi:

10.1016/j.anaerobe.2010.08.002.

36. Rice AL, Sacco L, Hyder A, Black RE. Malnutrition as an underlying cause of childhood deaths associated with infectious diseases in developing countries. Bulletin of the World Health Organization.

2000;78(10):1207. http://search.proquest.com/docview/229545678.

37. Selection in vitro of an RNA enzyme that specifically cleaves single-stranded DNA. .

38. C Tuerk, L Gold. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science. 1990;249(4968):505-510. http://www.sciencemag.org/content/249/4968/505.abstract. doi: 10.1126/science.2200121.

39. Szostak JW, Ellington AD. In vitro selection of RNA molecules that bind specific ligands. Nature.

1990;346(6287):818-822. http://www.ncbi.nlm.nih.gov/pubmed/1697402. doi: 10.1038/346818a0.

97

40. Walsh R, DeRosa MC. Retention of function in the DNA homolog of the RNA dopamine aptamer.

Biochemical and Biophysical Research Communications. 2009;388(4):732-735. http://www.sciencedirect.com/science/article/pii/S0006291X09016593. doi: 10.1016/j.bbrc.2009.08.084.

41. C T Lauhon, J W Szostak. RNA aptamers that bind flavin and nicotinamide redox cofactors. Journal of the American Chemical Society. 1995;117(4):1246-1257. http://www.ncbi.nlm.nih.gov/pubmed/11539282. doi: 10.1021/ja00109a008.

42. Hasegawa H, Savory N, Abe K, Ikebukuro K. Methods for improving aptamer binding affinity.

Molecules (Basel, Switzerland). 2016;21(4):421. http://www.ncbi.nlm.nih.gov/pubmed/27043498. doi:

10.3390/molecules21040421.

43. Kim YS, Gu MB. Advances in aptamer screening and small molecule aptasensors. Advances in biochemical engineering/biotechnology. 2014;140:29. http://www.ncbi.nlm.nih.gov/pubmed/23851587.

44. Diafa S, Hollenstein M. Generation of aptamers with an expanded chemical repertoire. Molecules.

2015. http://boris.unibe.ch/72412/. doi: 10.3390/molecules200916643.

45. Kimoto M, Yamashige R, Matsunaga K, Yokoyama S, Hirao I. Generation of high-affinity DNA aptamers using an expanded genetic alphabet. Nature biotechnology. 2013;31(5):453. http://www.ncbi.nlm.nih.gov/pubmed/23563318. doi: 10.1038/nbt.2556.

46. Yu Y, Liang C, Lv Q, et al. Molecular selection, modification and development of therapeutic oligonucleotide aptamers. International journal of molecular sciences. 2016;17(3):358. http://www.ncbi.nlm.nih.gov/pubmed/26978355. doi: 10.3390/ijms17030358.

47. Shahid M Nimjee, Rebekah R White, Richard C Becker, Bruce A Sullenger. Aptamers as therapeutics.

Annual review of pharmacology and toxicology. 2017;57(1):61-79. doi: 10.1146/annurev-pharmtox-

010716-104558.

98

48. Ying-Jung Chen, Chia-Yu Tsai, Wan-Ping Hu, Long-Sen Chang. DNA aptamers against taiwan banded krait [alpha]-bungarotoxin recognize taiwan cobra cardiotoxins. . 2016;8(3):66. http://search.proquest.com/docview/1791594102.

49. Jeevalatha Vivekananda, Christi Salgado, Nancy J Millenbaugh. DNA aptamers as a novel approach to neutralize staphylococcus aureus α-toxin. Biochemical and biophysical research communications.

2014;444(3):433-438. http://www.ncbi.nlm.nih.gov/pubmed/24472539. doi: 10.1016/j.bbrc.2014.01.076.

50. Saraspadee Mootien, Paul M Kaplan. Monoclonal antibodies specific for bacteroides fragilis enterotoxins BFT1 and BFT2 and their use in immunoassays. PLoS One. 2017;12(3). http://search.proquest.com/docview/1874149972. doi: 10.1371/journal.pone.0173128.

51. Shaoguang Wu, Lawrence A. Dreyfus, Art O. Tzianabos, Chika Hayashi, Cynthia L. Sears. Diversity of the metalloprotease toxin produced by enterotoxigenic bacteroides fragilis. Infection and Immunity.

2002;70(5):2463-2471. http://iai.asm.org/content/70/5/2463.abstract. doi: 10.1128/IAI.70.5.2463-

2471.2002.

52. Kharlampieva DD, Manuvera VA, Podgorny OV, et al. Purification and characterisation of recombinant bacteroides fragilis toxin-2. Biochimie. 2013;95(11):2123. http://www.ncbi.nlm.nih.gov/pubmed/23954621. doi: 10.1016/j.biochi.2013.08.005.

53. Mundy LM, Sears CL. Detection of toxin production by bacteroides fragilis: Assay development and screening of extraintestinal clinical isolates. Clinical Infectious Diseases: An Official Publication Of The

Infectious Diseases Society Of America. 1996;23(2):269.

54. C S Weikel, F D Grieco, J Reuben, L L Myers, R B Sack. Human colonic epithelial cells, HT29/C1, treated with crude bacteroides fragilis enterotoxin dramatically alter their morphology. Infection and

Immunity. 1992;60(2):321-327. http://iai.asm.org/content/60/2/321.abstract.

99

55. Wu S, Morin PJ, Maouyo D, Sears CL. Bacteroides fragilis enterotoxin induces c-myc expression and cellular proliferation. Gastroenterology. 2003;124(2):392-400. http://www.sciencedirect.com/science/article/pii/S0016508502158987. doi: 10.1053/gast.2003.50047.

56. Brown Jr. T, Brown T. Solid-phase oligonucleotide synthesis. In: Nucleic acids book. http://www.atdbio.com/nucleic-acids-book: ATDBio Ltd.

57. Penner G, Paul N. Validation of random library for aptamer selection. https://www.trilinkbiotech.com/tech/aptamer.asp Web site.

58. Mayer G, Höver T. In vitro selection of ssDNA aptamers using biotinylated target proteins. Methods in molecular biology (Clifton, N.J.). 2009;535:19. http://www.ncbi.nlm.nih.gov/pubmed/19377986.

59. Tolle F, Mayer G. Preparation of SELEX samples for next-generation sequencing. Methods in molecular biology (Clifton, N.J.). 2016;1380:77. http://www.ncbi.nlm.nih.gov/pubmed/26552817.

60. Afgan E, Baker D, van den Beek M, et al. The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic acids research. 2016;44(W1):W10. http://www.ncbi.nlm.nih.gov/pubmed/27137889. doi: 10.1093/nar/gkw343.

61. Andrews S. FastQC A quality control tool for high throughput sequence data. . . http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.

62. Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: A fast and accurate illumina paired-end reAd mergeR. Bioinformatics (Oxford, England). 2014;30(5):614-620. http://www.ncbi.nlm.nih.gov/pubmed/24142950. doi: 10.1093/bioinformatics/btt593.

63. Bailey TL, Boden M, Buske FA, et al. MEME suite: Tools for motif discovery and searching. Nucleic

Acids Research. 2009;37(suppl_2):W208. http://search.proquest.com/docview/200617824. doi:

10.1093/nar/gkp335.

100

64. Sievers F, Higgins DG. Clustal omega, accurate alignment of very large numbers of sequences.

Methods in molecular biology (Clifton, N.J.). 2014;1079:105. http://www.ncbi.nlm.nih.gov/pubmed/24170397.

65. Bellaousov S, Reuter JS, Seetin MG, Mathews DH. RNAstructure: Web servers for RNA secondary structure prediction and analysis. Nucleic Acids Research. 2013;41(W1):W474. doi: 10.1093/nar/gkt290.

66. Bernhart SH, Hofacker IL, Will S, Gruber AR, Stadler PF. RNAalifold: Improved consensus structure prediction for RNA alignments. BMC bioinformatics. 2008;9(1):474. http://www.ncbi.nlm.nih.gov/pubmed/19014431. doi: 10.1186/1471-2105-9-474.

67. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic acids research. 2003;31(13):3406-3415. http://www.ncbi.nlm.nih.gov/pubmed/12824337. doi:

10.1093/nar/gkg595.

101

CURRICULUM VITAE

Payam Fathi August 1990. Oxford, Mississippi Education

Degree Master of Science

Major Molecular Microbiology and Immunology

Expected May 2017

Institution Bloomberg School of Public Health

W. Harry Feinstone Department of Molecular Microbiology and Immunology,

Johns Hopkins University

Degree Bachelor of Science

Major Bioengineering

Awarded May 2012

Institution A. James Clark School of Engineering

Fischell Department of Bioengineering, University of Maryland, College Park

Degree Bachelor of Science

Major Biochemistry

Awarded December 2012

Institution College of Computer, Mathematical, and Natural Sciences

Department of Chemistry and Biochemistry, University of Maryland, College Park

Research Experience

Position Graduate Research Assistant- Sears Laboratory

Location Johns Hopkins University Bloomberg School of Public Health

Dates September 2015- Present

102

Position Research Technologist- Sears Laboratory

Location Division of Infectious Disease, Johns Hopkins University School of Medicine

Bloomberg-Kimmel Institute for Cancer Immunotherapy, Johns Hopkins

Dates July 2013- Present

Position Research Assistant

Location Biophysics of Cellular Movement Laboratory, University of Maryland, College Park

Dates May 2011- March 2013

Position Research Assistant

Location Molecular Mechanics and Self-Assembly Laboratory, University of Maryland, College Park

Dates May 2010- August 2010

Position Research Assistant/Technical Writer

Location Human Performance Laboratory, University of Maryland, College Park

Dates January 2010 – May 2011

103

Publications

Prindeze NJ, Fathi P, Mino MJ, Mauskar NA, Travis TE, Paul DW, Moffatt LT, Shupp JW.Examination of the Early Diagnostic Applicability of Active Dynamic Thermography for Burn Wound Depth Assessment and Concept Analysis. Journal of Burn Care and Research, November 2014.

Fathi P, Vranis NM, Paryavi E. Diagnosis and Treatment of Upper Extremity Mucormycosis Infections. Journal of Hand Surgery, May 2015.

Fathi P, Wu S. Isolation, detection, and characterization of Enterotoxigenic Bacteroides fragilis in clinical samples. Open Microbiology Journal, April 2016.

Orberg ET, Fan H, Tam A, Dejea CM, Destefano Shields CE, Wu S, Chung L, Finard BB, Wu X, Fathi P, Ganguly S, Fu J, Pardoll DM, Sears CL, Housseau F. The Myeloid Immune Signature of Enterotoxigenic Bacteroides fragilis Induced Murine Colon Tumorigenesis. Mucosal Immunology, June 2016.

Drewes JL, White JR, Dejea CM, Fathi P, Iyadori T, Vadivelu J, Roslani AC, Wick E, Mongodin EF, Loke MF, Thulasi K, Gan HM, Goh KL, Chong HY, Kumar S, Wanyiri JW, Sears CL. Meta-analysis of high- resolution 16S gene-based profiling and biofilm status reveals consortia associated with colorectal cancer. March 2017, Submitted.

Dejea CM, Fathi P, Craig JM, Boleij AM, Taddese R, Geis AL, Wu X, Destefano Shields CE, Heichenbleikner L, Huso DL, Giardiello FM, Kinzler K, Vogelstein B, Wick EC, Pardoll DM, Sears CL. Early stage of Familial Adenomatous Polyposis associates with biofilms comprised of carcinogenic Escherichia coli and Bacteroides fragilis. May 2017, Submitted. Manuscripts in Preparation

Fathi P, Dejea CM, Allen J, Boleij AM, Heichenbleikner L, Romans K, Wick EL, Housseau F, Pardoll DM, Sears CL. Differential Pathogenicity of Colibactin Producing E. coli Isolates.

Boleij AM, Wu S, Fathi P, Wu X, Dalton WB, Milligan G, Zabransky D, Park B, Sears CL. Intestinal Epithelial GPR35 reacts to Bacteroides fragilis toxin and acts a regulator for gut infection by Enterotoxigenic Bacteroides fragilis.

Orberg ET, Geis AL, Chung L, Finard BB, Dejea CM, Chan JL, Chen A, Destefano Shields CE, Wu S, Tam A, Fathi P, McAllister F, Metz P, Fan H, Wu X, Ganguli S, Van Meerbeke S, Huso DL, Pardoll DM, Sears CL, Housseau F. IL-17 Orchestrates the cooperation between Colonic Epithelial Cells and Myeloid Cells to promote oncogenic properties of Bacteroides fragilis enterotoxin.

104