MITOCHONDRIAL CYTOCHROME OXIDASE I SEQUENCE POLYMORPHISMS REVEAL POPULATION GENETIC DIVERSITY OF WUCHERERIA BANCROFTI IN PAPUA NEW GUINEA

By

AKSHAYA RAMESH

Submitted in partial fulfillment of the requirements For the degree of Master of Science

Thesis Advisor: Dr. Peter A Zimmerman

Department of Biology CASE WESTERN RESERVE UNIVERSITY

August, 2012

CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis/dissertation of

Akshaya Ramesh

candidate for the MS_ degree *.

(signed) Dr. Roy Ritzmann (Chair of the committee)

Dr. Peter Zimmerman Dr. Daniel Tisch Dr. Michael Benard

(date) 5/22/2012

*We also certify that written approval has been obtained for any proprietary material contained therein.

ii

Dedicated to my beloved Father and Grandmother

iii

Table of Contents Chapter 1: Lymphatic Filariasis: Global burden and Epidemiology 1.0. Introduction 1 1.1. Global burden 1 1.1.2. Lymphatic filariasis in Papua New Guinea 1 1.2. Global Alliance to eliminate lymphatic filariasis (GAELF/GPELF) 2 1.3. Life cycle of the filarial nematode: Wuchereria bancrofti 4 1.4. Vectors of W. bancrofti in Papua New Guinea 5 1.5. Genetics of W. bancrofti populations 6 1.6. Objectives 7 Chapter 2: The complete mitochondrial genome sequence of the filarial nematode, W. bancrofti 2.0. Introduction 8

2.1. Methods 9 2.1.1. Genomic DNA extraction and amplification 9 2.1.2. W. bancrofti mitochondrial genome amplification strategy 10 2.1.3. Agarose gel electrophoresis and gel extraction 12 2.1.4. Sequence assembly and gene annotation 13 2.2. Results and Discussion 13 2.2.1. General features of the mitochondrial genome of W. bancrofti 13 2.2.2. Protein coding genes 15 2.2.3. Codon Usage and amino acid composition 17 2.2.4. Ribosomal RNA genes 17 2.2.5. Transfer RNA genes 18 2.2.6. Non coding regions 19 2.3. Conclusions and future directions 20

iv

Chapter 3: Genetic diversity of Wuchereria bancrofti in Dreikikir district, East Sepik province, Papua New Guinea 3.0. Introduction 21 3.1. Methods 24 3.1.1. Study sites and sample selection 24

3.1.2. Genomic DNA extraction and amplification 25

3.1.3. PCR amplification of the Cytochrome oxidase1 (cox1) gene and visualization of the products 25 3.1.4. Sequencing of the cox1 gene 26 3.1.5. Sequence analysis: Alignment of sequences and Sequence editing 27 3.1.6. Genetic heterogeneity of W. bancrofti 29 3.1.7. Test for neutrality 29 3.1.8. Population structure of W. bancrofti 29 3.2. Results 30 3.2.1. Genetic heterogeneity of W. bancrofti 31 3.2.2. Test for compliance to the neutral model of evolution 33 3.2.3. Genetic structure of W. bancrofti populations 33 3.2.4. Isolation by distance 45 3.3. Discussion 47 3.3.1. Genetic heterogeneity of W. bancrofti populations 47 3.3.2. Population structure 49 3.3.3. What mechanisms are responsible for structuring populations of W. bancrofti in PNG? 50 3.4. Conclusions and future directions 52 References 54

v

List of Tables Table 1: Fifteen primer sets and COX1 primers used in the present study to amplify the complete mt genome and study the genetic heterogeneity of W. bancrofti. 11

Table 2: Summary statistics for 14 individuals from 6 villages with the observed number of haplotypes, haplotype diversity and test for neutrality 32

Table 3: Matrix with the pairwise Fst values for the six villages 34

Table 4: Fst values for the individuals from five villages 35

Table 5: AMOVA statistic generated using Arlequin 3.0 and DnaSP 5.0 across the villages (n=6) 36

vi

List of Figures Figure 1: Linear representation of the complete mitochondrial genome of W. bancrofti from a Papua New Guinean isolate 14 Figure 2: Map of the study sites, Adapted from Bockarie et al., 1998 23

Figure 3: Multidimensional Scaling plot for the six villages in the ESP using pairwise Fst 34 Figure 4: Haplotype network for W. bancrofti hosts (n=14) across six villages in the ESP 38 Figure 5: Haplotype network for W. bancrofti hosts Peneng 39 Figure 6: Haplotype network for W. bancrofti hosts Albulum1 40 Figure 7: Haplotype network for W. bancrofti hosts Albulum2 41 Figure 8: Haplotype network for W. bancrofti hosts Yautong1 42 Figure 9: Haplotype network for W. bancrofti hosts Yautong2 43 Figure 10: Haplotype network for W. bancrofti hosts Moihuak 44 Figure 11: Histogram of genetic distances among hosts in the 6 villages 45 Figure 12: Histogram of genetic distances among hosts in the 6 villages 45

Figure 13: Comparison of Fst between the Peneng and Moihuak with respect to the other study sites 46 Figure 14: IBD plot for the six villages in ESP, PNG 47

vii

Acknowledgements

It is a pleasure to thank the many people who made this thesis possible.

I consider it a great privilege for the opportunity given to me to work at the Center for

Global Health and Diseases at Case Western Reserve University. I would like to sincerely thank

Dr. Peter Zimmerman, my advisor and mentor for his crucial guidance in this project, constant support and encouragement. I would also like to thank Dr. Daniel Tisch for his help with the analysis of my data and motivation throughout my thesis writing period. I would like to take this opportunity to thank Dr. Michael Benard for introducing me to population genetics and lending me his expert views and precious time.

I am extremely grateful to Dr. Scott Small, a post-doctoral fellow at the Zimmerman lab for constantly guiding me through the data; his untiring help, constant advice and support has substantially shaped the findings of my thesis. I unreservedly acknowledge with gratitude all the help and support provided by all the members of the Zimmerman lab. I deeply value the association with Dr. Rajeev Mehlotra, Krufinta Bun, Tenisha Phipps, Cara Halldin, Kyle Logue,

Chad Schaber, Bangan John, Barnie Willie and Melinda Zikursh who have provided great help, caring and support besides practical advice in completion of this project. I would also like to thank Zachary Kloos, a great friend and fellow researcher for introducing me to Wuchereria bancrofti and helping me through my project.

I wish to thank my best friend from high school (Soundarya Rangaraj), best friend as an undergraduate (Lakshmi Priya and Amrutha Pattamatta) and best friend in graduate school

(Kirsten Eichelman) for their emotional support, entertainment and help through difficult times.

viii

I owe my gratitude to the Department of Biology at Case Western Reserve University and other members of the Center for Global Health and Diseases for extending their continuous support and guidance throughout my project.

Last, and most importantly, I wish to thank my parents Sri Vidya Ramesh and Ramesh

Veeraraghavan, my grandmother, Chandra Renganathan, my Aunt and Uncle for their never- ending love, care and support. To them I dedicate this thesis.

ix

List of Abbreviations

ATP Annual Transmission Potential

atp6 ATP synthase subunit 6

cob cytochrome b cox1-3 cytochrome c oxidase subunits 1-3

DALY disability-adjusted life year

DEC Diethlycarbamazine

ESP East Sepik province

GAELF/GPELF Global Alliance to eliminate lymphatic filariasis gDNA Genomic DNA

LD Linkage Disequilibrium

LDR-FMA Ligase detection reaction-fluorescent microsphere assay

LF Lymphatic filariasis

MDA Mass drug administration

MFI Median Fluorescence Intensity

Mt Mitochondria nad1-6 NADH dehydrogenase subunits 1-6 nad4L NADH dehydrogenase subunit 4L nt nucleotide

PacELF Pacific Program to Eliminate Lymphatic Filariasis

PNG Papua New Guinea rrn ribosomal RNA trn transfer RNA

WHO World Health Organization

x

Mitochondrial cytochrome oxidase I (COXI) sequence polymorphisms reveal population genetic diversity of Wuchereria bancrofti in Papua New Guinea

Abstract

by

AKSHAYA RAMESH

Wuchereria bancrofti is the primary causative agent of lymphatic filariasis, estimated to affect

120 million people in 80 countries. Several chemotherapeutic programs to eliminate this parasite have been introduced, which are likely to result in changes of the genetic structure in W. bancrofti populations. Despite constituting a major public health burden, this parasite remains poorly understood with respect to its mitochondrial sequence and population biology. To address this knowledge gap, the complete mitochondrial genome of W. bancrofti was sequenced following which a portion of the cytochrome oxidase 1 gene was amplified from individuals in the East Sepik Province of Papua New Guinea. The present study suggests that W. bancrofti populations are highly heterogeneous with a moderate genetic structure across the East Sepik

Province. This study has facilitated exploration into W. bancrofti diversity and provides insights into patterns of transmission, an essential component of public health interventions aimed at eliminating lymphatic filariasis.

xi

Chapter 1

Lymphatic Filariasis: Global burden and Epidemiology

1.0. Introduction

1.1. Global burden

Lymphatic filariasis (LF) is a neglected tropical disease, primarily of the poor, affecting around 120 million people worldwide and endemic in 80 countries. LF is endemic in Africa, South America, Indian subcontinent, South East Asia, the Pacific islands and the eastern Mediterranean. Around 118 million people are estimated to have clinical symptoms of the disease with 74 million being microfilaraemic, which includes hidden renal and lymphatic pathology; another 27 million have hydrocoele. Additionally,

16 million people are reported to have elephantiasis, a chronic form of the disease. The disability-adjusted life year (DALY) burden, a measure of overall disease burden expressed as the number of years lost due to ill-health, due to LF is 5.5 million (Global programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis,

2002; Molyneux, Bradley, Hoerauf, Kyelem, & Taylor, 2003).

1.1.2. Lymphatic filariasis in Papua New Guinea

Among the endemic islands, Papua New Guinea (PNG) has the highest estimated population at risk with almost 50% of the entire population at risk of infection. PNG has also been reported to have one of the highest rates of microfilaremia in the Pacific island countries, with 39% of the country being microfilaremia positive (Bockarie & Kazura,

2003; Michael & Bundy, 1997). The Dreikikir district, located in the East Sepik province

(ESP), has one of the highest intensities of parasitemia in PNG (Kazura et al., 1984).

1

1.2. Global Alliance to eliminate lymphatic filariasis (GAELF/GPELF)

The Global Alliance to eliminate lymphatic filariasis was launched in 1997 by the

World Health Organization (WHO) with a goal to eradicate LF as a public health burden

by 2020. Six national intervention programs have been set up in each of the countries

where LF is endemic (Africa, South America, India, South East Asia, the Pacific and the

Mediterranean) to help reduce the global burden and monitor the progress of elimination.

In the Pacific region around 4.13 million people in 16 island countries are estimated to be

at a risk (American Samoa, Cook Islands, Marshall Islands, Fiji, French Polynesia,

Kiribati, Micronesia, New Caledonia, Niue, Palau, Papua New Guinea, Samoa, Tonga,

Tuvalu, Vanuatu, and Wallis & Futuna). Accordingly, the Pacific Program to Eliminate

Lymphatic Filariasis (PacELF) was set up as the Pacific arm of the GPELF. To interrupt

transmission of LF, 9 of these countries implemented the Mass Drug Administration

program (MDA), wherein the entire population at risk of infection of the disease is

treated with drugs until microfilaremia levels remained below that necessary to sustain

transmission. The two most commonly used drug regimens included drug combination

therapies of albendazole (400mg) with ivermectin (150 µg/kg) or diethlycarbamazine

(DEC) (6mg/kg) with albendazole (400mg) which are given once a year for five

consecutive years. Additional therapies have included treatment with DEC-cooking salt on a daily basis for 6-12 months. (Global programme to eliminate lymphatic filariasis:

Annual Report on Lymphatic Filariasis, 2002).

The most important factors for the success of the GPELF are (Michael et al.,

2004)an optimal drug regimen, the time-course of drug administration, the potential development of drug resistance and vector control. Bockarie and colleagues initiated a

2 trial in 1994 to study the effectiveness of combination treatment (DEC and ivermectin) with DEC alone in reducing the transmission of LF in PNG. The MDA trial in PNG was based solely on a community-level drug administration. Individuals above 5 years of age in 14 communities were included in the study. While seven communities received DEC 6 mg/kg, people in the other seven communities received DEC 6 mg/kg and ivermectin 400

µg/kg. Of the 2219 individuals who received treatment, the authors reported that microfilarial density had declined in all the communities, with a steeper decrease in individuals who had received a combination treatment. Combination treatment has been reported to be more effective owing to the macrofilaricidal activity of DEC and the microfilaricidal activity of ivermectin (Moilia-Pelat et al., 1995). After 4 years following treatment, a reduction in the transmission potential was observed in the high transmission villages to 84% and in the moderate transmission villages to 97% (Bockarie et al., 1998).

Both the microfilarial positivity and transmission potential dropped substantially one year following of treatment (Bockarie et al., 1998).

The World Health Organization (WHO) has recommended a minimum of a 5 year administration period of the drugs to interrupt transmission of LF (Global programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis, 2002). As discussed earlier, studies have indicated that while a one year administration of the drugs does effectively reduce the microfilarial positivity rate, a five year administration period is more effective (Bockarie et al., 1998). A five year administration period has not only been found to reduce the community level of the microfilaria, but also render the female parasites reproductively ‘dead’ affecting the parasite life cycle (Cupp, Sauerbrey, &

3

Richards, 2011). Addition of vector control decreased the number of target years to

achieve the desired endemicity level (Michael et al., 2004).

Michael and colleagues modeled the potential of the development of resistance to a drug using a one-locus two-allele model. The authors predicted that the parasite

populations are likely to develop resistance to ivermectin before albendazole, which is

likely to develop after apparent low levels of allelic change, assuming that ivermectin and

albendazole were dominant and recessive resistant genes respectively. These assumptions

were made based on previous reports of resistance that have been recorded (Michael et

al., 2004). They concluded by stating the urgent need to develop population genetic

markers to monitor these changes and incorporate changes in the population biology of

this parasites during the mass drug interventions. Also, use of a combination of drugs is

likely to reduce the possibility of development of drug resistance (Michael et al., 2004).

Michael and colleagues found that inclusion of vector control alongside the MDA would

interrupt transmission efficiency to a greater extent and also synergistically act with drugs

to reach the GPELF target of 0.5% microfilarial prevalence. Recently, on the 13th of

March 2012, WHO announced the integration of vector management to control LF which

has previously been reported imperative for the success of the GPLEF (Global

programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis,

2002; Michael et al., 2004).

1.3. Life cycle of the filarial nematode: Wuchereria bancrofti

Wucherereia bancrofti, Brugia malayi and Brugia timori are threadlike parasitic

worms that cause LF. These parasites are transmitted to humans by infected female

4

mosquitoes through a puncture in the skin, where L3 larvae develop into adult worms.

These adult worms reside in the host lymphatic system, causing them to dilate, resulting

in slowly and ineffective movement of the lymph fluid. The life span of these adult

worms varies between 4-6 years during which they produce millions of microfilariae.

Microfilariae are small larvae that are released at the lymph nodes and migrate to the

thoracic duct via blood circulation. The mosquitoes bite infected humans during their

subsequent blood meal and in the process take-up microfilaria circulating in the blood.

Microfilaria (L1 larvae) enters the mosquito, pass to their stomach, penetrate the gut wall, and then enter the body cavity. The microfilaria matures to L3 larva through two molts

within the mosquito. Mature L3 larvae migrate to the mosquito’s proboscis and are

subsequently transmitted to humans during their subsequent blood meal (Global

programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis,

2002).

1.4. Vectors of W. bancrofti in Papua New Guinea

The principal vectors involved in the transmission of W. bancrofti in PNG are the

Anopheles punctulatus complex of mosquitoes (Anopheles punctulatus, An. koliensis and

An. farauti) (Bryan, 2006; Michael & Bundy, 1997); to a lesser extent, Culex and

Mansonia spp. of mosquitoes are also involved. Biting catches for the An. punctulatus

group ranged from 2-12% and infective rates ranged from 0.4-3.5% in the villages

located in the East Sepik province of PNG (Michael & Bundy, 1997). Though filariasis and malaria in PNG have the same vectors, their distribution varies across highly endemic areas (Michael & Bundy, 1997).

5

1.5. Genetics of W. bancrofti populations

The genetic diversity of W. bancrofti influences response to drug treatment and

parasite fecundity thus having important connotations to the on-going elimination

program in PNG (Prichard, 2001). Mass drug administration is likely to eliminate susceptible worms, as a result of which the susceptible genes are not passed on to the future generations (Prichard, 2001). As a result of chemotherapeutic programs, the parasite population is expected to go through a bottleneck. Monitoring the genetic diversity of these parasites thus enables us to understand the response of the parasite population to treatment and allows the identification of resistant strains (Volkman,

Neafsey, Schaffner, Park, & Wirth, 2012).

Ardelli and colleagues studied the P-glycoprotein locus of O. volvulus to determine the impact of repeated ivermectin (a substrate for P-glycoprotein) treatment on these parasite populations in Ghana, Africa andreported a decrease in diversity of these parasite populations . Interestingly, individuals who were not treated with ivermectin also showed decreased diversity. The authors offer two explanations for this observation, 1) ivermectin was imposing selection on worm populations as a result of which only resistant worms could reproduce, or 2) individuals were selectively infected with worms already possessing resistance to the ivermectin treatment. The highest decrease in diversity was observed after 6 annual ivermectin treatments (Ardelli, Guerriero, &

Prichard, 2006).

6

1.6. Objectives

The overall goal of the thesis is to develop genetic tools for W. bancrofti and then

apply them to a population of W. bancrofti in PNG. I accomplish this by concentrating

on the following objectives:

1. Develop genetic markers for Wuchereria bancrofti.

2. Determine the genetic diversity of W. bancrofti populations by sequencing the cytochrome oxidase I gene (obtained in objective 1) from six villages [Peneng,

Albulum1, Albulum2, Yautong1, Yautong2 and Moihuak] in the Dreikikir District, East

Sepik Province, Papua New Guinea.

3. Test the hypothesis that genetic heterogeneity of W. bancrofti is influenced by geographic distance.

7

Chapter 2

The complete mitochondrial genome sequence of the filarial nematode, W. bancrofti

2.0. Introduction

Despite being the major contributor to the global burden of LF, the genetics of W. bancrofti remains poorly understood, Previously, the only genetic data available for W. bancrofti was a portion of the cytochrome oxidate1 (cox1) gene used in a phylogenetic analysis of filarial nematodes (Casiraghi, Anderson, Bandi, Bazzocchi, & Genchi, 2001).

Mitochondrial (mt) DNA has been reported as an ideal marker for phylogenetic analysis owing to its distinctive and non-recombinant nature, maternal mode of inheritance, simple genetic structure, rapid pace of evolution and ease of isolation (Avise et al., 1987).

Hu and colleagues sequenced the complete mt genome sequences of N. americanus from two geographic isolates and identified several differences between these isolates, demonstrating the importance of mt DNA markers in studies aimed at discerning the genetic basis of epidemiological patterns (Hu, Chilton, Abs El-Osta, & Gasser, 2003). In addition to its usefulness as a genetic marker in molecular epidemiological surveys, W. bancrofti mitochondrial DNA sequence data may also prove useful in the identification of new anthelmintic drug targets. While one research group has reported a novel inhibitor of complex I in Ascaris suum mitochondria (Okimoto & Wolstenholme, 1990), the potential for other drugs targeting nematode mt remains relatively unexplored (Feagin, 2000).

The complete mt genome for Dirofilaria immitis, Onchocerca volvulus, Setaria digitata and Brugia malayi (Hu, Gasser, Abs El-Osta, & Chilton, 2003; Keddie, Higazi,

& Unnasch, 1998; Yatawara, Wickramasinghe, Rajapakse, & Agatsuma, 2010, Ghedin et al., 2007) have been reported to date. These genomes are approximately 13.6-14.3 kb and

8 encode 12 protein coding genes, 2 ribosomal RNA's (rrns), 22 transfer RNA's (trns) and a putative control region (Min Hu & Gasser, 2006).

Nematode mt genomes are highly compact with short intergenic regions and overlapping genes that are all transcribed in the same direction (Hu & Gasser, 2006). The protein coding genes are predicted to employ incomplete stop codons, which are supposedly modified post-transcriptionally to form complete stop codons. Other noteworthy features of nematode mt DNA is its use of TGA to encode tryptophan and

AGA and AGG to encode serine (Feagin, 2000) and its unique start codons such as ATT and TTG (Hu & Gasser, 2006).

Here we report the complete mitochondrial genome of W. bancrofti from a Papua

New Guinean isolate. In reporting this sequence, we describe its gene order, codon usage bias, and nucleotide composition.

2.1. Methods

2.1.1. Genomic DNA extraction and amplification

Genomic DNA (gDNA) was extracted from whole blood samples collected from individuals living in a W. bancrofti endemic region of Dreikikir District, ESP, PNG. All blood samples were collected under clinical protocols approved by the institutional review boards at Papua New Guinea Institute of Medical Research and University

Hospitals Case Medical Center. DNA was extracted using a QIAamp 96 DNA Blood Kit

(QIAGEN, Valencia, CA). W. bancrofti positive samples were identified by a post-PCR ligase detection reaction-fluorescent microsphere assay (LDR- FMA) (Mehlotra et al.,

2010) and blood smear microscopy. A secondary confirmation included using a part of a

9

repetitive region of nuclear W. bancrofti DNA (GenBank accession no. AY297458) which was PCR amplified using forward primer 5ʹ-GATG GTGTATAATAGCAGCA-3ʹ and reverse primer 5ʹ-GTCATTTATTTCTCCGTCGACTG TC-3ʹ. The resulting PCR products were used as template in a ligase detection reaction (LDR) in which an upstream

W. bancrofti-specific primer 5ʹ-tacactttatcaaatcttacaatcTATATCTGC

CCATAGAAATAACTA-3ʹ was ligated to a downstream conserved sequence primer phos-5ʹ-CGGTGGATCTCTGGTTATCACTCTG-3ʹ-biotin using Taq DNA ligase (New

England Biolabs, Ipswich, MA). Fluorescing MagPlex-TAG microspheres (Luminex,

Austin, TX) were hybridized to the 5ʹ-ends of the resulting LDR products. Following this hybridization step, ligated products were further labeled at their 3ʹ-ends with streptavidin-

PE and their median fluorescence intensity (MFI) was determined using a Bio-Plex suspension array system (Bio-Rad, Hercules, CA). W. bancrofti-positive samples were identified as those whose MFI exceeded 300. These samples were stored at 4°C for PCR amplification and the negative samples were stored at -80°C for future reference.

2.1.2. W. bancrofti mt genome amplification strategy

Putative W. bancrofti fragments (9725 bp) were retrieved from the Broad

Institute’s filarial worm database by a BLAST search against the complete mt genome of

B. malayi (GenBank accession no. AF538716). Retrieved sequences were aligned to the

B. malayi mt genome using MacVector 11.1 (MacVector, Cary, NC). The resulting suggested that the Broad’s whole-genome shotgun sequencing efforts had not achieved complete coverage of the W. bancrofti mt genome. To confirm that the existing sequences represented W. bancrofti mt DNA and to bridge the gaps in the alignment, 15 primer sets were designed to amplify the entire mt genome in 15

10 overlapping fragments (Table 1). The number assigned to both primers within each of the pairs that appear in Table 1 roughly represents the first position in the MacVector alignment to which the forward primer was predicted to anneal. Furthermore, the F and R designations that appear in the primer names indicate the orientation of each primer with regard to gene transcription, F (Forward) and R (Reverse) (Ramesh et al., 2012).

PCR amplification was performed on 3μL of template previously determined to be positive for W. bancrofti. The reactions varied in their use of buffer, where reactions

11

involving primer sets 701, 1751, 2721, 3691, 5611, 6425, 7441, and 11420 used 10-fold

dilutions of a 10X buffer containing 16.6 mM (NH4)2SO4, 10 mM β-mercaptoethanol, 3.4 or 6.7 mM MgSO4, and 67 mM Tris-HCl (pH 8.0, 8.3, 8.5, or 8.8) and reactions involving primer sets 41, 4651, 8461, 9441, 10421, 12381, and 13119 used 10-fold dilutions of a 10X buffer containing 500 mM KCl, 0.1% (w/v) gelatin, 7.5 or 15 mM

MgCl2, and 100 mM Tris- HCl at pH 8.3. The PCR master mixtures (25 μL) contained

2.5 μL 10X buffer, 2 μL dNTPs (5 nmol each dNTP), 0.6 μL each primer (6 pmol), 0.6

μL Taq DNA polymerase (2.5 units) and 3 μL pooled gDNA (22 ng). An MJ Research

PTC-225 thermocycler (MJ Research, Waltham, MA) was used to amplify all the 15 fragments under the thermocycling conditions: 92°C for 2 min (initial denaturation), followed by 40 cycles of denaturation at 92°C for 30s, annealing at 48°C for 30s, and

extension at 72°C for 60 s (Ramesh et al., 2012).

2.1.3. Agarose gel electrophoresis and gel extraction

Amplified products were visualized on a 2% (w/v) agarose gels, stained with

SYBR Gold (Molecular Probes, Eugene, OR), and then visualized using a Storm 860

molecular imaging system with ImageQuant 5.2 software (Molecular Dynamics,

Sunnyvale, CA). Products of predicted nucleotide length, obtained from Mac Vector,

were all extracted using a QIAquick Gel Extraction Kit (QIAGEN, Valencia, CA).

These purified PCR products were then topo-TA cloned into pCR2.1-TOPO vector and

transformed into chemically competent E. coli TOP10 cells according to the

manufacturer’s protocol (Invitrogen, Carlsbad, CA). Multiple clones were obtained for

the 15 individual fragments and sequencing reactions of plasmid inserts using M13

universal primers were performed by Beckman Coulter Genomics (Danvers, MA).

12

2.1.4. Sequence assembly and gene annotation

Sequences were edited using Sequencher 4.8 (Gene Codes, Ann Arbor, MI) and assembled to form contigs. Consensus sequences for the 15 individual fragments were then assembled in Geneious 5.3 (Biomatters, Auckland, NZ) and aligned to the complete mt genome reported previously for B. malayi (Ghedin et al., 2007). B. malayi was used as a template for annotation of the complete mt genome of W. bancrofti to determine the boundaries of the rrn and protein-coding genes, as well as those of the non-coding AT- rich region. All trn genes were identified using both ARWEN and trnAscan- SE 1.21

(Laslett & Canbäck, 2008; Lowe & Eddy, 1997; Schattner, Brooks, & Lowe, 2005).

Tandem and inverted repeats were determined using Tandem Repeats Finder and

EMBOSS, respectively (Benson, 1999; Rice, 2000). Geneious 5.3 was used to determine pairwise amino acid identities and nucleotide identities. Codon usage bias in the mt protein coding genes was estimated using CodonO (Angellotti, Bhuiyan, Chen, & Wan,

2007) (Ramesh et al., 2012).

2.2. Results and Discussion

2.2.1. General features of the mitochondrial genome of W. bancrofti

A schematic representation of the W. bancrofti mt genome is given in Figure 1.

The mt genome organization of W. bancrofti, including the positions of the trn genes, is identical to B. malayi, D. immitis, S. digitata and O. volvulus (Ramesh et al., 2012).

13

The length of the complete mt genome of W. bancrofti is 13,637 bp, shorter than the mt

genomes of B. malayi (13,657bp), O. volvulus (13747bp), D. immitis (13,814bp), C.

elegans (13,794bp) and A. suum (14,284bp), thus making it amongst the smallest metazoan mt genomes sequenced to date. The overall base composition of the genome is highly A-T rich, with A, 20.1%; T, 54.5%; G, 18.1%; C, 7.2%. The mt genome of W. bancrofti contains 12 protein-coding genes (atp 6, cob, cox1-3, nad1-6 and nad4L), 2 ribosomal RNA’s (rrnS, rrnL), 22 transfer RNAs (trns) and a putative control/AT rich region. The genome lacks the atp8 gene, similar to most nematodes with the exception of

T. spiralis (Lavrov & Brown, 2001) (Ramesh et al., 2012).

All the genes are transcribed in the same direction. Short intergenic regions (1-46 nucleotides) are interspersed through the genome, the longest between the trnW and nad6

14

genes. Six pairs of genes overlap with adjacent trn genes by 1-3 nucleotides and 3 pairs of trn genes overlap by 1-2 nucleotides; consistent with the extreme economy observed in several mt metazoan genomes (Hu, Gasser, et al., 2003; Hu et al., 2002; Keddie et al.,

1998; Yatawara et al., 2010). The cob gene shares 1 and 2 nucleotides with trnQ and trnL

respectively. The nad1 and trnY genes overlap by 3 nucleotides, much smaller than the

reported number of nucleotides shared between nad1 and trnF gene reported in D.

immitis, O. volvulus and S. digitata (21, 23 and 26 nucleotides respectively). It has been

suggested that overlapping genes in metazoan mitochondria are likely to be regulated by

an RNA editing activity (Hu et al., 2002).

2.2.2. Protein coding genes

The mt genome of W. bancrofti encodes 12 protein-coding genes (atp 6, cob,

cox1-3, nad1-6 and nad4L. Four of them use ATT as the translation initiation codon

(cox1-3 and cob) and two of them use TTG (nad1 and nad4). The other translation

initiation codons used are TTA (nad2), TGT (nad6), GTA (nad4L), ATA (atp6), CTT

(nad3) and TTT (nad5). TTT has been reported as an initiation codon in B. malayi

(Ghedin et al., 2007) and S. digitata (Yatawara et al., 2010) for the nad5 gene and CTT has been reported as the initiation codon for nad3 in O. volvulus, D. immitis and B. malayi (Ghedin et al., 2007; Hu, Gasser, et al., 2003; Keddie et al., 1998). The codons

TTA, TAT and GTA are rarely used as initiation codons in the nematodes sequenced to date; however, they have been reported as initiation codons for the nad4L gene in B.

malayi and D. immitis (Ghedin et al., 2007; Hu, Gasser, et al., 2003). TGT has not been

reported as an initiation codon in nematodes and appears to be unique to W. bancrofti.

While 7 of the 12 protein coding genes use TAG/TAA as translation termination codons,

15 the other 5 (nad1-3, cob and cox2) genes use truncated codons such as a T or a TA for termination. Interestingly, the genes utilizing a truncated stop codon are followed only by trns and are likely to be converted to TAA after RNA editing by polyadenylation (Ojala,

Montoya, & Attaradi, 1981).

The protein coding genes have a high T content (47.7-65%) in comparison to the

A, C and G content with the AT content ranging from 67.8 - 81%. Relatively long poly T tracts ranging from 8-15 Ts are found in abundance in the protein coding genes of the mt genome. These are shorter than the poly T tracts reported in the mt genome of S. digitata,

O. volvulus, D. immitis and S. stercoralis (8-24 Ts) (Hu, Gasser, et al., 2003; Hu et al.,

2002; Keddie et al., 1998; Yatawara et al., 2010).

Among the different protein coding genes, the third codon position has a higher T content (68.5%) in comparison to the first (47.3%) and second (51%) codon positions. G

(18-22.6%), followed by A (22.2-11.9%), are the second and third most frequently used nucleotides in the first and third codon positions respectively. Though the overall mt genome is biased against C, there is selection for a higher C content (7.9 -12%) in the first and second codon positions. Similar results have been observed in other nematodes

(Hu, Gasser, et al., 2003; Hu et al., 2002; Keddie et al., 1998; Yatawara et al., 2010).

The predicted lengths of all the genes in W. bancrofti are very similar to those of

B. malayi (≤ 1 amino acid difference), D. immitis and O. volvulus (≤ 5 amino acid differences). With the exception of the cox1 gene, the predicted lengths are similar to those of N. americanus, A. duodenale, C. elegans (≤8 amino acid differences), A. suum and S. stercoralis (≤6 amino acid differences). The length of the cox1 gene is highly

16

variable (≤35 amino acid differences); however W. bancrofti, B. malayi, D. immitis and

O. volvuls have an upstream TA that might terminate the protein, resulting in a protein of length 535 amino acids. Pair wise comparisons of all the protein coding genes with other secernentean nematodes show that cox1 and cob are the most conserved proteins while

nad6 and atp6 are least conserved (Ramesh et al., 2012).

2.2.3. Codon Usage and amino acid composition

Codon bias is seen due to translational optimal codons, which are recognized by the most abundant tRNA species and are strongly expressed maximizing translation efficiency (Sharp & Matassi, 1994). Of the possible 64 codons, all but 6 codons (CTC,

CCC, ACA, ACG, GCG and CGC) are used. The most frequently used codons are T-rich,

TTT (Phenylalanine, 17.9%), GTT (Valine, 7.9%), TTG (Leucine, 7.1%), TAT

(Tyrosine, 6.2%), ATT (Isoleucine, 5.4%), TTA (Leucine, 4.9%) and TCT (Serine,

4.6%). Though phenylalanine, leucine and serine are among the most frequently used

amino acids; TTC (Phenylalanine), CTN (Leucine), TCC, TCA and TCG (Serine) are

seldom used due to the strong bias against C in this genome. Within each codon family, T

is preferred in the third position over G, A and C, indicative of the strong T bias in the mt

genome. Amino acids Glutamine (CAR) and Histidine (CAY) are the least used amino

acids. This pattern of a strong bias for T and against C has been reported in several other

nematodes as well ( Hu, Gasser, et al., 2003; Hu et al., 2002; Keddie et al., 1998;

Yatawara et al., 2010) (Ramesh et al., 2012).

2.2.4. Ribosomal RNA genes

17

The mt small and large subunit ribosomal RNA genes were identified based on sequence comparisons to other nematodes. The rrn genes have an AT content of 77.6%.

Eleven poly-T tracts (8-20 Ts) are present in the rrn genes; the rrnL gene contains the longest homopolymer tract of 20 Ts in the entire mt genome. The rrnS is located between nad4L and trn Y genes and rrnL is located between trnH and nad3 genes. The size of the rrnS is 672 bp, similar to other nematodes (684 bp in O. volvulus, 672 bp in B. malayi,

687 bp in D. immitis and 672bp in S. digitata). The rrnS gene is 94.4% similar to that of

B. malayi and 89% similar to S. digitata, O. volvulus and D. immitis. They have much lower similarities with C. elegans and A. suum (65.4 and 66.5% respectively). The size of the rrnL is 972 bp and it shares a high sequence similarity with B. malayi, S. digitata, O. volvulus and D. immitis (83.4-92.6%) and a lower similarity with C. elegans and A. suum

(68.2 and 70.4% respectively) (Hu, Gasser, et al., 2003; Hu et al., 2002; Keddie et al.,

1998; Yatawara et al., 2010) (Ramesh et al., 2012).

2.2.5. Transfer RNA genes

A total of 22 trn genes (53-59 nt in length) were identified in the W. bancrofti mt genome sequence. Twenty of the trn structures possess a TV-replacement loop instead of a TΨC arm and variable loop. The remaining two trns (trnS) structures lack a DHU arm, but possess a DHU-replacement loop. Ohtsuki and colleagues have proposed that loss of the TΨC and DHU arms from these trns has resulted in nematode mt elongation factor Tu

(EF-Tu1 and EF-Tu2) that exhibit different modes of trn binding ( Ohtsuki et al., 2001;

Ohtsuki, Sato, Watanabe, & Watanabe, 2002). Mismatches in the aminoacyl acceptor stem, anticodon stem, DHU stem or the TΨC arm are hypothesized to be corrected by

RNA editing (Hu, Gasser, et al., 2003) (Ramesh et al., 2012).

18

2.2.6. Non coding regions

The non-coding region of W. bancrofti contributes to 2.8% of the entire mt genome. The longest non-coding region, located between the cox3 and trnA genes is

likely to represent the control region, due to its high AT content (83.9%). The control

region is shorter than the previously reported control regions for S. digitata (506bp), B.

malayi (283bp) and D. immitis (362bp). Unlike A. suum and C. elegans, the control

region of W. bancrofti lacks AT dinucleotide repeats and the 43 nucleotide tandem repeat

motifs (CR1-CR6) reported in C. elegans (Ronald Okimoto, Macfarlane, Clary, &

Wolstenholme, 1992). The AT-rich region possesses two copies of a 10 nt tandem repeat

and seven pairs of inverted repeats ranging from 5 to 28 nucleotides. Invert repeats have

been previously reported in the control region of S. stercoralis, A. suum, O. volvulus, A.

duodenale and N. americanus ( Hu, Gasser, et al., 2003; Hu, Chilton, et al., 2003; Keddie

et al., 1998; Okimoto et al., 1992). Long non coding regions located between the nad4

and coxI genes in A. suum and C. elegans are absent in the control region of W bancrofti

(Okimoto et al., 1992). An additional short intergenic region of 46 nucleotides is located

between trnW and nad6 gene which can be folded into a stem-and-loop secondary

structure. Intergenic regions at identical locations have been reported in O volvulus

(46bp), S. digitata (39bp) and D. immitis (52bp) (Hu, Gasser, et al., 2003; Hu et al., 2002;

Keddie et al., 1998; Yatawara et al., 2010) though no secondary structures were detected

(Ramesh et al., 2012).

19

2.3. Conclusions and future directions

The W. bancrofti mt genome sequence reported herein provides a wealth of new

molecular markers that may be used to study genetic variation in this important human

parasite. The complete mt genome of W. bancrofti was also obtained from both Chennai,

India (GenBank Accesion: JQ316200) and Mali, Africa (GenBank Accesion: JN367461).

All the 3 mt genomes (PNG, India and Africa) have similar genome organization and

contains over 250 polymorphisms with most of the polymorphisms residing in nad4,

atp6, cox1 and cox2 (Ramesh et al., 2012). Investigation of a portion of the mitochondrial

cox1 gene from W. bancrofti microfilariae in PNG suggests that multiple parasite strains

infect individuals living in these areas (Chapter 3). Uncovering the epidemiological

consequences of this genetic variation will be essential to the success of programs aimed

at control and elimination of lymphatic filariasis in regions where W. bancrofti is

endemic.

20

Chapter 3

Genetic diversity of Wuchereria bancrofti in Dreikikir district, East Sepik Province,

Papua New Guinea

3.0. Introduction

Wuchereria bancrofti is the primary causative agent of lymphatic LF, a

disfiguring and debilitating vector-borne disease estimated to affect 120 million people in

80 countries (Global programme to eliminate lymphatic filariasis: Annual Report on

Lymphatic Filariasis, 2002). The GPELF goal to eliminate LF is predicated on

interrupting the transmission of infection by Mass Drug Administration (MDA). MDA

[Diethylcarmabazine(DEC) with/without ivermectin] has been shown to reduce the prevalence of W. bancrofti in human and mosquito infections (Bockarie et al., 1998).

Another study in PNG has shown that the levels of microfilaremia decreased by 86-98%

when the residents were treated with DEC, with or without ivermectin (Bockarie et al.,

2002).

In long term chemotherapeutic programs, such MDA, can lead to changes in the

parasite population structure thus altering management programs and causing potential

resurgence of resistance strains after its cessation (Esterre, Plichart, Sechan, & Nguyen,

2001; Harb et al., 1993; Sunish et al., 2003). Reports of a phenylalanine to tyrosine

substitution at position 200 of the beta-tubulin gene associated with benzimidazole

resistance, has been reported in T. circumcincta, H. contortus, C. oncophora and in W.

bancrofti as well (Churcher, Schwab, Prichard, & Basáñez, 2008; Elard & Humbert,

1999; Kwa, Veenstra, & Roos, 1993, 1994; Prichard, 2001; Schwab, Boakye, Kyelem, &

21

Prichard, 2005; Winterrowd, Pomroy, Sangster, Johnson, & Geary, 2003). Resistance to

ivermectin has also been reported at the P-glycoprotein locus in O. volvulus a close

genetic relative of W. bancrofti (Ardelli et al., 2006).

In PNG, MDA programs through 1998 showed that population-level administration of anti-filarial drugs significantly reduced prevalence of W. bancrofti

infection in human and mosquito populations in hyper-endemic communities (Bockarie et

al., 1998; Bockarie et al., 2002). However, since 1998 parasite transmission has shown

signs of recovering to pre-MDA levels. Also, Won and colleagues have reported similar

resurgence of W. bancrofti infection prevalence when annual MDA was missed in one

year of a multi-year treatment program in Haiti (Won, Rochars, Kyelem, Streit, &

Patrick, 2009). These results suggest that either the drugs used to kill W. bancrofti or the

current drug administration strategy is not completely effective due to changing

population dynamics.

To date population genetic data have not been available for W. bancrofti from

PNG. It follows then that genetic diversity of the parasite has not been measured, there is

no information regarding its breeding population size, its potential for variation, nor its

capacity to evade or recover from MDA. As a result, it is not possible to monitor

progress toward elimination of this parasite population, or assess potential for emergence

of drug resistance.

Here, we apply the genetic markers developed in Chapter 2 to assess the genetic

diversity of W. bancrofti populations in the Dreikikir, ESP in PNG. We sequenced a

portion of the cytochrome oxidase 1 from 14 patients belonging to 6 villages (Figure 2:

22

Peneng, Albulum1, Albulum2, Yautong1, Yautong2 and Moihuak) with the main

objective to understand the dynamics of the lymphatic filariasis. We accomplish this

through two objectives: 1) we evaluate the genetic diversity of the parasite populations

for the entire ESP region as well as each village and individual and 2) we determined if

there was genetic structure among the parasite populations in the ESP.

Previous RFLP studies across India and Southeast Asia have shown that W. bancrofti populations are highly heterogeneous (Bisht, Hoti, Thangadurai, & Das, 2006;

Dhamodharan, Das, Hoti, Das, & Dash, 2008a; Hoti, Thangadurai, Dhamodharan, & Das,

2008a; Pradeep Kumar, Patra, Hoti, & Das, 2002; Thangadurai, Hoti, Kumar, & Das,

2006) with a lack in the genetic structure in the parasites of the vertebrates (Braisher,

23

Gemmell, Grenfell, & Amos, 2004; Higazi, Klion, Boussinesq, & Unnasch, 2004).

However, not all nematodes show this pattern; Blouin and colleagues have shown that

Ascaris populations exhibit more genetic structure (Blouin, Liu, & Berry, 1999).

Other factors such as life cycle, population sizes and geographic distances impact the structure of the parasite population (Blouin et al., 1999; Wright, 1943). Specific to this study, the population structure could also be influenced by a difference in the Annual

Transmission Potential (ATP) among the six villages. While five of the villages (Peneng,

Albulum1, Albulum2, Yautong1 and Yautong2) are reported to experience a higher mosquito biting rate (High transmission zone), Moihuak has been reported to experience a much lower biting rate (moderate transmission zone) (Bockarie et al., 1998). However, the current study included only one village in the moderate transmission zone, consequently I sought to determine if the W. bancrofti populations were isolated by distance.

Based on the previous reports of W. bancrofti populations from RFLP data, I expected to see a highly heterogeneous W. bancrofti population in the ESP, PNG.

Preliminary analysis on human migration in the ESP has indicated that a large proportion of individuals migrated/moved between 1998-2008 across the villages in the ESP of PNG

(Bun et al., 2012 unpublished data). Consequently, I expected to observe no genetic structure among these W. bancrofti populations owing to dispersal of the infection due to human movement.

3.1. Methods

3.1.1. Study sites and sample selection

24

Whole blood samples were obtained from individuals living in a W. bancrofti

endemic region of Dreikikir District, ESP, PNG. In order to accurately characterize the

genetic heterogeneity of W. bancrofti, individuals chosen for the study were between 10-

30 years of age with a mean microfilarial density of 350 MF/ml and reported to have

limited migration among villages over the 10 year post MDA period (1998-2008) (Bun et

al., 2012, unpublished data. Based on these criteria, fourteen individuals from six

villages [Number of individuals from every village: Peneng (n=3), Albulum1 (n=3),

Albulum2 (n=3), Yautong1 (n=2), Yautong2 (n=2), Moihuak (n=2)] were included in the

study. The blood samples were collected under clinical protocols approved by the

institutional review boards at Papua New Guinea Institute of Medical Research and

University Hospitals Case Medical Center.

3.1.2. Genomic DNA extraction and amplification

Genomic DNA (gDNA) was extracted from whole blood using a QIAamp 96

DNA Blood Kit (QIAGEN, Valencia, CA). W. bancrofti positive samples were identified

by a post-PCR ligase detection reaction-fluorescent microsphere assay (LDR- FMA)

(Mehlotra et al., 2010) and blood smear microscopy. A secondary confirmation included using a part of a repetitive region of nuclear W. bancrofti DNA (GenBank accession no.

AY297458) which were PCR amplified following the protocols in Chapter 2.

3.1.3. PCR amplification of the Cytochrome oxidase1 (cox1) gene and visualization of the products

Primers (Table 1) were designed to amplify 650 bp of the partial cox1 (Casiraghi et al., 2001). The PCR master mixtures (25 μL) contained 2.5 μL 10X buffer (16.6 mM

25

(NH4)2SO4, 10 mM β-mercaptoethanol, 3.4 or 6.7 mM MgSO4, and 67 mM Tris-HCl, pH

8.8), 2 μL dNTPs (5 nmol each dNTP), 0.6 μL each primer (6 pmol), 0.6 μL Taq DNA

polymerase (2.5 units) and 1 μL of DNA template. An MJ Research PTC-225

thermocycler (MJ Research, Waltham, MA) was used to amplify the cox1 fragments

under the thermocycling conditions: 92°C for 2 min (initial denaturation), followed by 40

cycles of denaturation at 92°C for 30s, annealing at 50°C for 30s, and extension at 72°C

for 60 s followed by a final extension at 72°C for 5 minutes.

The amplification products were visualized on a 2% (w/v) agarose gels and the

gels were stained with SYBR Gold (Molecular Probes, Eugene, OR) and visualized using

a Storm 860 molecular imaging system with ImageQuant 5.2 software (Molecular

Dynamics, Sunnyvale, CA). These PCR products were TA cloned into pCR2.1-TOPO vector and transformed into chemically competent E. coli TOP10 cells according to the manufacturer’s protocol (Invitrogen, Carlsbad, CA).

3.1.4. Sequencing of the cox1 gene

An enzyme reaction, referred to in the literature as exo-sap, was used to clean

PCR products prior to cycle-sequencing reactions. The exo-sap master mixture (10 μL) contained 0.1 μL exonuclease, 0.5 μL antarctic phosphatase and 7 μL of PCR product.

The reaction was then run in a MJ Research PTC-225 thermocycler (MJ Research,

Waltham, MA) under the thermo-cycling conditions: 37°C for 30 min followed by 95°C for 5 min. Samples were then diluted to 2ng/ µl per 100 bp concentration and prepared for cycle sequencing. Cycle sequencing used a modified reaction protocol with 2µl of

5X Sequencing buffer, 1µl of Big dye terminator, 0.36µl of primer (3.2 pmol), 2.4µl of

26

water and 4 µl of diluted PCR product from the exo-sap reaction. The thermo-cycling

condition for cycle sequencing was an initial denaturation at 96°C for 30s followed by 25

cycles of denaturation at 96°C for 10 seconds, annealing at 50°C for 5 seconds and the

extension step at 60°C for 4 min.

Following cycle sequencing PCR products were purified to remove unincorporated dyes and single stranded DNA. Purification was performed by adding

2.5µl of 125mM EDTA followed by 30µL of 100% ethanol to each sample. Samples were then mixed by inversion, incubated at room temperature for 15 minutes, and centrifuged at 3000g for 30 min at 4 degrees C. Next, the supernatant was discarded and

30µL of 70% ethanol was added to the individual samples and centrifuged for 15 minutes at 1650g. The supernatant was then discarded and placed in a vacuum chamber for 15 minutes to ensure complete drying. Upon completion of drying, 10µl of HiDi formamide

(Applied Biosystems, Foster City, CA) was added to individual wells and loaded onto the

ABI PRISM 3700 DNA Sequencer (Applied Biosystems, Foster City, CA).

3.1.5. Sequence analysis: Alignment of sequences and Sequence editing

All the sequences were edited and assembled using Sequencher 4.8 (Gene Codes,

Ann Arbor, MI) and CodonCode Aligner 3.5 (CodonCode Corporation, Dedham MA).The

following steps were performed to obtain sequences of high quality:

1) The sequences were imported and double peaks for the bases were corrected

automatically by CodonCode Aligner 3.5

2) The ends of all the sequences were trimmed to discard the low quality bases

27

3) The sequences were assembled and a consensus sequence was generated in

CodonCode Aligner 3.5

4) All sites with a PHRED score of less than 20 were corrected automatically in

CodonCode Aligner 3.5 to match the consensus sequence.

5) Sites with PHRED scores between 20 to 30 were visually investigated and

corrected to consensus if an ambiguous base was reported

6) The other sequences with less than 300 quality bases (PHRED<30) were removed

from the assembly.

7) The manually edited sequences were then imported into Geneious 5.3 and aligned

against the complete mitochondrial genome of W. bancrofti (GenBank Accession

No. JF 557722) to check for correct translation frame.

8) UCHIME, a chimera detection program was used to detect these recombinant

molecules (Edgar, Haas, Clemente, Quince, & Knight, 2011). The UCHIME

algorithm divides the input sequences into four non-overlapping sequences

(chunks) and builds a reference database based on the consensus sequence. The

algorithm then matches these chunks to the reference database and selects two of

the best matches to generate a three-way multiple alignment. Based on the percent

identity of the alignment, a score is assigned and a chimeric molecule is reported

if the score exceeds a pre-determined threshold (Edgar et al., 2011). The chimeric

molecules detected by the program were not included in the study.

9) Following the methods of Krawczak 1989, polymorphisms that occurred less

than twice in the sequence assembly were treated as Taq DNA polymerase

28

errors and were modified to match the consensus sequence (Krawczak, Reiss,

Schmidtke & Rösler, 1989).

3.1.6. Genetic heterogeneity of W. bancrofti

An assembly of all the sequences from the six villages [Peneng, Albulum1,

Albulum2, Yautong2, Moihuak] was generated using Geneious 5.3 (Biomatters,

Auckland, NZ). The haplotype (gene) diversity (Nei, 1987) and θ (Watterson, 1975), a measure of the neutral mutation rate, were estimated using DnaSP 5.0 (Librado & Rozas,

2009) for the East Sepik Province (among all the six villages), W. bancrofti populations among hosts in a village and W. bancrofti populations within hosts (parasite infrapopulation).

3.1.7. Test for neutrality

All sequences were tested for deviations from the neutral model of evolution using Strobeck’s S (Sh) (Strobeck, 1987), which is a measure of the probability of

obtaining k haplotypes given the estimated value of θ estimated by Ewens sampling

formula (Ewens, 1972). Both the expected number of haplotypes and Sh were estimated

using DnaSP 5.0 (Librado & Rozas, 2009).

3.1.8. Population structure of W. bancrofti

To quantify genetic differentiation and gene flow for the ESP and W. bancrofti

populations among hosts in a village, Snn (Hudson, 2000; Hudson, Boos, & Kaplan,

1992), defined as the sequence nearest neighbor and pairwise Fst, a measure of the

deviation of allelic frequencies from Hardy-Weinberg (Wright & MePhee, 1925) were estimated using DnaSP 5.0 (Librado & Rozas, 2009) and Arlequin 3.5 (Excoffier &

29

Lischer, 2010). Snn represents a powerful sequence-based statistical test for genetic differentiation and are recommended for use in cases of high mutation rate as implicated in the nematode mt genome (Anderson, Blouin, & Beech, 1998) and small sample sizes

(Hudson, 2000; Hudson et al., 1992). Haplotype networks were generated using TCS

1.21 (Clement, Posada & Crandall, 2000) for both the ESP and W. bancrofti populations among hosts in a village.

Analysis of genetic variance across the villages was conducted using the an analysis of molecular variance (AMOVA) approach, within the program Arlequin 3.5

(Excoffier & Lischer, 2010). AMOVA calculates standard variance components and correlation measures for a maximum of three hierarchical levels of population subdivision (Pfenninger, Bahl, & Streit, 1996). For the three-way AMOVA statistic, the populations were partitioned by individual villages. A multi-dimentional scaling plot using data generated using pairwise Fst values using XLSTAT Addinsoft software

(Fahmy, 2003). Genetic distances (distances between species inferred from the nucleic acid sequences) were calculated using PHYLIP 3.6 (Felsenstein, 1989) and their frequencies were plotted for both within and between the hosts in the six villages.

Isolation by distance (IBD) of W. bancrofti populations in the ESP was computed using the Isolation by distance web service (Bohonak, 1999).

3.2. Results

After our sequence assembly process (outlined above), we were left with 388

sequences (281 nt in length) belonging to fourteen individuals from six villages. An overview of the number of individuals from each village is represented in Table 2. W.

30

bancrofti populations were sequenced from 3 individuals in Peneng (n=55 sequences, 13

haplotypes), Albulum1 (n=98 sequences, 53 haplotypes), Albulum2 (n=124 sequences,

40 haplotypes) and 2 individuals in Yautong1 (n=24 sequences, 12 haplotypes),

Yautong2 (n=37 sequences, 21 haplotypes) and Moihuak (n=50 sequences, 17

haplotypes).

3.2.1. Genetic heterogeneity of W. bancrofti

A total of 388 sequences representing 124 haplotypes were observed among the

six villages. The haplotype diversity (HD), a measure of uniqueness of a particular

haplotype in a given population (Hartl & Clark, 1997), for the entire ESP was quite high,

HD= 0.93 (Table 2). The high haplotype diversity is suggestive of a highly heterogeneous

W. bancrofti population in the ESP region of PNG. Haplotype diversity within each of

the six villages was more variable (HD=0.75-0.95). This suggested that the villages were

not equal in their HD across the study sites with some villages having higher haplotype diversity than the entire ESP. W. bancrofti populations were highly heterogeneous within individual hosts as well (HD=0.63-1.00), with variation among hosts even within the

same village.

31

θ, which is the proportion of nucleotide sites that are polymorphic (Hartl & Clark,

1997) in a given sequence was 10.55 across the ESP. A high number of polymorphic sites are suggestive of a highly diverse parasite population among these six villages. θ varied between 3.44-7.17 in the individual villages, suggesting a diverse nature of W. bancrofti populations in accordance with the high haplotype diversity observed earlier. A similar trend was detected amongst the parasite infrapopulation (parasite populations within the host) with θ varying between 1.89-6.43 (Table 2). This trend suggests that high haplotype diversity is not always correlated with high number of polymorphic sites; with some populations having low θ but high HD, suggesting that haplotypes might not differ by many substitutions.

32

3.2.2. Test for compliance to the neutral model of evolution

Strobeck's S (Strobeck, 1987) (Sh) is measure of the probability of capturing the

expected number of haplotypes as given by Ewens’ sampling distribution. Ewen’s

sampling distribution is based on the neutral model of evolution and assumes a constant

population size and no selection (Ewens, 1972). For the entire ESP we found 124 haplotypes, which was more than predicted under the neutral model (Expected number of halotypes = 22; Sh = 1.00) (Table 2). Consistent with this observation, we found more

haplotypes than expected in all villages (Sh > 0.95) with the exception of Moihuak.

Within the host populations only the villages of Moihuak and Peneng were consistently

within the neutral expectation for the number of haplotypes (Table 2), with all other

villages having more haplotypes than expected.

3.2.3. Genetic structure of W. bancrofti populations

Genetic structure of a population is typically measured by the variation in

subpopulation allele frequencies in comparison to the total population. Wright’s classic

Fst statistic quantifies this deviation. Wright interpreted the range of Fst as 0-0.05 to indicate little genetic differentiation, 0.05-0.15 to indicate moderate genetic differentiation, 0.15-0.25 to indicate high genetic differentiation, and values greater than

0.25 to indicate very high genetic differentiation (Hartl & Clark, 1997). A matrix of pairwise Fst values among villages (Table 3) was used to generate a multi-dimensional scaling (MDS) plot in XLSTAT. The MDS provides a visual representation of the relationship among the W. bancrofti populations in the six villages based on these genetic distances (Figure 3). The MDS plot indicates that W. bancrofti populations in Peneng

33 and Moihuak are most differentiated. A shepherd diagram, which measures the reliability of the MDS plot, indicates that the plot generated was of high quality (figure not shown).

Pairwise Fst values were also generated for W. bancrofti populations between hosts in each of the six villages (Table 4). While a moderate genetic structure of these parasite

34 populations were detected among hosts in Peneng and Albulum2, a high to a very high genetic structure was detected in Albulum1 and Moihuak. A general lack of genetic structure of W. bancrofti populations among hosts was observed in Yautong2 (Hartl &

Clark, 1997) (Table 4).

The nearest neighbor statistic, Snn, is a measure of how often the nearest sequence neighbor, sequence identity, is from the same locality in geographic space (Hudson,

2000). An Snn of 1 is indicative of a highly differentiated population and a value of 0.5 is indicative of a panmictic population with p-values estimated by permutation tests

(Hudson, 2000). For the entire ESP region, Snn was determined to be 0.50 (p<0.001)

35

suggestive of significant structure. Within the villages I observed moderate to a high

levels of genetic structure (Peneng: Snn =0.65, P<0.001; Albulum1: Snn =0.71, P<0.001;

Albulum2: Snn =0.59, P<0.001; Yautong2: Snn =0.75, P=0.004). The Snn statistic for

Moihuak was the highest with a value of 1 (p<0.001) (Table 2), indicating that the W.

bancrofti populations were genetically differentiated with a very high level of genetic

structure.

A three-way AMOVA was performed assuming no partition among the six

villages using Arlequin 3.5 (Excoffier & Lischer, 2010) (Table 5). The AMOVA

indicated that 69.02% of the variance is accounted for by the parasites within the hosts

with an Fst = 0.08 (p<0.001) indicative of a moderate genetic structure. Of the remaining variance, 22.71% was accounted for by the parasite populations among the hosts in the villages (Fst =0.31, p<0.001) and 8.27% of the variance among villages.

Two-way AMOVA’s were also performed for individual villages (data not shown)

indicating that 77-98% of the genetic diversity was due to variation within the parasite

infrapopulation in all villages, with the exception of Moihuak, where only 40% of the

variation was accounted for by the parasite infrapopulation. Interestingly, when the three-

36 way AMOVA was repeated excluding Moihuak, 90% of the variation was due to the heterogeneity within the parasite infrapopulation (data not shown).

The haplotype network (Figure 4) generated using TCS 1.21 revealed the existence of 5 dominant haplotypes across the entire ESP. The five dominant haplotypes

(haplotypes with the highest frequencies) were represented at varying frequencies in each of the six villages. The most dominant haplotype (Frequency=88) was represented only in

Albulum1, Albulum2, Yautong1 and Yautong2. Interestingly, W. bancrofti populations from Moihuak represented an extremely low frequency (1-2) of the 5 most dominant haplotypes. Haplotype networks were also generated for W. bancrofti populations among the hosts in the six villages (Figures 5-10). The haplotype network for Peneng indicated that 3 haplotypes were shared between the hosts (Table 2, Figure 5). Seven haplotypes were shared among hosts in Albulum1 (Table 2, Figure 6) and of the 40 haplotypes observed in Albulum2, 8 haplotypes were shared among the hosts (Table 2, Figure 7).

For Yautong1, the haplotype network (Figure 8) indicated that host TWG 436 was 1 mutational step away from the dominant haplotype of individual TWG 363. Of the 37 sequences obtained from W. bancrofti populations in Yautong2, 2 haplotypes were shared between the hosts (Table 2, Figure 9). For these five villages, the other haplotypes were found to be closely related (1 to 13 mutational steps) to the dominant haplotypes in the village. The haplotype network for Moihuak indicates no sharing of haplotypes between hosts, responsible for the previously described high levels of genetic structure (Table 2,

Figure 10).

37

38

39

40

41

42

43

I evaluated within host structure by comparing genetic distances both within and among hosts for the six villages (Figure 11 and 12). The sequences within the hosts were scored between 0-0.050 and among hosts were scored between 0- 0.060. This suggests that the

W. bancrofti population within the hosts are more closely related than the overall parasite population (Figures 11, 12).

44

3.2.4. Isolation by distance

To determine the cause of genetic structure we tested the fit of our data to the isolation by distance model (IBD) (Bohonak, 1999). The IBD model tests the assumption

45

that there is complete continuity in distribution of populations (Bohonak, 1999). Under

the IBD model more remote populations will show the highest level of differentiation

owing the intervening geographic distance (Wright, 1943). A comparison between the

eastern most (Peneng) and the western most (Moihuak) villages with respect to the other

study sites was plotted (Figure 13) for Fst values. I observed a cline from the east to the

west with highest Fst values between Moihauk and Peneng. This pattern is indicative of

an isolation by distance (Xu et al., 2012). I also tested the correlation between Fst and

geographic distance using the six villages (Correlation r2 = 0.90; p value<0.001) (Figure

14).

46

3.3. Discussion

3.3.1. Genetic heterogeneity of W. bancrofti populations

To my knowledge, the current study provides the first attempt at studying the population diversity of W. bancrofti populations using DNA sequence data. The study was conducted on individuals in the ESP who were previously treated with DEC alone or in combination with ivermectin. Previous, studies using RFLP markers have reported high genetic heterogeneity and a general lack of genetic structure (Blouin et al., 1999;

Churcher et al., 2008; Criscione, Poulin, & Blouin, 2005; Hoti, Thangadurai,

Dhamodharan, & Das, 2008b; Prichard, 2001; Thangadurai et al., 2006). Studies on W. bancrofti populations have been carried out in India, Andaman Islands, Thailand and

Myanmar using RFLP markers. These studies have reported the existence of highly heterogeneous W. bancrofti populations and the authors attributed this to the large

47

parasite infrapopulation (Braisher, Gemmell, Grenfell, & Amos, 2004, Dhamodharan,

Das, Hoti, Das, & Dash, 2008; Hoti et al., 2008; Pradeep Kumar, Patra, Hoti, & Das,

2002; Thangadurai et al., 2006, Nuchprayoon, Junpee, & Poovorawan, 2007). A study

conducted on 40 trichostrongylid sequences yielded 31-39 unique haplotypes (Blouin et al., 1999) indicating highly diverse populations. Brashier and colleagues have also reported a highly heterogeneous population of Teladorsagia species in Britain using mt markers (Braisher et al., 2004). In accordance with these findings, the W. bancrofti populations in the current study were also highly heterogeneous, (Table 2) most likely due to the highly diverse nature of parasite infrapopulation. Strocbeck’s S was significant

for most of the W. bancrofti populations within the hosts in the ESP, suggesting a signal

of selection or population expansion from a prior bottleneck (Venkatesan, Westbrook,

Hauer, & Rasgon, 2007) (Table 2).

In this study, most of the diversity was observed within the parasite

infrapopulation in the ESP (Table 5). The high diversity of these W. bancrofti

populations within these hosts is most likely due to the high mutation rate in the

nematode mitochondria (Anderson et al., 1998) . Apart from the mutation rate, polymorphisms also depend on migration rates and effective population sizes (Prichard,

2001). Large effective population sizes contribute to a highly heterogeneous mixture of

W. bancrofti populations within the host. Mosquitoes have also been shown to have a 'site fidelity' using marker-recapture studies (Churcher et al., 2008). Studies in Burkina Faso and East Africa have shown a density dependent process whereby multiple infective L3 larvae are transmitted in a single bite by Anopheles gambie and Culex quinquefasciatus

(Gasarasi, 2000; Gyapong et al., 2002). Thus a filarial infection is the outcome of several

48

infective bites (Menon & Rajagopalan, 1977) exposed over long periods of time (Hoti,

Thangadurai, Dhamodharan, & Das, 2008a) resulting in a high genetic diversity.

Interestingly, Peneng, Albulum1, Albulum2, Yautong1 and Yautong2 are located in the high transmission zone where mosquitoes are reported to have higher biting rates (a higher ATP) than the mosquitoes observed in the moderate transmission zone (Bockarie et al., 1998). Mosquitoes in Moihuak have a lower ATP as a result of which fewer infective L3 larvae are transmitted among the hosts. Accordingly, while most of the genetic diversity (77-98%) was observed in the parasite infrapopulation in the high

transmission village, 60% of the diversity in Moihuak (moderate transmission village)

was due to differences of W. bancrofti populations among hosts (data not shown).

3.3.2. Population structure

In the present study, moderate to high levels of genetic differentiation were

observed in the Peneng, Albulum1, Albulum2, Yautong2 with very high levels of

differentiation in Moihuak. Existing reports on nematode populations have reported a

general lack of genetic structure of these parasite populations. For example, Loa loa

populations from South Cameroon have been reported to represent a fairly heterogeneous

population with a general lack of genetic structure (Fst=0.01-0.04) (Higazi et al., 2004).

A similar trend was observed in trichostrongylid species in Britain (Braisher et al., 2004).

Contrary to the previous reports, Thangadurai and colleagues observed a high degree of

genetic differentiation (Fst = 0.7978) between two strains of W. bancrofti across the

Eastern and Western Ghats (Thangadurai et al., 2006) and attributed this difference to

geographical isolation. Accordingly, the W. bancrofti populations were tested for

isolation by distance, the results of which are discussed at the end of the chapter.

49

The haplotype network for all the six villages (Figure 4) indicated the existence of

a few dominant haplotypes which occupy the central position giving rise to the other

several haplotypes. Also, interestingly the rare haplotypes were always located at the tips

of the network. A similar trend was seen in another study, where 188 mitochondrial

sequences of S. lindsayae gave rise to 12 haplotypes (Courtright et al., 2000). No clear groupings according to the locality was observed, with the exception of the parasite populations in Moihuak which were several mutational steps away from the parasite populations in the other five villages.

Genetic distances of W. bancrofti populations both within and among hosts calculated using PHYLIP 3.6, provide evidence of relatedness within a host. While the mean genetic distance for W. bancrofti populations within hosts was 0.01 (Range: 0-

0.05), the mean genetic distance among hosts was 0.02 (Range: 0-0.06) (p<0.001)

(Figures 11, 12). These results suggest that parasites within a host are more closely related to each other than among hosts. This pattern, while not conclusive, suggests the need for follow-up experiments to understand how parasites family groups or sib-ships are transmitted from mosquitoes to hosts.

3.3.3. What mechanisms are responsible for structuring populations of W. bancrofti in

PNG?

Several factors such as environmental barriers, life histories and historical processes are responsible for shaping the genetic structure of populations. Additionally, the geographic distribution of species can influence the ability of individuals to find mates. In widely dispersed species, or species with small home ranges, mating tends to

50

occur more often in nearby populations than between distant populations. As a result of

this tendency, populations that live near each other are genetically more similar than

populations that live further apart. The resulting pattern is termed isolation by distance

(IBD) and results in clinal distribution of traits across a geographic region (Balloux &

Lugon-Moulin, 2002; Wright, 1943). In populations of W. bancrofti in the ESP, I observed a positive correlation between genetic and geographic distances (r2=0.90) and a

genetic cline in the Fst values east to the west (Figures 13, 14). This data leads us to

conclude that isolation by distance is driving population differentiation in W.bancrofti

populations (Wright, 1943). However it should be kept in mind that human/host

migration also significantly influences population structure of W. bancrofti and a

thorough analysis of host migration should be performed to assess its impact on the

genetic structure of this parasite.

W. bancrofti undergoes development in both human and vector phase, thus

transmission dynamics result in two possible modes for isolation by distance: human

migration and vector dispersal. Preliminary analysis of migration data has indicated

several migration events of individuals across the ESP of PNG (Bun et al., 2012,

unpublished data). A high rate of human migration/movement would homogenize the

parasite population making it an unlikely contributor to the isolation of these W. bancrofti

populations by geographic distance.

The more likely explanation accounting for the isolation by distance of W.

bancrofti populations can be attributed to vector dispersal. Anopheline mosquitoes are the

most common vectors of W. bancrofti transmission in the six study sites (M J Bockarie et

al., 1998). The transmission of W. bancrofti by mosquitoes depends both on the biting

51 behavior and the transmission efficiency of the vector (Churcher et al., 2008). Charlwood and Bryan used a mark-recapture experiment to conclude that Anopheles mosquitoes have a limited flight distance of 2 km (Charlwood & Bryan, 1987). All villages within our study site are farther apart than 2km (Bockarie et al., 1998), with the exception of

Albulum1 and Albulum2 (1.6 km) and Yautong1 and Yautong2 (1.5 km). Thus it is likely that the limited dispersal of these mosquitoes could be driving the pattern of IBD we detected in the W. bancrofti populations.

3.4. Conclusions and future directions

To my knowledge, this study provides the first attempt at studying the population diversity of W. bancrofti populations (cox1 gene; Chapter 3) using DNA sequence data

(mt genome described in Chapter 2). W. bancrofti populations in the Dreikikir district,

East Sepik Province of Papua New Guinea are highly heterogeneous and exhibits a moderate level of genetic structure. Most of the variance was associated to the high level of diversity within the parasite infrapopulation. Parasites across the ESP were isolated by geographic distance, probably due to dispersal of the L3 larvae by infective mosquitoes.

An interesting pattern was observed in the W. bancrofti populations of Moihuak (Figure 4 and 10), a moderate transmission village, in that these parasites were isolated from the parasites in the high transmission zone (Peneng, Albulum1, Albulum2 Yautong1,

Yautong2). This observation might suggest a possible isolation of the W. bancrofti populations based on the Annual Transmission Potential, however hosts from other moderate transmission villages should be sampled to further shed light on this hypothesis.

52

Michael and colleagues have used population genetic models to gain insights into developing resistance to drugs in W. bancrofti populations and have highlighted the importance to understand the population biology of this parasite and studying the vector patterns of host biting in Mass Drug Administration interventions (Michael et al., 2004).

A better understanding of the population structure and genetic differentiation of this parasite will provide important insights into patterns of transmission, disease outcome, and anthelmintic drug resistance, and influence the design and implementation of public health interventions aimed at eliminating this disease.

53

References

Anderson, T. J. C., Blouin, M. S., & Beech, R. N. (1998). Population Biology of Parasitic Nematodes: Applications of Genetic Markers. Advanced Parasitology, 41, 219-283.

Angellotti, M. C., Bhuiyan, S. B., Chen, G., & Wan, X.-F. (2007). CodonO: codon usage bias analysis within and across genomes. Nucleic acids research, 35(Web Server issue), W132-6. doi:10.1093/nar/gkm392

Ardelli, B. F., Guerriero, S. B., & Prichard, R. K. (2006). Ivermectin imposes selection pressure on P-glycoprotein from Onchocerca volvulus: linkage disequilibrium and genotype diversity. Parasitology, 132(Pt 3), 375-86. doi:10.1017/S0031182005008991

Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Neigel, J. E., Reeb, C. A., Saunders, N. C., et al. (1987). INTRASPECIFIC The PHYLOGEOGRAPHY : Mitochondrial DNA Bridge Between Population Genetics and Systematics. Ecology, 18, 489-522.

Balloux, F., & Lugon-Moulin, N. (2002). The estimation of population differentiation with microsatellite markers. Molecular ecology, 11(2), 155-65. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11856418

Benson, G. (1999). Tandem repeats finder: a program to analyze DNA sequences. Nucleic acids research, 27(2), 573-80. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=148217&tool=pmcentre z&rendertype=abstract

Bisht, R., Hoti, S. L., Thangadurai, R., & Das, P. K. (2006). Isolation of Wuchereria bancrofti microfilariae from archived stained blood slides for use in genetic studies and amplification of parasite and endosymbiont genes. Acta tropica, 99(1), 1-5. doi:10.1016/j.actatropica.2005.12.009

Blouin, M. S., Liu, J., & Berry, R. E. (1999). Life cycle variation and the genetic structure of nematode populations. Heredity, 83 ( Pt 3)(September 1998), 253-9. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/10504422

Bockarie, M J, Alexander, N. D., Hyun, P., Dimber, Z., Bockarie, F., Ibam, E., Alpers, M. P., et al. (1998). Randomised community-based trial of annual single-dose diethylcarbamazine with or without ivermectin against Wuchereria bancrofti infection in human beings and mosquitoes. Lancet, 351(9097), 162-8. doi:10.1016/S0140-6736(97)07081-5

54

Bockarie, M J, & Kazura, J. W. (2003). Lymphatic filariasis in Papua New Guinea: prospects for elimination. Medical microbiology and immunology, 192(1), 9-14. doi:10.1007/s00430-002-0153-y

Bockarie, Moses J, Tisch, D. J., Kastens, W., Alexander, N. D. E., Dimber, Z., Bockarie, F., Ibam, E., et al. (2002). Mass treatment to eliminate filariasis in Papua New Guinea. The New England journal of medicine, 347(23), 1841-8. doi:10.1056/NEJMoa021309

Bohonak, A. J. (1999). IBD ( Isolation by Distance ): A Program for Analyses of Isolation by Distance, (1994), 153-154.

Braisher, T. L., Gemmell, N. J., Grenfell, B. T., & Amos, W. (2004). Host isolation and patterns of genetic variability in three populations of Teladorsagia from sheep. International journal for parasitology, 34(10), 1197-204. doi:10.1016/j.ijpara.2004.06.005

Bryan, J. H. (2006). Vectors of Wuchereria bancrofii New in the Sepik Guinea Provinces of Papua, (March 1983).

Casiraghi, M., Anderson, T. J., Bandi, C., Bazzocchi, C., & Genchi, C. (2001). A phylogenetic analysis of filarial nematodes: comparison with the phylogeny of Wolbachia endosymbionts. Parasitology, 122 Pt 1, 93-103. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11197770

Charlwood, J. D., & Bryan, J. H. (1987). A mark-recapture experiment with the filariasis vector Anopheles punctulatus in Papua New Guinea. Annals of tropical medicine and parasitology, 81(4), 429-436. Maney. Retrieved from http://cat.inist.fr/?aModele=afficheN&cpsidt=7381440

Churcher, T. S., Schwab, A. E., Prichard, R. K., & Basáñez, M.-G. (2008). An analysis of genetic diversity and inbreeding in Wuchereria bancrofti: implications for the spread and detection of drug resistance. PLoS neglected tropical diseases, 2(4), 1-9. doi:10.1371/journal.pntd.0000211

Clement, M, Posada, D & Crandall, K. (2000). TCS: Phylogenetic network estimation using statistical parsimony. Molecular Ecology, 9(10), 1657-1660.

Courtright, E. M., Wall, D. H., Virginia, R. a, Frisse, L. M., Vida, J. T., & Thomas, W. K. (2000). Nuclear and Mitochondrial DNA Sequence Diversity in the Antarctic Nematode Scottnema lindsayae. Journal of nematology, 32(2), 143-53. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2620446&tool=pmcentr ez&rendertype=abstract

55

Criscione, C. D., Poulin, R., & Blouin, M. S. (2005). Molecular ecology of parasites: elucidating ecological and microevolutionary processes. Molecular ecology, 14(8), 2247-57. doi:10.1111/j.1365-294X.2005.02587.x

Cupp, E. W., Sauerbrey, M., & Richards, F. (2011). Elimination of human onchocerciasis: history of progress and current feasibility using ivermectin (Mectizan(®)) monotherapy. Acta tropica, 120 Suppl , S100-8. Elsevier B.V. doi:10.1016/j.actatropica.2010.08.009

Dhamodharan, R., Das, M. K., Hoti, S. L., Das, P. K., & Dash, a P. (2008a). Genetic variability of diurnally sub-periodic Wuchereria bancrofti in Nicobarese tribe of Nicobar group of Islands, Andaman and Nicobar Islands, India. Parasitology research, 103(1), 59-66. doi:10.1007/s00436-008-0927-2

Dhamodharan, R., Das, M. K., Hoti, S. L., Das, P. K., & Dash, a P. (2008b). Genetic variability of diurnally sub-periodic Wuchereria bancrofti in Nicobarese tribe of Nicobar group of Islands, Andaman and Nicobar Islands, India. Parasitology research, 103(1), 59-66. doi:10.1007/s00436-008-0927-2

Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C., & Knight, R. (2011). UCHIME improves sensitivity and speed of chimera detection. (Oxford, England), 27(16), 2194-200. doi:10.1093/bioinformatics/btr381

Elard, L., & Humbert, J. F. (1999). Importance of the mutation of amino acid 200 of the isotype 1 beta-tubulin gene in the benzimidazole resistance of the small-ruminant parasite Teladorsagia circumcincta. Parasitology research, 85(6), 452-6. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/10344538

Esterre, P., Plichart, C., Sechan, Y., & Nguyen, N. L. (2001). The impact of 34 years of massive DEC chemotherapy on Wuchereria bancrofti infection and transmission: the Maupiti cohort. Tropical medicine & international health : TM & IH, 6(3), 190-5. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11299035

Ewens, W. J. (1972). The Sampling Theory of Selectively Nuetral Alleles. Theoretical Population Biology, 3, 87-112.

Excoffier, L., & Lischer, H. E. L. (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular ecology resources, 10(3), 564-7. doi:10.1111/j.1755-0998.2010.02847.x

Fahmy, T. (2003). XLSTAT-Pro 7.0. Paris, France.

Feagin, J. E. (2000). Mitochondrial genome diversity in parasites. International journal for parasitology, 30(4), 371-90. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/10731561

56

Felsenstein, J. (1989). PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics, 5, 164-166.

Gasarasi, D. (2000). The transmission dynamics of bancroftian filariasis: the distribution of the infective larvae of Wuchereria bancrofti in Culex quinquefasciatus and Anopheles gambiae and its effect on parasite escape from the vector. Trans R Soc Trop Med Hyg, 94, 341-347.

Ghedin, E., Wang, S., Spiro, D., Caler, E., Zhao, Q., Crabtree, J., Allen, J. E., et al. (2007). Draft genome of the filarial nematode parasite Brugia malayi. Science (New York, N.Y.), 317(5845), 1756-60. doi:10.1126/science.1145406

Global programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis. (2002).World Health.

Gyapong, J. O., Kyelem, D., Kleinschmidt, I., Agbo, K., Ahouandogbo, F., Gaba, J., Owusu-Banahene, G., et al. (2002). The use of spatial analysis in mapping the distribution of bancroftian filariasis in four West African countries. Annals of Tropical Medicine and Parasitology, 96(7), 695-705. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/12537631

Harb, M., Faris, R., Gad, a M., Hafez, O. N., Ramzy, R., & Buck, a a. (1993). The resurgence of lymphatic filariasis in the Nile delta. Bulletin of the World Health Organization, 71(1), 49-54. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2393420&tool=pmcentr ez&rendertype=abstract

Hartl, D. L., & Clark, A. G. (1997). Principles of Population Genetics (3rd editio.). Sinauer Associates Inc.

Higazi, T. B., Klion, A. D., Boussinesq, M., & Unnasch, T. R. (2004). Genetic heterogeneity in Loa loa parasites from southern Cameroon: A preliminary study. Filaria journal, 3(1), 4. doi:10.1186/1475-2883-3-4

Hoti, S. L., Thangadurai, R., Dhamodharan, R., & Das, P. K. (2008a). Genetic heterogeneity of Wuchereria bancrofti populations at spatially hierarchical levels in Pondicherry and surrounding areas, south India. Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases, 8(5), 644-52. doi:10.1016/j.meegid.2008.06.002

Hoti, S. L., Thangadurai, R., Dhamodharan, R., & Das, P. K. (2008b). Genetic heterogeneity of Wuchereria bancrofti populations at spatially hierarchical levels in Pondicherry and surrounding areas, south India. Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases, 8(5), 644-52. doi:10.1016/j.meegid.2008.06.002

57

Hu, Min, Chilton, N. B., Abs El-Osta, Y. G., & Gasser, R. B. (2003). Comparative analysis of mitochondrial genome data for Necator americanus from two endemic regions reveals substantial genetic variation. International Journal for Parasitology, 33(9), 955-963. doi:10.1016/S0020-7519(03)00129-2

Hu, M., Gasser, R. B., Abs El-Osta, Y. G., & Chilton, N. B. (2003). Structure and organization of the mitochondrial genome of the canine heartworm, Dirofilaria immitis. Parasitology, 127(1), 37-51. doi:10.1017/S0031182003003275

Hu, Min, Chilton, N. B., & Gasser, R. B. (2002). The mitochondrial genomes of the human hookworms, Ancylostoma duodenale and Necator americanus (Nematoda: Secernentea). International journal for parasitology, 32(2), 145-58. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11812491

Hu, Min, & Gasser, R. B. (2006). Mitochondrial genomes of parasitic nematodes-- progress and perspectives. Trends in parasitology, 22(2), 78-84. doi:10.1016/j.pt.2005.12.003

Hudson, R. R. (2000). A new statistic for detecting genetic differentiation. Genetics, 155(4), 2011-4. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1461195&tool=pmcentr ez&rendertype=abstract

Hudson, R. R., Boos, D. D., & Kaplan, N. L. (1992). A statistical test for detecting geographic subdivision. and evolution, 9(1), 138-51. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/1552836

Kazura, J. W., Spark, R., Forsyth, K., Brown, G., Heywood, P., Peters, P., & Alpers, M. (1984). Parasitologic and clinical features of bancroftian filariasis in a community in East Sepik Province, Papua New Guinea. The American journal of tropical medicine and hygiene, 33(6), 1119-23. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/6391222

Keddie, E. M., Higazi, T., & Unnasch, T. R. (1998). The mitochondrial genome of Onchocerca volvulus: sequence, structure and phylogenetic analysis. Molecular and biochemical parasitology, 95(1), 111-27. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9763293

Krawczak, M, Reiss, J, Schmidtke, J & Rösler, U. (1989). Polymerase chain reaction: replication errors and reliability of gene diagnosis. Nucl. Acids Res, 17(6), 2197- 2201.

Kwa, M. S., Veenstra, J. G., & Roos, M. H. (1993). Molecular characterisation of beta- tubulin genes present in benzimidazole-resistant populations of Haemonchus contortus. Molecular and biochemical parasitology, 60(1), 133-43. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8366887

58

Kwa, M. S., Veenstra, J. G., & Roos, M. H. (1994). Benzimidazole resistance in Haemonchus contortus is correlated with a conserved mutation at amino acid 200 in beta-tubulin isotype 1. Molecular and biochemical parasitology, 63(2), 299-303. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/19616042

Laslett, D., & Canbäck, B. (2008). ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics (Oxford, England), 24(2), 172-5. doi:10.1093/bioinformatics/btm573

Lavrov, D. V., & Brown, W. M. (2001). Trichinella spiralis mtDNA: a nematode mitochondrial genome that encodes a putative ATP8 and normally structured tRNAS and has a gene arrangement relatable to those of coelomate metazoans. Genetics, 157(2), 621-37. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1461501&tool=pmcentr ez&rendertype=abstract

Librado, P., & Rozas, J. (2009). DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics (Oxford, England), 25(11), 1451-2. doi:10.1093/bioinformatics/btp187

Lowe, T. M., & Eddy, S. R. (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic acids research, 25(5), 955-64. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=146525&tool=pmcentre z&rendertype=abstract

Mehlotra, R. K., Gray, L. R., Blood-Zikursh, M. J., Kloos, Z., Henry-Halldin, C. N., Tisch, D. J., Thomsen, E., et al. (2010). Molecular-based assay for simultaneous detection of four Plasmodium spp. and Wuchereria bancrofti infections. The American journal of tropical medicine and hygiene, 82(6), 1030-3. doi:10.4269/ajtmh.2010.09-0665

Menon, P. K., & Rajagopalan, P. K. (1977). Mosquito control potential of some species of indigenous fishes in Pondicherry. Indian Journal of Medical Research, 66(5), 765-771.

Michael, E, & Bundy, D. A. P. (1997). Global mapping of lymphatic filariasis. Parasitology today (Personal ed.), 13(12), 472-6. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/15275135

Michael, Edwin, Malecela-lazaro, M. N., Simonsen, P. E., Pedersen, E. M., Barker, G., Kumar, A., & Kazura, J. W. (2004). Review Mathematical modelling and the control of lymphatic filariasis. The Lancet, 44, 223-234.

59

Moilia-Pelat, J., Glaziou, P., Weil, G., Nguyen, L., Gaxotte, P., & Nicolas, L. (1995). Combination ivermectin plus diethylcarbamazine, a new effective tool for control of lymphatic filariasis. Annals of Tropical Medicine and Parasitolgy, 45, 9-12.

Molyneux, D. H., Bradley, M., Hoerauf, A., Kyelem, D., & Taylor, M. J. (2003). Mass drug treatment for lymphatic filariasis and onchocerciasis. Trends in Parasitology, 19(11), 516-522. doi:10.1016/j.pt.2003.09.004

Nei, M. (1987). Molecular and Evolutionary Genetics (p. 294). Columbia University press.

Ohtsuki, T, Watanabe Yi, Takemoto, C., Kawai, G., Ueda, T., Kita, K., Kojima, S., et al. (2001). An “elongated” translation elongation factor Tu for truncated tRNAs in nematode mitochondria. The Journal of biological chemistry, 276(24), 21571-7. doi:10.1074/jbc.M011118200

Ohtsuki, Takashi, Sato, A., Watanabe, Y.-ichi, & Watanabe, K. (2002). A unique serine- specific elongation factor Tu found in nematode mitochondria. Nature structural biology, 9(9), 669-73. doi:10.1038/nsb826

Ojala, D., Montoya, J., & Attaradi, G. (1981). tRNA punctuation model of RNA processing in human mitochondria. nature, 290, 470-474.

Okimoto, R, & Wolstenholme, D. R. (1990). A set of tRNAs that lack either the T psi C arm or the dihydrouridine arm: towards a minimal tRNA adaptor. The EMBO journal, 9(10), 3405-11. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=552080&tool=pmcentre z&rendertype=abstract

Okimoto, Ronald, Macfarlane, J. L., Clary, D., & Wolstenholme, D. R. (1992). The Mitochondrial Genomes of Two Nematodes, Caenorhabditis elegans and Ascaris suum. Genetics, 130(3), 471-498.

Pfenninger, M., Bahl, A., & Streit, B. (1996). Isolation by distance in a population of a small land snail Trochoidea geyeri : evidence from direct and indirect methods. Proceedings: Biological Sciences, 263(1374), 1211-1217.

Pradeep Kumar, N., Patra, K. P., Hoti, S. L., & Das, P. K. (2002). Genetic variability of the human filarial parasite, Wuchereria bancrofti in South India. Acta tropica, 82(1), 67-76. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11904105

Prichard, R. (2001). Genetic variability following selection of Haemonchus contortus with anthelmintics. Trends in parasitology, 17(9), 445-53. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11530357

60

Ramesh, A., Small, S. T., Kloos, Z. a, Kazura, J. W., Nutman, T. B., Serre, D., & Zimmerman, P. a. (2012). The complete mitochondrial genome sequence of the filarial nematode Wuchereria bancrofti from three geographic isolates provides evidence of complex demographic history. Molecular and biochemical parasitology, 1-10. Elsevier B.V. doi:10.1016/j.molbiopara.2012.01.004

Rice, P. (2000). The European Molecular Biology Open Software Suite EMBOSS : The European Molecular Biology Open Software Suite. Science, 16(6), 2-3.

Schattner, P., Brooks, A. N., & Lowe, T. M. (2005). The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic acids research, 33(Web Server issue), W686-9. doi:10.1093/nar/gki366

Schwab, A. E., Boakye, D. a, Kyelem, D., & Prichard, R. K. (2005). Detection of benzimidazole resistance-associated mutations in the filarial nematode Wuchereria bancrofti and evidence for selection by albendazole and ivermectin combination treatment. The American journal of tropical medicine and hygiene, 73(2), 234-8. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/16103581

Sharp, P. M., & Matassi, G. (1994). Codon usage and genome evolution. Current opinion in genetics & development, 4(6), 851-60. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/7888755

Strobeck, C. (1987). Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics, 117(1), 149-53. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1203183&tool=pmcentr ez&rendertype=abstract

Sunish, I. P., Rajendran, R., Mani, T. R., Gajanana, a, Reuben, R., & Satyanarayana, K. (2003). Long-term population migration: an important aspect to be considered during mass drug administration for elimination of lymphatic filariasis. Tropical medicine & international health : TM & IH, 8(4), 316-21. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/12667150

Thangadurai, R., Hoti, S. L., Kumar, N. P., & Das, P. K. (2006). Phylogeography of human lymphatic filarial parasite, Wuchereria bancrofti in India. Acta tropica, 98(3), 297-304. doi:10.1016/j.actatropica.2006.06.004

Venkatesan, M., Westbrook, C. J., Hauer, M. C., & Rasgon, J. L. (2007). Evidence for a population expansion in the West Nile virus vector Culex tarsalis. Molecular biology and evolution, 24(5), 1208-18. doi:10.1093/molbev/msm040

Volkman, S. K., Neafsey, D. E., Schaffner, S. F., Park, D. J., & Wirth, D. F. (2012). Harnessing genomics and genome biology to understand malaria biology. Nature reviews. Genetics, 13(5), 315-328. Nature Publishing Group. doi:10.1038/nrg3187

61

Watterson, G. (1975). On the number of segregating sites in genetical models without recombination. Theoretical Population Biology, 7, 265-276.

Winterrowd, C. ., Pomroy, W. ., Sangster, N. ., Johnson, S. ., & Geary, T. . (2003). Benzimidazole-resistant β-tubulin alleles in a population of parasitic nematodes (Cooperia oncophora) of cattle. Veterinary Parasitology, 117(3), 161-172. doi:10.1016/j.vetpar.2003.09.001

Won, K. Y., Rochars, M. B. D., Kyelem, D., Streit, T. G., & Patrick, J. (2009). Assessing the Impact of a Missed Mass Drug Administration in Haiti. PLoS biology, 3(8), 8- 10. doi:10.1371/journal.pntd.0000443

Wright, S. (1943). Isolation b y distance*. Genetics, 28(March), 114-138.

Wright, S., & MePhee, H. C. (1925). An approximate method of calculating coefficients of inbreeding and relationship. J. Agric. Res, 31, 377-383.

Xu, S., Pugach, I., Stoneking, M., Kayser, M., Jin, L., Hugo, T., & Consortium, P.-asian S. N. P. (2012). Genetic dating indicates that the Asian – Papuan admixture through Eastern Indonesia corresponds to the Austronesian expansion. doi:10.1073/pnas.1118892109/- /DCSupplemental.www.pnas.org/cgi/doi/10.1073/pnas.1118892109

Yatawara, L., Wickramasinghe, S., Rajapakse, R. P. V. J., & Agatsuma, T. (2010). The complete mitochondrial genome of Setaria digitata (Nematoda: Filarioidea): Mitochondrial gene content, arrangement and composition compared with other nematodes. Molecular and biochemical parasitology, 173(1), 32-8. Elsevier B.V. doi:10.1016/j.molbiopara.2010.05.004

62